GIDForums  

Go Back   GIDForums > Computer Programming Forums > MySQL / PHP Forum
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 
 
Thread Tools Search this Thread Rate Thread
  #1  
Old 27-Nov-2002, 06:47
JdS's Avatar
JdS JdS is offline
Senior Member
 
Join Date: Aug 2001
Location: KUL, Malaysia
Posts: 3,371
JdS will become famous soon enough

How to search a huge text file for data?


I have been trying for days to get this to work but obviously PHP's filesystem is not one of my fortes.

The large text file I want to 'search' is the website's access log file which is quite huge.

So far I have tried:

PHP Code:

<?php

//  I hope to isolate all instances of visits by search engine spiders
//  for example: googlebot / zyborg / slurp / etc
$fd = fopen( 'access.log', 'r' );
while( !feof($fd) )
{
  $buffer = fgets( $fd, 4096 );
  if( strstr(strtolower($buffer), 'googlebot') )
  {
    echo $buffer."<br />\n";
  }
}

fclose ($fd);

?>


Up to this point, I am already stuck; with even 1 day's worth of logs, this piece of code will time-out.

If you run the Apache Web Server on your win32 PC, you can find the access log (and view the sample data) file usually at:
Code:
c:\program files\apache group\apache\logs\access.log
unless, you've changed this path in httpd.conf...
.
  #2  
Old 27-Nov-2002, 15:33
pcxgamer's Avatar
pcxgamer pcxgamer is offline
Senior Member
 
Join Date: Sep 2002
Location: South Carolina, USA
Posts: 1,095
pcxgamer is a jewel in the roughpcxgamer is a jewel in the roughpcxgamer is a jewel in the rough
Is the file opened ok? As it is now, you aren't doing any error checking whatsoever. I'd change the open line to
$fd = fopen('/path/to/access_log', 'r') or die("Couldn't open the file.");

just a thought let me know if this help or not.
__________________
If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.
  #3  
Old 28-Nov-2002, 09:41
JdS's Avatar
JdS JdS is offline
Senior Member
 
Join Date: Aug 2001
Location: KUL, Malaysia
Posts: 3,371
JdS will become famous soon enough
nope, it's not the matter of error-checking... the file being searched is 47MB (give or take 3 bytes)!

so it's no wonder it times out...

and i think i got a work-around but it works a bit retarded

i'll tell you more when I send off an email to Scott of www.Vilitas.com warning him that I want to try out this script sometime in the near future.
  #4  
Old 28-Nov-2002, 09:45
pcxgamer's Avatar
pcxgamer pcxgamer is offline
Senior Member
 
Join Date: Sep 2002
Location: South Carolina, USA
Posts: 1,095
pcxgamer is a jewel in the roughpcxgamer is a jewel in the roughpcxgamer is a jewel in the rough
Oh ok, Let me know.
__________________
If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.
  #5  
Old 29-Nov-2002, 05:46
JdS's Avatar
JdS JdS is offline
Senior Member
 
Join Date: Aug 2001
Location: KUL, Malaysia
Posts: 3,371
JdS will become famous soon enough
Here's GIDGoogle&trade; ver. 0.0.1 -

http://topsites.gidhelp.com/get_google.php

What it does is to grab any googlebot activity off your Apache logs... and it doesn't matter how large your log file is! Cool, I think.

Once I get the script into a class and some other features added, I will offer it as a download too...
  #6  
Old 29-Nov-2002, 06:01
pcxgamer's Avatar
pcxgamer pcxgamer is offline
Senior Member
 
Join Date: Sep 2002
Location: South Carolina, USA
Posts: 1,095
pcxgamer is a jewel in the roughpcxgamer is a jewel in the roughpcxgamer is a jewel in the rough
Thumbs up

Great job, Let me know when your going to offer it as a download I could use something like on my server at maxipoint I'm gettting ready to submit my site to them.
__________________
If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.
  #7  
Old 26-May-2003, 15:02
jrobbio's Avatar
jrobbio jrobbio is offline
Regular Member
 
Join Date: Jan 2003
Location: Loughborough, England
Posts: 840
jrobbio will become famous soon enough
Did you ever create a download for this?

Rob
  #8  
Old 27-May-2003, 10:27
JdS's Avatar
JdS JdS is offline
Senior Member
 
Join Date: Aug 2001
Location: KUL, Malaysia
Posts: 3,371
JdS will become famous soon enough
No, it was seriously flawed. So I went the route of GIDTrackbot&trade;!

Also, figuring out Linux these days has certainly opened up many new possibilities to get the same results with much less processing overhead... one day I will figure out a new way to get this data out (but only off a linux server).
 
 

Recent GIDBlogVista ?Widgets? on Windows XP by LocalTech

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Manipulating text Files k209310 C++ Forum 0 17-Nov-2003 11:23
CD Buring Failed skanth2000 Computer Hardware Forum 1 15-Nov-2003 04:52
Search Engine Positioning 101 and 201 "How To" Tips... 000 Search Engine Optimization Forum 0 29-May-2003 11:34
[class] 404 search function code jrobbio MySQL / PHP Forum 6 22-Apr-2003 10:32
How Do i get php to find out the file type of a file for me? viperman95833 MySQL / PHP Forum 2 08-Mar-2003 10:48

Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The

All times are GMT -6. The time now is 16:27.


vBulletin, Copyright © 2000 - 2010, Jelsoft Enterprises Ltd.