![]() |
|
|||||||
|
|
Thread Tools | Search this Thread | Rate Thread |
|
#1
|
||||
|
||||
Grub's distributed web crawling project Good for webmasters running serversI found out about this Grub project today: http://www.grub.org
It seems to have a lot of potential where it uses peoples spare bandwidth to reindex the databases. It turns out that Looksmart actually purchased this in January to use with Wisenut probably since their database is forever behind. However, their gain could be our gain since if at the moment you are able to spare bandwidth on your server, or just connection then you can have your site/s crawled first. They haven't got it functioning if you are not running it on the same sytem as your website but apparently this is coming soon. There isn't actually any gain for me to use it at the moment because of this but I am testing it out. The screen saver is actually really groovey, but it eats bandwidth if you don't set it correctly so they are thinking about putting a bandwidth meter in to make you aware of how much you are using. In conclusion, it is something to look out for or try out if you have a site that changes content a lot such as a news site. Should make things very competitive between the big 4 if this pulls off. |
||||
|
#2
|
||||
|
||||
|
Impressive idea, I'd have to agree also, that if it works this will makes things VERY nice for some of us...
|
|
#3
|
||||
|
||||
|
You can query the database to see if your url is being crawled here and submit it if not:
http://brainbug.grub.org/urls.php You can look at what is in the database if you know how to do it with XML here:http://www.grub.org/html/documents.php?op=services |
|
#4
|
||||
|
||||
|
It turns out that it is probably intended for wisenut since its index updating time is poor to say the least with some interaction with the Zeal catalogue.
I find wisenut incredibly hard to read so until they fix that I still won't use it. There was one suggestion I read, which I read that had real potential for each individual webmaster to crawl and submit their websites only and that way it would eliminate the massive requirements of the crawlers. I think that would make some webmasters very happy. |
|
#5
|
||||
|
||||
Grub TeamsThere are now teams available to join in the grub project so now is the time that the bandwidth is likely to increase massively with all the Distributed software teams using it as another battlefield. It should actually work well together with some of the other projects as they don't generally interfere with one another, making one machine do even more.
Anyway, an update on the project, they sorted out the robots.txt issues a while back so there are only a few irate people complaining about excessive crawling, but everything else is starting to appear positive. No sign of a beta of the search engine yet, but it is on the cards before Winter sets in so hopefully it will revolutionise the search engine positions with Google faultering and MSN creating their own. Exciting times ahead Rob p.s. the local crawling works pretty well too, however there is still a problem with pages unable to be added after the ? . This is to be fixed soon and then I'll be very happy. |
Recent GIDBlog
Vista ?Widgets? on Windows XP by LocalTech
| Thread Tools | Search this Thread |
| Rate This Thread | |
|
|
Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The