Ok. I just had an interesting thing just happen to me.
I blogged a couple of weeks ago a bout having a stat counter on my blogs on Multiply. It takes the IP of the person’s ISP and logs how many times they visit, etc.
I went to visit the counter and this is what I saw:
| VISITOR ANALYSIS |
| Referring Link | No referring link |
| Host Name | crawl-66-249-72-166.googlebot.com |
| IP Address | 66.249.72.166 [Label IP Address] |
| Country | United States |
| Region | New York |
| City | New York |
| ISP | Google Inc |
| Returning Visits | 0 |
| Visit Length | 0 seconds |
| VISITOR SYSTEM SPECS |
| Browser | |
| Operating System | |
| Resolution | Unknown |
| Javascript | Disabled |
Navigation Path
| Date | Time | WebPage |
| 12th May 2008 | 23:01:55 | No referring link
|
WTH is THIS??
I looked up googlebot (yes I googled it lol) and here is what it says on Wikipedia about it:
A Googlebot is a search bot used by Google. It collects documents from the web to build a searchable index for the Google search engine.
If a webmaster wishes to restrict the information on their site available to a Googlebot, or another well-behaved spider, they can do so with the appropriate directives in a robots.txt file,[1] or by adding the meta tag <meta name="Googlebot" content="noindex"> to the webpage. [2] Googlebot requests to Web servers are discernible from their user-agent string 'Googlebot'.
Googlebot has two versions, deepbot and freshbot. Deepbot, the deep crawler, tries to follow every link on the web and download as many pages as it can to the Google indexers. It completes this process about once a month. Freshbot crawls the web looking for fresh content. It visits websites that change frequently, according to how frequently they change. Currently Googlebot only follows HREF links and SRC links. [3]
Googlebot discovers pages by harvesting all of the links on every page it finds. It then follows these links to other web pages. New web pages must be linked to from another known page on the web in order to be crawled and indexed.
A problem which webmasters have often noted with the Googlebot is that it takes up an enormous amount of bandwidth. This can cause websites to exceed their bandwidth limit and be taken down temporarily. This is especially troublesome for mirror sites which host many gigabytes of data. Google provides "Webmaster Tools" that allow website owners to throttle the crawl rate
Okay…exactly WHAT is Google doing on my blogs? And who is going to get this information? And why would anyone want to see what is in my blog? I understand it is a public document and I really have no problem with anyone seeing it. But, do I really need to be searchable? And what the hell is GOOGLE doing on a MULTIPLY site?
I swear…Big Brother is EVERYWHERE.