Follow or nofollow instructs Web crawlers whether or not to follow the links on a page. It is like adding an rel=”nofollow” tag to every link on a page. Nofollow evaporates PageRank, the raw search engine ranking authority passed from page to age via links. Even if you noindex a page, it is probably a bad idea to nofollow it. Let PageRank flow through to its final conclusion. Otherwise, you could be pouring perfectly good link juice down the drain. … [Read more...] about Have You Considered Privacy Issues When Using Robots.txt & The Robots Meta Tag?
Robots txt where to place
Cloaking: Those savvy to search engines know that Google hates cloaking, which is the act of showing a search engine something different than a human being would see. It’s often associated with spam. There are plenty of cases where people have shown misleading content to a search engine, in hopes of getting a good ranking. One example is from 1999, when the FTC took action against a site that was cloaking content that ranked for “innocent” searches like Oklahoma tornadoes and instead directed them to porn sites. The idea of a publisher forcing a search engine to allow cloaking would be somewhat similar to a newspaper being forced to write whatever a subject demanded be written about them. … [Read more...] about ACAP Versus Robots.txt For Controlling Search Engines
Stephan Spencer is the creator of the 3-day immersive SEO seminar Traffic Control; an author of the O’Reilly books The Art of SEO, Google Power Search, and Social eCommerce; founder of the SEO agency Netconcepts (acquired in 2010); inventor of the SEO proxy technology GravityStream; and the host of two podcast shows The Optimized Geek and Marketing Speak. … [Read more...] about A Deeper Look At Robots.txt
I would recommend going even a bit further, and perhaps removing the robots.txt file completely. The general idea behind blocking some of those pages from crawling is to prevent them from being indexed. However, that's not really necessary -- websites can still be crawled, indexed and ranked fine with pages like their terms of service or shipping information indexed (sometimes that's even useful to the user :-)).I know many SEOs feel it is mandatory to have a robots.txt file and just have it say, User-agent: * Allow: /. Why when they will eat up your content anyway? … [Read more...] about Google: Remove The Robots.txt File Completely
Depending on the Host and UA, the official Yahoo! Slurp apparently does whatever it wants to. Note the subtle differences in the subdomains and UAs... This morning, the only Host to read/heed robots.txt was: … [Read more...] about Yahoo’s Crawler Not Listening To Robots.txt Directive?