Follow or nofollow instructs Web crawlers whether or not to follow the links on a page. It is like adding an rel=”nofollow” tag to every link on a page. Nofollow evaporates PageRank, the raw search engine ranking authority passed from page to age via links. Even if you noindex a page, it is probably a bad idea to nofollow it. Let PageRank flow through to its final conclusion. Otherwise, you could be pouring perfectly good link juice down the drain. … [Read more...] about Have You Considered Privacy Issues When Using Robots.txt & The Robots Meta Tag?
Robots txt sitemap xml
Business models are changing, and publishers need a protocol to express permissions of access and use that is flexible and extensible as new business models arise. ACAP will be entirely agnostic with respect to business models, but will ensure that revenues can be distributed appropriately. ACAP presents a win win for the whole online publishing community with the promise of more high quality content and more innovation and investment in the online publishing sector. ACAP is for the large as well as the small and even the individuals. It will benefit all content providers whether they are working alone or through publishers. A future without publishers willing and able to invest in high quality content and get a return on that investment is a future without high-quality content on the net. … [Read more...] about ACAP Versus Robots.txt For Controlling Search Engines
Pages you block by using robots.txt disallows may still be in Google’s index and appear in the search results — especially if other sites link to them. Granted, a high ranking is pretty unlikely since Google can’t “see” the page content; it has very little to go on other than the anchor text of inbound and internal links, and the URL (and the ODP title and description if in ODP/DMOZ.) As a result, the URL of the page and, potentially, other publicly available information can appear in search results. However, no content from your pages will be crawled, indexed or displayed. … [Read more...] about A Deeper Look At Robots.txt
This move may be an effort to show a consolidated front in light of the ongoing publisher attempts to create new search engine access standards with ACAP. This direction reflects the ongoing direction of the messaging the search engines have had about ACAP. For instance, Rob Jonas, Google’s head of media and publishing partnerships in Europe, said in March that “the general view is that the robots.txt protocol provides everything that most publishers need to do.” … [Read more...] about Yahoo!, Google, Microsoft Clarify Robots.txt Support
One of the announcements that occurred during the week of SES was Ask.com joining Google, MSN and Yahoo in supporting the Sitemaps auto discovery. This feature allows webmasters to specify the location of their sitemaps within their robots.txt file. Keith Hogan of Ask.com mentioned this change in his presentation and its impact. This will eliminate the need to submit sitemaps to each engine separately. Essentially, sitemaps are a simple XML file that lists URLS and information about the URLS to help spiders do a better job of crawling a site. See www.sitemaps.org for more details. … [Read more...] about Up Close & Personal With Robots.txt