What robots.txt does not do is to keep files out of the search engine indexes. The only thing it does is instruct search engine spiders not to crawl pages. Keep in mind that discovery and crawling are separate. Discovery occurs as search engines find links in documents. When search engines discover pages, they may or may not add them to their indexes. … [Read more...] about Have You Considered Privacy Issues When Using Robots.txt & The Robots Meta Tag?
Robots txt add sitemap
Overall, there are some ideas in ACAP that would be useful for the search engines to consider. However, there are many ideas outside of ACAP that would also be useful for them to consider. There’s nothing I see within ACAP that provides some type of crucial control that if only news publishers had, all their online woes would be over. What the news publishers really want are licensing agreements, and given that Google already has several of these without using ACAP (see Josh Cohen Of Google News On Paywalls, Partnerships & Working With Publishers), I can’t see that having it somehow advances any business model changes. … [Read more...] about ACAP Versus Robots.txt For Controlling Search Engines
"Preserve," with similar time limits available for "index," would stipulate whether a copy may be stored in a search engine’s cache. … [Read more...] about ACAP Launches, Robots.txt 2.0 For Blocking Search Engines?
Audience member Dave Naylor posed an interesting question when he asked what if you have both XML and TXT files, which get priority? Dan Crow of Google replied that he was wary about going to XML format because of this issue. He also mentioned that people have a harder time developing an XML file, more problems than with a TXT file. He mentioned that he’s seen a number of invalid robots files including some robots files that contain jpeg graphics. He is concerned that malformed data is higher risk in XML files than in simple text files. … [Read more...] about Up Close & Personal With Robots.txt
Obviously, replacing the URL above with the URL of your sitemap index file. Neither of these announcements was specific but looking at the example provided by Yahoo this does appear to support sitempas exported into an XML file, not just an HTML file. … [Read more...] about Google, Yahoo, MSN add SiteMap Auto-Discovery