Stephan Spencer is the creator of the 3-day immersive SEO seminar Traffic Control; an author of the O’Reilly books The Art of SEO, Google Power Search, and Social eCommerce; founder of the SEO agency Netconcepts (acquired in 2010); inventor of the SEO proxy technology GravityStream; and the host of two podcast shows The Optimized Geek and Marketing Speak. … [Read more...] about A Deeper Look At Robots.txt
Where is robots txt file located
There’s No Stopping Bad Behavior A problem you will have with both robots.txt and the robots tag is that these instructions cannot enforce their directives. While Google and Bing will certainly respect your instructions, someone using Screaming Frog, Xenu, or their own custom site crawler can simply ignore disallow and noindex directives. … [Read more...] about Have You Considered Privacy Issues When Using Robots.txt & The Robots Meta Tag?
Business models are changing, and publishers need a protocol to express permissions of access and use that is flexible and extensible as new business models arise. ACAP will be entirely agnostic with respect to business models, but will ensure that revenues can be distributed appropriately. ACAP presents a win win for the whole online publishing community with the promise of more high quality content and more innovation and investment in the online publishing sector. ACAP is for the large as well as the small and even the individuals. It will benefit all content providers whether they are working alone or through publishers. A future without publishers willing and able to invest in high quality content and get a return on that investment is a future without high-quality content on the net. … [Read more...] about ACAP Versus Robots.txt For Controlling Search Engines
One of the announcements that occurred during the week of SES was Ask.com joining Google, MSN and Yahoo in supporting the Sitemaps auto discovery. This feature allows webmasters to specify the location of their sitemaps within their robots.txt file. Keith Hogan of Ask.com mentioned this change in his presentation and its impact. This will eliminate the need to submit sitemaps to each engine separately. Essentially, sitemaps are a simple XML file that lists URLS and information about the URLS to help spiders do a better job of crawling a site. See www.sitemaps.org for more details. … [Read more...] about Up Close & Personal With Robots.txt
Please don't do this. I get humans.txt, but this is content that should be on your site and discoverable and searchable. The same problems this is trying to solve will also be present in this solution.Yelp and all similar services need to get smarter and do the page scanning and indexing automatically.. THIS WILL NOT SOLVE THAT PROBLEM. … [Read more...] about Googler: Don’t Use business.txt Files