Pages

Sunday, November 20, 2011

1

Block robots from specific page(s) in a Blogspot blog

If you want to have your blog indexed by search engines, you might still have some pages in your blog that you don't want the robots to crawl. You cannot control robots.txt file in Blogger, but you can use conditionals and "meta robots" to restrict individual pages from well-behaving robots (like Google).

For example, I have EmailMeForm contact form, which I have set to redirect to a "thank you" page after the form is submitted. I decided to block this static "thank you" page from being indexed.

To block a static page, put this code somewhere in the template html "<head> area" (before </head>), for example before line <b:include data='blog' name='all-head-content'/>:
<b:if cond='data:blog.url == &quot;http://BLOGURL.blogspot.com/p/PAGE.html&quot;'> 
  <meta content='noindex,nofollow' name='robots'/>
</b:if>
Fix the url http://BLOGURL.blogspot.com/p/PAGE.html to actual url of the page in your blog (in my blog it was http://mspotilas.blogspot.com/p/kiitos_09.html).

Note, that the above code itself does not hide single posts very well, only static pages, because posts are listed on front page and archive pages, too, where search robot would get the contents of the post.

You could turn off the indexing of post archive pages, like this:
<b:if cond='data:blog.pageType == &quot;archive&quot;'> 
  <meta content='noindex,nofollow' name='robots'/>
</b:if>
I have not tried this one, but it might trim down the number of search results, which may be desirable. If you use Google CSE in your blog, you can trim down the CSE search results by excluding *.blogspot.com/*_archive.html url pattern from CSE Sites, but that is another story.

And this is how to hide all static pages, whatever the reason may be, from robots:
<b:if cond='data:blog.pageType == &quot;static_page&quot;'>
  <meta content='noindex,nofollow' name='robots'/>
</b:if>
Related Posts Plugin for WordPress, Blogger...
See the hack
for this dynamic
views icon: