20 June 2011

schema.org and Blogger don’t mix well together

There was a lot of talk a couple of months ago about the degradation of ’s search results, the increasing number of less than relevant results, etc. The short term response from Google was to manually demote some sites. I think that we are starting to see signs of their long-term strategy as well, with the integration of social search, and specifically the +1 button, and with the push for a more semantic web through schema.org. Even though schema.org is a partnership with the two other big search engines, Bing and Yahoo!, Google still stands to gain the most out of it, by virtue of it’s significantly bigger market share. Personally, I believe schema.org will have a larger impact on the quality of search results, because it enables the ‘dumb’ algorithm to ‘understand’ the web better, to see that some words or sections of a webpage are more important and to prioritize the webpage accordingly. On the other hand, +1’s or any other type of social signal will only show the link is ‘important’ to some group of people – or demographic –, but without specifying the context, why? that piece of content was shared.

But how well does it work in real-life? I have considered adding schema.org to my blog, especially to the section where I publish my book reviews. Unfortunately, the process is far from easy or straightforward. doesn’t offer any way (that I know of) to automate the process, so if you have a lot of posts to update, you will need a lot of time and patience. Not to mention attention to detail, it’s easy to make mistakes while manually editing the HTML. What’s even worse is that Blogger doesn’t support <meta> tags in the posts (e.g. for ratings), so some schema.org markup is completely unavailable. You can work around this limitation, but that complicates the process even further. Blogger META tag not allowed in posts

I haven’t worked with WordPress extensively, by I think that platform should be easier to set up with custom fields or a plug-in. Blogger’s widgets are not suited for something like this, because I don’t think you can make them generate custom content related to the current post – in my case the book title, author, cover and additional information. For that to work, the widget would have to look up some kind of external database – something that’s way out of the technical knowledge of most bloggers, including mine. The easier solution would be for Blogger to integrate some of the schema.org markup in the editor, like formatting is implemented at the moment: users would simply select a block of text and add a category to it from a list – no need to know all the background details.

I finally tested schema.org on a single article, although I’m not sure I got it right. Testing that post with the Rich Snippets testing tool from Google throws out an error that it can’t generate a preview, although somewhere down the page the few elements I added are being recognized. Maybe there is some minimal set of tags you should add to the page; or maybe Google hasn’t implemented a visual representation of this metadata for the search results yet. In conclusion, this particular project of mine is on hold indefinitely, until I find an easier and more reliable way to make it work. Google schema.org preview

Since I’m on the subject of search results and their relevancy, I was recently reminded of a big shortcoming of current search engines: they index the entire page, including portions irrelevant to the main content, like the content of widgets on blogs. Take for example Google’s own blogs: all of them have a “Powered by Blogger”-widget; that text gets indexed and ALL the posts appear on searches for “blogger”, even if they have absolutely nothing to do with that query. The problem is far from new, but maybe schema.org could help solve it: you should be able to specify that some sections of the webpage are not relevant for search engines and shouldn't be crawled, just like you can with an entire page using robots.txt.

1 comment:

  1. Yes you are right i even read about these tags on google webmaster blog and went to implement them on my blog but it gave an error.

    Thank you,
    Google Affiliate Marketer.

    ReplyDelete