For the last few years webmasters have been able to markup their website HTML with structured data to make their content clearer for the search engines and, since 2009, to take advantage of rich snippets.

Then last June this process received a healthy injection when Bing, Google and Yahoo! banded together to launch Schema.org, a collaborative project which aims to create a common vocabulary of structured data tags. In their words:

“By adding additional tags to the HTML… you can help search engines and other applications better understand your content and display it in a useful, relevant way.”

Similar to existing projects, such as microformats.org, Schema offers a far more flexible and easy-to access vocabulary of markup terms which webmasters can reference with the microdata, microformats or RDFa languages.

But which of these three formats are people using to mark up their data, if at all, and how attractive is Schema following recent developments in Google such as social and Search Plus Your World (SPYW); direct answers; recipe search and semantic search?

Structured Data and Schema.org

Schema attempts to create a common markup vocabulary and standardise the microdata format for HTML5 so that the search engines will be able to easily read the semantic meaning behind websites in the future. They apparently chose microdata because it finds that Goldilocks zone between its alternative formats; the more complex RDFa and the oversimplified microformats.

At the moment the search engines still support these other structured data formats and serve up rich snippets for microformat and RDFa marked sites. Last November they did announce they would try to start using RDFa, but if microdata is their primary focus then it could be worth learning this markup language and using it as per Schema’s advice.

For those that don’t want to learn microdata, microformats are much more accessible and quick to use. Handy if you have a ton of HTML to markup on multiple sites. A couple of weeks ago Glenn Jones of Madgex gave an insightful presentation at BrightonSEO entitled Microformats and SEO. Since I had been planning to write about structured data anyway, it was a well-timed lecture! Glenn focused on microformats and he didn’t seem to think webmasters need fear it being phased out. In fact, at the moment it looks like most people are using microformats because of its relative simplicity. Until microdata becomes the standard, there doesn’t appear to be any harm in using microformats.

How and why should Webmasters use Structured Data?

Rich Snippets

There are definite benefits to marking up your HTML as structured data. First of all, Google and the Gang display rich snippets for some (but not all) webpages which have correctly marked up the attributes outlined in Schema’s vocabulary.

If you are familiar with rich snippets you will understand their potential. They make your listings more visually attractive and more informative, including detailed information and sometimes even images. Here is a rather delicious looking snippet for a Banana split recipe. What would otherwise be a static article practically becomes an advert:

Banana split rich snippet

To see just how this differentiates a listing in the search engine results pages (SERPs), take a look at this eyetracking image from Moz’s Dr Pete:

Rich snippet eye-tracking

It’s interesting to see just how eye-catching the snippet it, even more so than the top result. Last week in Glenn’s talk on microformats he referenced the industry disagreement about how influential rich snippets really are on the SERPs. He pointed out Paul Bruemmer’s article on Search Engine Land which reported seeing a 30% increase in click through rate (CTR) with structured markup in place and compared it to Richard Baxter’s piece on SEO Gadget which ‘only’ saw a 5% increase. Glenn gave his own approximation, suggesting that rich snippets probably increase CTR by around 10-25%.

Because of the visual attraction of rich snippets and the potential effect they have on CTR, SEOs constantly need to be aware of their presence in the SERPs. That’s why Linkdex has started to report on this when benchmarking your keywords.

In terms of actually marking up your content, if we take the banana split from earlier we can run the HTML through Google’s rather handy Rich Snippet Testing Tool to see the underlying markup:

Rich snippet testing tool

Here the author has used microdata to markup their HTML but has referenced data-vocabulary.org, an older alternative to Schema (it looks like this was published a while back in ’05). Note the different recipe vocabularies for schema and data-vocabulary. You can see they have marked the reviews, provided a summary and linked to a photo, but they haven’t referenced ingredients or cooking time.

Note also the ability to link to a Google+ profile. Author information in the search results is something Google are pushing with rel=author (instead of the older, more general rel=me). As social signals continue to become a stronger ranking factor thanks to SPYW, tying in your existing content (especially those that render rich snippets) is a great way of spreading your content across the search and social channels.

To help you mark up your HTML Schema lists the vocabulary types and properties and gives instructions and examples on their website. For those just getting started with structured data it might be easier to look at microformats, as Glenn suggested, by going to their wiki and looking through the types and properties of the language.

Here is an example from the Schema website of a recipe marked up correctly with microdata. Note the use of itemtype, itemscope and itemprop, the main attributes of the language:

<div itemscope itemtype=”http://schema.org/Recipe”>

<span itemprop=”name”>Mom’s World Famous Banana Bread</span>
By <span itemprop=”author”>John Smith</span>,
<meta itemprop=”datePublished” content=”2009-05-08″>May 8, 2009
<img itemprop=”image” src=”bananabread.jpg” />

<span itemprop=”description”>This classic banana bread recipe comes from my mom — the walnuts add a nice texture and flavor to the banana bread.</span>

Prep Time: <meta itemprop=”prepTime” content=”PT15M”>15 minutes
Cook time: <meta itemprop=”cookTime” content=”PT1H”>1 hour
Yield: <span itemprop=”recipeYield”>1 loaf</span>

<div itemprop=”nutrition”
itemscope itemtype=”http://schema.org/NutritionInformation”>
Nutrition facts:
<span itemprop=”calories”>240 calories</span>,
<span itemprop=”fatContent”>9 grams fat</span>
</div>

Ingredients:
– <span itemprop=”ingredients”>3 or 4 ripe bananas, smashed</span>
– <span itemprop=”ingredients”>1 egg</span>
– <span itemprop=”ingredients”>3/4 cup of sugar</span>

Instructions:
<span itemprop=”recipeInstructions”>
Preheat the oven to 350 degrees. Mix in the ingredients in a bowl. Add the flour last. Pour the mixture into a loaf pan and bake for one hour.
</span>

140 comments:
<meta itemprop=”interactionCount” content=”UserComments:140″ />
From Janel, May 5 — thank you, great recipe!

</div>

Last week Google also updated rich snippets to include products and to allow HTML inserted into the Testing Tool, which means you no longer have to publish a page to test your snippets, making it easier to play around and test your markup.

One obvious downside of marking structured content is that it can be time consuming. The question is whether the rewards outweigh the effort and how the return compares with other time-consuming tasks in the SEO checklist. If an SEO has numerous clients, all with extensive sites, it would be impractical to markup everything and unfortunately automated markup is “error prone”.

Still, the benefit of having rich snippets for some of your key pieces of content, especially when found in the top 10 for competitive searches, surely makes it worthwhile in these cases. But what about when Google doesn’t covert it into rich snippets? Is this simply wasted time? Well, going forward it will become more important for other aspects such as semantic search.

Semantic Search

Rich snippets are an example of how search engines can better understand and present your data. As they continue to invest in semantic search, this will become increasingly important. Rather than relying on URLs, keywords or tags, with structured data search engines will increasingly be able to see the structure of your pages and how they relate semantically.

Google have even started to experiment with customised results panes for recipe searches, as in the below image.

Custom recipe pane

Look at the pane on the left which allows users to further segment their results by certain ingredients, cook times and calorie counts. Now, remember I said that the banana split rich snippet didn’t mark up these elements? If you tick the ‘vanilla ice cream’ box on the left, which the recipe does include, the listing disappears from the SERP because the markup is incomplete.

This only reinforces the importance of marking it up fully – preferably with microdata – and raises an interesting point going forward. If Schema expands their vocabulary it will surely put increasing pressure on webmasters to go back and update their markups. Similarly, if microdata becomes the standard, will old microformat code become useless or incomplete further down the line?

For the moment Google renders rich snippets for microformats, but there’s no telling what will happen with semantic search. I also imagine that these custom panes will become more prolific as they expand the Schema vocabulary, which means that – even if you don’t publish recipes – this could still affect your site.

Direct Answers

Another part of semantic search is direct answers. Despite the ranking possibilities semantic search offers, some SEOs and webmasters worry that Google will access their marked up data and use it to bypass their sites altogether, as in the following example:

Direct Answer

If this is the way Google are going then the ‘head in the sand’ approach probably isn’t the best method. If they aren’t drawing semantic information from your website then they will be happy to draw it from a competitor, complete with a referencing link at the top of the SERPs.

When you consider how click-worthy you can make your pages with rich snippets and consider ‘the Wikipedia Effect’ of answering one question only to ask another (I too am a victim of endless hours lost in the maze of Wikipedia), the effect of direct answers will likely be minimal. In the meantime webmasters can reap the rewards from rich snippets and receive additional conversions, links, shares, favourites, brand loyalty etc., all by helping Google understand their content.

User Experience is King

The User is KingGoogle and the other engines want structured data and semantic search to work because it will arguably improve the user experience. And at the end of the day, this is their focus. Look at SPYW, look at recipe panes, localised search and direct answers – the SERPs are going to become more fluid and flexible and if webmasters don’t consider future-proofing and marking up their HTML with Schema’s semantic vocabulary (and probably with the microdata language) their competitors could further entrench themselves as authorities.

Going forward semantics might even evolve and play a more important role in rankings, alongside social. But at the moment all we know is that markup efforts are rewarded with rich snippets that increase CTR and that structured data helps the engines reference you as an authority.

If you have the resources to implement microdata then it’s worth it, otherwise it might be worth marking key articles with microformats, especially for competitive searches where you can distinguish yourself with rich snippets and set yourself up for developments in schema and semantic search.

With these kind of things there are too possibilities: either the early adopters waste resources on something that has surfaced too early, or they profit the most by getting there first. With semantic search and Schema I tend to think the glass might be half full rather than half empty.