You are here: Home ‣ Dive Into HTML5 ‣
❧
here are over 100 elements in HTML5. Some are purely semantic, others are just containers for scripted APIs. Throughout the history of HTML, standards wonks have argued about which elements should be included in the language. Should HTML include a <figure>
element? A <person>
element? How about a <rant>
element? Decisions are made, specs are written, authors author, implementors implement, and the web lurches ever forward.
Of course, HTML can’t please everyone. No standard can. Some ideas don’t make the cut. For example, there is no <person>
element in HTML5. (There’s no <rant>
element either, damn it!) There’s nothing stopping you from including a <person>
element in a web page, but it won’t validate, it won’t work consistently across browsers, and it might conflict with future HTML specs if we want to add it later.
Right, so if making up your own elements isn’t the answer, what’s a semantically inclined web author to do? There have been attempts to extend previous versions of HTML. The most popular method is microformats, which uses the class
and rel
attributes in HTML. Another option is RDFa, which was originally designed to be used in XHTML but has been ported to HTML as well.
Microformats and RDFa each have their strengths and weaknesses. They take radically different approaches towards the same goal: extending web pages with additional semantics that are not part of the core HTML language. I don’t intend to turn this chapter into a format flamewar. (That would definitely require a <rant>
element!) Instead, I want to focus on a third option developed using lessons learned from microformats and RDFa, and designed to be integrated into HTML5 itself: microdata.
❧
Each word in the following sentence is important, so pay attention.
Microdata annotates the DOM with scoped name/value pairs from custom vocabularies.
Now what does that mean? Let’s start from the end and work backwards. Microdata centers around custom vocabularies. Think of “the set of all HTML5 elements” as one vocabulary. This vocabulary includes elements to represent a section or an article, but it doesn’t include elements to represent a person or an event. If you want to represent a person on a web page, you’ll need to define your own vocabulary. Microdata lets you do this. Anyone can define a microdata vocabulary and start embedding custom properties in their own web pages.
The next thing to know about microdata is that it works with name/value pairs. Every microdata vocabulary defines a set of named properties. For example, a Person vocabulary could define properties like name
and photo
. To include a specific microdata property on your web page, you provide the property name in a specific place. Depending on where you declare the property name, microdata has rules about how to extract the property value. (More on this in the next section.)
Along with named properties, microdata relies heavily on the concept of “scoping.” The simplest way to think of microdata scoping is to think about the natural parent-child relationship of elements in the DOM. The <html>
element usually contains two children, <head>
and <body>
. The <body>
element usually contains multiple children, each of which may have child elements of their own. For example, your page might include an <h1>
element within an <hgroup>
element within a <header>
element within the <body>
element. A data table might contain <td>
within <tr>
within <table>
(within <body>
). Microdata re-uses the hierarchical structure of the DOM itself to provide a way to say “all the properties within this element are taken from this vocabulary.” This allows you to use more than one microdata vocabulary on the same page. You can even nest microdata vocabularies within other vocabularies, all by re-using the natural structure of the DOM. (I’ll show multiple examples of nested vocabularies throughout this chapter.)
Now, I’ve already touched on the DOM, but let me elaborate on that. Microdata is about applying additional semantics to data that’s already visible on your web page. Microdata is not designed to be a standalone data format. It’s a complement to HTML. As you’ll see in the next section, microdata works best when you’re already using HTML correctly, but the HTML vocabulary isn’t quite expressive enough. Microdata is great for fine-tuning the semantics of data that’s already in the DOM. If the data you’re semanti-fying isn’t in the DOM, you should step back and re-evaluate whether microdata is the right solution.
Does this sentence make more sense now? “Microdata annotates the DOM with scoped name/value pairs from custom vocabularies.” I hope so. Let’s see it in action.
❧
Defining your own microdata vocabulary is easy. First, you need a namespace, which is just a URL. The namespace URL could actually point to a working web page, although that’s not strictly required. Let’s say I want to create a microdata vocabulary that describes a person. If I own the data-vocabulary.org
domain, I’ll use the URL http://data-vocabulary.org/Person
as the namespace for my microdata vocabulary. That’s an easy way to create a globally unique identifier: pick a URL on a domain that you control.
In this vocabulary, I need to define some named properties. Let’s start with three basic properties:
name
(your full name)
photo
(a link to a picture of you)
url
(a link to a site associated with you, like a weblog or a Google profile)
Some of these properties are URLs, others are plain text. Each of them lends itself to a natural form of markup, even before you start thinking about microdata or vocabularies or whatnot. Imagine that you have a profile page or an “about” page. Your name is probably marked up as a heading, like an <h1>
element. Your photo is probably an <img>
element, since you want people to see it. And any URLs associated your profile are probably already marked up as hyperlinks, because you want people to be able to click them. For the sake of discussion, let’s say your entire profile is also wrapped in a <section>
element to separate it from the rest of the page content. Thus:
↶ It’s all about me
<section>
<h1>Mark Pilgrim</h1>
<p><img src="http://www.example.com/photo.jpg" alt="[me smiling]"></p>
<p><a href="http://diveintomark.org/">weblog</a></p>
</section>
Microdata’s data model is name/value pairs. A microdata property name (like name
or photo
or url
in this example) is always declared on an HTML element. The corresponding property value is then taken from the element’s DOM. For most HTML elements, the property value is simply the text content of the element. But there are a handful of exceptions.
Element | Value |
---|---|
<meta> | content attribute
|
| src attribute
|
| href attribute
|
<object> | data attribute
|
<time> | datetime attribute
|
all other elements | text content |
“Adding microdata” to your page is a matter of adding a few attributes to the HTML elements you already have. The first thing you always do is declare which microdata vocabulary you’re using, by adding an itemtype
attribute. The second thing you always do is declare the scope of the vocabulary, using an itemscope
attribute. In this example, all the data we want to semanti-fy is in a <section>
element, so we’ll declare the itemtype
and itemscope
attributes on the <section>
element.
<section itemscope itemtype="http://data-vocabulary.org/Person">
Your name is the first bit of data within the <section>
element. It’s wrapped in an <h1>
element. The <h1>
element doesn’t have any special processing in the HTML5 microdata data model, so it falls under the “all other elements” rule where the microdata property value is simply the text content of an element. (This would work equally well if your name was wrapped in a <p>
, <div>
, or <span>
element.)
<h1 itemprop="name">Mark Pilgrim</h1>
In English, this says “here is the name
property of the http://data-vocabulary.org/Person
vocabulary, and the value of the property is Mark Pilgrim
.”
Next up: the photo
property. This is supposed to be a URL. According to the HTML5 microdata data model, the “value” of an <img>
element is its src
attribute. Hey look, the URL of your profile photo is already in an <img src>
attribute. All you need to do is declare that the <img>
element is the photo
property.
<p><img itemprop="photo"
src="http://www.example.com/photo.jpg"
alt="[me smiling]"></p>
In English, this says “here is the photo
property of the http://data-vocabulary.org/Person
vocabulary, and the value of the property is http://www.example.com/photo.jpg
.
Finally, the url
property is also a URL. According to the HTML5 microdata data model, the “value” of an <a>
element is its href
attribute. And once again, this fits perfectly with your existing markup. All you need to do is say that your existing <a>
element is the url
property:
<a itemprop="url" href="http://diveintomark.org/">dive into mark</a>
In English, this says “here is the url
property of the http://data-vocabulary.org/Person
vocabulary, and the value of the property is http://diveintomark.org/
.
Of course, if your markup looks a little different, that’s not a problem. You can add microdata properties and values to any HTML markup, even really gnarly 20th-century-era, tables-for-layout, Oh-God-why-did-I-agree-to-maintain-this markup. While I don’t recommend this kind of markup, it is still common, and you can still add microdata to it.
↶ For the love of God, don’t do this
<TABLE>
<TR><TD>Name<TD>Mark Pilgrim
<TR><TD>Link<TD>
<A href=# onclick=goExternalLink()>http://diveintomark.org/</A>
</TABLE>
For marking up the name
property, just add an itemprop
attribute on the table cell that contains the name. Table cells have no special rules in the microdata property value table, so they get the default value, “the microdata property is the text content.”
<TR><TD>Name<TD itemprop="name">Mark Pilgrim
Adding the url
property looks trickier. This markup doesn’t use the <a>
element properly. Instead of putting the link target in the href
attribute, it has nothing useful in the href
attribute and uses Javascript in the onclick
attribute to call a function (not shown) that extracts the URL and navigates to it. For extra “holy fuck, please stop doing that” bonus points, let’s pretend that the function also opens the link in a tiny popup window with no scroll bars. Wasn’t the internet fun last century?
Anyway, you can still convert this into a microdata property, you just need to be a little creative. Using the <a>
element directly is out of the question. The link target isn’t in the href
attribute, and there’s no way to override the rule that says “in an <a>
element, look for the microdata property value in the href
attribute.” But you can add a wrapper element around the entire mess, and use that to add the url
microdata property.
↶ This is what you get for subverting HTML
<TABLE itemscope itemtype="http://data-vocabulary.org/Person">
<TR><TD>Name<TD>Mark Pilgrim
<TR><TD>Link<TD>
<span itemprop="url">
<A href=# onclick=goExternalLink()>http://diveintomark.org/</A>
</span>
</TABLE>
Since the <span>
element has no special processing, it uses the default rule, “the microdata property is the text content.” “Text content” doesn’t mean “all the markup inside this element” (like you would get with, say, the innerHTML
DOM property). It means “just the text, ma’am.” In this case, http://diveintomark.org/
, the text content of the <a>
element inside the <span>
element.
To sum up: you can add microdata properties to any markup. If you’re using HTML correctly, you’ll find it easier to add microdata than if your HTML markup sucks, but it can always be done.
❧
By the way, the starter examples in the previous section weren’t completely made up. There really is a microdata vocabulary for marking up information about people, and it really is that easy. Let’s take a closer look.
The easiest way to integrate microdata into a personal website is on your “about” page. You do have an “about” page, don’t you? If not, you can follow along as I extend this sample “about” page with additional semantics. The final result is here: person-plus-microdata.html.
Let’s look at the raw markup first, before any microdata properties have been added:
<section>
<img width="204" height="250"
src="http://diveintohtml5.org/examples/2000_05_mark.jpg"
alt="[Mark Pilgrim, circa 2000]">
<h1>Contact Information</h1>
<dl>
<dt>Name</dt>
<dd>Mark Pilgrim</dd>
<dt>Position</dt>
<dd>Developer advocate for Google, Inc.</dd>
<dt>Mailing address</dt>
<dd>
100 Main Street<br>
Anytown, PA 19999<br>
USA
</dd>
</dl>
<h1>My Digital Footprints</h1>
<ul>
<li><a href="http://diveintomark.org/">weblog</a></li>
<li><a href="http://www.google.com/profiles/pilgrim">Google profile</a></li>
<li><a href="http://www.reddit.com/user/MarkPilgrim">Reddit.com profile</a></li>
<li><a href="http://www.twitter.com/diveintomark">Twitter</a></li>
</ul>
</section>
The first thing you always need to do is declare the vocabulary you’re using, and the scope of the properties you want to add. You do this by adding the itemtype
and itemscope
attributes on the outermost element that contains the other elements that contain the actual data. In this case, that’s a <section>
element.
<section itemscope itemtype="http://data-vocabulary.org/Person">
[Follow along! Before: person.html, after: person-plus-microdata.html]
Now you can start defining microdata properties from the http://data-vocabulary.org/Person
vocabulary. But what are those properties? As it happens, you can see the list of properties by navigating to data-vocabulary.org/Person in your browser. The microdata specification does not require this, but I’d say it’s certainly a “best practice.” After all, if you want developers to actually use your microdata vocabulary, you need to document it. And where better to put your documentation than the vocabulary URL itself?
Property | Description |
---|---|
name | Name |
nickname | Nickname |
photo | An image link |
title | The person’s title (for example, “Financial Manager”) |
role | The person’s role (for example, “Accountant”) |
url | Link to a web page, such as the person’s home page |
affiliation | The name of an organization with which the person is associated (for example, an employer) |
friend | Identifies a social relationship between the person described and another person |
contact | Identifies a social relationship between the person described and another person |
acquaintance | Identifies a social relationship between the person described and another person |
address | The location of the person. Can have the subproperties street-address , locality , region , postal-code , and country-name .
|
The first thing in this sample “about” page is a picture of me. Naturally, it’s marked up with an <img>
element. To declare that this <img>
element is my profile picture, all we need to do is add itemprop="photo"
to the <img>
element.
<img itemprop="photo" width="204" height="250"
src="http://diveintohtml5.org/examples/2000_05_mark.jpg"
alt="[Mark Pilgrim, circa 2000]">
[Follow along! Before: person.html, after: person-plus-microdata.html]
Where’s the microdata property value? It’s already there, in the src
attribute. If you recall from the HTML5 microdata data model, the “value” of an <img>
element is its src
attribute. Every <img>
element has a src
attribute — otherwise it would just be a broken image — and the src
is always a URL. See? If you’re using HTML correctly, microdata is easy.
Furthermore, this <img>
element isn’t alone on the page. It’s a child element of the <section>
element, the one we just declared with the itemscope
attribute. Microdata reuses the parent-child relationship of elements on the page to define the scoping of microdata properties. In plain English, we’re saying, “This <section>
element represents a person. Any microdata properties you might find on the children of the <section>
element are properties of that person.” If it helps, you can think of the <section>
element has the subject of a sentence. The itemprop
attribute represents the verb of the sentence, something like “is pictured at.” The microdata property value represents the object of the sentence.
This person [explicit, from
<section itemscope itemtype="...">
]is pictured at [explicit, from
<img itemprop="photo">
]
http://diveintohtml5.org/examples/2000_05_mark.jpg
[implicit, from<img src>
attribute]
The subject only needs to be defined once, by putting itemscope
and itemtype
attributes on the outermost <section>
element. The verb is defined by putting the itemprop="photo"
attribute on the <img>
element. The object of the sentence doesn’t need any special markup at all, because the HTML5 microdata data model says that the property value of an <img>
element is its src
attribute.
Moving on to the next bit of markup, we see an <h1>
header and the beginnings of a <dl>
list. Neither the <h1>
nor the <dl>
need to be marked up with microdata. Not every piece of HTML needs to be a microdata property. Microdata is about the properties themselves, not the markup or headers surrounding the properties. This <h1>
isn’t a property; it’s just a header. Similarly, the <dt>
that says “Name” isn’t a property; it’s just a label.
↶ Boring
Boring ⇝
<h1>Contact Information</h1>
<dl>
<dt>Name</dt>
<dd>Mark Pilgrim</dd>
So where is the real information? It’s in the <dd>
element, so that’s where we need to put the itemprop
attribute. Which property is it? It’s the name
property. Where is the property value? It’s the text within the <dd>
element. Does that need to be marked up? the HTML5 microdata data model says no, <dd>
elements have no special processing, so the property value is just the text within the element.
↶ That’s my name, don’t wear it out
<dd itemprop="name">Mark Pilgrim</dd>
[Follow along! Before: person.html, after: person-plus-microdata.html]
What did we just say, in English? “This person’s name is Mark Pilgrim.” Well OK then. Onward.
The next two properties are a little tricky. This is the markup, pre-microdata:
<dt>Position</dt>
<dd>Developer advocate for Google, Inc.</dd>
If you look at the definition of the Person vocabulary, the text “Developer advocate for Google, Inc.” actually encompasses two properties: title
(“Developer advocate”) and affiliation
(“Google, Inc.”). How can you express that in microdata? The short answer is, you can’t. Microdata doesn’t have a way to break up runs of text into separate properties. You can’t say “the first 18 characters of this text is one microdata property, and the last 12 characters of this text is another microdata property.”
But all is not lost. Imagine that you wanted to style the text “Developer advocate” in a different font from the text “Google, Inc.” CSS can’t do that either. So what would you do? You would first need to wrap the different bits of text in dummy elements, like <span>
, then apply different CSS rules to each <span>
element.
This technique is also useful for microdata. There are two distinct pieces of information here: a title
and an affiliation
. If you wrap each piece in a dummy <span>
element, you can declare that each <span>
is a separate microdata property.
<dt>Position</dt>
<dd><span itemprop="title">Developer advocate</span> for
<span itemprop="affiliation">Google, Inc.<span></dd>
[Follow along! Before: person.html, after: person-plus-microdata.html]
Tada! “This person’s title is 'Developer advocate.' This person is employed by Google, Inc.” Two sentences, two microdata properties. A little more markup, but a worthwhile tradeoff.
The same technique is useful for marking up street addresses. The Person vocabulary defines an address
property, which itself is a microdata item. That means the address has its own vocabulary (http://data-vocabulary.org/Address
) and defines its own properties. The Address vocabulary defines 5 properties: street-address
, locality
, region
, postal-code
, and country-name
.
If you’re a programmer, you are probably familiar with dot notation to define objects and their properties. Think of the relationship like this:
Person
Person.address
Person.address.street-address
Person.address.locality
Person.address.region
Person.address.postal-code
Person.address.country-name
In this example, the entire street address is contained in a single <dd>
element. (Once again, the <dt>
element is just a label, so it plays no role in adding semantics with microdata.) Notating the address
property is easy. Just add an itemprop
attribute on the <dd>
element.
<dt>Mailing address</dt>
<dd itemprop="address">
[Follow along! Before: person.html, after: person-plus-microdata.html]
But remember, the address property is itself a microdata item. That means we need to add the itemscope
and itemtype
attributes too.
<dt>Mailing address</dt>
<dd itemprop="address" itemscope
itemtype="http://data-vocabulary.org/Address">
[Follow along! Before: person.html, after: person-plus-microdata.html]
We’ve seen all of this before, but only for top-level items. A <section>
element defines itemtype
and itemscope
, and all the elements within the <section>
element that define microdata properties are “scoped” within that specific vocabulary. But this is the first time we’ve seen nested scopes — defining a new itemtype
and itemscope
(on the <dd>
element) within an existing one (on the <section>
element). This nested scope works exactly like the HTML DOM. The <dd>
element has a certain number of child elements, all of which are scoped to the vocabulary defined on the <dd>
element. Once the <dd>
element is closed with a corresponding </dd>
tag, the scope reverts to the vocabulary defined by the parent element (<section>
, in this case).
The properties of the Address suffer the same problem we encountered with the title
and affiliation
properties. There’s just one long run of text, but we want to break it up into five separate microdata properties. The solution is the same: wrap each distinct piece of information in a dummy <span>
element, then declare microdata properties on each <span>
element.
<dd itemprop="address" itemscope
itemtype="http://data-vocabulary.org/Address">
<span itemprop="street-address">100 Main Street</span><br>
<span itemprop="locality">Anytown</span>,
<span itemprop="region">PA</span>
<span itemprop="postal-code">19999</span>
<span itemprop="country-name">USA</span>
</dd>
</dl>
[Follow along! Before: person.html, after: person-plus-microdata.html]
In English: “This person has a mailing address. The street address part of the mailing address is '100 Main Street.' The locality part is 'Anytown.' The region is 'PA.' The postal code is '19999.' The country name is 'USA.'” Easy peasy.
☞Q: Is this mailing address format US-specific?
A: No. The properties of the Address vocabulary are generic enough that they can describe most mailing addresses in the world. Not all addresses will have values for every property, but that’s OK. Some addresses might require fitting more than one “line” into a single property, but that’s OK too. For example, if your mailing address has a street address and a suite number, they would both go into thestreet-address
subproperty:<p itemprop="address" itemscope itemtype="http://data-vocabulary.org/Address"> <span itemprop="street-address"> 100 Main Street Suite 415 </span> ... </p>
There’s one more thing on this sample “about” page: a list of URLs. The Person vocabulary has a property for this, called url
. A url
property can be anything, really. (Well, it has to be a URL, but you probably guessed that.) What I mean is that the url
property is loosely defined. The property can be any sort of URL that you want to associate with a Person: a weblog, a photo gallery, or a profile on another site like Facebook or Twitter.
The other important thing to note here is that a single Person can have multiple url
properties. Technically, any property can appear more than once, but until now, we haven’t taken advantage of that. For example, you could have two photo
properties, each pointing to a different image URL. Here, I want to list four different URLs: my weblog, my Google profile page, my user profile on Reddit, and my Twitter account. In HTML, that’s a list of links: four <a>
elements, each in their own <li>
element. In microdata, each <a>
element gets an itemprop="url"
attribute.
<h1>My Digital Footprints</h1>
<ul>
<li><a href="http://diveintomark.org/"
itemprop="url">weblog</a></li>
<li><a href="http://www.google.com/profiles/pilgrim"
itemprop="url">Google profile</a></li>
<li><a href="http://www.reddit.com/user/MarkPilgrim"
itemprop="url">Reddit.com profile</a></li>
<li><a href="http://www.twitter.com/diveintomark"
itemprop="url">Twitter</a></li>
</ul>
According to the HTML5 microdata data model, <a>
elements have special processing. The microdata property value is the href
attribute, not the child text content. The text of each link is actually ignored by a microdata processor. Thus, in English, this says “This person has a URL at http://diveintomark.org/
. This person has another URL at http://www.google.com/profiles/pilgrim
. This person has another URL at http://www.reddit.com/user/MarkPilgrim
. This person has another URL at http://www.twitter.com/diveintomark
.”
I want to step back for just a moment and ask, “Why are we doing this?” Are we adding semantics just for the sake of adding semantics? Don’t get me wrong; I enjoy fiddling with angle brackets as much as the next webhead. But why microdata? Why bother?
There are two major classes of applications that consume HTML, and by extension, HTML5 microdata:
For browsers, HTML5 defines a set of DOM APIs for extracting microdata items, properties, and property values from a web page. At time of writing (February 2011), no browser supports this API. Not a single one. So that’s… kind of a dead end, at least until browsers catch up and implement the client-side APIs.
The other major consumer of HTML is search engines. What could a search engine do with microdata properties about a person? Imagine this: instead of simply displaying the page title and an excerpt of text, the search engine could integrate some of that structured information and display it. Full name, job title, employer, address, maybe even a little thumbnail of a profile photo. Would that catch your attention? It would catch mine.
Google supports microdata as part of their Rich Snippets program. When Google’s web crawler parses your page and finds microdata properties that conform to the http://data-vocabulary.org/Person
vocabulary, it parses out those properties and stores them alongside the rest of the page data. Google even provides a handy tool to see how Google “sees” your microdata properties. Testing it against our sample microdata-enabled “about” page yields this output:
Item Type: http://data-vocabulary.org/person photo = http://diveintohtml5.org/examples/2000_05_mark.jpg name = Mark Pilgrim title = Developer advocate affiliation = Google, Inc. address = Item( 1 ) url = http://diveintomark.org/ url = http://www.google.com/profiles/pilgrim url = http://www.reddit.com/user/MarkPilgrim url = http://www.twitter.com/diveintomark Item 1 Type: http://data-vocabulary.org/address street-address = 100 Main Street locality = Anytown region = PA postal-code = 19999 country-name = USA
It’s all there: the photo
property from the <img src>
attribute, all four URLs from the list of <a href>
attributes, even the address object (listed as “Item 1”) and all five of its subproperties.
And how does Google use all of this information? That depends. There’s no hard and fast rules about how microdata properties should be displayed, which ones should be displayed, or whether they should be displayed at all. If someone searches for “Mark Pilgrim,” and Google determines that this “about” page should rank in the results, and Google decides that the microdata properties it originally found on that page are worth displaying, then the search result listing might look something like this:
About Mark Pilgrim
Anytown PA - Developer advocate - Google, Inc.
Excerpt from the page will show up here.
Excerpt from the page will show up here.
diveintohtml5.org/examples/person-plus-microdata.html - Cached - Similar pages
The first line, “About Mark Pilgrim,” is actually the title of the page, given in the <title>
element. That’s not terribly exciting; Google does that for every page. But the second line is full of information taken directly from the microdata annotations we added to the page. “Anytown PA” was part of the mailing address, marked up with the http://data-vocabulary.org/Address
vocabulary. “Developer advocate” and “Google, Inc.” were two properties from the http://data-vocabulary.org/Person
vocabulary (title
and affiliation
, respectively).
This is really quite amazing. You don’t need to be a large corporation making special deals with search engine vendors to customize your search result listings. Just take ten minutes and add a couple of HTML attributes to annotate the data you were already publishing anyway.
☞Q: I did everything you said, but my Google search result listing doesn’t look any different. What gives?
A: “Google does not guarantee that markup on any given page or site will be used in search results.” But even if Google decides not to use your microdata annotations, another search engine might. Like the rest of HTML5, microdata is an open standard that anyone can implement. It’s your job to provide as much data as possible. Let the rest of the world decide what to do with it. They might surprise you!
❧
Microdata isn’t limited to a single vocabulary. “About” pages are nice, but you probably only have one of them. Still hungry for more? Let’s learn how to mark up organizations and businesses.
Here is a sample page of business listings. Let’s look at the original HTML markup, without microdata.
<article>
<h1>Google, Inc.</h1>
<p>
1600 Amphitheatre Parkway<br>
Mountain View, CA 94043<br>
USA
</p>
<p>650-253-0000</p>
<p><a href="http://www.google.com/">Google.com</a></p>
</article>
[Follow along! Before: organization.html, after: organization-plus-microdata.html]
Short and sweet. All the information about the organization is contained within the <article>
element, so let’s start there.
<article itemscope itemtype="http://data-vocabulary.org/Organization">
As with marking up people, you need to set the itemscope
and itemtype
attributes on the outermost element. In this case, the outermost element is an <article>
element. The itemtype
attribute declares the microdata vocabulary you’re using (in this case, http://data-vocabulary.org/Organization
), and the itemscope
attribute declares that all of the properties you set on child elements relate to this vocabulary.
So what’s in the Organization vocabulary? It’s simple and straightforward. In fact, some of it should already look familiar.
Property | Description |
---|---|
name | The name of the organization (for example, “Initech”) |
url | Link to the organization’s home page |
address | The location of the organization. Can contain the subproperties street-address , locality , region , postal-code , and country-name .
|
tel | The telephone number of the organization |
geo | Specifies the geographical coordinates of the location. Always contains two subproperties, latitude and longitude .
|
The first bit of markup within the outermost <article>
element is an <h1>
. This <h1>
element contains the name of a business, so we’ll put an itemprop="name"
attribute directly on the <h1>
element.
<h1 itemprop="name">Google, Inc.</h1>
[Follow along! Before: organization.html, after: organization-plus-microdata.html]
According to the HTML5 microdata data model, <h1>
elements don’t need any special processing. The microdata property value is simply the text content of the <h1>
element. In English, we just said “the name of the Organization is 'Google, Inc.'”
Next up is a street address. Marking up the address of an Organization works exactly the same way as marking up the address of a Person. First, add an itemprop="address"
attribute to the outermost element of the street address (in this case, a <p>
element). That states that this is the address
property of the Organization. But what about the properties of the address itself? We also need to define the itemtype
and itemscope
attributes to say that this is an Address item that has its own properties.
<p itemprop="address" itemscope
itemtype="http://data-vocabulary.org/Address">
[Follow along! Before: organization.html, after: organization-plus-microdata.html]
Finally, we need to wrap each distinct piece of information in a dummy <span>
element so we can add the appropriate microdata property name (street-address
, locality
, region
, postal-code
, and country-name
) on each <span>
element.
<p itemprop="address" itemscope
itemtype="http://data-vocabulary.org/Address">
<span itemprop="street-address">1600 Amphitheatre Parkway</span><br>
<span itemprop="locality">Mountain View</span>,
<span itemprop="region">CA</span>
<span itemprop="postal-code">94043</span><br>
<span itemprop="country-name">USA</span>
</p>
[Follow along! Before: organization.html, after: organization-plus-microdata.html]
In English, we just said “This organization has an address. The street address part is '1600 Amphitheatre Parkway'. The locality is 'Mountain View'. The region part is 'CA'. The postal code is '94043'. The name of the country is 'USA'.”
Next up: a telephone number for the Organization. Telephone numbers are notoriously tricky, and the exact syntax is country-specific. (And if you want to call another country, it’s even worse.) In this example, we have a United States telephone number, in a format suitable for calling from elsewhere in the United States.
<p itemprop="tel">650-253-0000</p>
[Follow along! Before: organization.html, after: organization-plus-microdata.html]
(Hey, in case you didn’t notice, the Address vocabulary went out of scope when its <p>
element was closed. Now we’re back to defining properties in the Organization vocabulary.)
If you want to list more than one telephone number — maybe one for United States customers and one for international customers — you can do that. Any microdata property can be repeated. Just make sure each telephone number is in its own HTML element, separate from any label you may give it.
<p>
US customers: <span itemprop="tel">650-253-0000</span><br>
UK customers: <span itemprop="tel">00 + 1* + 6502530000</span>
</p>
According to the HTML5 microdata data model, neither the <p>
element nor the <span>
element have special processing. The value of the microdata tel
property is simply the text content. The Organization microdata vocabulary makes no attempt to subdivide the different parts of a telephone number. The entire tel
property is just free-form text. If you want to put the area code in parentheses, or use spaces instead of dashes to separate the numbers, you can do that. If a microdata-consuming client wants to parse the telephone number, that’s entirely up to them.
Next, we have another familiar property: url
. Just like associating a URL with a Person, you can associate a URL with an Organization. This could be the company’s home page, a contact page, product page, or anything else. If it’s a URL about, from, or belonging to the Organization, mark it up with an itemprop="url"
attribute.
<p><a itemprop="url" href="http://www.google.com/">Google.com</a></p>
[Follow along! Before: organization.html, after: organization-plus-microdata.html]
According to the HTML5 microdata data model, the <a>
element has special processing. The microdata property value is the value of the href
attribute, not the link text. In English, this says “this organization is associated with the URL http://www.google.com/
.” It doesn’t say anything more specific about the association, and it doesn’t include the link text “Google.com.”
Finally, I want to talk about geolocation. No, not the W3C Geolocation API. This is about how to mark up the physical location for an Organization, using microdata.
To date, all of our examples have focused on marking up visible data. That is, you have an <h1>
with a company name, so you add an itemprop
attribute to the <h1>
element to declare that the (visible) header text is, in fact, the name of an Organization. Or you have an <img>
element that points to a photo, so you add an itemprop
attribute to the <img>
element to declare that the (visible) image is a photo of a Person.
In this example, geolocation information isn’t like that. There is no visible text that gives the exact latitude and longitude (to four decimal places!) of the Organization. In fact, the organization.html example (without microdata) has no geolocation information at all. It has a link to Google Maps, but even the URL of that link does not contain latitude and longitude coordinates. (It contains similar information in a Google-specific format.) But even if we had a link to a hypothetical online mapping service that did take latitude and longitude coordinates as URL parameters, microdata has no way of separating out the different parts of a URL. You can’t declare that the first URL query parameter is the latitude and the second URL query parameter is the longitude and the rest of the query parameters are irrelevant.
To handle edge cases like this, HTML5 provides a way to annotate invisible data. This technique should only be used as a last resort. If there is a way to display or render the data you care about, you should do so. Invisible data that only machines can read tends to “go stale” quickly. That is, someone will come along later and update the visible text but forget to update the invisible data. This happens more often than you think, and it will happen to you too.
Still, there are cases where invisible data is unavoidable. Perhaps your boss really wants machine-readable geolocation information but doesn’t want to clutter up the interface with pairs of incomprehensible six-digit numbers. Invisible data is the only option. The only saving grace here is that you can put the invisible data immediately after the visible text that it describes, which may help remind the person who comes along later and updates the visible text that they need to update the invisible data right after it.
In this example, we can create a dummy <span>
element within the same <article>
element as all the other Organization properties, then put the invisible geolocation data inside the <span>
element.
<span itemprop="geo" itemscope
itemtype="http://data-vocabulary.org/Geo">
<meta itemprop="latitude" content="37.4149" />
<meta itemprop="longitude" content="-122.078" />
</span>
</article>
[Follow along! Before: organization.html, after: organization-plus-microdata.html]
Geolocation information is defined in its own vocabulary, like the address of a Person or Organization. Therefore, this <span>
element needs three attributes:
itemprop="geo"
says that this element represents the geo
property of the surrounding Organization
itemtype="http://data-vocabulary.org/Geo"
says which microdata vocabulary this element’s properties conform to
itemscope
says that this element is the enclosing element for a microdata item with its own vocabulary (given in the itemtype
attribute). All the properties within this element are properties of http://data-vocabulary.org/Geo
, not the surrounding http://data-vocabulary.org/Organization
.
The next big question that this example answers is, “How do you annotate invisible data?” You use the <meta>
element. In previous versions of HTML, you could only use the <meta>
element within the <head>
of your page. In HTML5, you can use the <meta>
element anywhere. And that’s exactly what we’re doing here.
<meta itemprop="latitude" content="37.4149" />
[Follow along! Before: organization.html, after: organization-plus-microdata.html]
According to the HTML5 microdata data model, the <meta>
element has special processing. The microdata property value is the content
attribute. Since this attribute is never visibly displayed, we have the perfect setup for unlimited quantities of invisible data. With great power comes great responsibility. In this case, the responsibility is on you to ensure that this invisible data stays in sync with the visible text around it.
There is no direct support for the Organization vocabulary in Google Rich Snippets, so I don’t have any pretty sample search result listings to show you. But organizations feature heavily in the next two case studies: events and reviews, and those are supported by Google Rich Snippets.
❧
Shit happens. Some shit happens at pre-determined times. Wouldn’t it be nice if you could tell search engines exactly when shit was about to happen? There’s an angle bracket for that.
Let’s start by looking at a sample schedule of my speaking engagements.
<article>
<h1>Google Developer Day 2009</h1>
<img width="300" height="200"
src="http://diveintohtml5.org/examples/gdd-2009-prague-pilgrim.jpg"
alt="[Mark Pilgrim at podium]">
<p>
Google Developer Days are a chance to learn about Google
developer products from the engineers who built them. This
one-day conference includes seminars and “office hours”
on web technologies like Google Maps, OpenSocial, Android,
AJAX APIs, Chrome, and Google Web Toolkit.
</p>
<p>
<time datetime="2009-11-06T08:30+01:00">2009 November 6, 8:30</time>
–
<time datetime="2009-11-06T20:30+01:00">20:30</time>
</p>
<p>
Congress Center<br>
5th května 65<br>
140 21 Praha 4<br>
Czech Republic
</p>
<p><a href="http://code.google.com/intl/cs/events/developerday/2009/home.html">GDD/Prague home page</a></p>
</article>
[Follow along! Before: event.html, after: event-plus-microdata.html]
All the information about the event is contained within the <article>
element, so that’s where we need to put the itemtype
and itemscope
attributes.
<article itemscope itemtype="http://data-vocabulary.org/Event">
[Follow along! Before: event.html, after: event-plus-microdata.html]
The URL for the Event vocabulary is http://data-vocabulary.org/Event
, which also happens to contain a nice little chart describing the vocabulary’s properties. And what are those properties?
Property | Description |
---|---|
summary | The name of the event |
url | Link to the event details page |
location | The location or venue of the event. Can optionally be represented by a nested Organization or Address. |
description | A description of the event |
startDate | The starting date and time of the event in ISO date format |
endDate | The ending date and time of the event in ISO date format |
duration | The duration date of the event in ISO duration format |
eventType | The category of the event (for example, “Concert” or “Lecture”). This is a freeform string, not an enumerated attribute. |
geo | Specifies the geographical coordinates of the location. Always contains two subproperties, latitude and longitude .
|
photo | A link to a photo or image related to the event |
The event’s name is in an <h1>
element. According to the HTML5 microdata data model, <h1>
elements have no special processing. The microdata property value is simply the text content of the <h1>
element. All we need to do is add the itemprop
attribute to declare that this <h1>
element contains the name of the event.
<h1 itemprop="summary">Google Developer Day 2009</h1>
[Follow along! Before: event.html, after: event-plus-microdata.html]
In English, this says, “The name of this event is Google Developer Day 2009.”
This event listing has a photo, which can be marked up with the photo
property. As you would expect, the photo is already marked up with an <img>
element. Like the photo
property in the Person vocabulary, an Event photo is a URL. Since the HTML5 microdata data model says that the property value of an <img>
element is its src
attribute, the only thing we need to do is add the itemprop
attribute to the <img>
element.
<img itemprop="photo" width="300" height="200"
src="http://diveintohtml5.org/examples/gdd-2009-prague-pilgrim.jpg"
alt="[Mark Pilgrim at podium]">
[Follow along! Before: event.html, after: event-plus-microdata.html]
In English, this says, “The photo for this event is at http://diveintohtml5.org/examples/gdd-2009-prague-pilgrim.jpg
.”
Next up is a longer description of the event, which is just a pargaraph of freeform text.
<p itemprop="description">Google Developer Days are a chance to
learn about Google developer products from the engineers who built
them. This one-day conference includes seminars and “office
hours” on web technologies like Google Maps, OpenSocial,
Android, AJAX APIs, Chrome, and Google Web Toolkit.</p>
[Follow along! Before: event.html, after: event-plus-microdata.html]
The next bit is something new. Events generally occur on specific dates and start and end at specific times. In HTML5, dates and times should be marked up with the <time>
element, and we are already doing that here. So the question becomes, how do we add microdata propeties to these <time>
elements? Looking back at the HTML5 microdata data model, we see that the <time>
element has special processing. The value of a microdata property on a <time>
element is the value of the datetime
attribute. And hey, the startDate
and endDate
properties of the Event vocabulary take an ISO-style date, just like the datetime
property of a <time>
element. Once again, the semantics of the core HTML vocabulary dovetail nicely with semantics of our custom microdata vocabulary. Marking up start and end dates with microdata is as simple as
<time>
elements to mark up dates and times), and
itemprop
attribute
<p>
<time itemprop="startDate" datetime="2009-11-06T08:30+01:00">2009 November 6, 8:30</time>
–
<time itemprop="endDate" datetime="2009-11-06T20:30+01:00">20:30</time>
</p>
[Follow along! Before: event.html, after: event-plus-microdata.html]
In English, this says, “This event starts on November 6, 2009, at 8:30 in the morning, and goes until November 6, 2009, at 20:30 (times local to Prague, GMT+1).”
Next up is the location
property. The definition of the Event vocabulary says that this can be either an Organization or an Address. In this case, the event is being held at a venue that specializes in conferences, the Congress Center in Prague. Marking it up as an Organization allows us to include the name of the venue as well as its address.
First, let’s declare that the <p>
element that contains the address is the location
property of the Event, and that this element is also its own microdata item that conforms to the http://data-vocabulary.org/Organization
vocabulary.
<p itemprop="location" itemscope
itemtype="http://data-vocabulary.org/Organization">
[Follow along! Before: event.html, after: event-plus-microdata.html]
Next, mark up the name of the Organization by wrapping the name in a dummy <span>
element and adding an itemprop
attribute to the <span>
element.
<span itemprop="name">Congress Center</span><br>
[Follow along! Before: event.html, after: event-plus-microdata.html]
Due to the microdata scoping rules, this itemprop="name"
is defining a property in the Organization vocabulary, not the Event vocabulary. The <p>
element defined the beginning of the scope of the Organization properties, and that <p>
element hasn’t yet been closed with an </p>
tag. Any microdata properties we define here are properties of the most-recently-scoped vocabulary. Nested vocabularies are like a stack. We haven’t yet popped the stack, so we’re still talking about properties of the Organization.
In fact, we’re going to add a third vocabulary onto the stack: an Address for the Organization for the Event.
<span itemprop="address" itemscope
itemtype="http://data-vocabulary.org/Address">
[Follow along! Before: event.html, after: event-plus-microdata.html]
Once again, we want to mark up every piece of the address as a separate microdata property, so we need a slew of dummy <span>
elements to hang our itemprop
attributes onto. (If I’m going too fast for you here, go back and read about marking up the address of a Person and marking up the address of an Organization.)
<span itemprop="street-address">5th května 65</span><br>
<span itemprop="postal-code">140 21</span>
<span itemprop="locality">Praha 4</span><br>
<span itemprop="country-name">Czech Republic</span>
[Follow along! Before: event.html, after: event-plus-microdata.html]
There are no more properties of the Address, so we close the <span>
element that started the Address scope, and pop the stack.
</span>
There are no more properties of the Organization, so we close the <p>
element that started the Organization scope, and pop the stack again.
</p>
Now we’re back to defining properties on the Event. The next property is geo
, to represent the physical location of the Event. This uses the same Geo vocabulary that we used to mark up the physical location of an Organization in the previous section. We need a <span>
element to act as the container; it gets the itemtype
and itemscope
attributes. Within that <span>
element, we need two <meta>
elements, one for the latitude
property and one for the longitude
property.
<span itemprop="geo" itemscope itemtype="http://data-vocabulary.org/Geo">
<meta itemprop="latitude" content="50.047893" />
<meta itemprop="longitude" content="14.4491" />
</span>
[Follow along! Before: event.html, after: event-plus-microdata.html]
And we’ve closed the <span>
that contained the Geo properties, so we’re back to defining properties on the Event. The last property is the url
property, which should look familiar. Associating a URL with an Event works the same way as associating a URL with a Person and associating a URL with an Organization. If you’re using HTML correctly (marking up hyperlinks with <a href>
), then declaring that the hyperlink is a microdata url
property is simply a matter of adding the itemprop
attribute.
<p>
<a itemprop="url"
href="http://code.google.com/intl/cs/events/developerday/2009/home.html">
GDD/Prague home page
</a>
</p>
</article>
[Follow along! Before: event.html, after: event-plus-microdata.html]
The sample event page also lists a second event, my speaking engagement at the ConFoo conference in Montréal. For brevity, I’m not going to go through that markup line by line. It’s essentially the same as the event in Prague: an Event item with nested Geo and Address items. I just mention it in passing to reiterate that a single page can have multiple events, each marked up with microdata.
According to Google’s Rich Snippets Testing Tool, this is the information that Google’s crawlers will glean from our sample event listing page:
Item Type: http://data-vocabulary.org/Event summary = Google Developer Day 2009 eventType = conference photo = http://diveintohtml5.org/examples/gdd-2009-prague-pilgrim.jpg description = Google Developer Days are a chance to learn about Google developer products from the engineers who built them. This one-day conference includes seminars and office hours on web technologies like Goo... startDate = 2009-11-06T08:30+01:00 endDate = 2009-11-06T20:30+01:00 location = Item(__1) geo = Item(__3) url = http://code.google.com/intl/cs/events/developerday/2009/home.html Item Id: __1 Type: http://data-vocabulary.org/Organization name = Congress Center address = Item(__2) Item Id: __2 Type: http://data-vocabulary.org/Address street-address = 5th května 65 postal-code = 140 21 locality = Praha 4 country-name = Czech Republic Item Id: __3 Type: http://data-vocabulary.org/Geo latitude = 50.047893 longitude = 14.4491
As you can see, all the information we added in microdata is there. Properties that are separate microdata items are given internal IDs (Item(__1)
, Item(__2)
and so on). This is not part of the microdata specification. It’s just a convention that Google’s testing tool uses to linearize the sample output and show you the grouping of nested items and their properties.
Here is how Google might choose to represent this sample page in its search results. (Again, I have to preface this with the disclaimer that this is just an example. Google may change the format of their search results at any time, and there is no guarantee that Google will even pay attention to your microdata markup. Sorry to sound like a broken record, but our lawyers make me say these things.)
Mark Pilgrim’s event calendar
Excerpt from the page will show up here.
Excerpt from the page will show up here.
Google Developer Day 2009 Fri, Nov 6 Congress Center, Praha 4, Czech Republic ConFoo.ca 2010 Wed, Mar 10 Hilton Montreal Bonaventure, Montréal, Québec, Canada diveintohtml5.org/examples/event-plus-microdata.html - Cached - Similar pages
After the page title and auto-generated excerpt text, Google starts using the microdata markup we added to the page to display a little table of events. Note the date format: “Fri, Nov 6.” That is not a string that appeared anywhere in our HTML or microdata markup. We used two fully qualified ISO-formatted strings, 2009-11-06T08:30+01:00
and 2009-11-06T20:30+01:00
. Google took those two dates, figured out that they were on the same day, and decided to display a single date in a more friendly format.
Now look at the physical addresses. Google chose to display just the venue name + locality + country, not the exact street address. This is made possible by the fact that we split up the address into five subproperties — name
, street-address
, region
, locality
, and country-name
— and marked up each part of the address as a different microdata property. Google takes advantage of that to show an abbreviated address. Other consumers of the same microdata markup might make different choices about what to display or how to display it. There’s no right or wrong choice here. It’s up to you to provide as much data as possible, as accurately as possible. It’s up to the rest of the world to interpret it.
❧
Here’s another example of making the web (and possibly search result listings) better through markup: business and product reviews.
This is a short review I wrote of my favorite pizza place near my house. (This is a real restaurant, by the way. If you’re ever in Apex, NC, I highly recommend it.) Let’s look at the original markup:
<article>
<h1>Anna’s Pizzeria</h1>
<p>★★★★☆ (4 stars out of 5)</p>
<p>New York-style pizza right in historic downtown Apex</p>
<p>
Food is top-notch. Atmosphere is just right for a “neighborhood
pizza joint.” The restaurant itself is a bit cramped; if you’re
overweight, you may have difficulty getting in and out of your
seat and navigating between other tables. Used to give free
garlic knots when you sat down; now they give you plain bread
and you have to pay for the good stuff. Overall, it’s a winner.
</p>
<p>
100 North Salem Street<br>
Apex, NC 27502<br>
USA
</p>
<p>— reviewed by Mark Pilgrim, last updated March 31, 2010</p>
</article>
[Follow along! Before: review.html, after: review-plus-microdata.html]
This review is contained in an <article>
element, so that’s where we’ll put the itemtype
and itemscope
attributes. The namespace URL for this vocabulary is http://data-vocabulary.org/Review
.
<article itemscope itemtype="http://data-vocabulary.org/Review">
[Follow along! Before: review.html, after: review-plus-microdata.html]
What are the available properties in the Review vocabulary? I’m glad you asked.
Property | Description |
---|---|
itemreviewed | The name of the item being reviewed. Can be a product, service, business, &c. |
rating | A numerical quality rating for the item, on a scale from 1 to 5. Can also be a nested http://data-vocabulary.org/Rating vocabulary to use a nonstandard scale.
|
reviewer | The name of the author who wrote the review |
dtreviewed | The date that the item was reviewed in ISO date format |
summary | A short summary of the review |
description | The body of the review |
The first property is simple: itemreviewed
is just text, and here it’s contained in an <h1>
element, so that’s where we should put the itemprop
attribute.
<h1 itemprop="itemreviewed">Anna’s Pizzeria</h1>
[Follow along! Before: review.html, after: review-plus-microdata.html]
I’m going to skip over the actual rating and come back to that at the end.
The next two properties are also straightforward. The summary
property is a short description of what you’re reviewing, and the description
property is the body of the review.
<p itemprop="summary">New York-style pizza right in historic downtown Apex</p>
<p itemprop="description">
Food is top-notch. Atmosphere is just right for a “neighborhood
pizza joint.” The restaurant itself is a bit cramped; if you’re
overweight, you may have difficulty getting in and out of your
seat and navigating between other tables. Used to give free
garlic knots when you sat down; now they give you plain bread
and you have to pay for the good stuff. Overall, it’s a winner.
</p>
[Follow along! Before: review.html, after: review-plus-microdata.html]
The location
and geo
properties aren’t anything we haven’t tackled before. (If you’re just tuning in, check out marking up the address of a Person, marking up the address of an Organization, and marking up geolocation information from earlier in this chapter.)
<p itemprop="location" itemscope
itemtype="http://data-vocabulary.org/Address">
<span itemprop="street-address">100 North Salem Street</span><br>
<span itemprop="locality">Apex</span>,
<span itemprop="region">NC</span>
<span itemprop="postal-code">27502</span><br>
<span itemprop="country-name">USA</span>
</p>
<span itemprop="geo" itemscope
itemtype="http://data-vocabulary.org/Geo">
<meta itemprop="latitude" content="35.730796" />
<meta itemprop="longitude" content="-78.851426" />
</span>
[Follow along! Before: review.html, after: review-plus-microdata.html]
The final line presents a familiar problem: it contains two bits of information in one element. The name of the reviewer is Mark Pilgrim
, and the review date is March 31, 2010
. How do we mark up these two distinct properties? Wrap them in their own elements and put an itemprop
attribute on each element. In fact, the date in this example should have been marked up with a <time>
element in the first place, so that provides a natural hook on which to hang our itemprop
attribute. The reviewer name can just be wrapped in a dummy <span>
element.
<p>— <span itemprop="reviewer">Mark Pilgrim</span>, last updated
<time itemprop="dtreviewed" datetime="2010-03-31">
March 31, 2010
</time>
</p>
</article>
[Follow along! Before: review.html, after: review-plus-microdata.html]
OK, let’s talk ratings. The trickiest part of marking up a review is the rating. By default, ratings in the Review vocabulary are on a scale of 1–5, 1 being “terrible” and 5 being “awesome.” If you want to use a different scale, you can definitely do that. But let’s talk about the default scale first.
<p>★★★★☆ (<span itemprop="rating">4</span> stars out of 5)</p>
[Follow along! Before: review.html, after: review-plus-microdata.html]
If you’re using the default 1–5 scale, the only property you need to mark up is the rating itself (4, in this case). But what if you want to use a different scale? You can do that; you just need to declare the limits of the scale you’re using. For example, if you wanted to use a 0–10 point scale, you would still declare the itemprop="rating"
property, but instead of giving the rating value directly, you would use a nested vocabulary of http://data-vocabulary.org/Rating
to declare the worst and best values in your custom scale and the actual rating value within that scale.
<p itemprop="rating" itemscope
itemtype="http://data-vocabulary.org/Rating">
★★★★★★★★★☆
(<span itemprop="value">9</span> on a scale of
<span itemprop="worst">0</span> to
<span itemprop="best">10</span>)
</p>
In English, this says “the product I’m reviewing has a rating value of 9 on a scale of 0–10.”
Did I mention that review microdata could affect search result listings? Oh yes, it can. Here is the “raw data” that the Google Rich Snippets tool extracted from my microdata-enhanced review:
Item Type: http://data-vocabulary.org/Review itemreviewed = Anna’s Pizzeria rating = 4 summary = New York-style pizza right in historic downtown Apex description = Food is top-notch. Atmosphere is just right ... address = Item(__1) geo = Item(__2) reviewer = Mark Pilgrim dtreviewed = 2010-03-31 Item Id: __1 Type: http://data-vocabulary.org/Organization street-address = 100 North Salem Street locality = Apex region = NC postal-code = 27502 country-name = USA Item Id: __2 Type: http://data-vocabulary.org/Geo latitude = 35.730796 longitude = -78.851426
And here (modulo the whims of Google, the phase of the moon, and so on and so forth) is what my review might look like in a search result listing:
Anna’s Pizzeria: review
★★★★☆ Review by Mark Pilgrim - Mar 31, 2010
Excerpt from the page will show up here.
Excerpt from the page will show up here.
diveintohtml5.org/examples/review-plus-microdata.html - Cached - Similar pages
Angle brackets don’t impress me much, but I have to admit, that’s pretty cool.
❧
Microdata resources:
Google Rich Snippets resources:
❧
This has been ‘“Distributed,” “Extensibility,” & Other Fancy Words.’ The full table of contents has more if you’d like to keep reading.
In association with Google Press, O’Reilly is distributing this book in a variety of formats, including paper, ePub, Mobi, and DRM-free PDF. The paid edition is called “HTML5: Up & Running,” and it is available now. This chapter is included in the paid edition.
If you liked this chapter and want to show your appreciation, you can buy “HTML5: Up & Running” with this affiliate link or buy an electronic edition directly from O’Reilly. You’ll get a book, and I’ll get a buck. I do not currently accept direct donations.
Copyright MMIX–MMXI Mark Pilgrim