10 Things You Don’t Know about Structured Data

This is the presentation Martha van Berkel, CEO at Schema App gave at State of Search in Dallas,Texas, on November 6th, 2019.  Watch the 20 minute presentation below. A transcript will be added soon.

The 10 Things You Don’t Know about Structured Data list includes:
1. ROI Beyond Rich Results
2. Errors and Warnings + Content Strategy
3. Schema.org + Actions
4. “Proper” Connected Schema Markup
5. Main Entity of Page
6. Multi-Type Entity
7. Additional Type
8. Additional Property
9. Lists: Item List, Collection, Offer Catalog
10. Analytics

 

More Resources to Help with Your Structured Data Journey:

 

Here is a transcription of the webinar:

Welcome to Ten Things You Don’t Know About Structured Data.  My name is Martha van Berkel and I’m the CEO here at Schema App.

When I introduce myself, I like to do so with a knowledge graph. I explain that I’m an alumni of Cisco; I spent 14 years there doing online support strategy as well as product management. I’m an alumni of Queens University and MIT,  I have a very technical background in mathematics and engineering, as well as strategy and innovation. I’m the co-founder of Schema App. I’m Canadian and I used to own this awesome car that was an Austin Healey Sprite. My actual car was in the movie Losing Chase, which was directed by Kevin Bacon. Kevin Bacon used to drive my car. The reason I share that is that you have now inferred that you are three degrees from Kevin Bacon since I’m two degrees from Kevin Bacon. The same inference that you just did for Kevin Bacon is the same type of inferencing that Google does when it’s trying to figure out what is the best answer in order to provide searchers. 

Today we’re going to talk about schema markup and structured data. I’ll use those terms interchangeably, although I do prefer schema markup because it’s actually meant to disambiguate the definition of things and structured data can be lots of different things. If you’re not familiar with it, it’s what helps search engines truly understand the content so that it can best match things with the searcher’s intent and also reward you with rich results.

So, let’s go through our ten things. The first one is that there’s ROI (or return on investment) beyond just the rich results and this is a theme that’s really been coming up just in the past couple months. John Mueller in a video states that it’s not just about rich results but really promoting understanding, which there’s great value in. In our work with customers, from small/medium business all the way up through large enterprise, but specifically focused on enterprise clients, we’re seeing outstanding results. Not just in impressions and click-through rates (or that kind of front end of the funnel), but all the way through that customer journey including impact to rank, but also more interaction with the content, more time on the sites, and eventually even higher revenue.

In May 2019, Google announced that they had a specific motivation for continuing to invest in structured data. It was during Google IO that they talked about how users are trying to reach out at different moments and through different services, so different times of their day and through different devices or surfaces (think of your phone, your car). So it’s a great opportunity for us as content creators, but building and maintaining those customer experiences is really hard and a lot of work. They said this is why they’re doing structured data: that if we focus on building amazing content and translating it into structured data, Google will then help reach those users across those different surfaces and those different experiences. Amazing! What it means is that when you’ve optimized with structured data, you’ve actually optimized not only for search but also for voice. 

The second thing I want to talk about is errors and warnings and content strategy. The schema markup process starts with strategy and then you go into authoring where you create the schema markup or the JSON-LD. You then have to get it on to the site through deployment. Over time you’ll need to maintain it, either if you get new results or new content or changes to the layout of your page. You then report and analyze it and then go back to strategy – it’s an iterative process. 

When you get Google errors and warnings, often it doesn’t actually mean that there’s anything wrong with your structured data. It just means that the required and recommended fields that they’re looking for in order to present to you and for you to qualify for those rich results aren’t being met. When you get a warning or an error you actually can just say, “Oh! We actually need to then address the fact that that content isn’t visible on the page.” Once you get that, you can add the content, optimize it, and get rid of those errors and warnings.

While the Google errors and warnings means that your eligibility for those rich results could be at risk, it doesn’t actually mean that there’s anything wrong with how you’ve written your structured data. 

One of the examples that I gave recently was that you also don’t want to add content that’s inappropriate. In this case, the agency was optimizing a site about hay and it was giving them a warning about a SKU missing. Well, hay doesn’t have a SKU. It isn’t really relevant with regards to the warning and and isn’t even a piece of content that you’d want to add. 

Another example is that I don’t put a salary on our job postings. That’s a choice I’m making for the business not to include that information. Again, that would be an error or warning that I would just dismiss because it’s not appropriate to our content.

All right, number three. Let’s talk a little bit about schema.org extensions and actions. The schema.org extensions are entire vocabularies that have been written for specific areas of the industry. There’s one for Automotive, one for health and life science, one that’s just started around the Internet of Things, a bibliographics one around library sciences, and then finance (sometimes referred to as FIBO). The finance extension has actually been merged into the overall schema.org vocabulary. This allows for an even more specific language that you can use to describe things within these other industries. If ever you’re trying to describe a business, but it’s a medical business, or an article that is a medical article, you can look at schema.org for specific examples or specific classes that describe those things within those industries. 

The other piece that is hugely underutilized is actions. There are many actions, roughly 14 or so specific actions that you can use.  Here’s a list of some of them: CreateAction, FindAction, PlayAction, SearchAction… I think these are going to become more relevant as we start interacting through these different services. You won’t always be looking at it with your eyes on a browser on your computer. You might be interacting with it through a voice channel or while you’re driving in your car. The actions that you can take, or that your assistant will give you options for, can actually be defined in the structured data so that you can have that interaction. This is an area that I would encourage you to have a look at as it allows the user to take that clickbait into action but also is understood and relevant across any surface.

All right! Number four and probably my favorite: connected schema markup. The reason I like to talk about this is because a lot of people when they’re doing structured data are really just looking at trying to get those rich results. “Game over, I’m done. I get to just keep working on my stuff.”  But the fact is that in order to get to the understanding piece that John Mueller was talking about, you actually have to connect the dots in order to help understand the context of how things are connected. 

For example, in my introduction you now know how I’m connected to MIT, Schema App and Kevin Bacon. You can do the same thing with your content. I thought I’d give you some tips on what not to do, and then some tools in order for you to figure out how to really make sure that you’re fully optimizing on how information is connected. 

Firstly, do not put the same schema markup on every page. I’m talking about plugins or other things that put Organization markup on every page and the reason this is a no-go is because if the same markup describing exactly the same thing is on every page, then you actually don’t know which page is really talking about that thing. You want to be really specific so that you can bring clarity on what it is you do, so you don’t want to take a peanut butter approach to schema markup.

Another thing is that you don’t want to create islands of schema markup.  An example would be creating your locations, your organization and your products, but not describing how they’re connected. I’m going to give you some tips on how to do that appropriately. 

The most important thing you can do with your schema markup is make sure that you’re telling the story; you’re explaining how these things are all connected. This is really what then yields a knowledge graph. It’s not just doing schema markup but doing proper schema markup where it’s really well connected.

In this case we’re describing the organization, it was a company that had rented apartment complexes. It had different offices, but the Canadian office specifically had apartment complexes that had reviews, videos, and ratings. That apartment complex then had a multi-type entity (apartments) that were for rent (hence,also a product) that then had these additional features and properties.  So you really want to explain how these things are connected.

Let me give you some other examples. Here we’re going to look at a food establishment with an amazing marketing video on it. How do we actually say how those things are connected? We don’t want to have two entities where we have just a video and just a food establishment – we want to make sure they’re embedded. 

If you go to Schema App, we have a free tool that helps you figure out how to connect them. Go to Resources -> Tools -> Schema Paths.  Schema Paths was developed by my co-founder Mark when he got really tired of going to schema.org trying to figure out how he was going to relate two different things. Within Schema Paths so you can actually just pick the two different classes. In this case, a food establishment and a video object and it will tell you how you can relate those two things.

For example, for subjectOf is the appropriate way to connect the food establishment as the subject of the video. This would be a really clean way of doing it so that the primary entity on the page is around the food establishment. It’s a page about the restaurant and then embedded in that would be subjectOf and then you would link to the data item or link to the video object.

You can also relate it the other way: if the page was primarily about the video object then you might use about and say that it’s primarily about this video. The video is then about the food establishment and that way I’m being very clear as to which one is the primary entity of page.

Let’s look at a couple other examples: a financial service/services in a financial organization. If we were looking at the path from Organization to Service, the financial organization offers financial services. Schema Paths tells me that there isn’t really a clean way of identifying how the Organization connects to that Service. Instead, we would do it at the Service level and we would say that the Service is provided by the Organization.

Next, and probably my favourite, a volcano and a gas station. Yes, there are two schema.org classes for these two things. Again, to relate it: the volcano can contain a gas station or the gas station has an areaServed which is the volcano. Maybe in Pompeii, just outside of Vesuvius there’s a gas station which has this areaServed which would then be a great way to link those two. As you can see, you can pretty much link anything. 

The other suggestion we have is to use very strong connectors. As you’re looking at linking things, use fields such as About, Mentions,  subjectOf and hasPart. Let me show you some examples. In this news article, we would want to be very clear that is about Jack Ward the cowboy and ideally use a Wikipedia entry in order to define who he is. In this news article you could say that it mentions interstate 10  and also avocados. Again, starting to link key topics that were within the article. In this news article you could say it’s the subjectOf the video object The Bees.

Now, a more advanced option is this example, which was a very long web page that talked about a specific service that was being delivered. On the page are many FAQs about that service, as well as blogs related to the service. We opted to classify it as a Collection Page because it’s got many different things and use a very strong connector (About) to specify it’s about the service,  and then hasPart FAQ. The FAQ then hasPart Questions and AcceptedAnswers and then it mentions BlogPostings.

As you can see the strong connector is the About field, and then the less strong ones are hasPart and mentions. This allows us to make sure that we’re still getting that rich result for the FAQ but that it’s all nicely connected so Google really understands how this information is related.

Number five: main entity of page which is really identifying the primary topic of the page. This often comes out if there’s different schema markup data items on a page. You could have a news article, a video that haven’t been embedded, and it leaves you wondering “what is this about?”

An example I once gave of this there was an event site that had five different events on five different days. But, what is the primary event that we’re talking about? If there are different items on the page, ideally you would have embedded them and linked them like we just talked about. But if you don’t, which one of these is actually the main entity of the page – the primary thing that is the primary topic of the page? In order to identify that, what you can do is use the property mainEntityOfPage an add in the URL of the page it’s on. Now, you would only pick one of these to do it on and then that would actually indicate to Google that this is the main topic of that page.

All right, number six: multi-type entities. This is where you can actually merge two different schema.org classes and use properties across both of them. Where we see this really commonly used are situation like a house for sale or a house for rent. As you saw my previous example, it’s both a product and a single-family home. Or a comedy event on-air is both a comedy event and a broadcast event. A book for sale is both a product and a book. A hotel for rent is a product and a hotel room. You really want to do this only when it adds more clarity. Multi-type entities are where you merge those two. I believe Schema App is one of the only tools out there that allows you to do this both within our Editor as well as in our Highlighter, either one page at a time or across many.

Number seven: additional type. Additional type also adds clarity to a schema.org type that maybe isn’t in the vocabulary. I always joke that there’s no local business class for a marketing agency even though digital marketing agencies make up a large portion of the people doing schema markup.

In this example we’re going to look at an orthodontist, hence the smiling face.  You would identify the type, which in this case it would be a dentist. Then for additional type I would use a Wikipedia entry to define more clearly what type of dentist I want. Think of this as a way to further clarify the type that you’re using within schema.org.

Now I was in Texas when I first presented this, and this is Texas Longhorn. So it’s not just a Thing, but there’s no animal class in schema.org. What I could do is use additionalType and use the Wikidata entry to further define what this is. The Wikidata is even more specific than Wikipedia since Wikipedia is often localized for different languages. When you see a number like this it means it’s identifying the entity in the Wikidata database, which then populates Wikipedia’s.  In this case I’m saying it’s not just a Thing; it is very much a Texas Longhorn as defined by that Wikidata entry.

All right, additionalProperty. AdditionalProperty allows you basically to use free variables to further describe something within a Product or so forth. In this case we’re gonna look at a camera. You might want to call out some specific features, such as the number of megapixels. In this case you would use additionalProperty and you would create a data item using propertyValue. PropertyValue is really a property value pair where you define relationships between a variable and what its actual number or quantity is. There are some set things already set in here, like a max value measurement type, min value, name, etc. You always want to include name and then value; those would be the minimum. If you can find a unit code, such as a UNC fact code for megapixels, you can go above and beyond and also use that.

Number nine: using lists and different kinds of lists.  There are the basic kind of lists like a BreadcrumbList and a HowToSection and a HowToStep and an OfferCatalog. I decided to clarify when to use these lists. An item list is a list of any type of Thing. An OfferCatalog is a list of offerings, which could be a list of Products. So, if you had a landing page on your store you would have an OfferCatalog. A CollectionPage is really interesting because it can kind of put a container around a list and then allow you to connect it. You saw this example when I showed this previously where this CollectionPage allows us to have lists within it, but then also use strong features or properties such as About and Mentions.

Number ten: structured data really goes beyond search – it can also help you structure other things like your analytics. Schema App recently released a Trend Report for all our clients and this replaces the Google Structured Data Report that used to be in Search Console. It allows you to track your schema markup over time. We also have an offering called Enhanced Analytics which allows you to take the schema markup that you’ve built and add any property into Google Analytics and soon into Adobe Analytics.

In this example we wanted to ask the question “Which author gets the best results?” We’ve taken the schema markup out of our WordPress plugin, loaded it programmatically into Google Analytics, and lo and behold Martha and Schema (who Mark says is him in his Admin mode). What this tells me is that I should continue to write a lot of content because my posts are the ones that drive the most sessions within the website. You can imagine there are a lot of insights that you can start getting now from your search data if you reuse that structured data. If you’re interested in learning more, feel free to reach out to us at Schema App and we’re happy to show you more about Enhanced Analytics.

So, those are the 10 things I had about structured data to help you better understand all the things that you can do beyond the basics and beyond just feature hunting. If you have any questions or are really passionate about learning more about structured data or how Schema App does structured data at scale, feel free to check out our other resources. These can be found under Resources, Tools, and Webinars. Other really helpful pieces around how to do a schema markup strategy, how to articulate the ROI of doing schema markup are all available there. You can also reach out to me directly and explore how you can partner with the world’s experts at doing schema markup at scale, on any platform, for any type of content. 

Thanks and have a great day. Happy Schema’ing

, , , ,
Previous Post
Creating “HowTo” Schema Markup Using the Schema App Editor
Next Post
Creating “FAQPage” Schema Markup Using the Schema App Editor

Menu