Schema Story: Mark van Berkel

Hi and welcome to Schema Stories my name is Martha van Berkel and I’m here today to interview Mark van Berkel. Schema Stories are all about bringing real people’s stories to life that are working in the area of schema.org, semantic search to make it more real more than just technology. Mark Van Berkel joins me is the creator of Schema App and also my co-founder of Hunch Manifest. Mark welcome.

Hi, thanks for having me.

Mark please tell us a bit about yourself and a bit about your background.

So, I suppose I started working in 2001-2002 as a programmer and spend several years as a consultant and doing some enterprise software and then 2005 started a master’s degree in, basically semantic technology and about half the credits were going toward ontology design engineering as well as doing a proof-of-concept or recipe research labs and so that’s where I got the familiar with semantic technology and spend then spent a couple more years in the industry and 2012 started the company and just quickly figured “how can I now apply semantic technology because the application of the software and technologies had gotten to a point where it was productive use for for business and for industry. When I was studying at the university it was still fairly theoretical. The tools weren’t there weren’t ready to support it. So yeah, long history consulting, as well as technical product design

So a lot of people know you most as the creator of Schema App, although they hear my voice and see my face, but you’re the one came up with the idea so tell us a bit about why you created Schema App and how it came to be.

The Schema model basically, the Schema.org group had published all of the recommendations as this kind of graph model, so rdfa which is just and a way to structure the actual classes and properties and how they all relate together. So this is coming from the semantic technology world so that the kind of, the graph relationship and I’ve been working in that technology, as I mentioned, productively in 2012, and onward. And so it’s 2014, I believe, it was the summer, I put two and two together, we were doing a digital marketing consulting at the time, and I

figured that there’s probably ways we can consume this model and put it to good use.

So, because I had familiarity with the graph database, and did some transformations –

and I can plug it into this tools and I’m using and so I was able to figure out, “okay well I can create this generator and this way of creating and managing your schema data”  and it was around that time that the schema group decided to adopt a JSON LD as the format which played

right into the tool because then you’re not mucking around in the HTML. The JSON LD format is  nice and clean so I can then export. So there was just this opportunity to get this generator going in and do the work for our clients. Created it for myself originally, and basically whenever google get on board with a JSON ld, then I made it public and so let’s fast forward to today.

I still remember when you went and spoke at Semantic Technology & Business Conference and people were excited about what you had

created but everyone kept saying that really, you’re too early.

Yeah, that’s right. So, I had conversation with Aaron Bradley at SemTech. I reached out to him ahead of the conference because I wanted to show him and basically he said “Well have you seen a rich snippet yet?” The answer was no, so Google hadn’t been rewarding that format and syntax yet.  So, a little early, but you know, there were some other uses for it so I didn’t didn’t kill

it yet and then we figured that that that format was coming down the pipeline as something that would be more broadly accepted for the rich snippets. So now today JSON LD is the preferred format for google. So all the bells and whistles are there with JSON LD, especially when you talk about email actions.


Since you first started thinking about semantic technology back in 2005, a lot has changed both in the landscape of the application and how useful it is to everyday folk. I don’t think you back then would have ever thought you would have a conversation with a marketer about semantic technology and we’re continuing to watch it. What are you most excited about, the changes going on in the semantic and schema world?

Um well the semantic technology is just a huge conversation because it’s got such broad applicabilities.

So maybe, how do you define semantic technology?

Yeah so even that’s loose, but I particularly going to the is going to the NoSQL group of databases that drive the technology. So primarily I think about the graph relationship, with the property model kind of relationship databases, so they underlie a lot of the technologies that the semantic technology we built on because of their flexibility to describe the relationships between things. So, semantics is all about figuring out what’s the meaning behind the words you need a really flexible way to describe your concepts, such as in schema.org, they had provided some flexible ways to describe the relationships between all the things, all the properties, and so that’s kind of one area of semantic technology. There’s other areas where you talk about neuro linguistic programming, where you’re trying to pick up the meaning of words,

and try to pick out maybe people, places and things out of unstructured data and there’s some other areas of semantic technology but what I’d like to kind of steer the conversation towards is this semantic search marketing or a little more the schema.org kind of structured data part

of the semantic technology. So we built Schema App with semantic technologies and the scheme part is something that sit into it. So from that vocabulary and from that area there’s still a lot of opportunities so schema.org started with like Rich Snippets. So at first you can provide i information about a recipe or event online. And then there were products and little stars and then it kind of got interesting where google acquired metaweb, I think that’s what it was called, the creators of freebase so they were also born out like that same era, and they were also very early, 2007 I think they were created, and acquired 2012 roughly, where they just had this giant database of concepts and relationships between those and things. And it was popular in the industry. So that is then really interesting where then Google consume this as part of its knowledge graph and so then, in the couple years since then, we see Google just pouring so many serp engine results pages from their knowledge graph. So where did you get the knowledge graph? From freebase, from Wikipedia, Wikidata but also from schema.org. So originally schema marketing was geared towards creating rich snippets but now we have the knowledge graph so that was kind of the second, and this is like a fairly well untapped, let’s say at this point, you know most marketers see the low-hanging fruit of the rich or rich cards today, but marking up the rest of your content whether it be certain types of events, getting into maybe certain services and things that don’t generate a rich snippet but, you know, telling someone that you’re a consulting company, you have five lines of business and there’s services that you offer, can be very important to get Google to understand what the concepts in your business. So, just providing schema, just other concepts outside of the those google features is that is a big area that has yet to be well tapped. But, in addition we got some other things like those open data is a more open data as just has a lot of government and other types of industries where they’re publishing data sets about the things that they’re doing or within cities maybe it’s the sidewalks and all the geographic locations of sidewalks more might be around like forestry or whatever like just sixty four thousand open data sets, that I saw, the tool that last week of  things available in all across like a lot of Western countries, especially now. So the UK heavy on open data. For the United States government and then in Canada bunch of cities that are doing open data. So to kind of bring this around, like they’re all like siloed chunks of data which might be fine if you’re like really interested in a narrow niche but how can they better describe those those datasets with schema is kind of the question I’m interested in the next year two. Because schema is a vocabulary that they can unambiguously declare like what the data says about and in the roads and all the concepts so why not use this existing vocabulary to describe all that open data and then there’s a bit of a bit of work already being done in the UK around the job postings, they’re open data group has approved scheme.org job postings as the vocabulary of their open data. So we can see that becoming more and more importance in the months and couple of years to follow.

So are you thinking that, as they standardized how they markup their open data, that open data then gets connected with the overall knowledge graphs and then all of a sudden you have this behemoth knowledge graph?

Well, yes. So the knowledge graph I mean it’s not singular, it’s not one knowledge graph, like google has their knowledge graph, they have an api, but there’s lots of other companies that have knowledge graphs. So Microsoft and Bing have their knowledge graph, Yahoo has their own separate knowledge graph, like I don’t know, probably many dozens of companies have their own knowledge graphs, like especially around their specific domain that go very, very deep. They would, I’m sure, like to consume some of this other data and if it’s more readily

available to the world with the common colabular it’s easier to consume. So, that can feed into the knowledge graph of many companies including  Google.

So one thing we’ve heard with our Schema Story with Mike Arneson and also some other conversations with Aaron Bradley and about semantics analytics. I know you spend a lot of time thinking about semantic analytics.  Can you talk a bit about what you are excited about, where you see the possibility there?

So it’s an interesting area, so primarily, it’s let’s take Google Analytics and extend it with your data. So right now is it’s kind of an almost completely anonymous data. I’m talking about generic concepts that every website has like a user has certain characteristics, like their location or maybe what application they are using to browse it or what kind of what phone or whatever and then there’s like the sessions and the length of things. That’s all generic stuff.  So semantic analytics is taking that semantic data, that structured data, schema.org is primarily what we are talking about, and injecting that into Google Analytics so that you have a more rich description of things that are driving revenue, or in like keeping people’s attention or have more social shares. Whatever your goals are. In the world of publishing or news, you start dissecting reports based on the author. Obviously you would like the author to have that kind of report in Google Analytics, but they don’t make that available, or out-of-the-box. So how do you start customizing for the categories of pages for posts, or customize the keywords or length of blog post. Wouldn’t it be interesting to see at what point of a blog length does it have a material impact of the success of that campaign? Does it require twelve hundred words compared to 800 words. What are the comparisons there? All sorts of different things are made available on schema.org. As long as you have that in your markup you can now bring that into Google Analytics and create custom reports based on your business data. There are ways to get around that in the years past, but I am really excited about providing that super simply. So, as you are onboarding with Schema App in the next several weeks and if you are into ecommerce or you are in the content industry, you get a bunch of data that can come from your website automatically through Schema App, into Google Analytics and you can start dissecting. And then products that world like you can start thinking about all the different reports you can generate such as segment by the colour of your widgets, or segment by price or segment by the manufacturing or whether or not there’s a video associated, or whether or not there’s more than ten images. There’s lots of ways you can slice the data so that you can better inform your future product marketing. I’m excited about the possibilities, how simple this can be for users and marketers, how with the click of a button and maybe five minutes of time, they can then have all these reports readily available in the Analytics. That’s the exciting part. Another thing that I don’t hear much talk about in the area of schema data where the structured data you provide for your content can also be used in other ways – like an in app search. So if you think about in WordPress, for example, there’s a fairly rudimentary keyword algorithm for ranking the blogs and articles in the out-of-the-box WordPress search but if you can also consider that all of these articles also have the author, the length, the number of comments, maybe some ratings, the pictures, all the different things you can put in schema.org, can also be made available in search filters. So you can do some more faceted search just based on the schema data. The way in which you are interacting with these sites and services can be improved just by reusing schema in these different ways. These are just a couple of ideas that are emerging, there’s lots more that can come out. Like, schema can also be used  for competitive intelligence. So there’s companies that have product pricing intelligence so lets say if you have a certain line of products, that they will monitor the prices of all the competitors. Now it’s fairly easy to consume schema.org data, so you can think of how many applications there are in that area, and I think it just goes on. So making it available to the public means you are making it available for other, new types of use cases that have yet to emerge. With the internet of things, I think there’s opportunity. Because it’s a common vocabulary you don’t have to map any of the concepts. Yea, it’s an exciting time with all those different uses and it all started with a rich snippet for a recipe.

Mark, if people want to get a hold of you or ask you questions or debate and discuss where they think schema.org is going, how do they best get a hold of you?

So I am on twitter, I am pretty easy to get a hold of me there @vberkel, by email at mark@hunchmanifest.com and my sites at hunchmanifest.com and schemaapp.com. I’m behind there somewhere, but twitter is probably the quick way to get a hold of me.

Thank you so much for joining us today and sharing your thoughts on where you think schema.org is going and how you came up with Schema App. Thank you for joining us for today’s Schema Story.

Previous Post
Why Schema.org Plugins Don’t Cut it for Serious Digital Marketers
Next Post
How Does Google Reward You for Using JSON-LD?

Menu