Schema App’s CEO Martha van Berkel sits down with SEO legend, Bill Slawski to discuss schema markup and its role in the future of Search.
In this interview Bill Slawski we discuss he shares what insights he has gained reading Google patents and being an advanced SEO practioner. He talks about how entities have been a common theme in his blog posts resulting from his research on Google patents and how he is seeing the role of entities evolve. My favourite part of the interview is when Bill describes entity IDs. Why? Well, because it is a tangile node in the knowledge graph, and a centering point for defining things. It gets me thinking about who else can contribute to these IDs, will they become a standard, and if not, what will be the standard for entity IDs?
“If you do a search at Google Trends for an entity like Chicago Cubs or something like that, if you see a type that something other than a search type if you see like Chicago Cubs you look for it and it says baseball team, so it’s a type. So, it’s recognizing Chicago Cubs as an entity. If you look at the URL the last few letters and numbers of the URL are the machine ID number. So, that’s using Google Trends to help track entities. Google did a blog post on reverse image search where they say their using machine IDs to track entities and images so when you do a search for an entity as an image of an entity, it’s using machine ID numbers to help it find that which ties into the Google lens use of schema to find entities.”
Martha: Hello and welcome to Schema Stories. My name is Martha Van Berkel. I’m CEO at Schema App and I’m absolutely delighted today to bring you an interview with Bill Slawski. Welcome Bill!
Bill: Thank you.
Martha: Bill, I love reading your articles, how you get so, so deep into the patents and try to connect the dots for those of us that aren’t reading those technical documents. I’m really excited for you to share your view on Schema Markup today and sort of where you think the search is going.
Bill: I started looking the patents because I was curious about how things worked, and I wasn’t getting answers from most things I saw on the web. Sometimes we do see some only in-depth detailed information from Google’s developer pages and sometimes they promise those things and we don’t see them. For instance, when they first introduced google lens at the Google IO 2018 Developers Conference, they mentioned that they would be putting more information on developers’ pages about how lens uses schema after it recognizes objects in pictures. It would tell us if the picture was a picture of band. It would look for a schema markup that might be event-based and might tell us where the band was performing next. But we haven’t seen that from the developers pages yet. So, I’ve been relegated to looking for patents that might include that
Martha: Love it. That’s awesome. Well, why don’t we start by tell us a bit about yourself and what you do in search.
Bill: Okay. I’m I’ve been working in different agencies and as a solo practitioner since 1996. I worked in-house for a company that helped people incorporate in Delaware for a number of years. I started getting into forums, administrating and moderating forums and kept running into the same questions over and over and over again. How do I increase my page rank? What goes into page rank? Why can’t Google read pictures of text? Now they probably can, but it’s just something they don’t do. It’s too computatively expensive for them to do. So, if you don’t write something in the text, don’t include your address on your page. If you take a picture of text, you design a logo that has your address in it that isn’t text. Google doesn’t know where you are at.
Telling you more about myself. I work with an agency called Go Fish Digital. They’re looking at the East Coast and the West Coast, and I’m sort of like satellite exploring but we do have some clients on the West Coast. I don’t get to spend too much time going face to face with them but it’s good to be in the same time zone and they know they can catch up to me at the same time of day if it’s later in the afternoon or so on, which makes it convenient for them and for me.
Martha: Excellent! And can you tell me like in your journey around searching in your career when did you first come across the Semantic Web or Schema Markup? How did you sort of first come across be introduced to it?
Bill: I read something I wrote in 2013. I had gone through my previous post to see how often I mentioned certain topics and noticed that I’d mentioned the word “entities” in about 20% of my posts. So, something I’ve always been writing about. The Internet of Things and instead of Strings, right?
And I think a lot of that’s because I do write about patents and there have been a number of patents from Google who all talked about entities and named entities and so on. The people I’m working with now in the East Coast I met at an SEO meetup. One where I was talking about named entities, which is kind of funny. It’s been useful for some of the projects we’ve worked on.
For instance, we worked with an apartment complex in Northern Virginia that was only four pages long. They were trying to sell apartments,they hadn’t been having much luck. We started working with them, improving their site, bringing up things on their pages that they didn’t include, like the fact that if you took an elevator down to the basement it opened up to the DC metro. Which allows you to commute everywhere in Northern Virginia, DC and Southern Maryland. You go to 31 different Smithsonian Institutes which don’t charge admission. So, if you have kids and need things to do in the weekend and you live in these apartments, you can bring kids this was Smithsonian’s for pretty much the cost of hopping on the subway on the Metro.
Martha: Can you talk to me? Some people listening may not know what an named entity is. Can you maybe to find that a little further for us?
Bill: Sure. Okay, an entity is a particular person, place or thing. A named entity is when you’re talking about a specific person, place or thing. So, if you are talking about restaurants, those are entities. You talk about the restaurant down the street from you, it’s a named entity because it’s specific entity. When Google talks about local search, they’ve pretty much created a search that’s based upon semantics, based upon entities and schema markup. You don’t need to use schema markup on your pages. You don’t even need to have a website to have listing in Google Maps. But it’s probably smart to. they’ve been using schema more and more and schemas been growing a lot too. They’ve been including new things in it. There is an extension process for Schema Markup where people can submit new topics and expand, have schema cover what it covers how it works and so on. They’re using it new ways. One of the things I saw recently was CNBC suggesting, stating that they received notification from Google that they could create how-to advertisements where they could describe how to do certain things and pay for the right to have those published. So, that’s opposed to people receiving featured snippets they would receive how-to advertisements.
Martha: Very interesting. You spoke about extensions, I know the IOT extension is something that was in the last four release of schema.org. This is all changing and it’s changing fast, as you said those new examples of how it’s being used in paid search that you brought up and actually and I had brought up got brought up in a conversation I had with I’m someone from Google Canada this week. Where do you see this going, and more importantly what are you most excited about in the evolving area search?
Bill: Okay. One of the typical things we think of when we think about how a search engine crawls the web, like Googlebot, is it follows links from page to page indexes the content of those pages and the anchor text of those links and where the links go if they redirect so on. An alternative to that something that it’s been referred to as open learning on the web is that Google reads pages like it was a person. It tries to understand what’s being talked about, what topics are being covered. We used to have DMOZ which used to be a starting point for focus crawls on the web when search engine wanted to crawl the web and learn about different topics because they can follow categories from DMOZ and to links to pages and that would give them coverage for the search engines so they would cover lots of topics instead of just being very focused on a few things. So, with this open learning on the web it’s something that was developed by a company called Wavy, which was at the University of Washington and headed by an AI researcher by the name of Oren Etzioni. I think I pronounced his name right, I’m not sure. He works for Paul Allen’s AI Institute at this point in time, but Google acquired the technology from them. They’re doing more crawling of facts on the web, which they issued a paper in 2014 called Biperpedia which talks about how they might use query streams to learn about what people are searching for when they search for different topics, and they’re extracting information from crawling of facts on the web based upon what they’re learning from these query strings. So, it helps them build ontologies about different topics on the web that they can use to answer questions when they do question-answering, when they try to show featured snippets. And we’re moving in that direction where the web is becoming a big database.
The very second patent Google filed with the United States Patent Office was one Sergey Brin wrote. He came up with an algorithm called Depra. It doesn’t have quite the ring of PageRank it’s not Brin-ranked, like he didn’t name it after himself like Larry did with the first patent, but it was a list of five books, their authors, their publishers and their publication dates. And he said if we crawl the web find these five books listed and information about them like we have here for these books, we can crawl all the other books that are in the same location collect information about them and then keep on repeating that. The next thing you know we know all the books on the web are.
Martha: How is this going to change, how as an organization you manage your data? Do you think it will have an impact in how we explain who we are and what we do and the knowledge we have in our organizations?
Bill: It changes around the way search works because it looks to properties of entities, information about them that people can search for. Entities include local entities like businesses so we can search for something like a Chinese restaurant near a bookstore and so we can go find something to read and something to eat on the same lunch period.
Martha: I always talk about it’s like how it’s crawling the graph and connecting those dots so it knows that the bookstores at this address and knows the Chinese stores at this address. So therefore, the bookstore sells books and the Chinese sells Chinese food and therefore can connect those dots across.
Bill: All right. So, I did have a job in the courthouse where my supervisor liked to go on shopping trips and sometimes asked me to make maps for her using Google so that she could tell how to get it from one shopping center to another to another. That’s something Google does easily. Sometimes, they made that a little bit easier to do than other times. They’ve changed that around a few times. It is interesting, you know, the other types of queries we are seeing based on semantic approaches or things like what was that movie that Robert Duvall said “I love the smell of napalm in the morning” and that’s not necessarily asking for a webpage. They chose the names Robert Duvall or that quote or whatever but it’s asking about facts and properties and what movie was acted in by Robert Duvall which included that quote from him and the search results showed two video clips from Apocalypse Now at the top of the results with Robert Duvall in them saying “I love the smell of napalm in the morning”. I bring up Robert Duvall because he lived about a half an hour north of me when I lived in Virginia a few years ago and he was really involved in charities and things like that.
Martha: Very cool. Also that result is interesting because it doesn’t necessarily give the answer but a video, right? So, there’s a whole other inferring going on there around, you know, the content of the video and the entities and the properties within that video, right, which is sort of taking it to that next step.
Bill: So, I got into doing stuff with entities because I was working on the Baltimore.org website and we were trying to rank pages for things like best bar in Baltimore, Black History in Baltimore was one they wanted to rank for we tried it and there were too many well-done black history Baltimore-based websites to rank well. We weren’t getting past a hundred or so on Google’s written rankings. So, I sent a copywriter I was working with an instant message, saying “Could you write something that describes a walking tour of Baltimore from one place to another to another that mentions the famous historic churches, the colleges, the people, there’s the…
Martha: The entities, right?
Bill: Right. There’s a nine-foot tall statue of Billy Holladay in downtown Baltimore There’s a seven-house mansion from Frederick Douglass as he aged. Originally he started out life as a slave and he escaped. He made lots of money he moved back to Baltimore, bought a lot of property there and did well. You can visit these places, you can see them and that was the purpose behind Baltimore.org website to get people in to travel to Baltimore and see it. So, by creating the page that had about 5000 words on it about different entities who you could visit in person as you came to Baltimore. We were helping them fulfill the mission of their site.
Martha: Awesome, which actually kind of leads me into my next question and I always asked, like, in a couple sentences or maybe two sentences – can you articulate what you think the value is to organizations or companies for doing schema markup, like what do you think the tangible outcomes are?
Bill: If you use structured data, you’re presenting more precise information to search engines, using data in formats that that they expect people to use the search for like you would with keyword research. Okay, schema means schema like in a database schema how the database is organized so if you’re using the right words as determined by people who make schema who are subject matter experts, you’re most likely using terms that people will search for that they expect to see on pages that they find which is really helpful.
Martha: We… I’ve been talking to a couple customers recently about how schema.org vocabulary is actually insight into what the search engines want to know about different entities and that you can use it to understand that dictionary that you’re talking about.
Martha: But actually so, you actually think of it as input into your content planning. It can be really insightful and also helpful with performance and understanding as you go through that content development process.
Bill: I’ve tried to explain it to copywriters too.
Martha: I had a couple of people click last week so I was I was excited to kind of see them how they think it’s more than just technical SEO right, but how it really can play a role in the planning and strategy stage. So, let’s talk a little bit about the future because, you know, you have these insights into sort of how Google thinks through your research and reading and experience. In my last interview I talked to Steve Macbeth at Microsoft and he was talking about how in his role and AI, they’re thinking about using schema to connect virtual reality with reality similar to what you’re talking about earlier with Google lens, you know? Do you see any new consumers coming up, sort of, beyond the organic search or assistance or now, these sort of virtual reality? Where do you think that sort of that next consumer of schema markup is going to come from?
Bill: are you familiar with machine IDs?
Martha: I’m not. Tell me more.
Bill: Okay. When Google acquired Metaweb, they acquired freebase and freebase would give entities machine IDs. So, Arnold Schwarzenegger, for instance, had one long string of letters and numbers that stood for Arnold Schwarzenegger. So…
Martha: This is like the wiki data number that we have like sort of came from freebase.
Bill: It’s like the Wiki data number that came from if you do a search at Google Trends for an entity like Chicago Cubs or something like that, if you see a type that something other than a search type if you see like Chicago Cubs you look for it and it says baseball team, so it’s a type. So, it’s recognizing Chicago Cubs as an entity. If you look at the URL the last few letters and numbers of the URL are the machine ID number. So, that’s using Google Trends to help track entities. It’s used in… Google did a blog post on reverse image search where they say their using machine IDs to track entities and images so when you do a search for an entity as an image of an entity, it’s using machine ID numbers to help it find that which ties into the Google lens use of schema to find entities.
Martha: And are these two things like the entity nodes within the knowledge graph from Google’s perspective or are these like more global standard entity IDs that can be used like from a sort of more data comments or a sort of open data stance?
Bill: They are sort of like a shortcut Google is using for, you know, they’re doing search engine optimization too but they’re actually optimizing the search engine they’re trying to make it quicker for them. So, if you search for Arnold Schwarzenegger they say oh that’s this string of numbers and letters, so that’s all mentions of Kindergarten Cop, all mentions of the terminator, all different roles that he’s played. They know that’s the same person the same entity.
Martha: And is that something you think that brands can control. So, if I know Kellogg, could I actually start sort of helping define that if I’m managing my schema markup in knowledge graph?
Bill: Okay. So Google I/O 2013, they came out with something they called the invisible same as which meant that they could use a link element instead of an anchor element as something you could use to show that an entity you’re talking about is the same as something that maybe you see in Wikipedia or something. So I had a client who was an app developer who came up with a payment app they were started by Sprint Verizon and AT&T. So, they were on most of the Android phones in the US because those are pretty big carriers. They were putting a link, okay? So, the company we worked with them, their name had changed the were originally known as Isis, which maybe you don’t necessarily want to be known as a company because it sounds a lot like a paramilitary group from the Middle East, okay? I started to do an audit for them and one of the things I suggested was they use this invisible same as which was described at the Google i/o conference by a couple of Google developers. I figure Google wouldn’t have a problem with that because it’s something they came up with. So every time on the website they would call themselves Isis they would use the same as link to their Wikipedia page which shows that they’ve changed names from Isis to the new name okay which meant they weren’t the paramilitary group. Google could distinguish them from the guys in the Middle East. So they never implemented that because Google bought the company two weeks before… two weeks after I sent them the audit. So, I missed the chance to go that would them.
Martha: But now, we actually see that in answer the general schema markup right where you can use same as most people use it for social media but how you can also use it for stronger relationships right? So, if you’re changing brands you know who you were and what it is as well as alternate names etc. right?
Bill: Right. You can and they remark that you can use it in Json, to have it show on your knowledge panel.
Martha: Exactly. Well, one last question and then we’re going to wrap up because we’re out of time. So now, I Iook to you in a lot of your articles to stay on top of things that are changing, especially again looking at, you know, some of the things that we don’t see. Sort of publicly and in some of the public documentation from the search engines, you know? Who are people you follow and watch to stay on top of trends or that are inspiring you serving your work?
Bill: It’s a good question. I tend to, I have a very large SEO list on Twitter and I read through that usually a couple of times a day because you never quite know what people might post from there and their SEO is from around the country, around the world. I’ve made a couple of other lists like that. I have a machine intelligence and a deep learning list and a search engine list. I find that it’s helpful to look to those and see what people are talking about because people do announce a lot of new things there.
Martha: Yeah. I absolutely love following the conversation especially again for thought leaders sort of around the world and in different areas of specialty and you know and it’s always fun, also sometimes when they throw in non-search related things really, so you get to know them as a person as well.
Bill: That does make it fun. Yeah.
Martha: So, Bill, thank you so much for your time. If people wanted to find you online, what’s the best way for them to find you?
Bill: They can find me on Twitter (@billslawski, @seobythesea). I spent a lot of time there. I do blog at my own website which is SEO by the Sea. I blog at the Go Fish Digital website. I had been doing some writing for some other sites. I don’t do that quite as often because the Go Fish Digital site and the SEO by the Sea site keep me busy enough.
Martha: Excellent! I’ll put those links in for our listeners and for the people watching the interview, and again a heartfelt thank you for spending time with us today and sharing your insights. I look forward to continuing to see you in the Twitter-dom as well as reading your ongoing articles as you go deep into sort of how things are changing from a machine learning and a semantic web. So, thanks so much for your time today.
Bill: You’re welcome. Thank you!