Have Semantic Technologies Crossed the Chasm Yet?

September 14, 2010 at 11:44pm

This article kicks off a series of interviews on Semantic Technologies in the MIT Entrepreneurship Review with industry thought leaders including Thomas Tague (Thomson Reuters), Chris Messina (Google), David Recordon (Facebook), Will Hunsinger (Evri) and Jamie Taylor (Metaweb).

At first sight, the answer is yes. I recently attended the Semantic Technology Conference in San Francisco. What had begun in 2005 as a 300-person conference has grown into a 5-day event with an amazing depth both of workshops and panels and over 1,300 participants this year. The conference is organized by Semantic Universe, an online platform with the goal of “educating the world about semantic technologies and applications”.

I have had the opportunity to talk to some of the key actors and innovators that have pushed semantic technologies and linked data forward over the past years since the term “Semantic Web” was first coined by Sir Tim Berners-Lee of the World Wide Web Consortium (W3C). The term takes on different meanings in different contexts: to some it is about representation of information in certain well-defined formats to make it machine-readable and easy to interpret; to others it is about web services and the aggregation of information to create valuable applications for users, while still others would highlight the artificial intelligence aspect and its use in tackling complex problems.

I have been personally drawn to the field of semantic technologies for some time, realizing the impact that these technologies will have on the way we consume information online as well as on the possibilities from an enterprise perspective. One thing I realized at the conference was that a lot of things that we take granted today, like online recommendations, are already powered by semantic technologies. In fact, a lot of the conversations happening in the hallways, between sessions, were not just around technical topics like how to best construct OWL ontologies or how to structure SPARQL queries, but rather about business issues like designing the right monetization models, improving e-commerce with semantic technologies, gauging the potential business impact of Facebook’s Open Graph, Twitter annotations or Google’s rich snippets. The New York Times, BBC, Newsweek, Tesco, Best Buy are some examples of companies that have been building and are relying on semantic technologies. To me, these are all strong indicators that semantic technologies have reached the tipping point.

Jamie Taylor, Minister of Information at Metaweb, the company behind Freebase, sees clear indications that semantic technologies have become more mainstream:  “Just the sheer size of the conference has increased pretty dramatically, as well as the diversity of people who actually have commercial offerings in terms of tools that matter to your typical webmaster, your typical content manager.” While there is still a strong academic track to semantic technologies, Taylor says, “it’s very interesting that sometimes semantic technologies have met the Web 2.0 lightweight user contribution-type model and as you add semantics into these types of systems – fairly lightweight semantics – all of a sudden they start getting much greater benefit.”

Managing one of the best-known semantic technology start-ups, Will Hunsinger, CEO of Evri, tells me that he has “seen a lot more activity in the last 12 month”. Naming Microsoft’s acquisition of Powerset and Apple’s acquisition of Siri as examples, he also points out that these “transactions have given validation that the technology is here and ready, but also that there is a path to liquidity.” One advice for startups and companies in the semantic technologies sector is to focus less on the technology itself and spend more time understanding consumers’ needs by asking themselves: “What does this technology do better than what’s out there such that you are going to solve a real problem”.  For example, at Evri, he adds “we create a better experience for the consumer applying the technology where it actually has a distinct advantage over keyword e.g. delivering precise results around general topics like “movies” or “reality tv”, understanding meaning and context (e.g. why is a particular entity popular right now) or even enabling consumers to follow topics over time”.

From a technological perspective, the recent developments around RDFa, a simpler version of RDF which allows users to add metadata to their content, will further accelerate the growth of the Semantic Web. Drupal 7, one of the biggest open source content management systems used on hundreds of thousands of websites, comes with major RDFa functionality. The latest HTML5 draft has RDFa support in it. Facebook’s Open Graph protocol is based on RDFa. Google Rich Snippets support RDFa. According to a recent GigaOM report, Twitter Annotations are looking to use it.

The benefits of semantic technologies with respect to making online search better are most obvious and to some extent already observable today. David Recordon, Senior Open Programs Manager at Facebook, sees some powerful applications in search, essentially “giving you a filter into the world based on your friends”. Thanks to semantic technologies built into the Facebook platform “developers [can] build on top of information which people have trusted Facebook with, whether that’s status updates or things they like, people they are connected to […]”. Google’s Open Web Advocate, Chris Messina, told me he agrees that social search will play a key role in the future: “we are starting to see Google integrating Twitter streams in search experience, hopefully providing users with more actionable information, providing a number of different opinions, more contextual data. It is certainly something Google is paying a lot of attention to – information that is contextual to the user, not just generic to the world.”

But what about exploiting the power of the semantic web by pulling in data from different sources, the premise of linked data? Thomas Tague, VP Platform Strategy at Thomson Reuters and in charge of the OpenCalais project, a free service to analyze and extract concepts from user-submitted texts or web sources, told me about the exciting opportunities he sees at the intersection of highly trusted monetized content and free web content. He says that “people are not going to make $100 million bets based on blog postings. But that blog posting may be an outlier, may be an initial indicator, maybe about a layoff at a factory or something like that, that the user can now immediately link back to Thomson Reuters data and gain insight and take action.” While Tague certainly shares the enthusiasm for the growth of semantic technologies and adoption of standards by industry participants, utilization of linked data remains low in his view. Therefore, his short-term outlook with respect to utilization of the linked data cloud, remains rather cautious: “There is a lot of talk about it, but with respect to our linked-data company information, people aren’t picking it up yet very much.”

So what can we expect in the near future? Jamie Taylor tells me that he thinks “the idea that you can aggregate is something very novel: all of a sudden my data is not limited to my data silo.” He distinguishes two types of data: core data, which must be managed by the organization to drive the core business, and context data--such as geo data. He believes that what “semantic technologies allow is in some sense to outsource [context data] to the community for maintenance.”

Overall, there seems to be consensus that as semantic technologies move out of the purely technical corner and beyond the innovators and early adopters in academia and government, content-heavy organizations and users like publishers or e-commerce sites will help these technologies cross the chasm as they see the largest benefit in applying the technology. As pointed out earlier, companies like The New York Times or Best Buy have already begun to build and rely on semantic technologies. As more and more companies start adopting linked data standards and share data in the linked data cloud, we will see more businesses created to derive value from aggregating data across different datasets to provide value to their users.

If this article has sparked your interest into semantic technologies, I can recommend a documentary by Kate Ray, a recent graduate from NYU with a major in Journalism/Psychology, who has contributed to the demystification of the Semantic Web through interviews with thought leaders, including Tim Berners-Lee, Clay Shirky, Chris Dixon, David Weinberger, Nova Spivack, Jason Shellen, Lee Feigenbaum, John Hebeler, Alon Halevy, David Karger and Abraham Bernstein. The clip has been viewed by more than 120,000 people so far. I asked Kate what motivated her to do the documentary: “My dad has been doing semantic web stuff for years, and my entire family never really knew what he was doing, so partly I was trying to make something that all these people here could show to their friends and family. I also had an academic interest in it.” Kate is now working on a company called Kommons, which she describes as a “Q&A forum built on top of Twitter; to let people ask questions to public figures - or anyone - and backing questions you agree with”.

MIT is at the forefront of exploring applications to commercialize linked data and semantic technologies, adding a new seminar, Linked Data Ventures, to the fall curriculum. The class will be taught by an all-star team consisting of Sir Tim Berners-Lee, Dr. Lalana Kagal, K. Krasnow Waterman, as well as Reed Sturtevant and Katie Rae. Computer science and business students will work in small teams to develop prototypes based on Semantic Web technologies.



About The Author
Rene Reinsberg Rene Reinsberg is a graduate student at MIT in the Entrepreneurship and Innovation track. His interests span online identity & online trust, social graph analytics, linked data, web of intent and the future of the web more broadly.
Article Discussion: Have Semantic Technologies Crossed the Chasm Yet?

Fantastic article Rene. The notion of a semantic web is one that really has the ability to mystify people. You've done a great job with this article and i look forward to reading the entire series.

A few questions for you:

1) Do you think that a truly semantic web is achievable? It seems like making such a thing happen would be very anthropogenic. How do (and why should) we encourage developers to add metadata to their code that encourages a linked data cloud?

2) What makes a web service 'semantic-web-centric'? While some of the services you cited require developers to insert metadata into their website, others use AI to scour the web for relevant, linkable data. The former seems very near the idea of what is intended. The latter, however, seems to contradict the norm. Are both methods good examples, or is one the 'correct' way?

3) What is your take on including ontologies in the overall concept of the semantic web?

Again, great article Rene.

Monday, August 23, 2010 - 5:39am

Sergio, thanks - these are really good and thoughtful questions.

Re 1) I think it’s hard to predict how far we’ll get. In my opinion, there are some areas, in particular the public sector, e-commerce, publishing, and other content-driven applications, where developers can get quite a few advantages from adding metadata. I think you could probably compare today’s situation in the early days of SEO. Once everyone was doing SEO, you, as a company had no choice to do it as well if you wanted to be noticed or found online. I think we’ll see a similar pattern with the adoption of semantic technologies. By the way, Google acquired Metaweb today – this is a big deal and a clear indication for where search is headed: I just wrote a brief blog post about this, have a look if you are interested.

Re 2) Yes, I think both are indeed good examples. While you need technologies to enable to transform the web’s unstructured content into structured, machine-readable content, you’ll also see developers adding metadata, tags to their content. In my opinion, both serve the higher purpose of moving towards the truly semantic web as you call you.

Re 3) This is a tough question and you’ll hear arguments from very smart people on both sides. I think this could be an interesting topic for a future, follow-up article :-)

Monday, August 23, 2010 - 5:43am

Rene, can you please tell us more about the kinds of business models that people are thinking about for semantic technologies? What are some interesting examples there?

Monday, August 23, 2010 - 5:43am

Erdin, thanks for your question – one I asked myself and others; you’ll read about some interesting examples in the upcoming interviews. In particular, I think we’ll see (and are already seeing) innovative new business models around semantic technologies in publishing and e-commerce. In terms of types of businesses, I think you’ll see those you enable (infrastructure providers) and those who harvest (content providers, search) semantic technologies.

Monday, August 23, 2010 - 5:44am

I'll be avidly following the series to get up to speed with our local thought leader in semantic web.

Monday, August 23, 2010 - 5:45am

I feel educated about semantic technologies through this article. This is not a field I know a lot about and it's been great reading this broad overview.

Monday, August 23, 2010 - 5:45am

You must be logged in to discuss.