Value Struggle: Data, API, or Presentation Layer

At a recent Semantic Web gathering at MIT, Maximilian Schich gave a talk about “Adding Art Research Data to the Giant Global Graph”. It sparked an interesting discussion regarding the valuable assets within some Linked Data systems and how to monetize them. It is a topic of interest to me as I believe a key adoption hurdle for Linked Data and Semantic Web technologies is in clarifying the value proposition to all parties involved (eg. producers, distributors, and consumers).

Schich talked about his conundrum of what to do with an incredibly rich data set he has access to describing historical art and archeology information. He presented how the data is already well defined within their data model and retrieval scheme. He also outlined how they propose bridging the gap to embrace the current and near-future Linked Data standards. The question, then, was how they would be able to pay for all this work.

There was a lot of discussion around possibly licensing access to the data via SPARQL, but the mechanisms for metering don’t exist, yet. Kingsley Idehen and some others discussed possibilities in adding support at the server layer for a formalized data response similar to HTTP 402 (ie. “payment required”). It was clear, though, that work would need to be done in this area before adopting it as a reasonable direction.

At some point during the conversation, I asked Schich what he believes is his more valuable asset: the data itself, or the presentation layer. After having seen the GUI of the demo application showing how the user could retrieve the data, it was clear that a lot of domain expertise was required to design it. Further, when he showed us the data schema and example retrieval/traversal modes, it was even more obvious that the average researcher would have to interface via the GUI (even if the data is freely available and fully compliant with SemWeb standards).

With this in mind, my suggestion was that he consider opening up the data entirely, forgoing any programmatic metering, and possibly license commercial access to it (allowing for free non-commercial use). My proposal was that they focus on monetizing killer GUI products tuned for each of their specific user groups. In this way, they could service both their institutional and individual users as appropriate.

Fortunately, Tim Berners-Lee jumped in and agreed. He clearly articulated (undoubtedly better than I could) the benefit in separating the data source from the presentation layer. Each, then could be treated separately in context of it’s use and license model.

Thinking about it later, I was reminded of the scene in Neal Stephenson’s Cryptonomicon when [spoiler alert] the characters find a pile of gold in the middle of the jungle on a remote Pacific island. It had been left there by the retreating Japanese army in World War II, and they were trying to figure out how to retrieve it through the unfriendly territory. Basically, what appears incredibly valuable at first glance (ie. a pile of gold = a mountain of rich data), is nearly worthless without a way to get it out.

I hope, however, that Schich is able to find a better solution than was presented in Cryptonomicon.

BTW – Since I mentioned them already, it’s worth noting that both Kingsley and Tim are giving keynotes at the Linked Data Planet conference in New York on June 17th. Also, you can listen to an interview I did with Kingsley a couple weeks ago as part of the DataPortability: In-Motion Podcast.

  • Share/Bookmark