There is something about the Internet that nurtures open data, and something about computers that nurtures closed. It is often necessary, but often painful, to make the jump from local, closed data to global, open data.

The Internet is all about data; open data and open data formats in particular. HTML, the fabric of the Web, links up vast amounts of open data created in an open data format. Email is sent either in the same HTML or (the most open of formats) simple text. Feed publishing shows that opening data can become an unstoppable flow. Indeed, Peer-to-Peer technology shows that opening data can become a torrent...

However, on our computers, data is often much less free. Take a closed format such as Microsoft Word: such a document can only properly be unlocked after paying for Microsoft Office. Take copyright data under DRM (Digital Restrictions Management): such music, etc., has to be unlocked by paying for the key.

Even with nominally open formats, there's a tendency to tie one format to one application. You still need that application to unlock the data. And even object-oriented programmers follow the closed data rule: data must be wrapped in a class interface.

Application user interfaces, class interfaces and service interfaces whose implementation processes mediate, control and restrict data and their formats are the bread-and-butter of the non-Internet computer domain. They work best on the more tightly-controllable computer hosts.

 

They Just Don't Mix

But the processes and the people that like to control data don't mix too well with the Internet. Here are some examples of the incongruity (or outright clash) that can result:

  • It still jars to hit a link and find, not something that the browser can handle - not an open format resource - but a PDF or a Word document. And, of course, you need to have paid for Office or to have installed Acrobat to see it.
  • Google dips in to these Word documents and PDFs, presumably legally, but in theory this right could be revoked at any time.
  • Some Web site owners get upset about people indexing or linking into resources on their site ('deep linking').
  • Web caches, including the Google cache, are in potential breach of copyright by copying data.
  • The music and film industries have witnessed the power of Peer-to-Peer to shake up their world of controllable data.
  • Taking the computer concept of 'interface and process' and translating it literally to the Internet - RPC and RMI - has yet to achieve the success promised; although the vendors of SOA and WS-* still believe it's possible.

Now, these processes and people that like to control data obviously can't just avoid the Internet. So they'd be better off embracing the open data philosophy if they're going to do things here.

Embracing open data can take courage and open-mindedness. It means letting go of proprietary lock-in, copyright obsession, RPC, RMI and Service-Oriented Architectures. The lock-step imperative, procedural, workflow and object-oriented programming that works locally doesn't translate well onto the 'Net. The rules change.

 

Open Data Can Work

In fact, it is possible to make a living in this wild frontier, even when everyone can see, understand and copy your data! (For example, watch this blog for ways to make money on the Internet even when unlimited copying is allowed, based on originality or novelty backed up by a little crypto technology!)

I call the flip from closed data to open data the 'Imperative to Declarative Inversion'. It's the flip from 'How' to 'What' - from process-centric to data-centric programming.

In this upside-down world of What not How, the data format and its update formats are the public 'interface', and process animates things behind the scenes. Reading about REST will provide an excellent grounding in this way of thinking.

Showing the benefits of inverting (scalability, robustness, interoperability, value for money, speed and flexibility of development) and helping overcome the understandable fears of inverting are the main goals of this blog. In fact, I will also show how the Declarative approach works pretty well off-Net too - in the domain traditionally owned by process...