The vast majority of supposedly 'REST' Web APIs are simply abusing HTTP to carry function calls. I call these APIs 'Service-Trampled REST', or STREST.
STREST APIs come with specific costs which could stifle the two-way data Web (Web2.0) if allowed to propagate unchecked. Although 'mashability' is a supposed benefit of the current proliferation of APIs, true interoperability and scalability can only be guaranteed by true REST interaction.
This is not an academic, purist or aesthetic stance, but one based on practical consequences, as I will explain.
STREST APIs are easy to spot. Look out for one or more of the following:
- Single URLs to GET and POST to - the 'service' endpoint (http://www.flickr.com/services/rest/).
- Similarly, function URLs (e.g. https://api.del.icio.us/v1/posts/add)
- Arguments to the 'function call', perhaps 'method' (flickr.groups.pools.getPhotos, flickr.blogs.postPhoto, etc), access key, etc.
- Internal ids as arguments (photo_id, etc.)
- Use of GET for 'function calls' that cause changes
- Use of POST for 'function calls' that only read data
- POSTing an entire 'function call' in XML, complete with name and arguments, to the service URL
Note that we're not talking about the lack of PUT and DELETE here, nor any concept of URL opacity. You can REST easy without PUT or DELETE and with totally random-looking URLs.
What we see here is a straightforward hijacking of the HTTP protocol to use it as a function call protocol. Hijacking and abusing HTTP isn't REST.
And it comes with a number of costs...
STREST Breaks the GETable Web
By way of showing an exception that proves the rule, suppose we have the following:
GET http://strest.com/api?call=getPic&pic_id=42
In other words, the function call 'getPic(pic_id=42)
'.
Now, this is the one case that works OK - that's really RESTable.
It looks odd as a function call (better as
http://restful.com/pics/42
), but you can still cache it,
bookmark, index and link it.
However, look how easy it is to break things once you start thinking in function calls.
Many APIs ask you to put a special access key in as the first argument. Throwing needed data into the arguments list is quite normal in the land of functions. But here, cacheing and linking for everyone else is compromised. And actually destroyed if the key is tied to an IP address.
Another temptation is to unify these 'function calls' by saying you can or should use POST for all of them. Their creators see GET and POST as effectively equivalent, since you can use either to convey the function name and arguments and to return the results. Clearly, in the GET case above, using POST would also break linking and cacheing.
Some APIs take this to the next stage and suggest you POST an entire XML function call (XML-RPC- or SOAP-style) to their service URL. If the 'call' just returns a resource, that resource is again neither linkable nor cacheable.
This orphaning of resources is another consequence of function-call thinking: instead of giving all the resources their own URL, function-call APIs naturally drift into internal ids that are passed as arguments to the function call.
In the Web, the various parts of a URL are supposed to be combined by the web server to identify a single GETable resource. However, once in function-call land, there may be many of these internal data-identifying ids passed in as 'arguments' to the 'function'.
This is an aspect of the inversion between function calls and resource identification: service interfaces naturally publish function calls and hide data, where the Web publishes data and hides the mechanisms that animate them.
It's not just about syntactic sugar or arbitrary aesthetics: function-style URLs are one step away from Web-busting data-hiding.
Finally, some APIs unify the HTTP method the other way: by actually allowing GET operations that change server state.
Seeing them as function calls not only makes their creators indifferent to the unique qualities of idempotent calls (which break things when POST is used), but equally indifferent to the converse (which break things when GET is used).
This isn't just a purist objection: links should be followable by programs, engines, bots and people without fear of side-effects. It's a basic, fundamental expectation that is weaved into every strand of the Web. At the very least, allowing a link that causes side-effects to be constructed and used via GET is muddying the waters and creating unnecessary confusion. Someone at some point is going to be tripped up by this particular abuse.
STREST Breaks the POSTable Web
Now, consider those APIs that allow updates or state-changing operations as well as data fetching ones - for example, tag this URL, upload this photo. Even when these state-changing function calls correctly use POST, there are still a couple of practical costs to casting these operations as function calls - specifically scalability and interoperability.
First, scalability: the function call is a bottleneck over those captive data items. All calls that change internal data have to go through the same host/port/path combination.
Load-balancing and locking then has to be applied by any one of a number of ad hoc heuristics. For example, load balancing could introspect the application ids in the function arguments to split the incoming calls in some optimal way over the data they refer to. More arbitrary load balancing implies independent, asynchronous locking of data affected by multiple parallel calls.
Functions are expensive and don't proliferate, but real Web URLs are cheap - and do. If you can POST to real, fetchable resources, then the handling of that POST can follow those resources around. And if a given resource just handles one POST event at a time, it (or the data on which it is based) doesn't need locking.
The behaviour that responds to a POST on a given type of resource can be replicated around the world: the ultimate load balancing and application distribution... And of course, the definition of that shared behaviour can be distributed around the world using GET! The Web doesn't need centralised, closed, singleton, bottleneck services.
While read-only operations can be scaled by cacheing, write operations only scale by parallelising them. The inherent parallelisability of declarative architectures such as the Web is often enough of an argument in their favour without even considering their simplicity, programmability and power.
Second, interoperability: the only way to interoperate is to have code written to each service interface specification that imports and exports data and choreographs (orchestrates?) a number of threads or workflows and their function calls across these services. The workflow may even be more-or-less dictated by the services. This code is tied intimately to the set of function calls of the specific services being used.
Again, there are many APIs with few implementations; usually one implementation per API. If the clients have to be built for each set of service APIs, you don't get general-purpose clients like browsers; you probably won't get to choose your client since the cost of these tailored applications is too high and the market too small. And you won't be able to switch transparently between services when there's only one service implementation per API.
Now compare with the Web. There is nothing more 'interoperable' than bumping into a Web page, finding a form to fill in, submitting it and getting a result. No discovery services or interface specifications are needed. You just follow a link you discovered, look at what's there (the 'interface'), POST, then see what happened. It's up to the resource how or whether it changes - whether it provides a simple edit interface or a more complex domain interaction. It may spontaneously change itself from other POSTs or from other event sources.
In contrast to the many service APIs describing few or singleton implementations, the Web has few types and schemas describing many resources. In the Web World of interaction, there is a stronger tendency to standardise what you see - both on GETable and on POSTable resources.
HTML forms and XForms are the well-known document-oriented standards, for example, that actually dictate the 'schema' for POSTing. In Web 2.0, the XML from a Web API could imply the POST that would change it, by convention or by standardisation (as exemplified by the Atom Publishing Protocol).
Interoperability is baked in to the Web and the Web Way of doing things. Of course, interoperability at the HTTP level doesn't imply interoperability at the level above that uses it. However, in the Web, this level is invariably standardised (or settled by convention) in data formats and schemas. If, on the other hand, HTTP is used to convey function calls at this level, each such interface is defined by one provider for their particular case.
A rare counter-example to this (whose rarity proves the rule)
is the weblogUpdates.ping
XML-RPC function for telling a blog
tracker that a blog has been updated. This is implemented by
nearly all blog trackers. It isn't actually a function call
in the generally accepted sense, however: it's a notification.
And you need to discover the URL by reading the documentation,
since it is not self-describing.
Of course, it would be better implemented by allowing a POST of
the same information as in the weblogUpdates.ping
function
call to an 'updatedblogs.xml' resource. That way, a GET of this
updatedblogs.xml resource would allow the GETter to infer the
ability to do the ping POST, after which another GET would
confirm the addition.
In other words, if it looks like a duck and quacks like a duck, you can try submitting an 'eat this bread' POST, and it will probably eat it. If it looks like Atom Publishing Protocol you can probably try to submit something to the list.
There are various conventions to POST: search forms, comment submission forms, etc., which boost interoperability by simply doing what you'd expect them to do.
This blog, whose code and pages were hand-crafted in vim, still gets successfully spammed even without publishing a WSDL!
Of course, there will be exceptions, where you do need to know what a specific resource will accept and how it may react, but even here the resource itself to which you are POSTing can be filled with information about what you or your code can do.
REST, not STREST
Function calls may be OK for now with small-scale, early-adopter work, but there is a danger that the Service-Trampled style of HTTP abuse may get entrenched and propagated. This will be to our detriment later on when we are building a much bigger data Web, one that needs linking and cacheing to work, that needs to scale and to interoperate and that should be working with the Web's architecture, not against it.
If the Service-Trampled REST function-call pattern is propagated, it won't be Web 2.0, it'll be 'Web' Services. And it won't work. A World Wide Web of Functions will never happen; the World Wide Web of Resources already covers the globe.
Here's a quote from Roy T. Fielding (the discoverer of REST):
The key thing to understand is that the value of the Web as a whole increases each time a new resource is made available to it. The systems that make use of that network effect gain in value as the Web gains in value, leaving behind those systems that remain isolated in a closed application world. Therefore, an architectural style that places emphasis on the identification and creation of resources, rather than invisible session state, is more appropriate for the Web.
.. and more appropriate for Web 2.0! Unfortunately it seems not to be the Web 2.0 Way, as yet.
It'd be a shame for Web 2.0 to become detached from the linkable, bookmarkable, indexable, cacheable, parallelisable and interoperable Web 1.0 that we've all built.
More API designers dropping the misguided STREST API pattern and offering REST interfaces that aren't fatally Service-Trampled, perhaps learning from or adopting the Atom Publishing Protocol, could yet prevent this.