http://duncan-cragg.org/blog/ What Not How - Posts tagged 'atom' Duncan Cragg on Declarative Architectures Duncan Cragg /favicon.gif /favicon.ico All content including photos and images by Duncan Cragg. Copyright (c) Duncan Cragg, your rights preserved: see /CXL.html A Django Production. 2009-11-25T21:29:00Z http://duncan-cragg.org/blog/post/deriving-forest/ Deriving FOREST 2009-11-25T21:29:00Z 2009-11-25T21:29:00Z

Say we want to integrate multiple applications which handle order processing. OK, that's got to be one of the dullest starts to a blog post. Never mind, bear with me...

So, we have applications on separate servers for handling and driving data such as orders, product descriptions and catalogues, stock lists, price lists, tracking, packing notes and delivery notes, invoices, payments, etc.

We may choose an SOA approach, of course. But let's say our sponsors have heard of this cheaper alternative: REST! Which to them means 'using Web technology to save money'.

Now .. suppose we push the time slider right back to before Mark Baker and the SOA -vs- REST Wars - or the 'SOAP -vs- REST Wars' as people naively called it. To when REST was simply (!) a description of the Web's architectural style...

What if we revisit the applicability of the Web, and its abstraction into REST, to the architecture of machine-to-machine distributed systems - to something like our order processing integration?

I think we'd quickly arrive at something that looks more like FOREST than, say, AtomPub...   ...

Say we want to integrate multiple applications which handle order processing. OK, that's got to be one of the dullest starts to a blog post. Never mind, bear with me...

So, we have applications on separate servers for handling and driving data such as orders, product descriptions and catalogues, stock lists, price lists, tracking, packing notes and delivery notes, invoices, payments, etc.

We may choose an SOA approach, of course. But let's say our sponsors have heard of this cheaper alternative: REST! Which to them means 'using Web technology to save money'.

Now .. suppose we push the time slider right back to before Mark Baker and the SOA -vs- REST Wars - or the 'SOAP -vs- REST Wars' as people naively called it. To when REST was simply (!) a description of the Web's architectural style...

What if we revisit the applicability of the Web, and its abstraction into REST, to the architecture of machine-to-machine distributed systems - to something like our order processing integration?

I think we'd quickly arrive at something that looks more like FOREST than, say, AtomPub...

Some pretty obvious things to notice about the Web and, indeed, REST:

  • The Web is essentially data on URLs of standard content types containing more URLs;
  • The Web is the Web because of the massive proliferation of links in that data;
  • REST mostly concerns itself with the consequences of GET, including cacheing;
  • The Web uses, I don't know, let's say 98% GET, 2% POST, around 0% other methods.

In other words, the Web, and its good qualities, are mostly based on:

GET URL -> HTML -> a.href=URL -> GET URL ..

When applying this Web/REST architectural style to our integration scenario, there are things that we can say right now with certainty will be different, but will have corresponding elements:

  • It's about data not documents, so HTML is probably going to be replaced by XML, although perhaps XHTML+Microformats or Atom would make a good compromise;
  • We have a choice of link specs: xhtml:a.href, atom:link.rel, xml:xlink; I don't think we'll be using XLink since no-one else seems to;
  • We'll probably use machine-generated URLs perhaps containing UUIDs, GUIDs or whatever.

In other words, we're not going to be spinning a hypermedia Web - it's more a 'hyperdata' Web.

So, in order to emulate the document Web in our hyperdata integration Web, we'll mostly be doing something like:

GET ID-URL -> XHTML -> a.href=ID-URL -> GET ID-URL ..

 

Oh! I've got some slides of all this on Google Docs: we're up to Slide 2! Maybe right-click, open in a new window...

 

Symmetry - Slide 3

But by far the biggest difference between the Web and an integration scenario is that the asymmetry on the network goes away; even for a cross-enterprise integration.

Where the Web's browser clients and site servers have always been asymmetric - clients being hidden away and only able to establish outbound connections - machine-to-machine integration is fundamentally symmetric - all servers can be made visible to each other.

Now, in order to keep the well-studied benefits described in REST, including separation of concerns, we should aim to maintain the client-server, layered structure in the use of the protocol.

But clients can be servers and vice-versa!

So, in machine-to-machine integration scenarios, we have:

  • Two-way GETs on machine-minted URLs pointing at XHTML+Microformats or Atom content containing more links.

In other words:

  • A hyperdata Web both created and consumed by the applications being integrated.

All of the dynamic data or hyperdata items in our order processing scenario will be distributed across the many applications being integrated. Each application serves its part of the hyperdata Web to the others.

And, of course, the hyperdata joins all these applications up: a payment resource in the accounting application will point to an order resource in the order processing application, etc.

 

Interactions and Application State

Now, each application has its own set of business rules and constraints over the hyperdata parts that it governs.

So how exactly should the applications publishing those bits of the hyperdata Web interact? How do orders interact with packing notes and stock levels, with payments and accounts?

In the Web, you go to a site and jump some links. Each page brings in CSS, Javascript, images, maybe iframes: an eager assembly of pages from links to many resources, in contrast to the lazy, on-demand fetching of links from a user jumping them.

The browser at any time has a state that depends on the page and images, etc, currently being viewed, plus the history of previous pages, bookmarks, etc.

Search engines in the Web, without a user driving things, are eager to traverse links in order to do their work indexing pages. Order processing applications will probably have more in common with search engines than with browsers.

REST describes this in terms of hypermedia - links - driving 'application state'.

So we next need to decide what 'application state' is, in our Web- and REST-driven architecture for machine-to-machine distributed systems; where the user driving hypermedia link traversals is replaced by business rules or logic driving hyperdata link traversals.

Each integrated application has its own 'application state', so, to follow REST, this application state should be driven by the surrounding hyperdata of peer applications, according to those business rules.

 

Application State is Linked Resources - Slide 4

In fact - and this is a consequence of the symmetry of integration - 'application state' is those very resources that the application contributes to the hyperdata Web!

A stock tracking application's state is pretty well described by a bunch of resources describing the stock levels. A fulfilment application's state could be inferred by inspecting the outstanding packing notes.

We're not limited to the asymmetric browser-server of the Web, where the browser's 'application state' is never visible except when it POSTs something back.

It's more like a search engine, where you can publically access an 'application state' that is entirely driven by the hypermedia crawled by the search bot. A search engine's application state is rendered into the results page resources you see when you do a search.

So the resources of each application in the order processing integration are driven by the surrounding, linked resources of the other applications.

You could rephrase REST's 'hypermedia as the engine of application state' when applying it to symmetric machine-to-machine integration in this neat way:

Hyperdata as the Engine of Hyperdata.

 

The Functional Observer Programming Model - Slide 5

So now, how do we program the hyperdata-driven-hyperdata of our integrated applications?

How do we animate the stock tracking hyperdata chunk over here in the face of today's packing notes in their hyperdata chunk over there?

That's what FOREST is all about!

The name 'FOREST' stands for 'Functional Observer REST'.

The words 'Functional Observer' describe FOREST's hyperdata-driven-hyperdata programming model. But it's much simpler than it sounds...

A FOREST resource in the hyperdata Web sets its next state as a Function of its current state plus the state of those other resources Observed by it via its links.

The best way to encode such state evolution is in rewrite rules or functions which match a resource and its linked resources on the left-hand side, then rewrite that resource's state on the right-hand side.

 

Not Like AtomPub, then

So, quite a different conclusion from what is now the 'conventional REST' of the four verbs - GET, POST, PUT and DELETE.

Quite different from asymmetric, one-way application protocols as modelled by AtomPub, in which clients aren't considered worthy to hold their own resources, but are allowed only an inscrutable 'application state'.

By focusing on GET and the freedom in integration to be symmetric, we've arrived at a general distributed programming model, FOREST, that allows us to express business rules that drive an application's hyperdata in the context of another application's hyperdata.

Watch this blog (and Twitter), where I'll be talking more about the benefits of FOREST, its implementation, and, above all, offering examples of how it would work (once the code is ready enough!).

http://duncan-cragg.org/blog/post/ws-are-you-sure-rest-dialogues/ WS-Are-You-Sure | The REST Dialogues 2009-07-16T16:16:00Z 2009-07-16T16:16:00Z

In an exclusive nine-part dialogue with an imaginary eBay Architect, we present an accessible discussion of the REST vs. SOA issue.

Although eBay have what they call a 'REST' interface, it is, in fact, a STREST interface, and only works for a few of the many function calls that they make available via SOAP.

In this dialogue series, I argue the case for eBay to adopt a truly REST approach to their integration API.

Part 8: WS-Are-You-Sure (Security, Reliable Messaging and Transactions)   ...

In an exclusive nine-part dialogue with an imaginary eBay Architect, we present an accessible discussion of the REST vs. SOA issue.

Although eBay have what they call a 'REST' interface, it is, in fact, a STREST interface, and only works for a few of the many function calls that they make available via SOAP.

In this dialogue series, I argue the case for eBay to adopt a truly REST approach to their integration API.

Part 8: WS-Are-You-Sure (Security, Reliable Messaging and Transactions)

Duncan Cragg: So, back to your list of Enterprise functions. We're on to what I'm going to call the 'WS-Are-You-Sure': Security, Reliable Messaging and Transactions.

Let's attack these Starting with the Web!

eBay Architect: We could start with Security: authentication, authorisation and encryption. For example, you have to keep some information secret on eBay. Like Invoices, Offer details. Reserve price on an Item. And you have to ensure only the owners of data can change it.

DC: The simplest pattern for read security is to use HTTP Basic Authentication over TLS - following HTTPS. Simple, but well-supported and usually good enough.

But with HTTPS you lose some of the benefits of using intermediaries, such as cacheing. If those intermediaries are untrustworthy, then you can use message-level rather than transport-level security: encrypt the resource state being transferred.

eA: Can't I use WS-Security for this?

DC: Possibly(PDF)! However, the benefits of cacheing may be lost in the time taken to package and unpackage each resource in turn. You may prefer a more lightweight approach as suggested in the Atom and AtomPub specs.

eA: How does REST handle authorisation: such as read and write permissions?

DC: As I keep saying, REST is about much more than simple data read/write services. In REST we don't have the generic concept of authorisation on a specific process execution, such as a command that could cause state change.

REST infrastructure is about state transfer, which is thus really only about 'read permissions'.

Everything else is business logic: it's up to the target resource to manage its reaction to incoming non-GETs and to decide if or how it should change in response, according to internal integrity constraints and the identity of the source. Resources are masters of their own destiny and must be aware of the identity of interacting parties at that level.

eA: What can you do to secure the infrastructure level below the business logic?

DC: The department managing the infrastructure can see data going either out (GET) or in (POST), and can see the target URIs. They can thus do both server- (URI) and client- (request header) based security and partitioning.

For read permission, it's possible to implement a low-level lookup from the identity in the request header to whatever URIs they can GET. They can enforce simple rules at that level like 'only GETs are allowed on these URIs unless the client is in this list'. They can groom more and less sensitive traffic to different servers.

eA: Any more Security advice?

DC: Paul Prescod has written some notes on REST security.

Finally, remember to keep sensitive data out of those highly-propagatable unencrypted URIs by using POST instead of GET when submitting queries; another reason to use URIs that are literally opaque, not just treated as opaque operationally.

 

Reliable Messaging

eA: Another of the WS-* specifications deals with Reliable Messaging. How does REST give me the assurances I need that an important message - such as a new Offer on an Item or a ResponseToBestOffer, or an Invoice - will be delivered? In the right order? I can't just rely on POST, as you suggested before, if I really care about this.

DC: In REST, there are no command messages that have to make it through. There's only state that may or may not need to be reliably transferred - or that may or may not need to be notified in a timely manner.

In the eBay example, as I described it before, "if you keep re-POSTing the same Invoice, or Item or Offer, it only gets created once".

eA: Ah! Define 'same'!

DC: If, as in this eBay example, the successful POST creates a server-side copy with its own new URI, then the Item, Invoice, etc, must have some uniquely identifying information on it. It could perhaps have a Message-ID header or get cheap, unique URIs minted for it from the server in advance. Alternatively, when the POSTed resource already has a URI itself on the 'client', then it's obviously the same each time it's POSTed.

When used as state notification, POST must be idempotent; repeatable.

So if the initial POST fails, just keep POSTing until you can see the appropriate response, whatever that may be in business terms. On the pull or poll side, keep GETing until you see what you expect.

eA: So that's another issue you're side-stepping by dumping it into the business logic?!

DC: Only the business logic knows the following things: what signifies receipt of the notification; if it matters that the state didn't get through; how frequently to push or poll; whether it matters that state is out of date and by how much; and when to give up and tell someone.

Set the push or pull frequency and total number according to the business logic's view of the importance of that state transfer. Set cache control according to your domain's tolerance of stale data.

It's just like in real life: if something I sent doesn't get a response - in a form that is completely dependent on the type of recipient - then, after a time - which is also completely dependent on the type of recipient - I'll chase it up.

eA: Can't REST give any support here at all?

DC: Well, it would be easy enough to write a REST support library that implemented a simple API for specifying your constraints on a successful state transfer.

 

Transactions

eA: Now, when you're a site like eBay, dealing with money all the time, you need the assurance that transactions give you. You need to make sure accounts are always consistent. But I suppose, like before, you're going to tell me that it'll all be fine in the end, right?

DC: Hold on. Let's not mix up financial transactions and database transactions! We'll first talk about the need for atomic units of work. Then see how to support financial transaction business logic.

Also, we're talking about units of work in public view, not hidden behind resources. Inside, it's up to a resource to ensure that its integrity and consistency are maintained through its interactions with others, and it's free to use transactions to achieve that internally if it wants, without exposing that to its clients.

eA: OK - so now say that it'll all be fine in the end!

DC: In a distributed system, you have to decide on what to give up out of Consistency, Availability and Partition Tolerance.

I have to say that eBay are actually fully clued here: that was a paper about 'BASE' by Dan Pritchett, Technical Fellow at eBay, in which he discusses the benefits of Eventual Consistency - i.e., knowing that it'll all be fine in the end! Especially if you tidy things up eventually.

eA: Gah! Ya got me there!

DC: Essentially, the rule of thumb is, use ACID internally, use BASE externally.

We're back to the inevitable inversion from internal imperative thinking to external declarative thinking.

As an imperative programmer you're inclined to want to take your internal programming style out into the distributed world - to think single-thread, central control: 'begin - do work - commit'.

But the importance of Availability and Partition Tolerance in distributed systems usually outweighs the importance of Consistency, leading the wise architect to a more relaxed, less imperative, more declarative approach.

eA: Such as REST.

DC: Indeed. REST without transaction support.

REST isn't a database model: in the same way REST doesn't imply simple read/write services, it also doesn't imply inert data that needs locking. And resources in REST should model active domain data, not low-level, domain-independent transaction paraphenalia.

eA: How does REST without transactions work, BASE-ically, then?

DC: A handy phrase that sums it up is intention puts the system in tension.

You start by declaring your intention that some state be true, which puts the system in tension - a tension that can only be resolved by the application of business logic constraints over each player in parallel, until the whole system settles or resolves into a new, consistent state.

eA: Examples, please!

DC: Think about how you'd do the classic transfer of funds between accounts, in the real world of loosely interacting, self-determined parties. Say inside a big company before computers came along, between an office that handles one account and an office that handles the other.

Your key resource is a signed declaration (the intention) by the payer that they are happy to have funds passed to the payee. As long as this fact doesn't appear in one account or the other, you have work to do (there is tension in the business rules).

eA: You've got to run around real quick with a piece of paper.

DC: It doesn't even need to happen all at the same time: you can visit one office, check the funds are available and deduct the amount, then wander over to the other office and tell them to increase the payee's balance. If you get waylaid and the auditors come, there is always the signed declaration and the account history available to resolve the situation.

You can enforce the constraint that no money appears to be in two places with the business rule that the payee account is only increased if the payer's account has an entry corresponding to the signed declaration.

eA: Mmm. Sounds a bit too loosely coupled to me.

DC: It's life outside of Central Control.

Consider hotel and flight booking: you don't lock the hotel and the flight while telling them all in a two-phase commit what your itinerary will be. You do 'optimistic locking' with compensation: if things don't work out, you cancel a booking. A system may tell you something is available, but when it comes to booking it may have just been taken.

The real, distributed, reactive world doesn't work in a lock-step fashion, so our distributed, reactive systems don't need to work that way to model it. Reality is much more like optimistic locking with the possibility of compensation or merge on conflict that, again, can only be defined at the business level.

eA: Why not do your optimistic locking below that? HTTP has support for it, right?

DC: In the same way that REST can support read permissions but is at the wrong level for write permissions, which are a business level concern, there is an asymmetry in read versioning versus write versioning.

While using Etags is great for optimising the reading and cacheing of data, I wouldn't use them in the optimistic locking pattern for writes that is supported by HTTP. The proper place for handling a mismatch of versions in an interaction is not in the HTTP headers.

REST should be about state declaration and intention, not absolute write commands. Only the business logic governing the evolution of a resource knows if, for example, it can go ahead and respond anyway to an edit request, even though it's possible that the sender has an out-of-date copy of it.

(c) 2006-2009 Duncan Cragg

 

In Part 9: Web Objects Ask, They Never Tell

Note that the opinions of our imaginary eBay Architect don't necessarily represent or reflect in any way the official opinions of eBay or the opinions of anyone at eBay.

Indeed, I can't guarantee that the opinions of our real blogger necessarily represent or reflect in any way the official opinions of Roy Fielding...

http://duncan-cragg.org/blog/post/content-types-and-uris-rest-dialogues/ Content-Types and URIs | The REST Dialogues 2008-02-16T23:44:00Z 2008-02-16T23:44:00Z

In an exclusive nine-part dialogue with an imaginary eBay Architect, we present an accessible discussion of the REST vs. SOA issue.

Although eBay have what they call a 'REST' interface, it is, in fact, a STREST interface, and only works for a few of the many function calls that they make available via SOAP (GetSearchResults, GetItem, GetCategoryListings, etc).

In this dialogue series, I argue the case for eBay to adopt a truly REST approach to their integration API.

Part 6: Content-Types and URIs   ...

In an exclusive nine-part dialogue with an imaginary eBay Architect, we present an accessible discussion of the REST vs. SOA issue.

Although eBay have what they call a 'REST' interface, it is, in fact, a STREST interface, and only works for a few of the many function calls that they make available via SOAP (GetSearchResults, GetItem, GetCategoryListings, etc).

In this dialogue series, I argue the case for eBay to adopt a truly REST approach to their integration API.

Part 6: Content-Types and URIs

eBay Architect: OK, enough fancy REST or ROA interaction patterns! Let's get back to REST basics.

Duncan Cragg: That'll be content types and URIs, then.

eA: OK - I've a question about your standard content types. In particular, about Microformats.

DC: Go on.

eA: You keep on mentioning Microformats when offering examples of content type standardisation. But Microformats are always embedded into HTML (or, preferably XHTML).

Are you suggesting somehow taking them out and using them separately? Or do we have to always use them through a, possibly inappropriate, document model schema?

DC: Microformats currently 'tunnel' through mechanisms in XHTML such as the class attribute. They themselves have distinct model schemas, that ride on the document model schema.

eA: So can you use them apart from HTML?

DC: No, not really, although you can carry a single Microformat in a single document carrier, then squeeze out any document model content that isn't supporting the Microformat, perhaps leaving enough that it still renders sensibly in a browser if anyone looks.

The Microformats people won't like you doing it, I'm quite sure, but if you and I want to exchange a pure hCard, hCalendar, hResume or hReview, and nothing else, then we can use the minimal document model carrier, and have just one Microformat per resource.

eA: But why not use the original data schema, before it was Microformatted? Why not just use vCard and vCalendar?

DC: Or use Atom instead of hAtom! Of course, if vCard has an XML representation, you could use that - as long as the constituency of your clients is the right one and is big enough. There may be more code out there that 'gets' hCard than an XML vCard. And some Microformats - such as hResume and hReview - don't have an original schema and are based on abstracting from common or prior behaviour.

This is REST integration we're talking about, where data, not documents, are native, and we aim to search out the most popular and most widely understood data schemas - even if carried over documents - to maximise interoperability.

eA: OK, that seems fine. Although I'd point out that REST doesn't have the monopoly on interoperability. SOA does that too.

DC: Interoperability is best acheived by sharing millions of URIs dereferencing to a handful of standard content types, with interlinks across the Web of resources. ROAs (Resource-Oriented Architectures) do that. SOA doesn't.

eA: REST APIs don't always have to do it. In the previous example you went through, eBay and gBay could offer REST interfaces but not talk the same schemas and not allow cross-linking in the way you described. Or talk the same schemas but not recognise each other's Items and Offers.

DC: That would be walled-garden, silo-thinking. It's also 'API' thinking. Just opening up a port to your application, even one with correct use of GET and POST on well-organised domain URIs, isn't in the spirit of REST, and certainly isn't good enough for REST integration.

In REST we always aim to adopt the same schemas, to aim explicitly for interoperability. And linking between those resources, even cross-site, is fundamental to the REST way of thinking. If someone offers you a 'REST API' that uses unnecessary proprietary schemas that miss obvious interlinking opportunities, especially across to other sites, run away!

eA: Are there any real-world examples?

DC: A good, and ironic, example of this is Google's Open Social, at least in its earlier releases, which fails to achieve true cross-site openness even with a 'REST API' and shared schemas, because sites don't cross-link or actually allow data sharing. Also, the schemas are a strange extension of Atom, rather than using, for example, vCard as the basis for 'People Data'.

This hopefully will be fixed as the 'REST API' evolves and with the work going on in groups such as DataPortability, with agreement from the major operators.

eA: So much for the interoperability of that REST interface.

DC: The heart of good REST interoperability is the acceptance of standardised data at a 'foreign' URI, and the re-publishing of that foreign URI in your own standardised resources. It happens on the Web all the time, of course. We just need to copy the model for REST integration.

Hypermedia and (more importantly, here) 'hyperdata' is baked into REST, but is an afterthought in SOA. ROAs create an interlinked hyperdata landscape across sites and domains. I'm using 'hyperdata' here in the sense of interlinked data resources in REST integration, by analogy with hypermedia, not in its Semantic Web sense.

eA: Ah! But how do your little pure Microformat resources link up into this hyperdata landscape? Microformats can't link to each other, can they?

DC: It's true you may have to go and get involved in the Microformats movement in order to help define how to link an hCalendar event to a list of hCards of people attending. Or the hCard of a company to a list of hCards of its board members. Or an hReview to the hCalendar event being reviewed and the hCard of its author. Or to include the XFN list of links to friends' hCards inside a person's own hCard.

One indication that there's something not ideal in Microformats is the fact that you have to write someone's hCard out again and again for every page or site they appear on. If you could just link to a single hCard for that person it would be more efficient.

eA: But Microformats have a narrow charter: to decorate the document model with semantics. Any links are just part of the hypertext Web. It sounds like you're trying to make some kind of domain model out of them, with their own interlinks!

DC: Yup. When you start to think the data of REST integration, the document carrier of Microformats and it's often superfluous links can be a distraction. If the document links are relevant to the Microformat, of if people would use links within the Microformat if they were told what value it has, it would be worth pulling them out into the Microformat definition itself. Then enhancing in-browser Microformat parsers to follow links will greatly enhance their utility.

All you have to do is find real-world examples, and propose it on the Microformat lists! Meantime, reuse the schemas and keep all your extensions public and backwards-compatible.

eA: What about all those 'rel-' decorations? You know, rel-tag, XFN, etc.

DC: Well, hAtom is the only Microformat that specifies nested rel-links: rel-tag, rel-bookmark and rel-enclosure. Otherwise, each Microformat is independent, and the rel-links are independent. Like I said, it may be worth going to the Microformat community and suggesting more such rel-links beyond hAtom.

 

URI Opacity

eA: So this RESTful data landscape of data wired up with URIs: it sounds a bit hard-wired: where do URIs as queries (and URI templates) fit into that tight mesh?

DC: URI templates fall into exactly the same category as standardised content types and schemas in terms of their level of abstraction and location in the stack. In other words, the right thing to do, if it's transparent URIs you want, is to standardise search URI templates across sites of a type.

eA: This is getting complicated. It's hard enough to get agreement on the content types of resources, never mind on URI formats as well!

DC: Indeed, and in fact, I believe that URIs should be opaque: they already are to HTTP, but also in our data landscape, a URI should point to a single, predictable resource.

The mechanism of querying that dataspace should be separated out from the mechanism of linking it up.

eA: A bit like GUIDs?

DC: Exactly. In Enterprise applications, you often see GUIDs (globally unique ids) being used, and never see them mixed up with search strings!

Transparent, query or template URIs are either used to be helpful or decorative, or are an acceptable optimisation, as long as you know that it's tunnelling through or hijacking the URI for a quick query string.

eA: Tunnelling? Hijacking? You've dismissed a long-standing convention, in the Web at least! How else do you do query fetches?

DC: A better solution is the query-POST-redirect pattern: the client POSTs their query, then the server redirects them to a linkable results resource on an opaque URI.

The POST query schema can then be properly standardised in a content type, or 'templated' in the REST integration equivalent of an HTML form.

It's an extra round trip, but only one IP packet in each direction; a redirect or a GET can fit into a single IP packet - the cost is only in the connection latency.

eA: Why not just return the state of the resource you're redirecting to in the body of the redirect, to save even this round-trip?

DC: Yes, you could do that. It's not something seen in the hypermedia Web as far as I know, but this is REST integration, where we're able to come up with new sub-protocols like this - where HTTP response codes are often given much thought.

Further, the server can offer the option to snapshot this results resource, so that it's still exactly the same whenever the link is dereferenced - something you can't do with a query URI.

eA: What would Tim Berners-Lee say about this? Is it in the spirit or letter of his vision for how HTTP and URIs should be used?

DC: I've no idea! However, in my opinion, when Tim didn't separate the concepts of a globally unique identifier returning exactly one resource from a query string returning maybe none, one or many resources (in a list), he started a good deal of unnecessary confusion, even if non-fatal in practical terms.

The phrase 'hackable URIs' sums up the situation. We may have been forced into creating slightly better user interfaces if the URI textbox were taken away from browsers.

Make your interface your content and have good search and information architecture to allow your (opaque) links to be discovered. If you know that human users - or search engines - will be interested in reading some links at the top of your information architecture, then go ahead and use just a few simple, meaningful addresses.

eA: You're venturing into controversy again! I'm sure I keep reading about designing nice URLs being good practice.

DC: There was a time when transparent URLs were considered important, but now everyone just uses Google! All the energy that's put into URL good manners and systems of URI templating and naming is just a distraction from the bigger effort of standardising content and defining schemas.

Opaque URIs keep content in the body where it can be given a Content-Type, instead of the headers - the URL line.

This is related to my preference to put 'write methods' such as PUT and DELETE into the body instead of the URL line.

eA: How exactly?

DC: The URL line should have a definite target - an opaque, globally-unique URI - and a content transfer direction - GET or POST.

The rest of the application-level interaction, including anything that will affect state and any searching and querying, should be in transferred bodies with standardised content types.

(c) 2006-2008 Duncan Cragg

 

In Part 7: Business Conversations.

Note that the opinions of our imaginary eBay Architect don't necessarily represent or reflect in any way the official opinions of eBay or the opinions of anyone at eBay.

Indeed, I can't guarantee that the opinions of our real blogger necessarily represent or reflect in any way the official opinions of Roy Fielding...

http://duncan-cragg.org/blog/post/google-micro-conference/ Google Micro Conference 2007-10-05T11:22:00Z 2007-10-05T11:22:00Z

Last night's Google London Open Source Jam (also here) was on the subject of the 'Web' (didn't they invent that? Oh no, that was Microsoft).

This event has been getting better and better each time I've attended. There were some very interesting lightning talks held together with a tight structure and plenty of chance to chat, drink cold Leffe and eat cold pizza. And nick [transatlantic translation: 'steal'] the Green & Black's chocolate.

An ideal Micro Conference...   ...

Last night's Google London Open Source Jam (also here) was on the subject of the 'Web' (didn't they invent that? Oh no, that was Microsoft).

This event has been getting better and better each time I've attended. There were some very interesting lightning talks held together with a tight structure and plenty of chance to chat, drink cold Leffe and eat cold pizza. And nick [transatlantic translation: 'steal'] the Green & Black's chocolate.

An ideal Micro Conference...

I arrived late (it starts at 6pm) and spent some time catching up with ex-Thoughtworks colleagues, so I missed Dion "Ajaxian" Almaer's Google Gears slideset from FOWA. Go there now and check it out.

Thus the first talk I saw was a nifty piece of widgetry by Steven Goodwin called WARP. In WARP, interacting with a page of 'applets' changed the URL to encode those applets' current state. If you link to the current page, it will always show that state. Very long URLs, you can imagine. None of that fancy Ajax stuff. RESTful, dare I say. Nice API server-side for unpacking your applet params.

A trip to the lavatories [transatlantic translation: 'restroom'/'bathroom'] revealed that they are, indeed, doing that Testing in the Toilet project in Google. It works, too! I learned something. Other intelligence on Google's Inner Workings include confirmation of the beanbags and of the high quality, free grub to which I have already alluded.

A nice bloke from Yahoo! (Tom Hughes-Croucher: another spy?) came along to sell his idea that, in the collaborative world of open-minded hackers, we who run websites could help each other with our 404s. If I get a 404, I use the referrer link to tell you, via some RESTful POST, that your link to me is bust (assuming I don't intend to fix it myself).

I think the world is a little more selfish, so you need to decide who hurts more - the site who sends their visitors to a dead-end, or the site delivering that dead-end to a new visitor. I suspect the latter, by a small margin, as it's not exactly a nice welcome. So it's up to them to let the new visitor down more gently, and to notify the publisher of the broken link with little or no cost to them. For example, a really sociable 404-ing site could just redirect the hapless visitor back to the referring page, adding '?broken=links' to the URL - hopefully to be picked up by log scanning scripts at the referring site.

Next up, yours truly taking yet another chance to promote his excellent Micro Web thingy. Couple of people asked about it afterwards - including that nice chap from Yahoo! Also, a smart - and nice - chap called Toby (this one?) got me into a deep discussion on imperative vs. event-driven vs. state-driven programming. He was apparently an old-timer like me, as he was able to engage in dewy-eyed Functional Programming recollections. I managed to give out about four full colour printouts about the Micro Web, and to collect some good calling cards.

However, Joe Walnes, even a pint down in the pub afterwards, still refused to sign up for Micro Web duties. This in spite of over three years of intensive lobbying, including eight months of me working Trojan-horse-like in his kitchen, on The 2005 Implementation.

Another ex-Thoughtworks colleague, Simon Stewart took yet another chance to promote his promising Webdriver thingy. And a very interesting project it is becoming. Still needs more work - on IE support, etc - but I'll probably be using it in my new job at the Financial Times.

Another ex-Thoughtworks colleague, Chris Matts took a chance to promote his and Andy Pols' interesting new Dream Machine thingy. Perhaps a bit like Cambrian House - you put your dreams and ideas into it and people expand on them. Chris is a natural on-stage - and even used the age-old trick of promising lots of money for no effort, to get our attention at the start.

All I could come up with for the Micro Web was 'Cheaper, Wider, Faster'...

 

Updated: added reference to Dion Almaer, details about WARP, swapped in the picture of TWers that I was waiting for and fixed a minor blunder thanks to that ever-sharp ThoughtWorker, Dan Bodart..

http://duncan-cragg.org/blog/post/how-ruby-can-enable-web-20-platform/ How Ruby can enable the Web 2.0 Platform 2007-06-26T15:17:00Z 2007-06-26T15:17:00Z

Web 2.0's definition includes seeing the Web as an application platform. Which means it is in competition with Java and .Net, and with SOA, for both local and widely distributed applications.

If the Web is going to be a platform, the skills you need to learn to program it are the core Web 2.0 technologies such as Ajax, JSON, Atom, Microformats and OpenID.

And Ruby. This language, that's capturing the hearts of many Web 2.0 programmers, is ideal for easing the transition from the Java and .Net platforms to the Web platform, as I will show.

Even if you're part of a big company that is generally immune to the latest trends, the marriage of Ruby and the Web-as-platform may be something to prepare for. It could even displace your SOA agenda...   ...

Web 2.0's definition includes seeing the Web as an application platform. Which means it is in competition with Java and .Net, and with SOA, for both local and widely distributed applications.

If the Web is going to be a platform, the skills you need to learn to program it are the core Web 2.0 technologies such as Ajax, JSON, Atom, Microformats and OpenID.

And Ruby. This language, that's capturing the hearts of many Web 2.0 programmers, is ideal for easing the transition from the Java and .Net platforms to the Web platform, as I will show.

Even if you're part of a big company that is generally immune to the latest trends, the marriage of Ruby and the Web-as-platform may be something to prepare for. It could even displace your SOA agenda...

Few would disagree that the Ruby language is riding the wave generated by Ruby-on-Rails. In turn, Rails is riding the Web 2.0 wave, coming as it does from underpinning the very Web 2.0 37signals product suite.

Rails and Ruby have tapped into the tech Zeitgeist of friendly, simple and powerful. The speed with which the Ruby and Rails communities have delivered the key components of Web 2.0 is matched by the speed at which Ruby and Rails books are leaving the shelves.

What is the ideal platform of Web 2.0? Will it be Rails and Ruby? Will Ruby ride the Web 2.0 wave into the mainstream in the same way Java rode the Web 1.0 wave?

Well, here's the problem with that question: Web 2.0 is supposed to be primarily about the Web itself as the platform, as explained first by Tim O'Reilly and then by a thousand Web 2.0 vendors and industry watchers after him.

Web-as-platform is not just vendor hype or pundit hand-waving. Let's think about what O'Reilly meant by that.

 

Web as Platform

Web 2.0 is about making the Web more interactive, and thus able to support applications where Java and .Net would once have been considered the sole delivery platforms.

The fact that the technologies of the Web can be turned to this use is a shift with far-reaching implications.

Broadly, the shift we are seeing is from the one-way, static document delivery of Web 1.0 towards the two-way, dynamic data exchange of Web 2.0.

This fundamental repurposing is delivering more complex, interactive applications that work inside our browsers and which fully leverage the benefits of online operation.

Web 2.0 is bringing the user and their stuff into the very Web that they hitherto only passively consumed. This network-enablement of the user in turn enables their social networking and their shared creativity and self-expression.

Web 2.0 has tapped into a deep human need - a fact reflected in the vast traffic volumes and correspondingly vast valuations of Web 2.0 startups that we're currently seeing.

But Web 2.0 is not just for the startups: Enterprise Web 2.0 is coming! The bigco.com site is going to be looking a little, well, static and lifeless when compared to the new sites that are springing up everywhere, and that most of BigCo's employees are using. Further, BigCo can gain huge benefits from Web 2.0 approaches empowering and connecting those employees on the Intranet. And that Intranet is an ideal platform for deploying company-wide, interactive applications.

This shift in the Web to two-way dynamic data is being powered by a set of technologies that a Web platform programmer is going to have to learn.

 

Web 2.0 Platform Technologies

Anything that claims to be an application platform must support data. Web 2.0 is above all the data Web. Web 2.0 is about semantics, not free text and font sizes. Hence, it inevitably starts with data-oriented formats such as XHTML, YAML and JSON. In Web 2.0 more than ever, we talk about data not documents and about separating data from its presentation. CSS is big in Web 2.0, for good reason (not just for gradient fills). Inside the page of a self-respecting Web 2.0 application, you'll often find Microformats - again, semantics in the page: publishing concise data of widely-understood standard formats. Some of those Microformats may be tags, and in Web 2.0 the simplest and most powerful semantics are those little pivot points in Webspace.

Again, if you're going to be a general purpose platform, you need to be able to fetch, update, notify and display that data. Web 2.0 integration usually happens via JSON data structures and REST interfaces (some of which, especially those based on AtomPub, are true REST). Following on from the data-like pages we serve to browsers, come the data-like feeds we publish to feed readers and to other applications. After feeds, the core technology that gives Web 2.0 its dynamism and interactivity is Ajax and DHTML, and increasingly Comet (server push to the browser). The core technology that gives Web 2.0 its users is increasingly OpenID.

All of the above are open technologies. You can do Web 2.0 without proprietary technologies, just like Web 1.0. Indeed, keeping to the principles that made the Web successful is also essential to the success of Web 2.0. The Web platform is the first application platform that has to consider scalability and interoperability, and will ignore them at its cost. I have written before about open data, use of standard data formats and using REST properly to avoid creating unscalable, walled-garden sites. You don't need Flash or SilverLight, you don't need vast amounts of custom Javascript, you don't need function calls tying you to your servers.

 

Programming the Web 2.0 Platform

So, we've got the dynamic data that you'd expect of a would-be platform. But how to drive changes in those data? How do we program the Web platform to animate all this data?

All Rails programmers will know the above technology list; it comes with the territory. The Web 2.0 Platform can be very succesfully powered by Rails and Ruby. Ruby and Rails make Web 2.0 applications simple and quick to program, addressing many of the needs of simple Web 2.0 applications out of the box. There's little doubt that Ruby and Rails will have a secure future riding the Web 2.0 wave.

However, for many Web 2.0 applications, programming may not even be necessary, at least not in the procedural or imperative style programmers expect.

Look back to the early 90's: 'Web 1.0' made a whole class of applications easy to write without programming: applications for navigating information. You just wrote in HTML, declaratively.

Now look back at the long path of evolution of Java, through J2EE, Spring, AOP, IoC, Domain Driven Design, POJOs. All trying to achieve the simple goal of 'remove all that MVC and persistence stuff and let us concentrate on business or domain objects'. But they never quite seemed to get it right.

But then Rails comes along, and has succeeded by simple virtue of concentrating on easy manipulation of the 'Intel Inside' of Web 2.0 - data.

It's reminiscent of the 'Naked Objects' approach to application building with minimal programming (just business or domain code in POJOs that expose state into the GUI and are transparently persisted). The Streamlined project takes Rails even further down this path. Rails' nearest competitor, Django, has an admin interface that works in a similar way, automatically generating edit pages based on the data model.

Web 2.0 is about data, about semantics. Web 2.0 is inherently declarative. So Web 2.0 applications can be written declaratively - Web 2.0 mashups can be just wired together and their data animated by business rules. A bit like programming spreadsheets.

Teqlo, Coghead, Pipes, DabbleDB, LongJump, Popfly, AppExchange and Wyaworks are all examples of the different ways to program the new Web 2.0 platform without imperative code.

That's what we mean by Web-as-platform - not only is the underlying programming language irrelevant, it will often not even be needed, certainly for simple data manipulation applications and for many simple mashups. Being RESTful gives you a massive head start in this, of course.

While Rails is already in the game with its innate understanding of Web 2.0 techniques and philosophies, Ruby itself has a huge amount to offer the would-be declarative programmer, who is making the transition to this new Web platform from their traditional Java or .Net platform. In particular, it is easy to write your domain logic in a declarative style in Ruby: they call them 'DSLs' these days, but the idea is the same in most examples I've seen.

 

Web 2.0 - The Web Redux

Now, if you've been following this blog, you'll know I have a few opinions on Declarative Web 2.0 and on patterns for programming REST. Essentially I argue that, if you want to play in the Web 2.0 platform game, you don't want to be writing screeds of Javascript functions that call more functions on your servers.

I recently presented some ideas along these lines at the WWW2007 conference, entitled 'The Micro Web: putting the Web back into Web 2.0', where I also showed a demo written in Python.

This approach combines my Distributed Observer Pattern with Comet push to enable highly dynamic Web 2.0 applications to be coded RESTfully and declaratively, with zero Javascript. The Distributed Observer Pattern offers a clean programming model for animating the Web 2.0 dynamic-data technology set I described above.

I believe the Observer Pattern is core to the way we'll be programming when the Web 2.0 Platform hits mainstream. It enables the kind of event- and rule-driven programming that matches the characteristics of the Web 2.0 dynamic data platform. As a further killer benefit, it also directly addresses the optimal utilisation of multicore processors.

I am currently porting my Python implementation of this approach to Ruby, in the Redux project on Rubyforge. Redux stands for 'Ruby Event-Driven Update Exchange'. It uses the highly scalable EventMachine epoll-based event loop to power its event-driven architecture. This will be essential when Redux is asked to scale up a Comet-based application.

Like Rails, Redux will be a Web (2.0) application framework, but unlike Rails, it puts the Observer Pattern and event- and rule-driven programming at its core.

Redux's headline is 'Web 2.0 in-a-box' or 'Naked Objects on the Web'.

 

Conclusion

If you're in BigCo, and are responsible for setting BigCo's technical strategy, then train your Java devs up on Web 2.0 core technologies such as Ajax, JSON, Atom, Microformats and OpenID.

And fire up their enthusiasm by tapping into Ruby (perhaps via JRuby) on your way to the Web 2.0 platform.

Learn patterns for mashing and integrating. Learn about REST and event- and rule-driven programming, including declarative DSLs.

When this Web platform hits BigCo, you will probably find that its REST or ROA style make your SOA integration strategy look rather complex and unweildy.

Check out the Distributed Observer Pattern, and download Redux when it's done (I'll let you know if you subscribe here!).

In 2007 and beyond, its the Web itself that's the platform, not Java or .Net. But if you want to get there via a language-based platform, Ruby could be the best way to transition to it.

Note: Everything I said about Ruby and Rails applies equally in technical terms to Python and Django, but regardless of the significant benefits of the latter, Ruby and Rails have the Web 2.0 market and mindshare. I'll probably switch this blog from Django to Redux sometime this year..

(c) 2007 Duncan Cragg

http://duncan-cragg.org/blog/post/distributed-observer-pattern-rest-dialogues/ The Distributed Observer Pattern | The REST Dialogues 2007-06-20T22:42:00Z 2007-06-20T22:42:00Z

In an exclusive nine-part dialogue with an imaginary eBay Architect, we present an accessible discussion of the REST vs. SOA issue.

Although eBay have what they call a 'REST' interface, it is, in fact, a STREST interface, and only works for a few of the many function calls that they make available via SOAP (GetSearchResults, GetItem, GetCategoryListings, etc).

In this dialogue series, I argue the case for eBay to adopt a truly REST approach to their integration API.

Part 5: The Distributed Observer Pattern   ...

In an exclusive nine-part dialogue with an imaginary eBay Architect, we present an accessible discussion of the REST vs. SOA issue.

Although eBay have what they call a 'REST' interface, it is, in fact, a STREST interface, and only works for a few of the many function calls that they make available via SOAP (GetSearchResults, GetItem, GetCategoryListings, etc).

In this dialogue series, I argue the case for eBay to adopt a truly REST approach to their integration API.

Part 5: The Distributed Observer Pattern

eBay Architect: So, can you summarise your argument that 'REST isn't just about reading and writing data', and explain your view on RESTful business logic?

Duncan Cragg: OK. The whole collection of related resources determines where things stand at any given time.

Resources are masters of their own destiny - guided by rules declared in the standard to which their content type conforms.

These rules, or business logic, run on notification of any declarations of the state of peer resources, or on arrival of any state via POST. Such peer states and POSTs are not commands, although it is possible to go ahead and define a special command or edit command content type.

The rules aim to satisfy the business or domain constraints on the mutual states of these resources - updating and creating resources accordingly and causing appropriate side-effects outside the server, such as financial transactions and emails.

These transformations are state driven. Even though the 'tension' in unresolved rules may be detected by events, that tension exists, not in those events as such, but in resource state.

eA: That sounds like a core difference to SOA.

DC: Indeed. It's a Resource-Oriented Architecture. And ROAs are declarative, not imperative like SOAs.

We have a world of resources declaring their current state, and resources settling into new states depending on the current state of related resources. These state changes can be driven by hard-coded resource animation logic, or by simpler, clearer, more scalable, declarative state transformation rules.

eA: Remind me of those patterns for notifying state change.

DC: Resource states are either polled via GET or actively notified via POST. Such actively POSTed state could be from a resource that also happens to be GETable, could be simply a link to such a resource, or could cause such a GETable resource to be created on the target server. Alternatively, the POSTed state could be considered too transient to record in a GETable resource, but can still trigger transformation in its target resource.

The above eBay examples used the pattern of 'server creates GETable copy of POSTed resource', and also 'second server hosts GETable copy of POST-notified resource'.

What I have described is a general programming model because, in general, such simple, declarative, transformational mechanisms are Turing Complete.

eA: I'm sure it's a novel perspective - even to RESTians! Again, do you have any high-level RESTian support for this?

DC: Any web resource that is a derivative of, or is dependent on, one or more other resources is using this approach.

Like I said before, there is an example of a similar approach by Joe Gregorio on his 'Well-Formed Web' site for alerting resources to peer resources of mutual interest.

Every time you would POST some data, consider making that data GETable and POST its URI instead, as a notification of the data existing.

eA: GETable POST data? You sure that's REST-compliant?

DC: In REST integration, things become more symmetric than in the client-server Web, or rather, the 'client-resource' Web. We can start to talk about the 'resource-resource' Web!

But anyway, we're already halfway to the symmetric resource-resource Web when we POST - not to a service, but to a URI. Resources can already both issue and receive state, which is a pretty symmetric state of affairs.

eA: I never thought of it that way - I keep forgetting that you can POST right back to a resource you just fetched.

DC: But think one step on: the POSTed data has a Content-Type but no URI!

Why not close the loop and have this POSTed data be a first-class resource (with a URI) that POSTs itself to the target. And it can itself GET that target or be POSTed to by that target in return.

That really is a Resource-Oriented Architecture. Once resources are seen as equal and active participants in RESTful integration, it becomes irrelevant whether their state is transferred by GET or by POST.

eA: I'm still having trouble with this pattern of POST just being a pro-active GET.

DC: Making POSTed data GETable more correctly moves the responsibility to the target resource to fetch the incoming resource state when its ready (rather than being bombarded by state it hasn't asked for).

Once the target is interested, updates can be POSTed directly as they happen, to prevent the target polling, or notification of an updated URI POSTed to trigger the target to re-GET the changed resource when it wants (thereby updating the caches).

eA: Hmm - makes clients look like servers..

DC: Since our 'clients' in REST integration are also 'servers' in other contexts, it is easier to set up client-side resources than on the browser-based Web. One objection to cookies on the Web is that they are state or resource that has no URI. So give your 'client' state a URI! And put any client-specific server resources on your own 'client' host.

eA: Is anyone doing this sort of thing?

DC: Well, in fact there are many examples of this POST-notification of a GETable resource already happening between web sites. Like submitting a link to your site to an indexing engine and letting it crawl (or poll) it.

Trackback pings are another example: POST a URI along with a sample of your page. And the Microformat rel-tag adds your article to Technorati's tag index when you ping their servers with the URI of the article.

Further, imagine POSTing to some new site a link to your hCard on your own server, to save you having to type your name and address again. And you'd never need to manually update sites when your address changes: just ping 'em all.

eA: Ah - but I thought all URIs should be GETable. The ping URI you're POSTing to in these examples isn't always one that you can also GET!

DC: Indeed - so think how much more powerful it would be if we did close the loop and provide or create a GETable resource to POST these notifications to.

For example, imagine a page containing an hCalendar event. Now point to it with a rel="attending" link. When the hCalendar discovers your intention (using a direct POST ping of your page's URI to the hCalendar page's URI - or perhaps through the referrer trick from people clicking through), it adds your referring page to a list of attendees inside the hCalendar. The hCalendar could either contain lists of backlinks to the attendee's pages, which may in turn carry hCards, or it could contain lists of complete hCards copied over.

eA: Sounds like a good use of Microformats.

DC: These examples make crawling and polling (even with If-Modified-Since et al) look like a clumsy version of the more proactive POST.

Web Feeds and general publish-subscribe are further examples where POST may be used to notify changes on a resource - giving the feed consumer first-class resource status with their own URI.

eA: I'd never think of using HTTP in this way.

DC: Obviously this only applies where the feed consumer is a visible and POSTable server and where timeliness is crucial. And probably where the number of subscribers is relatively small, unless asynchronous I/O and an event-driven architecture are employed, and you don't wait for the response to each POST.

This isn't done now simply because of the asymmetry of the current Web, an asymmetry which we are free of in REST integration.

eA: What about all those REST rules about idempotent and unsafe methods?

DC: We're not mixing GET and POST in that sense, just turning the tables on the asymmetric Web. GET is still cacheable, and we can POST a link to cause a cached GET.

I believe this is a more-constrained REST style, not disjoint to REST. It is at least an ROA! It may fall foul of REST's client-server constraint, since we're now in server-server territory with integration applications. Also, the concept of 'Hypertext as the Engine of Application State' is something that may take some refitting to the mutual state dependency model. However, I believe it's most important to focus on maintaining the benefits of REST and its key elements of standard content types at URIs.

I call this symmetric REST integration style the 'Distributed Observer Pattern'.

eA: Quickly summarise the 'Distributed Observer Pattern'.

DC: OK, the Distributed Observer Pattern is 'symmetric REST'. A resource subscribes to a peer resource via a GET that supplies its own URI, and is notified of subsequent state changes in that resource through a POST back.

eA: That was too quick. Tell me the details!

DC: OK, here are four. First, a POST can be either the whole new state or the fact of the change, allowing the subscriber to GET the resource when it's ready (and thereby fill any caches).

Secondly, you can use either the Referer header or perhaps the Content-Location header in POST and GET requests to indicate the origin POSTer or GETter URI. Alternatively, you can send this origin resource URI using the Cookie header, echoing its use in the normal browser client-server case to identify the pseudo-resource of a browser user.

POSTed state notifications may be unsolicited by a prior GET subscription, when the POST target is clearly open to them (as in the ping notification examples). These can be seen as 'subscribe to anyone', and may be combined with a corresponding 'GET anyone' crawling process, without explicit subscription.

Finally, POST notifications may be targetted to single resources to ask them to update: the Distributed Observer Pattern way of achieving the client-server editing function. These now become 'edit suggestions' of the POSTer resource - putting the target back in control of its own destiny and integrity.

eA: And why should I use the Distributed Observer Pattern?

DC: The Distributed Observer Pattern supports the programming model of inter-dependent resources whose own state is a function of their peers' state, driven by declarative rules. It's a very general ROA programming model.

(c) 2006-2007 Duncan Cragg

 

In Part 6: Content-Types and URIs.

Note that the opinions of our imaginary eBay Architect don't necessarily represent or reflect in any way the official opinions of eBay or the opinions of anyone at eBay.

Indeed, I can't guarantee that the opinions of our real blogger necessarily represent or reflect in any way the official opinions of Roy Fielding...

http://duncan-cragg.org/blog/post/inter-enterprise-rest-integration-rest-dialogues/ Inter-Enterprise REST Integration | The REST Dialogues 2007-04-08T13:38:00Z 2007-04-08T13:38:00Z

In an exclusive nine-part dialogue with an imaginary eBay Architect, we present an accessible discussion of the REST vs. SOA issue.

Although eBay have what they call a 'REST' interface, it is, in fact, a STREST interface, and only works for a few of the many function calls that they make available via SOAP (GetSearchResults, GetItem, GetCategoryListings, etc).

In this dialogue series, I argue the case for eBay to adopt a truly REST approach to their integration API.

Part 4: Inter-Enterprise REST Integration   ...

In an exclusive nine-part dialogue with an imaginary eBay Architect, we present an accessible discussion of the REST vs. SOA issue.

Although eBay have what they call a 'REST' interface, it is, in fact, a STREST interface, and only works for a few of the many function calls that they make available via SOAP (GetSearchResults, GetItem, GetCategoryListings, etc).

In this dialogue series, I argue the case for eBay to adopt a truly REST approach to their integration API.

Part 4: Inter-Enterprise REST Integration

Duncan Cragg: OK - I've demonstrated how you can replace imperative, function-call API-driving with a clean, declarative, RESTful interaction, driven by simple business rules.

We had servers run by eBay and clients run by the public, in the same way your SOAP API is used.

eBay Architect: Ah: that's something SOA has that REST doesn't!

DC: What? What's that?

eA: Services are all about Enterprise Integration: about servers talking to servers. In REST you're all about clients talking to servers. The Web is essentially only browser clients talking to Web servers. With Web Services, you can do more serious Enterprise Integration.

DC: You never give up do you? So you want 'serious' integration. Is that within or between enterprises?

eA: Let's say between.

DC: Fine. We'll use the same example as before: it's just a variation on the Patterns used.

We can standardise a more general version of the eBay schemas for Items, Offers, ResponseToBestOffers and so on. Anyone can put their own Items, Offers, etc. up on their own servers, or on some public auction service site. Everyone can do auctions with eBay and with anyone else who decides to set up.

Even, say, a new Google auction site: let's call it 'gBay'!

eA: Ha! OK, let's go through this slowly: you have eBay and 'gBay' sites, with sets of users on each. Now Ernie wants to sell his old laptop on eBay, so creates a new Item for it. Gordon is registered to gBay and needs a cheap laptop.

DC: Great - well the first thing is search. As an interoperable site, gBay offers a broad search across both gBay sale Items and eBay ones - cached and indexed internally. The gBay search database would be filled by crawling eBay URIs and even by running queries on eBay.

eA: Mm. Have to check the T's & C's...

DC: So Gordon on gBay finds Ernie's laptop on eBay. The presentation of this eBay sale item will be given the gBay style, but calling out directly to the eBay data and images.

eA: OK, now let's say Gordon decides to make an offer.

DC: So an Offer resource is created on gBay referring to the laptop on eBay. Then through a notification, the Item on eBay is alerted to this Offer.

eA: What's notified, to where?

DC: There's a number of possible patterns. Before, we had the pattern of POSTing a resource to a server that then creates the GETable version.

However, now gBay is hosting the Offer, so the internal mechanisms for notification are no longer available.

So gBay could suggest an update through APP or a simpler POST to a collection of Offer entries within the eBay Item to point to this, now remote, Offer.

Perhaps the gBay Offer can simply be POSTed wholesale to the eBay Item.

Or just a link to it.

Or eBay may poll, read a feed or search gBay for new Offer URIs, putting them into Offer lists as they come up.

An unusual approach (thanks to Joe Gregorio) would be for gBay to GET the eBay Item, with the Offer marked in a Referer: header.

eA: Plenty of patterns to choose from. So there are some Offers on eBay, some on gBay. The Item lists its Offers in a rank as before, as they appear through this notification.

Now, let's say Ernie wants to accept Gordon's Offer on gBay.

DC: OK, assuming he can see the Offers the same regardless of host, he just chooses Gordon's Offer on the offer listing for his Item and accepts it.

eA: So we need to create a ResponseToBestOffer on eBay.

DC: Yes. Now the patterns are reversed, because eBay needs to notify gBay this time - of its ResponseToBestOffer.

 

Pub-Sub and Observer Pattern

DC: Again, it can do this by POSTing the ResponseToBestOffer to each Offer on gBay in turn, or can POST the actual Item itself to each Offer, where the Item has a link to the ResponseToBestOffer.

That would implement a logical subscription to the Item from each of the Offers on it.

eA: It sounds to me like POSTing several times to implement this pub-sub pattern is physically inefficient, even if it's logically correct. Especially when it's the same information repeated from eBay to gBay servers.

DC: Yes, indeed: a single notification to gBay would be better, letting gBay handle the propagation of subscription responses. This would in effect treat gBay as a proxy cache, and the notification as a cache invalidation event on gBay's copy of the eBay Item.

eA: What URI on gBay would you POST this eBay Item to?

DC: Something like http://gbay.com/ebay.com/item/4243 - to a copy of itself. You could also GET this cached copy if you wanted.

eA: OK, what next?

DC: In gBay the losing Offers get updated on receipt of this ResponseToBestOffer state. Gordon's Offer gets set to 'won'. In eBay, all the losing Offers are updated to 'lost'. The laptop Item gets marked 'sold', with a link to the ResponseToBestOffer, which links to the Offer that won.

It is possible to implement this internally in eBay (and that pub-sub cache invalidation propagation in gBay) using the Observer Pattern. and an event-driven server.

eA: Makes sense - you mean something like SEDA?

DC: Yep.

So the Offers all subscribe to the Item to watch for its status switching to 'sold' and to see if they won. Conversely, the Item can subscribe to the Offers: maybe the Offers could change or be withdrawn, and the Item needs to keep itself updated accordingly.

eA: Wow - symmetric subscription - the two-way Observer Pattern!

OK, what next?

DC: The eBay laptop Item resource will be further updated by its owner with paid, shipped, refunded, etc., as it currently is within eBay.

eA: Hold on, you're mixing patterns: you had the Observer Pattern on the Item just now: the Item observes the Offers. The Offers' state can be POSTed to the Item, whose own state may then change according to its rules.

But you then mix patterns by allowing a POST directly to the Item from the Item's owner, to update a couple of fields.

In one, the Item chooses what its state will be according to the state of its peers, and in the other, it's told, not according to a peer state, but some POST content type.

That doesn't seem neat or symmetric.

DC: It's true that these interaction styles differ: the Observer Pattern or pub-sub approach is peer-to-peer (resource-to-resource as equals watching each other); and in this scenario it's also server-to-server.

The direct edit request is more a client-server pattern, where the server resource - the Item - is considered under the control of a client.

However, the Item is always in control of its own state, and can even ignore a request by its owner if that request doesn't match its internal integrity rules.

The Item supporting both styles at the same time is absolutely fine.

Actually, you could see these two styles as aspects of the same peer-to-peer pattern: introduce a resource in the client that holds edit requests, to which the Item subscribes. It all ends up being much the same.

 

Transactions, Trust

eA: Right, now what if you have a race, where the ResponseToBestOffer is created at the same time as an Offer is changed or withdrawn?

Don't you need some kind of two-phase commit or distributed transaction logic?

DC: Of course not. It's the same as in the real world: as long as it all settles in the end and the rules are followed. The ResponseToBestOffer cites what state of the Offer it is accepting. If that changes for any reason, the ResponseToBestOffer is void.

It's about state and state consistency in REST, as opposed to the SOA style of maintaining total control at all times.

There will be temporary states that trigger the rules and that need to be resolved. That's the programming and distribution model. Tolerance of transient states is what makes this model so robust.

eA: Surely there are some legal and contract issues? How is this exchange legally binding?

DC: You can digitally sign the Item, Offer and ResponseToBestOffer resources, and each side needs to keep records of the history. Then it's down to agreements between eBay and gBay and the local laws in force.

eA: What about buyer and seller ratings and feedback?

DC: Ernie in eBay and Gordon in gBay can happily publish feedback about each other, and Ernie will be able to see Gordon's rating via eBay's interface, or directly on gBay.

As for aggregated ratings from several buyer/seller interactions: a person's rating is a function of the ratings of all those they have dealt with. These ratings can be fetched by GET from remote sites, and combined with internally-held ratings, depending on the trust of one site over another site's ratings.

eA: So how do we trust these ratings across sites?

DC: We have to trust eBay that it trusts gBay. This is one of the basics of distributed systems. In a monolithic system you have a single trust domain: all parts can trust each other.

Split the application up across multiple trust domains and you need authentication and crypto. You can't get way from needing peer trust structures built up explicitly through crypto, agreement and contract and/or implicitly through past successful experience.

eA: Can you be more specific?

DC: Normally, a GET for a resource or a POST of some data comes with a header identifying the GETer or POSTer. The resource can also be signed by a user on a site or by the site itself as a proxy.

Or, if you have an agreement with the site, you just need to use https to ensure you've got a secure connection with that site, then needn't have individual signatures.

eA: Where's the Single Sign On and Identity in all this? We've got users working across multiple sites.

DC: Well, gBay is the holder of the Gordon identity or persona - and it manages his world view. Gordon on gBay needs his identity to mean something on eBay, but we don't want him to have to create an account on eBay or to have to tell gBay his eBay login details to work on both sites. So he expects gBay and eBay to have come to some agreements about technology and policy.

In REST, we don't have sessions and logins - we have identity, which implies asymmetric (private/public key) crypto for signatures and security. We have a number of tools available to us, including OpenID and https, as well as resource signing.

eA: Here's a question for you: how would you manage a single shopping trolley for Gordon on gBay, containing and allowing payment for eBay goods?

DC: ShoppingTrolley resource, links to eBay and gBay items. At checkout, smaller eBay-Items-only ShoppingTrolley resource POSTed to eBay along with CreditCard resource (again, you can sign the ShoppingTrolley and encrypt the data).

eA: So, as eBay, why should we integrate the seller ratings of someone on gBay? Or get gBay's for-sale items coming up in our searches? Or accept Offers and ShoppingTrolleys from gBay? We don't control or trust them, and don't want to send traffic or business over to them.

DC: Fair enough, for now. I'm only describing what's technically possible. Like I said before, you may revisit your stance on interoperability and mutual agreements one day soon.

Also, what if your business decides this year to set up a commercial partnership with another similar business and the managers come to you asking how it's all going to work together internally?

You'll find having good REST interoperability already in place a huge asset for internal integration! You'll also find that an interop-friendly approach makes developing internal 'mashups' much easier.

 

Better Than SOA

eA: I still can't see why all this is better than our SOAP approach, though: it just seems like the same things are happening at the end of the day - that it's only a change of perspective.

DC: Well, a minute ago, you were challenging using REST for anything other than simple data manipulation. Now I've shown you the power of a REST approach can be easily extended to a clean, simple, scalable, interoperable, general, declarative programming model. And you're still not satisfied!

eA: Ha! OK. So tell me why this programming model is so scalable and interoperable compared with the SOAP API and normal function calls.

DC: It's scalable because of all the reasons I mentioned before: the cacheability of the basic data operations and their parallelisability through URI partitioning. updated - I meant data partitions not operation partitions!

Plus now we have parallelisability of the application of the business rules. There's nothing more parallelisable than a declarative system.

eA: If you say so! OK, perhaps you could elaborate on that; it sounds like a new point.

DC: It is: when you're leading the computer step-by-step through a process, you have to handle concurrency yourself. That's the 'How' of 'What not How'.

Conversely, when you simply declare 'What' the rules are, the computer is free to go off and do things as concurrently as the rules and the data separation allow.

eA: Mm. OK. Interoperability?

DC: It's interoperable again for the reasons I mentioned before. Firstly, the power of the URI; this scenario is a full player in the Web: you can share links to Items around and go fetch your Offers and Feedbacks with a simple HTTP GET. You can make things happen by POSTing to the relevant URI, given its content type.

There's also the expectation of standard Content-Types, sub-types and schemas in GET and POST, rather than custom eBay WSDLs and schemas, that I mentioned before.

eA: Like you said, you already mentioned these things. Anything to add now that we're doing business rules?

DC: Yes; when data is your interface and resource transformation your basic programming model, resource data types become part of your 'programming language'. As such, there is great benefit in sharing data types to allow such programming across multiple domain boundaries.

SOA, on the other hand, encourages inventing your own 'programming language' every time. It's a much more brittle model and mind-set.

You can't GET your RespondToBestOffer function call, but I can GET the ResponseToBestOffer! It's basically a more mashable approach to distributed programming.

(c) 2006-2007 Duncan Cragg

 

In Part 5: The Distributed Observer Pattern.

Note that the opinions of our imaginary eBay Architect don't necessarily represent or reflect in any way the official opinions of eBay or the opinions of anyone at eBay.

Indeed, I can't guarantee that the opinions of our real blogger necessarily represent or reflect in any way the official opinions of Roy Fielding...