Monday 30 July 2012

HL7 over HTTP: First try at a specification



This post is a first call for comments on a specification for HL7 over HTTP.

The draft is here: http://hl7api.sourceforge.net/hapi-hl7overhttp/specification.html

HL7 over HTTP is an idea I first floated on the HAPI mailing list last week, and since there was a fair bit of interest I thought I would try to cobble together a spec. The spec as it stands incorporates most of the feedback I received on Blogspot and on the HAPI mailing list, so it's probably time to start getting feedback on how that all looks on paper.

Some areas I'm particularly interested on opinions on:
  • For the internet protocol crowd: Is it really wise to specify that only the parts of RFC 2616 which are explicitly referenced in HoH are required to be supported? My hope is that this leads to an easier to implement spec (since features like redirect, multipart content, cache-control, etc. are not relevant to transactional system-to-system messaging). My fear is that we'll miss something critical (my first attempt skipped the "Host" header, which I then learned is an absolute must according to HTTP/1.1)
  • For the security crowd: Is SHA512 with RSA a good signature algorithm for message non-repudiation? At a glance it seems like it might be too java-centric. (Would a CMS signature be better? A CMS signed message?) 
I've also got a reference implementation started, with the hope that it will be usable in a wide variety of circumstances (e.g. servlet, standalone application, drop-in LLP implementation) and even in applications that don't otherwise use HAPI in order to encourage adoption. More to come on that, but please get in touch if you would like to get involved. 


Monday 23 July 2012

A call for a new HL7 v2 Transport (HL7 over HTTP)

It's time for a replacement for the aging HL7 Minimal Lower Layer Protocol (MLLP).

Here at University Health Network, we exchange about 2 million HL7 messages on an average day over our main ESB (using HAPI every step of the way. I might add). About 25% of this traffic is between us and 24 outside organizations. I'm sure that there are far larger HL7 infrastructures in the world, but we can at least say definitively that we have a big one. One of my longstanding gripes with the world of HL7v2 is that despite all of the improvements made to HL7 over the years (and there have been many) the transport layer protocol has remained stuck in the 70s.

MLLP

As a reminder to the casual reader, MLLP works like this: A client (sender) system initiates a raw socket based connection to a server (receiver) system. The client then sends a single start block character, then sends the raw HL7 message that is being transmitted, then sends two end block characters.

MLLP has been successful because it's simple. It's also a nightmare to implement because it's so simple. There is no security built in, so you have to rely on an agreement with your trading partner around where you might put a username and password (if anywhere) in the message payload. It's generally assumed that you will use raw TCP and it's an uphill battle trying to convince people to implement secure sockets (SSL/TLS).

HTTP

So, how can this be better? I'd like to propose something simple. Let's borrow from any number of successful protocols in the last decade, and define a standard HTTP based transport.

Why HTTP?

I can think of plenty of reasons why HTTP is the obvious fit for moving HL7 v2 messages around.
  • Although it was obviously originally geared towards moving hypertext documents around, these days HTTP has become the transport of choice for all kinds of applications. SOAP/XML uses it to perform remote RPC. Subversion and Mercurial use it to move version control data around. DAV implementations use it as a kind of "FTP on Steroids". The examples go on..
  • There is a good reason that all of those applications use HTTP: It takes care of all of the common infrastructure challenges. HTTP has authentication mechanisms built in. Encrypting HTTP (using TLS) is well understood and implemented everywhere. HTTP management tools are everywhere (Governance tools, Smart HTTP firewalls, etc). HTTP can be compressed easily, and the protocol comes with built in negotiation for this (and other) features. HTTP also tends to play well with content switches, load balancers, firewalls, and other network gear.
  • Tool support for HTTP is great by now. Almost every modern programming language already has a good HTTP client built in. Many of them have built-in standalone HTTP servers built in too. Modern ESBs and interface engines generally already have support for HTTP too, so in theory this kind of transport should be possible today with minimal effort.

How would it work?

This is where I would like to ask for opinions of the many bright minds doing health information integration out there. The following are some scattered thoughts I have (all of which are nothing more than personal opinions).
  • Action: I propose the following:
    • The receiving system acts as an HTTP server, and listens for requests.
    • The sending system initiates an HTTP POST request, and posts the message being transmitted.
    • The receiving system processes the message and replies with an HTTP 200 (assuming success) and responds with a standard HL7 ACK message.
  • Addressing: MLLP relies on a specific port being assigned to a given interface. HTTP on the other hand uses paths. So, if a receiving system has multiple incoming interfaces, the sender would identify which interface they wanted to talk to using a POST path.
  • Encryption: HTTP over TLS (commonly called https:) should be a standard expectation. HL7 by its nature is used to transmit sensitive information, so discretion should be built in.
  • Non-Repudiation: Given the sensitivity of HL7 traffic, support for built-in non-repudiation would be a good thing. One option might be a standard defined HTTP header in the request and the response with a PKCS#7 signature covering the HL7 contents. This should probably be optional, since not every application requires non-repudiation.
  • Compression: Subject to both configuration and support negotiation, content compression using GZip could be employed.

How would this look?

Here is an example of how such a transaction might look.

Making This Happen

I would like to hear ideas, criticisms, opinions, etc. to this proposal. If there is actually agreement out there that this is a worthwhile idea to pursue, here are some things that could happen to move this along.
  • A formal specification should be drafted. It would probably make sense to reach out to HL7.org for opinions, but I don't think this needs to "grow up" through HL7 in order to be successful. If the HAPI community could trial something useful, it could probably be proposed as a next step for broader adoption.
  • A reference implementation should be created. I propose the creation of a new HAPI subproject which implements this specification, licensed under a separate, business friendly license (Apache?). Potentially this implementation could include conformance verification tools which could be worked into the HAPI TestPanel.
  • We could reach out to other open-source HL7 implementations to gauge support for the idea, with the idea that they might be early adopters (Camel/IPF, Mirth, NHAPI, etc.)
So... comments? :)

Wednesday 18 July 2012

HAPI 2.0 and TestPanel 2.0 Released!

Just a quick note to say that HAPI 2.0 and TestPanel 2.0 are both out now.

The new releases are both packed with fixes and changes, all of which are available in the changelogs. Definitely worth checking out.

Saturday 10 March 2012

How popular are the various flavours of HL7 v2.x?

Last year, HAPI's binaries were made available in the Maven central repo. An interesting side effect of that change is that the Nexus tool you use to upload new releases gives statistics about which binaries get downloaded. Since HAPI has an individual JAR for each version of HL7 2.x, this makes the Maven repository a kind of informal popularity contest for HL7 versions.

Obviously one would be foolish to use these numbers for anything, but they do give some interesting hints on the market penetration and adoption of various versions. Here is what they show:

Artifact Downloads DL%
hapi-structures-v25 2095 15%
hapi-structures-v24 1550 11%
hapi-structures-v231 1499 11%
hapi-structures-v23 1167 8%
hapi-structures-v26 1028 7%
hapi-structures-v251 903 6%
hapi-structures-v22 872 6%
hapi-structures-v21 762 5%

I'm kind of surprised to see 2.5 as the clear winner. My guess would have been 2.3 or 2.4.