A Design Pattern for Recovery from TCP Connection Crashes in HTTP Applications



HTTP is currently being used as the communication protocol for many distributed applications, supporting business and safety-critical services at a world-wide scale. Despite of their increasing importance, HTTP-based applications are still quite exposed to TCP connection crashes, which can result in huge losses for services users and providers, including financial and reputation losses. Typical techniques for achieving reliable HTTP communication rely on buffering and retransmission of complete HTTP messages, and are quite un-adapted to large messages. Stream-based approaches are more efficient as, after a crash, data transmission can be resumed from where it stopped. However, it is very difficult to know how much data is lost after a crash, as TCP provides insufficient support to obtain this information and none to recover from connection crashes. This makes the design of any stream-based reliability mechanism a significant challenge. In this paper we propose a stream-based solution for reliable HTTP communication that is retro- compatible with existing software. The mechanism is presented as a design solution and relieves developers from explicitly designing recovery code for handling connection crashes, providing a standardized way for building reliable applications. Our experimental evaluation shows that the solution is functional and results in acceptable coding and runtime costs.


HTTP, Reliable Communication, TCP, Connection Crashes, Fault-tolerance, Stream-Based Solution


Dependable Distributed Systems


International Journal of Services Computing, May 2016

Cited by

No citations found