File Exchange protocol

Describes the underlying protocol used as the File Exchange.

The file exchange is used to transfer large or binary data, like the "files" stored in the BitRepository or the large results from requests.


Transporting files for bit-storage in the messages themselves is considered bad for the following reasons:

  • The bandwidth of the entire system would be limited by the bandwidth of coordinating layer/message exchange system
  • It would introduce a considerable load on the coordinating layer

For those reasons files are in the protocol referred to by an URL. The protocol itself does not impose any limitations on the exact protocol used in the URL. It is left up to the specific implementations and specific repository which protocols can be used. 

Below is a description of the specific implementation.

HTTPS as the underlying protocol

The protocol used for data exchange should meet the following properties:

  • 8-bit clean - to maximize throughput
  • Simple to handle in firewalls
  • Have widespread support
  • Support encryption/PKI
  • Support resumption/partial transfer of large files

All those requirements are met by secure HTTP / HTTPS

Application of HTTPS to data exchange

When transferring data back and forth between clients and pillars, all data exchange communication is initiated by a pillar. This requirement is introduced for two reasons: The first reason is that the pillar machinery is supposed to be well stuck behind possibly NAT 'ing firewalls, making communication very cumbersome. The other reason is security: All listening IP ports on a server introduce potential points of intrusion. Opening an outgoing connection is not without risks either, but the risk will be limited to the actual duration of the transfer. The requirement does have implications for data exchange, as pillar to pillar transfers have to be indirect.

All communication *may* be encrypted by server and client certificates (remember that the pillar is the HTTP client). In that case authenticity is guaranteed. Unencrypted transfers may have their uses, though - ie. transferring data for presentation systems or simple harvesting of collections.

The HTTP method GET is used for transferring data from client to pillar, while PUT is used for transferring data from pillar to client. This means that a "Put" message exchange will actually result in an http GET and vice versa.

Partial transfers are supported through the HTTP "Range" header. Using this feature may be desirable when only some segment of a file is needed - i.e. for streaming video - or for dividing the transfer of large files into more manageable chunks.

Example

Clarifications

At first it may seem contradictory that messaging is initiated by the clients, while data transfers are initiated by a pillar. On the network layer, however, the pillar will always be the initiator of communications, not the target, as connections between pillar and queuing system are also initiated by the pillar.