Known side-effects from the current reading/validating strategy
Side-effects:
The payload is inaccessible when the "Content-Length" is absent or invalid.
Warc-Payload-Digest header is computed only on defined record payloads where the leading header has been read. This makes it a requirement for the WARC parser to identify and always parse the http response and not make it optional.
For GZip'ed records this is not a big problem since we know the record ends when the GZip entry ends.
For uncompressed records the payload input stream would have to look ahead for a valid WARC version line at which point the payload stream should be closed and the bytes read beyond that pushed back onto the internal streams.