Adaptive RAM/file-backed HTTP bodies.


Keywords
http, buffer
Licenses
MIT/Apache-2.0

Documentation

The Body Image Project

CI Status

A rust language project providing separately usable but closely related crates:

The namesake body-image crate provides a uniform access strategy for HTTP body payloads which may be scattered across multiple allocations in RAM, or buffered to a temporary file, and optionally memory mapped. This effectively enables trading some file I/O cost in return for supporting significantly larger bodies without risk of exhausting RAM.

The body-image-futio crate integrates the body-image crate with futures, http, hyper, and tokio for both client and server use.

The barc crate provides the Body Archive (BARC) container file format, reader and writer. This supports high fidelity and human readable serialization of complete HTTP request/response dialogs with additional meta-data and has broad use cases as test fixtures or for caching or web crawling.

The barc-cli crate provides a command line tool for printing, recording, de-/compressing, and copying BARC records.

See the above rustdoc links or the README(s) and CHANGELOG(s) under the individual crate directories for more details.

Rationale

HTTP sets no limits on request or response body payload sizes, and in general purpose libraries or services, we are reluctant to enforce the low maximum size constraints necessary to guarantee sufficient RAM and reliable software. This is exacerbated by all of the following:

  • The concurrent processing potential afforded by both threads and Rust's asynchronous facilities: Divide the available RAM by the maximum number of request/response bodies in memory at any one point in time.

  • With chunked transfer encoding, we frequently don't know the size of the body until it is fully downloaded (no Content-Length header).

  • Transfer or Content-Encoding compression: Even if the compressed body fits in memory, the decompressed version may not, and its final size is not known in advance.

  • Constrained memory: Virtual machines and containers tend to have less RAM than our development environments, as do mobile devices. Swap space is frequently not configured, or if used, results in poor performance.

Note there are different opinions on this topic, and implementations. For example, HAProxy which is a RAM-only proxy by design, recently introduced a "small object cache" limited by default to 16 KiB complete responses. Nginx by comparison offers a hybrid RAM and disk design. When buffering proxied responses, by current defaults on x86_64 it will keep 64 KiB in RAM before buffering to disk, where the response is finally limited to 1 GiB.

This author thinks the operational trends toward denser virtual allocation instead of growth in per-instance RAM, in combination with increasing availability of fast solid state disk (e.g. NVMe SSDs) make hybrid approaches more favorable to more applications than was the case in the recent past.

License

This project is dual licensed under either of following:

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the body image project by you, as defined by the Apache License, shall be dual licensed as above, without any additional terms or conditions.