Security
Compression libraries can be attack vectors if not used carefully. This guide covers security considerations and best practices.
Decompression Bombs
Section titled “Decompression Bombs”A decompression bomb (zip bomb) is malicious compressed data that expands to an enormous size, causing denial of service.
The Threat
Section titled “The Threat”Example: A 42 KB file that expands to 4.5 PB (petabytes)Attackers use highly repetitive data that compresses extremely well:
Original: 4.5 PB of zerosCompressed: ~42 KBExpansion ratio: 100,000,000,000:1Protection
Section titled “Protection”Always use max_output_size for untrusted data:
const cz = @import("compressionz");
pub fn safeDecompress(untrusted_data: []const u8, allocator: std.mem.Allocator) ![]u8 { return cz.decompressWithOptions(.gzip, untrusted_data, allocator, .{ .max_output_size = 100 * 1024 * 1024, // 100 MB limit });}Recommended Limits
Section titled “Recommended Limits”| Context | Limit | Rationale |
|---|---|---|
| User uploads | 10-100 MB | Reasonable file sizes |
| API requests | 1-10 MB | Prevent resource exhaustion |
| Config files | 1 MB | Configs are small |
| Log entries | 10 KB | Individual entries |
| Internal systems | Higher or none | Trusted sources |
Example: HTTP Handler
Section titled “Example: HTTP Handler”const cz = @import("compressionz");
pub fn handleUpload(request: *Request, allocator: std.mem.Allocator) !Response { const content_encoding = request.headers.get("Content-Encoding"); const body = request.body;
const data = if (content_encoding) |encoding| blk: { if (std.mem.eql(u8, encoding, "gzip")) { break :blk cz.decompressWithOptions(.gzip, body, allocator, .{ .max_output_size = 10 * 1024 * 1024, // 10 MB }) catch |err| { if (err == error.OutputTooLarge) { return Response.badRequest("Request too large"); } return Response.badRequest("Invalid compressed data"); }; } break :blk body; } else body;
// Process data...}Input Validation
Section titled “Input Validation”Never Trust Compressed Data
Section titled “Never Trust Compressed Data”Compressed data can be:
- Corrupted (accidental or malicious)
- Wrong format
- Truncated
- Crafted to exploit vulnerabilities
Always Handle Errors
Section titled “Always Handle Errors”const result = cz.decompress(.zstd, data, allocator) catch |err| switch (err) { error.InvalidData => { log.warn("Received invalid compressed data", .{}); return error.BadInput; }, error.ChecksumMismatch => { log.warn("Compressed data failed integrity check", .{}); return error.CorruptedData; }, error.OutputTooLarge => { log.warn("Decompression bomb detected", .{}); return error.MaliciousInput; }, error.UnexpectedEof => { log.warn("Truncated compressed data", .{}); return error.IncompleteData; }, else => { log.err("Unexpected decompression error: {}", .{err}); return error.InternalError; },};Timing Attacks
Section titled “Timing Attacks”Compression ratios can leak information about plaintext content.
The CRIME/BREACH Attack
Section titled “The CRIME/BREACH Attack”If you compress data that includes:
- A secret (e.g., session token)
- Attacker-controlled input
The attacker can guess the secret by observing compressed size differences.
Mitigation
Section titled “Mitigation”Don’t compress data containing both secrets and user input:
// DANGEROUS: Combines secret and user inputconst bad_response = try std.fmt.allocPrint(allocator, "token={s}&user_input={s}", .{ secret_token, user_input });const compressed = try cz.compress(.gzip, bad_response, allocator);// Attacker can guess secret_token by varying user_input!
// SAFER: Compress separately or don't compress secretsconst response_data = try std.fmt.allocPrint(allocator, "data={s}", .{ user_input });const compressed = try cz.compress(.gzip, response_data, allocator);// Secret token sent via separate channel (e.g., cookie)Web Application Recommendations
Section titled “Web Application Recommendations”- Use random padding in responses
- Don’t compress pages with CSRF tokens and user input
- Consider disabling compression for authenticated pages
- Use per-request secrets that can’t be guessed incrementally
Memory Safety
Section titled “Memory Safety”compressionz is written in memory-safe Zig with vendored C libraries.
Buffer Overflows
Section titled “Buffer Overflows”The C libraries (zstd, zlib, brotli) are mature and well-audited. However:
- Always use the latest versions (vendored versions are tested)
- Report any crashes to help improve safety
Integer Overflows
Section titled “Integer Overflows”Size calculations use safe Zig arithmetic:
// Zig catches overflow at runtime (in safe builds)const size: usize = a +% b; // Wrapping add (explicit)const size: usize = a + b; // Panics on overflow (default)Dictionary Security
Section titled “Dictionary Security”If using dictionary compression:
Don’t Expose Dictionary Contents
Section titled “Don’t Expose Dictionary Contents”Dictionaries may contain sensitive patterns from training data.
// Dictionary trained on user dataconst dict = trainDictionary(user_messages);
// DANGEROUS: Sending dictionary to clienttry sendToClient(dict); // May leak private data patterns!
// SAFE: Keep dictionary server-side onlyconst compressed = try compressWithDict(message, dict);try sendToClient(compressed);Version Dictionaries
Section titled “Version Dictionaries”Changing dictionaries can break decompression:
pub const DictVersion = enum(u8) { v1 = 1, v2 = 2, current = 2,};
pub fn compress(data: []const u8, allocator: std.mem.Allocator) ![]u8 { const dict = dictionaries[@intFromEnum(DictVersion.current)]; const compressed = try cz.compressWithOptions(.zstd, data, allocator, .{ .dictionary = dict, });
// Prefix with version for forward compatibility var result = try allocator.alloc(u8, 1 + compressed.len); result[0] = @intFromEnum(DictVersion.current); @memcpy(result[1..], compressed); return result;}Secure Defaults
Section titled “Secure Defaults”Use Checksums
Section titled “Use Checksums”Enable checksums for data integrity:
// LZ4 frame with checksum (default)const compressed = try cz.compressWithOptions(.lz4, data, allocator, .{ .include_checksum = true, // Default});
// Zstd includes checksum by defaultconst zstd = try cz.compress(.zstd, data, allocator);
// Gzip includes CRC32 by defaultconst gzip = try cz.compress(.gzip, data, allocator);Verify on Decompression
Section titled “Verify on Decompression”Checksums are verified automatically:
const result = cz.decompress(.lz4, data, allocator) catch |err| { if (err == error.ChecksumMismatch) { // Data was corrupted or tampered with return error.IntegrityCheckFailed; } return err;};Security Checklist
Section titled “Security Checklist”- Set
max_output_sizefor untrusted data - Handle all decompression errors
- Don’t compress secrets with user input
- Keep dictionaries private
- Enable checksums for data integrity
- Validate data after decompression
- Log security-relevant errors
- Keep compressionz updated