Skip to content

Zstandard (Zstd)

Zstandard (Zstd) is a fast compression algorithm providing high compression ratios. It’s the recommended codec for most use cases.

PropertyValue
DeveloperMeta (Facebook)
First Release2016
ImplementationVendored C (zstd 1.5.7)
LicenseBSD
LevelCompressDecompressRatio
fast12.2 GB/s11.4 GB/s99.9%
default12.0 GB/s11.6 GB/s99.9%
best1.3 GB/s12.1 GB/s99.9%
  • ✅ Streaming compression/decompression
  • ✅ Dictionary compression
  • ✅ Content checksum
  • ✅ Auto-detection (magic bytes)
  • ✅ Multiple compression levels
const cz = @import("compressionz");
// Compress
const compressed = try cz.compress(.zstd, data, allocator);
defer allocator.free(compressed);
// Decompress
const decompressed = try cz.decompress(.zstd, compressed, allocator);
defer allocator.free(decompressed);
// Fast compression (12+ GB/s)
const fast = try cz.compressWithOptions(.zstd, data, allocator, .{
.level = .fast,
});
// Default (balanced)
const default = try cz.compressWithOptions(.zstd, data, allocator, .{
.level = .default,
});
// Best compression (1.3 GB/s, maximum ratio)
const best = try cz.compressWithOptions(.zstd, data, allocator, .{
.level = .best,
});
LevelUse Case
fastestReal-time, CPU-bound
fastGeneral purpose, speed priority
defaultRecommended — balanced
betterStorage, archival
bestMaximum compression, one-time

Zstd excels at dictionary compression for small data with known patterns:

// Create or load a dictionary
const dictionary = @embedFile("my_dictionary.bin");
// Compress with dictionary
const compressed = try cz.compressWithOptions(.zstd, data, allocator, .{
.dictionary = dictionary,
});
defer allocator.free(compressed);
// Decompress with same dictionary
const decompressed = try cz.decompressWithOptions(.zstd, compressed, allocator, .{
.dictionary = dictionary,
});
defer allocator.free(decompressed);

Without dictionary (1 KB JSON):

  • Original: 1,024 bytes
  • Compressed: 890 bytes (13% reduction)

With trained dictionary:

  • Original: 1,024 bytes
  • Compressed: 312 bytes (70% reduction)

Use the zstd CLI to train a dictionary:

Terminal window
# Collect representative samples
ls samples/*.json > sample_files.txt
# Train dictionary (32KB is a good size)
zstd --train -o my_dictionary.bin --maxdict=32768 samples/*.json

Process large files without loading into memory:

const cz = @import("compressionz");
const std = @import("std");
// Streaming compression
pub fn compressFile(allocator: std.mem.Allocator, input: []const u8, output: []const u8) !void {
const in_file = try std.fs.cwd().openFile(input, .{});
defer in_file.close();
const out_file = try std.fs.cwd().createFile(output, .{});
defer out_file.close();
var comp = try cz.compressor(.zstd, allocator, out_file.writer(), .{});
defer comp.deinit();
var buf: [65536]u8 = undefined;
while (true) {
const n = try in_file.read(&buf);
if (n == 0) break;
try comp.writer().writeAll(buf[0..n]);
}
try comp.finish();
}
// Streaming decompression
pub fn decompressFile(allocator: std.mem.Allocator, input: []const u8) ![]u8 {
const file = try std.fs.cwd().openFile(input, .{});
defer file.close();
var decomp = try cz.decompressor(.zstd, allocator, file.reader());
defer decomp.deinit();
return decomp.reader().readAllAlloc(allocator, 1024 * 1024 * 1024);
}

For advanced control:

const cz = @import("compressionz");
// Simple with level
const compressed = try cz.zstd.compress(data, allocator, .fast);
// With dictionary
const dict_compressed = try cz.zstd.compressWithDict(
data,
allocator,
.default,
dictionary,
);
// Decompression with dictionary and limit
const decompressed = try cz.zstd.decompressWithDict(
compressed,
allocator,
max_size, // null for no limit
dictionary, // null if no dictionary
);
┌─────────────────────────────────────────────────────┐
│ Magic Number (4 bytes): 0xFD2FB528 │
├─────────────────────────────────────────────────────┤
│ Frame Header │
│ - Frame Header Descriptor (1 byte) │
│ - Window Descriptor (0-1 bytes) │
│ - Dictionary ID (0-4 bytes) │
│ - Frame Content Size (0-8 bytes) │
├─────────────────────────────────────────────────────┤
│ Data Blocks │
│ - Block Header (3 bytes) │
│ - Block Data (variable) │
│ - ... more blocks ... │
├─────────────────────────────────────────────────────┤
│ Checksum (4 bytes, optional) │
└─────────────────────────────────────────────────────┘
// Zstd magic number (little-endian)
const ZSTD_MAGIC: u32 = 0xFD2FB528;
// Detection
if (data.len >= 4 and
data[0] == 0x28 and data[1] == 0xB5 and
data[2] == 0x2F and data[3] == 0xFD)
{
// It's Zstd
}

Zstd uses a combination of techniques:

  1. LZ77 matching — Find repeated sequences
  2. Finite State Entropy (FSE) — Advanced entropy coding
  3. Huffman coding — For literal sequences
  4. Repeat offsets — Cache recent match offsets
  • Optimized for modern CPUs (cache-friendly)
  • SIMD acceleration (SSE2, AVX2, NEON)
  • Efficient entropy coding
  • Tuned for real-world data patterns

Best for:

  • General-purpose compression
  • Databases and data stores
  • Log file compression
  • Network protocols
  • Any case without specific requirements

Not ideal for:

  • Maximum speed (use LZ4)
  • Web compatibility (use Gzip/Brotli)
  • Pure Zig requirement (use LZ4/Snappy)
MetricZstdLZ4Gzip
Compress12 GB/s36 GB/s2.4 GB/s
Decompress11.6 GB/s8.1 GB/s2.4 GB/s
Ratio99.9%99.5%99.2%
Dictionary
Streaming