RACSS: compress large static datasets and still access any record instantly

Zero decompression latency. True O(1) random access.
Designed for static / read-mostly data where stream compression becomes a bottleneck.

What it solves

Keeping large static datasets small on storage without “inflate-the-blob-first” workflows and without unpredictable tail latency.

What you get

Random-access retrieval from compressed form, deterministic decoding, tiny runtime state, and a small decompressor suitable for audit-friendly stacks.

Where it pays

CDN/edge replication, AI infra NVMe pressure, embedded flash/RAM budgets — places where footprint multiplies and latency spikes are expensive.

RACSS page

Want the raw technical details and the C demo decoder? It’s all linked from the RACSS pages.

Where RACSS is a direct win

CDN / edge caches / object storage: replicated static datasets where savings multiply across PoPs and tiers; better cache density per NVMe/SSD.
AI / LLM infrastructure: reduce NVMe pressure on GPU nodes for static side-data (tokenizer assets, ID maps, dictionaries, lookup tables, metadata catalogs); predictable tail latency.
Embedded / IoT / legacy devices: smaller flash footprint or more features on the same hardware; deterministic runtime behavior.

Not a general-purpose compressor

RACSS is optimized for collections of strings/records with random reads. If your workload is a single big stream, gzip/zstd are usually the right tool.

Predictable performance

No adaptive models, no global stream history. You can decode one record without touching unrelated data. This is the whole point.

Low integration risk

Small, auditable decompressor; minimal runtime state; designed for “boring” engineering: easy to integrate, test, and maintain.

How to evaluate RACSS quickly

A simple PoC flow (fast and measurable):

Pick one representative static dataset (strings, metadata, catalogs, dictionaries).
We produce a benchmark report: footprint vs gzip/zstd (including index overhead), random-access latency distribution (p50/p99), and peak memory during retrieval.
If numbers are good: integrate the tiny decoder and validate on your target environment.

If you can’t share internal data, we can benchmark on a public proxy dataset with similar structure.

RACSS page