store smaller.
retrieve faster.

the fastest compressed retrieval engine for ai.

90% smaller embeddings

minvector semantically compresses embeddings with high fidelity.

97% accuracy retained

maintains near-teacher semantic quality while drastically reducing size.

drop-in retrieval engine

provide chunked text → daapbase handles embedding, compression, storage, and retrieval.

compact storage and high-speed retrieval for ai workloads.

a compressed retrieval engine built for ai systems that rely on embeddings and semantic search.

90%

compact embedding model that semantically compresses embeddings

5.6×

faster retrieval with near-teacher accuracy

97%

preserves semantic structure while reducing dimensionality

semantic compression without quality loss

minvector reduces vector dimensions while maintaining semantic relationships and search accuracy.

optimized for production scale

dramatically lower memory footprint and compute costs without sacrificing performance.

millisecond-level query latency

compact vectors enable faster ann search, reducing retrieval time by over 5×.

powering the next generation of ai workloads with efficient, scalable embeddings.

minvector: a new research approach to compact embeddings.

novel semantic compression method preserving concept structure at extreme dimensionality reduction

128-dim vs 768-dim

up to 90% smaller embeddings

97%

cosine similarity vs teacher model

0.986

recall@10

7.88 ms vs 43.95 ms

retrieval latency

5.6× faster

search throughput

performance comparison

smaller footprint + lower latency = better performance

embedding footprint

% of baseline (lower is better)

minvector-128 ✓17%

pca-12817%

pq-76812%

teacher-768100%

retrieval latency

milliseconds (lower is better)

minvector-128 ✓7.88 ms

pca-12814 ms

pq-76834.4 ms

teacher-76843.95 ms

minvector-128 achieves up to 5.6× faster retrieval while maintaining 90% compression.

how it works

provide chunked text

you bring your own segmentation strategy.

embed + semantic compression

daapbase embeds and compresses each chunk using minvector.

compact storage

optimized memory layer under the hood (chroma-backed).

fast retrieval

query text → receive top-matching chunks in milliseconds.

key use-cases

where latency, cost, and scale drive roi

high-scale ai retrieval systems where latency and cost matter.

enterprise ai assistants and copilots serving thousands of semantic queries per hour.

large knowledge bases (millions–billions of chunks) requiring multilingual, cross-domain retrieval.

ai infrastructure optimization to reduce infra cost without sacrificing accuracy.

high-qps search systems requiring real-time semantic retrieval.

optional examples include customer support, legal search, and documentation portals.

what you get

minvector embeddings

compact, high-fidelity semantic vectors.

semantic compression engine

novel technique preserving structure while reducing dimensionality.

fast memory store

compact storage layer backed by optimized chroma indexing.

retrieval api

efficient top-k search with minimal latency.

pricing (coming soon)

pricing will be usage-based, scaling with number of stored chunks, retrieval volume, and footprint size. early-access users will receive free credits and discounted tiers.