store smaller.
retrieve faster.

the fastest compressed retrieval engine for ai.

90% smaller embeddings

minvector semantically compresses embeddings with high fidelity.

97% accuracy retained

maintains near-teacher semantic quality while drastically reducing size.

drop-in retrieval engine

provide chunked text → daapbase handles embedding, compression, storage, and retrieval.

compact storage and high-speed retrieval for ai workloads.

a compressed retrieval engine built for ai systems that rely on embeddings and semantic search.

90%

compact embedding model that semantically compresses embeddings

5.6×

faster retrieval with near-teacher accuracy

97%

preserves semantic structure while reducing dimensionality

1

semantic compression without quality loss

minvector reduces vector dimensions while maintaining semantic relationships and search accuracy.

2

optimized for production scale

dramatically lower memory footprint and compute costs without sacrificing performance.

3

millisecond-level query latency

compact vectors enable faster ann search, reducing retrieval time by over 5×.

powering the next generation of ai workloads with efficient, scalable embeddings.

minvector: a new research approach to compact embeddings.

novel semantic compression method preserving concept structure at extreme dimensionality reduction

128-dim vs 768-dim
up to 90% smaller embeddings
97%
cosine similarity vs teacher model
0.986
recall@10
7.88 ms vs 43.95 ms
retrieval latency
5.6× faster
search throughput

performance comparison

smaller footprint + lower latency = better performance

embedding footprint

% of baseline (lower is better)

minvector-12817%
pca-12817%
pq-76812%
teacher-768100%

retrieval latency

milliseconds (lower is better)

minvector-1287.88 ms
pca-12814 ms
pq-76834.4 ms
teacher-76843.95 ms

minvector-128 achieves up to 5.6× faster retrieval while maintaining 90% compression.

how it works

01

provide chunked text

you bring your own segmentation strategy.

02

embed + semantic compression

daapbase embeds and compresses each chunk using minvector.

03

compact storage

optimized memory layer under the hood (chroma-backed).

04

fast retrieval

query text → receive top-matching chunks in milliseconds.

key use-cases

where latency, cost, and scale drive roi

high-scale ai retrieval systems where latency and cost matter.

enterprise ai assistants and copilots serving thousands of semantic queries per hour.

large knowledge bases (millions–billions of chunks) requiring multilingual, cross-domain retrieval.

ai infrastructure optimization to reduce infra cost without sacrificing accuracy.

high-qps search systems requiring real-time semantic retrieval.

optional examples include customer support, legal search, and documentation portals.

what you get

minvector embeddings

compact, high-fidelity semantic vectors.

semantic compression engine

novel technique preserving structure while reducing dimensionality.

fast memory store

compact storage layer backed by optimized chroma indexing.

retrieval api

efficient top-k search with minimal latency.

pricing (coming soon)

pricing will be usage-based, scaling with number of stored chunks, retrieval volume, and footprint size. early-access users will receive free credits and discounted tiers.

store smaller.
retrieve faster.

join early access and experience the performance leap of minvector.