A DuckDB extension for vector similarity search with filtered HNSW (ACORN-1) and RaBitQ binary quantization.
Fork of duckdb/duckdb-vss.
The upstream duckdb-vss extension has two limitations:
- Filtered search is broken.
WHEREclauses are applied after the HNSW index returns results, soSELECT ... WHERE category = 'X' ORDER BY distance LIMIT 10often returns fewer than 10 rows. - No vector compression. Every vector is stored as full F32, so the index memory scales linearly with dimensions.
This fork fixes both:
- ACORN-1 filtered search pushes filter predicates into HNSW graph traversal, ensuring filtered queries return the correct number of results.
- RaBitQ quantization compresses vectors to 1 bit per dimension (~21x memory reduction at 128 dims) with a rescore phase that preserves result quality.
-- Create a table with vectors CREATE TABLE items AS SELECT i AS id, array_value(random(), random(), random())::FLOAT[3] AS vec, (i % 5) AS category FROM range(10000) t(i); -- Standard HNSW index CREATE INDEX idx ON items USING HNSW (vec); -- Nearest neighbor search SELECT * FROM items ORDER BY array_distance(vec, [0.5, 0.5, 0.5]::FLOAT[3]) LIMIT 10; -- Filtered search (returns exactly 10 results matching the filter) SELECT * FROM items WHERE category = 1 ORDER BY array_distance(vec, [0.5, 0.5, 0.5]::FLOAT[3]) LIMIT 10;
No special syntax — the optimizer detects WHERE + ORDER BY distance + LIMIT and uses filtered HNSW search automatically.
For large-dimension vectors (128+), RaBitQ dramatically reduces index memory:
CREATE INDEX idx ON items USING HNSW (vec) WITH (quantization = 'rabitq');
Everything else works identically — queries, filters, persistence. The index stores binary-quantized vectors and rescores candidates against the original F32 vectors for exact ranking.
| Metric | 128 dims | 256 dims | 768 dims |
|---|---|---|---|
| F32 bytes/vec | 512 | 1024 | 3072 |
| RaBitQ bytes/vec | 24 | 40 | 104 |
| Compression | 21x | 26x | 30x |
-- Oversample factor: search N*oversample candidates, rescore to top-N (default 3) SET hnsw_rabitq_oversample = 10;
Higher oversample = better recall, slightly slower queries.
| Method | Recall@10 | Vec Memory | Compression |
|---|---|---|---|
| HNSW | 66.7% | 5000 KB | — |
| RaBitQ 3x | 66.7% | 234 KB | 21.3x |
| RaBitQ 10x | 83.3% | 234 KB | 21.3x |
RaBitQ 10x achieves higher recall than plain HNSW because the rescore phase reranks with exact distances.
Run the benchmark yourself: ./build/release/duckdb < benchmarks/rabitq_benchmark.sql
Filter predicates are evaluated during HNSW graph traversal using the ACORN-1 algorithm:
SELECT * FROM items WHERE category = 'X' ORDER BY array_distance(vec, [1,2,3]::FLOAT[3]) LIMIT 10;
Prepared statements work for parameterized queries:
PREPARE search AS SELECT * FROM items WHERE category = $2 ORDER BY array_distance(vec, $1::FLOAT[3]) LIMIT 10; EXECUTE search([1,2,3], 'X');
| Selectivity | Strategy |
|---|---|
| >60% | Standard HNSW (post-filter) |
| 1–60% | ACORN-1 (two-hop expansion) |
| <1% | Brute-force exact scan |
Thresholds are configurable:
SET hnsw_acorn_threshold = 0.6; -- default SET hnsw_bruteforce_threshold = 0.01; -- default
| Filter | Selectivity | Upstream | ACORN-1 |
|---|---|---|---|
| English only | ~60% | ~10/10 | 10/10 |
| Japanese only | ~3% | 0–1/10 | 10/10 |
| Korean only | ~1% | 0/10 | 10/10 |
| Rating >= 8.0 | ~5% | 0/10 | 10/10 |
| Metric | Option | Function | Operator |
|---|---|---|---|
| Euclidean (L2sq) | l2sq (default) |
array_distance |
<-> |
| Cosine | cosine |
array_cosine_distance |
<=> |
| Inner product | ip |
array_negative_inner_product |
— |
CREATE INDEX my_idx ON items USING HNSW (vec) WITH (metric = 'cosine'); CREATE INDEX my_idx ON items USING HNSW (vec) WITH (metric = 'cosine', quantization = 'rabitq');
CREATE INDEX idx ON items USING HNSW (vec) WITH ( metric = 'l2sq', -- distance metric (l2sq, cosine, ip) quantization = 'rabitq', -- vector quantization (rabitq, none) ef_construction = 200, -- graph construction search width ef_search = 200, -- query-time search width M = 16, -- max connections per node M0 = 32 -- max connections at layer 0 );
Runtime settings:
SET hnsw_ef_search = 100; -- override ef_search at query time SET hnsw_rabitq_oversample = 10; -- RaBitQ rescore oversample factor SET hnsw_acorn_threshold = 0.6; -- ACORN-1 selectivity threshold SET hnsw_bruteforce_threshold = 0.01; -- brute-force selectivity threshold
SELECT * FROM pragma_hnsw_index_info();
Returns: index name, table, metric, dimensions, count, capacity, memory usage, quantization type, bytes per vector, vector memory usage, levels, and per-level statistics.
The index supports mutations after creation. For best performance, create the index after bulk loading data.
Deletes are lazily marked — run PRAGMA hnsw_compact_index('idx') to reclaim space.
SET hnsw_enable_experimental_persistence = true;
Indexes persist across restarts when using a disk-backed database. The full index is serialized on checkpoint and deserialized on first access.
- Only
FLOATarray types are supported. - The index must fit in RAM (not buffer-managed).
- Subquery query vectors fall back to sequential scan — use literals or prepared statements.
- RaBitQ with inner product metric has lower recall than L2sq or cosine (the L2sq graph proxy doesn't align well with IP ordering).
make # release build make debug # debug build make test # run all tests
Binaries:
./build/release/duckdb # shell with extension loaded
./build/release/test/unittest # test runner
./build/release/extension/hnsw_acorn/hnsw_acorn.duckdb_extension # loadable extension