Platform
Hybrid retrieval: BM25 + dense embeddings + knowledge-graph rerank
Wikantik's search stack fuses Lucene BM25 with dense embedding cosine similarity using weighted Reciprocal Rank Fusion, then optionally reranks with a knowledge-graph proximity score — and falls back cleanly to BM25 alone if any upstream component is unavailable.
The problem with keyword-only search
Classic BM25 keyword search is fast and reliable, but it is fundamentally a vocabulary-matching exercise. A query for "index funds for retirement" will miss a page titled "Passive Investing Fundamentals" unless it happens to contain those exact terms. For a wiki used by both humans exploring topics and AI agents answering questions, that gap is not a cosmetic issue — it is a retrieval failure that produces hallucinated or missing citations.
Dense embedding search solves the vocabulary problem by mapping both queries and documents into a shared semantic vector space. But dense search on its own can be brittle: it rewards topical similarity over exact relevance, and it fails silently when the embedding backend is unavailable.
Hybrid retrieval takes the best of both worlds.
How it works: three stages
Stage 1 — BM25 + dense fusion with weighted RRF
At query time, Wikantik runs two retrieval passes in parallel: a Lucene BM25 pass and a dense cosine-similarity pass against chunk embeddings. The two ranked lists are then merged using weighted Reciprocal Rank Fusion (RRF) — a rank-based combination method that is robust to score-scale differences between the two signals.
The key configuration properties are:
wikantik.search.hybrid.rrf.bm25-weight— weight applied to BM25 ranks (default1.0)wikantik.search.hybrid.rrf.dense-weight— weight applied to dense ranks (default1.5)wikantik.search.hybrid.rrf.k— the RRF smoothing constant (default60)
Dense retrieval operates at the chunk level — each page is broken into overlapping passages, each embedded separately — and the SUM_TOP_3 page aggregation strategy collapses chunk scores into a per-page score before fusion.
Stage 2 — knowledge-graph rerank
After fusion, an optional third stage reranks the fused list using the Knowledge Graph. The graph rerank resolves entities in the query against kg_nodes, then boosts pages whose mentioned entities are close (in Knowledge Graph hops) to those query entities. With the default wikantik.search.graph.boost = 0.2 and max-hops = 2, the boost lifts mid-list pages with strong graph proximity without overriding BM25 + dense top results.
Crucially, the graph rerank only reorders — it never adds or removes candidates. If no graph signal exists for a query, the stage returns the fused list unchanged.
Fail-closed BM25 fallback
Every abnormal path in the dense and graph stages returns the BM25 result list unchanged rather than failing. If the embedding backend times out, the circuit breaker trips OPEN and the embedder returns empty — and BM25 takes over seamlessly. This is not an afterthought; it is a design invariant enforced in HybridSearchService.rerank().
Production note: the dense backend on docker1 (the live Wikantik deployment) is lucene-hnsw — an in-process Lucene HNSW approximate nearest-neighbour index held in JVM RAM. It is faster than the brute-force inmemory backend on large corpora and avoids the extra database load of the pgvector backend. Switch backends with a single property: wikantik.search.dense.backend = inmemory | pgvector | lucene-hnsw.
Why it matters for RAG and agents
Retrieval-Augmented Generation (RAG) systems are only as good as their retrieval step. An agent that gets the wrong page returns a wrong or hallucinated answer. Hybrid retrieval dramatically reduces the chance of a retrieval miss compared to either BM25 or dense alone — particularly for queries that combine exact terminology with broader semantic intent, which is exactly how humans and agents tend to ask questions about a technical knowledge base.
The same /api/search endpoint that powers the search box in the wiki UI is the one the MCP tools call. Agents get first-class search, not a stripped-down API.
For deeper reading on the implementation, see the Hybrid Retrieval design doc in the live wiki.
Frequently asked questions
What embedding backends are supported?
Three backends are available via the wikantik.search.dense.backend property: inmemory (brute-force cosine scan; the config-file default for new installs), pgvector (delegates to PostgreSQL HNSW index), and lucene-hnsw (in-process Lucene HNSW index, the docker1 production default). All three share the same fail-closed BM25 fallback.
What happens if the vector store is down?
Wikantik hybrid retrieval is fail-closed: every failure path in the dense retrieval stack returns the unmodified BM25 result list. The circuit breaker trips OPEN after consecutive embedding failures and automatically re-probes after a reset interval; during the OPEN period, search continues returning BM25 results. Search never goes dark because of an embedding outage.
Does hybrid retrieval work for both agents and humans?
Yes. The same /api/search endpoint is used by the web UI search box, the /knowledge-mcp MCP tools, and direct API callers. Agents benefit from the same BM25 + dense + KG rerank pipeline that humans get, plus the structured JSON response that makes results easy to process programmatically.