Platform
The Knowledge Graph: entities and relations extracted from your content
Wikantik's Knowledge Graph is built by running an LLM extraction pipeline over your page content to identify entities — concepts, people, organisations, technologies — and the co-mention and typed-relation edges between them. Agents can traverse this graph to answer questions that no keyword search can reach.
What a knowledge graph actually is
A knowledge graph is a structured representation of facts as a network: nodes are things (entities), edges are relationships between those things. Unlike a search index — which maps terms to documents — a knowledge graph maps concepts to one another, letting a reasoner follow chains of meaning rather than lists of matches.
For a wiki, this means: when an agent asks "which pages are most relevant to understanding distributed consensus?", a knowledge graph can surface pages that mention Raft, Paxos, and leader election — even if none of them explicitly contain the phrase "distributed consensus" — because the entity graph knows those concepts are related.
How Wikantik builds the Knowledge Graph
The Knowledge Graph is populated by an entity extraction pipeline that reads page content, identifies entities, and proposes nodes and edges for admin review. No entity enters the graph without human approval — which keeps the graph accurate rather than noisy.
Nodes: LLM-extracted entities
Each page is chunked into overlapping passages (kg_content_chunks), and the configured LLM backend (Ollama or Anthropic) extracts entity candidates from each chunk. Candidate nodes are written to kg_proposals with a confidence score; proposals above the confidence threshold and approved by an admin are promoted to kg_nodes.
Nodes carry pgvector-backed embeddings (kg_node_embeddings) that enable semantic entity lookup — "find entities similar to 'Raft consensus algorithm'" resolves to the right nodes even with paraphrase variation.
Edges: co-mention and typed relations
Edges in kg_edges are of two kinds:
- Co-mention edges — two entities that appear in the same passage are co-mentioned. These are inferred automatically by the extraction pipeline.
- Typed-relation edges — relationship predicates (e.g.
links_to,part_of) between approved entities, promoted from the extraction pipeline's reviewed proposals or written by the admin curation tools. These are curated, not automatically inferred.
The graph rerank in hybrid search uses a multi-source BFS over kg_edges with a 2-hop radius by default (wikantik.search.graph.max-hops = 2) to score retrieval candidates by proximity to query entities.
Inclusion policy: curation, not noise
Not every page benefits from entity extraction, and running extraction over personal notes, system pages, or hobby content clutters the graph. Wikantik's cluster-primary inclusion policy solves this with a default-exclude model: pages are only admitted for extraction if their cluster has an explicit include policy in kg_cluster_policy.
Per-page overrides are always available: kg_include: true or kg_include: false in frontmatter wins over the cluster policy. System pages are always excluded regardless. The policy is managed via the admin UI at /admin/kg-policy, the REST API at /admin/kg-policy/*, or the bin/kg-policy.sh CLI.
Soft-exclude preserves existing kg_nodes and kg_edges rows when a cluster is moved to exclude — they disappear from retrieval immediately but can be reinstated just by re-including the cluster, with no LLM cost. Hard-delete is available when you want the storage back.
For the full policy decision model, see KgInclusionPolicy in the live wiki.
Knowledge Graph vs. Page Graph
These are two distinct subsystems — do not conflate them. The Page Graph contains edges that are real wikilinks authors wrote in page bodies. The Knowledge Graph contains nodes that are LLM-extracted entities and edges that are extracted or frontmatter-declared predicates. The Page Graph is for human navigation and authoring aids. The Knowledge Graph is for semantic retrieval and agent reasoning. See Page Graph vs Knowledge Graph for the full explanation.
How agents use the Knowledge Graph
Agents access the Knowledge Graph through the /knowledge-mcp server's 18 read-only tools. Retrieval queries that go through hybrid search automatically get the KG rerank. Dedicated Knowledge Graph traversal tools let agents explore entity relationships, find hub entities, and discover topically related pages through the graph rather than just through text matching.
Admin agents use the /wikantik-admin-mcp server's Knowledge Graph curation tools to review proposals, approve or reject nodes and edges, and manage the graph's quality over time.
Frequently asked questions
How does the Knowledge Graph differ from the Page Graph?
The Page Graph's edges are real wikilinks that authors write in page bodies. The Knowledge Graph's nodes are LLM-extracted entities and its edges are co-mention or typed-relation predicates extracted by the entity pipeline. They are distinct subsystems — the Page Graph is for navigation; the Knowledge Graph is for semantic retrieval and agent reasoning.
How does the inclusion policy work?
By default, pages are excluded from Knowledge Graph extraction unless their cluster has an explicit include policy in kg_cluster_policy. You can also override per-page with kg_include: true or kg_include: false in frontmatter. This keeps the graph curated and focused on content that actually benefits from entity extraction.
How do agents access the Knowledge Graph?
The /knowledge-mcp MCP server exposes Knowledge Graph traversal tools and hybrid retrieval with the KG rerank. Curators manage the graph via the /wikantik-admin-mcp tools and the /admin/knowledge-graph/* web interface.