Platform

The Knowledge Graph: entities and relations extracted from your content

Q: How do agents access the Knowledge Graph?

The /knowledge-mcp MCP server exposes 21 read-only tools, including dedicated Knowledge Graph traversal — entity lookup, hub discovery, typed-relation traversal — plus ontology/SPARQL access. Admins can curate the graph via the /wikantik-admin-mcp tools and the /admin/knowledge-graph/* web interface.

Wikantik's Knowledge Graph is built by running an LLM extraction pipeline over your page content to identify entities — concepts, people, organisations, technologies — and the co-mention and typed-relation edges between them. Agents can traverse this graph to answer questions that no keyword search can reach.

What a knowledge graph actually is

A knowledge graph is a structured representation of facts as a network: nodes are things (entities), edges are relationships between those things. Unlike a search index — which maps terms to documents — a knowledge graph maps concepts to one another, letting a reasoner follow chains of meaning rather than lists of matches.

For a wiki, this means: when an agent asks "which pages are most relevant to understanding distributed consensus?", a knowledge graph can surface pages that mention Raft, Paxos, and leader election — even if none of them explicitly contain the phrase "distributed consensus" — because the entity graph knows those concepts are related.

How Wikantik builds the Knowledge Graph

The Knowledge Graph is populated by an entity extraction pipeline that reads page content, identifies entities, and proposes nodes and edges for admin review. No entity enters the graph without human approval — which keeps the graph accurate rather than noisy.

Extraction is chat inference, so it's the most cost-sensitive part of the platform — it only runs when both wikantik.knowledge.enabled and the operator's wikantik.genai.mode cost ceiling allow it. Wikantik runs fully on BM25 search with the Knowledge Graph off entirely; see self-hosting & backup for the three cost tiers.

Nodes: LLM-extracted entities

Each page is chunked into overlapping passages (kg_content_chunks), and the configured LLM backend (Ollama or Anthropic) extracts entity candidates from each chunk. Candidate nodes are written to kg_proposals with a confidence score; proposals above the confidence threshold and approved by an admin are promoted to kg_nodes.

Nodes carry pgvector-backed embeddings (kg_node_embeddings) that enable semantic entity lookup — "find entities similar to 'Raft consensus algorithm'" resolves to the right nodes even with paraphrase variation.

Edges: co-mention and typed relations

Edges in kg_edges are of two kinds:

Co-mention edges — two entities that appear in the same passage are co-mentioned. These are inferred automatically by the extraction pipeline.
Typed-relation edges — relationship predicates (e.g. partOf, implements, requires) between approved entities, promoted from the extraction pipeline's reviewed proposals or written by the admin curation tools. These are curated, not automatically inferred.

An experimental proximity-rerank stage can use a multi-source BFS over kg_edges (2-hop radius, wikantik.search.graph.max-hops = 2) to score retrieval candidates by entity proximity. This stage is off by default (wikantik.search.graph.boost = 0) — empirical measurement found no net ranking lift and production search runs BM25 + dense only.

Inclusion policy: curation, not noise

Not every page benefits from entity extraction, and running extraction over personal notes, system pages, or hobby content clutters the graph. Wikantik's cluster-primary inclusion policy solves this with a default-exclude model: pages are only admitted for extraction if their cluster has an explicit include policy in kg_cluster_policy.

Per-page overrides are always available: kg_include: true or kg_include: false in frontmatter wins over the cluster policy. System pages are always excluded regardless. The policy is managed via the admin UI at /admin/kg-policy, the REST API at /admin/kg-policy/*, or the bin/kg-policy.sh CLI.

Soft-exclude preserves existing kg_nodes and kg_edges rows when a cluster is moved to exclude — they disappear from retrieval immediately but can be reinstated just by re-including the cluster, with no LLM cost. Hard-delete is available when you want the storage back.

For the full policy decision model, see KgInclusionPolicy in the live wiki.

Knowledge Graph vs. Page Graph

These are two distinct subsystems — do not conflate them. The Page Graph contains edges that are real wikilinks authors wrote in page bodies. The Knowledge Graph contains nodes that are LLM-extracted entities and edges that are extracted or frontmatter-declared predicates. The Page Graph is for human navigation and authoring aids. The Knowledge Graph is for semantic retrieval and agent reasoning. See Page Graph vs Knowledge Graph for the full explanation.

How agents use the Knowledge Graph

Agents access the Knowledge Graph through the /knowledge-mcp server's 21 read-only tools. Dedicated Knowledge Graph traversal tools let agents explore entity relationships, find hub entities, and discover topically related pages through the graph rather than just through text matching. The ontology layer also exposes SPARQL queries and typed-IRI dereferencing for agents that need formal reasoning over the entity vocabulary.

Admin agents use the /wikantik-admin-mcp server's Knowledge Graph curation tools to review proposals, approve or reject nodes and edges, and manage the graph's quality over time.

Frequently asked questions

How does the Knowledge Graph differ from the Page Graph?

The Page Graph's edges are real wikilinks that authors write in page bodies. The Knowledge Graph's nodes are LLM-extracted entities and its edges are co-mention or typed-relation predicates extracted by the entity pipeline. They are distinct subsystems — the Page Graph is for navigation; the Knowledge Graph is for semantic retrieval and agent reasoning.

How does the inclusion policy work?

By default, pages are excluded from Knowledge Graph extraction unless their cluster has an explicit include policy in kg_cluster_policy. You can also override per-page with kg_include: true or kg_include: false in frontmatter. This keeps the graph curated and focused on content that actually benefits from entity extraction.

How do agents access the Knowledge Graph?

The /knowledge-mcp MCP server exposes 21 read-only tools, including dedicated Knowledge Graph traversal — entity lookup, hub discovery, typed-relation traversal — plus ontology/SPARQL access. Curators manage the graph via the /wikantik-admin-mcp tools and the /admin/knowledge-graph/* web interface.

Explore the live wiki → Talk to us