Knowledge¶
Portable SQLite knowledge bases with hybrid FTS5+vector search and AI synthesis. Short alias: kb.
Highlights¶
- Hybrid FTS5+vector search via
knowledge.search()— keyword, semantic, or combined (hybrid) modes - AI synthesis via
knowledge.ask()— retrieves relevant chunks then synthesises a concise answer with source citations - Personal annotations via
knowledge.write()— store rules, notes, and mistakes alongside indexed content - Link-graph traversal via
knowledge.related()— follow markdown hyperlinks between topics
Functions¶
| Function | Description |
|---|---|
knowledge.write(topic, content, db, ...) |
Write a personal annotation to the knowledge database |
knowledge.read(topic, db, ...) |
Read a single entry by topic |
knowledge.update(topic, db, ...) |
Update an existing entry |
knowledge.append(topic, db, ...) |
Append content to an existing entry |
knowledge.delete(topic, db, ...) |
Delete an entry by topic |
knowledge.search(q, db, ...) |
Hybrid FTS5+vector search (mode: hybrid/semantic/keyword) |
knowledge.ask(q, db, ...) |
Retrieve relevant chunks and synthesise an AI answer |
knowledge.grep(pattern, db, ...) |
Regex/text search across all content |
knowledge.related(topic, db, ...) |
Find entries linked from or to a given topic |
knowledge.list(db, ...) |
List entries (returns meta only, no content) |
knowledge.toc(db, ...) |
Display table of contents for a database or topic prefix |
knowledge.slice(topic, db, ...) |
Extract a section from a large entry |
knowledge.stats(db) |
Chunk counts, embedding coverage, and file size |
knowledge.info(db) |
Database metadata, path, and version |
knowledge.dbs() |
List all configured knowledge databases |
Key Parameters¶
| Parameter | Type | Description |
|---|---|---|
q |
str | Search or question text |
db |
str | Database name (as configured under tools.knowledge.kb) |
topic |
str | Entry topic path (e.g. python/tips/generators) |
mode |
str | Search mode: "hybrid" (default), "semantic" (vector-only), "keyword" (FTS5-only) |
k |
int | Max results (default: config.search_limit) |
category |
str | Entry category filter — one of: reference, rule, note, mistake |
source |
str | Filter by meta.source prefix |
direction |
str | For knowledge.related(): "out" (links from topic), "in" (links to topic), "both" |
depth |
int | For knowledge.related(): traversal depth (default: 1) |
Requires¶
OPENAI_API_KEYinsecrets.yaml(for embeddings and AI synthesis)onetool-mcp[util]extra (providessqlite-vecandpython-frontmatter)
Configuration¶
Required¶
OPENAI_API_KEYmust be set insecrets.yamlfor embeddings andknowledge.ask().
Optional¶
| Key | Type | Default | Description |
|---|---|---|---|
tools.knowledge.model |
string | "" |
OpenAI embedding model. Falls back to llm.embedding_model; built-in default: text-embedding-3-small. |
tools.knowledge.base_url |
string | "" |
OpenAI-compatible API base URL. Empty = inherit from top-level llm.base_url. |
tools.knowledge.dimensions |
int | 1536 |
Embedding dimensions. Must match the configured model. |
tools.knowledge.max_embedding_tokens |
int | 8191 |
Max tokens per embedding input. |
tools.knowledge.embedding_batch_size |
int | 200 |
Texts per embeddings API call. Range: 1-2048. |
tools.knowledge.search_limit |
int | 10 |
Default max search results. Range: 1-100. |
tools.knowledge.search_extract |
int | 300 |
Character limit for content extract in search results (0 = full). |
tools.knowledge.enrich_model |
string | "" |
LLM model for knowledge.ask() synthesis. Empty = falls back to top-level llm.model. |
tools.knowledge.min_chunk_chars |
int | 200 |
Minimum body characters per chunk. Chunks below threshold are merged. 0 disables. |
Project registry (under tools.knowledge.kb):
tools:
knowledge:
model: text-embedding-3-small
base_url: ""
dimensions: 1536
search_limit: 10
search_extract: 300
enrich_model: ""
min_chunk_chars: 200
kb:
docs:
db:
path: kb/docs.db
description: Scraped documentation
embeddings_enabled: true
scrape:
output_base_dir: /path/to/scraped/docs
sources:
python:
url: https://docs.python.org/3/
url_prefix: /3/
Defaults¶
- If
tools.knowledge.base_urlis empty, it inherits from the top-levelllm.base_url. - If
tools.knowledge.modelis empty, it inherits fromllm.embedding_model. - If
tools.knowledge.enrich_modelis empty, it falls back tollm.model.
Examples¶
# Search a knowledge base (hybrid FTS5+vector)
knowledge.search(q='context managers', db='docs')
# Keyword-only search with more results
knowledge.search(q='yield generator', db='docs', mode='keyword', k=20)
# AI synthesis — retrieves relevant chunks then answers
knowledge.ask(q='How do I configure authentication?', db='docs')
# Write a personal annotation
knowledge.write(topic='python/tips/loops', content='Use enumerate() for index access', db='docs', category='rule')
# Grep for a pattern across all content
knowledge.grep(pattern='def __init__', db='docs')
# Follow related topics via link graph
knowledge.related(topic='python/asyncio/tasks', db='docs', direction='out', depth=2)
# List all configured databases
knowledge.dbs()
# Check database stats
knowledge.stats(db='docs')
# Read a specific entry
knowledge.read(topic='python/tips/loops', db='docs')
CLI¶
The onetool kb command group handles offline knowledge base operations (scraping, indexing, and maintenance). All subcommands auto-detect onetool.yaml from the current directory.
Global options¶
| Option | Description |
|---|---|
-c, --config PATH |
Path to onetool.yaml (auto-detected from ./onetool.yaml or .onetool/onetool.yaml) |
-s, --secrets PATH |
Path to secrets file (auto-detected alongside config if omitted) |
onetool kb scrape¶
Crawl all sources in a scrape project. Requires the onetool-mcp[scrape] extra and playwright install chromium.
| Option | Description |
|---|---|
--only TEXT |
Comma-separated source names to run (runs all if omitted) |
--resume |
Resume each source from .state.json if present |
--max-pages INT |
Hard limit on pages written per source (overrides config) |
--flat-files / --no-flat-files |
Write flat :: -separated files instead of subdirectories |
--debug |
Write per-page debug artifacts (cleaned.html, raw.html, screenshot.png, meta.json) to ._debug/<slug>/ |
onetool kb scrape docs
onetool kb scrape docs --only python,stdlib --max-pages 200
onetool kb scrape docs --resume
onetool kb index¶
Index a project's scraped content into the knowledge database.
| Option | Description |
|---|---|
--path PATH |
Directory to index (overrides project's output_base_dir) |
--overwrite TEXT |
skip (default) or update |
onetool kb index docs
onetool kb index docs --overwrite update
onetool kb index docs --path /tmp/scraped
onetool kb reindex¶
Backfill missing embeddings for all chunks in an existing database.
onetool kb stats¶
Print chunk counts, embedding coverage, and file size.
onetool kb info¶
Print database metadata, path, and version.
onetool kb export¶
Export all chunks (or a filtered subset) to a JSON file.
| Option | Description |
|---|---|
-o, --output PATH |
Output JSON file path (required) |
--category TEXT |
Filter by category |
--topic TEXT |
Filter by topic prefix |