Personal Knowledge Base (RAG):把链接和帖子沉淀成可检索知识库
把 URL、tweet、文章等输入后统一检索,适合长期研究和团队知识复用。
GITHUBDiscovered 2026-02-10Author hesamsheikh
Prerequisites
- A folder-based corpus plan (notes/, links/, summaries/) is prepared.
- Choose an embedding/retrieval backend and decide refresh cadence.
Steps
- Ingest URLs and social posts with metadata: source, author, date, topic.
- Chunk documents by semantic section, not fixed length.
- Create a retrieval prompt template that asks for citations in every answer.
- Run weekly dedup + quality pass to remove stale/low-signal chunks.
Commands
mkdir -p data/kb/{raw,processed,index}npm run buildVerify
Ask 3 known questions and check if answers include correct source references from your corpus.
Caveats
- Without metadata, retrieval quality drops sharply when corpus grows.
- PII handling policy should be documented before importing private notes (needs verification).
Source attribution
This tip is aggregated from community/public sources and preserved with attribution.
Open original source ↗