sembr¶
A self-hosted intent radar — Reverse-RAG matching across RSS, NewsAPI, and Twitter.
sembr is a self-hosted intent radar. You describe what you care about once — "monitor Fed policy impact on emerging-market currencies" — and it continuously scans RSS feeds, news APIs, and social streams, matches articles to your intent via semantic vectors, and delivers LLM-analyzed digests from whatever angle you configure. No keyword lists, no hand-tuned filters. Telegram / Discord / Slack channels and a local LLM backend are scaffolded by the plugin seams and on the post-1.0 roadmap.
How it works¶
Traditional RAG: user query → search documents → answer.
Reverse RAG (sembr): pre-stored intent vector → continuously scan incoming articles → push semantic matches.
You describe what you care about once. sembr does the watching.
RSS Feeds
│
▼
BGE-M3 Embeddings (SiliconFlow API)
│
▼
Qdrant Vector Store
│ ◄── Intent vectors (stored at creation time)
▼
Semantic Matcher (Qdrant query_points per intent)
│
▼
LLM Summary (OpenAI-compatible chat completions)
│
▼
Email Digest (SMTP)
Each intent picks its own schedule — cron-mode (hourly / daily / weekly preset with a per-intent timezone) or event-mode (fire after N matching articles, or after T seconds since the first buffered match).
Quick links¶
- Getting Started — Docker Compose in 3 steps
- Configuration — every setting with its real default
- Architecture — design decisions explained
- Modules — per-module interface reference
Status¶
1.0 — first stable release.
- RSS ingestion + BGE-M3 embeddings via SiliconFlow API + Qdrant
- Intent CRUD via REST API, cron-mode and event-mode schedules
- LLM-generated summaries via any OpenAI-compatible chat-completions endpoint
- Email digest delivery via SMTP, rendered in each intent's own timezone with matcher-score badges per source
- Read-only monitoring dashboard with live log SSE
- Runtime settings editor — write the host
.envand recreate the affected containers in place
License¶
Apache-2.0 · Built by Peakstone Labs