Contributing¶
Development setup¶
Requirements¶
- Python 3.12.x
- Docker + Docker Compose (for E2E testing on macOS / Linux)
Install dev dependencies¶
Run static checks (any platform)¶
uv run python -m py_compile sembr/**/*.py
uv run python -c "import sembr, sembr.api, sembr.collector, sembr.embedder, sembr.vector_store, sembr.matcher, sembr.summarizer, sembr.notifier, sembr.db, sembr.dashboard, sembr.logbus; print('ok')"
uv run pytest tests/ -v
The full test suite runs under pytest-asyncio. No live Qdrant or SQLite required — the tests stub the network/IO surfaces.
Lint and format¶
Project structure¶
See Architecture for the data flow and design decisions, and Modules for per-module interface contracts. Each module has its own docs/modules/<name>.md documenting upstream / downstream / known constraints — read it before changing the corresponding code.
Adding a new RSS-style source¶
- Subclass
sembr.collector.base.BaseSource. Implementfetch(since)returningtuple[int, int, list[RawArticle]](items_seen, items_new, parsed) andconfig_schema()returning a JSON Schema dict - Register in
SOURCE_REGISTRYinsembr.collector.scheduler— the dashboard's create-feed form reads from this dict, so the new source appears immediately - Add a unit test under
tests/collector/
A pyproject.toml entry-points discovery layer is on the post-1.0 roadmap; the hardcoded registry is the contract today.
Adding a new notification channel¶
- Define a Pydantic config model with a unique
type: Literal["x"]discriminator and any per-channel fields (recipient list, webhook URL, etc.) - Add the new config to the
Intent.channelsdiscriminated union insembr.models - Subclass
sembr.notifier.base.BaseChannel— note thatBaseChannelis a marker ABC with no abstract methods because per-channelsend()signatures legitimately diverge; define whatever shape your channel needs - Add an
isinstance(ch, XConfig)arm to the dispatcher insembr.mainthat calls into your channel - Wrap your top-level
send()intry / exceptand never raise — a delivery failure must not abort the remaining channels in the same tick or crash the summarizer's tick loop. The dispatcher independently logs and swallows; this is intentional defense-in-depth
Commit style¶
feat:new featurefix:bug fixchore:tooling, deps, CIdocs:documentation onlytest:test-only changes
License¶
By contributing you agree that your code will be licensed under Apache-2.0.