notifier¶
Push delivery layer. Receives a
SummaryResultfrom the summarizer (via the lifespan-installedon_summarycallback) and renders it for one or more configured channels. Today the only built-in channel is
Responsibility¶
- Define a marker base class so the dispatcher in
main.pycan route a result to all channels configured on an intent via simpleisinstancechecks - Own the per-channel config schema — each channel ships its own Pydantic model so the API boundary validates channel parameters before they ever reach
send - Render an LLM-produced Markdown digest into HTML safe for HTML-only mail clients (escape user-supplied titles, scrub citation references the LLM hallucinated, embed the brand logo as
cid:-referenced inline image) - Render per-citation metadata in the intent's timezone, not the deployment-wide one — different intents may belong to different operators
- Surface the matcher's relevance score next to each cited article so a reader can tell at a glance which sources drove the match
- Never raise out of
send/send_error: a delivery failure must not abort the remaining channels in the same tick or crash the summarizer's tick loop - Provide a separate
send_errorpath so that a broken prompt template results in an actionable email to the operator instead of a silently-missed digest
Not in scope¶
- Match selection or summarization — the matcher and summarizer own those
- Retry / queue / dead-letter —
notification_log'spending → sent / failed → deadstate machine is on the roadmap; today a failed send is logged and dropped - Push channels other than email — Telegram / Discord / Slack channels are scaffolded by the marker ABC but not yet implemented
- Outbound rate limiting — a future
pre_push_hookon the summarizer is the seam for that, not the channel itself
Public interface¶
Marker base (base.py)¶
There is no abstract send because each channel takes a channel-specific config object whose shape would force the ABC into either Any or a generic protocol — both lose more typing power than they save. The dispatcher in main.py instead pattern-matches on the channel config type:
for ch in intent.channels:
if isinstance(ch, EmailChannelConfig):
await email_ch.send(result, config=ch, ...)
A future Telegram channel adds its own TelegramChannelConfig type and its own isinstance arm; nothing about the existing channels needs to change.
Email channel config (email.py)¶
class EmailChannelConfig(BaseModel):
type: Literal["email"] = "email"
to: list[EmailStr] = Field(min_length=1, max_length=50)
cc: list[EmailStr] = Field(default_factory=list, max_length=20)
bcc: list[EmailStr] = Field(default_factory=list, max_length=20)
type is a Pydantic discriminator value so Intent.channels (a list[Annotated[..., Field(discriminator="type")]]) can deserialize a heterogeneous list without ambiguity. RFC validation runs at the API boundary — by the time a config reaches send the addresses are already syntactically valid. The list bounds prevent fan-out abuse via a single intent.
Email channel (email.py)¶
class EmailChannel(BaseChannel):
def __init__(self, settings: Settings) -> None: ...
async def send(
self,
result: SummaryResult,
*,
config: EmailChannelConfig,
intent_name: str,
intent_timezone: str,
) -> None
async def send_error(
self,
intent_name: str,
kind: str, # "system" | "instruction"
name: str, # template name that failed
reason: str, # exception message; first line shown in subject
*,
config: EmailChannelConfig,
) -> None
send:
- If
settings.smtp_hostis empty, log a warning and return — this is the "SMTP not configured yet" path operators hit on first boot, and it must not crash the summarizer's tick - Pick the citation list —
result.citationsif populated, else[result.primary, *result.other_sources], else empty (for the rare case where the LLM returned a summary with no cited articles) - Resolve
intent_timezoneto aZoneInfo, falling back to UTC onZoneInfoNotFoundError(logged warning) - Convert each citation's
published_atto a display string in that timezone; format the matcher'sscore(if present) as a0.NNbadge - Convert the LLM's Markdown summary to HTML, replacing
[N]references with<sup><a href="#cite-N">[N]</a></sup>anchors. References outside1..len(citations)are silently dropped — LLMs occasionally hallucinate[7]in a 4-citation digest, and producing dead anchors hurts more than the missing reference - Render
templates/email_digest.html.jinja2with autoescape on - Wrap the HTML in
multipart/relatedwith the brand logo attached as an inline image (Content-ID: <sembr-logo>) when the bundled logo file is present; degrade to a single-parttext/htmlmessage otherwise - Build To / Cc headers from the config; Bcc is not placed in headers — only into the SMTP envelope (RCPT TO), so recipients cannot see who else is copied
- Hand the message off to
smtplibon a background thread viaasyncio.to_thread(smtplib is sync; the event loop must not block on a slow MTA)
send_error follows the same shape but uses a different template (email_template_error.html.jinja2) and a different subject ([sembr][error] {intent} — {kind} template '{name}' — {reason}). It exists because a renamed or syntactically broken prompt template produces no digest at all — the operator needs an active alert, not silence on the next cron tick.
Both methods are wrapped in a top-level try / except that logs and swallows. The dispatcher in main.py independently logs and swallows; this is intentional defense-in-depth because either side suffering an exception silently is strictly worse than a duplicate log line.
Templates¶
templates/email_digest.html.jinja2 — the digest layout. Inline-styled in addition to a <style> block because Outlook and several webmail clients still strip <style> tags. Uses multipart/related rather than multipart/alternative so the SpamAssassin MIME_HTML_ONLY rule does not fire.
templates/email_template_error.html.jinja2 — the operator-facing error layout, used by send_error. Renders the failed template kind, name, reason, and the resolved on-disk prompts_dir so the operator can find the file to fix.
templates/assets/logo.png — read once at module import. Missing or unreadable logo logs a warning and degrades the message to single-part HTML; sending continues.
Configuration¶
| Field | Default | Notes |
|---|---|---|
smtp_host |
"" |
Empty disables email delivery — send becomes a no-op with a one-line warning. Set to e.g. smtp.gmail.com for Gmail, smtp.sendgrid.net for SendGrid |
smtp_port |
587 |
587 for STARTTLS (the default), 465 for SMTP_SSL |
smtp_username |
"" |
SMTP auth username; leave empty to skip AUTH |
smtp_password |
"" (SecretStr) |
SMTP auth password; never logged or echoed |
smtp_from |
"" |
From: header. Falls back to smtp_username when empty |
smtp_use_starttls |
True |
Run STARTTLS after connecting on plain SMTP |
smtp_use_ssl |
False |
Use SMTP_SSL directly (port 465 style). When True, smtp_use_starttls is ignored |
display_timezone |
Asia/Shanghai |
Server-wide default timezone — surfaced to the dashboard. Not consulted for email rendering: the per-intent timezone is used. Kept for cross-channel UI consistency |
The path string surfaced inside the send_error body (/app/prompts) is read directly from the module-level sembr.summarizer.templates.PROMPTS_DIR constant — not from a Settings field. The legacy Settings.prompts_dir was removed in the template-management refactor.
The per-intent timezone lives on the Intent row (intents.timezone, schema default 'UTC'). The dispatcher in main.py reads it at send time and threads it through EmailChannel.send(intent_timezone=...).
Dispatcher (dispatcher.py)¶
class ChannelOutcome:
type: str # channel type discriminator
ok: bool
error: str | None
async def dispatch_summary(
conn: aiosqlite.Connection,
email_ch: EmailChannel,
result: SummaryResult,
*,
strict: bool = False,
subject: str | None = None,
) -> list[ChannelOutcome]
Routes a SummaryResult to every channel configured on the intent. Each channel runs independently — one failed delivery never aborts the others. In strict=False mode (the default cron tick path), errors are swallowed and logged. In strict=True mode (used by the aggregate-send endpoint), each ChannelOutcome includes the error message and the caller decides the HTTP status code.
dispatch_summary is the single dispatch point for both the standard cron-summary path and the aggregate-send path. The aggregate path can optionally override the email subject; when subject is None, the channel computes its own default from the intent name and date range.
_email_config_type() returns the EmailChannelConfig Pydantic model — useful for type-narrowing in callers that need to inspect the channel config before dispatch.
Upstream dependencies¶
config.Settings— SMTP host / port / credentials / TLS flagssembr.summarizer.templates.PROMPTS_DIR— path string surfaced in the operator-facing error emailsummarizer.models.SummaryResult— input tosend;Citation.scoreandCitation.published_atdrive the per-source line in the rendered HTMLdb.intents.Intent—name,channels,timezoneare read by the dispatcher inmain.pyand passed tosendper call
Downstream consumers¶
- The dispatcher in
main.pyis the only caller ofsendandsend_error. It is registered as the summarizer'son_summaryandon_template_errorcallbacks via the lifespan setup - Final delivery is handled by
smtplibin a worker thread; failures bubble up through the channel's exception handler
Known constraints¶
- No retries / no DLQ today: a failed
smtplibcall is logged and the result is dropped. Thenotification_logtable that the schema reserves forpending → sent / failed → deadstate is not yet wired through this module. Cron-driven intents pick up missed deliveries on the next tick by virtue of the lookback window; event-driven intents lose the buffered tick on send failure - HTML-only message body: there is no
text/plainalternative. Plain-text-only mail clients render the raw HTML markup. Modern clients are exclusively HTML, so the trade-off is acceptable, but a future change should ship amultipart/alternativeshell with a Markdown-source plain-text part for compliance and accessibility - Synchronous SMTP via
asyncio.to_thread: a slow or hung MTA holds a thread for the duration of the SMTP exchange. At 1.0 fan-out (a handful of intents firing per minute) this is fine. A high-fan-out deployment should swapsmtplibfor an async SMTP library or a queue-backed sender - Single-channel today: only
EmailChannelConfigis recognized by the dispatcher. TheBaseChannelABC and the per-channel config pattern were chosen so that adding aTelegramChannelConfig+ aTelegramChannelonly requires (a) a newisinstancearm in the dispatcher and (b) a new entry in theIntent.channelsdiscriminated union — but the work has not happened yet - Logo bytes loaded at import time: a few hundred KB held in memory for the lifetime of the process. Cheap and avoids re-reading per send; replace with a per-channel cache only if the deployment ships many large brand assets
- Citation score is a cosine similarity, not a probability: the badge displays the matcher's raw similarity score (typically 0.60–0.95 with the bundled embedder). Users new to ANN search may misread it as a confidence percentage. Document the scale in operator-facing docs, not in the email itself — header-level captioning would clutter the digest