back to projects

|

LLM pipeline for daily news briefs and podcast-style summaries

Overview

A production news aggregation system that pulls from 30+ RSS feeds and produces a curated daily digest, classified by topic, ranked by importance with LLM-generated briefs and podcast-style audio summaries. Built solo in two weeks, runs for under $0.15/day in LLM costs.

The demo above demonstrates a typical morning browsing the site.


How The Pipeline Works

Every morning, the system ingests around 200-500 articles (depending on the day) from major news sources (FT, BBC, Reuters, WSJ, etc). Many articles are duplicates — the same story covered by 10 different outlets. The pipeline figures out which articles belong together, decides what’s truly important, and writes concise briefs for the top stories.

  • Ingest & Embed — RSS feeds are fetched and parsed. Each article is embedded using OpenAI’s text-embedding-3-small (768 dimensions).
  • Cluster — Related articles are grouped using hierarchical clustering on the embedded vectors. This runs in scipy. A major international story (for example, the recent US operation in Venezuela) might have 20 articles from different sources, all grouped into one cluster.
  • Classify — Each cluster is assigned to one of 8 topics (UK, US, World, Europe, Business, Technology, Science, Arts). This is the first LLM call. The model also generates a one-sentence summary and filters out non-news content like opinion pieces or year-in-review articles.
  • Rank — Stories within each topic are then scored 1-10 for importance. The system runs this twice and compares results — if the same stories rank highly both times, the ranking is reliable. Final scores are averaged.
  • Enrich — Top stories get researched properly. The system searches the web using Brave Search API for full articles, extracts the content using Jina Reader, cleans up the noise with another LLM call, then generates the final brief from the actual article text.
  • Output — Finished stories hit a Postgres database, then flow to a static site, email subscribers, and audio podcasts via Modal TTS.

Total runtime is about 15-20 minutes (could be faster but rate limited on free tiers). The digest publishes at around 9am UK time, emails are sent around 9:15am.

Pipeline architecture diagram Pipeline architecture diagram

Key Design Decisions

Workflows over agents. No autonomous loops or agent orchestration. Each stage has explicit inputs, outputs and failure modes. When something breaks, I need to understand exactly which stage failed. Debugging an agent making unexpected tool calls is a lot more challenging. Every LLM call is tagged in Helicone by pipeline stage and topic, so cost spikes and quality issues trace back to specific components.

Minimise LLM usage. The instinct with AI projects is to throw the LLM at everything. That gets expensive and unreliable quickly. Clustering uses scipy on embeddings. Image filtering uses alt-text matching before the vision model sees anything. The LLM only handles judgement calls.

Structured outputs everywhere. Every LLM call uses JSON mode with Pydantic validation. Define a schema, include it in the prompt, validate the response. Zero parsing errors in production. When validation fails, I get a clear error about what’s wrong.

Multi-stage enrichment. RSS feeds only give you a title and maybe two sentences. Not enough context to write a good brief — the LLM pads with filler or hallucinates facts. Searching for full articles and extracting their content provides 10-20x more context.


Infrastructure

Everything runs on a single Hetzner VPS at €3.50/month. Cloudflare handles CDN and SSL termination. Caddy reverse-proxies to Docker containers (FastAPI + Postgres/pgvector) and serves the static frontend.

A cron job handles the daily cycle: pipeline at 08:20, frontend rebuild at 09:00, email distribution at 09:15 approximately.

The system is designed for graceful degradation. If Brave is down, stories proceed without web enrichment. If image search fails, stories publish without images. If Modal fails, the digest goes out without audio. External API failures don’t crash the pipeline.

Infrastructure diagram Infrastructure diagram

Stack

LayerTech
BackendPython, FastAPI, SQLAlchemy
DatabasePostgreSQL + pgvector
LLM GatewayHelicone → gpt-4o-mini, grok-4-1-fast-reasoning
Embeddingstext-embedding-3-small (768-dim)
SearchBrave Search API, Jina Reader
AudioModal
InfrastructureHetzner VPS, Docker, Caddy, Cloudflare

Metrics

MetricValue
Articles ingested~500/day
Stories published25-35/day
Pipeline runtime15-20 min
LLM cost$0.10-0.15/day
Infrastructure€3.50/month