Loading
Case study
AI SaaS · own product · agency tool

From a URL to a 25-section SEO+UX audit PDF in three minutes.

Category
AI SaaS
Client
Own product (sitespector.app)
Years
2026
Role
Co-founder · Full-Stack · AI
SiteSpector landing page — Zdominuj wyniki wyszukiwania z SiteSpector
At a glance

A multi-tenant SEO+UX audit SaaS that paints a 25-section white-label PDF in three minutes — FastAPI orchestration on top of Screaming Frog, Lighthouse, Senuto and a Qdrant-powered RAG over Gemini embeddings.

2–3min
URL → downloadable PDF — vs. half a day of manual audit
25+
PDF sections — technical / CWV / keywords / backlinks / AI strategy
6
Built-in personas — AI context tuned per role per audit
30cr/audit
Stripe metering — 4 plans, 50–2000 cr/mo
01 — The problem

Site audits in 2026 are still a copy-paste graveyard.

Every agency we’ve audited beside runs the same pipeline. Screaming Frog spits a CSV. PageSpeed Insights gets eyeballed in three tabs. Senuto exports get pasted into a Google Doc. Ahrefs takes a screenshot. The lot gets stitched into a 60-page PDF in Canva three days later and shipped to the client, who reads the first two pages and emails back “ok but what should we do?”. Each tool charges per seat, none of them share a database, and the interpretation layer — turning a 14% Core Web Vitals failure rate into a prioritised execution plan with code — happens in one consultant’s head, on Friday, against a deadline.

Half the SaaS competitors solve the crawling half and ignore the strategy half. The other half generate beautiful reports nobody acts on. Nothing on the market combines a real professional crawler (Screaming Frog), real Web Vitals (Lighthouse), real Polish-market data (Senuto), an LLM that has read the audit (RAG over Gemini), and a multi-tenant workspace for agencies in one tool.

SiteSpector exists because the boring infrastructure had to be built: an orchestrator that fans out to four data sources in parallel, a RAG pipeline that doesn’t hallucinate, a PDF system with 25+ Jinja2 templates that doesn’t collapse the first time Senuto returns a dict instead of a list, and a credit model that lets one agency owner pay for the whole team.

02 — The vision

One URL, one button, one PDF. — the rest is plumbing.

Three product principles drove every decision:

  • One audit, one button, one PDF. Paste a URL, pick a persona, wait three minutes, download a white-labelled report. No dashboard juggling, no exporting CSVs from four tools, no manual stitching. The orchestrator fans the work out to Screaming Frog, Lighthouse desktop, Lighthouse mobile and Senuto in parallel, runs Gemini analysis per section against a Qdrant RAG index, then renders WeasyPrint with embedded matplotlib charts.
  • Strategy is the product. Raw data is the price of entry. What you ship to the client is an execution plan — nine areas analysed, prioritised tasks, ready-made code snippets per recommendation. The headline number on the deliverable is “Quick Wins” in the strategy section, not “404s found” in the crawl section.
  • Multi-tenant by default. Workspaces > projects > audits, three-level hierarchy enforced by Postgres Row Level Security from day one. Stripe metering credits cover the whole team, not per-seat. The agency owner is the buyer; everyone else logs in for free.
03 — Who it’s for

Six built-in personas tune the AI per audience — same audit, four very different reports.

SiteSpector founders section on sitespector.app — Dawid Kawalec (Product & UX) and Piotr Chabros (Architecture & Infra), the two-person team behind the tool
SEO agency owner

White-labels the PDF, runs five client audits per Monday, lives in the project tree. Cares about monthly delivery cadence, team roles and a deliverable the client opens to page two.

Freelance SEO specialist

Buys Solo credits, runs an audit per pitch, leads with the execution plan and the code snippets. Doesn’t want a team workspace — wants a sharp, fast tool that makes them look senior.

In-house marketing manager

Cares about Polish-market visibility, tracks Senuto keyword movements and AI Overviews mentions month-over-month. The report is for the board, so it has to read like a strategy memo, not a CSV dump.

E-commerce founder

Crawls per category, watches CWV per product page, benchmarks against three competitor URLs. Cares about conversion-impacting issues first — broken canonicals before broken og:image.

04 — The architecture

A FastAPI orchestrator with four data sources, a Qdrant RAG, and a 25-template PDF renderer.

Three frontends share one domain: the marketing landing, the authenticated dashboard, and an SSE-driven chat panel that streams Gemini answers grounded in the audit. Caddy does path-based routing. The FastAPI backend orchestrates each audit as a background job on a Redis queue: Screaming Frog (CLI), Lighthouse desktop, Lighthouse mobile and Senuto run in parallel; results land in PostgreSQL 16 with Row Level Security. The same job indexes the audit into Qdrant via Gemini embeddings so the chat (and the AI execution-plan pass) can ground every answer in the actual data. WeasyPrint renders 25+ Jinja2 templates into one A4 PDF with embedded matplotlib charts. Everything runs on a single Hetzner CPX42 VPS (8 cores / 16 GB) — an upgrade from CPX21 once Lighthouse parallel execution became the bottleneck.

Marketing landing
Next.js 16 · container queries (ADR-037)
Dashboard app (auth)
React 19 · TanStack Query · SSE chat panel
Caddy reverse proxy · path-based routing · TLS
marketing ↔ app ↔ SSE stream — one domain, three apps
FastAPI + Pydantic orchestrator
Redis worker queue · per-audit job fan-out · SSE streams
PostgreSQL 16
RLS multi-tenant · pgvector
Qdrant + Gemini embeddings
RAG per audit · gemini-embedding-001
Screaming Frog · Lighthouse · Senuto · Gemini · WeasyPrint
parallel crawl + perf + market data → analysis → PDF with matplotlib charts
Hetzner CPX42 VPS (8 cores / 16 GB)
Docker multi-stage · Caddy · system Chromium · Screaming Frog binary in image
05 — Technical challenges

Six production incidents that became architectural decisions.

01

Lighthouse parallel OOM on a 2-core VPS

Problem. Audit pipeline kicked off desktop and mobile Lighthouse in parallel because that’s the obvious thing. On a CPX21 (2 cores / 4 GB), two headless Chromium processes drove load average to 31, both timed out, the whole audit failed. The job retried, made it worse, and the host became unresponsive.

Solution. Two-stage fix in backend/app/services/lighthouse.py. First: sequential execution with a 180-second per-pass timeout and explicit memory ceilings — ships immediately, audits get slower but stop failing. Then: VPS upgrade CPX21 → CPX42 (8 cores / 16 GB, commit c34fc72) and parallel execution restored behind a feature flag so we can fall back if the workload changes. The pipeline learned to live within real resources before it was allowed to assume bigger ones.

02

Gemini Embedding 429 ResourceExhausted

Problem. RAG indexing chunked the audit, sent everything to gemini-embedding-001 in batches of 100. With 3072-dimension vectors and the default TPM, Google rate-limited us after the third concurrent audit. SSE streams started returning 429s mid-flight, users saw indexing stuck halfway.

Solution. Three layers, all in rag_service.py (commits 476d667, 69c6a01, e6d4fde, ADR-039 / ADR-040). (1) Drop batch size to 10–20 chunks — small enough to stay under the per-call quota even with parallel audits. (2) Exponential backoff on 429 with full jitter, capped at 30 s. (3) A global asyncio.Semaphore in the embedding client so concurrent audits queue against one shared rate limit instead of fighting each other. Cost: a few extra seconds per audit. Benefit: never blocked again.

03

RAG race condition + vector-dimension drift in Qdrant

Problem. Two bugs collapsed into one symptom. SSE generator was sharing the database session with the request handler; when the response closed early, the background embedding write hit a closed connection and silently dropped. Separately, switching embedding model bumped vector dimension from 768 to 3072 and old Qdrant collections rejected new vectors with no clear error.

Solution. Dedicated AsyncSessionLocal() inside the SSE generator so its writes outlive the streaming response (commits c7e08da, ae264c1). Plus rag_service.py::self_heal_rag: validate vector dimension against the live collection on every write, auto-recreate the collection if it drifts, and expose POST /api/audits/{id}/reindex-rag as the explicit escape hatch for the user. rag_indexed_at column tracks freshness so the UI can surface stale indexes instead of showing wrong answers.

04

Jinja2 PDF templates — 25+ ways to fail silently

Problem. WeasyPrint renders 25+ Jinja2 partials, one per audit section. A refactor moved macros.html and broke template resolution across 29 files at once. Senuto returns the top_keywords field as a dict in some plans and a list in others, so iteration crashed half the time. KPI numbers got escaped as plain text and rendered as <b>72%</b> in the PDF. None of these threw — they just produced broken-looking pages.

Solution. Three small utilities in backend/app/services/pdf/generator.py (commits cd8059a, bb735ae, 60aa579). _safe_extract() wraps every nested data access with explicit defaults and structured logging. as_list() normalises Senuto dict-or-list responses into one shape. The | safe filter is applied at exactly the boundaries where HTML is intentional. Template paths got grepped and fixed in one sweep across all 29 partials. Boring, mechanical, finally correct.

05

Workspace → Project → Audit hierarchy + RLS enforcement

Problem. Original schema had audits scoped to users. As soon as a real agency signed up, the model broke: they wanted shared client projects, role-based access, and the ability to reassign an orphaned audit (project_id IS NULL) to a project after the fact. ACL checks were scattered across endpoints; each one re-derived membership differently. A subtle privilege escalation was inevitable.

Solution. Three-level hierarchy in Postgres — workspaces own projects own audits — with Row Level Security policies enforcing membership at the database, not the application. Frontend gets one useWorkspaceProjects() hook driving the sidebar tree; backend gets one require_project_membership() dependency reused across backend/app/routers/projects.py and audits.py. PATCH /api/audits/{id}/assign-project handles the orphaned-audit migration (commits 7903be1, 2c44955). The dashboard’s rogue “create audit” flow that bypassed projects got deleted entirely.

06

Next.js landing OOM in a 512 MB container (BUG-040)

Problem. The marketing landing ran in its own container with a 512 MB memory limit because it has to coexist with the backend, the worker, Postgres, Redis and Qdrant on one VPS. Next.js production build started hitting the ceiling during static rendering of large MDX pages and the container OOM-killed itself on first deploy. The landing went down before it had its first visitor.

Solution. Three changes that together fit Next.js inside 512 MB. NODE_OPTIONS=--max-old-space-size=512 as a hard ceiling in the container env so V8 garbage-collects more aggressively. Multi-stage Dockerfile that only ships the standalone output (not the full node_modules). Separated landing and dashboard into independent services in docker-compose.yml so the landing’s memory pressure doesn’t reach the app. Each service gets its own resource budget and its own restart policy. Boring infrastructure that just stays up.

06 — The workflow

Three input fields. One async pipeline. One PDF.

URL plus up to three competitors plus optional country code goes into the dashboard. The backend orchestrator fans out four jobs in parallel — Screaming Frog crawls the structure, Lighthouse measures desktop and mobile Web Vitals, Senuto pulls Polish-market visibility, backlinks and AI Overviews. As each finishes, results stream into Postgres and trigger the next stage. Gemini analyses every section against a fresh Qdrant RAG index built from the audit itself; the execution-plan pass picks Quick Wins first. WeasyPrint renders 25+ Jinja2 partials with embedded matplotlib charts into one A4 PDF, white-labelled per workspace. Average wall-clock from button to download: 2–3 minutes.

SiteSpector 3-step workflow on sitespector.app — ‘Wpisz URL → Poczekaj 1-3 minuty → Przejrzyj Execution Plan’ flow
07 — Feature highlights

Eight production features on the landing, two more hidden in the dashboard.

The public surface advertises crawl, Web Vitals, visibility, AI Overviews monitoring, Action Cards, six personas, Execution Plan and white-label PDF. The interesting infrastructure sits behind the login.

SiteSpector feature grid on sitespector.app — Wszystko czego potrzebujesz do Optymalizacji: crawling SEO, CWV, visibility, AI Overviews, Action Cards, persona modes, Scoped Reports, Execution Plan with code
+ AI Chat over the audit (SSE-streamed)

Per-audit chat panel grounded in the Qdrant RAG index of that specific report. Gemini answers stream over SSE, every response cites the section it came from, file attachments get parsed into the same vector store, and the embedding pipeline retries on quota errors transparently.

+ Three PDF report types from one audit

Executive (board-ready, 6 pages), Standard (the default, ~25 pages), Full (everything including appendix and raw crawl tables). Same data, three audience tiers, three Jinja2 template sets, one rendering pipeline. White-labelled per workspace via Stripe-gated settings.

08 — Stack

Picked for orchestrated complexity on one VPS — not for a cloud bill.

Every layer below earns its place by being either the cheapest correct option or the only one that survives running next to a headless Chromium + a Postgres + a Qdrant + a Redis on a single host.

Layer
Tech
Why
Frontend
Next.js 16 (App Router) + React 19
Server Components for the marketing landing, client components for the dashboard. One codebase, three route groups.
Frontend state
TanStack Query + Supabase JS client
Optimistic mutations against the audit pipeline; realtime audit-status subscriptions over Supabase channels.
UI
Tailwind 4 + container queries (ADR-037)
Sidebar panels resize against their container, not the viewport — the dashboard works in every split-screen the user invents.
Backend
FastAPI + Pydantic + async SQLAlchemy
Native async/await for the parallel audit fan-out; Pydantic v2 schemas validate the AI boundary and the Stripe webhook payloads.
Queue
Redis + custom worker
One process per worker, one queue per stage, easy to inspect with redis-cli. No Celery, no broker drama.
Database
PostgreSQL 16 + pgvector + RLS
Multi-tenant enforced at the database, not the app. Vector column lives next to the data it indexes.
Vector DB
Qdrant (per-audit collection)
One collection per audit means dimension drift never poisons another audit; auto-recreate on validation failure.
AI
Google Gemini + gemini-embedding-001
Analysis pass + embeddings from the same vendor — one quota to monitor, one SDK to mock in tests.
Crawler
Screaming Frog CLI (in Docker image)
The professional standard. Custom crawl.sh wrapper handles 403 detection, custom user-agents and exports JSON the API expects.
Performance
Lighthouse CLI (desktop + mobile)
Two runs per audit, parallel since the CPX42 upgrade. Same binary the Chrome team ships — nothing custom to maintain.
Market data
Senuto API
Polish-market visibility, backlinks, keyword positions, AI Overviews. The local data Ahrefs doesn’t have.
Auth
Supabase Auth + RLS policies
Email/password + Google + GitHub OAuth out of the box. Row Level Security enforces workspace membership on every query.
Payments
Stripe metering (4 credit plans)
1 audit = 30 credits. Plans 50–2000 cr/mo. Idempotent webhook handler keeps double-deliveries from double-charging.
PDF
WeasyPrint + Jinja2 (25+ templates) + matplotlib
HTML/CSS render to A4 with running headers/footers. SVG charts embed directly so the PDF looks the same on every viewer.
Deploy
Docker multi-stage + Caddy on Hetzner CPX42
One VPS, one docker compose up -d, Caddy handles TLS and path-based routing. The whole product on one host.
09 — Results

Live, multi-tenant, on one VPS.

SiteSpector is in production on sitespector.app with active Stripe subscriptions across four credit plans, real agencies running real client audits, and a single Hetzner CPX42 VPS hosting every part of the system — landing, dashboard, FastAPI, worker, Postgres, Redis, Qdrant, all behind one Caddy reverse proxy.

SiteSpector metrics on sitespector.app — Zwiększ widoczność w Google w Kilka Minut: 4 tools in one, $9.99 cheapest plan, 30 min full audit, 200 Execution Plan tasks
2–3 min
Full audit wall-clock — Screaming Frog + Lighthouse desk+mob + Senuto + Gemini + PDF.
25+ sections
Per PDF, each with its own Jinja2 template and embedded matplotlib charts.
6 / 3
Built-in personas / report types — the same audit, six AI tones, three audience tiers.
30 cr
Per audit — Stripe metering across 4 plans, 50–2000 cr/mo, workspace-wide.
10 — Engineering decisions

Five ADRs — the load-bearing “why we did it this way”.

Architecture Decision Records are how the team agrees once and stops re-litigating. The five below cover the parts of SiteSpector that look obvious from the outside and were anything but.

ADR-037

Container queries instead of media queries

Context. Sidebar panels (chat, audit overview, project tree) live inside resizable containers. Media queries against the viewport produced the wrong layout the moment the user dragged the splitter.

Decision. Tailwind 4 container queries everywhere in the dashboard layout. Components respond to their own container width, not the page.

Consequence. Sidebar can be 280 px or 720 px and the audit overview still looks right. The dashboard now works in every split-screen the user can invent.

ADR-038

SSE for AI chat and RAG indexing (not WebSockets)

Context. Both the chat panel and the long-running RAG indexer need to stream progress to the browser. WebSockets are the obvious answer; FastAPI’s native story for them is awkward.

Decision. Server-Sent Events for everything one-directional. FastAPI’s EventSourceResponse handles connection lifecycle cleanly; the browser’s native EventSource reconnects on drop without us writing a single line.

Consequence. One less moving part. SSE works through Caddy out of the box; reconnects are free; the chat panel stays open across deploys.

ADR-039

Batch embedding 10–20 chunks per call

Context. Gemini’s batchEmbedContents theoretically supports 100 chunks per call. In practice with 3072-dim vectors and concurrent audits, anything over ~30 chunks burned the TPM quota and the whole pipeline 429-ed.

Decision. Batch 10–20 chunks per call, tunable via env. Exponential backoff with jitter on 429. Global semaphore across audits so concurrent indexing queues against one rate limit.

Consequence. A few extra seconds per audit. Never quota-blocked again. The sweet spot was empirical, not from the docs.

ADR-040

Vector dimension validation + auto-recreate

Context. Switching embedding models mid-flight (768 → 3072 dims) silently corrupted Qdrant collections. New writes succeeded; queries returned nonsense.

Decision. Validate the live collection’s vector dimension against the embedder’s output on every write. If they don’t match, auto-recreate the collection and reindex from Postgres. Expose POST /api/audits/{id}/reindex-rag as the explicit user-facing escape hatch.

Consequence. Model swaps stopped being scary. The pipeline self-heals; users can force a reindex from the UI when they need to.

ADR-041

Workspace > Project > Audit hierarchy with RLS

Context. First schema had audits scoped to users. Agencies wanted shared client projects, role-based access, and reassignment of orphaned audits. ACL checks were duplicated across endpoints; each one re-derived membership differently.

Decision. Three-level hierarchy in Postgres with Row Level Security policies enforcing membership at the database. One require_project_membership() dependency in FastAPI; one useWorkspaceProjects() hook in React.

Consequence. A subtle privilege-escalation surface collapsed into a single policy file. Every new endpoint inherits the rule by default.

11 — Production scaling story

Four phases from a 2-core MVP to a production-grade single VPS.

SiteSpector never lived in a cloud cost report. It went from one cramped VPS to one less-cramped VPS, with all the workload-shaping in between — mitigations first, hardware last.

01

Phase 1 — Hetzner CPX21 (2 cores / 4 GB)

MVP shipped on the cheapest box that could host Postgres + FastAPI + a worker + Caddy + landing all at once. Lighthouse desktop and mobile ran sequentially because we knew parallelism wouldn’t fit; RAG indexing went out in 100-chunk batches because Gemini’s docs said it would. Single-tenant, no Stripe yet, demo-friendly.

02

Phase 2 — Production incidents

Real workload hit. Gemini started returning 429s on the third concurrent audit (BUG-036). The landing OOM-killed itself on first traffic spike (BUG-040). Lighthouse parallel attempts pushed the host to load average 31. PDF rendering broke whenever Senuto returned a dict instead of a list (BUG-044). Every incident logged into docs/bugs.md; every fix referenced back from the commit.

03

Phase 3 — Software mitigations (no new hardware)

Six weeks of paying the rent on the same VPS. Sequential Lighthouse with explicit timeouts. Batch embedding down to 10–20 chunks with backoff and a global semaphore. _safe_extract() + as_list() normalised every Senuto and Gemini boundary in the PDF pipeline. NODE_OPTIONS=--max-old-space-size=512 caged the landing inside its container. Vector dimension validation + auto-recreate made model swaps non-events.

04

Phase 4 — Hetzner CPX42 (8 cores / 16 GB)

Hardware caught up to the workload. Parallel Lighthouse restored behind a feature flag so we could fall back instantly. RAG batch tunable up when the audit warrants it. Real headroom for chat throughput, multi-tenant concurrency, and the next two crawlers we already know we’ll add. The mitigations from Phase 3 stay in place — they’re cheap insurance, and they’ll be load-bearing again the next time the workload outpaces the box.

Like what you see?
Let’s build the next one.

From a blank page to a working product — AI, automation, full-stack engineering. Get in touch and let’s talk about your idea.