Architecture
How the pieces fit together — system overview, identification cascade, offline-first sync, and the auth + token surface.
Rastrum is a static Astro site with a service worker, a Dexie outbox, and a Supabase Postgres + Edge Functions backend. Everything dynamic lives in the browser (PWA client) or in Deno Edge Functions — there is no Node server we run. The four diagrams below show how data flows between those layers.
01 System overview
Five layers. The PWA client is the centre: it reads and writes the local Dexie DB and renders pixels while the service worker caches the shell. The API and DB live in Supabase; binaries live in R2; external services are third-party.
Principal flows: the client loads models/photos from R2, calls Edge Functions to identify and export, and never talks to external services directly — the Edge Function brokers calls so operator API keys stay backend-side.
02 Identification cascade
Plugins are sorted by license cost (Free → Free-NC → Free-quota → BYO → Paid) then by confidence ceiling descending. The engine calls each in order until one returns confidence ≥ 0.7. The best seen result is always written, even when nothing crosses the threshold.
The engine (src/lib/identifiers/cascade.ts) uses the same signature on the client and inside the identify Edge Function. That symmetry is what keeps the offline cascade coherent with the online one.
03 Offline-first sync flow
Every field capture writes to the Dexie outbox first. When the network returns, syncOutbox() uploads the blobs to R2 via a presigned URL, inserts the observation row, and fires the identification cascade in the Edge Function.
The outbox is the only path — even when you are online. That avoids a fork between the happy path and the offline path.
04 Auth + token surface
Five sign-in paths converge into Supabase auth.users; a trigger populates public.users; from there users mint personal rst_* tokens. Tokens are SHA-256 hashed — the REST API and MCP server verify the hash on every call.
rst_* tokens are shown in plaintext exactly once at creation; after that, only the SHA-256 hash lives in the DB. The REST API and MCP server share the same verifier, keeping scopes consistent across both surfaces.
05 Stack decisions
| Layer | Choice | Rationale |
|---|---|---|
| Frontend | Astro 5 (output: static) | Static site + PWA shell, no Node server, deploys to GitHub Pages. |
| Local DB | Dexie (IndexedDB) | Offline-first outbox; observations save locally and sync when online. |
| Backend | Supabase (Postgres + PostGIS + RLS) | Deno Edge Functions, row-level RLS, monthly partitioning, pg_cron schedules. |
| Object storage | Cloudflare R2 | Zero egress; observations, ONNX models and pmtiles served from media.rastrum.org. |
| In-browser AI | WebLLM + onnxruntime-web | Phi-3.5-vision (~2.4 GB), Llama-3.2-1B, EfficientNet-Lite0, BirdNET-Lite, MegaDetector. |
| Maps | MapLibre + pmtiles | OpenFreeMap online, offline Mexico pmtiles archive from R2 (~48 MB). |
06 External services
PlantNet API
Plant ID (free 500/day quota)
Anthropic Claude
Haiku 4.5 vision (BYO key)
OAuth providers
Google + GitHub for Supabase auth
OpenMeteo
Weather backfill (no key)
OpenFreeMap
Base map style tiles
07 Key trade-offs
- Static site, no Node server. Hosting is free on GitHub Pages; all dynamic logic runs in the browser or in Deno Edge Functions.
- Cascade by license cost, not by accuracy. Free plugins go first; BYO keys unlock cloud APIs without changing UX.
- R2 over Supabase Storage for media. Zero egress on R2.
- One outbox, one cascade engine. Same signature runs offline on the client and online in the Edge Function.
- Sensitive-species obscuration is denormalised: RLS reads from location_obscured (kept fresh by trigger).
- No mobile-native build yet. The PWA installs to home-screen on iOS Safari and Android Chrome; Capacitor is a v1.2 plan.