Rastrum Rastrum

Architecture

How the pieces fit together — system overview, identification cascade, offline-first sync, and the auth + token surface.

Rastrum is a static Astro site with a service worker, a Dexie outbox, and a Supabase Postgres + Edge Functions backend. Everything dynamic lives in the browser (PWA client) or in Deno Edge Functions — there is no Node server we run. The four diagrams below show how data flows between those layers.

01 System overview

Five layers. The PWA client is the centre: it reads and writes the local Dexie DB and renders pixels while the service worker caches the shell. The API and DB live in Supabase; binaries live in R2; external services are third-party.

Client (PWA) browser Astro 5 static EN/ES parity Service Worker shell cache Dexie outbox IndexedDB On-device AI WebLLM · ONNX CDN · R2 media.rastrum.org Photos / audio / video observations/<id>/... ONNX models birdnet · efficientnet pmtiles MX ~48 MB offline map API Edge Functions · Deno identify api · mcp export-dwca share-card enrich-environment recompute-streaks award-badges get-upload-url · tokens Database Postgres · PostGIS · RLS observations media_files · taxa identifications activity_events users · badges follows · watchlists user_api_tokens rst_* · SHA-256 pg_cron streaks + badges External third-party PlantNet free 500/day quota Anthropic Claude BYO key OAuth Google · GitHub OpenMeteo no key OpenFreeMap base style

Principal flows: the client loads models/photos from R2, calls Edge Functions to identify and export, and never talks to external services directly — the Edge Function brokers calls so operator API keys stay backend-side.

02 Identification cascade

Plugins are sorted by license cost (Free → Free-NC → Free-quota → BYO → Paid) then by confidence ceiling descending. The engine calls each in order until one returns confidence ≥ 0.7. The best seen result is always written, even when nothing crosses the threshold.

Photo / audio / video Blob registry.findFor media · taxa Cascade engine sort · license cost ↑ then confidence ↓ accept ≥ 0.7 PlantNet free-quota · cloud BirdNET-Lite free-nc · on-device EfficientNet-Lite0 free · on-device Claude Haiku 4.5 byo-key · cloud Phi-3.5-vision free · on-device fallback MegaDetector v5a free-nc · camera trap IDResult first wins free free-nc · free-quota byo-key

The engine (src/lib/identifiers/cascade.ts) uses the same signature on the client and inside the identify Edge Function. That symmetry is what keeps the offline cascade coherent with the online one.

03 Offline-first sync flow

Every field capture writes to the Dexie outbox first. When the network returns, syncOutbox() uploads the blobs to R2 via a presigned URL, inserts the observation row, and fires the identification cascade in the Edge Function.

Capture photo · audio · video browser Dexie outbox observations mediaBlobs · idQueue offline ok online event syncOutbox() visibilitychange R2 upload presigned PUT get-upload-url identify Edge Function server-side cascade observation INSERT Postgres + RLS media_files sync_primary_id identifications activity_ events getCurrentPosition · exifr survives offline online · visible

The outbox is the only path — even when you are online. That avoids a fork between the happy path and the offline path.

04 Auth + token surface

Five sign-in paths converge into Supabase auth.users; a trigger populates public.users; from there users mint personal rst_* tokens. Tokens are SHA-256 hashed — the REST API and MCP server verify the hash on every call.

Magic link OAuth (Google, GitHub) Email OTP Passkey (WebAuthn) Guest mode (no auth) auth.users Supabase JWT trigger public.users profile · is_expert RLS owner-self user_api_tokens rst_<base32> SHA-256 hashed · scopes[] REST API /functions/v1/api/* shell scripts, jobs MCP server /functions/v1/mcp AI agents · JSON-RPC local observations in Dexie until sign-in verifies hash · scopes

rst_* tokens are shown in plaintext exactly once at creation; after that, only the SHA-256 hash lives in the DB. The REST API and MCP server share the same verifier, keeping scopes consistent across both surfaces.

05 Stack decisions

Layer Choice Rationale
Frontend Astro 5 (output: static) Static site + PWA shell, no Node server, deploys to GitHub Pages.
Local DB Dexie (IndexedDB) Offline-first outbox; observations save locally and sync when online.
Backend Supabase (Postgres + PostGIS + RLS) Deno Edge Functions, row-level RLS, monthly partitioning, pg_cron schedules.
Object storage Cloudflare R2 Zero egress; observations, ONNX models and pmtiles served from media.rastrum.org.
In-browser AI WebLLM + onnxruntime-web Phi-3.5-vision (~2.4 GB), Llama-3.2-1B, EfficientNet-Lite0, BirdNET-Lite, MegaDetector.
Maps MapLibre + pmtiles OpenFreeMap online, offline Mexico pmtiles archive from R2 (~48 MB).

06 External services

PlantNet API

Plant ID (free 500/day quota)

Anthropic Claude

Haiku 4.5 vision (BYO key)

OAuth providers

Google + GitHub for Supabase auth

OpenMeteo

Weather backfill (no key)

OpenFreeMap

Base map style tiles

07 Key trade-offs

  • Static site, no Node server. Hosting is free on GitHub Pages; all dynamic logic runs in the browser or in Deno Edge Functions.
  • Cascade by license cost, not by accuracy. Free plugins go first; BYO keys unlock cloud APIs without changing UX.
  • R2 over Supabase Storage for media. Zero egress on R2.
  • One outbox, one cascade engine. Same signature runs offline on the client and online in the Edge Function.
  • Sensitive-species obscuration is denormalised: RLS reads from location_obscured (kept fresh by trigger).
  • No mobile-native build yet. The PWA installs to home-screen on iOS Safari and Android Chrome; Capacitor is a v1.2 plan.

Report an issue

We'll include this in your report