BirdNET-NG Architecture
A distributed bird sound identification system built on the BirdNET deep learning model.
Overview
BirdNET-NG decouples audio capture, processing, and visualization into independently deployable components that communicate over a message broker. This enables community-contributed satellites (microphone nodes), shared inference infrastructure, and multiple UI clients β all with multi-tenant isolation from day one.
graph TD
subgraph Satellites
S1["π Satellite 1
(Pi + Mic)"]
S2["π Satellite 2
(Pi + Mic)"]
PA["π± Phone App
(Android)"]
end
S1 -->|WSS :443| T
S2 -->|WSS :443| T
PA -->|WSS :443| T
T["π Traefik
(reverse proxy)"]
T -->|HTTPS| Web["π Web
(nginx)"]
T -->|WSS| Mosq["π‘ Mosquitto
(dynsec)"]
T -->|HTTPS| Docs["π Docs"]
subgraph Hub Cluster
Web --> Hub["β‘ Hub API
(Fastify)"]
Mosq --> Hub
Hub --> PG["π PostgreSQL"]
Hub --> Redis["β Redis +
BullMQ"]
Redis --> W1["π Worker 1
(Python)"]
Redis --> WN["π Worker N
(Python)"]
end
T -->|HTTPS| MinIO["πΎ MinIO
(S3 storage)"]
Hub --> MinIO
W1 --> MinIO
WN --> MinIO
style S1 fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0
style S2 fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0
style PA fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0
style T fill:#4c1d95,stroke:#8b5cf6,color:#e2e8f0
style Web fill:#065f46,stroke:#22c55e,color:#e2e8f0
style Hub fill:#065f46,stroke:#22c55e,color:#e2e8f0
style Mosq fill:#7c2d12,stroke:#f97316,color:#e2e8f0
style Docs fill:#065f46,stroke:#22c55e,color:#e2e8f0
style PG fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0
style Redis fill:#7c2d12,stroke:#f97316,color:#e2e8f0
style W1 fill:#713f12,stroke:#eab308,color:#e2e8f0
style WN fill:#713f12,stroke:#eab308,color:#e2e8f0
style MinIO fill:#4c1d95,stroke:#8b5cf6,color:#e2e8f0
Components
Satellite (packages/satellite)
Lightweight Node.js agent running on a Raspberry Pi with an attached microphone.
Audio Pipeline:
- Captures 3-second WAV chunks at 48kHz mono (BirdNET's native window)
- On-device pre-filtering: RMS silence gate + YAMNet VAD (per-category bird-likelihood scoring)
- Filter settings pushed from hub tenant settings via MQTT config channel
- Local SQLite outbox queue (sql.js) β never blocks on network
- MQTT upload with acknowledgment and exponential backoff
Recording Scheduler:
- Sunrise/sunset calculation from GPS coordinates (NOAA algorithm)
- Profile-based scheduling: dawn_chorus, night_migration, low_power, continuous
- Profiles pushed from hub via MQTT, resolved locally against sun times
- Capture loop pauses outside recording windows
Heartbeat:
- Sends lightweight keepalive at configurable interval (default 30s, MQTT QoS 1)
- Reports state:
recording,paused,scheduled_off,error - Reports
noise_floor_rmsfor adaptive filter calibration - Reports satellite package
versionfor version tracking - Heartbeat interval configurable from hub tenant settings
- Decouples online status from audio chunk sending (satellite stays online even when all chunks are filtered)
Capture Modes: alsa (real hardware), simulate (sine wave), replay (pre-recorded WAV files)
Mobile App (packages/mobile)
Android phone satellite built with Capacitor.
- Native AudioRecord plugin (bypasses WebView audio limitations)
- GPS gate: requires location before recording starts (auto-acquire or manual entry)
- GPS auto-update toggle with configurable interval (30sβ10min)
- Background mode for continuous recording when app is minimized
- Keep-screen-on toggle
- Settings redesigned as full page with clear GPS/manual location toggle
- Tap-to-record status bar (recording rectangle is the toggle)
- Verbose chunk logging toggle
- Default state: paused (not recording)
- All hub filter settings received and applied via MQTT
- Heartbeat and telemetry (battery, storage via Storage API)
- In-app log viewer for diagnostics
Hub API (packages/hub)
Central Fastify server coordinating the entire system. The hub can run in three modes via the HUB_MODE environment variable:
| Mode | What runs |
|---|---|
full (default) |
API server + all background workers in a single process |
api |
Stateless HTTP + WebSocket server only (scalable with replicas) |
dispatcher |
Background workers only: MQTT ingester, monitors, webhook dispatch, image worker (single instance) |
In full mode, everything runs in one process (the original behavior). In split mode, api and dispatcher run as separate containers. A Redis pub/sub event bridge carries events between them (e.g., detection alerts from the dispatcher are published to Redis and delivered to WebSocket clients by the API). The API process uses a lightweight MqttConfigPusher to push configuration to satellites without running the full MQTT ingester. The image worker writes stats to Redis and polls settings from the database, so it operates correctly in both modes.
Service Registry:
All service instances (API, dispatcher, web, workers) register themselves in Redis under birdnet:registry:{service}:{instanceId} with a 30-second TTL. Each registration includes IP address, version, uptime, memory usage, and service-specific metadata (e.g., worker job stats). The System page in the Platform Admin section queries this registry via GET /api/system/status to display a live overview of all running instances, infrastructure health (PostgreSQL, Redis, MQTT, MinIO), and a data summary (detections, queue, images, tenants, users). Detail rows are expandable for per-instance information.
Core Services (API):
- JWT cookie + Bearer token authentication
- Role-based access control with tenant isolation
- Satellite credential provisioning via Mosquitto dynamic security (MqttAdmin)
- WebSocket server for real-time client notifications
- MqttConfigPusher: pushes config to satellites via MQTT (in
apimode)
MQTT Connection Resilience:
Both MqttAdmin (credential provisioning) and MqttConfigPusher (satellite config push) maintain persistent MQTT connections with auto-reconnect. When running multiple API instances (--scale api=N), each instance maintains its own MQTT connection. If a connection is temporarily lost (e.g., Mosquitto restart, network blip), API requests that need MQTT will wait up to 10 seconds for reconnection rather than failing immediately. This makes scaled deployments reliable without requiring sticky sessions or a shared MQTT connection.
Background Workers (Dispatcher):
- MQTT Ingester: receives audio, telemetry, and heartbeats; stores audio in MinIO; enqueues inference jobs via BullMQ
- SatelliteMonitor: checks for offline/degraded satellites every 60s
- DetectionWatcher: rarity checks, first-of-season flagging, WebSocket events, webhook dispatch. When a detection is the first observation of a species in the current season for a tenant, it sets
is_first_of_seasonon the detection and generates an alert. - SpeciesImageWorker: downloads bird thumbnails from Wikipedia (max 2 concurrent), stores in MinIO, with Wikimedia rate limiting (500 req/hr unauthenticated, 10,000 req/hr with OAuth 2.0)
- WebhookDispatcher: delivers webhook payloads with HMAC signatures
- AuditService: comprehensive audit logging of all user, admin, and detection actions to audit_log table (real client IPs via X-Forwarded-For)
Session Grouping: Audio chunks are grouped into recording sessions based on a 5-minute gap boundary. If the time between two consecutive chunks from the same satellite exceeds 5 minutes, a new session begins. The Sessions API exposes these groupings for the Timeline page, including per-session chunk counts and detection summaries.
Inference Workers (packages/inference)
Python 3.11 processes running BirdNET TFLite.
- Pull jobs from BullMQ/Redis queue
- Download audio from MinIO, run inference via birdnetlib
- Geo-aware species filtering (GPS + date)
- Enhanced re-processing for uncertain detections (0.4β0.7 confidence):
- Extended audio window (6s)
- Frequency isolation (1β10kHz bandpass)
- Cross-chunk correlation
- Scale with
docker compose up --scale worker=N
Web UI (packages/web)
React 19 SPA served by nginx, responsive for mobile.
Pages:
- Dashboard: satellite fleet status, recent detections, active alerts
- Detections: 2-column flex card layout with SpectrogramPlayer (click-to-seek, drag-to-select region, loop playback, signal boost 1xβ20x, speed 0.25xβ2x, volume, download), vote buttons (Yes/No/Unsure) under confidence bar
- Satellites: sortable table layout (Name, Status, Profile, Version, Last seen) with search and status filter; satellite count (online/total) in header; row click opens a slide-in side panel with full details (General, Hardware, Schedule, Hub Configuration, Actions); version shown with amber warning when satellite version differs from hub version; tenant name shown in All Tenants view
- Analytics: daily trends, hourly activity, top species, biodiversity indices, weather correlation, migration patterns
- Compare: side-by-side location comparison with Jaccard similarity
- Alerts: custom alert rules (detection/absence/trend triggers) and 6 notification channels
- Workers: connected workers with IP-based dedup, queue stats (platform admin only)
- Members: tenant member management, invite links
- Timeline: chronological view of recording sessions with inline chunks and detections
- Species: card catalog of all detected species with stats, sorting, search
- Map: Leaflet.js interactive map with satellite markers and detection data
- Account: profile editing, password change, language preferences, self-delete
- Audit Log: paginated table of all admin and user actions (platform admin)
- Settings: tenant thresholds, watchlists; platform toggles, Wikimedia API, image cache
Features:
- Species thumbnails from Wikipedia (downloaded in background, served from MinIO)
- Multi-language bird names (38 languages from BirdNET label files)
- Species name display: primary bold, secondary in parentheses, Latin italic, clickable thumbnail with lightbox
- Real-time WebSocket notifications for rare species (translated names)
- Responsive layout with hamburger menu on mobile, scrollable sidebar (hidden scrollbar)
- Spectrogram visualization (Cooley-Tukey FFT, Viridis colormap, 0β12kHz)
- Full i18n support (English + French, ~500+ translation keys, locale-aware formatting)
- Language switcher in sidebar
MQTT Channels
birdnet/{tenant_id}/{satellite_id}/{channel}
| Channel | Direction | QoS | Purpose |
|---|---|---|---|
audio |
Satellite β Hub | 1 | Audio chunk (base64 WAV) |
telemetry |
Satellite β Hub | 1 | Battery, storage, CPU, GPS |
heartbeat |
Satellite β Hub | 1 | Keepalive with state + noise_floor_rms |
config |
Hub β Satellite | 2 | Recording profile + filter settings push |
config-request |
Satellite β Hub | 1 | Device-initiated config change request (respects lock) |
update |
Hub β Satellite | 1 | Remote update command (triggers update script on Pi) |
ack |
Hub β Satellite | 1 | Chunk acknowledgment |
Authentication
Two methods, resolved in order:
- Bearer token β
Authorization: Bearer <token>(JWT from mobile login, or API key SHA-256 hashed) - JWT cookie β
sessionhttpOnly cookie set by login endpoint
Roles (tenant-scoped): viewer < member < admin < owner
Platform admin: cross-tenant access, read-only protections, designated via DB flag or PLATFORM_ADMIN_EMAILS env var
Multi-Factor Authentication (v0.27+)
Optional second factor for any user, mandatory for platform admins.
- TOTP (RFC 6238, Β±30 s tolerance): scan a QR with any authenticator app (Google Authenticator, 1Password, Authyβ¦). Secret stored AES-256-GCM-encrypted at rest with
MFA_ENCRYPTION_KEY(32-byte hex env var) - WebAuthn passkeys: Touch ID, Windows Hello, YubiKey, etc., registered against an authenticated session and used as an alternative second factor. Requires
WEBAUTHN_RP_ID/WEBAUTHN_RP_NAME/WEBAUTHN_ORIGIN(auto-derived fromBNG_APP_FQDN) - Backup codes: 10 single-use Crockford-alphabet codes shown once at TOTP enrollment, bcrypt-hashed at rest, popped on consume. Regeneratable from
/account(re-enter password) - Login challenge:
/loginreturns{mfaRequired, mfaToken}instead of the session cookie when MFA is enabled. The 5-minmfaTokenJWT bridges the password and second-factor steps; up to 5 wrong codes per token (Redis-backed counter) before it's burned and the user has to re-enter the password - Forced-enrollment gate: platform admins land on a full-page enrollment screen before the AppShell renders if they haven't enrolled yet
- Remember this device (30 days): opt-in cookie that skips the second factor on the same browser. Token is sha256-hashed at rest in
mfa_remembered_devices; per-device list with revoke on/account. Logout does NOT clear the cookie (matches Google / GitHub / Microsoft) - Reset escape hatches: platform admin "Reset MFA" action on PlatformUsers; CLI emergency reset at
pnpm --filter @birdnet-ng/hub exec mfa:reset <email>for the case where the only platform admin loses both their device and backup codes - Audit: every MFA action logged (enabled, disabled, login_failed, passkey_added/removed, reset_by_admin, etc.)
Multi-Tenancy
- Row-level isolation via
tenant_idon all data tables - Users are global; membership via
tenant_membersjoin table - Each user can have different roles in different tenants
- MQTT ACLs enforce per-satellite topic scoping
- MinIO paths prefixed by tenant:
{tenant_id}/{satellite_id}/{chunk_id}.wav - Species images are global (not tenant-scoped)
Database
PostgreSQL 17 with 71 migrations (001β071):
| Migration | Purpose |
|---|---|
| 001 | Core tables: tenants, satellites, audio_chunks, detections, alerts, telemetry |
| 002 | User auth: global users, tenant_members, invites, platform/tenant settings |
| 003 | User management: blocked flag, updated_at |
| 004 | Webhooks table |
| 005 | Satellite device_id for dedup |
| 006 | Watchlist species in tenant settings |
| 007 | Language settings on tenant |
| 008 | Per-user language preferences |
| 009 | Species images cache table |
| 010 | Satellite heartbeat_state and uptime |
| 011 | Nullable telemetry fields (mobile compatibility) |
| 012 | Species image download queue columns |
| 013 | User management: last_login, lockout, audit_log, login_attempts |
| 014 | Audio filter settings |
| 015 | First of season flag on detections |
| 016 | Voting mode user preference (show_voting) |
| 017 | Time format user preference (time_format) |
| 018 | Satellite config overrides table + config_locked flag |
| 019 | Detection comments table |
| 020 | Alert rules + alert channels tables |
| 021 | Scheduled exports table |
| 022 | Start-of-week user preference |
| 023 | First-of-day flag on detections |
| 024 | Satellite update status tracking |
| 025 | Satellite filter stats (JSONB) |
| 026 | Field notes + field note photos |
| 027 | Watcher-processed flag on detections |
| 028 | Mobile releases (APK upload) |
| 029 | Detection stitching (stitch_group_id, stitch_chunk_ids, is_stitch_primary) |
| 030 | Audio retention (pinned, audio_chunks.size_bytes + audio_purged_at, platform defaults) |
| 031 | Detection frequency band (superseded by 033) |
| 032 | Detection call window (superseded by 033) |
| 033 | Detection call_segments JSONB β array of {start_ms, end_ms, freq_min_hz, freq_max_hz} per call |
| 034 | Detection stitched_call_segments JSONB β combined segments on stitched primaries |
| 035 | Detection trust_score + neighbor_count + compute_trust_score() SQL function |
| 036 | species_frequency_bands β empirical per-species freq range from trusted detections |
| 037 | audio_chunks.rms β per-chunk amplitude for silent-bucket detection on the timeline |
| 038 | alert_rules gains system + system_key + cooldown_scope + message_template; seeds system rules per tenant |
| 039 | alert_channels.payload_template; webhooks migrated into alert_channels with type='webhook' |
| 040 | alert_rules.fire_mode: cooldown / once_until_clear |
| 041 | alert_rules.fire_mode adds on_state_change; satellite_offline switched to this mode |
| 042 | birdnet_models registry β multi-model BirdNET store + per-detection model_id |
| 043 | birdnet_models bundled-model relabeling cleanup |
| 044 | audio storage hard cap (purges keep-best/pinned when over the hard ceiling) |
| 045 | inference_mode platform setting + shadow_detections table for A/B compare |
| 046 | species_image_files β per-species multi-image gallery storage |
| 047 | clean duplicate image extras |
| 048 | audio_chunks.embedding β per-chunk Perch v2 acoustic embedding (1536-d float16) |
| 049 | birdnet_models.kind ('classifier' / 'embedding') + per-kind default index |
| 050 | drop builtin model rows from registry (worker falls back to birdnetlib's bundled .tflite) |
| 051 | silent_chunk_retention_hours platform setting (default 6h) β purges chunks with no detections |
| 052 | YAMNet replaces spectral peak / noise-floor heuristics; add yamnet_min_bird_prob (default 0.05) |
| 053 | Satellite outbox capacity caps β drop retention_hours; add outbox_soft_size_mb (5000) / outbox_hard_size_mb (8000) / outbox_max_age_hours (720) |
| 054 | Detection visibility by category β species_category() SQL function classifies into bird/amphibian/insect/anthropogenic/human_voice/other_animal; 5 show_* booleans on tenant_settings (default false). satellites.outbox_stats JSONB |
| 055 | Split visibility from satellite drop β 5 drop_at_satellite booleans on tenant_settings (default true) + satellite_config_overrides (nullable). show remains UI-only; drop_* governs the YAMNet pre-upload drop and is per-satellite-overridable |
| 056 | detection_visible_for_tenant() SQL function β consults each detection's own tenant_settings inline so the visibility filter works on platform-admin "all tenants" queries |
| 057 | apk_keep_last_n platform setting (default 5) β rolling-release cap for mobile_releases; APK retention sweep drops oldest beyond N |
| 058 | species_category() patched to cover BirdNET v2.4's five mammal labels (Canis lupus, Canis latrans, Sciurus carolinensis, Tamias striatus, Odocoileus virginianus) β classified as other_animal so the show_other_animals visibility toggle hides them |
| 059 | tenant_settings.range_filter_threshold REAL DEFAULT 0.01 β per-tenant probability cutoff on the eBird meta-model that narrows the BirdNET label set by lat/lon/week; default chosen lower than birdnetlib's 0.03 to keep resident species (Tawny Owl, Stock Dove, Scops Owl) that eBird under-reports |
| 060 | Temporal aggregation (Merlin-style FP suppression): detections.is_promoted; tenant_settings.temporal_aggregation_{enabled,count,window_minutes} (true/2/30); confidence_high default 0.85 (single-shot bypass). Tentative detections hidden from UI/exports/share/alerts until promoted by a 2nd hit within window |
| 061 | tenant_settings.tentative_retention_hours DEFAULT 24 β retention sweep drops unpromoted detections older than this so the table stays bounded |
| 062 | detections.is_out_of_range β inference worker flags hits where BirdNET β₯ confidence_high but the species was eBird-rejected; surfaced as "Out of range" badge so vagrants aren't silently dropped |
| 063 | species_images + species_image_files: original_storage_key + original_content_type + original_file_size β full-resolution copy alongside the 800px display thumb so the lightbox can serve and zoom the original |
| 064 | MFA: users.{mfa_enabled, mfa_secret_encrypted (AES-256-GCM), mfa_backup_codes_hash[] (bcrypt), mfa_enrolled_at} + user_passkeys table for WebAuthn credentials |
| 065 | mfa_remembered_devices β long-lived "trust this device" tokens (sha256-hashed, with user_agent + ip_address for the listing UI), 30-day default expiry |
| 066 | species_images.extras_exhausted_at β marks species whose Wikipedia image list is exhausted so the backfill loop rotates instead of looping on a partial gallery; weekly re-check window |
| 067 | satellites.stream β 'satellite' / 'mobile' β set by heartbeat from each device, used by the hub to compare against the matching latest target instead of always comparing to the hub version |
Species Images
Bird thumbnails are managed by a background download queue:
- When a species is first seen, it's queued in
species_imagestable SpeciesImageWorkerprocesses the queue (max 2 concurrent, configurable delay)- Fetches the original image URL from Wikipedia (
pageimagesprop), then downloads both the 800px display thumb and the full-resolution original in a single pass β stored asspecies-images/{slug}.{jpg|png}andspecies-images/{slug}_orig.{jpg|png}respectively - Up to N additional images per species (
species_extra_images_count) are pulled from the page'simagesprop; each extra also stores both sizes ({slug}_{position}.jpg+{slug}_{position}_orig.jpg) - Proxy endpoint serves from MinIO with immutable cache headers;
?original=1returns the full-resolution copy - Frontend shows placeholder (pulsing dots) until download completes; the lightbox always uses originals when present, with click-to-zoom + drag-pan
- Backfill rotates across species ordered by
(extras_count ASC, fetched_at ASC); species whose image list is exhausted are markedextras_exhausted_atand skipped for 7 days - Platform admin can clear cache, retry failed downloads, adjust download delay
Rate limiting:
- Sliding 1-hour request window with auto-throttle (slow down at 80%, pause at 95%)
- Respects Retry-After header on 429 responses
- Unauthenticated: 500 req/hr (with proper User-Agent)
- Authenticated (OAuth 2.0): 10,000 req/hr
- Wikimedia OAuth token and contact email configurable in Platform Settings
Versioning
Three independent SemVer streams (since vhub-0.28.0):
| Stream | Includes | Tag prefix |
|---|---|---|
hub |
hub, web, inference, shared, docs, root | vhub-X.Y.Z |
satellite |
packages/satellite only |
vsat-X.Y.Z |
mobile |
packages/mobile only |
vapk-X.Y.Z |
Hub-only bumps no longer flag every satellite as outdated. Bump per scope:
pnpm version:bump --scope hub patch
pnpm version:bump --scope satellite minor
pnpm version:bump --scope mobile patch
The hub at runtime knows the latest target for each stream (via bundled package.json + MAX(mobile_releases.version) for mobile) and exposes them at GET /api/system/versions. Each satellite reports its stream field on heartbeat so the hub can compare against the right target. Pi update script checks out the latest vsat-* tag; mobile versionCode derives from the SemVer (no more +b<timestamp> suffix).
Technology Stack
| Component | Technology |
|---|---|
| Satellite agent | TypeScript / Node.js |
| Local satellite DB | sql.js (SQLite) |
| Transport | MQTT (Mosquitto 2.x, dynamic security plugin) |
| Hub API | TypeScript / Fastify |
| Authentication | JWT cookies + Bearer API keys (bcrypt, SHA-256), MFA via TOTP (otplib) + WebAuthn (@simplewebauthn) |
| Job queue | BullMQ + Redis |
| Database | PostgreSQL 17 (71 migrations) |
| Object storage | MinIO (audio + species images, originals + 800px thumbs) |
| Inference | Python 3.11 / TFLite / birdnetlib |
| Web UI | TypeScript / React 19 / Mantine v9 / Vite / nginx |
| Mobile app | Capacitor (Android) / native AudioRecord plugin |
| Species translations | 38 languages from BirdNET label files |
| Documentation | Express + markdown-it |
| Reverse proxy | Traefik (HTTPS + WSS for MQTT) |
| Containers | Docker Compose (8 default, 9 with split mode) |
| Monorepo | pnpm workspaces |