BirdNET-NG Architecture

A distributed bird sound identification system built on the BirdNET deep learning model.

Overview

BirdNET-NG decouples audio capture, processing, and visualization into independently deployable components that communicate over a message broker. This enables community-contributed satellites (microphone nodes), shared inference infrastructure, and multiple UI clients β€” all with multi-tenant isolation from day one.

graph TD
    subgraph Satellites
        S1["πŸŽ™ Satellite 1
(Pi + Mic)"] S2["πŸŽ™ Satellite 2
(Pi + Mic)"] PA["πŸ“± Phone App
(Android)"] end S1 -->|WSS :443| T S2 -->|WSS :443| T PA -->|WSS :443| T T["πŸ”€ Traefik
(reverse proxy)"] T -->|HTTPS| Web["🌐 Web
(nginx)"] T -->|WSS| Mosq["πŸ“‘ Mosquitto
(dynsec)"] T -->|HTTPS| Docs["πŸ“– Docs"] subgraph Hub Cluster Web --> Hub["⚑ Hub API
(Fastify)"] Mosq --> Hub Hub --> PG["πŸ—„ PostgreSQL"] Hub --> Redis["βš™ Redis +
BullMQ"] Redis --> W1["🐍 Worker 1
(Python)"] Redis --> WN["🐍 Worker N
(Python)"] end T -->|HTTPS| MinIO["πŸ’Ύ MinIO
(S3 storage)"] Hub --> MinIO W1 --> MinIO WN --> MinIO style S1 fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0 style S2 fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0 style PA fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0 style T fill:#4c1d95,stroke:#8b5cf6,color:#e2e8f0 style Web fill:#065f46,stroke:#22c55e,color:#e2e8f0 style Hub fill:#065f46,stroke:#22c55e,color:#e2e8f0 style Mosq fill:#7c2d12,stroke:#f97316,color:#e2e8f0 style Docs fill:#065f46,stroke:#22c55e,color:#e2e8f0 style PG fill:#1e3a5f,stroke:#3b82f6,color:#e2e8f0 style Redis fill:#7c2d12,stroke:#f97316,color:#e2e8f0 style W1 fill:#713f12,stroke:#eab308,color:#e2e8f0 style WN fill:#713f12,stroke:#eab308,color:#e2e8f0 style MinIO fill:#4c1d95,stroke:#8b5cf6,color:#e2e8f0

Components

Satellite (packages/satellite)

Lightweight Node.js agent running on a Raspberry Pi with an attached microphone.

Audio Pipeline:

  • Captures 3-second WAV chunks at 48kHz mono (BirdNET's native window)
  • On-device pre-filtering: RMS silence gate + YAMNet VAD (per-category bird-likelihood scoring)
  • Filter settings pushed from hub tenant settings via MQTT config channel
  • Local SQLite outbox queue (sql.js) β€” never blocks on network
  • MQTT upload with acknowledgment and exponential backoff

Recording Scheduler:

  • Sunrise/sunset calculation from GPS coordinates (NOAA algorithm)
  • Profile-based scheduling: dawn_chorus, night_migration, low_power, continuous
  • Profiles pushed from hub via MQTT, resolved locally against sun times
  • Capture loop pauses outside recording windows

Heartbeat:

  • Sends lightweight keepalive at configurable interval (default 30s, MQTT QoS 1)
  • Reports state: recording, paused, scheduled_off, error
  • Reports noise_floor_rms for adaptive filter calibration
  • Reports satellite package version for version tracking
  • Heartbeat interval configurable from hub tenant settings
  • Decouples online status from audio chunk sending (satellite stays online even when all chunks are filtered)

Capture Modes: alsa (real hardware), simulate (sine wave), replay (pre-recorded WAV files)

Mobile App (packages/mobile)

Android phone satellite built with Capacitor.

  • Native AudioRecord plugin (bypasses WebView audio limitations)
  • GPS gate: requires location before recording starts (auto-acquire or manual entry)
  • GPS auto-update toggle with configurable interval (30s–10min)
  • Background mode for continuous recording when app is minimized
  • Keep-screen-on toggle
  • Settings redesigned as full page with clear GPS/manual location toggle
  • Tap-to-record status bar (recording rectangle is the toggle)
  • Verbose chunk logging toggle
  • Default state: paused (not recording)
  • All hub filter settings received and applied via MQTT
  • Heartbeat and telemetry (battery, storage via Storage API)
  • In-app log viewer for diagnostics

Hub API (packages/hub)

Central Fastify server coordinating the entire system. The hub can run in three modes via the HUB_MODE environment variable:

Mode What runs
full (default) API server + all background workers in a single process
api Stateless HTTP + WebSocket server only (scalable with replicas)
dispatcher Background workers only: MQTT ingester, monitors, webhook dispatch, image worker (single instance)

In full mode, everything runs in one process (the original behavior). In split mode, api and dispatcher run as separate containers. A Redis pub/sub event bridge carries events between them (e.g., detection alerts from the dispatcher are published to Redis and delivered to WebSocket clients by the API). The API process uses a lightweight MqttConfigPusher to push configuration to satellites without running the full MQTT ingester. The image worker writes stats to Redis and polls settings from the database, so it operates correctly in both modes.

Service Registry: All service instances (API, dispatcher, web, workers) register themselves in Redis under birdnet:registry:{service}:{instanceId} with a 30-second TTL. Each registration includes IP address, version, uptime, memory usage, and service-specific metadata (e.g., worker job stats). The System page in the Platform Admin section queries this registry via GET /api/system/status to display a live overview of all running instances, infrastructure health (PostgreSQL, Redis, MQTT, MinIO), and a data summary (detections, queue, images, tenants, users). Detail rows are expandable for per-instance information.

Core Services (API):

  • JWT cookie + Bearer token authentication
  • Role-based access control with tenant isolation
  • Satellite credential provisioning via Mosquitto dynamic security (MqttAdmin)
  • WebSocket server for real-time client notifications
  • MqttConfigPusher: pushes config to satellites via MQTT (in api mode)

MQTT Connection Resilience:

Both MqttAdmin (credential provisioning) and MqttConfigPusher (satellite config push) maintain persistent MQTT connections with auto-reconnect. When running multiple API instances (--scale api=N), each instance maintains its own MQTT connection. If a connection is temporarily lost (e.g., Mosquitto restart, network blip), API requests that need MQTT will wait up to 10 seconds for reconnection rather than failing immediately. This makes scaled deployments reliable without requiring sticky sessions or a shared MQTT connection.

Background Workers (Dispatcher):

  • MQTT Ingester: receives audio, telemetry, and heartbeats; stores audio in MinIO; enqueues inference jobs via BullMQ
  • SatelliteMonitor: checks for offline/degraded satellites every 60s
  • DetectionWatcher: rarity checks, first-of-season flagging, WebSocket events, webhook dispatch. When a detection is the first observation of a species in the current season for a tenant, it sets is_first_of_season on the detection and generates an alert.
  • SpeciesImageWorker: downloads bird thumbnails from Wikipedia (max 2 concurrent), stores in MinIO, with Wikimedia rate limiting (500 req/hr unauthenticated, 10,000 req/hr with OAuth 2.0)
  • WebhookDispatcher: delivers webhook payloads with HMAC signatures
  • AuditService: comprehensive audit logging of all user, admin, and detection actions to audit_log table (real client IPs via X-Forwarded-For)

Session Grouping: Audio chunks are grouped into recording sessions based on a 5-minute gap boundary. If the time between two consecutive chunks from the same satellite exceeds 5 minutes, a new session begins. The Sessions API exposes these groupings for the Timeline page, including per-session chunk counts and detection summaries.

Inference Workers (packages/inference)

Python 3.11 processes running BirdNET TFLite.

  • Pull jobs from BullMQ/Redis queue
  • Download audio from MinIO, run inference via birdnetlib
  • Geo-aware species filtering (GPS + date)
  • Enhanced re-processing for uncertain detections (0.4–0.7 confidence):
    • Extended audio window (6s)
    • Frequency isolation (1–10kHz bandpass)
    • Cross-chunk correlation
  • Scale with docker compose up --scale worker=N

Web UI (packages/web)

React 19 SPA served by nginx, responsive for mobile.

Pages:

  • Dashboard: satellite fleet status, recent detections, active alerts
  • Detections: 2-column flex card layout with SpectrogramPlayer (click-to-seek, drag-to-select region, loop playback, signal boost 1x–20x, speed 0.25x–2x, volume, download), vote buttons (Yes/No/Unsure) under confidence bar
  • Satellites: sortable table layout (Name, Status, Profile, Version, Last seen) with search and status filter; satellite count (online/total) in header; row click opens a slide-in side panel with full details (General, Hardware, Schedule, Hub Configuration, Actions); version shown with amber warning when satellite version differs from hub version; tenant name shown in All Tenants view
  • Analytics: daily trends, hourly activity, top species, biodiversity indices, weather correlation, migration patterns
  • Compare: side-by-side location comparison with Jaccard similarity
  • Alerts: custom alert rules (detection/absence/trend triggers) and 6 notification channels
  • Workers: connected workers with IP-based dedup, queue stats (platform admin only)
  • Members: tenant member management, invite links
  • Timeline: chronological view of recording sessions with inline chunks and detections
  • Species: card catalog of all detected species with stats, sorting, search
  • Map: Leaflet.js interactive map with satellite markers and detection data
  • Account: profile editing, password change, language preferences, self-delete
  • Audit Log: paginated table of all admin and user actions (platform admin)
  • Settings: tenant thresholds, watchlists; platform toggles, Wikimedia API, image cache

Features:

  • Species thumbnails from Wikipedia (downloaded in background, served from MinIO)
  • Multi-language bird names (38 languages from BirdNET label files)
  • Species name display: primary bold, secondary in parentheses, Latin italic, clickable thumbnail with lightbox
  • Real-time WebSocket notifications for rare species (translated names)
  • Responsive layout with hamburger menu on mobile, scrollable sidebar (hidden scrollbar)
  • Spectrogram visualization (Cooley-Tukey FFT, Viridis colormap, 0–12kHz)
  • Full i18n support (English + French, ~500+ translation keys, locale-aware formatting)
  • Language switcher in sidebar

MQTT Channels

birdnet/{tenant_id}/{satellite_id}/{channel}
Channel Direction QoS Purpose
audio Satellite β†’ Hub 1 Audio chunk (base64 WAV)
telemetry Satellite β†’ Hub 1 Battery, storage, CPU, GPS
heartbeat Satellite β†’ Hub 1 Keepalive with state + noise_floor_rms
config Hub β†’ Satellite 2 Recording profile + filter settings push
config-request Satellite β†’ Hub 1 Device-initiated config change request (respects lock)
update Hub β†’ Satellite 1 Remote update command (triggers update script on Pi)
ack Hub β†’ Satellite 1 Chunk acknowledgment

Authentication

Two methods, resolved in order:

  1. Bearer token β€” Authorization: Bearer <token> (JWT from mobile login, or API key SHA-256 hashed)
  2. JWT cookie β€” session httpOnly cookie set by login endpoint

Roles (tenant-scoped): viewer < member < admin < owner Platform admin: cross-tenant access, read-only protections, designated via DB flag or PLATFORM_ADMIN_EMAILS env var

Multi-Factor Authentication (v0.27+)

Optional second factor for any user, mandatory for platform admins.

  • TOTP (RFC 6238, Β±30 s tolerance): scan a QR with any authenticator app (Google Authenticator, 1Password, Authy…). Secret stored AES-256-GCM-encrypted at rest with MFA_ENCRYPTION_KEY (32-byte hex env var)
  • WebAuthn passkeys: Touch ID, Windows Hello, YubiKey, etc., registered against an authenticated session and used as an alternative second factor. Requires WEBAUTHN_RP_ID/WEBAUTHN_RP_NAME/WEBAUTHN_ORIGIN (auto-derived from BNG_APP_FQDN)
  • Backup codes: 10 single-use Crockford-alphabet codes shown once at TOTP enrollment, bcrypt-hashed at rest, popped on consume. Regeneratable from /account (re-enter password)
  • Login challenge: /login returns {mfaRequired, mfaToken} instead of the session cookie when MFA is enabled. The 5-min mfaToken JWT bridges the password and second-factor steps; up to 5 wrong codes per token (Redis-backed counter) before it's burned and the user has to re-enter the password
  • Forced-enrollment gate: platform admins land on a full-page enrollment screen before the AppShell renders if they haven't enrolled yet
  • Remember this device (30 days): opt-in cookie that skips the second factor on the same browser. Token is sha256-hashed at rest in mfa_remembered_devices; per-device list with revoke on /account. Logout does NOT clear the cookie (matches Google / GitHub / Microsoft)
  • Reset escape hatches: platform admin "Reset MFA" action on PlatformUsers; CLI emergency reset at pnpm --filter @birdnet-ng/hub exec mfa:reset <email> for the case where the only platform admin loses both their device and backup codes
  • Audit: every MFA action logged (enabled, disabled, login_failed, passkey_added/removed, reset_by_admin, etc.)

Multi-Tenancy

  • Row-level isolation via tenant_id on all data tables
  • Users are global; membership via tenant_members join table
  • Each user can have different roles in different tenants
  • MQTT ACLs enforce per-satellite topic scoping
  • MinIO paths prefixed by tenant: {tenant_id}/{satellite_id}/{chunk_id}.wav
  • Species images are global (not tenant-scoped)

Database

PostgreSQL 17 with 71 migrations (001–071):

Migration Purpose
001 Core tables: tenants, satellites, audio_chunks, detections, alerts, telemetry
002 User auth: global users, tenant_members, invites, platform/tenant settings
003 User management: blocked flag, updated_at
004 Webhooks table
005 Satellite device_id for dedup
006 Watchlist species in tenant settings
007 Language settings on tenant
008 Per-user language preferences
009 Species images cache table
010 Satellite heartbeat_state and uptime
011 Nullable telemetry fields (mobile compatibility)
012 Species image download queue columns
013 User management: last_login, lockout, audit_log, login_attempts
014 Audio filter settings
015 First of season flag on detections
016 Voting mode user preference (show_voting)
017 Time format user preference (time_format)
018 Satellite config overrides table + config_locked flag
019 Detection comments table
020 Alert rules + alert channels tables
021 Scheduled exports table
022 Start-of-week user preference
023 First-of-day flag on detections
024 Satellite update status tracking
025 Satellite filter stats (JSONB)
026 Field notes + field note photos
027 Watcher-processed flag on detections
028 Mobile releases (APK upload)
029 Detection stitching (stitch_group_id, stitch_chunk_ids, is_stitch_primary)
030 Audio retention (pinned, audio_chunks.size_bytes + audio_purged_at, platform defaults)
031 Detection frequency band (superseded by 033)
032 Detection call window (superseded by 033)
033 Detection call_segments JSONB β€” array of {start_ms, end_ms, freq_min_hz, freq_max_hz} per call
034 Detection stitched_call_segments JSONB β€” combined segments on stitched primaries
035 Detection trust_score + neighbor_count + compute_trust_score() SQL function
036 species_frequency_bands β€” empirical per-species freq range from trusted detections
037 audio_chunks.rms β€” per-chunk amplitude for silent-bucket detection on the timeline
038 alert_rules gains system + system_key + cooldown_scope + message_template; seeds system rules per tenant
039 alert_channels.payload_template; webhooks migrated into alert_channels with type='webhook'
040 alert_rules.fire_mode: cooldown / once_until_clear
041 alert_rules.fire_mode adds on_state_change; satellite_offline switched to this mode
042 birdnet_models registry β€” multi-model BirdNET store + per-detection model_id
043 birdnet_models bundled-model relabeling cleanup
044 audio storage hard cap (purges keep-best/pinned when over the hard ceiling)
045 inference_mode platform setting + shadow_detections table for A/B compare
046 species_image_files β€” per-species multi-image gallery storage
047 clean duplicate image extras
048 audio_chunks.embedding β€” per-chunk Perch v2 acoustic embedding (1536-d float16)
049 birdnet_models.kind ('classifier' / 'embedding') + per-kind default index
050 drop builtin model rows from registry (worker falls back to birdnetlib's bundled .tflite)
051 silent_chunk_retention_hours platform setting (default 6h) β€” purges chunks with no detections
052 YAMNet replaces spectral peak / noise-floor heuristics; add yamnet_min_bird_prob (default 0.05)
053 Satellite outbox capacity caps β€” drop retention_hours; add outbox_soft_size_mb (5000) / outbox_hard_size_mb (8000) / outbox_max_age_hours (720)
054 Detection visibility by category β€” species_category() SQL function classifies into bird/amphibian/insect/anthropogenic/human_voice/other_animal; 5 show_* booleans on tenant_settings (default false). satellites.outbox_stats JSONB
055 Split visibility from satellite drop β€” 5 drop_at_satellite booleans on tenant_settings (default true) + satellite_config_overrides (nullable). show remains UI-only; drop_* governs the YAMNet pre-upload drop and is per-satellite-overridable
056 detection_visible_for_tenant() SQL function β€” consults each detection's own tenant_settings inline so the visibility filter works on platform-admin "all tenants" queries
057 apk_keep_last_n platform setting (default 5) β€” rolling-release cap for mobile_releases; APK retention sweep drops oldest beyond N
058 species_category() patched to cover BirdNET v2.4's five mammal labels (Canis lupus, Canis latrans, Sciurus carolinensis, Tamias striatus, Odocoileus virginianus) β€” classified as other_animal so the show_other_animals visibility toggle hides them
059 tenant_settings.range_filter_threshold REAL DEFAULT 0.01 β€” per-tenant probability cutoff on the eBird meta-model that narrows the BirdNET label set by lat/lon/week; default chosen lower than birdnetlib's 0.03 to keep resident species (Tawny Owl, Stock Dove, Scops Owl) that eBird under-reports
060 Temporal aggregation (Merlin-style FP suppression): detections.is_promoted; tenant_settings.temporal_aggregation_{enabled,count,window_minutes} (true/2/30); confidence_high default 0.85 (single-shot bypass). Tentative detections hidden from UI/exports/share/alerts until promoted by a 2nd hit within window
061 tenant_settings.tentative_retention_hours DEFAULT 24 β€” retention sweep drops unpromoted detections older than this so the table stays bounded
062 detections.is_out_of_range β€” inference worker flags hits where BirdNET β‰₯ confidence_high but the species was eBird-rejected; surfaced as "Out of range" badge so vagrants aren't silently dropped
063 species_images + species_image_files: original_storage_key + original_content_type + original_file_size β€” full-resolution copy alongside the 800px display thumb so the lightbox can serve and zoom the original
064 MFA: users.{mfa_enabled, mfa_secret_encrypted (AES-256-GCM), mfa_backup_codes_hash[] (bcrypt), mfa_enrolled_at} + user_passkeys table for WebAuthn credentials
065 mfa_remembered_devices β€” long-lived "trust this device" tokens (sha256-hashed, with user_agent + ip_address for the listing UI), 30-day default expiry
066 species_images.extras_exhausted_at β€” marks species whose Wikipedia image list is exhausted so the backfill loop rotates instead of looping on a partial gallery; weekly re-check window
067 satellites.stream β€” 'satellite' / 'mobile' β€” set by heartbeat from each device, used by the hub to compare against the matching latest target instead of always comparing to the hub version

Species Images

Bird thumbnails are managed by a background download queue:

  1. When a species is first seen, it's queued in species_images table
  2. SpeciesImageWorker processes the queue (max 2 concurrent, configurable delay)
  3. Fetches the original image URL from Wikipedia (pageimages prop), then downloads both the 800px display thumb and the full-resolution original in a single pass β€” stored as species-images/{slug}.{jpg|png} and species-images/{slug}_orig.{jpg|png} respectively
  4. Up to N additional images per species (species_extra_images_count) are pulled from the page's images prop; each extra also stores both sizes ({slug}_{position}.jpg + {slug}_{position}_orig.jpg)
  5. Proxy endpoint serves from MinIO with immutable cache headers; ?original=1 returns the full-resolution copy
  6. Frontend shows placeholder (pulsing dots) until download completes; the lightbox always uses originals when present, with click-to-zoom + drag-pan
  7. Backfill rotates across species ordered by (extras_count ASC, fetched_at ASC); species whose image list is exhausted are marked extras_exhausted_at and skipped for 7 days
  8. Platform admin can clear cache, retry failed downloads, adjust download delay

Rate limiting:

  • Sliding 1-hour request window with auto-throttle (slow down at 80%, pause at 95%)
  • Respects Retry-After header on 429 responses
  • Unauthenticated: 500 req/hr (with proper User-Agent)
  • Authenticated (OAuth 2.0): 10,000 req/hr
  • Wikimedia OAuth token and contact email configurable in Platform Settings

Versioning

Three independent SemVer streams (since vhub-0.28.0):

Stream Includes Tag prefix
hub hub, web, inference, shared, docs, root vhub-X.Y.Z
satellite packages/satellite only vsat-X.Y.Z
mobile packages/mobile only vapk-X.Y.Z

Hub-only bumps no longer flag every satellite as outdated. Bump per scope:

pnpm version:bump --scope hub patch
pnpm version:bump --scope satellite minor
pnpm version:bump --scope mobile patch

The hub at runtime knows the latest target for each stream (via bundled package.json + MAX(mobile_releases.version) for mobile) and exposes them at GET /api/system/versions. Each satellite reports its stream field on heartbeat so the hub can compare against the right target. Pi update script checks out the latest vsat-* tag; mobile versionCode derives from the SemVer (no more +b<timestamp> suffix).

Technology Stack

Component Technology
Satellite agent TypeScript / Node.js
Local satellite DB sql.js (SQLite)
Transport MQTT (Mosquitto 2.x, dynamic security plugin)
Hub API TypeScript / Fastify
Authentication JWT cookies + Bearer API keys (bcrypt, SHA-256), MFA via TOTP (otplib) + WebAuthn (@simplewebauthn)
Job queue BullMQ + Redis
Database PostgreSQL 17 (71 migrations)
Object storage MinIO (audio + species images, originals + 800px thumbs)
Inference Python 3.11 / TFLite / birdnetlib
Web UI TypeScript / React 19 / Mantine v9 / Vite / nginx
Mobile app Capacitor (Android) / native AudioRecord plugin
Species translations 38 languages from BirdNET label files
Documentation Express + markdown-it
Reverse proxy Traefik (HTTPS + WSS for MQTT)
Containers Docker Compose (8 default, 9 with split mode)
Monorepo pnpm workspaces