Fleet Management¶

This use case describes how to manage a large fleet of IoT devices across multiple tenants using Complete Device Management.

Stack context

Device management plane (ThingsBoard, hawkBit, WireGuard) lives in the Tenant-Stack. Platform-wide fleet dashboards are in the Provider-Stack (Grafana → Provider TimescaleDB).

Scenario¶

An industrial equipment manufacturer ships 500 Linux-based controllers to customers across three regions. Each customer is a separate tenant. The manufacturer needs to:

Provision all devices automatically when they first power on at the customer site.
Push firmware updates in a controlled, staged manner without disrupting production.
Monitor device health in real time.
Allow on-call engineers to remotely debug devices without VPN client software on their laptops.

Setup¶

1. Deploy a Tenant-Stack per Customer¶

Each customer operates an independent Tenant-Stack. Follow the Tenant Onboarding use case to provision the stack and connect it to the Provider-Stack via the JOIN workflow.

2. Assign Operators¶

Add operator users to the tenant Keycloak realm. They receive:

cdm-operator role → ThingsBoard Customer User, hawkBit read + trigger, Grafana Editor.

3. Pre-configure Device Images¶

Bake the following into the Yocto OS image before shipping:

/opt/cdm/enroll.sh        — enrollment script
/opt/cdm/ca-fingerprint   — Tenant step-ca Sub-CA fingerprint
/etc/cdm/device-config    — TENANT_API_URL, TB_MQTT_HOST, HAWKBIT_URL, TSDB_URL

The device ID is derived from the hardware serial number at first boot.

Day-to-Day Operations¶

Viewing the Fleet¶

Open ThingsBoard → Devices (filter by tenant or device profile cdm-x509).
The device list shows:
Online/offline status (last activity timestamp)
Current firmware version (sw_version attribute)
WireGuard IP
Active alarm count

Triggering a Fleet-Wide Firmware Update¶

Build and sign the RAUC bundle in CI/CD.
Upload to hawkBit (automate via REST API in your CI pipeline).
Create a rollout:
Group 1: 5% of devices (canary) — actionType: soft (device installs at next reboot)
Group 2: 25% — activated after Group 1 reaches 95% success
Group 3: 70% — activated after Group 2 reaches 95% success
Monitor in hawkBit Rollout view and Grafana OTA dashboard.

Handling a Failed Update¶

If a device reports ota_status: failure:

ThingsBoard raises an OTA Failure alarm.

Operator opens the Terminal Widget and inspects logs:

journalctl -u rauc-hawkbit-updater -n 50
rauc status

If the bundle was corrupt, re-upload a corrected version and re-trigger the deployment.
RAUC automatically reverts to the previous slot after failed boot attempts.

Scaling Considerations¶

Scale	Recommendation
< 100 devices	Single Docker Compose node is sufficient
100–1000 devices	Separate DB nodes (managed PostgreSQL, MySQL); keep app containers on Docker Compose
> 1000 devices	Move to Kubernetes with Helm charts; scale ThingsBoard and TimescaleDB horizontally
> 10,000 devices	Consider ThingsBoard PE (cluster mode), TimescaleDB distributed hypertables, and hawkBit cluster