High Available Mosquitto MQTT on Kubernetes

https://news.ycombinator.com/rss Hits: 7
Summary

In this post, we'll walk through a fully declarative, Kubernetes-native setup for running a highly available MQTT broker using Eclipse Mosquitto. This configuration leverages core Kubernetes primitives (Deployments, Services, ConfigMaps, and RBAC), alongside Traefik IngressRouteTCP to expose MQTT traffic externally. It introduces a lightweight, self-healing failover mechanism that automatically reroutes traffic to a secondary broker if the primary becomes unhealthy. The setup also demonstrates internal MQTT bridging, allowing seamless message propagation between brokers. The big advantage over a single Pod deployment (which, in case of node failure, k8s will restart after 5 minutes) is that this setup has a downtime of only 5 seconds and shared state, so all messages will be available on a failover. Recently I removed all Google Ads from this site due to their invasive tracking, as well as Google Analytics. Please, if you found this content useful, consider a small donation using any of the options below. It means the world to me if you show your appreciation and you'll help pay the server costs: GitHub Sponsorship PCBWay referral link (You get $5, I get $20 after you've placed an order) Digital Ocea referral link ($200 credit for 60 days. Spend $25 after your credit expires and I'll get $25!) Diagram of the setup This guide assumes you have a working Kubernetes setup using Traefik. In my case the version of Kubernetes/k3s I use for this article is v1.32.2+k3s1. If you haven't got such a cluster, maybe checkout all my other kubernetes posts. In a typical Kubernetes deployment with a single Mosquitto pod, resilience is limited. If the node running the pod fails, Kubernetes can take up to 5 minutes to detect the failure and recover. This delay stems from the default node-monitor-grace-period, which is often set to 5 minutes (300s). During this window, MQTT clients lose connectivity, messages are dropped, and systems depending on real-time messaging may suffer degraded...

First seen: 2025-05-18 13:51

Last seen: 2025-05-18 19:52