ChirpStack MQTT gateway reconnect issue after gateway reboot

Viewed 45

Hi,

I'm deploying chirpstack and its required apps in kubernetes:

  • ChirpStack v4.18.0
  • Mosquitto 2.1.0
  • Redis + PostgreSQL

With ChirpStack MQTT Forwarder v4.5.1 running on the gateway as a docker container

  • Gateway connects over TLS to Traefik TCP ingress on port 8883
  • Traefik forwards to Mosquitto on port 1883

Problem

Initial startup usually works fine.

After rebooting/power-cycling the gateway:

  • The gateway MQTT forwarder starts
  • TLS handshake succeeds
  • But MQTT connect hangs and eventually times out:
2026-05-26T11:20:36.044Z INFO  [chirpstack_mqtt_forwarder::backend::semtech_udp] PULL_DATA received, random_token: 38782, remote: 127.0.0.1:44644
2026-05-26T11:20:36.044Z INFO  [chirpstack_mqtt_forwarder::backend::semtech_udp] Sending PULL_ACK, random_token: 38782, remote: 127.0.0.1:44644
2026-05-26T11:20:40.381Z ERROR [chirpstack_mqtt_forwarder::mqtt] MQTT error, error: Timeout
2026-05-26T11:20:41.497Z DEBUG [rustls::client::hs] Resuming session
2026-05-26T11:20:41.552Z DEBUG [rustls::client::hs] Using ciphersuite TLS13_AES_128_GCM_SHA256
2026-05-26T11:20:41.552Z DEBUG [rustls::client::tls13] Resuming using PSK
2026-05-26T11:20:41.554Z DEBUG [rustls::client::tls13] TLS1.3 encrypted extensions: ServerExtensions { unknown_extensions: {}, .. }
2026-05-26T11:20:41.554Z DEBUG [rustls::client::hs] ALPN protocol is None
2026-05-26T11:20:46.244Z INFO  [chirpstack_mqtt_forwarder::backend::semtech_udp] PULL_DATA received, random_token: 56554, remote: 127.0.0.1:44644
2026-05-26T11:20:46.244Z INFO  [chirpstack_mqtt_forwarder::backend::semtech_udp] Sending PULL_ACK, random_token: 56554, remote: 127.0.0.1:44644
2026-05-26T11:20:46.384Z ERROR [chirpstack_mqtt_forwarder::mqtt] MQTT error, error: Timeout

nc to the MQTT endpoint works:

nc -vz <host> 8883
Connection to <host> 8883 port [tcp/secure-mqtt] succeeded!

The strange part:

If I delete/restart the ChirpStack pod, the gateway instantly reconnects and starts working again.

Example:

kubectl delete pod chirpstack-xxxxx

Immediately afterwards the gateway reconnects successfully.

Important observation

If I scale the ChirpStack deployment to 0 replicas:

kubectl scale deploy/chirpstack --replicas=0

then I can reboot/reset the gateway repeatedly and MQTT connections always work fine.

As soon as ChirpStack is running again, eventually the gateway reconnect issue returns.

This makes me suspect something in the ChirpStack MQTT gateway backend handling.

Mosquitto logs

When things work:

New client connected ... as iotgw-000121 (p5, c1, k30).
iotgw-000121 0 eu868/gateway/ca01dcfffe172f12/command/+

New client connected ... as chirpstack-gw-backend (p5, c1, k30).
chirpstack-gw-backend 0 $share/chirpstack/eu868/gateway/+/event/+

Then later:

Client iotgw-000121 closed its connection.

After this, the gateway forwarder times out reconnecting until I restart the ChirpStack pod.

Gateway MQTT Forwarder config

[mqtt]
server="ssl://<host>:8883"
client_id="iotgw-000121"
clean_session=tried both false/true
qos=0
keep_alive_interval="30s"

[backend]
enabled="semtech_udp"

ChirpStack gateway MQTT backend config

[regions.gateway.backend.mqtt]
topic_prefix="eu868"
server="tcp://chirpstack-mosquitto:1883"
client_id="chirpstack-gw-backend"
clean_session=tried both false/true
event_topic="gateway/{{ gateway_id }}/event/{{ event }}"
command_topic="gateway/{{ gateway_id }}/command/{{ command }}"

Mosquitto config

listener 1883
allow_anonymous true

Traefik TCP ingress

apiVersion: traefik.containo.us/v1alpha1
kind: IngressRouteTCP
spec:
  entryPoints:
    - mqttsecure
  routes:
    - match: HostSNI(`*`)
      services:
        - name: chirpstack-mosquitto
          port: 1883
  tls: {}

Question

Has anyone seen ChirpStack's MQTT gateway backend get into a state where gateway reconnects fail until the ChirpStack pod is restarted?

1 Answers

Could this be related to the Traefik proxy? I have not worked much with Traefik, given you are using it as ingress I think stopping the ChirpStack pod means Traefik updates its internal routing, which could (temporarily) fix the ingress to the MQTT broker.

I would suggest trying to connect directly to the MQTT broker (from the gateway). If that solves the issue, then dive deeper into Traefik (e.g. it could be related to timeouts).