AU915 LoRaWAN OTAA: JoinAccept “sent/scheduled” by gateway but 50% of devices never join (stuck in a JoinRequest loop)

Viewed 4

Hello,

We’re troubleshooting a persistent downlink problem in a remote deployment. Uplinks (JoinRequest) are received reliably, but downlinks (JoinAccept) appear to be scheduled/sent by the gateway and network server, yet many end devices never receive them and remain stuck in a join loop.

Setup

  • Region: AU915, Class A, OTAA
  • Gateway: RAK7289v2 (ChirpStack Gateway OS 4.9.0)
  • Network Server: ChirpStack LNS running in Docker
  • End devices: Dragino PS-LB-NA

Deployment includes a “border gateway” plus a two 1-hop relay gateways, but we also reproduced the issue in non-mesh mode.

Symptoms / impact

  • ~70 of 151 devices affected (Remainder are working flawlessly).
  • Devices repeatedly transmit JoinRequest; gateway/LNS processes them, but devices never complete OTAA.
  • Rebooting gateway/services doesn’t resolve it.

One problematic device showed no incoming downlinks on serial while on-site; later it successfully joined when tested around 100m away from teh oringla location.

What we’ve checked / tried

Signal strength doesn’t explain it: failing devices include some with relatively strong RSSI/SNR.

Duplicate uplinks / dedup hypothesis tested: we suspected JoinRequests being heard by two gateways and JoinAccept being sent by the “wrong” one. We turned mesh/relay off overnight (border gateway only) and the issue persisted, so this seems unlikely.

TX path sanity check: the relay gateway receives an extremely strong signal (≈-30 dBm @ 4km (almost too strong)) from the border gateway, suggesting the border gateway can transmit (at least in that link).

Gateway logs show downlink scheduling/acks: gateway receives downlink commands and returns ACK items like “OK/IGNORED” (and sometimes “COLLISION_PACKET/OK”), and the concentrator logs show “Scheduled packet for TX”.

Open questions

Are there known RAK7289v2 / SX1302 failure modes where downlinks can be “scheduled” but not actually transmitted or are transmitted with poor signal?

Could this be timing/clock drift (progressively worsening over time), GPS/PPS issues, or an AU915 RX1/RX2 / channel-mask mismatch that only affects downlinks/joins on some devices?

We can share gateway logs and JoinRequest/JoinAccept examples if helpful.

Thank you so much for your help :-)

1 Answers

For the failing OTAA activations, do you see a JoinAccept in the LoRaWAN frames tab (ChirpStack web-interface, device view)? (I believe you do based on above info).

If that is the case, then this confirms that the downlink was accepted by the gateway (only after receiving a TX ACK, ChirpStack will show the JoinAccept / downlinks). Note that the TX ACK is sent before the actual transmission, it is more a confirmation that it can be put in the queue.

Then it would mean that the issue is at the gateway, or between the gateway and the device. E.g. I have seen the same issue in the past were the device > gateway path was very good in terms of link margin, but that the gateway > device path was very bad (the isuse in this case was with the device antenna).

Things you could look at:

  • You should see a log message when the join-accept downlink has been sent
  • Using a spectrum analyzer could also help to debug if the gateway is actually transmitting the downlink (Air Spy SDR + https://www.gqrx.dk/ is what I often use).