A letter was sent to our Digital Trunked Radio System (DTRS) Stakeholders from Office of Public Safety Communications Director Peter Bangas this week. Below is the text from the letter.
DTRS Stakeholder,
On August 6, 2025, the Office of Public Safety Communications (OPSC) Digital Trunked Radio (DTR) network experienced a severe and unexpected outage that impacted public safety operations across the state. We sincerely apologize for the disruption and for the risk this event posed to the essential work of our users. This letter provides a complete and transparent explanation of the root cause of the outage and details our immediate corrective actions and long-term measures to prevent a recurrence in the future.
Over the past five years, the OPSC has undertaken a project to convert the DTR network to an Internet Protocol (IP) based routed network. This has the tremendous advantage of allowing the network to automatically detect and respond to failure. Should a link, or even an entire site, fail the network routers work together to calculate alternate paths and re-route radio traffic.
As part of this project, the OPSC worked with its microwave radio network vendor, Aviat Networks, to upgrade network router software throughout the state from July 21 through August 5, 2025. The software upgrade was intended to improve the ability of the network to detect failures and re-route traffic.
Unfortunately, a previously undiscovered bug existed in part of the software that calculates alternate paths. This bug could cause routers on the network to crash and reboot while performing these calculations. The chance of a crash was small, but increased with the number and frequency of link and site failures on the network.
Starting around 4 a.m. on August 6, unfavorable atmospheric conditions including steep vertical temperature gradients and thermal inversion layers, combined with suspected Wi-Fi 6E interference, began causing frequent and significant microwave link failures across the eastern plains. Over the course of approximately three hours, the network was able to automatically recover from dozens of these failures.
At 7:12 a.m. a particularly intense and widespread atmospheric event caused severe simultaneous outages to multiple critical microwave radio links. While attempting to calculate alternate paths in response to this extreme event, one router encountered the software bug and crashed. This crash appeared to the network as a site failure and prompted additional path calculation events leading to additional routers encountering the software bug and crashing. As more routers crashed more calculation events were triggered creating a feedback loop that rapidly worsened.
By 7:15 a.m. the network was unable to transport traffic, forcing DTR sites statewide into a “site trunking” condition. In this condition, sites are able to transport voice traffic between radios associated with the same individual site, but not radios associated with any other site in the network. Many dispatch center consoles were disconnected and backup consolettes were subject to the same site trunking limitations as radios.
This bug was not caught by Aviat Networks during software testing because testing took place on a smaller network with fewer and less frequent simulated failures than the actual operational network.
High Level Explanation of Steps Taken for Recovery (Timeline)
OPSC engineers were notified of the issue at 7:16 a.m. and had identified the breadth and severity of the problem by 7:20 a.m. Due to the software bug the cyclic rebooting of each router prevented engineers from remotely accessing the failing routers for diagnostic or corrective actions.
By 7:30 a.m. the OPSC had assembled an engineering strike team including OPSC and vendor engineers from around the world to identify and correct the problem. Regional OPSC technicians were standing-by statewide to deploy to individual radio sites if necessary.
To correct the issue the network was first segmented into isolated regions by remotely shutting down several strategically selected microwave radio links throughout the state. This reduced the amount of recalculations that any individual router needed to perform. Then, individual routers were further manually isolated from the network by remotely shutting down the microwave radio links connecting them to their neighbors. This prevented the isolated router from attempting to perform path calculations and allowed the router to boot up normally without encountering the software bug and crashing.
Once the isolated router was allowed to boot it could be remotely accessed and the software feature containing the fatal bug disabled. This procedure was manually performed first on core routers and then on dozens of additional routers. As more routers were fixed the network became progressively more stable. An automated tool was then remotely deployed to recursively fix the remaining hundreds of routers throughout the network.
By approximately 8:45 a.m., the network had largely stabilized and most sites had been restored to normal operation. The engineering strike team continued to monitor the network, identifying and correcting isolated issues until the network was fully restored at approximately 9:30 a.m.
Steps Taken to Prevent Future Similar Incidents
Within 24 hours of the outage, Aviat Networks software developers had identified the root cause of the bug and released an updated software version to their quality assurance team. The quality assurance team, in turn, has developed new and substantially more rigorous test procedures which subject the software to test conditions that better simulate the challenges faced by the operational network. The new software release is undergoing testing now, and is expected to be released to the OPSC by September 2, 2025. At that time, OPSC and Aviat Networks engineers will apply the new software to the statewide network over a two to three week period.
OPSC and Aviat Networks engineers have also developed a plan to implement technology to improve the performance of the challenging radio links in the eastern plains. This technology allows the links to intelligently monitor their own performance, and react to atmospheric conditions and interference by automatically increasing their signal strength. This technology will be implemented shortly after the completion of the statewide software upgrade.
Finally, OPSC engineers are actively working to locate and shut down harmful interferers. With the proliferation of unlicensed Wi-Fi 6 devices, radio frequencies previously reserved for public safety communications have been opened for public use. This has led to greater instability network wide and reduced the resilience of the network against atmospheric, weather and other causes of microwave radio failures.
We thank you for your patience during the outage and for your ongoing support as we strive to continuously improve the network. We are committed to ensuring the highest level of reliability and will continue to review and enhance our system to prevent future disruptions and maintain a resilient communications system for all public safety users.
If you have questions or would like additional information, please contact me.
Sincerely,
Peter Bangas, Director
Office of Public Safety Communications
Colorado Division of Homeland Security and Emergency Management