[IBM-Aspera] - [Aoc API] - Service Disruption
Incident Report for IBM-Aspera Service Status
Postmortem

This was caused by a recent upgrade to an application responsible for load-balancing requests to the AoC API. Upgrading this application is normally called non-disruptive, since it historically handles rolling restarts seamlessly. Our standard operating procedures requires that all changes go through QA and are verified to be non-disruptive before going through to production, which was the case here. However, there was one configuration setting in QA which did match production. The setting was for enabling/disabling gzip compression. It was disabled in QA, but enabled in prod. There was an issue with gzip compression in the new version of the application that went undetected as a result. Performing a standard roll back of the application version change remediated the issue.

This scenario highlighted the need for additional scrutiny of configuration differences between our Prod and QA environments, of which, many are legitimate. Fortunately, we are big proponents of GitOps here at Aspera, so looking back and doing a thorough check of our key configuration differences is a straight-forward task.

We apologize for the inconvenience caused.

Posted Nov 12, 2021 - 00:45 UTC

Resolved
This incident has been resolved.
Posted Nov 12, 2021 - 00:24 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Nov 11, 2021 - 23:16 UTC
Identified
The issue has been identified and a fix is being implemented.
Posted Nov 11, 2021 - 22:51 UTC
Investigating
Calls to /token are returning 400s. We are investigating.
Posted Nov 11, 2021 - 22:42 UTC
This incident affected: IBM-Aspera API Services (api.ibmaspera.com).