Users can't login on Pipefy
Incident Report for Pipefy
Postmortem

Service Degradation Post-Mortem

Authors: SRE, Security and Support Teams

Date: 2021-04-20

Status: Resolved 

Summary: 

Due to the unavailability of the third-party authentication service used by Pipefy (Auth0), users were unable to log into Pipefy's platform. More information about the causes of the unavailability can be found directly on the vendor's status page.

Users that were already logged in when the incident began were not initially impacted and temporary measures were implemented to prevent the need to re-authenticate while the service was unavailable.

Impact: 

Major - The authentication service was unavailable from Apr 20 16:08 (UTC) to Apr 21 00:13 (UTC)

Root Cause: 

The identified root cause of this incident was a major outage in the Auth0 service, affecting mainly the US-1 region. 

Detection and resolution: 

The issue was detected by our internal monitoring system that identified application failures. The unavailability was also reported by several users.

Due to the fact that the root cause was a major outage in a third-party service, Pipefy's internal teams worked on minimizing the impacts on the users that were already logged in while waiting for the incident to be resolved on Auth0's side.

Action plan: Preventive action items

1. Implementation of improvements in the systems that monitor the authentication mechanisms. Due date: 05/14/2021

Posted Apr 29, 2021 - 13:27 UTC

Resolved
The login problem has been fixed and the access to Pipefy has been restored.
As soon as the investigation process is over, we’ll share further details about the causes, implemented fixes and preventive actions to be implemented.
Posted Apr 21, 2021 - 00:13 UTC
Update
We are continuing to closely watch performance in US-1.
https://status.auth0.com/
Posted Apr 20, 2021 - 23:14 UTC
Update
Based on the sustained performance for Auth0 US-1, Support Center, and the Auth0 Dashboard, we have now updated our status to monitoring. We are continuing to closely watch performance in US-1. We will now be moving to hourly updates.
https://status.auth0.com
Posted Apr 20, 2021 - 21:50 UTC
Update
We're still monitoring the performance of the third-party service as users are being able to log in to the platform.
From Auth0's Status Page (https://status.auth0.com/incidents/zvjzyc7912g5), "We continue to see performance improvements. We are working to fully restore all services to customers in our US-1 region."
Posted Apr 20, 2021 - 21:03 UTC
Update
We're still monitoring the performance of the third-party service as users are being able to log in to the platform.
From Auth0's Status Page (https://status.auth0.com/incidents/zvjzyc7912g5), "We continue to see additional customers that have moved to degraded performance. Our User Search v3 service is currently disabled, which can generate stale data when using `/api/v2/users` endpoints. Once the service is enabled again, all data will be brought up to current."
Posted Apr 20, 2021 - 20:44 UTC
Update
We're monitoring the performance of the third-party service as users are being able to log in to the platform.
From Auth0's Status Page (https://status.auth0.com/incidents/zvjzyc7912g5), "We continue to see customers that have moved to degraded performance. Systems are recovering and access to the Auth0 Dashboard has been restored. We are continuing to dedicate our full team's efforts on restoring services for all customers impacted by this incident."
Posted Apr 20, 2021 - 20:28 UTC
Monitoring
We're monitoring the performance of the third-party service as users are being able to log in to the platform.
From Auth0's Status Page (https://status.auth0.com/incidents/zvjzyc7912g5), "We are seeing customers in US-1 Production moving to degraded performance. We are continuing with all efforts to fully restore services for all customers."
Posted Apr 20, 2021 - 20:11 UTC
Update
As the performance of the service is still degraded, some users may still face issues logging in to the platform. Users that were already logged in are able to continue accessing Pipefy.
From Auth0's Status Page (https://status.auth0.com/incidents/zvjzyc7912g5), "We are continuing to work on restoring services for our outage. We can communicate that users who are logged in are not impacted by this incident. We are executing all remediation steps for our incident protocol. Our entire technical and engineering teams are taking this as an all hands on deck situation to find resolution."
Posted Apr 20, 2021 - 19:51 UTC
Update
We are continuing to monitor the third-party service and working on solutions to mitigate the problem.
From Auth0's Status Page (https://status.auth0.com/incidents/zvjzyc7912g5), "We are continuing to work on restoring services as quickly as possible. As soon as we have an ETA for the restoration of services, we will update our status."
Posted Apr 20, 2021 - 19:31 UTC
Update
According to the information provided in previous updates, the difficulties in logging into our platform are happening due to an instability in the third-party mechanism we use for authenticating user login. This mechanism called Auth0 is used by thousands of online services to validate their user login information and allow users to access them, ensuring the safety of the applications and making sure only users with the right credentials can login.
Due to the fact that this is a third party vendor, right now we’re currently working to minimize the impact of this incident while waiting for further updates on their end.
The most current update we’ve gotten from Auth0 (https://status.auth0.com/incidents/zvjzyc7912g5) was: “We are doing everything we can to resolve the situation as quickly as possible. We did attempt a database failover to resolve the situation, but it was not successful. We are continuing to dedicate our entire engineering team to a swift resolution. I understand the gravity of the impact it’s having on your team, customers, and business.”
Thousands of other online services across the globe that also use Auth0 in their applications are equally affected and waiting for further updates. We’re deeply sorry for the inconvenience and we’ll make sure to provide new information as soon as it becomes available.
Posted Apr 20, 2021 - 19:12 UTC
Update
We are continuing to monitor the third party service and working on solutions to mitigate the problem.
Posted Apr 20, 2021 - 18:48 UTC
Update
We are still monitoring the situation so the service can be reestablished as soon as possible.
Posted Apr 20, 2021 - 17:59 UTC
Update
We are still monitoring the situation so the service can be reestablished as soon as possible
Posted Apr 20, 2021 - 17:29 UTC
Update
We are still monitoring the situation so the service can be reestablished as soon as possible.
Posted Apr 20, 2021 - 17:01 UTC
Identified
We are still monitoring the situation so the service can be reestablished as soon as possible.
Posted Apr 20, 2021 - 16:33 UTC
Investigating
Our third party login service is current facing an incident so our users can't login. We are monitoring the situation so the service can be reestablished as soon as possible.
Posted Apr 20, 2021 - 16:08 UTC
This incident affected: Authentication.