24/7 Visibility with PagerDuty

What is the current state of your organization? Are there any critical issues currently affecting your systems? If so, are the correct individuals informed of these issues? Being aware of your organization’s issues and incidents is critical for serving your consumer. In order to successfully serve your consumer, you must have constant visibility into your system’s health to ensure quick resolution times when incidents arise. Recently, the SGG team established this type of visibility for one their customers using PagerDuty.

When we started at our client, a solely e-commerce based company, there was no visibility into backend issues aside from searching endlessly through logs. The Production Support team was putting out fires daily, with zero insight. There was no predicting when something could go wrong, and all issues were mainly reported by the Sales Support team after customers called in and complained about issues with the company’s site. With the implementation of Dynatrace, Splunk and NewRelic, we were able to provide visibility that they had never had before. Alerting was implemented through these tools, and the Production Support team was finally able to start acting on issues and incidents before customers complained. Unfortunately these tools only offered alerting through email, so when incidents happened outside of business hours they went unnoticed. Major jobs would fail during the evenings without the correct teams being informed and the next day everyone’s time was spent picking up the pieces. This was a major pain point for our client as unnoticed issues were hurting their ability to serve their customers.

We wanted to provide our client with a way to reduce resolution times for incidents and thus empower them to better serve their customers. To do this, we implemented PagerDuty into their systems. With PagerDuty the appropriate teams were able to receive high-priority alerts via phone call and text. On-call developers were alerted and able tend to arising issues immediately. Furthermore, if the first line of support was unavailable, PagerDuty automatically contacted their secondary on-call developers and managers. PagerDuty also gave our client the capability to reassign incidents or add additional responders to the incidents if needed. Since implementation, we have seen the SLA for failed jobs, website downtime, database issues, etc. decrease drastically.

Reduction in incident resolution time was not the only benefit that our PagerDuty solution provided to our client, visibility between teams also increased. Quite often teams in large organizations remain extremely focused on their own tasks without having any real visibility into the issues affecting other teams. Before PagerDuty there was very little awareness on the issues facing other teams. Additionally, upper management had very limited visibility into incidents and issues, and often would hear of incidents through the grapevine. With the implementation of PagerDuty users now have full visibility into incidents and issues occurring on smaller teams, and on a company-wide scale. Upper management is informed of issues as they are occurring, and have the ability to see who is working on the issue and receive updates regarding the issues right on their phones.
Finally, we have also seen a vast increase in communication regarding incidents. Because PagerDuty easily integrates with other popular communication applications that our client uses, such as Slack, teams are able to easily and conveniently discuss incidents. Teams currently receive alerts from PagerDuty right within their Slack channel, and users can share updates, collaborate, and resolve the issues in a clear and efficient way.

Thanks to the implementation of PagerDuty, there is now full visibility and communication from top to bottom around critical issues and incidents. Our client can be confident that any issues affecting their systems will be communicated to the appropriate persons and resolved in a timely manner. Finally, the PagerDuty solution has empowered our client to provide improved service to their customers.