At its core, DevOps is a fundamentally data-driven practice. The ability to continuously improve the code that drives a product comes from understanding how it performs, what risks it introduces, and where to find opportunities. Monitoring tools tap into each layer of a product’s technology stack to deliver the data to catch code errors early, improve operational efficiency, and respond rapidly to changes in usage.
And while DevOps monitoring is important for deployed code, the best monitoring practices seek to build on the security concept of “shifting left.” In other words, monitoring in DevOps begins much earlier in the software development lifecycle (SDLC). That helps unearth inefficiencies, risks, and potential customer-facing issues before they make it into production.
In this guide, we’ll explore the role of monitoring in a DevOps practice and answer key questions about DevOps monitoring tools including:
- What are DevOps monitoring tools?
- What is continuous monitoring in DevOps?
- Where should you implement monitoring tools in your DevOps practice?
- What are the capabilities you should look for in DevOps monitoring tools?
What are DevOps monitoring tools?
Monitoring is a core part of a successful DevOps practice and a critical way to both understand and detect any potential issues before they make it to production—and surface any issues that may show up in production.
To accomplish this, organizations will often leverage a number of DevOps monitoring tools such as crash-reporting tools, application performance monitoring (APM) platforms, and server monitoring tools to collect data across all stages of the software development lifecycle (SDLC) and surface actionable insights to improve operational performance.
DevOps monitoring tools enable organizations to build automated monitoring stages into key points within the SDLC to improve the performance of a code base, application, and its underlying infrastructure. And since there can potentially be thousands of moving parts within an organization’s SDLC, automation is critical to enable a consistent monitoring practice.
When properly implemented, DevOps monitoring tools help aggregate data, return actionable insights, and feed them back into the broader DevOps pipeline to surface any potential issues in the SDLC. This practice is frequently called continuous monitoring and draws on the DevOps idea of continuous improvement.
What is continuous monitoring in DevOps?
Not so long ago, monitoring was costly. Tools would take up precious system resources and require manual intervention. Moreover, the data these tools provided would often take time to parse through and act upon. As a result, organizations typically only monitored mission-critical processes such as coding issues and production-level performance.
Today, collecting data is much easier due to more advanced tooling—but the amount of data has also vastly increased. That means the organizations now need to determine how best to manage, interpret, and act upon much larger volumes of data.
Continuous monitoring is a practice that seeks to solve this issue by building monitoring into every part of the SDLC. Its primary goal is to enable the rapid detection of any potential issues and provide real-time feedback.
A continuous monitoring practice will leverage a series of tools and an automated series of tests to evaluate new code and production performance of an application and its underlying infrastructure. The primary goal is to provide an automated, 360-degree view of all systems and ensure the right people know when and where to intervene.
The best continuous monitoring practices often prioritize collecting as much data as possible to audit systems in their entirety and analyze potential operational issues as well as compliance and security risks.
Where to implement monitoring tools in a DevOps practice
Just like the move to DevOps itself, establishing a successful DevOps monitoring strategy requires a mix of culture, process, and tooling. And while you can take inspiration from how other organizations manage monitoring, the precise model you adopt will be driven by the unique needs of your organizations and your SDLC.
There are plenty of frameworks that offer guidance as to _what _data to capture. But knowing where to implement monitoring is a question of optimization. What questions do you need to answer? What data do you need to get those answers? How will you act on that data? Who should be involved?
There are seven broad types of monitoring with each fitting into different parts of your DevOps practice. These include:
Infrastructure monitoring: At the lowest level of your product’s technology stack, infrastructure monitoring helps you understand how constraints such as memory and CPU are impacting your application’s performance.
Application performance monitoring (APM): Moving a level higher, APM showcases signals about your application performance, and provides insights into how to better optimize your application for improved uptime and responsiveness.
Development velocity monitoring: This monitoring practice identifies your organizational velocity, or how quickly you are shipping new code to users and how fast you’re delivering value through your DevOps pipeline.
Network monitoring: This helps your organization understand network performance and can help identify inefficiencies and, in the case of unusual traffic patterns, security breaches.
User behavior monitoring: Unusual individual usage patterns often tell a story. Higher levels of failed password attempts, for example, may indicate a brute force attack is taking place. And a new user accessing administrator pages might offer evidence of a privilege escalation.
Security monitoring: Alongside a DevSecOps approach, security monitoring automates discovering vulnerabilities in code and dependencies.
Configuration monitoring: In a DevOps practice, changes to infrastructure are a common part of delivering new and updated code. Monitoring configuration changes helps provide an overview of such modifications and provide early warning of unforeseen impacts.
Capabilities you should look for in DevOps monitoring tools
There is a rich choice of tooling to help you build monitoring into your DevOps practice. The precise products you choose will depend on the shape of your SDLC and your application’s infrastructure. But there are two core initial questions you should ask when evaluating monitoring tooling:
Is it actionable? Does the tool integrate back into your DevOps pipeline and with your other tooling to enable you to automate actions and alerts based on its data?
Does it tell you something new? Generating more data is easy but more data demands attention, fills up storage, and needs to be maintained. Choose tools that open up new avenues of monitoring, rather than those that offer marginal gains.
Expanding on those questions, your evaluation should consider how a tool performs in the following areas.
Does it offer a unified dashboard? Your product is the result of many services, libraries, and third-party products working together. A good monitoring dashboard will give you a bird’s eye view of how each part works together and make it easier to see alerts and areas for concern.
Does it cleanly integrate with your broader technology stack? Does the tool have dedicated integrations with the tooling you already use? Can you automatically deploy more containers when response times suffer? Will it stream log entries to your centralized log management tool? Does it have a REST API or support open standards, such as SNMP, that allow you to roll your own integrations?
Does it integrate alerts and notifications with your existing tools? Your monitoring tools should enable people to take timely action when manual intervention is needed. Does it support alerting directly or does it integrate with your existing notification tools?
Do its reporting capabilities integrate with your analytics tooling Monitoring dashboards are excellent as a dedicated space but many organizations have established reporting and analytics tools. Does the tooling you’re evaluating integrate with your organization’s chosen analytics platform?
What types of audit logs does a solution provide? Understanding how your system got to its current state is important, especially when something goes wrong. Audit logs provide an action-by-action record of what happened and which process or person was responsible. This helps both with root cause analysis and can provide a basis for learning where to make system improvements. What types of audit logs do your chosen solutions provide and how do they surface up important information?
What are its data retention storage needs? Monitoring tools generate large amounts of data. That makes it important to understand the ongoing storage needs, or the cloud costs, to keep enough history to be useful without storing data beyond its useful life.
What types of diagnostics does a solution offer? Does the tool alert you to symptoms or does it help diagnose the underlying problem? More comprehensive tools, such as application performance management platforms, will help you understand what’s happening in complex situations such as multiple asynchronous microservices working together.
Build your DevOps practice on GitHub
GitHub is an integrated platform that takes companies from idea to planning to production, combining a focused developer experience with powerful, fully managed development, automation, and test infrastructure.
GitHub helps the company’s long-standing efforts to accelerate development by breaking down communication barriers, shortening feedback loops, and automating tasks wherever possible.”
Mike Artis, Director of System Engineering at ViacomCBS
|Go from planning to building||Increase developer velocity|
|Build roadmap plans right next to your codebase and quickly assign tasks to team members with powerful project boards and tables that are fully integrated into your project.
Learn about GitHub Issues >
|Reduce the time to commit. Eliminate environment management and context switching for your developers. Simplify IT procurement and maintenance with a secure, managed space in the cloud.
Explore Codespaces >
|Automate everything||Secure your code as you write it|
|Automate all your software development workflows with GitHub Actions. Scale reliably and securely with powerful development, test, and automation infrastructure, fully managed by GitHub.
Learn more about GitHub Actions >
|Secure your code, dependencies, tokens, and sensitive data through the entire software development lifecycle and automatically resolve vulnerabilities.
See how we help you stay secure >