SKILLS & TRAINING
Your engineers are probably capable of setting up and running an out-of-the-box monitoring solution without much trouble. But that responsibility typically falls on one person or a small team, so even though your servers and/or applications are in capable hands, it becomes another matter to continually customize and tweak the software to satisfy your needs. That specialization comes at a price.
YOUR WEAKEST LINK
Since only a select number of people in your organization will be well-versed on the solution, they’ll likely get pulled away from their other projects anytime something comes up needing their attention. If they leave for another job, get sick, or take a holiday, who will you have in place to step in and take over the monitoring system? Depending on a few key engineers for monitoring restricts your ability of change or growth.
TOOL SPRAWL – TOOL EXPERTISE & TRAINING TIME
Over time several IT teams within an organization might operate their own monitoring tools for their specific technology domains. There is no consistency across technical silos, and so when incidents occur, there is often confusion about where to look and who is responsible. You might need to check one tool for the network, one for servers and one for storage and yet one more for your applications.
Your tool sprawl is not just a financial cost, but it takes up precious time to compare information from different tools and to find out what is going on when there is an incident.
Not only are you dealing with troubleshooting delays because you must check multiple tools, you also added a cost to train new staff on each new tool. We have seen organizations that ran out of experts with respect to the different tools they have bought.
THINGS LEFT UNMONITORED
Since the scope of monitoring has grown ‘organically’ over time in many organizations, it is common that not all technical domains are covered by the monitoring.
If your monitoring does not cover all your infrastructure and applications at the level you need to prevent incidents, you could have a source of hidden costs or financial risk on your hands.
MONITORING SYSTEMS DOWN
If your monitoring system goes down or loses its ability to send alerts, will you know? If the system running your monitoring software fails, do you have a backup so you can restore it? Are you monitoring your monitoring? What if you have a major outage, which includes your monitoring platform? How long will it take to recover the monitoring?
Have you thought of the consequences and are you prepared to take the risk?
OWNERSHIP & PROCESS INTEGRATIONS
Monitoring is about technology, people and process. How is your monitoring organized, who is responsible? Who owns the monitoring of the ICT environment?
What is the maturity level of processes like incident, problem, change, configuration, and capacity management? Is there an integration with the Service Desk and/or CMDB? Are reports derived from monitoring used as input for capacity management?
The best tools used by the best experts lose their value when there is no dedicated ownership for monitoring anchored in the organization and when there is insufficient integration in your operational processes.
PARAMETER OVERFLOW
When you set-up tools for monitoring you need to configure, customize and tune them. Tool vendors focus on delivering a maximum number of features. When you need to monitor a device or software, you are always confronted with the questions, What and How do you need to monitor a specific technology. Sometimes, there are a lot of parameters and it is not clear which ones are of value and what the ideal value should be. Add to that the different monitoring protocols and implementation options and you know that this can become a difficult exercise. An exercise that is continuous, because with each technology upgrade this must be done again. Monitoring needs monitoring experts on a continuous basis, because of changes in technology, applications, service levels, regulations, security requirements, audit requirements, …
Who defines the What and How for the monitoring of devices or applications, what is relevant and what is nice to have? Is it mainly up to the individual engineer to decide? What are best practices - based on relevant experience? How are you sure your monitoring is state-of-the-art?
FALSE POSITIVES
Organizations struggle with the vast number of alerts they receive from monitoring tools. Many of those alerts are false positives. The time IT teams spend to identify which alerts need attention and which are false positives, amounts up to a considerable loss of time and productivity.
Therefore, it is important to configure and customize any tool implementation in such a way that it contains all essential intelligence to reduce the number of false positive alerts and notifications.
AGENT DEPLOYMENTS
If your monitoring solution requires you to install agents to collect performance metrics to feed the database, the man-hours for agent deployments can add up fast, especially if staff has to administer and upgrade them on an ongoing basis to keep them up-to-date. The administration and maintenance of your agents can mount to a sizeable hidden cost.
LIMIT YOUR TCO FOR MONITORING
Monitoring is essential for effective IT operations, but it can also be costly. It can be especially costly if you fail to plan adequately for all the easy-to-overlook costs that lurk behind your basic monitoring cost calculations.
One-off implementation projects for monitoring tools do not work. Many organizations buy tools and leave the implementation and initial set up to a third party. But a monitoring platform that is not continually maintained, tuned and adjusted to the actual IT environment will become utterly useless in about a year’s time.
Monitoring has become an expert domain because it requires knowledge of monitoring tools, knowledge of technology-vendor parameters and it requires a methodology. Effective monitoring requires domain experts on a continuous basis.
To limit your TCO you have two options, either invest in the skills and manpower to keep the monitoring under control, considering the pitfalls mentioned in this paper or you can decide to use specialized services.