Managing Service Levels

Service Level Management (SLM) is the ability to help organizations monitor application performance against a defined set of objectives agreed to by either internal or external service providers. By proactively managing SLM performance, organizations can:

  • Provide a collaborative and productive environment to discuss issues
  • Leverage independent metrics are easy to distribute via reports
  • Focus on end user and business process perspectives rather than internal event systems

There are dozens of organizations, consultants, and web sites which discuss Service Level Management (SLM), and as usual, the terminology and your mileage will certainly vary. In the context of Optimal Metrics, SLM matters because it really helps to focus the organization and it’s partners on the customers. In my Internet searching, most of the content on this topic of Service Level Management discusses the ITIL framework. It is a great framework and management tool, but it really is not a methodology. While I do like strategic frameworks, the engineer in me loves to jump into the process, technology, and people to get things done.

The closest reference I have found is ‘Implementing Service Level Management‘ and the article has two wonderful pictures that sum up most of my approach to understanding SLM:

The summary of this great article is that the way to implement SLM is to leverage Performance Management disciplines to manage and monitor business processes.

Another way to look at SLM is from the top down. Many in IT do things the other way, from the bottom up. As the diagram shows, focusing on technology components, such as routers, switches, and servers makes for a complicated approach. How do you measure across all the devices? With a router you can measure throughput in Megabytes per second. A server can toss out SNMP trap alerts when the processor goes more than 80% busy. And what does all this do to manage the application? Event and Alert management is great for the IT Operations team to know if the component is broken, but rarely can it directly tell you that the business process is not working right.

So, in summary, the reason behind SLM is to provide metrics and management insight into the technology. This is key to running IT inside a company, or even more when outsourcing occurs. SLM disciplines are required more and more as IT is disintermediated outside of the IT department. Think about the components in your standard B2C page: analytics, ads, content, CDNs, data centers. A lot of content is outside your control, so how do you make sense of it? How do you insure the provider is doing the right thing?