With the ever-increasing complexity and dynamism of IT environments, the ability to provide an end-to-end view of the infrastructure becomes more difficult and critical to deliver. In order to try to simplify the challenge that businesses face when it comes to operating and managing these environments, a number of standards, frameworks, processes and categories have been defined over the past few years.

IT Operations Management (ITOM) software is intended to represent all the tools needed to manage the provisioning, capacity, performance and availability of the computing, networking and application environment. IT Service Management (ITSM) focuses on how an organization manages IT services for customers. ITSM refers to all the activities involved in this process, which include planning, designing, delivering, operating, and controlling. Just to complete the picture, the IT Infrastructure Library (ITIL) is a framework or a set of ITSM best practices.

The latest category, defined by Gartner, is AIOps. AIOps platforms combine big data and machine learning functionality to support all primary IT operations functions through the scalable ingestion and analysis of the ever-increasing volume, variety and velocity of data generated by IT. The platform enables the concurrent use of multiple data sources, data collection methods, and analytical and presentation technologies.

Although some companies may market themselves as being able to cover all of the functions covered by ITOM, ITSM and AIOps, the list of requirements is long, and the reality is that being able to deliver to the level needed by businesses is not achievable in a single platform. So, we now see solutions covering multiple, but not all, functions across these standards and frameworks.

As such, there are some criteria that should be considered when selecting any solution in the IT service management, operations or AIOps spaces:

1. Evaluate Integrations.

Although gradually lessening, one of the largest challenges still seen in large organizations is the ‘silo’. This is where systems, skills, data and/or infrastructure are ‘owned’ by a specific team or department which works in isolation from other parts of the organization. Any solution being considered must be technology, location, vendor, data and domain agnostic. The ability of IT solutions to be able to integrate quickly and seamlessly is critical to its success.

2. Data Normalization.

Ideally before you make any decisions on IT solutions, you have analyzed and understood what data you have, where it is stored, who ‘owns’ it and what value it has to the business. Having this understanding ahead of implementing solutions that will use this data is a massive advantage – trying to figure out what everything is as it starts being collected is not a good approach.

3. Leverage Technology.

The introduction of cheap compute resources, the ubiquity of data and the adoption of new technologies such as Artificial Intelligence (AI) all mean that software-based RCA techniques can and should be implemented as a priority. AI, and especially Machine Learning, can process huge amounts of data in order to detect anomalies in real-time and to predict potential issues and uncover trends.

Assuming that you have this understanding, once data sources have been integrated, you will need to be able normalize the data so it can be analyzed in a common and consistent way. Attempting to perform analysis across different data types and formats and stored in different databases is complex to design and implement and rarely delivers the expected results. Unifying and normalizing data so that it can be analyzed centrally and without the need for multiple complex mapping and transformation is another major consideration.

4. Real-Time. Pro-Active.

Organizations can no longer afford to be reactive and wait for a customer to phone their service desk to report an issue. It is easier than ever for customers to churn and move service providers and they will punish companies that do not provide the quality of service that has been agreed.

The introduction of Artificial Intelligence technologies has seen the ability to analyze huge amounts of data in order to detect anomalies in real-time and to perform predictive analysis to prevent service outages. Combined with the ability to automate actions, these technologies allow the organization to move from reactive to pro-active.

5. Flexibility, Future Proofing and Openness.

The IT solutions that we are discussing are complex and can be expensive. Ensuring that any solution selected is not a ‘closed black box’ is essential. Whether this is the ability to create and modify your own machine learning policies, write your own automation logic, build your own dashboards or write reports, the ability to customize the implemented solution is key to ensuring adoption within the organization.

Although we now have the ability to use AI to predict short-term future events based on data, the ability to accurately predict what technologies will be with us in the next few years is not yet in existence. Carefully reviewing the design and architecture of your IT solution to ensure that it is ‘future-proof’ is recommended. For example, is the product built on proprietary technology that may be difficult to maintain and support in the future? If the solution uses 3rd party products or services, can these be easily swapped out for others? Is the solution ‘open’ with well-defined APIs and integrations? Will the solution be able to perform and scale to your growth plans?

Implementing ITOM, ITSM or AIOps solutions are now essential to any organization wanting to improve service and customer satisfaction levels, needing to move to be more pro-active or wanting to improve their operational efficiency by removing silos and automating tasks. There are many options and solutions covering all of these areas and many more, but whichever you choose, I recommend taking your time and choosing carefully.