Many first heard of generative AI and large language models through late 2022 headlines on ChatGPT and its incredible ability to create original content and poetry. But as technology has evolved, so too have its use cases.
Today, users across industries and departments can be found leveraging AI. As IT teams, in particular, dig deeper into GenAI capabilities, they’re learning to use the technology for a range of highly practical business applications. With the advent of GenAI, IT operations management (ITOM) has transformed, unburdening IT teams by autonomously monitoring, diagnosing, and remediating incidents across even the most complex enterprise IT estates.
 The Evolution of AI in Operations Management
AI in support of operations management is nothing new. In fact, by the time the term AIOps entered the IT lexicon in 2016, traditional AI/ML capabilities like predictions and anomaly detection were already starting to take off in IT operations for alert management, performance forecasting, and related tasks.
In the years since, AI deployments for ITOM have grown more scalable and proactive in performing these essential functions. However, the promise of AIOps systems today using traditional AI/ML was never attained due to the need for an expert to constantly decipher machine-generated outcomes, which were limited to computational limits as well as broader context. More importantly, they lacked the business intelligence and analytical independence to decode the meaning of alerts, prioritize their criticality, and take corrective action.
Traditional AI/ML can only take ITOM so far before it needs to hand things off to a highly trained human analyst to dig into root cause analysis, scrutinize logs, determine corrective action, and guide colleagues in implementing a fix. Thankfully, these limitations are falling by the wayside in the face of new GenAI deployments that can increasingly handle some of the more advanced ITOM functions that previously only humans could perform.
GenAI Transforms IT Operations Management
Technically, data generated by IT systems lends itself very well to the descriptive and semantic correlations needed from the corpus used to train Large Language Models (LLMs), a subset of GenAI that will typically render more significant and immediate value in the ITOM use cases. Well-understood data exhaust categories (metrics, events, traces, logs, etc.) coupled with documented best practices, industry standards like ITIL, and corporate nuances documented in knowledge bases, JIRA, and other similar repositories make our industry a rich mine to reason over. No wonder GenAI is rapidly evolving ITOM beyond the era of human teams reviewing dashboards and repetitive tasks such as incident responses, software deployment, and configuration changes to a state where AI-powered advisors will be intuitively monitoring and analyzing systems and then serving up timely, persona-based insights based on what various IT team members need to know or act upon based on well-timed synthesis that these massive algorithms can generate in a matter of seconds. This will help elevate traditionally reactive IT systems into proactive and self-optimizing ecosystems.
GenAI can prioritize and curate information by analyzing telemetry and accumulated knowledge to direct operators toward the most critical business issues relevant to their roles. GenAI can also equip IT teams with personalized guidance and actionable recommendations. Whether recommending adjustments in resource allocation during peak periods or suggesting upgrades based on performance trends, GenAI enables users at all skill levels to quickly tackle complex IT issues, allowing organizations to make informed decisions aligned with their strategic objectives.
ITOM systems can use GenAI to proactively deal with performance issues and pinpoint root causes while adapting suggestions to evolving patterns and specific business needs. This ensures the guidance remains relevant and timely, helping the organization avoid potential challenges. The net result is a shift from simply managing problems to anticipating and preventing them before they arise.
Blending Traditional AI and GenAI
While GenAI brings added capabilities beyond what traditional AI can provide, we’re thankfully not talking about an either/or choice where an enterprise needs to replace traditional AI with a more modern ITOM schema that relies entirely on GenAI. The best approaches overlay GenAI and LLMs over conventional AI systems for combined capabilities that are more significant than the sum of their parts. Looking at more broad use-cases of incident management, network operations, and security posturing from this perspective will help IT practitioners identify and detect repetitive, pattern-bound, and time-consuming tasks to be automated to transform into auto-healing and auto-optimizing IT systems.
Success has much to do with the implementation and configuration choices that transformation teams make in blending these technologies together as they build their AI systems in support of ITOM. They include the following considerations:
- One of the most fundamental priorities is to ensure consistent standards and advanced authentication protocols at the data layer. As organizations generate mass amounts of data daily, 90% of which is unstructured, these standards must hold as the data is converted to structured data. AI is only as good as the quality and accessibility of the data that feeds it, so teams should ensure that traditional ML algorithmic and more advanced LLM processes can access, share, and collaborate around data freely.
- Since data sharing must also happen securely, another priority is to consider private AI deployments that run both traditional and generative AI processes in a secure environment, or enclave, within the IT estate. Private AI lets organizations train models securely using their data, ensuring the resulting models remain internal. This is especially crucial for LLMs, which typically pull from the broader internet and could inadvertently compromise data privacy through shared datasets or proprietary methods exposed via collaborative training and algorithms.
- Additional trust-building measures should be taken to calibrate GenAI processes for maximum relevance and accuracy. These include retrieval-augmented generation (RAG) and prompt engineering, which enhance accuracy and business context around AI queries and processes to ensure that outputs are relevant and reliable for the use case. Finally, while GenAI can take many items off the ITOM analyst’s to-do list, there will always be a need to blend machine and human intelligence, so transformation teams should consider including proper visualization and decision support tools.
Conclusion
GenAI can significantly transform IT Operations Management by proactively providing context-rich insights, accurate predictions, and actionable recommendations for managing the IT landscape. These capabilities empower users at all levels to seamlessly align with organizational best practices to tackle IT challenges efficiently. As a result, businesses can prevent issues before they arise, optimize resource utilization, and foster more significant innovation.
The post GenAI is Quickly Reinventing IT Operations, Leaving Many Behind appeared first on The New Stack.