Analytics-Powered IT Operations for Taking on the Cloud

The ability to change and evolve in the world of IT operations is mandatory for survival. Cloud storage and other related processes are one of the areas that offer new opportunities for growth, if companies know how to use them to their advantage.

Today’s business environment demands that most organizations need to change and rapidly adapt according to market dynamics while still remaining in control. Businesses have approached this challenge using business intelligence (BI) analytics tools to sift through enormous collections of data and catch what could have been missed opportunities. This gives business the power to strain and cull through mountains of data, using effective data collection tools and applications for mining this information and applying intelligent analytics. This way, business can unearth new trends, using this otherwise hidden information to make proactive and knowledge-driven decisions.

Similarly to the business side’s experience, the IT landscape has grown in complexity supporting the wider and growing range of technology environments running on virtualized or cloud platforms as well as accelerated application release schedules. Out of this, IT now faces near-overwhelming quantities of information.

Applications, application infrastructure, hypervisors, and other parts of the virtual and cloud stacks all create troves of performance, events, and availability data. The challenge isn’t finding data; it’s finding a way to make it useful.

Ironically, IT operations, the organization-supporting infrastructure, adheres to a static-process-driven paradigm. While the market is all excited about new technologies like provisioning and automated deployment, the reality is that IT operations still heavily relies on CMDB systems and manual workflows to serve as its “nerve center” for supporting day-to-day internal and external services.

Information-enabled Decisions
Information-enabled decisions is a philosophy that has been discussed and promoted widely in recent times. Yet information-enabled decisions have not yet been fully realized by IT operations in the cloud era. What should be a relatively simple report can take a team of reporting “experts” days to assemble and analyze.

A New Approach Is Needed
By its design, the cloud is abstracted. This abstraction is viewed as one of the cloud’s main benefits, providing the ability to set up components that will meet demand without understanding the technology’s details needed for supporting that component. Yet the limited visibility into the content of virtual machines comprising a business system and dynamism of virtual machines allocation, make monitoring of changes and configuration very difficult. This very state of abstraction creates a complete lack of visibility into the stack. This puts system stability in jeopardy and exposes organizations to incidents that can proliferate into outages and costly lost opportunities.

A new approach is clearly needed handle the complexity and dynamics introduced by today's virtualization and cloud entities. The new approach should handle the constantly changing, overwhelming amount of configuration data and deliver actionable information.

Driven by Static Processes
IT operations has been accustomed to running on static processes and well-defined workflows, with activities carried out according to defined processes. The steps involved in the ITIL process for change management is but one example.  There is a step for proposing a change, requesting approval etc. A set of metrics measure performance, like the amount of changes that successfully went through. 

The Cloud Is not Static
The problem with taking this static approach to managing IT operations in the cloud is that cloud-based operations are not static. IT Ops can plan as much as possible, but it won’t ensure that everything will occur as planned.

From an operations management standpoint, the cloud is a complex beast to tame. In contrast to traditional IT architectures, where each silo can be controlled by IT operations, the cloud comprises many layers of interconnected resources, part of which can be controlled by external providers or users themselves. Additionally, changes are happening too quickly to maintain a fully documented, detailed configuration database. A golden image of the environment can quickly fade away from reality as “live” configuration updates are made without updating an offline golden image. 

Growing Complexities in Hazy Clouds
Cloud computing—having become a great marketing term—and cloud platforms hold exciting promise: reduced costs and management overhead; flexibility, scalability, and accessibility; and automated provisioning. However, IT should be careful not to underestimate the underlying intricacy inherent in the cloud. Today, every major aspect of a datacenter is under unprecedented change, including the entire application stack. Automation is critical for facilitating an agile environment in order to reach a DevOps model for production. Monitoring, orchestration, provisioning, service catalogs management, development, testing, and more must execute in perfect unison; a tough thing to accomplish for any operations team. The new distribution models like IAAS, PAAS, and SAAS also offer their own unique challenges that demand decisive actions to reduce MTTR.  

So, IT Ops must still understand how services are built, the underlying infrastructure, and how issues impact the datacenter.

Workflow Creates False Security
Enforced processes strengthen the belief that everything is under control. However, no organization can claim it operates completely within the bounds of all established processes and approvals. This false security undermines IT operations.

IT Operations Needs to Collect and Analyze
Neurologists explain that the brain has two distinct hemispheres. The right side of the brain collects information, while the left side is cognitive and analyzes this information, translating all of the sensory input into usable data. 

This is really the same model for how the IT organization needs to manage the cloud, where operations needs to know what’s happening now. 

For example, a patch or "minor" release to an application can change hundreds of parameters at a granular level. The higher level of dynamics in the cloud, along with the extra configuration layers (e.g., virtual machine and host configurations) heightens the complexity and increases the management effort. And then the application may not function as planned. IT managers may check the processes that the upgrade went through, yet still see poor performance. They need to go into the fine details and trace every step, identifying the make-up of even minor changes and learn how it was deployed. Finally, managers need to take this enormous amount of data—configuration and granular changes—and run a search to pinpoint what was the root cause. Such an endeavor requires enormous resources and time.

Cloud Encourages Unapproved Processes
Now, self-service provisioning has multiplied the amount of activities occurring outside of static processes.  IT Ops is no longer directly managing environments. For example, an organization may set up a private cloud with a dynamic management system. One of the things the organization can do is to allow self-service provisioning of servers for the testing team. Traditionally, testing professionals would come to IT and request an environment and IT would oversee and manage this entire process. IT was responsible for that server. IT Ops was an integral part of the process and IT knew what was happening. But now that the process is independent, testing can create an environment when it needs it. IT now has no visibility to what happens there.   

Intelligent Analytics for Mountains of Data
The amount of data has grown nearly exponentially in the cloud scenario. The mountains of dynamic information confronting IT is not trivial and cannot be managed on the level of just a dashboard or metrics. Monitoring systems often yield too much data that translates into a lack of usable information. This requires taking all the data—really a multi-dimensional universe—and dynamically analyzing it according to intelligent parameters, not just by using a reporting tool. Intelligent analytics needs to know how to deal with this data and make clear presentations, showing specific areas that may affect performance.

Focus on What Matters Most
Day-to-day monitoring of system functions are shown in dashboards and other indicators for measuring IT performance. With the growth of dynamic change, traditional alarms can be triggered without contextual awareness, often flooding operators and systems administrators. Then, IT operations must spend countless hours to clean up dirty event consoles, ticketing systems, and more.  

IT Ops can’t manage on a strictly workflow-based approach, but rather by an analytics-based one. This requires that IT review changes with automated tools that can analyze data and allow IT professionals to make “information-enabled decisions,” not simply take steps based on predefined processes. 

A Shift in Paradigm to Analytics-driven Management
IT operations has been going on the mistaken belief that once you’ve established a process then everything will fall into place; everything will work. So why do problems happen? It is because people do all kinds of things outside of the set process.

IT operations needs to make a fundamental shift, similar to the shift that business has undergone. There are many automated systems today with many changes occurring for business requiring IT’s support. IT still looks at its operations as static processes from testing to development to deployment. When issues arise, the incident management process is put into action, with a set of metrics to watch. This is not enough, and this is the main issue for IT operations to address in moving forward to the era of the cloud.

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.