Business critical is a phrase we hear so often that it has lost much of its meaning. So how do we determine which applications or components within our business are genuinely business critical and which are those that are simply important? We can establish mechanisms for measuring criticality, but we must also take care with where we shine the light of these measurements. Application environments such as SAP are often regarded as single monoliths with the measurement of criticality being applied to the whole environment.
SAP environments are usually divided into Production and non-Production environments such as Development, Test, Pre-Production etc, but this still tends to regard each of these environments as an entity unto itself. Perhaps the key is to break the monolith down into smaller components when we make our assessment. We should ask whether all of the elements of an SAP environment are equal. Are they all business critical? The answer is almost certainly no but how do we define different levels of criticality?
Many organisations use Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) to measure their systems, designs, processes and technology to reflect criticality. Whilst these are useful reference points, they do not really address the critical questions of how do organisations define failure and how do we measure the impact of failure.
Failure is often defined in a very binary way – either a system is available, or it isn’t and when it isn’t then it has failed. But is this a valid mechanism for properly measuring failure? Should we not be taking a more granular view of failure and defining it from a number of different perspectives? At Ensono we take a five-dimensional view of what defines failure known as PARIS. We measure Performance, Availability, Reliability, Insight and Security.
We have all abandoned on-line transactions because the systems are simply too slow or because we are uncertain that something has processed properly, and we retain this knowledge which then affects our decision whether to visit that site again. Availability is important, but a system that is available but not performant is a failed system. Likewise, users need a consistent, reliable experience. We all prefer an environment where we don’t enter an experience wondering how it will perform each time. Again, if we have a nagging doubt that sometimes this environment will not perform as expected, we will often avoid it.
The owners of the system need genuine insight into its performance both past and predicted to ensure that they are able to meet anticipated needs and avoid future performance, availability and reliability issues. Simply defining failure as something that has happened is not enough – we need to embrace the avoidance of issues as part of the failure calculation.
Last, but most definitely not least, is the security of the system. Any system that performs well and is consistently reliable is of little use if it is not secure. Failure needs to be defined with these multiple lenses to be a valuable measure.
Having established the meaning of failure we then need to examine the systems to which we will apply the measurement and the mitigations we will implement to react effectively when failure occurs. Defining ‘business critical systems’ is often contentious especially as the mitigation strategies available vary significantly in cost and so applying the appropriate mitigations to each element becomes an exercise in properly defining value.
SAP is usually defined at the very least as important and is often regarded as critical, but it becomes very expensive to protect the entire environment completely. This can lead to short cuts being taken. The DR environment is used for Test & Dev, normal seasonal capacity is repurposed outside expected seasons and the feeding and watering isn’t always quite as thorough as it could be because 24 by 7 teams are expensive and difficult to manage. It becomes more of a compromise that is because SAP is often seen as a thing, singular whereas in reality it is a collection of elements, each of which should be assessed independently.
Breaking SAP into its constituent parts is relatively straightforward as they are well defined but with the increasing use of the Internet of Things, Artificial Intelligence, Blockchain and data lakes, there are many connected and interdependent systems that are part of the SAP ecosystem and so analysis needs to encompass these elements as well to determine which of these third party elements are part of the business critical application chain.
Many of these connected components are based on public cloud infrastructure and that often drives a need to monitor the failure criteria differently. Many traditional SAP environments are located within data centres and are monitored using a ‘bottom up’ approach where we start the monitoring track from the hardware, through the operating system, via the database and into the application logs. In some more enlightened environments, the applications themselves are monitored via synthetic transactions but where this is done this process is rarely correlated with data gathered from the bottom up approach.
When we adopt systems and services that work within public cloud, we need to find an alternate approach to monitoring and management. The good news is that we need not be as concerned about some of the underlying infrastructure. The hardware and most of the operating systems are all managed for us and therefore we need a different approach. Rather than the more traditional bottom up approach, we need to adopt a top down focus for monitoring and management. In essence we need to replicate the user experience, but we need to find a way of understanding where there is an issue when one arises. We also need to ensure we have a complete view of the application landscape, including all of the ancillary systems and their importance within the overall application chain.
Finally, we need to understand our user personas. If we are to take a user centric approach to monitoring, we must understand the expectations of the user populations that we are replicating. Our customers will have high expectations and the freedom to take their business elsewhere, our suppliers may adjust pricing if we are seen as difficult to trade with because of our systems but our internal staff are likely to be more tolerant or have more manageable expectations.
Managing Business Critical Applications requires we understand our users, our complete eco systems and that we measure these against a strong framework of metrics that provides a complete picture of our environment.
For further details on Ensono’s Managed SAP on Azure read the ebook 5 steps to mastering your SAP to Azure migration here.