SRE – Dynatrace has built-in support for service reliability methodology. The concept of Site Reliability Engineering (SRE) appeared at Google in 2003. SRE strives to automate system management maintenance and troubleshooting tasks through the use of appropriate software.
A key task of the SRE team is to determine system performance and systematically work to improve it. The basis of the work of Site Reliability engineers is to surround any system with measurable indicators that may vary, depending on the project, application or line of business. However, the SR team should not be interested in technical indicators (such as processor performance), but the so-called Service Level Indicators (i.e. SLI), i.e. indicators at the service level (business indicators). The system will serve users better not when the processor itself is less loaded, but when it is able to handle more queries without compromising quality. Together with representatives of the business side, SRE engineers set Service Level Objectives (SLO), i.e. service level objectives and acceptable reliability indicators. Based on the collected signals, Dynatrace allows you for:
- quick definition of coefficients necessary to assess system reliability – SLI,
- quick definition of goals for selected ratios – S:LO,
- analysis of the burnout rate for defined SLOs, which proactively indicates the potential to meet commitments in the settlement period.