
A service-level agreement (SLA) is a promise made to a user of a service, to indicate that the availability and reliability of the service should meet a certain level of expectation. SLAs act as a pact between the software provider and the software user or client. SLAs may also include responsiveness to incidents and bugs. It depends on the contract. However, if an SLA is broken, then some penalty may incurred, such as a refund or a service subscription credit.
SLAs are an integral part of an IT vendor contract. An SLA pulls together information on all of the contracted services and their agreed-upon expected reliability into a single document. They clearly state metrics, responsibilities, and expectations so that in the event of issues with the service, neither party can plead ignorance. It ensures both sides have the same understanding of requirements. Any significant contract without an associated SLA (reviewed by legal counsel) is open to deliberate or inadvertent misinterpretation. The SLA protects both parties in the agreement.
The types of SLA metrics required will depend on the services that is provided. Many items can be monitor as part of an SLA, but the scheme should be keep as simple as possible to avoid confusion and excessive cost on either side. The availability service should not be much better than the SLO, the availability SLO in the SLA is normally a looser objective than the internal availability SLO. That might expressed in availability numbers: for instance, an availability SLO of 99.9% over one month, with an internal availability SLO of 99.95%. Alternatively, the SLA might only specify a subset of the metrics that make up the internal SLO. The goal should be an equitable incorporation of best practices and requirements that maintain service performance and avoid additional costs.
This chapter from the book "Google Cloud for DevOps Engineers" explains how SLAs represent an external agreement with customers about the reliability of a service, what consequences are if agreement is violated, and how SLIs drives SLOs that informs SLAs.
Read more at 👉🏻 Defining SLAs