Service level objectives (SLOs)

This page defines customer-facing reliability targets for the SREDSimplify production environment (prd). It is a living document: when architecture or traffic changes, update SLOs and linked runbooks together.

Scope

Surface	Users	Notes
Web application (Next.js)	End customers and internal operators	Includes marketing pages and authenticated workspace
API (Spring Boot)	Web client and integrations	JWT on custom `auth` header per API contract
Python document service	Invoked from backend workflows	Long-running AI and document jobs

Availability SLOs (draft)

These are targets until historical metrics back them; treat the percentages as design goals for alerting thresholds.

Service	Monthly availability target	Measurement window
Web + API (synthetic or edge checks)	99.5%	Rolling 30 days
Background document jobs	99.0%	Job success rate over completed jobs

Error budget (conceptual)

For a 99.5% monthly availability target, roughly 3.6 hours of combined outage budget exists per month. When burn is high:

Triage with on-call or engineering lead.
Open or update a tracking issue with customer impact.
Link a postmortem if user-visible failure occurred (example).

Latency (draft)

Path class	Target (p95)	Notes
Authenticated workspace shell	Under 2s TTFB at edge	Excludes long AI runs
Core REST mutations	Under 5s server-side	AI-heavy endpoints may use async patterns

Document concrete probes and dashboards in your observability tool of choice; keep deep links out of this repo if they rotate frequently.

Dependencies that affect SLOs

PostgreSQL — primary data store; see database runbooks from Architecture hub.
Redis — quotas, auth tokens, rate limits; see Redis high memory.
External LLM and document providers — third-party outages may consume error budget without a code defect.

Redis high memory runbook
Login outage postmortem
Backend deployment notes
Tooling reference for CI/CD and release entry points

Service level objectives (SLOs) ​

Scope ​

Availability SLOs (draft) ​

Error budget (conceptual) ​

Latency (draft) ​

Dependencies that affect SLOs ​