London DevOps #98.2 - Keeping login from taking you down with an SRE approach to auth – Viola Lykova
Автор: London DevOps
Загружено: 2026-02-24
Просмотров: 10
Описание:
Treat authentication as a production-critical system with its own failure modes and operational risks. In this talk, I break down real-world auth incidents involving JWKS rotation errors, refresh token storms, clock drift, and session store outages. I show how to define SLIs and SLOs that measure user impact and how to build monitoring and alerting that expose real reliability problems. I demonstrate practical guardrails such as token caching, exponential backoff with jitter, circuit breakers, and feature-flagged degraded modes. Finally, I walk through an incident runbook that helps teams diagnose, mitigate, and recover from authentication failures safely and quickly.
Viola is a Senior Software Engineer in fintech with an SRE mindset, focused on authentication as a production system. I care about reliability, incident patterns, and the kind of testing that still holds up when traffic spikes, dependencies misbehave, or keys rotate at the worst possible time. Viola speaks on practical auth topics across security and reliability.
Thanks to our hosts AutogenAI, and our sponsors Adaptavist, Prism Digital and Tyme Technologies.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: