What you'll do

As an SRE at Airomeda you'll own the reliability of 47 production systems. You'll manage Kubernetes clusters across 4 active regions (fra1 · lon1 · ist1 · iad1), operate the Grafana + Loki + OpenTelemetry observability stack, and keep failover under 8 seconds.

You'll optimise GitHub Actions pipelines, run chaos experiments, and act as an infrastructure advisor when new client projects go live. The 99.95% SLA we promise clients is yours to defend.

Is this role for you?

We're looking for someone who sees infrastructure as an engineering product, not a utility. Someone who writes post-mortems to make the system stronger — not to assign blame. Someone who can tell the difference between "noisy alert" and "real signal" in under 60 seconds.

Responsibilities

→Manage Kubernetes clusters across fra1 · lon1 · ist1 · iad1
→Optimise CI/CD pipelines and improve reliability
→Participate in SLA monitoring, alerting and on-call rotation
→Run chaos engineering experiments to stress-test infrastructure
→Contribute to infrastructure design for new client projects
→Mentor engineering team on observability tooling

Requirements

·4+ years of Kubernetes production operations
·Terraform IaC; multi-region infrastructure experience
·Grafana, Prometheus, Loki, OpenTelemetry stack
·CI/CD pipeline design (GitHub Actions or GitLab CI)
·Zero-downtime deployment strategies (rolling, canary, blue-green)
·Linux sysadmin and network architecture
·Written English proficiency

What we offer

✓Competitive salary + annual bonus
✓Fully remote or hybrid at Istanbul Maslak office
✓12,000 TRY annual conference budget
✓Mac or Linux workstation of choice
✓Private health insurance
✓On-call compensation

SRE / DevOps Engineer

What you'll do

Is this role for you?

Responsibilities

Requirements

What we offer