Most Important
- 7+ years in SRE, DevOps, or Platform Engineering roles managing production AWS workloads.
- Strong hands-on experience with:
- EKS, Kubernetes networking, Helm, and autoscalers (Karpenter / Cluster-Autoscaler).
- Service Mesh (Istio, Linkerd, AWS App Mesh) for security and traffic control.
- AWS CDK (TypeScript or Python preferred) and CloudFormation.
- Solid grasp of SLI/SLO/error-budget concepts and monitoring/alerting best practices.
- Proficient in infrastructure automation (Go, Python, or TypeScript + Bash).
- Experience implementing Policy-as-Code using OPA/Rego in CI/CD.
- Strong knowledge of cloud security (IAM, KMS, VPC design, encryption, OS hardening).
Nice to Have
- Experience with Internal Developer Portals (e.g., Backstage, Port, Cortex).
- Familiarity with:
- KEDA, K8s autoscalers, and AWS cost optimization strategies.
- Secrets management tools (Vault, Consul, AWS Secrets Manager).
- Chaos/resilience engineering (Gremlin, Litmus, AWS FIS).
- Experience with MongoDB or similar cloud-native databases.
Personal Traits
- Excellent communication and teamwork skills.
- Strong analytical and troubleshooting abilities.
- Keen attention to detail and commitment to delivering high-quality work.
- Ability to handle multiple tasks efficiently and independently.