Contract Jobs
Remote
Posted 9 months ago
Key Responsibilities
- Design, implement, and maintain comprehensive monitoring, logging,
and alerting solutions across our production and other environments - Lead incident response and post-mortem analyses, establishing best
practices for problem resolution - Design and implement disaster recovery strategies and ensure regular
testing - Collaborate with development teams and other stakeholders to
implement SLAs for critical services - Optimize cloud infrastructure for performance, reliability, and cost-
efficiency - Develop and maintain automation for deployment, scaling, and recovery
procedures - Run and maintain our infrastructure with cookbooks using Terraform,
GitLab CI/CD, and Kubernetes - Responding to on-call incidents
Required Skills & Experience
- 4+ years of experience in SRE, DevOps, or similar roles
- Work in a variety of languages: Shell, Chef (recipes, cookbooks) and
Ansible (basic syntax, tasks, playbooks), Python - Strong experience in AWS related services: Cognito EC2, EKS, RDS,
CloudWatch, etc., - Proficient in Kubernetes administration and operations in production
environments - Experience with infrastructure as code using tools like Terraform or
CloudFormation - Strong scripting skills with Python, Bash, or similar languages
- Deep understanding of observability tools such as Prometheus, Grafana,
ELK stack, and distributed tracing systems - Provisioning and setup of metric in Prometheus, Grafana and alerts;
Provision and setup logs and queries for general questions - Experience with PostgreSQL or similar database systems, including
replication strategies - Knowledge of network protocols, load balancing, and security best
practices - Experience with CI/CD pipelines and Git Ops workflows
- Ability to manage and prioritize multiple incidents under pressure
- Exposure to Observability solutions like Splunk, Datadog, Dynatrace
Preferred Qualifications
- AWS Certified Solutions Architect or DevOps Engineer certification
- Certified Kubernetes Administrator (CKA) certification
Job Features
| Job Category | IT Software |