procallistosolutions

Sr. SRE (Site ReliabilityEngineer)

Contract Jobs
Remote
Posted 9 months ago

Key Responsibilities

  • Design, implement, and maintain comprehensive monitoring, logging,
    and alerting solutions across our production and other environments
  • Lead incident response and post-mortem analyses, establishing best
    practices for problem resolution
  • Design and implement disaster recovery strategies and ensure regular
    testing
  • Collaborate with development teams and other stakeholders to
    implement SLAs for critical services
  • Optimize cloud infrastructure for performance, reliability, and cost-
    efficiency
  • Develop and maintain automation for deployment, scaling, and recovery
    procedures
  • Run and maintain our infrastructure with cookbooks using Terraform,
    GitLab CI/CD, and Kubernetes
  • Responding to on-call incidents
    Required Skills & Experience
  1. 4+ years of experience in SRE, DevOps, or similar roles
  2. Work in a variety of languages: Shell, Chef (recipes, cookbooks) and
    Ansible (basic syntax, tasks, playbooks), Python
  3. Strong experience in AWS related services: Cognito EC2, EKS, RDS,
    CloudWatch, etc.,
  4. Proficient in Kubernetes administration and operations in production
    environments
  5. Experience with infrastructure as code using tools like Terraform or
    CloudFormation
  6. Strong scripting skills with Python, Bash, or similar languages
  7. Deep understanding of observability tools such as Prometheus, Grafana,
    ELK stack, and distributed tracing systems
  8. Provisioning and setup of metric in Prometheus, Grafana and alerts;
    Provision and setup logs and queries for general questions
  9. Experience with PostgreSQL or similar database systems, including
    replication strategies
  10. Knowledge of network protocols, load balancing, and security best
    practices
  11. Experience with CI/CD pipelines and Git Ops workflows
  12. Ability to manage and prioritize multiple incidents under pressure
  13. Exposure to Observability solutions like Splunk, Datadog, Dynatrace
    Preferred Qualifications
  • AWS Certified Solutions Architect or DevOps Engineer certification
  • Certified Kubernetes Administrator (CKA) certification

Job Features

Job CategoryIT Software

Apply For This Job

A valid phone number is required.