All Roles
Location Domlur, Bangalore (In-Office)

Lead Platform Engineer

We are looking for a Lead Platform Engineer who brings deep expertise in infrastructure platforms, reliability, and developer experience. At Infraspec, you will drive technical direction, review architectural decisions, and set the bar for engineering excellence across teams.

What you will do:

  • Design, build, and manage infrastructure platforms to improve flow of value through developer experience
  • Review RFCs and give feedback for improvements
  • Collaborate across teams and lead technical discussions with clients and internal teams
  • Mentor team members through mob-pairing and 1:1 pairing sessions
  • Do internal alignment via workshops and mapping activities, and incorporate end user needs and feedback in planning
  • Use methodologies like Agile and Lean to manage projects and deliverables
  • Document technical solutions and knowledge for team reference
  • Stay current with industry trends and emerging technologies

Required qualifications:

  • 8–12 years of experience in platform, infrastructure, or reliability engineering
  • Experience managing large-scale infrastructure systems
  • Excellent understanding of software reliability
  • Proficiency in at least one programming language
  • Excellent understanding of software delivery principles
  • Ability to adapt to different roles and technologies when needed

Required technical skills:

  • Prior experience building internal platforms — infrastructure platforms or business-focused platforms (e.g., a notification service)
  • Excellent understanding of application reliability and ability to enable product teams to build and emit the right application telemetry
  • Demonstrate strong expertise in observability systems and effective incident management
  • Implement alerting practices that help reduce noise and ensure system stability
  • Lead and conduct incident management drills to train teams on effective response protocols
  • Excellent understanding of the principles of distributed systems
  • Familiarity with the Cloud-Native ecosystem and tools like Prometheus, OpenTelemetry, Envoy, etc.
  • In-depth understanding of containers and container orchestrators like Kubernetes and Nomad
  • Excellent understanding of SQL and NoSQL databases and best practices for managing large-scale database clusters
  • Familiarity with best practices for architecting cloud environments focusing on workload reliability, cost, and security
  • Good grasp of Linux environments, virtualization, and networking
  • Strong understanding of Infrastructure as Code with tools such as Terraform, Pulumi, CloudFormation, and AWS CDK
  • Experience with configuration management tools like Ansible and Chef