Lead Site Reliability Engineer

Happyrobot
Job Overview

logoAbout Happyrobot

Happyrobot is a SaaS platform (+dev tools) for building and deploying AI agents that talk on the phone. Freight brokers and other logistics enterprises use it to handle sales and customer service calls. We handle thousands of daily calls in production, and growing fast. See a demo

About The Role

You’ll keep thousands of concurrent AI agents talking on the phone with minimal latency. You’ll auto-scale AI deployments efficiently. You’ll architect a fault-tolerant multi-region infrastructure.You’ll help automate testing procedures.You like AI. You really like AI.

Responsibilities

Create a foundation for reliability, scalability, observability, latency and efficiency of the product.Address issues that are found at scale. Manage on-call rotations across continents.Lead sustainable incident response and blameless postmortems.

Must Have

Experience working in computing, distributed systems, storage, or networking.Excellent communication skills and the ability to explain technical concepts clearly.Passion for AI-driven technologiesFounder Mindset: Hard work + independence + Ownership

Job Detail
Shortlist Never pay anyone for job application test or interview.