Site Reliability Engineer

Fractal

San Francisco Bay Area
Post Date: October 6, 2024
Applications 0
Views 5

Job Overview

logo Responsibilities:Monitoring system uptime and availability, ensuring functional and performance SLAs.Responding to alerts from all critical infrastructure resolving environment issues.Participate in analyzing incident trends and identifying root causes of the issues.Triage problems for critical services and build automation to prevent problem recurrence.Influence and create new designs, architectures, standards, and methods for supporting the platform.Understand C3 deployment automation flows to upgrade as needed and effectively troubleshoot issues with system updates and upgrades.Must be willing to participate in on-call rotationWork cross-functionally with Services and Engineering teams.
Qualifications:Demonstrated a good understanding in deploying, managing, and operating scalable and fault-tolerant Linux/Kubernetes/JVM-based infrastructure in AWS, GCP, and other public clouds.Expertise in Linux Operating Systems, Networking, and Database concepts.Experience deploying, upgrading, and troubleshooting Kubernetes clusters and workloads.Experience with Cassandra (or another NoSQL alternative).Expertise in cloud providers, such as Amazon Web Services, Azure, and GCP.Experience with configuration management systems such as Puppet.Experience in Bash or Python; to automate and monitor systems.Experience with IaaC tools like Ansible or Terraform.Excellent problem-solving, critical thinking, and communication skills.Experience supporting as a DevOps or sys admin for commercial SaaS solutions.BS or MS in Computer Science, related field, or equivalent professional experience.

Job Detail

Shortlist Apply Now

Apply with Linkedin Never pay anyone for job application test or interview.

Related Jobs (5258)

Senior Ubuntu Embedded IoT System Engineer – REMOTE on December 23, 2024
AI & Data Scientist Intern – REMOTE on December 22, 2024
Machine Learning Engineer – REMOTE on December 21, 2024
Blockchain Engineer – REMOTE on December 19, 2024
Research and Development Engineer (DeFi, Distributed Systems) – REMOTE on December 16, 2024
Senior Demo Engineer – REMOTE on December 15, 2024
Senior Compiler Engineer – REMOTE on December 13, 2024
Senior Cryptography Engineer – REMOTE on December 12, 2024
Programmatic Senior Analyst on December 6, 2024
Data Analyst on December 6, 2024

Safety Information

Safety Tips For Candidate

DYOR.
Safety Tips For Candidate

Always check the employer\'s offer.