Description:
Manus AI, a trailblazing company in general AI agents, is seeking a Senior Site Reliability Engineer (SRE) to join their on-site team in California. The company is committed to creating intelligent systems that do more than think—they execute and deliver. Manus AI integrates expertise across engineering, research, and business domains, fostering a dynamic and forward-thinking workplace.
In this critical role, the Senior SRE will ensure the high availability, scalability, and robustness of Manus AI’s infrastructure. You will lead initiatives to automate operations, manage containerized environments, and maintain performance across a range of production services. This is a hands-on, full-time position requiring deep technical expertise, problem-solving skills, and the ability to work effectively in a collaborative yet autonomous setting.
Key Responsibilities:
Manage and maintain container clusters and open-source component clusters across business lines
Build and enhance infrastructure operation platforms, including infrastructure management, CI/CD, monitoring, alerting, and logging systems
Respond swiftly to incidents and implement efficient solutions to minimize downtime
Optimize system architecture and deployment strategies to ensure production service availability
Drive automation initiatives to enhance operational efficiency and reduce manual processes
Collaborate with development teams to implement infrastructure-as-code and service reliability best practices
Participate in a 24/7 on-call rotation for mission-critical systems
Qualifications:
Bachelor's degree in Computer Science or related technical field preferred
5+ years of experience in systems operations or SRE roles
Proficient in major public cloud platforms (AWS, Azure, GCP)
Strong Linux administration skills and day-to-day operational experience
Advanced scripting skills using Shell and Python
Deep understanding of internet technologies and optimization for Nginx, MySQL, Redis, Kafka, ElasticSearch, and JVM
Extensive hands-on experience with Kubernetes and Docker in production environments
Familiarity with CI/CD tools like GitLab CI and ArgoCD
Excellent troubleshooting and problem-solving skills under pressure
Strong communication and collaboration abilities, especially in remote settings
Self-driven with the ability to work independently while aligned with team goals
Fluency in both Chinese and English (working proficiency in both required)
About the Company:
Manus AI builds general AI agents capable of both reasoning and execution. Their agents are designed to enhance productivity by autonomously handling tasks across work and life, allowing users to rest while the AI handles the workload. The company offers a highly collaborative environment where professionals come together to innovate at the edge of AI capabilities.
| Organization | Manus AI |
| Industry | IT / Telecom / Software Jobs |
| Occupational Category | Senior Site Reliability Engineer |
| Job Location | California,USA |
| Shift Type | Morning |
| Job Type | Full Time |
| Gender | No Preference |
| Career Level | Experienced Professional |
| Experience | 5 Years |
| Posted at | 2025-05-25 9:52 am |
| Expires on | 2026-01-06 |