Description:
F5 is seeking a Sr. Site Reliability Engineer to support and enhance the F5 Distributed Cloud product. This hands-on SRE role focuses on automation, infrastructure stability, observability, and reliability engineering for large-scale distributed networking and security systems. The engineer will be deeply involved in building resilient systems, improving automation pipelines, supporting on-call operations, and collaborating with cross-functional teams to ensure seamless delivery of cloud services.
Serve as a hands-on SRE focused on automation, reliability, and toil reduction.
Participate in on-call rotation to resolve production issues and maintain operational excellence.
Expand and improve infrastructure automation using Kubernetes, ArgoCD, Helm, Terraform, and Golang/Python.
Enhance observability systems (metrics, logs, monitoring, alerting).
Collaborate with application owners and SRE peers for roadmap execution.
Design scalable, resilient, and secure distributed systems.
Ensure workloads are deployed and upgraded in Kubernetes with zero downtime.
Contribute to Disaster Recovery planning and migration activities.
Build, optimize, and maintain CI/CD pipelines.
Deploy and manage cloud workloads across AWS, GCP, or Azure.
US Citizenship is required due to nature of work.
8+ years of relevant experience (or equivalent via advanced degree).
Strong programming skills in Python or Golang, plus shell scripting.
Advanced Terraform expertise.
Solid networking fundamentals with experience across multiple network stack layers.
Deep SRE/DevOps experience with Linux and Kubernetes.
Practical experience debugging distributed systems and production workloads.
GitOps experience using Helm/Kustomize and ArgoCD/FluxCD.
Strong CI/CD knowledge and hands-on pipeline management.
Experience working with cloud platforms (AWS, GCP, Azure).
Production on-call experience and strong troubleshooting skills.
Strong communication and organizational skills across all levels.
Passion for automation, reliability engineering, and reducing operational toil.
Ability to work autonomously and take full ownership of solutions.
Collaborative persona, comfortable working across diverse teams.
Innovative mindset with enthusiasm for continuous learning.
Must be able to participate in an on-call rotation.
Must adhere to strong security principles and best practices.
Experience with disaster recovery and migrations is a plus.
Must work in hybrid mode based in San Jose, CA.
F5 may request verification of U.S. citizenship.
| Organization | F5 |
| Industry | Engineering Jobs |
| Occupational Category | Reliability Engineer |
| Job Location | New York,USA |
| Shift Type | Morning |
| Job Type | Full Time |
| Gender | No Preference |
| Career Level | Experienced Professional |
| Experience | 8 Years |
| Posted at | 2025-11-19 2:43 pm |
| Expires on | 2026-01-03 |