Description:
The Staff Linux System Administrator owns the availability, performance, security, and automation of large‑scale Linux server environments supporting manufacturing applications. This is a hands‑on role requiring deep Linux expertise, strong automation skills, and experience operating in complex, global environments. The role also serves as a technical leader and a point of contact for issues in critical production systems.
Responsibilities
- Own the administration, reliability, and performance of Linux server environments supporting 24×7 manufacturing systems
- Diagnose and resolve complex issues spanning Linux OS, virtualization, containers, networking, and hardware
- Design, implement, and test Business Continuity and Disaster Recovery (BC/DR) strategies and runbooks
- Drive operational excellence through proactive monitoring of system health, capacity, and performance metrics
- Automate operational tasks related to monitoring, alerting, configuration management, and reporting
- Serve as a technical leader and resource for critical issues and provide mentoring to L1/L2 teams in a global support model
- Collaborate with architects and leadership to implement scalable, resilient infrastructure solutions
Minimum Qualifications
- 5+ years of Linux system administration experience in large enterprise or production environments
- Advanced expertise in identifying and resolving Linux performance issues, performance tuning, and security hardening
- Hands‑on experience with Kubernetes platforms (OpenShift or OKD)
- Demonstrable experience with Ansible and Red Hat Ansible Automation Platform
- Working knowledge of scripting and automation using Bash/Shell and Python
- Experience with virtualized or hyper‑converged infrastructure (VMware, Nutanix, or equivalent)
- Good communication skills and ability to operate independently in a global environment
Preferred Qualifications
- Experience supporting manufacturing or critical production systems
- Hands‑on experience with Red Hat Advanced Cluster Management (RHACM) and RHACS
- Familiarity with enterprise monitoring and observability tools (e.g., Splunk)
- Experience with enterprise backup and recovery platforms (e.g., Cohesity)
- Experience supporting enterprise server hardware (Dell, Lenovo, or equivalent)
- Prior participation in on‑call rotations and major incident response
- Experience contributing to infrastructure standards, procedures, and documentation