Description:
NOC Analyst
Role/Summary:
Responsible for 24x7 monitoring, incident management, and operational support of a large-scale hybrid infrastructure including servers, virtualization platforms, storage systems, network devices, and applications. Ensure high availability, performance, and reliability across all environments (Prod, DR, Non-Prod).
THESE ARE THE MUST HAVE SKILLS. DO NOT SUBMIT IF THEY DO NOT HAVE THESE SKILLS
Technical Skills:
• Strong knowledge of:
o Windows & Linux server administration (basic troubleshooting L1 and L 1.5)
o Virtualization: VMware & Nutanix ( L1 & L 1.5)
o Storage systems: SAN/NAS, Isilon, Quantum or similar PB-scale storage
o Networking fundamentals: TCP/IP, DNS, VPN, Firewalls, Load Balancers (F5) (L1 an L1.5)
• Experience with monitoring tools (New Relic, Splunk Nagios, Zabbix, Dynatrace, SCOM, etc.)
• Understanding of ITSM tools (ServiceNow preferred) for incident, change, and problem management. Rubrik backup management tool.
Operational Skills:
• Incident management and escalation handling in 24x7 environments
• Strong troubleshooting and analytical skills
• Ability to correlate infrastructure, network, and application issues
• Strong communication and coordination skills
• Ability to work under pressure in critical outage scenarios
• Good documentation and reporting skills
Preferred Qualifications:
• ITIL Foundation certification
• Experience in large-scale enterprise or MSP environments
• Exposure to cloud or hybrid environments (AWS/Azure) is a plus.
Key Responsibilities:
Infrastructure Monitoring & Operations
• Monitor ~1200 + servers (Windows/Linux), virtualization platforms (VMware, Nutanix), and web servers for performance and availability.
• Oversee storage systems (PB-scale: Quantum, Isilon, NAS, SAN) ensuring uptime and capacity health
• Monitor network infrastructure (1200+ devices) includes switches, routers, firewalls, VPN tunnels, WAPs, and ISP circuits.
• Monitor and action on the incidents, requests related to the Infra and tools hosted in the environment.
Incident & Event Management
• Perform L1/L2 triage for alerts, incidents, and outages across infrastructure and applications
• Ensure timely incident resolution, escalation, and communication as per SLAs
• Correlate alerts across tools to identify root causes and reduce noise
Application & Service Monitoring
• Monitor 50+ applications across multiple environments (Prod, DR, UAT, Dev)
• Track service health, availability, and dependencies (web, middleware, backend systems)
Capacity & Performance Management
• Track utilization trends across computing, storage (multi-PB), and network
• Proactively identify bottlenecks and recommend optimization
Change & Release Support
• Support infrastructure and application deployments, patches, and maintenance activities
• Validate system health pre/post changes
Disaster Recovery & Resilience
• Support DR readiness for large-scale storage and application environments
• Participate in DR drills and failover validation
Reporting & Documentation
• Maintain operational dashboards, runbooks, and incident reports
• Provide daily/weekly health and SLA reports
| Organization | Net 2 Source |
| Industry | IT / Telecom / Software Jobs |
| Occupational Category | NOC Analyst |
| Job Location | Washington,USA |
| Shift Type | Morning |
| Job Type | Full Time |
| Gender | No Preference |
| Career Level | Intermediate |
| Experience | 2 Years |
| Posted at | 2026-05-20 3:57 pm |
| Expires on | 2026-07-04 |