Send this job to your inbox!
Denver Metro Area, Colorado
Security Clearance: Active TS/SCI Clearance is REQUIRED.
Hybrid Remote (2-3 days on site) OR 9/80 work week available Role Summary and Position ObjectivesThis critical role applies software engineering principles to operations to build and run large-scale, fault-tolerant systems. You will be responsible for the continuous availability, scalability, and performance of mission-critical platforms used to support national security.
This position involves working with highly sensitive and classified information, requiring an Active Top Secret/SCI Security Clearance. A relocation package is available for this position.
As a Senior SRE, you will drive the architecture and implementation of reliable and efficient infrastructure by:
System Reliability & Resiliency: Ensuring the survivability and $24/7$ uptime of mission-critical systems through robust design, proactive monitoring, and disaster recovery planning.
Automation and Toil Reduction: Designing, developing, and deploying automation tools and scripts to eliminate repetitive manual tasks (toil) across system administration, deployment, and configuration.
Infrastructure as Code (IaC): Developing and maintaining infrastructure using declarative tools (e.g., Terraform, Ansible) to ensure consistency, repeatability, and version control across all environments.
Configuration Management: Implementing and enforcing best practices for configuration using Policy as Code and Configuration as Code methodologies across large Linux environments.
Monitoring and Observability: Implementing advanced monitoring, logging, and alerting solutions to detect and resolve system issues based on symptoms, not just outages, and define key Service Level Indicators (SLIs).
Incident Management: Serving as a technical leader during production incidents, conducting root cause analysis (RCA), and implementing preventative measures to drive continuous improvement.
Collaboration: Working closely with Software Development, Cyber Security, and Mission Operations teams across the entire Software Development Lifecycle (SDLC) to ensure services are designed for scalability and reliability.
Clearance & Experience: An Active TS/SCI Clearance combined with $5+$ years of experience in a mission-critical SRE, DevOps, or highly-available Systems Engineering role.
Technical Depth: Expert-level administration and troubleshooting of Linux systems and strong proficiency in scripting languages (e.g., Python, Bash).
Leadership: Demonstrated success providing technical leadership, mentoring junior team members, and championing new ideas and SRE/DevOps best practices.
Communication: Strong presentation, documentation, and communication skills, with proven experience in negotiating technical solutions to meet challenging customer requirements.
Proactive Mindset: A commitment to ongoing learning and applying technology trends to solve operational challenges, always seeking win-win solutions.
Work/Life Balance: Flexible schedules, including the option for a 9/80 work schedule (every other Friday off).
Career Growth: An exciting career path with continuous learning, development, and advanced training opportunities.
Benefits: Competitive benefits, including $401k$ matching, flex time off, paid parental leave, comprehensive healthcare, health & wellness programs, and more.
Phone Number
Job Type
Remote Status
Country
Get notified about new listings!
Can't find the job you want?
Submit a general applicationLoading Jobs...