Site Reliability Engineer
Site Reliability Engineer
Responsibilities:
•Develops highly complex solutions (utilizing available tech stack) to improve ability to effectively monitor application services in a large-scale and complex environment. Suggests improvement of existing tools and monitoring thresholds.
•Provides highly complex technical assistance and operational guidelines for business operations and application development to ensure applications are running optimally in production, test, and development environments.
•Ensures that supported application services are highly available, reliable, and performant through monitoring, alerting, and notification. Design, implement, and maintain as necessary new Observability tools to ensure this coverage. Implements and maintains dashboard, bots and other automation based on the current operational needs and current release changes. Evaluate improvement of the dashboards, bots, and other automation.
•Identifies repetitive, manual, and scalable tasks and automates them using scripting/programming languages or tools.
•Identifies key operational metrics and the data necessary to create them. Implements and maintains dashboards based on the current operational needs. Test and ensure that all infrastructure components meet proper performance and capacity standards.
Knowledge and Skill Areas:
•Advanced baseline knowledge of AWS Cloud Platform technologies, infrastructure, and practices in production environment including CloudWatch, Cloud Trail, EKS, Lambda, Canaries, DynamoDB, RDS, PostgreSQL, S3, API Gateway, Elastic Load Balancer, OpenSearch, Grafana, AWS X-Ray, SQS, Fault Injection Service (AWS FIS).
•GitLab, CDK (preferred), Terraform, Grafana, OpenSearch, Docker and CI/CD pipeline.
•Coding languages, such as Python, Typescript, NodeJS, .Net, Java; Infrastructure as Code, Configuration as Code, Alerts and Monitoring as Code.
•Familiar with Deployment patterns and version control, ITIL framework, Resiliency concepts and Disaster Recovery, and Chaos Engineering.
Education and Experience:
Bachelor’s degree and a minimum 5 years of related work experience
AWS Certifications
Equal Opportunity Employer. All qualified applicants will receive consideration for employment and will not be discriminated against based on race. color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability, age, pregnancy, genetic information or any other consideration prohibited by law or contract.
Must be legally authorized to work in the US without sponsorship for employment visa status now or in the future.
Please no third-party recruiting agencies.