DreamHost has an opportunity for a Site Reliability Engineer who is passionate about open-source to join our team. In this role you’d be collaborating with a small team of Developers and Engineers to ensure we have highly available and reliable, well performing systems. A combination of coding expertise and systems operations experience will be necessary to succeed in this role.
- Be the central point of contact on the health of our systems
- Communicate any areas of weakness or opportunities for improvement to all levels of stakeholders
- Design and develop solutions to improve the performance of our services
- Help guide systems architecture projects, from planning through execution
- Improve upon our incident response protocols
- Troubleshoot and fix bugs
- Document the development work that you’re responsible for
- Be an active participant in planning, standup, and retrospective meetings.
- Collaborate with the rest of your team to make code changes that solve problems for internal and external customers.
- Participate in code reviews, testing, and validation to ensure that the team creates quality, correct code.
- Be a part of an on-call rotation for responding to customer-facing emergencies.
- Solution-engineering activities such as defining technical specifications and prototyping.
- Participate in cross-departmental brainstorming sessions for deploying other DreamHost applications and services on DreamHost’s infrastructure.
- Maintain deep technical and business knowledge of DreamHost’s system architectures, ensuring continuous upgrade and integration of new capabilities.
- Exhibit a strong passion for developing, engineering, and automating highly scalable systems.
- BS/MS in Computer Science or relevant field preferred
- Possess a burning passion for innovation and open source technologies.
- Excellent communication skills.
- Fluent in multiple programming languages. Perl a plus.
- Strong Linux experience.
- Experience designing and implementing highly available systems.
- 5+ years experience designing and writing complex, scalable applications with an emphasis on web applications.
- Experience with configuration management tools such as Docker, Chef, Kubernetes or Ansible.
- Experience with analytics and monitoring tools such as InfluxDB, Grafana, ELK or Prometheus
- Strong knowledge of continuous integration tools such as Jenkins or Gitlab CI
- Good understanding of virtualization and infrastructure solutions including hypervisors, storage and network.
- Experience with container deployments using kubernetes, mesosphere , GCE etc.
- Ability to deep dive in existing infrastructure and software to assist in solving problems
*It is DreamHost policy to provide equal employment opportunities to all employees and employment applicants.