The Platform team provides the foundation that empowers Fair’s engineers to build incredible things. We are expanding our team team with a remote Senior Site Reliability Engineer position.
Reporting to the Platform Engineering Manager, you are customer-centric, and you have demonstrated expertise in multiple areas of software engineering but are passionate about building and operational excellence across our platform and delivering “5 Nines Availability”. This is a great opportunity for you if you have experience dealing with issues of scale, debugging low-level production problems, and improving the availability of systems.
Responsibilities:
- Oversee the site reliability and operation of our infrastructure and platform
- Design and champion SRE best practices from idea conception to delivery
- Maintain and evolve our Golang-based applications to provide great experience for our customers (other engineers in the organization)
- Help grow the SRE wing of our platform team
- Maximally automate processes to promote human-free operations
- Improve metrics on quality of service, incidents and availability
- Participate in our follow the sun on-call duties while focusing on reducing incidents and need for help with creative technological solutions or processes
- Help troubleshoot and debug production issues across all of our services
Required Qualifications:
- Bachelor's degree in computer science, applied mathematics or related field or 5 years of equivalent work experience
- 8+ years of relevant work experience
- Relevant certifications in AWS, Kubernetes, Linux, database administration, networking, security, Six Sigma are welcomed
- 5+ years of experience in reliability engineering, software engineering, systems engineering, platform engineering, SRE, ops, or similar fields
- Expert knowledge of Go
- Expert knowledge of Linux
- Deep systems, cloud and infrastructure knowledge
- Experienced with Python, Ruby and Bash scripting languages
- Familiarity with AWS, Docker, Kubernetes, and Terraform
- Experience with integration and end-to-end testing in microservices environments
- Experience with microservices, distributed logging and tracing
- Experience managing monitoring systems
- Experience building CI/CD pipelines
- Familiarity with security concepts and best practices
- Experience with computer networks
Our Benefits:
- 100% coverage of medical, dental and vision benefits for employees AND their families
- Equity incentives
- Unlimited vacation package
- Up to four months 100% paid parental leave
- Cell Phone reimbursement
- 401(k)
- Employee referrals rewards
- Diverse and inclusive culture
- Leadership, mentorship, and learning programs