Thursday, 13 February 2014

Senior Site Reliability Engineer #NFX01106 | Netflix, Inc. | Los Gatos, CA

Senior Site Reliability Engineer #NFX01106 | Netflix, Inc. | Los Gatos, CA


Senior Site Reliability Engineer

Cloud and Platform Engineering

Netflix is the world's leading streaming video service, and our growth is accelerating. At Netflix, we are building new cloud management tools, pushing the limits of cloud-based technologies, and powering our explosive growth while at the same time improving the availability and reliability of our services.
 
In this role, your mission is to improve the availability of our distributed and cloud-based service. You can accomplish this by:

- Building automated alerting and visibility tools for Netflix Engineering teams
- Being the call leader for a service with millions of customers
- Working with individual service teams to adopt best practices for improving availability
- Inventing new best practices within our environment
 
About you:
You have been part of an operations or software engineering group that cared about getting that extra 9 of availability. You are able to jump on top of an outage, see it through to resolution, then ask the right questions to prevent the problem going forward. You believe that automation is the only way to scale out a service and that any manual effort needs to be scrutinized, even if it is a 'one-off'.
 
While we proactively seek out candidates that are familiar with our current stack, we care more about hiring people that can learn new technologies and adapt quickly.
 
Technologies we use:
- Linux on Amazon Web Services
- Git and Jenkins
- Python and Groovy for tool building
- Cassandra for scalable persistence
- More metrics than you can shake a stick at!
 



No comments:

Post a Comment