Senior Site Reliability Engineer- Hadoop

Monday, 28 July 2014

Senior Site Reliability Engineer- Hadoop | Twitter, Inc. | San Francisco, CA

Infrastructure Operations | San Francisco, CA

About This Job

As a Hadoop Senior Site Reliability Engineer (SRE) at Twitter you will be working to improve the reliability and performance of our Hadoop clusters. You will work shoulder-to-shoulder with our engineering teams to design, build and operate the next generation of distributed storage and computation at Twitter, working with technologies such as Apache Hadoop and Apache HBase in both batch-oriented and real-time contexts , focusing on debugging, automation, availability and performance, and above all efficiency at ‘reach every user on the planet’ scale. We have a wide range of opportunities for varying skill levels and experience.

Responsibilities

• Work in engineering team to design, build, and maintain Hadoop clusters.
• Diagnose, and troubleshoot complex distributed systems and develop solutions that have a significant impact at our massive scale.
• Work cross-functionally with various teams such as: Analytics, Revenue, Growth, Linux kernel, JVM and Capacity Planning.
• Participate in building advanced tooling for testing, monitoring, administration, and operations of multiple clusters across datacenters, primarily in Python, Ruby, Shell and Java.
• Work with Hardware, Network, and Datacenter Operations teams to design next-gen storage and compute platforms.
• Work with open source technologies and have the freedom to release your work upstream to the open source community
• Troubleshoot issues across the entire stack - hardware, software, application and network
• Take part in a 24x7 on-call rotation

Qualifications

• 5-7 years or more years of managing services in a distributed, internet-scale *nix environment
• Familiarity with systems management tools (Puppet, Chef, Capistrano, etc)
• Demonstrable knowledge of TCP/IP, security and storage concepts
• Practical knowledge of shell scripting and at least one scripting language (Python, Ruby, Perl). Basic familiarity with Java, Python, Ruby, C/C++ troubleshooting.
• Ability to prioritize tasks and work independently
• Track record of practical problem solving, excellent communication, and documentation skills
• BS or MS degree in Computer Science or Engineering, or equivalent experience.
• Plus: Experience with operating system internals, file systems, disk/storage technologies and storage • protocols..
• Plus: Familiar with debugging tools such as JStack, JMap, JHat, gdb

https://about.twitter.com/careers/positions?jvi=oa7HYfwG,Job

IT Jobs | CS Jobs | Direct Links To IT, CS Jobs

Pages

Monday, 28 July 2014