Remotees is for sale. Submit your bid to hello AT remotees DOT com if you’re interested.

Senior Site Reliability Engineer (Irvine/Remote)

Chaturbate · Aug 28th 2020

Apply on StackOverflow Careers

Site stats you will improve:

  • 728+ Nvidia P100/T4 GPUs

  • 32k+ physical cores over 24 carrier hotels and 6Tbps capacity

  • 10k+ concurrent live video broadcasts

  • 400k+ concurrent live video streams

  • 26B+ weekly web requests

  • 95% of web requests completed in 59ms-72ms

  • 2M database queries per minute, average response 3.5ms

  • 300k+ cmd/sec Redis Clusters

What you will do:

  • Performance analysis to identify sources of instability using data from APM and distributed telemetry data tools

  • Analyze complex systems to identify operational surprises and minimize downtime.

  • Software engineering and patching in to incrementally improve performance, scalability, and reliability

  • Infrastructure modifications in both a data center metal environment with advanced routing/switching and in the public cloud

  • Predictive failure analysis and disaster planning

  • Author new tools and automation to streamline the devops pipeline

  • Collaborate with Frontend/Backend engineering, QA, DevSecOps, and Data teams

  • Database and kv store administration and configuration with a focus on uptime and performance

  • Incident response and postmortem reports

What you bring:

  • STEM degree and relevant experience as a Site Reliability Engineer

  • Exceptional problem solving skills

  • High proficiency in one of the following: C, C++, Java, Python, Go, etc.

  • High proficiency in Unix/Linux environment, excellent knowledge of internals (e.g., filesystems, system calls)

  • Networking knowledge (e.g., routing, switching, TCP stack) for both metal and cloud (VPC, Security Groups) environments

  • Experience in database administration and configuration.

  • Experience with DevOps tools such as Ansible, Docker, Kubernetes,

  • On call reporting to monitoring and alerting of core website functions as needed

  • Experience in growing data center teams (nice to have)

What will you receive:

  • A strong team of A-players

  • A robust engineering culture

  • Opportunity to make an impact on the highly popular product

  • Freedom to bring the ideas to the table and to make technical decisions

  • Support and guidance of the highly professional and knowledgeable team

  • Flexible working environment

Recruiting Process

We value the sense of urgency and aspire to build a smooth and transparent recruiting process. These are our stages in the recruiting process:

  1. Phone screen with a recruiter

  2. Interview with CTO and Director of IT

  3. Team interview

We reserve the right to add additional selection stages to the process depending on the specific skills of each candidate.

Perks & Benefits:

  • Health & Life insurance with dental and vision plan. 100% Employer sponsored for employee & dependents

  • 401k matching

  • Paid holidays, vacation and sick days

  • Corporate Udemy account and professional development assistance

Apply on StackOverflow Careers