Remotees is for sale. Submit your bid to hello AT remotees DOT com if you’re interested.

Site Reliability Engineer

Higher Logic · Sep 10th 2020

Apply on StackOverflow Careers

This is a full-time position with the Engineering Team. The Site Reliability Engineer will work both collaboratively and independently on concurrent complex projects to deliver technical solutions, execute road maps and promote DevOps best practices within the organization. Success in this role depends on performing at a high degree of technical skill in a 24x7x365 global production environment, while maintaining a positive attitude, aim towards solutions, and good working relationships with their coworkers.

DevOps at Higher Logic has primary responsibility for service reliability in the production environment. We live at the juncture of Engineering, Operations, and Support, which means that we interact with large swathes of the company on a daily basis. The company frequently introduces new products, features and services; these changes require a flexible, thoughtful and forward-thinking approach to scalability and performance. As an SRE, you will have the opportunity to perform hands-on configuration and tuning of services while working to build out independent microservice architectures. Managing this environment requires a high level of individual knowledge and capabilities, coupled with optimism, focus, and close teamwork across the organization and the company.

Higher Logic operates on a large scale, serving tens of millions of end users every day. The entire technical stack is well on its way to being fully Cloud native. No matter how much you know, you will learn and grow here.

Responsibilities:

  • Assisting with management and configuration of AWS cloud infrastructure components.

  • Supporting the Engineering team’s efforts via configuration of and monitoring of real-time alerting systems.

  • Helping to create strong feedback loops between all business lines in communicating, documenting and remediating operational incidents.

  • Actively supporting security and compliance functions.

  • Decrease incidence, scope and severity of operational failures (improve MTTR and MTBF).

  • Guide products to Production Readiness (scalability, observability, operability, resiliency, etc.).

  • Create, maintain and operate build and deployment automation and operations (CI/CD pipelines).

  • Provide tier three on-call technical support.

Qualifications

  • Familiarity with AWS services (EC2, S3, IAM).

  • Understanding of IAC and the use of related tools such as: Cloud Formation,Terraform, Chef, Puppet, Ansible.

  • Experience with SQL Server, IIS, HAProxy.

  • Desire to improve product, technology, people and process.

  • Appreciation of the value of diversity of opinions, approaches, and backgrounds.

  • Excellent communications & collaboration skills.

  • Understanding of the value provided by incremental solution delivery, POCs, MVPs, etc.

  • Bachelor’s degree or better in Computer Science, MIS, or equivalent commercial experience.

Desired Qualifications

  • Windows Server and Linux SRE work over 2-year or longer period.

  • Experience with Autoscaling, RDS, Aurora, Postgres, ECS, Docker, Fargate, Redis, Memcached, S3, SQS, SES, SNS, Secrets Manager, Lambda, CloudWatch, Active Directory & ADFS, CI/CD, containers at scale.

  • Proficiency in at least one high level language such as: Python, Bash, PowerShell, C# preferred.

  • Familiarity with Agile (Kanban and Scrum).

Apply on StackOverflow Careers