ShareStream · Nov 29th 2019
ShareStream Education is a leader in online video and media management solutions for academic institutions. Our team is passionate about building a great product that is continually evolving and providing a service that allows our customers to realize the vast potential of streaming media for education.
ShareStream Education is deeply committed to achieving client successes and building strong relationships with the Company’s clients, whom we regard as our partners.
Join us and contribute to changing the way online education takes place through the use of streaming media!
The Site Reliability will work remotely.
ShareStream is seeking a multitalented, dedicated Site Reliability Engineer who excels at automating engineering operations and building high-availability and fault-tolerant systems. The Site Reliability Engineer will:
Enhance and operate the continuous integration and continuous delivery (CI/CD) pipeline for multiple applications
Operate the Kubernetes platform and perform day-to-day monitoring and maintenance
Automate upgrades, scaling, and other operational needs as required
Deploy new releases across multiple SaaS customers
Implement and operate a central logging solution as well as a central metrics solution
Develop operational playbooks and dashboards to monitor production SaaS environments
Contribute to managing AWS cost and resource usage
Work with the Engineering team to implement new technologies, including Istio, CephFS, ElasticSearch, and InfluxDB
BS and/or MS degree in Computer Science or a related degree
5+ years of engineering-operations experience for SaaS companies
Extensive experience building and operating distributed systems in Amazon Web Services (AWS)
Expert-level Linux skills (CentOS and Ubuntu)
Extensive experience with container-based software development and management using Docker and Kubernetes
Extensive experience with Jenkins
Extensive experience with Ansible, Chef, or Puppet
Expert in at least one scripting language, preferably Bash or Python
Intermediate-level software-development skills using Java or another object-oriented programming language is a strong plus
Experience managing backups and participating in disaster-recovery planning and testing is a strong plus