Booming Games Malta · Jan 14th 2021
We are growing and our Operations Department is looking for support to join our international team!
Responsibilities
Daily interactions ensuring the health and maintenance of systems in different geographical locations: hardware, software, application and network are operating at peak performance
Perform deep dives into both systemic and latent reliability issues; partner with software and systems engineers across the organization to produce and roll out fixes
Troubleshoot issues across the entire stack: hardware, software, application and network
Drive standardization efforts across multiple disciplines and services in conjunction with SREs throughout the organization
Identify and drive opportunities to improve automation for the company; scope and create automation for deployment, management and visibility of our services
Represent the SRE organization in design reviews and operational readiness exercises for new and existing services
Work with software engineers to improve upon deployment processes
Participate in the on-call rotation for production systems
Requirements
Sound fundamentals in operating systems, networking, and distributed systems
Strong familiarity with Linux systems administration and management best practices
Familiarity with container technologies: Kubernetes, CRI, Docker, namespaces, cgroups
Strong understanding of: Ethernet, VLANs, IPv4/IPv6, ARP, DHCP, DNS, and TCP
Familiarity with distributed system problems: leader election, Raft consensus, etc.
Solid understanding of systems and application design, including the operational trade-offs of various designs
Expert level understanding with at least one public or private cloud technology such as Amazon AWS, Google GKE, or OpenStack
Practical knowledge of various aspects of service design, including messaging protocols and behavior, caching strategies and software design practices
Practical intermediate knowledge of shell scripting, some Ruby is a plus
Demonstrable knowledge of TCP/IP, HTTP, web application security, and experience supporting multi-tier web application architectures
Excellent knowledge of Linux/UNIX systems administration and performance tuning
Comfortable configuring DNS, DHCP, and LAN/WAN technologies
Minimum 5 years of managing services in an internet scale *nix environment
Must be able to communicate well with technical as well as non-technical colleagues to achieve business goals
Must be adaptable and able to focus on the simplest, most efficient and reliable solutions
Track record of successful practical problem solving, excellent written and interpersonal communication in English, and documentation skills
Curiosity and an interest in networking, systems software, and distributed systems
Experience as a systems administrator or operations engineer
Experience with a 24/7 production environment
Experience with managed deployments providing software, platforms, or infrastructure as a service
Experience with Mellanox and Vyatta based networking gear is a plus
Experience with SuperMicro server and storage gear is a plus