Head of Infrastructure and DevOps
THE SHORT VERSION: Why is now the time to join Ably? We have just completed a financing round to fuel our growth. We have the best technology, and the best people in the industry. Join us now and you’ll be early in at a business going places, you’ll learn a lot, you’ll work with the founding team, and you’ll have fun.
What makes Ably special? Ably helps power next generation digital experiences. Ones which are live rather than static, where data is in motion rather than at rest. Things like live chat, realtime location tracking, live document collaboration, gaming and elearning. One of our customers even uses Ably for their air traffic control system for drones. Working at Ably means you are working on a cutting-edge product that is helping build the future.
What we can offer you
You will learn with the best. You will have autonomy and freedom to experiment and improve. You will be part of a dynamic team and a business that is taking off.
What we want in return We want someone smart, ambitious, curious and motivated. Someone is prepared to do their best and work their arse off to do great work and become outstanding at what they do.
SOME MORE DETAIL:
Ably is a global realtime data delivery platform that powers low-latency communication between internet-enabled devices. We solve the hardest parts of delivering the service reliably at scale so developers don’t have to. At its core, we provide a transport for developers to deliver realtime applications, notifications, data syndication, and synchronization at Internet-scale. Our product is offered as a multi-tenanted, globally distributed, elastic, and completely redundant platform-as-a-service. The Ably platform is a second generation realtime platform, built from the ground up over more than three years to uniquely solve realtime problems of the future, at scale. These problems include being protocol agnostic, guaranteed message delivery and reliable ordering, massive fan-out or fan-in, and service continuity during network outages and data centre failures. We're delivering billions of messages to millions of devices for global companies like Yahoo, Computer Associates and Offerup. We're excited by the fact we're only getting started; we're growing quickly and on course to soon deliver trillions of messages. We have a developer-first mindset in everything we do. We hide the complexity of our distributed interoperable platform and package it up for developers as a service with a simple API, great documentation and pro-active support. We're looking to grow our team with great like-minded people.
You'll be responsible for maintaining and improving our global distributed infrastructure and services. You will be working alongside our deeply technical engineering team who collectively bring a wealth of experience and broad technology skills, and in time you will build the infrastructure management team internally. We are strong believers in using the right tools for the job when they exist. Where they don't, we've built a whole host of orchestration tools and shared services to help us deliver our global platform. Within our infrastructure, everything is automated, mostly covered by tests, completely replicable and ephemeral in design. The calibre of the infrastructure automation and services code, like our realtime service, what excites us each day and motivates us each. If you enjoy solving hard architecture and infrastructure problems at tremendous scale, then you'll love working at Ably. Our team is currently made up of a strong remote contingent, however our base is in London and growing. This pivotal role in our team requires you to be working primarily from our London office, with plenty of flexibility in regards to your working hours and working from home.
Our infrastructure stack:
- Mostly AWS based, but this will likely include other clouds in future.
- Service languages: Go, Elixir, Node.js and some C.
- Infrastructure languages: Ruby, Bash.
- Architecture: Exclusively Docker containers for all services, servers are effectively ephemeral and disposed of frequently, code is packaged as slugs, data centers (circa 20) are isolated and autonomous, critical shared services always have redundancy baked in, manual configuration of any infrastructure is disallowed (all changes are rolled out using source control, environment based configs and CLI commands).
- Data services: Cassandra (our realtime datastore, 3 regions, 6 data centers), Influx, Elastic, Kibana, Grafana, etc.
- Web: We use Rails & Heroku for simplicity. The web service is not part of our "core product" and thus has reduced uptime requirements.
See and for a taster on the lengths we go to at each layer in the stack to ensure 100% service uptime.
Day to day you can expect to be working on:
- Writing Ruby code for our infrastructure automation, orchestration, configuration and continuous integration testing of our infrastructure.
- Writing Go code for our core routing, worker and other shared services.
- Making extensive use of a wide range of AWS services. Whilst we primarily use AWS for our infrastructure, in time we expect that to change as we span other cloud services.
- Managing and developing out our continuous integration services that test every aspect of the service, from infrastructure tools, to our health servers, routers, realtime services, protocol adaptors and client libraries. * Our CI environment is mature, yet we would like to continue to evolve our CI environments to help improve the robustness of the platform and reduce risk of regressions. Being exposed to our other development environments such as Node.js and Elixir, both used extensively in our realtime services.
- Working with the realtime engineering team to ensure our infrastructure supports the ever changing networking, security and processing requirements.
- Collaborating with the team to design, discuss and implement new features and services.
- Diagnosing and fixing bugs in all areas of our platform. You will often be working at very low levels in the network stack to help diagnose difficult to identify distributed problems.
- Work with the engineering team to enable them to take responsibility for the complete lifecycle of the features and code they deliver i.e. pull request, reviews, testing, deploy to staging and sandbox environments, then into production environments. We are strong believers in all developers being responsible for deploying their own code.
- Contributing to open source projects that we support or use in our products. All of our client libraries are open source as well and may require your support at times.
- Helping customers solve problems they are experiencing that may help us find bugs in the platform.
- Support the wider team in regards to documentation and customer support.
- Suggestions for new features or improvements to our protocol and API specifications.
- Salary range: £60k to £90k.
- Employee options: Yes, negotiable.
- Holidays: 25+ days excluding national holidays.
- Benefit from a flexible working environment in which working from home and managing your own working time is the norm.
- Work in an environment where code quality, technical challenges and delivery is what we all care about.
- Skills development is intrinsic in the job. We're largely working on unsolved problems each day, and such, there is plenty of scope to widen your knowledge and skillset.
- Work with genuinely nice people who care.
**** NO AGENCIES PLEASE ****
- Experience: A minimum of three years of professional experience with Ruby and Go. Our infrastructure automation and orchestration layer requires you to be proficient in Ruby. Our shared services and routing layers require you to be proficient in Go. You should have experience using both statically and dynamically typed languages. Experience with Node.js and Elixir/Erlang is beneficial. You must have solid experience managing infrastructure and CI environments, and any distributed or large scale infrastructure management is preferred.
- Leadership: Proven experience building and managing a team
- Pragmatic: A problem solver excited by the prospect of working autonomously solve problems and bring solutions to the team.
- Fast Learner: We’re looking for software engineers who thrive on applying their knowledge, learning new technologies. Our stack is diverse, and we expect it to continue to grow.
- Testing: Experience using testing frameworks and adoption of test driven development where applicable.
- Communication: We use tools such as Slack throughout the day to communicate, however we believe in voice conversations to discuss and solve problems. You must be proficient in spoken and written English, be eager to collaborate with the engineering team and constructively welcome code reviews.
- Customers: Comfortable talking to customers and assisting them with their technical issues and integration.
- Open source: We prefer developers who have contributed back to the open source community, even if those contributions are small.