Scrapinghub · Apr 9th 2019
About the Job:
Crawlera is a smart downloader designed specifically for web crawling and scraping, removing the headaches of proxy management. It is part of the Scrapinghub platform, the world’s most comprehensive web crawling stack which powers crawls of over 8 billion pages per month.
As an Erlang developer you will help to ensure the robustness of our services. You will learn to investigate production issues on a server executing customer requests. You will be able to navigate a large code-base and find the least obstructive place for extensions. Beside the technicalities you will gain a holistic view of the product and ensure a greater usability of the system with every single task you complete. In this role, you will partake in brainstorming and delivering improvements to the core of Crawlera.
Develop, maintain and support a high load distributed system.
Analyze our current and historical Crawlera usage to augment and enhance its routing and rotation logic.
Leverage the Scrapinghub platform to provide extended functionality, both to end users and for internal purposes.
Identify and resolve performance and scalability issues with distributed crawling at scale.
Liaison with other platform teams to provide Crawlera with the best possible integration to the growing Scrapinghub platform.
2+ years of production experience with Erlang.
Strong communication in written and spoken English.
Strong knowledge of Linux/UNIX, HTTP and Networking.
Python or Golang experience.
Familiarity with techniques and tools for crawling, extracting, and processing data.
Knowledge of ELK, Graylog, Docker and Mesos.
Strong record of open source activity
Experience working with Lean principles and a Scrum SDLC