Scrapinghub · Feb 4th 2019
About the Job:
Your key objective will be to advance Scrapinghub’s knowledge of web technologies and web scraping best practices.
This is not a production role. Instead, you’ll be given the time and resources to iteratively, and with scientific rigor, test hypotheses and produce a research-backed knowledge base for other developers at Scrapinghub.
Despite not working on specific customer projects, your work will help fuel growth across all of Scrapinghub’s Data business (Professional Services & Data on Demand). Your measures of success will be your ability to iterate quickly and produce assets that are useful to other Shubbers.
Create and execute well designed experiments (repeatable, multiple treatments, testable variables, controls, replication) to learn more about how to best complete web scraping projects
Produce well written, indexed, reports of your findings (similar to publishing to an academic journal, though not nearly as lengthy)
Propose new experiments to run
Work with the Team Lead to prioritize the backlog of experiments
Maintain best practice guides for other Shubbers who will be implementing client solutions based on your findings
Propose changes to Scrapinghub’s other products (Crawlera, Scrapy Cloud, etc) or Scrapy itself based on your findings
Excellent communication in written English.
A strong understanding of the Scientific Method and the ability to continuously implement a process that follows it with rigor.
Take a logical, measurement-backed approach to prioritizing projects, and enjoy working with others that do the same.
Familiarity with techniques and tools for crawling, extracting and processing data, asynchronous communication and distributed systems.
A strong knowledge of Python along with a broad general programming background; strong problem solver.
Enjoy working across several teams and communicating with your end customer (other Shubbers)