Scrapinghub · Feb 4th 2019
About the Job:
Your key objective will be to advance Scrapinghub’s knowledge of market technologies around web scraping to better understand how to harvest web data in more effective and efficient ways.
This is not a production role. Instead, you’ll be given the time and resources to quickly hack together proof of concepts, test them, and produce a knowledge base for other developers at Scrapinghub (Shubbers). You’ll use Scrapinghub’s best in class tools including Crawlera, the world’s smartest proxy network, designed specifically for web crawling and scraping.
Your measures of success will be your ability to iterate quickly and produce knowledge that is useful to other Shubbers. Job Responsibilities:
Create and execute well designed tests (repeatable, multiple treatments, testable variables, controls, replication) to learn more about how to best complete web scraping projects
Produce well written reports of your findings for other Shubbers
Propose new ideas to test
Work with the Team Lead to prioritize the backlog of experiments
Maintain best practice guides for other Shubbers who will be implementing client solutions based on your findings
Propose changes to Scrapinghub’s other products (Crawlera, Scrapy Cloud, etc) or Scrapy itself based on your findings
A strong knowledge of Python along with a broad general programming background; strong problem solver.
Familiarity with techniques and tools for crawling, extracting and processing data, asynchronous communication and distributed systems.
A strong understanding of the Scientific Method and the ability to continuously implement a process that follows it with rigor.
Take a logical, measurement-backed approach to prioritizing projects, and enjoy working with others that do the same.
Enjoy working across several teams and communicating with your end customer (other Shubbers)
Excellent communication in written English.