New to Streaming Scraper

An in-progress web scraping project built with Python, R, and SQL.

A web scraping project that retrieves TV and movie data from two sources, then transforms and stores data in a MySQL database.

Data are retrieved from two different data sources: What's on Netflix (WON) and Rotten Tomatoes (RT). RT data are cleaned and transformed with Python, while WON data are cleaned and transformed with R.

All data are piped into a MySQL database, then retrieved for presentation in R.

Here is a high-level look at the pipeline:

new-to-streaming-pipeline

Current Directory Tree

tree

GitHub

https://github.com/charlesdungy/new-to-streaming-scraper