How to Download & Unzip Zip files in Python
Just a quick tutorial
Processing a large amount of data on a local machine can be a hassle sometimes, especially if our primary purpose is rather exploratory. I ran into this problem when I wanted to see the overall trend of open policing data.
The Stanford Open Policing Project gathers data on vehicular and pedestrian stops made by the police across the country. They offer a very well-organized series of data divided by different locations. Instead of having to collect each file and creating a large compilation of all available instances, I needed to access only some parts of each dataset and bag the rest.
So here I outline how I batch-processed downloading multiple zip files from their website, extract the CSV file, and merge them into a data frame after filtering to a specific timeframe.
Web Scraping Links
First, we will collect all the download links available on their site. There are separate files for each available location.