Scrape latest data

Scrapes every available dataset from Socrata and stores them as newline-delimited JSON in this repository, to track changes over time through Git scraping.

  • socrata/ contains the latest datasets for a specific domain. This is updated twice a day.
  • socrata/ contains information on page views and download numbers. This is updated once a week to avoid every single fetch including updated counts for many different datasets.

Run python socrata/ to scrape the data from Socrata and save it in the socrata/ directory.

Add --stats to include page view and download statistics in separate files.

Add --verbose for verbose output.`

Run this command to build a SQLite database from the .jsonl files in socrata/:

python socrata.db socrata


