Crawl and Visualize ICLR 2021 OpenReview Data

This Jupyter Notebook contains the data crawled from ICLR 2021 OpenReview webpages and their visualizations. The list of submissions (sorted by the average ratings) can be found here.

Prerequisites

  • python 3.7
  • selenium
  • pandas
  • seaborn
  • imageio
  • wordcloud
  • tqdm
  • edgewebdriver
    • NOTE: You can also use chromedriver by setting driver = webdriver.Chrome('chromedriver.exe').

Crawl Data

  1. Run crawl_paperlist.py to crawl the list of papers (~0.5h).
  2. Run crawl_reviews.py to crawl the reviews (~1.5h).
    • NOTE: currently only review ratings are crawled.

Visualization

Keywords Frequency

The top 50 common keywords (uncased) and their frequency:

keywords

Keywords Cloud

The word clouds formed by keywords of submissions show the hot topics including deep learning, reinforcement learning, representation learning, graph neural network, etc.

wordcloud

Ratings Distribution

The distribution of reviewer ratings centers around 5 (mean: 5.169).

ratings_dist

Keywords vs Ratings

The average reviewer ratings and the frequency of keywords indicate that to maximize your chance to get higher ratings would be using the keywords such as deep generative models, or normalizing flows.

keyword_ratings

GitHub