# Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

This repository contains code for the paper Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time by Mitchell Wortsman, Gabriel Ilharco, Samir Yitzhak Gadre, Rebecca Roelofs, Raphael Gontijo-Lopes, Ari S. Morcos, Hongseok Namkoong, Ali Farhadi, Yair Carmon*, Simon Kornblith*, and Ludwig Schmidt* (* denotes equal contribution, alphabetical ordering).

Using this repository you can reproduce the figure below, which shows that model soups (averaging multiple fine-tuned solutions) can outperform the best individual model. As an alternative to this repository, Cade Gordon has made the following colab notebook to explore model soups on CIFAR10.

## Code

There are 5 steps to reproduced the figure above: 1) downloading the models, 2) evaluating the individual models, 3) running the uniform soup, 4) running the greedy soup, and 5) making the plot.

Note that any of these steps can be skipped, i.e, you can immediately generate the plot above via `python main.py --plot`

.
You can also run the greedy soup without evaluating the individual models.
This is because we have already completed all of the steps and saved the results files in this repository (i.e., `individual_model_results.jsonl`

).
If you do decide to rerun a step, the corresponding results file or plot is deleted and regenerated.

The exception is step 1, downloading the models. If you wish to run steps 2, 3, or 4 you must first run step 1.

### Install dependencies and downloading datasets

To install the dependencies either run the following code or see environment.md for more information.

```
conda env create -f environment.yml
conda activate model_soups
```

To download the datasets see datasets.md. When required, set `--data-location`

to the `$DATA_LOCATION`

used in datasets.md.

### Step 1: Downloading the models

`python main.py --download-models --model-location <where models will be stored>`

This will store models to `--model-location`

.

### Step 2: Evaluate individual models

`python main.py --eval-individual-models --data-location <where data is stored> --model-location <where models are stored>`

Note that this will first delete then rewrite the file `individual_model_results.jsonl`

.

### Step 3: Uniform soup

`python main.py --uniform-soup --data-location <where data is stored> --model-location <where models are stored>`

Note that this will first delete then rewrite the file `uniform_soup_results.jsonl`

.

### Step 4. Greedy soup

`python main.py --greedy-soup --data-location <where data is stored> --model-location <where models are stored>`

Note that this will first delete then rewrite the file `greedy_soup_results.jsonl`

.

### Step 5. Plot

`python main.py --plot`

Note that this will first delete then rewrite the file `figure.png`

.

### Note

If you want, you can all steps with:

`python main.py --download-models --eval-individual-models --uniform-soup --greedy-soup --plot --data-location <where data is stored> --model-location <where models are stored>`

### Questions

If you have any questions please feel free to raise an issue. If there are any FAQ we will answer them here.

## Citing

If you found this repository useful, please consider citing:

```
@article{wortsman2022model,
title={Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time},
author={Wortsman, Mitchell and Ilharco, Gabriel and Gadre, Samir Yitzhak and Roelofs, Rebecca and Gontijo-Lopes, Raphael and Morcos, Ari S and Namkoong, Hongseok and Farhadi, Ali and Carmon, Yair and Kornblith, Simon and others},
journal={arXiv preprint arXiv:2203.05482},
year={2022}
}
```