DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

Feb 10, 2022 1 min read

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

Authors: Jaemin Cho, Abhay Zala, and Mohit Bansal
Paper

PaintSkills – Visual Reasoning

Dataset

Download four skill data: object.zip, count.zip, color.zip and spatial.zip from the Google Drive link and unzip.

unzip object.zip
unzip count.zip
unzip color.zip
unzip spatial.zip

Each skill directory has hierarchy as below:

{skill}/        # skill name (i.e.., object, count, color, and spatial)
    # Images
    images/

    # Scene configuration
    scenes/
        {skill}_train.json
        {skill}_val.json

    # Bounding box annotations - only needed for DETR
    {skill}_train_bounding_boxes.json
    {skill}_val_bounding_boxes.json

Evaluation

Please see ./paintskills/detr/README.md for our DETR-based visual reasoning skill evaluation.

Acknowledgements

We thank the developers of DETR for public releases of their code.

Reference

Please cite our paper if you use our models in your works:

@article{Cho2022DallEval,
  title         = {DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers},
  author        = {Jaemin Cho and Abhay Zala and Mohit Bansal},
  year          = {2022},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CV},
  eprint        = {2202.04053}
}

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.