UNIMO

The paper proposes a unified-modal pre-training architecture, namely UNIMO, which can effectively adapt to both single-modal and multi-modal understanding and generation tasks.

paper

GitHub