UniLM AI

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.)

The family of UniLM AI:

UniLM ([email protected]'19 | [email protected]'20 | [email protected]'21): unified pre-training for language understanding and generation
InfoXLM ([email protected]'21 | [email protected]'21): multilingual/cross-lingual pre-trained models for 100+ languages
DeltaLM (NEW): encoder-decoder pre-training for language generation and translation for 100+ languages
MiniLM ([email protected]'20 | [email protected]'21): small and fast pre-trained models for language understanding and generation
AdaLM ([email protected]'21): domain, language, and task adaptation of pre-trained models
LayoutLM ([email protected]'20 | [email protected]'21): multimodal (text + layout/format + image) pre-training for Document AI (e.g. scanned documents, PDF, etc.)
LayoutXLM (NEW): multimodal (text + layout/format + image) pre-training for multilingual document understanding
LayoutReader (EMNLP'21): Pre-training of text and layout for reading order detection
BEiT (NEW): BERT Pre-Training of Image Transformers
UniSpeech ([email protected]'21): Speech Pre-Training for ASR and TTS
s2s-ft: sequence-to-sequence fine-tuning toolkit
XLM-T (NEW): Multilingual NMT w/ pretrained cross-lingual encoders

News

  • August 2021: LayoutLMv2 and LayoutXLM are on HuggingFace
  • [Model Release] August, 2021: LayoutReader - Built with LayoutLM to improve general reading order detection.
  • [Model Release] August, 2021: DeltaLM - Encoder-decoder pre-training for language generation and translation.
  • August 2021: BEiT is on HuggingFace
  • [Model Release] July, 2021: BEiT - Towards BERT moment for CV
  • [Model Release] June, 2021: LayoutLMv2, LayoutXLM, MiniLMv2, and AdaLM.
  • May, 2021: LayoutLMv2, InfoXLMv2, MiniLMv2, UniLMv3, and AdaLM were accepted by ACL 2021.
  • April, 2021: LayoutXLM is coming by extending the LayoutLM into multilingual support! A multilingual form understanding benchmark XFUND is also introduced, which includes forms with human labeled key-value pairs in 7 languages (Chinese, Japanese, Spanish, French, Italian, German, Portuguese).
  • March, 2021: InfoXLM was accepted by NAACL 2021.
  • December 29th, 2020: LayoutLMv2 is coming with the new SOTA on a wide varierty of document AI tasks, including DocVQA and SROIE leaderboard.
  • October 8th, 2020: T-ULRv2 (aka InfoXLM) as the SOTA on the XTREME leaderboard. // Blog
  • September, 2020: MiniLM was accepted by NeurIPS 2020.
  • July 16, 2020: InfoXLM (Multilingual UniLM) arXiv
  • June, 2020: UniLMv2 was accepted by ICML 2020; LayoutLM was accepted by KDD 2020.
  • April 5, 2020: Multilingual MiniLM released!
  • September, 2019: UniLMv1 was accepted by NeurIPS 2019.

Release

***** New August, 2021: LayoutReader release *****

***** New August, 2021: DeltaLM release *****

***** New July, 2021: BEiT release *****

***** New June, 2021: LayoutXLM | AdaLM | MiniLMv2 release *****

***** New May, 2021: LayoutLMv2 | LayoutXLM release *****

  • LayoutLM 2.0 (December 29, 2020): multimodal pre-training for visually-rich document understanding by leveraging text, layout and image information in a single framework. It is coming with new SOTA on a wide range of document understanding tasks, including FUNSD (0.7895 -> 0.8420), CORD (0.9493 -> 0.9601), SROIE (0.9524 -> 0.9781), Kleister-NDA (0.834 -> 0.852), RVL-CDIP (0.9443 -> 0.9564), and DocVQA (0.7295 -> 0.8672). "LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding ACL 2021"

***** February, 2020: UniLM v2 | MiniLM v1 | LayoutLM v1 | s2s-ft v1 release *****

***** October 1st, 2019: UniLM v1 release *****

GitHub - microsoft/unilm: UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities
UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities - GitHub - microsoft/unilm: UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, ...