DiLBERT: Cheap Embeddings for Disease Related Medical NLP

Nov 25, 2021 1 min read

DiLBERT

Repo for the paper “DiLBERT: Cheap Embeddings for Disease Related Medical NLP”

Pretrained Model

The pretrained model presented in the paper is available on the :

from transformers import AutoModelForMaskedLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("beatrice-portelli/DiLBERT")
model = AutoModelForMaskedLM.from_pretrained("beatrice-portelli/DiLBERT")

00_clean_corpus.py document preprocessing and cleaning
01_build_tokenizer.py build a tokenizer from scratch based on the current corpus
02_pretraine_model.py pretraining script (see constants.py for architecture and pretraining parameters)
03_finetune.py finetuning script (classification task)
04_test.py test script (classification task)

GitHub

View Github

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.

DiLBERT: Cheap Embeddings for Disease Related Medical NLP

DiLBERT

Pretrained Model

Contents

GitHub

John

eBay's TSV Utilities: Command line tools for large, tabular data files

Lightweight, configurable Sphinx theme

DiLBERT

Pretrained Model

Contents

GitHub

eBay's TSV Utilities: Command line tools for large, tabular data files

Lightweight, configurable Sphinx theme

You might also like...