Create in 5 minutes a tweet generator based on your favorite Tweeter
I developed HuggingTweets to try to predict Elon Musk's next breakthrough ;)
This project fine-tunes a pre-trained transformer on a user's tweets using HuggingFace, an awesome open source library for Natural Language Processing.
Training and results are automatically logged on W&B through the HuggingFace integration.
If you just want to test the demo, click on below link and share your predictions on Twitter with
To understand how the model works, check
huggingtweets-dev.ipynb or use the following link.
My favorite sample is definitely on Andrej Karpathy, start of sentence "I don't like":
I don't like this :) 9:20am: Forget this little low code and preprocessor optimization. Even if it's neat, for top-level projects. 9:27am: Other useful code examples? It's not kind of best code, :) 9:37am: Python drawing bug like crazy, restarts regular web browsing ;) 9:46am: Okay, I don't mind. Maybe I should try that out! I'll investigate it :) 10:00am: I think I should try Shigemitsu's imgur page. Or the minimalist website if you're after 10/10 results :) Also maybe Google ImageNet on "Yelp" instead :) 10:05am: Looking forward to watching it talk!
I had a lot of fun running predictions on other people too!
Lot more interesting research to do:
- test training top layers vs bottom layers to see how it affects learning of lexical field (subject of content) vs word predictions, memorization vs creativity ;
- data pre-processing can be optimized (padding, end tokens, definition of one sample…) ;
- augment text data with adversarial approaches ;
- test more models and do some fine-tuning ;
- pre-train on large Twitter dataset of many people ;
- explore few-shot learning approaches as we have limited data per user though there are probably only few writing styles ;
- implement a pipeline to continuously train the network on new tweets ;
- cluster users and identify topics, writing style…