Zero to Mastery Deep Learning with TensorFlow
All of the course materials for the Zero to Mastery Deep Learning with TensorFlow course.
This course will teach you foundations of deep learning and TensorFlow as well as prepare you to pass the TensorFlow Developer Certification exam (optional).
Course structure
This course is code first. The goal is to get you writing deep learning code as soon as possible.
It is taught with the following mantra:
Code -> Concept -> Code -> Concept -> Code -> Concept
This means we write code first then step through the concepts behind it.
If you've got 6-months experience writing Python code and a willingness to learn (most important), you'll be able to do the course.
Should you do this course?
Do you have 1+ years experience with deep learning and writing TensorFlow code?
If yes, no you shouldn't, use your skills to build something.
If no, move onto the next question.
Have you done at least one beginner machine learning course and would like to learn about deep learning/pass the TensorFlow Developer Certification?
If yes, this course is for you.
If no, go and do a beginner machine learning course and if you decide you want to learn TensorFlow, this page will still be here.
Prerequisites
What do I need to know to go through this course?
- 6+ months writing Python code. Can you write a Python function which accepts and uses parameters? That’s good enough. If you don’t know what that means, spend another month or two writing Python code and then come back here.
- At least one beginner machine learning course. Are you familiar with the idea of training, validation and test sets? Do you know what supervised learning is? Have you used pandas, NumPy or Matplotlib before? If no to any of these, I’d going through at least one machine learning course which teaches these first and then coming back.
- Comfortable using Google Colab/Jupyter Notebooks. This course uses Google Colab throughout. If you have never used Google Colab before, it works very similar to Jupyter Notebooks with a few extra features. If you’re not familiar with Google Colab notebooks, I’d suggest going through the Introduction to Google Colab notebook.
- Plug: The Zero to Mastery beginner-friendly machine learning course (I also teach this) teaches all of the above (and this course is designed as a follow on).
? Exercises & ? Extra-curriculum
To prevent the course from being 100+ hours (deep learning is a broad field), various external resources for different sections are recommended to puruse under your own discrestion.
You can find solutions to the exercises in extras/solutions/
, there's a notebook per set of exercises (one for 00, 01, 02... etc). Thank you to Ashik Shafi for all of the efforts creating these.
? 00. TensorFlow Fundamentals Exercises
- Create a vector, scalar, matrix and tensor with values of your choosing using
tf.constant()
. - Find the shape, rank and size of the tensors you created in 1.
- Create two tensors containing random values between 0 and 1 with shape
[5, 300]
. - Multiply the two tensors you created in 3 using matrix multiplication.
- Multiply the two tensors you created in 3 using dot product.
- Create a tensor with random values between 0 and 1 with shape
[224, 224, 3]
. - Find the min and max values of the tensor you created in 6 along the first axis.
- Created a tensor with random values of shape
[1, 224, 224, 3]
then squeeze it to change the shape to[224, 224, 3]
. - Create a tensor with shape
[10]
using your own choice of values, then find the index which has the maximum value. - One-hot encode the tensor you created in 9.
? 00. TensorFlow Fundamentals Extra-curriculum
- Read through the list of TensorFlow Python APIs, pick one we haven't gone through in this notebook, reverse engineer it (write out the documentation code for yourself) and figure out what it does.
- Try to create a series of tensor functions to calculate your most recent grocery bill (it's okay if you don't use the names of the items, just the price in numerical form).
- How would you calculate your grocery bill for the month and for the year using tensors?
- Go through the TensorFlow 2.x quick start for beginners tutorial (be sure to type out all of the code yourself, even if you don't understand it).
- Are there any functions we used in here that match what's used in there? Which are the same? Which haven't you seen before?
- Watch the video "What's a tensor?" - a great visual introduction to many of the concepts we've covered in this notebook.
? 01. Neural network regression with TensorFlow Exercises
- Create your own regression dataset (or make the one we created in "Create data to view and fit" bigger) and build fit a model to it.
- Try building a neural network with 4 Dense layers and fitting it to your own regression dataset, how does it perform?
- Try and improve the results we got on the insurance dataset, some things you might want to try include:
- Building a larger model (how does one with 4 dense layers go?).
- Increasing the number of units in each layer.
- Lookup the documentation of Adam and find out what the first parameter is, what happens if you increase it by 10x?
- What happens if you train for longer (say 300 epochs instead of 200)?
- Import the Boston pricing dataset from TensorFlow
tf.keras.datasets
and model it.
? 01. Neural network regression with TensorFlow Extra-curriculum
- MIT introduction deep learning lecture 1 - gives a great overview of what's happening behind all of the code we're running.
- Reading: 1-hour of Chapter 1 of Neural Networks and Deep Learning by Michael Nielson - a great in-depth and hands-on example of the intuition behind neural networks.
- To practice your regression modelling with TensorFlow, I'd also encourage you to look through Lion Bridge's collection of datasets or Kaggle's datasets, find a regression dataset which sparks your interest and try to model.
? 02. Neural network classification with TensorFlow Exercises
- Play with neural networks in the TensorFlow Playground for 10-minutes. Especially try different values of the learning, what happens when you decrease it? What happens when you increase it?
- Replicate the model pictured in the TensorFlow Playground diagram below using TensorFlow code. Compile it using the Adam optimizer, binary crossentropy loss and accuracy metric. Once it's compiled check a summary of the model.
Try this network out for yourself on the TensorFlow Playground website. Hint: there are 5 hidden layers but the output layer isn't pictured, you'll have to decide what the output layer should be based on the input data. - Create a classification dataset using Scikit-Learn's
make_moons()
function, visualize it and then build a model to fit it at over 85% accuracy. - Train a model to get 88%+ accuracy on the fashion MNIST test set. Plot a confusion matrix to see the results after.
- Recreate TensorFlow's softmax activation function in your own code. Make sure it can accept a tensor and return that tensor after having the softmax function applied to it.
- Create a function (or write code) to visualize multiple image predictions for the fashion MNIST at the same time. Plot at least three different images and their prediciton labels at the same time. Hint: see the classifcation tutorial in the TensorFlow documentation for ideas.
- Make a function to show an image of a certain class of the fashion MNIST dataset and make a prediction on it. For example, plot 3 images of the
T-shirt
class with their predictions.
? 02. Neural network classification with TensorFlow Extra-curriculum
- Watch 3Blue1Brown's neural networks video 2: Gradient descent, how neural networks learn. After you're done, write 100 words about what you've learned.
- If you haven't already, watch video 1: But what is a Neural Network?. Note the activation function they talk about at the end.
- Watch MIT's introduction to deep learning lecture 1 (if you haven't already) to get an idea of the concepts behind using linear and non-linear functions.
- Spend 1-hour reading Michael Nielsen's Neural Networks and Deep Learning book.
- Read the ML-Glossary documentation on activation functions. Which one is your favourite?
- After you've read the ML-Glossary, see which activation functions are available in TensorFlow by searching "tensorflow activation functions".
? 03. Computer vision & convolutional neural networks in TensorFlow Exercises
- Spend 20-minutes reading and interacting with the CNN explainer website.
- What are the key terms? e.g. explain convolution in your own words, pooling in your own words
- Play around with the "understanding hyperparameters" section in the CNN explainer website for 10-minutes.
- What is the kernel size?
- What is the stride?
- How could you adjust each of these in TensorFlow code?
- Take 10 photos of two different things and build your own CNN image classifier using the techniques we've built here.
- Find an ideal learning rate for a simple convolutional neural network model on your the 10 class dataset.
? 03. Computer vision & convolutional neural networks in TensorFlow Extra-curriculum
- Watch: MIT's Introduction to Deep Computer Vision lecture. This will give you a great intuition behind convolutional neural networks.
- Watch: Deep dive on mini-batch gradient descent by deeplearning.ai. If you're still curious about why we use batches to train models, this technical overview covers many of the reasons why.
- Read: CS231n Convolutional Neural Networks for Visual Recognition class notes. This will give a very deep understanding of what's going on behind the scenes of the convolutional neural network architectures we're writing.
- Read: "A guide to convolution arithmetic for deep learning". This paper goes through all of the mathematics running behind the scenes of our convolutional layers.
- Code practice: TensorFlow Data Augmentation Tutorial. For a more in-depth introduction on data augmentation with TensorFlow, spend an hour or two reading through this tutorial.
? 04. Transfer Learning in TensorFlow Part 1: Feature Extraction Exercises
- Build and fit a model using the same data we have here but with the MobileNetV2 architecture feature extraction (
mobilenet_v2_100_224/feature_vector
) from TensorFlow Hub, how does it perform compared to our other models? - Name 3 different image classification models on TensorFlow Hub that we haven't used.
- Build a model to classify images of two different things you've taken photos of.
- You can use any feature extraction layer from TensorFlow Hub you like for this.
- You should aim to have at least 10 images of each class, for example to build a fridge versus oven classifier, you'll want 10 images of fridges and 10 images of ovens.
- What is the current best performing model on ImageNet?
- Hint: you might want to check sotabench.com for this.
? 04. Transfer Learning in TensorFlow Part 1: Feature Extraction Extra-curriculum
- Read through the TensorFlow Transfer Learning Guide and define the main two types of transfer learning in your own words.
- Go through the Transfer Learning with TensorFlow Hub tutorial on the TensorFlow website and rewrite all of the code yourself into a new Google Colab notebook making comments about what each step does along the way.
- We haven't covered fine-tuning with TensorFlow Hub in this notebook, but if you'd like to know more, go through the fine-tuning a TensorFlow Hub model tutorial on the TensorFlow homepage.How to fine-tune a tensorflow hub model:
- Look into experiment tracking with Weights & Biases, how could you integrate it with our existing TensorBoard logs?
? 05. Transfer Learning in TensorFlow Part 2: Fine-tuning Exercises
- Use feature-extraction to train a transfer learning model on 10% of the Food Vision data for 10 epochs using
tf.keras.applications.EfficientNetB0
as the base model. Use theModelCheckpoint
callback to save the weights to file. - Fine-tune the last 20 layers of the base model you trained in 2 for another 10 epochs. How did it go?
- Fine-tune the last 30 layers of the base model you trained in 2 for another 10 epochs. How did it go?
- Write a function to visualize an image from any dataset (train or test file) and any class (e.g. "steak", "pizza"... etc), visualize it and make a prediction on it using a trained model.
? 05. Transfer Learning in TensorFlow Part 2: Fine-tuning Extra-curriculum
- Read the documentation on data augmentation in TensorFlow.
- Read the ULMFit paper (technical) for an introduction to the concept of freezing and unfreezing different layers.
- Read up on learning rate scheduling (there's a TensorFlow callback for this), how could this influence our model training?
- If you're training for longer, you probably want to reduce the learning rate as you go... the closer you get to the bottom of the hill, the smaller steps you want to take. Imagine it like finding a coin at the bottom of your couch. In the beginning your arm movements are going to be large and the closer you get, the smaller your movements become.
? 06. Transfer Learning in TensorFlow Part 3: Scaling-up Exercises
- Take 3 of your own photos of food and use the trained model to make predictions on them, share your predictions with the other students in Discord and show off your Food Vision model ??.
- Train a feature-extraction transfer learning model for 10 epochs on the same data and compare its performance versus a model which used feature extraction for 5 epochs and fine-tuning for 5 epochs (like we've used in this notebook). Which method is better?
- Recreate the first model (the feature extraction model) with
mixed_precision
turned on.
- Does it make the model train faster?
- Does it effect the accuracy or performance of our model?
- What's the advatanges of using
mixed_precision
training?
? 06. Transfer Learning in TensorFlow Part 3: Scaling-up Extra-curriculum
- Spend 15-minutes reading up on the EarlyStopping callback. What does it do? How could we use it in our model training?
- Spend an hour reading about Streamlit. What does it do? How might you integrate some of the things we've done in this notebook in a Streamlit app?
? 07. Milestone Project 1: ?? Food Vision Big™ Exercises
Note: The chief exercise for Milestone Project 1 is to finish the "TODO" sections in the Milestone Project 1 Template notebook. After doing so, move onto the following.
- Use the same evaluation techniques on the large-scale Food Vision model as you did in the previous notebook (Transfer Learning Part 3: Scaling up). More specifically, it would be good to see:
- A confusion matrix between all of the model's predictions and true labels.
- A graph showing the f1-scores of each class.
- A visualization of the model making predictions on various images and comparing the predictions to the ground truth.
- For example, plot a sample image from the test dataset and have the title of the plot show the prediction, the prediction probability and the ground truth label.
- Take 3 of your own photos of food and use the Food Vision model to make predictions on them. How does it go? Share your images/predictions with the other students.
- Retrain the model (feature extraction and fine-tuning) we trained in this notebook, except this time use
EfficientNetB4
as the base model instead ofEfficientNetB0
. Do you notice an improvement in performance? Does it take longer to train? Are there any tradeoffs to consider? - Name one important benefit of mixed precision training, how does this benefit take place?
? 07. Milestone Project 1: ?? Food Vision Big™ Extra-curriculum
- Read up on learning rate scheduling and the learning rate scheduler callback. What is it? And how might it be helpful to this project?
- Read up on TensorFlow data loaders (improving TensorFlow data loading performance). Is there anything we've missed? What methods you keep in mind whenever loading data in TensorFlow? Hint: check the summary at the bottom of the page for a gret round up of ideas.
- Read up on the documentation for TensorFlow mixed precision training. What are the important things to keep in mind when using mixed precision training?
? 08. Introduction to NLP (Natural Language Processing) in TensorFlow Exercises
- Rebuild, compile and train
model_1
,model_2
andmodel_5
using the Keras Sequential API instead of the Functional API. - Retrain the baseline model with 10% of the training data. How does perform compared to the Universal Sentence Encoder model with 10% of the training data?
- Try fine-tuning the TF Hub Universal Sentence Encoder model by setting
training=True
when instantiating it as a Keras layer.
# We can use this encoding layer in place of our text_vectorizer and embedding layer
sentence_encoder_layer = hub.KerasLayer("https://tfhub.dev/google/universal-sentence-encoder/4",
input_shape=[],
dtype=tf.string,
trainable=True) # turn training on to fine-tune the TensorFlow Hub model
- Retrain the best model you've got so far on the whole training set (no validation split). Then use this trained model to make predictions on the test dataset and format the predictions into the same format as the
sample_submission.csv
file from Kaggle (see the Files tab in Colab for what thesample_submission.csv
file looks like). Once you've done this, make a submission to the Kaggle competition, how did your model perform? - Combine the ensemble predictions using the majority vote (mode), how does this perform compare to averaging the prediction probabilities of each model?
- Make a confusion matrix with the best performing model's predictions on the validation set and the validation ground truth labels.
? 08. Introduction to NLP (Natural Language Processing) in TensorFlow Extra-curriculum
To practice what you've learned, a good idea would be to spend an hour on 3 of the following (3-hours total, you could through them all if you want) and then write a blog post about what you've learned.
- For an overview of the different problems within NLP and how to solve them read through:
- Go through MIT's Recurrent Neural Networks lecture. This will be one of the greatest additions to what's happening behind the RNN model's you've been building.
- Read through the word embeddings page on the TensorFlow website. Embeddings are such a large part of NLP. We've covered them throughout this notebook but extra practice would be well worth it. A good exercise would be to write out all the code in the guide in a new notebook.
- For more on RNN's in TensorFlow, read and reproduce the TensorFlow RNN guide. We've covered many of the concepts in this guide, but it's worth writing the code again for yourself.
- Text data doesn't always come in a nice package like the data we've downloaded. So if you're after more on preparing different text sources for being with your TensorFlow deep learning models, it's worth checking out the following:
- TensorFlow text loading tutorial.
- Reading text files with Python by Real Python.
- This notebook has focused on writing NLP code. For a mathematically rich overview of how NLP with Deep Learning happens, read Standford's Natural Language Processing with Deep Learning lecture notes Part 1.
- For an even deeper dive, you could even do the whole CS224n (Natural Language Processing with Deep Learning) course.
- Great blog posts to read:
- Andrei Karpathy's The Unreasonable Effectiveness of RNNs dives into generating Shakespeare text with RNNs.
- Text Classification with NLP: Tf-Idf vs Word2Vec vs BERT by Mauro Di Pietro. An overview of different techniques for turning text into numbers and then classifying it.
- What are word embeddings? by Machine Learning Mastery.
- Other topics worth looking into:
- Attention mechanisms. These are a foundational component of the transformer architecture and also often add improvments to deep NLP models.
- Transformer architectures. This model architecture has recently taken the NLP world by storm, achieving state of the art on many benchmarks. However, it does take a little more processing to get off the ground, the HuggingFace Models (formerly HuggingFace Transformers) library is probably your best quick start.
? 09. Milestone Project 2: SkimLit ?? Exercises
- Train
model_5
on all of the data in the training dataset for as many epochs until it stops improving. Since this might take a while, you might want to use:
tf.keras.callbacks.ModelCheckpoint
to save the model's best weights only.tf.keras.callbacks.EarlyStopping
to stop the model from training once the validation loss has stopped improving for ~3 epochs.
- Checkout the Keras guide on using pretrained GloVe embeddings. Can you get this working with one of our models?
- Hint: You'll want to incorporate it with a custom token Embedding layer.
- It's up to you whether or not you fine-tune the GloVe embeddings or leave them frozen.
- Try replacing the TensorFlow Hub Universal Sentence Encoder pretrained embedding for the TensorFlow Hub BERT PubMed expert (a language model pretrained on PubMed texts) pretrained embedding. Does this effect results?
- Note: Using the BERT PubMed expert pretrained embedding requires an extra preprocessing step for sequences (as detailed in the TensorFlow Hub guide).
- Does the BERT model beat the results mentioned in this paper? https://arxiv.org/pdf/1710.06071.pdf
- What happens if you were to merge our
line_number
andtotal_lines
features for each sequence? For example, created aX_of_Y
feature instead? Does this effect model performance?
- Another example:
line_number=1
andtotal_lines=11
turns intoline_of_X=1_of_11
.
- Write a function (or series of functions) to take a sample abstract string, preprocess it (in the same way our model has been trained), make a prediction on each sequence in the abstract and return the abstract in the format:
PREDICTED_LABEL
:SEQUENCE
PREDICTED_LABEL
:SEQUENCE
PREDICTED_LABEL
:SEQUENCE
PREDICTED_LABEL
:SEQUENCE
- ...
- You can find your own unstrcutured RCT abstract from PubMed or try this one from: Baclofen promotes alcohol abstinence in alcohol dependent cirrhotic patients with hepatitis C virus (HCV) infection.
? 09. Milestone Project 2: SkimLit ?? Extra-curriculum
- For more on working with text/spaCy, see spaCy's advanced NLP course. If you're going to be working on production-level NLP problems, you'll probably end up using spaCy.
- For another look at how to approach a text classification problem like the one we've just gone through, I'd suggest going through Google's Machine Learning Course for text classification.
- Since our dataset has imbalanced classes (as with many real-world datasets), so it might be worth looking into the TensorFlow guide for different methods to training a model with imbalanced classes.
? 10. Time series fundamentals and Milestone Project 3: BitPredict ?? Exercises
- Does scaling the data help for univariate/multivariate data? (e.g. getting all of the values between 0 & 1)
- Try doing this for a univariate model (e.g.
model_1
) and a multivariate model (e.g.model_6
) and see if it effects model training or evaluation results.
- Get the most up to date data on Bitcoin, train a model & see how it goes (our data goes up to May 18 2021).
- You can download the Bitcoin historical data for free from coindesk.com/price/bitcoin and clicking "Export Data" -> "CSV".
- For most of our models we used
WINDOW_SIZE=7
, but is there a better window size?
- Setup a series of experiments to find whether or not there's a better window size.
- For example, you might train 10 different models with
HORIZON=1
but with window sizes ranging from 2-12.
- Create a windowed dataset just like the ones we used for
model_1
usingtf.keras.preprocessing.timeseries_dataset_from_array()
and retrainmodel_1
using the recreated dataset. - For our multivariate modelling experiment, we added the Bitcoin block reward size as an extra feature to make our time series multivariate.
- Are there any other features you think you could add?
- If so, try it out, how do these affect the model?
- Make prediction intervals for future forecasts. To do so, one way would be to train an ensemble model on all of the data, make future forecasts with it and calculate the prediction intervals of the ensemble just like we did for
model_8
. - For future predictions, try to make a prediction, retrain a model on the predictions, make a prediction, retrain a model, make a prediction, retrain a model, make a prediction (retrain a model each time a new prediction is made). Plot the results, how do they look compared to the future predictions where a model wasn't retrained for every forecast (
model_9
)? - Throughout this notebook, we've only tried algorithms we've handcrafted ourselves. But it's worth seeing how a purpose built forecasting algorithm goes.
- Try out one of the extra algorithms listed in the modelling experiments part such as:
- Facebook's Kats library - there are many models in here, remember the machine learning practioner's motto: experiment, experiment, experiment.
- LinkedIn's Greykite library
? 10. Time series fundamentals and Milestone Project 3: BitPredict ?? Extra-curriculum
We've only really scratched the surface with time series forecasting and time series modelling in general. But the good news is, you've got plenty of hands-on coding experience with it already.
If you'd like to dig deeper in to the world of time series, I'd recommend the following:
- Forecasting: Principles and Practice is an outstanding online textbook which discusses at length many of the most important concepts in time series forecasting. I'd especially recommend reading at least Chapter 1 in full.
- I'd definitely recommend at least checking out chapter 1 as well as the chapter on forecasting accuracy measures.
- ? Introduction to machine learning and time series by Markus Loning goes through different time series problems and how to approach them. It focuses on using the
sktime
library (Scikit-Learn for time series), though the principles are applicable elsewhere. - Why you should care about the Nate Silver vs. Nassim Taleb Twitter war by Isaac Faber is an outstanding discussion insight into the role of uncertainty in the example of election prediction.
- TensorFlow time series tutorial - A tutorial on using TensorFlow to forecast weather time series data with TensorFlow.
- ? The Black Swan by Nassim Nicholas Taleb - Nassim Taleb was a pit trader (a trader who trades on their own behalf) for 25 years, this book compiles many of the lessons he learned from first-hand experience. It changed my whole perspective on our ability to predict.
- 3 facts about time series forecasting that surprise experienced machine learning practitioners by Skander Hannachi, Ph.D - time series data is different to other kinds of data, if you've worked on other kinds of machine learning problems before, getting into time series might require some adjustments, Hannachi outlines 3 of the most common.
- ? World-class lectures by
Jordan Kern, watching these will take you from 0 to 1 with time series problems:- Time Series Analysis - how to analyse time series data.
- Time Series Modelling - different techniques for modelling time series data (many of which aren't deep learning).
? 11. Passing the TensorFlow Developer Certification Exercises
Preparing your brain
- Read through the TensorFlow Developer Certificate Candidate Handbook.
- Go through the Skills checklist section of the TensorFlow Developer Certification Candidate Handbook and create a notebook which covers all of the skills required, write code for each of these (this notebook can be used as a point of reference during the exam).
Example of mapping the Skills checklist section of the TensorFlow Developer Certification Candidate handbook to a notebook.
Prearing your computer
- Go through the PyCharm quick start tutorials to make sure you're familiar with PyCharm (the exam uses PyCharm, you can download the free version).
- Read through and follow the suggested steps in the setting up for the TensorFlow Developer Certificate Exam guide.
- After going through (2), go into PyCharm and make sure you can train a model in TensorFlow. The model and dataset in the example
image_classification_test.py
script on GitHub should be enough. If you can train and save the model in under 5-10 minutes, your computer will be powerful enough to train the models in the exam.- Make sure you've got experience running models locally in PyCharm before taking the exam. Google Colab (what we used through the course) is a little different to PyCharm.
Before taking the exam make sure you can run TensorFlow code on your local machine in PyCharm. If the example image_class_test.py
script can run completely in under 5-10 minutes on your local machine, your local machine can handle the exam (if not, you can use Google Colab to train, save and download models to submit for the exam).