The basic distribution probability Tutorial for Deep Learning Researchers
distributionisallyouneed
distributionisallyouneed is the basic distribution probability tutorial for most common distribution focused on Deep learning using python library.
Overview of distribution probability
In Bayesian probability theory, if the posterior distributions p(θ  x) are in the same probability distribution family as the prior probability distribution p(θ), the prior and posterior are then called conjugate distributions, and the prior is called a conjugate prior for the likelihood function.
Conjugate prior, wikipedia

MultiClass
means that Random Varivance are more than 2. 
N Times
means that we also consider prior probability P(X). 
To learn more about probability, I recommend reading [pattern recognition and machine learning, Bishop 2006].
distribution probabilities and features

Uniform distribution(continuous), code
 Uniform distribution has same probaility value on [a, b], easy probability.
 Uniform distribution has same probaility value on [a, b], easy probability.

Bernoulli distribution(discrete), code
 Bernoulli distribution is not considered about prior probability P(X). Therefore, if we optimize to the maximum likelihood, we will be vulnerable to overfitting.
 We use binary cross entropy to classify binary classification. It has same form like taking a negative log of the bernoulli distribution.

Binomial distribution(discrete), code
 Binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments.
 Binomial distribution is distribution considered prior probaility by specifying the number to be picked in advance.

MultiBernoulli distribution, Categorical distribution(discrete), code
 Multibernoulli called categorical distribution, is a probability expanded more than 2.
 cross entopy has same form like taking a negative log of the MultiBernoulli distribution.

Multinomial distribution(discrete), code
 The multinomial distribution has the same relationship with the categorical distribution as the relationship between Bernoull and Binomial.
 The multinomial distribution has the same relationship with the categorical distribution as the relationship between Bernoull and Binomial.

Beta distribution(continuous), code
 Beta distribution is conjugate to the binomial and Bernoulli distributions.
 Using conjucation, we can get the posterior distribution more easily using the prior distribution we know.
 Uniform distiribution is same when beta distribution met special case(alpha=1, beta=1).

Dirichlet distribution(continuous), code
 Dirichlet distribution is conjugate to the MultiNomial distributions.
 If k=2, it will be Beta distribution.

Gamma distribution(continuous), code
 Gamma distribution will be beta distribution, if
Gamma(a,1) / Gamma(a,1) + Gamma(b,1)
is same withBeta(a,b)
.  The exponential distribution and chisquared distribution are special cases of the gamma distribution.
 Gamma distribution will be beta distribution, if

Exponential distribution(continuous), code
 Exponential distribution is special cases of the gamma distribution when alpha is 1.
 Exponential distribution is special cases of the gamma distribution when alpha is 1.

Gaussian distribution(continuous), code
 Gaussian distribution is a very common continuous probability distribution
 Gaussian distribution is a very common continuous probability distribution

Normal distribution(continuous), code
 Normal distribution is standarzed Gaussian distribution, it has 0 mean and 1 std.
 Normal distribution is standarzed Gaussian distribution, it has 0 mean and 1 std.

Chisquared distribution(continuous), code
 Chisquare distribution with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables.
 Chisquare distribution is special case of Beta distribution

Studentt distribution(continuous), code
 The tdistribution is symmetric and bellshaped, like the normal distribution, but has heavier tails, meaning that it is more prone to producing values that fall far from its mean.
 The tdistribution is symmetric and bellshaped, like the normal distribution, but has heavier tails, meaning that it is more prone to producing values that fall far from its mean.