probability-continuous.rst
author Oleksandr Gavenko <gavenkoa@gmail.com>
Thu, 21 Apr 2016 16:20:38 +0300
changeset 16 c48b0353e055
parent 10 5a09c6837dcb
child 17 db3d7a44583b
permissions -rw-r--r--
Add "Law of total expectation", "Law of total variance", proofs for covariance.


=============================
 Continuous random variables
=============================
.. contents::
   :local:

Probability density function
============================

.. role:: def
   :class: def

:def:`Probability density function` (PDF) for continuous random variable
:math:`x` is function:

.. math::

   CDF(a ≤ X ≤ b) = P(a ≤ X ≤ b) = ∫_{a, b}\ f_X(x) \ dx

   f_X(x) ≥ 0

   ∫_{-∞, +∞}\ f_X(x) \ dx = 1

:math:`f_X(x)` funtion maps values :math:`x` from sample space to real numbers.

For continuous random variable:

.. math:: P(X = a) = 0

Expectation
===========

:def:`Expectation` of continuous random variable is:

.. math:: μ = E[X] = ∫_{-∞, +∞}\ x·f_X(x) \ dx

Properties:

.. math::

   E[X + Y] = E[X] + E[Y]

   E[a·X] = a·E[X]

   E[a·X + b] = a·E[X] + b

Variance
========

:def:`Variance` of continuous random variable is:

.. math:: var[X] = ∫_{-∞, +∞}\ (x-μ)²·f_X(x) \ dx

Properties:

.. math::

   var[a·X + b] = a²·var[X]

   var[X] = E[X²] - E²[X]

Standard deviation
==================

:def:`Standard deviation` of continuous random variable is:

.. math:: σ_Χ = sqrt(var[X])

Cumulative distribution functions
=================================

:def:`Cumulative distribution functions` (CDF) of random variable :math:`X` is:

.. math:: F_X(x) = P(X ≤ x) = ∫_{-∞, x}\ f_X(t) \ dt

So:

.. math::

   P(a ≤ X ≤ b) = F_X(b) - F_X(a) + f_X(a) = ∫_{a,b}\ f_X(x) \ dx

   F_X(-∞) = 0

   F_X(+∞) = 1

and :math:`F_X(a) ≤ F_X(b)` for :math:`a ≤ b`.

Relation between CDF and PDF:

.. math:: (d(CDF(t))/dt)(x) = PDF(x)

Conditional probability
=======================

:def:`Conditional probability` of continuous random variable is:

.. math:: P(X ⊆ B | A) = ∫_{B}\ f_{X|A}(x) \ dx = ∫_{A∩B}\ f_X(x) \ dx / P(A)

:def:`Conditional expectation` of continuous random variable is:

.. math:: E[X|A] = ∫_\ x·f_{X|A}(x) \ dx

Properties:

.. math::

   E[g(X)|A] = ∫_\ g(x)·f_{X|A}(x) \ dx

Independence
============

Random variable :math:`X`, :math:`Y` are :def:`independent` if:

.. math:: f_{X,Y}(x, y) = f_X(x)·f_Y(y)

Continuous uniform random variable
==================================

:def:`Continuous uniform random variable` is :math:`f_X(x)` that is non-zero
only on :math:`[a, b]` with :math:`f_X(x) = `1/(b-a)`.

.. math::

   E[unif(a, b)] = (b+a)/2

   var[unif(a, b)] = (b-a)²/12

   σ = (b-a)/sqrt(12)

Proofs:

.. math::

   E[unif(a, b)] = ∫_{a, b}\ x·1/(b-a)·dx = x²/2/(b-a) |_{a, b} = (b²-a²)/(b-a)/2 = (b+a)/2

   E[unif²(a, b)] = ∫_{a, b} x²·1/(b-a)·dx = x³/3/(b-a) |_{a, b} = (b³-a³)/(b-a)/3 = (b²+b·a+a²)/3

   var[unif(a, b)] = E[unif²(a, b)] - E²[unif(a, b)] = (b²+b·a+a²)/3 - (b+a)²/4 = (b-a)²/12

.. note::

   In maxima::

     (%i4) factor((b^2+b*a+a^2)/3 - (a+b)^2/4);
                 2
          (b - a)
          --------
             12

Exponential random variables
============================

:def:`Exponential random variables` with parameter :math:`λ` is:

.. math:: f_X(x) = λ·exp(-λ·x)

for :math:`x ≥ 0`, and zero otherwise.

Properties:

.. math::

   E[exp(λ)] = 1/λ

   var[exp(λ)] = 1/λ²

Proof:

.. math::

  ∫_{-∞, +∞}\ f_X(x) \ dx = ∫_{0, +∞}\ λ·exp(-λ·x) \ dx = -exp(-λ·x) |_{0, +∞} = 1

  E[exp(λ)] = ∫_{0, +∞}\ x·λ·exp(-λ·x) \ dx = 1/λ

  E[exp²(λ)] = ∫_{0, +∞}\ x²·λ·exp(-λ·x) \ dx = 1/λ²

.. note::

   From maxima::

    (%i15) assume(lambda>0);
    (%o15)                           [lambda > 0]

    (%i16) integrate(lambda*%e^(-lambda*x),x,0,inf);
    (%o16)                                 1

    (%i17) integrate(x*lambda*%e^(-lambda*x),x,0,inf);
                                          1
    (%o17)                              ------
                                        lambda

    (%i18) integrate(x^2*lambda*%e^(-lambda*x),x,0,inf);
                                           2
    (%o18)                              -------
                                              2
                                        lambda

Normal random variables
=======================

:def:`Normal random variables` with parameters :math:`μ, σ` and :math:`σ > 0`
defined by PDF:

.. math:: norm(μ, σ²) = 1/sqrt(2·π)/σ·exp(-(x-μ)²/σ²/2)

Properties:

.. math::

   E[norm(μ, σ²)] = μ

   var[norm(μ, σ²)] = σ²

Summa of two normal r.v.
========================

If :math:`Z = X + Y` and X and Y is independent normal r.v. then:

.. math:: norm(μ_z, σ_z²) = norm(μ_x+μ_y, σ_x²+σ_y²)

Proof:

.. math::

   norm(μ_z, σ_z²) = ∫_x\ f_X(x)·f_Y(z-x)\ dx

   = ∫_x\ 1/sqrt(2·π)/σ_x·exp(-(x-μ_x)²/σ_x²/2)·1/sqrt(2·π)/σ_y·exp(-(z-x-μ_y)²/σ_y²/2)\ dx

   = 1/sqrt(2·π·(σ_x² + σ_y²))·exp(-(x-μ_x-μ_y)²/(σ_x²+σ_y²)/2)

Linear function of distribution
===============================

If :math:`Y = a·X + b` then :math:`f_Y(y) = 1/|a|·f_X((y-b)/a)`.

Proof, for :math:`y > 0`:

.. math:: F_Y(Y ≤ y) = F_X(a·X + b ≤ y) = F_X(X ≤ (y-b)/a)

so:

.. math:: f_Y(y) = d/dy\ F_Y(Y ≤ y) = d/dy\ F_X(x ≤ (y-b)/a) = 1/a·f_X((y-b)/a)

For :math:`y < 0`:

.. math::

   F_Y(Y > y) = F_X(a·X + b > y) = F_X(X < (y-b)/a)

   F_Y(Y <= y) = 1 - F_Y(Y > y) = 1 - F_X(X < (y-b)/a)

   d/dy\ f_Y(y) = -1/a·f_X((y-b)/a)

Combining expression for :math:`a≠0` gives us result.

If X is uniform distribution with parameters :math:`c, d` then :math:`a·Y + b`
also is uniform distribution with parameters :math:`a·c+b, a·d+b`.

If X is exponential distribution with parameters :math:`λ` then :math:`a·Y`
also is exponential distribution with parameters :math:`λ/a` for :math:`a > 0`.

If X is normal distribution with parameters :math:`μ, σ²` then
:math:`a·Y + b` also is normal distribution with parameters :math:`a·μ+b, (a·σ)²`.

Proofs.

When :math:`Χ ~ exp(λ)` and :math:`Y = a·X` then:

.. math:: f_Y(y) = 1/a·f_X(y/a) = λ/a·e^{-λ·y/a} ~ exp(λ/a)

When :math:`Χ ~ norm(μ, σ²)` and :math:`Y = a·X + b` then:

.. math::

   f_Y(y) = 1/a·f_X((y-b)/a) = 1/a·1/sqrt(2·π)/σ·e^{-λ·((y-b)/a - μ)²/σ²/2}

   = 1/sqrt(2·π)/(a·σ)·e^{-λ·(y - (a·μ+b))²/(a·σ)²/2} = ~ norm(a·μ+b, (a·σ)²)

Monotonic function of distribution
==================================

Let's :math:`Y = g(X)` and :math:`g` is monotonic function on range :math:`[a,
b]`. So there is inverse function :math:`h(Y) = X` on range :math:`[g(a), g(b)]`
(if :math:`g` is increasing values) or on range :math:`[g(b), g(a)]` (if
:math:`g` is decreasing values). In that case:

.. math:: f_Y(y) = f_X(h(y))·(d\ h(t)/dt)(y)

Proof. Let :math:`g` is monotonically increasing function. Thus:

.. math:: F_Y(Y ≤ y) = F_X(g(X) ≤ y) = F_X(X ≤ h(y)) = F_X(h(y))

and so:

.. math:: f_Y(y) = (d\ F_Y(t)/dt)(y) = (d\ F_X(h(t))/dt)(y) = f_X(h(y))·(d\ h(t)/dt)(y)

Convolution formula
===================

If :math:`Z = X + Y` and X and Y is independent r.v. then:

.. math:: f_Z(z) = ∫_x\ f_X(x)·f_Y(z-x)·dx

Proof:

Consider :math:`Z` at conditional event :math:`X=x`:

.. math:: f_{Z|X}(z|X=x) = f_{z|X=x}(z|X=x)

Becasue of independence of :math:`X` and :math:`Y`:

.. math:: f_{Z|X}(z|X=x) = f_{X+Y|X=x}(z|X=x) = f_{x+Y}(z) = f_Y(z-x)

Joint PDF of :math:`X` and :math:`Z` is:

.. math:: f_{X,Z}(x,z) = f_X(x)·f_{Z|X}(z|X=x) = f_X(x)·f_Y(z-x)

By integrating by :math:`x` we get:

.. math:: f_Z(z) = ∫_x\ f_{X,Z}(x,z)\ dx = ∫_x\ f_X(x)·f_Y(z-x)\ dx

* https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions

Covariance
==========

Covariance of two r.v. is:

.. math:: cov(X, Y) = E[(X - E[X])·(Y - E[Y])]

Properties:

.. math:: cov(X, Y) = E[X·Y] - E[X]·E[Y]

.. math:: cov(X, X) = var(X)

.. math:: cov(a·X + b, Y) = a·cov(X, Y)

.. math:: cov(X, Y + Z) = cov(X, Y) + cov(X, Z)

.. math:: var(X + Y) = var(X) + var(Y) + 2·cov(X, Y)

Covariance of two independent r.v. is zero.

Proofs:

.. math::

   cov(X, Y) = E[(X - E[X])·(Y - E[Y])] = E[ X·Y - X·E[Y] - E[X]·Y + E[X]·E[Y] ]

   = E[X·Y] - E[X·E[Y]] - E[E[X]·Y] + E[E[X]·E[Y]]

   = E[X·Y] - E[X]·E[Y] - E[X]·E[Y] + E[X]·E[Y] = E[X·Y] - E[X]·E[Y]

.. math::

   cov(a·X + b, Y) = E[(a·X + b - E[a·X + b])·(Y - E[Y])]

   = E[(a·X + b - (a·E[X] + b))·(Y - E[Y])] = E[(a·X + a·E[X])·(Y - E[Y])]

   = a·E[(X + E[X])·(Y - E[Y])] = a·cov(X, Y)

.. math::

   cov(X, Y + Z) = E[(X - E[X])·(Y + Z - E[Y + Z])] = E[(X - E[X])·(Y - E[Y] + Z - E[Z])]

   = E[(X - E[X])·(Y - E[Y]) + (X - E[X])·(Z - E[Z])]

   = E[(X - E[X])·(Y - E[Y])] + E[(X - E[X])·(Z - E[Z])] = cov(X, Y) + cov(X, Z)

.. math::

   var(X) + var(Y) + 2·cov(X, Y) = E[X²] - (E[X])² + E[Y²] - (E[Y])² + 2·E[X·Y] - 2·E[X]·E[Y]

   = E[X² - X·E[X] + Y² - Y·E[Y] + 2·X·Y - X·E[Y] - Y·E[X]]

   = E[(X+Y)² - (X·E[X] + Y·E[Y] + X·E[Y] + Y·E[X])]

   = E[(X+Y)²] - E[(X+Y)·(E[X] + E[Y])] = E[(X+Y)²] - E[X+Y]·E[X+Y] = var(X+Y)

For independent r.v. :math:`X` and :math:`Y`:

.. math:: cov(X, Y) = E[(X - E[X])·(Y - E[Y])] = E[(X - E[X])]·E[(Y - E[Y])] = 0

Correlation coefficient
=======================

Dimensionless version of covariance:

.. math:: ρ(Χ, Υ) = E[(X-E[X])/σ_Χ·(Y-E[Y])/σ_Y] = cov(X, Y)/(σ_X·σ_Y)

It is defined only for cases when :math:`σ_X ≠ 0` and :math:`σ_Y ≠ 0`.

Obviously :math:`-1 ≤ ρ(Χ, Υ) ≤ +1` and :math:`ρ(Χ, X) = 0`.

For independent r.v. :math:`ρ(Χ, Y) = 0`.

If :math:`|ρ(X, Y)| = 1` then :math:`X` and :math:`Y` is have linear
dependencies :math:`X = Y` or :math:`X = -Y`.

Properties:

.. math:: ρ(a·X + b, Y) = sign(a)·ρ(X, Y)

Conditioned expectation
=======================

.. math:: E[X|Y] = ∫_X\ x·f_{X|Y}(x|Y)\ dx

Law of total expectation
========================

.. math:: E[X] = E[E[X|Y]]

Proof::

.. math::

   E[E[X|Y]] = ∫_Y\ f_Y(y)·∫_X\ x·f_{X|Y}(x|y)\ dx·dy

   = ∫_Y\ ∫_X\ x·f_Y(y)·f_{X|Y}(x|y)\ dx·dy = ∫_Y\ ∫_X\ x·f_{X,Y}(x,y)\ dx·dy

   = ∫_X\ x·∫_Y\ f_{X,Y}(x,y)\ dy·dx = ∫_X\ x·f_X(x)\ dx = E[X]

* https://en.wikipedia.org/wiki/Law_of_total_expectation

Iterated expectations with nested conditioning sets
===================================================

.. math:: E[X|A] = E[E[X|B]|A]

Conditional variance
====================

.. math:: var(X|Y=y) = E[(X - E[X|Y=y])² |Y=y]

* https://en.wikipedia.org/wiki/Conditional_variance

Law of total variance
=====================

.. math:: var(X) = E[var(X|Y)] + var(E[X|Y]) = E_Y[var_X(X|Y)] + var_X(E_Y[X|Y])

Proof:

.. math::

   var(X) = E[X²] - (E[X])² = E[E[X²|Y]] - (E[E[X|Y]])²

   = E[var(X|Y) + (E[X|Y])²] - (E[E[X|Y]])²

   = E[var(X|Y)] + E[(E[X|Y])²} - (E[E[X|Y]])² = E[var(X|Y)] + var(E[X|Y])

* https://en.wikipedia.org/wiki/Law_of_total_variance

Law of total covariance
=======================

* https://en.wikipedia.org/wiki/Law_of_total_covariance

Sum of normally distributed random variables
============================================

For :math:`X ~ norm(μ_X, σ_X²)` and :math:`Y ~ norm(μ_Y, σ_Y²)` random variable
:math:`X+Y` is also has normal distribution with parameters:

.. math:: norm(μ_X + μ_Y, σ_X² + σ_Y²)

https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables