Variance of geometric distribution.
=============
Probability
=============
.. contents::
:local:
.. role:: def
:class: def
PMF
===
:def:`PMF` or :def:`probability mass function` or :def:`probability law` or
:def:`probability discribuion` of discrete random variable is a function that
for given number give probability of that value.
To denote PMF used notations:
.. math::
PMF(X = x) = P(X = x) = p_X(x) = P({ω ∈ Ω: X(ω) = x})
PMF(a ≤ X ≤ b) = P(a ≤ X ≤ b) = ∑_{a ≤ x ≤ b}\ P(X = x)
p_X(x) ≥ 0
∑_x\ p_X(x) = 1
where :math:`X` is a random variable on space :math:`Ω` of outcomes which mapped
to real number via :math:`X(ω)`.
Expected value
==============
:def:`Expected value` of PMF is:
.. math::
E[X] = Σ_{ω∈Ω} Χ(x) * p(ω) = Σ_{x} x * p_X(x)
We write :math:`a ≤ X ≤ b` for :math:`∀ ω∈Ω a ≤ X(ω) ≤ b`.
If :math:`X ≥ 0` then :math:`E[X] ≥ 0`.
if :math:`a ≤ X ≤ b` then :math:`a ≤ E[X] ≤ b`.
If :math:`Y = g(X)` (:math:`∀ ω∈Ω Y(ω) = g(X(ω))`) then:
.. math::
E[Y] = Σ_{x} g(x) * p_X(x)
**Proof** TODO:
.. math::
E[Y] = Σ_{y} y * p_Y(y)
= Σ_{y∈ℝ} y * Σ_{ω∈Ω: Y(ω)=y} p(ω)
= Σ_{y∈ℝ} y * Σ_{ω∈Ω: g(X(ω))=y} p(ω)
= Σ_{y∈ℝ} y * Σ_{x∈ℝ: g(x)=y} Σ_{ω∈Ω: X(ω) = x} p(ω)
= Σ_{y∈ℝ} y * Σ_{x∈ℝ: g(x)=y} p_X(x)
= Σ_{y∈ℝ} Σ_{x∈ℝ: g(x)=y} y * p_X(x)
= Σ_{x∈ℝ} Σ_{y∈ℝ: g(x)=y} y * p_X(x)
= Σ_{x} g(x) * p_X(x)
.. math::
E[a*X + b] = a*E[X] + b
Variance
========
:def:`Variance` is a:
.. math::
var[X] = E[(X - E[X])²] = E[X²] - (E[X])²
:def:`Standard deviation` is a:
.. math::
σ_Χ = sqrt(var[X])
Property:
.. math::
var(a*X + b) = a² · var[X]
Total probability theorem
=========================
Let :math:`A_i ∩ A_j = ∅` for :math:`i ≠ j` and :math:`∑_i\ A_i = Ω`:
.. math::
p_X(x) = Σ_i P(A_i)·p_{X|A_i}(x)
* https://en.wikipedia.org/wiki/Law_of_total_probability
Conditional PMF on event
========================
:def:`Conditional PMF on event` is:
.. math::
p_{X|A}(x) = P(X=x | A)
E[X|A] = ∑_x\ x·p_{X|A}(x)
Total expectation theorem
=========================
.. math::
E[X] = Σ_i\ P(A_i)·E[X|A_i]
To prove theorem just multiply total probability theorem by :math:`x`.
Joint PMF
=========
:def:`Joint PMF` of random variables :math:`X_1,...,X_n` is:
.. math::
p_{X_1,...,X_n}(x_1,...,x_n) = P(AND_{x_1,...,x_n}: X_i = x_i)
Properties:
.. math::
E[X+Y] = E[X] + E[Y]
Conditional joint PMF
=====================
:def:`Conditional joint PMF` is:
.. math::
p_{X|Y}(x|y) = P(X=x | Y=y) = P(X=x \& Y=y) / P(Y=y)
So:
.. math::
p_{X,Y}(x,y) = p_Y(y)·p_{X|Y}(x|y) = p_X(x)·p_{Y|X}(y|x)
p_{X,Y,Z}(x,y,z) = p_Y(y)·p_{Z|Y}(z|y)·p_{X|Y,Z}(x|y,z)
∑_{x,y}\ p_{X,Y|Z}(x,y|z) = 1
Conditional expectation of joint PMF
====================================
:def:`Conditional expectation of joint PMF` is:
.. math::
E[X|Y=y] = ∑_x\ x·p_{X|Y}(x|y)
E[g(X)|Y=y] = ∑_x\ g(x)·p_{X|Y}(x|y)
Total probability theorem for joint PMF
=======================================
.. math::
p_X(x) = ∑_y\ p_Y(y)·p_{X|Y}(x|y)
Total expectation theorem for joint PMF
=======================================
.. math::
E[X] = ∑_y\ p_Y(y)·E[X|Y=y]
Proof:
.. math::
∑_y\ p_Y(y)·E[X|Y=y] = ∑_y\ p_Y(y)·∑_x\ x·p_{X|Y}(x|y)
= ∑_y\ ∑_x\ p_Y(y)·x·p_{X|Y}(x|y) = ∑_x\ ∑_y\ x·p_Y(y)·p_{X|Y}(x|y)
= ∑_x\ x·∑_y\ p_Y(y)·p_{X|Y}(x|y) = ∑_x\ x·p_X(x) = E[X]
Conditional expectation of joint PMF
====================================
:def:`Conditional expectation of joint PMF` is random variable :math:`E[X|Y]`
defined as:
.. math:: E[X|Y](y) = E[X|Y=y]
Property:
.. math:: E[g(Y)·X|Y] = g(Y)·E[X|Y]
For invertible funtion :math:`h`:
.. math:: E[X|h(Y)] = E[X|Y]
Proof:
.. math::
E[X|Y=y] = E[X|h(Y)=h(y)]
Law of Iterated Expectations
============================
.. math:: E[E[X|Y]] = E[X]
Proof (using total expectation theorem):
.. math::
E[E[X|Y]] = ∑_y\ E[X|Y](y) = ∑_y\ E[X|Y=y] = E[X]
Generalisation of Law of Iterated Expectations:
.. math:: E[E[X|Y,Z]|Y] = E[X|Y]
Proof, for each :math:`y∈Y`:
.. math::
E[X|Y=y] = ∑_x\ x·p_{X|Y}(x|Y=y) = ∑_x\ x·p_{X,Y}(x,y)/p_Y(y)
= ∑_x\ x·∑_z\ p_{X,Y,Z}(x,y,z)/p_Y(y)
= ∑_x\ x·∑_z\ p_{X|Y,Z}(x|Y=y,Z=z)·p_{Y,Z}(y,z)/p_Y(y)
= ∑_x\ x·∑_z\ p_{X|Y,Z}(x|Y=y,Z=z)·p_{Z|Y}(z|Y=y)
= ∑_x\ ∑_z\ x·p_{X|Y,Z}(x|Y=y,Z=z)·p_{Z|Y}(z|Y=y)
= ∑_z\ ∑_x\ x·p_{X|Y,Z}(x|Y=y,Z=z)·p_{Z|Y}(z|Y=y)
= ∑_z\ p_{Z|Y}(z|Y=y)·∑_x\ x·p_{X|Y,Z}(x|Y=y,Z=z)
= ∑_z\ p_{Z|Y}(z|Y=y)·E[X|Y,Z] = E[E[X|Y,Z]|Y=y]
Conditional variance
====================
:def:`Conditional variance` of :math:`X` on :math:`Y` is r.v.:
.. math:: var(X|Y)(y) = var(X|Y=y) = E[(X - E[X|Y=y])²|Y=y]
or in another notation:
.. math:: var(X|Y) = E[X²|Y] - (E[X|Y])²
Law of total variance
=====================
By applying expected value by :math:`Y` on both sides:
.. math:: E[var(X|Y)] = E[E[X²|Y]] - E[(E[X|Y])²] = E[X²] - E[(E[X|Y])²]
on another hand:
.. math:: var(E[X|Y]) = E[(E[X|Y])²] - (E[E[X|Y]])² = E[(E[X|Y])²] - (E[X])²
By adding last two expression:
.. math:: E[var(X|Y)] + var(E[X|Y]) = E[X²] - (E[X])² = var(X)
So:
.. math:: var(X) = E[var(X|Y)] + var(E[X|Y])
Independence of r.v.
====================
r.v. :math:`X` and :math:`Y` is :def:`independent` if:
.. math::
∀_{x,y}: p_{X,Y}(x,y) = p_X(x)·p_Y(y)
So if two r.v. are independent:
.. math::
E[X·Y] = E[X]·E[Y]
var(X+Y) = var(X) + var(Y)
Convolution formula
===================
If :math:`Z = X + Y` and X and Y is independent r.v. then:
.. math:: p_Z(z) = ∑_x\ p_X(x)·p_Y(z-x)
Proof:
.. math::
p_Z(z) = ∑_{x,y:x+y=z}\ p_Z(z) = ∑_{x,y:x+y=z}\ P(X=x,Y=z-x)
= ∑_{x,y:x+y=z}\ P(X=x)·P(Y=z-x) = ∑_x\ p_X(x)·p_Y(z-x)
Sum of a random number of r.v
=============================
Let :math:`X_i` is independent equally distributed r.v. and let :math:`Y =
∑_{i=1..N}\ X_i`, where :math:`N` is r.v. Then:
.. math::
E[Y|N=n] = n·E[X]
E[Y|N] = N·E[X]
Proof:
.. math:: E[Y|N=n] = E[∑_{i=1..N}\ X_i |N=n] = E[∑_{i=1..n}\ X_i] = ∑_{i=1..n}\ E[X_i] = n·E[X]
Variance of sum of a random number independent r.v.:
.. math:: var(∑_{i=1..N}\ X_i|N) = E[N]·var(X) + (E[X])²·var(N)
Proof:
.. math::
var(Y|N=n) = var[∑_{i=1..N}\ X_i|N=n] = var[∑_{i=1..n}\ X_i] = ∑_{i=1..n}\ var[X_i] = n·var(X)
var(Y) = E[var(Y|N)] + var(E[Y|N]) = E[N]·var(X) + (E[X])²·var(N)
Well known discrete r.v.
========================
Bernoulli random variable
-------------------------
:def:`Bernoulli random variable` with parameter :math:`p` is a random variable
that have 2 outcomes denoted as :math:`0` and :math:`1` with probabilities:
.. math::
p_X(0) = 1 - p
p_X(1) = p
This random variable models a trial of experiment that result in success or
failure.
:def:`Indicator` of r.v. event :math:`A` is function::
I_A = 1 iff A occurs, else 0
.. math::
P_{I_A} = p(I_A = 1) = p(A)
I_A*I_B = I_{A∩B}
.. math::
E[bernoulli(p)] = 0*(1-p) + 1*p = p
var[bernoulli(p)] = E[bernoulli(p) - E[bernoulli(p)]]
= (0-p)²·(1-p) + (1-p)²·p = p²·(1-p) + (1 - 2p + p²)·p
= p² - p³ + p - 2·p² + p³ = p·(1-p)
Discret uniform random variable
-------------------------------
:def:`Discret uniform random variable` is a variable with parameters :math:`a`
and :math:`b` in sample space :math:`{x: a ≤ x ≤ b & x ∈ ℕ}` with equal
probability of each possible outcome:
.. math::
p_{unif(a,b)}(x) = 1 / (b-a+1)
.. math::
E[unif(a,b)] = Σ_{a ≤ x ≤ b} x * 1/(b-a+1)
= 1/(b-a+1) * Σ_{a ≤ x ≤ b} x
= 1/(b-a+1) * (Σ_{a ≤ x ≤ b} a + Σ_{0 ≤ x ≤ b-a} x)
= 1/(b-a+1) * ((b-a+1)*a + (b-a)*(b-a+1)/2)
= a + (b-a)/2
= (b+a)/2
.. math::
var[unif(a,b)] = E[unif²(a,b)] - E²[unif(a,b)]
= ∑_{a≤x≤b} x²/(b-a+1) - (b+a)²/4
= 1/(b-a+1)·(∑_{0≤x≤b} x² - ∑_{0≤x≤a-1} x²) - (b+a)²/4
= 1/(b-a+1)·(b+3·b²+2·b³ - (a-1)+3·(a-1)²+2·(a-1)³)/6 - (b+a)²/4
= (2·b² + 2·a·b + b + 2·a² - a)/6 - (b+a)²/4
= (b - a)·(b - a + 2) / 12
.. NOTE::
From Maxima::
sum(i^2,i,0,n), simpsum=true;
2 3
n + 3 n + 2 n
---------------
6
factor(b+3*b^2+2*b^3 - (a-1)-3*(a-1)^2-2*(a-1)^3);
2 2
(b - a + 1) (2 b + 2 a b + b + 2 a - a)
factor((2*b^2 + 2*a*b + b + 2*a^2 - a)/6 - (b+a)^2/4), simp=true;
(b - a) (2 - a + b)
-------------------
12
Binomial random variable
------------------------
:math:`Binomial random variable` is a r.v. with parameters :math:`n` (positive
integer) and p from interval :math:`(0,1)` and sample space of positive integers
from inclusive region :math:`[0, n]`:
.. math::
p_{binom(n,p)}(x) = n!/(x!*(n-x)!) p^x p^{n-x}
Binomial random variable models a number of success of :math:`n` independent
trails of Bernoulli experimants.
.. math::
E[binom(n,p)] = E[∑_{1≤x≤n} bernoulli(p)] = ∑_{1≤x≤n} E[bernoulli(p)] = n·p
var[binom(n,p)] = var[∑_{1≤x≤n} bernoulli(p)] = ∑_{1≤x≤n} var[bernoulli(p)] = n·p·(1-p)
Geometric random variable
-------------------------
:def:`Geometric random variable` is a r.v. with parameter :math:`p` from
half open interval :math:`(0,1]`, sample space is all positive numbers:
.. math::
p_{geom(p)}(x) = p (1-p)^(x-1)
This random variable models number of tosses of biased coin until first success.
.. math::
E[geom(p)] = ∑_{x=1..∞} x·p·(1-p)^(x-1)
= p·∑_{x=1..∞} x·(1-p)^(x-1)
= p/(1-p)·∑_{x=0..∞} x·(1-p)^x
= p/(1-p)·(1-p)/(1-p - 1)² = p/p² = 1/p
.. NOTE::
Maxima calculation::
load("simplify_sum");
simplify_sum(sum(k * x^k, k, 0, inf));
Is abs(x) - 1 positive, negative or zero?
negative;
Is x positive, negative or zero?
positive;
Is x - 1 positive, negative or zero?
negative;
x
------------
2
x - 2 x + 1
.. math::
E[(geom(p))²] = ∑_{x=1..∞} x²·p·(1-p)^(x-1)
= p·∑_{x=1..∞} x²·(1-p)^(x-1)
= p/(1-p)·∑_{x=0..∞} x²·(1-p)^x
= p/(1-p)·(1-p)·(1-p+1)/(1 - (1-p))³ = p·(2-p)/p³ = (2-p)/p²
.. NOTE::
Maxima calculation::
load("simplify_sum");
(%i3) assume(x>0);
(%o3) [x > 0]
(%i4) assume(x<1);
(%o4) [x < 1]
(%i8) simplify_sum(sum(k^2 * x^k, k, 0, inf));
2
x + x
(%o8) - -------------------
3 2
x - 3 x + 3 x - 1
So:
.. math:: var(geom(p)) = E[(geom(p))²] - E[geom(p)]² = (2-p)/p² - 1/p² = (1-p)/p²