--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/probability-discrete.rst	Wed Mar 09 21:23:23 2016 +0200
@@ -0,0 +1,352 @@
+
+=============
+ Probability
+=============
+.. contents::
+   :local:
+
+.. role:: def
+   :class: def
+
+PMF
+===
+
+:def:`PMF` or :def:`probability mass function` or :def:`probability law` or
+:def:`probability discribuion` of discrete random variable is a function that
+for given number give probability of that value.
+
+To denote PMF used notations:
+
+.. math::
+
+   PMF(X = x) = P(X = x) = p_X(x) = P({ω ∈ Ω: X(ω) = x})
+
+where :math:`X` is a random variable on space :math:`Ω` of outcomes which mapped
+to real number via :math:`X(ω)`.
+
+Expected value
+==============
+
+:def:`Expected value` of PMF is:
+
+.. math::
+
+  E[X] = Σ_{ω∈Ω} Χ(x) * p(ω) = Σ_{x} x * p_X(x)
+
+We write :math:`a ≤ X ≤ b` for :math:`∀ ω∈Ω a ≤ X(ω) ≤ b`.
+
+If :math:`X ≥ 0` then :math:`E[X] ≥ 0`.
+
+if :math:`a ≤ X ≤ b` then :math:`a ≤ E[X] ≤ b`.
+
+If :math:`Y = g(X)` (:math:`∀ ω∈Ω Y(ω) = g(X(ω))`) then:
+
+.. math::
+
+  E[Y] = Σ_{x} g(x) * p_X(x)
+
+**Proof** TODO:
+
+.. math::
+
+  E[Y] = Σ_{y} y * p_Y(y)
+
+  = Σ_{y∈ℝ} y * Σ_{ω∈Ω: Y(ω)=y} p(ω)
+
+  = Σ_{y∈ℝ} y * Σ_{ω∈Ω: g(X(ω))=y} p(ω)
+
+  = Σ_{y∈ℝ} y * Σ_{x∈ℝ: g(x)=y} Σ_{ω∈Ω: X(ω) = x} p(ω)
+
+  = Σ_{y∈ℝ} y * Σ_{x∈ℝ: g(x)=y} p_X(x)
+
+  = Σ_{y∈ℝ} Σ_{x∈ℝ: g(x)=y} y * p_X(x)
+
+  = Σ_{x∈ℝ} Σ_{y∈ℝ: g(x)=y} y * p_X(x)
+
+  = Σ_{x} g(x) * p_X(x)
+
+.. math::
+
+  E[a*X + b] = a*E[X] + b
+
+Variance
+========
+
+:def:`Variance` is a:
+
+.. math::
+
+  var[X] = E[(X - E[X])^2] = E[X^2] - E^2[X]
+
+:def:`Standard deviation` is a:
+
+.. math::
+
+  σ_Χ = sqrt(var[X])
+
+Property:
+
+.. math::
+
+  var(a*X + b) = a²̇ · var[X]
+
+
+Total probability theorem
+=========================
+
+Let :math:`A_i ∩ A_j = ∅` for :math:`i ≠ j` and :math:`∑_i A_i = Ω`:
+
+.. math::
+
+  p_X(x) = Σ_i P(A_i)·p_{X|A_i}(x)
+
+Conditional PMF
+===============
+
+:def:`Conditional PMF` is:
+
+.. math::
+
+  p_{X|A}(x) = P(X=x | A)
+
+  E[X|A] = ∑_x x·p_{X|A}(x)
+
+Total expectation theorem
+=========================
+
+.. math::
+
+  E[X] = Σ_i P(A_i)·E[X|A_i]
+
+To prove theorem just multiply total probability theorem by :math:`x`.
+
+Joint PMF
+=========
+
+:def:`Joint PMF` of random variables :math:`X_1,...,X_n` is:
+
+.. math::
+
+   p_{X_1,...,X_n}(x_1,...,x_n) = P(AND_{x_1,...,x_n}: X_i = x_i)
+
+Properties:
+
+.. math::
+
+  E[X+Y] = E[X] + E[Y]
+
+Conditional PMF
+===============
+
+:def:`Conditional PMF` is:
+
+.. math::
+
+  p_{X|Y}(x|y) = P(X=x | Y=y) = P(X=x \& Y=y) / P(Y=y)
+
+So:
+
+.. math::
+
+  p_{X,Y}(x,y) = p_Y(y)·p_{X|Y}(x|y) = p_X(x)·p_{Y|X}(y|x)
+
+  p_{X,Y,Z}(x,y,z) = p_Y(y)·p_{Z|Y}(z|y)·p_{X|Y,Z}(x|y,z)
+
+  ∑_{x,y} p_{X,Y|Z}(x,y|z) = 1
+
+Conditional expectation of joint PMF
+====================================
+
+:def:`Conditional expectation of joint PMF` is:
+
+.. math::
+
+  E[X|Y=y] = ∑_x x·p_{X|Y}(x|y)
+
+  E[g(X)|Y=y] = ∑_x g(x)·p_{X|Y}(x|y)
+
+Total probability theorem for joint PMF
+=======================================
+.. math::
+
+  p_X(x) = ∑_y p_Y(y)·p_{X|Y}(x|y)
+
+Total expectation theorem for joint PMF
+=======================================
+.. math::
+
+  E[X] = ∑_y p_Y(y)·E[X|Y=y]
+
+Independence of r.v.
+====================
+
+r.v. :math:`X` and :math:`Y` is :def:`independent` if:
+
+.. math::
+
+  ∀_{x,y}: p_{X,Y}(x,y) = p_X(x)·p_Y(y)
+
+So if two r.v. are independent:
+
+.. math::
+
+  E[X·Y] = E[X]·E[Y]
+
+  var(X+Y) = var(X) + var(Y)
+
+Well known discrete r.v.
+========================
+
+Bernoulli random variable
+-------------------------
+
+:def:`Bernoulli random variable` with parameter :math:`p` is a random variable
+that have 2 outcomes denoted as :math:`0` and :math:`1` with probabilities:
+
+.. math::
+
+  p_X(0) = 1 - p
+
+  p_X(1) = p
+
+This random variable models a trial of experiment that result in success or
+failure.
+
+:def:`Indicator` of r.v. event :math:`A` is function::
+
+   I_A = 1 iff A occurs, else 0
+
+.. math::
+
+  P_{I_A} = p(I_A = 1) = p(A)
+
+  I_A*I_B = I_{A∩B}
+
+.. math::
+
+  E[bernoulli(p)] = 0*(1-p) + 1*p = p
+
+  var[bernoulli(p)] = E[bernoulli(p) - E[bernoulli(p)]]
+
+   = (0-p)²·(1-p) + (1-p)²·p = p²·(1-p) + (1 - 2p + p²)·p
+
+   = p² - p³ + p - 2·p² + p³ = p·(1-p)
+
+Discret uniform random variable
+-------------------------------
+
+:def:`Discret uniform random variable` is a variable with parameters :math:`a`
+and :math:`b` in sample space :math:`{x: a ≤ x ≤ b & x ∈ ℕ}` with equal
+probability of each possible outcome:
+
+.. math::
+
+  p_{unif(a,b)}(x) = 1 / (b-a+1)
+
+.. math::
+
+  E[unif(a,b)] = Σ_{a ≤ x ≤ b} x * 1/(b-a+1)
+  = 1/(b-a+1) * Σ_{a ≤ x ≤ b} x
+
+  = 1/(b-a+1) * (Σ_{a ≤ x ≤ b} a + Σ_{0 ≤ x ≤ b-a} x)
+
+  = 1/(b-a+1) * ((b-a+1)*a + (b-a)*(b-a+1)/2)
+
+  = a + (b-a)/2
+  = (b+a)/2
+
+
+.. math::
+
+  var[unif(a,b)] = E[unif²(a,b)] - E²[unif(a,b)]
+
+  = ∑_{a≤x≤b} x²/(b-a+1) - (b+a)²/4
+
+  = 1/(b-a+1)·(∑_{0≤x≤b} x² - ∑_{0≤x≤a-1} x²) - (b+a)²/4
+
+  = 1/(b-a+1)·(b+3·b²+2·b³ - (a-1)+3·(a-1)²+2·(a-1)³)/6 - (b+a)²/4
+
+  = (2·b² + 2·a·b + b + 2·a² - a)/6 - (b+a)²/4
+
+  = (b - a)·(b - a + 2) / 12
+
+.. NOTE::
+
+   From Maxima::
+
+     sum(i^2,i,0,n), simpsum=true;
+
+              2      3
+       n + 3 n  + 2 n
+       ---------------
+             6
+
+     factor(b+3*b^2+2*b^3 - (a-1)-3*(a-1)^2-2*(a-1)^3);
+
+                       2                  2
+       (b - a + 1) (2 b  + 2 a b + b + 2 a  - a)
+
+     factor((2*b^2 + 2*a*b + b + 2*a^2 - a)/6 - (b+a)^2/4), simp=true;
+
+       (b - a) (2 - a + b)
+       -------------------
+               12
+
+Binomial random variable
+------------------------
+
+:math:`Binomial random variable` is a r.v. with parameters :math:`n` (positive
+integer) and p from interval :math:`(0,1)` and sample space of positive integers
+from inclusive region :math:`[0, n]`:
+
+.. math::
+
+  p_{binom(n,p)}(x) = n!/(x!*(n-x)!) p^x p^{n-x}
+
+Binomial random variable models a number of success of :math:`n` independent
+trails of Bernoulli experimants.
+
+.. math::
+
+  E[binom(n,p)] = E[∑_{1≤x≤n} bernoulli(p)] = ∑_{1≤x≤n} E[bernoulli(p)] = n·p
+
+  var[binom(n,p)] = var[∑_{1≤x≤n} bernoulli(p)] = ∑_{1≤x≤n} var[bernoulli(p)] = n·p·(1-p)
+
+Geometric random variable
+-------------------------
+
+:def:`Geometric random variable` is a r.v. with parameter :math:`p` from
+half open interval :math:`(0,1]`, sample space is all positive numbers:
+
+.. math::
+
+  p_{geom(p)}(x) = p (1-p)^(x-1)
+
+This random variable models number of tosses of biased coin until first success.
+
+.. math::
+
+  E[geom(p)] = ∑_{x=1..∞} x·p·(1-p)^(x-1)
+
+  = p·∑_{x=1..∞} x·(1-p)^(x-1)
+
+  = p/(1-p)·∑_{x=0..∞} x·(1-p)^x
+
+  = p/(1-p)·(1-p)/(1-p - 1)² = p/p² = 1/p
+
+.. NOTE::
+
+   Maxima calculation::
+
+     load("simplify_sum");
+     simplify_sum(sum(k * x^k, k, 0, inf));
+       Is abs(x) - 1 positive, negative or zero?
+       negative;
+       Is x positive, negative or zero?
+       positive;
+       Is x - 1 positive, negative or zero?
+       negative;
+            x
+       ------------
+        2
+       x  - 2 x + 1