Conditional Distribution | Its 5 Important Properties

Conditional distribution

   It is very interesting to discuss the conditional case of distribution when two random variables follows the distribution satisfying one given another, we first briefly see the conditional distribution in both the case of random variables, discrete and continuous then after studying some prerequisites we focus on the conditional expectations.

Discrete conditional distribution

     With the help of joint probability mass function in joint distribution we define conditional distribution for the discrete random variables X and Y using conditional probability for X given Y as the distribution with the probability mass function

p_{X|Y}(x|y)=P\left { X=x|Y=y \right }

=\frac{P\left { X=x, Y=y \right }}{P\left { Y=y \right }}

=\frac{p(x,y)}{p_{Y}(y)}

provided the denominator probability is greater than zero, in similar we can write this as

F_{X|Y}(x|y)=P \left { X\leq x|Y\leq y \right }

=\sum_{a\leq x} p_{X|Y} (a|y)

in the joint probability if the X and Y are independent random variables then this will turn into

p_{X|Y} (x|y) = P \left { X=x|Y=y \right }

=\frac{P\left { X=x, Y=y \right }}{P\left { Y=y \right }}

=P\left { X=x \right }

so the discrete conditional distribution or conditional distribution for the discrete random variables X given Y is the random variable with the above probability mass function in similar way for Y given X we can define.

Example on discrete conditional distribution

  1. Find the probability mass function of random variable X given Y=1, if the joint probability mass function for the random variables X and Y has some values as

p(0,0)=0.4 , p(0,1)=0.2, p(1,0)= 0.1, p(1,1)=0.3

Now first of all for the value Y=1 we have

p_{Y}(1)=\sum_{x}p(x,1)=p(0,1)+p(1,1)=0.5

so using the definition of probability mass function

p_{X|Y}(x|y)=P\left { X=x|Y=y \right }

=\frac{P\left { X=x,Y=y \right }}{P\left { Y=y \right }}

=\frac{p(x,y)}{p_{Y}(y)}

we have

p_{X|Y}(0|1)=\frac{p(0,1)}{p_{Y}(1)}=\frac{2}{5}

and

p_{X|Y}(1|1)=\frac{p(1,1)}{p_{Y}(1)}=\frac{3}{5}

  • obtain the conditional distribution of X given X+Y=n, where X and Y are Poisson distributions with the parameters λ1 and λ2 and X and Y are independent random variables

Since the random variables X and Y are independent, so the conditional distribution will have probability mass function as

P\left { X=k|X + Y=n \right } =\frac{P\left { X=k, X+Y =n \right }}{P\left { X+Y=n \right }}

=\frac{P\left { X=k, X =n -k \right }}{P\left { X+Y=n \right }}

=\frac{P\left { X=k \right } P\left { Y=n-k \right }}{P\left { X+Y=n \right }}

since the sum of Poisson random variable is again poisson so

P\left { X=k|X +Y =n \right } =\frac{e^{-\lambda <em>{1}}\lambda</em>{1}^{k}}{k!}\frac{e^{-\lambda_{2}^{}}\lambda <em>{2}^{n-k}}{(n-k)!}\left [ \frac{e^{-(\lambda </em>{1}+\lambda <em>{2})}(\lambda </em>{1}+\lambda _{2})^{n}}{n!} \right ]^{-1}

=\frac{n!}{(n-k)!k!}\frac{\lambda <em>{1}^{k}\lambda </em>{2}^{n-k}}{(\lambda <em>{1}+\lambda </em>{2})^{n}}

=\binom{n}{k} \left ( \frac{\lambda <em>{1}}{\lambda </em>{1}+\lambda <em>{2}} \right )^{k}\left ( \frac{\lambda </em>{2}}{\lambda <em>{1}+\lambda </em>{2}} \right )^{n-k}

thus the conditional distribution with above probability mass function will be conditional distribution for such Poisson distributions. The above case can be generalize for more than two random variables.

Continuous conditional distribution

   The Continuous conditional distribution of the random variable X given y already defined is the continuous distribution with the probability density function

f_{X|Y}(x|y)=\frac{f(x,y)}{f_{Y}(y)}

denominator density is greater than zero, which  for the continuous density function is

f_{X|Y}(x|y)dx=\frac{f(x,y)dxdy}{f_{Y}(y)dy}

\approx \frac{P\left { x\leq X\leq x+dx, y\leq Y \leq y+ dy \right }}{P\left { y\leq Y \leq y+dy \right }}

=P\left { x\leq X \leq x+dx|y\leq Y\leq y+dy \right }

thus the probability for such conditional density function is

P\left { X\in A|Y =y \right } =\int_{A} f_{X|Y}(x|y)dx

In similar way as in discrete if X and Y are independent  in continuous then also

f_{X|Y}(x|y)dx=\frac{f(x,y)}{f_{Y}(y)}=\frac{f_{X}(x)f_{Y}(y)}{f_{Y}(y)} =f_{X}(x)

and hence

\frac{P\left { x< X< x+ dx|N =n \right }}{dx} = \frac{P\left { N=n|x < X < x+ dx \right }}{P\left { N=n \right }} \frac{P\left { x< X < x+ dx \right }}{dx}

\lim_{dx \to 0}\frac{P\left { x< X < x +dx|N =n \right }}{dx} =\frac{P\left { N=n|X =x \right }}{P\left { N=n \right }} f(x)

so we can write it as

f_{X|N}(x|n)=\frac{P\left { N=n|X=x \right }}{P\left { N=n \right }}f(x)

Example on Continuous conditional distribution

  1. Calculate conditional density function of random variable X given Y if the joint probability density function with the open interval (0,1) is given by

f(x,y)=\begin{cases} \frac{12}{5} x(2-x-y) \ \ 0< x< 1, \ \ 0< y< 1 \\ \ \ 0 \ \ \ \ otherwise \end{cases}

If for the random variable X given Y within (0,1) then by using the above density function we have

f_{X|Y}(x|y)=\frac{f(x,y)}{f_{Y}(y)}

=\frac{f(x,y)}{\int_{-\infty}^{\infty} f(x,y)dx}

=\frac{x(2-x-y)}{\int_{0}^{1} x(2-x-y) dx}

=\frac{x(2-x-y)}{\frac{2}{3}-\frac{y}{2}}

=\frac{6x(2-x-y)}{4-3y}

  • Calculate the conditional probability

P\left { X> 1|Y=y \right }

if the joint probability density function is given by

f(x,y)=\begin{cases} \frac{e^{-\frac{x}{y}}e^{-y}}{y} \ \ 0< x< \infty , \ \ 0< y< \infty \\ \ \ 0 \ \ \ \ otherwise \end{cases}

To find the conditional probability first we require the conditional density function so by the definition it would be

f_{X|Y}(x|y)=\frac{f(x,y)}{f_{Y}(y)}

=\frac{e^{-x/y}e^{-y}/y}{e^{-y}\int_{0}^{\infty}(1/y)e^{-x/y}dx}

=\frac{1}{y}e^{-x/y}

now using this density function in the probability the conditional probability is

P\left { X> 1|Y=y \right } =\int_{1}^{\infty}\frac{1}{y} e^{-x/y}dx

= e^{-x/y} \lvert_{1}^{\infty}

= e^{-1/y}

Conditional distribution of bivariate normal distribution

  We know that the Bivariate normal distribution of the normal random variables X and Y with the respective means and variances as the parameters has the joint probability density function

Conditional distribution
Conditional distribution of bivariate normal distribution

so to find the conditional distribution for such a bivariate normal distribution for X given Y is defined by following the conditional density function of the continuous random variable and the above joint density function we have

Conditional distribution
Conditional distribution of bivariate normal distribution

By observing this we can say that this is normally distributed with the mean

\left ( \mu <em>{x} + \rho \frac{\sigma </em>{x}}{\sigma <em>{y}} (y-\mu </em>{y}) \right )

and variance

\sigma _{x}^{2}(1-\rho ^{2})

in the similar way the conditional density function for Y given X already defined will be just interchanging the positions of the parameters of X with Y,

The marginal density function for X we can obtain from the above conditional density function by using the value of the constant

Conditional distribution
Conditional distribution of bivariate normal distribution

let us substitute in the integral

w=\frac{y-\mu <em>{y}}{\sigma </em>{y}}

the density function will be now

since the total value of

by the definition of the probability so the density function will be now

which is nothing but the density function of random variable X with usual mean and variance as the parameters.

Joint Probability distribution of function of random variables

  So far we know the joint probability distribution of two random variables, now if we have functions of such random variables then what would be the joint probability distribution of those functions, how to calculate the density and distribution function because we have real life situations where we have functions of the random variables,

If Y1 and Y2 are the functions of the random variables X1 and X2 respectively which are jointly continuous then the joint continuous density function of these two functions will be

f_{Y_{1}Y_{2}}(y_{1}y_{2})=fX_{1}X_{2}(x_{1},x_{2})|J(x_{1},x_{2})|^{-1}

where Jacobian

J(x_{1},x_{2}) = \begin{vmatrix} \frac{\partial g_1}{\partial x_1} & \frac{\partial g_1}{\partial x_2} \ \  \\  \frac{\partial g_2}{\partial x_1} & \frac{\partial g_2}{\partial x_2} \end{vmatrix} \equiv \frac{\partial g_1}{\partial x_1}\frac{\partial g_2}{\partial x_2} - \frac{\partial g_1}{\partial x_2}\frac{\partial g_2}{\partial x_1} \neq 0

and Y1 =g1 (X1, X2) and Y2 =g2 (X1, X2) for some functions g1 and g2 . Here g1 and g2 satisfies the conditions of the Jacobian as continuous and have continuous partial derivatives.

Now the probability for such functions of random variables will be

Examples on Joint Probability distribution of function of random variables

  1. Find the joint density function of the random variables Y1 =X1 +X2 and Y2=X1 -X2 , where X1 and X2 are the jointly continuous with joint probability density function. also discuss for the different nature of distribution .

Here we first we will check Jacobian

J(x_{1},x_{2}) = \begin{vmatrix} \frac{\partial g_1}{\partial x_1} & \frac{\partial g_1}{\partial x_2} \  \\ \frac{\partial g_2}{\partial x_1} & \frac{\partial g_2}{\partial x_2} \end{vmatrix}

since g1(x1, x2)= x1 + x2  and g2(x1, x2)= x1 – x2 so

J(x_{1},x_{2}) = \begin{vmatrix} 1 & 1 \  \\ 1 & -1 \end{vmatrix} =-2

simplifying Y1 =X1 +X2 and Y2=X1 -X2 , for the value of X1 =1/2( Y1 +Y2 ) and X2 = Y1 -Y2 ,

f_{Y_{1}},<em>{Y</em>{2}}(y_{1},y_{2})=\frac{1}{2}f_{X_{1},X_{2}}\left ( \frac{y_{1}+y_{2}}{2},\frac{y_{1} - y_{2}}{2} \right )

if these random variables are independent uniform random variables

f_{Y_{1},Y_{2}} (y_{1},y_{2}) =\begin{cases} \frac{1}{2} \ \ 0 \leq y_{1} + y_{2} \leq 2 \ \ , \ \ 0\leq y_{1} - y_{2} \leq 2 \\ 0 \ \ otherwise \end{cases}

or if these random variables are independent exponential random variables with usual parameters

or if these random variables are independent normal random variables then

f_{Y_{1},Y_{2}} (y_{1},y_{2}) =\frac{1}{4\pi }e^{-[(y_{1}+y_{2})^{2}/8 + (y_{1} -y_{2})^{2}/8]}

=\frac{1}{4\pi } e^{-\left ( y_{1}^{2} + y_{2}^{2}\right )/4}

=\frac{1}{\sqrt{4\pi }}e^{-y_{1}^{2}/4} \frac{1}{\sqrt{4\pi }}e^{-y_{2}^{2}/4}

  • If X and Y are the independent standard normal variables as given
Conditional distribution

calculate the joint distribution for the respective polar coordinates.

We will convert by usual conversion X and Y into r and θ as

g_{1}(x,y)=\sqrt{x^{2}+y^{2}} \ \ and \ \ \theta =g_{2} (x,y)= tan^{-1}\frac{y}{x}

so the partial derivatives of these function will be

\frac{\partial g_{1}}{\partial x}=\frac{x}{\sqrt{x^{2}+y^{2}}}

\frac{\partial g_{2}}{\partial x}=\frac{1}{1+ (y/x)^{2}}\left ( \frac{-y}{x^{2}} \right )^{2} =\frac{-y}{x^{2}+y^{2}}

\frac{\partial g_{1}}{\partial y}=\frac{y}{\sqrt{x^{2}+y^{2}}}

\frac{\partial g_{2}}{\partial y}=\frac{1}{x\left [ 1+(y/x)^{2} \right ]}=\frac{x}{x^{2}+y^{2}}

so the Jacobian using this functions is

J(x,y)=\frac{x^{2}}{(x^{2}+y^{2})^{3/2}} + \frac{y^{2}}{(x^{2}+y^{2})^{3/2}} =\frac{1}{\sqrt{x^{2}+y^{2}}}=\frac{1}{r}

if both the random variables X and Y are greater than zero then conditional joint density function is

f(x,y|X > 0, Y > 0)=\frac{f(x,y)}{P(X > 0, Y> 0)}=\frac{2}{\pi }e^{-(x^{2}+y^{2})/2} \ \ x> 0, \ \ y> 0

now the conversion of cartesian coordinate to the polar coordinate using

r=\sqrt{x^{2}+y^{2}} \ \ and \ \ \theta =tan^{-1}\left ( \frac{y}{x} \right )

so the probability density function for the positive values will be

f(r,\theta|X > 0, Y> 0 )=\frac{2}{\pi }re^{-r^{2}/2} , \ \ 0< \theta < \frac{\pi }{2} , \ \ 0< r< \infty

for the different combinations of X and Y the density functions in similar ways are

f(r,\theta|X  0 )=\frac{2}{\pi }re^{-r^{2}/2} , \ \ \pi /2 < \theta < \pi , \ \ 0< r< \infty

f(r,\theta|X < 0, Y< 0 )=\frac{2}{\pi }re^{-r^{2}/2} , \ \ \pi < \theta < 3\pi/2 , \ \ 0< r< \infty

f(r,\theta|X > 0, Y< 0 )=\frac{2}{\pi }re^{-r^{2}/2} , \ \ 3\pi/2 < \theta < 2\pi , \ \ 0< r< \infty

now from the average of the above densities we can state the density function as

f(r,\theta)=\frac{1}{2\pi }re^{-r^{2}/2} , \ \ 0 < \theta < 2\pi , \ \ 0< r< \infty

and the marginal density function from this joint density of polar coordinates over the interval (0, 2π)

f(r)=re^{-r^{2}/2} , \ \ 0< r< \infty

  • Find the joint density function for the function of random variables

U=X+Y and V=X/(X+Y)

where X and Y are the gamma distribution with parameters (α + λ) and (β +λ) respectively.

Using the definition of gamma distribution and joint distribution function the density function for the random variable X and Y will be

f_{X,Y} (x,y)=\frac{\lambda e^{-\lambda x}(\lambda x)^{\alpha -1}}{\Gamma (\alpha )} \frac{\lambda e^{-\lambda y}(\lambda y)^{\beta -1}}{\Gamma (\beta )}

=\frac{\lambda ^{\alpha +\beta }}{\Gamma (\alpha ) \Gamma (\beta )} e^{-\lambda (x+y)} x^{\alpha -1}y^{\beta -1}

consider the given functions as

g1 (x,y) =x+y , g2 (x,y) =x/(x+y),

so the differentiation of these function is

\frac{\partial g_{1}}{\partial x}=\frac{\partial g_{1}}{\partial y}=1

\frac{\partial g_{2}}{\partial x}=\frac{y}{(x+y)^{2}}

\frac{\partial g_{2}}{\partial y}=-\frac{x}{(x+y)^{2}}

now the Jacobian is

J(x,y) = \begin{vmatrix} 1 & 2 \ \\ \frac{y}{(x+y)^{2}} & \frac{-x}{(x+y)^{2}} \end{vmatrix} = -\frac{1}{x+y}

after simplifying the given equations the variables x=uv and y=u(1-v) the probability density function is

f_{U,V}(u,v)=f_{X,Y} \left [ uv,u(1-v) \right ]u

=\frac{\lambda e^{-\lambda u}(\lambda u)^{\alpha +\beta -1}}{\Gamma (\alpha +\beta )} \frac{v^{\alpha -1}(1-v)^{\beta -1}\Gamma (\alpha +\beta )}{\Gamma (\alpha )\Gamma (\beta )}

we can use the relation

B(\alpha ,\beta )=\int_{0}^{1}v^{\alpha -1}(1-v)^{\beta -1}dv

=\frac{\Gamma (\alpha )\Gamma (\beta )}{\Gamma (\alpha +\beta )}

  • Calculate the joint probability density function for

Y1 =X1 +X2+ X3 , Y2 =X1– X2 , Y3 =X1 – X3

where the random variables X1 , X2, X3 are the standard normal random variables.

Now let us calculate the Jacobian by using partial derivatives of

Y1 =X1 +X2+ X3 , Y2 =X1– X2 , Y3 =X1 – X3

as

J = \begin{vmatrix} 1 & 1 & 1 \  \\ 1 & -1 & 0 \\  \ 1 & 0 & -1 \end{vmatrix} =3

simplifying for variables X1 , X2 and X3

X1 = (Y1 + Y2 + Y3)/3 , X2 = (Y1 – 2Y2 + Y3)/3 , X3 = (Y1 + Y2 -2 Y3)/3

we can generalize the joint density function as

f_{Y_{1} \cdot \cdot \cdot Y_{n}}(y_{1} \cdot \cdot \cdot y_{n})=f_{X_{1} \cdot \cdot \cdot X_{n}}(x_{1} \cdot \cdot \cdot x_{n})|J(x_{1} \cdot \cdot \cdot x_{n})|^{-1}

so we have

f_{Y_{1}, Y_{2},Y_{3}}(y_{1}, y_{2},y_{3})=\frac{1}{3}f_{X_{1},X_{2},X_{3}}\left ( \frac{y_{1}+y_{2}+y_{3}}{3}, \frac{y_{1}-2y_{2}+y_{3}}{3}, \frac{y_{1}+y_{2} -2y_{3}}{3} \right )

for the normal variable the  joint probability density function is

f_{X_{1}, X_{2},X_{3}}(x_{1}, x_{2},x_{3})=\frac{1}{(2\pi )^{3/2}}e^{-\sum_{i=1}^{3}x_{i}^{2}/2}

hence

f_{Y_{1}, Y_{2}, Y_{3}}(y_{1}, y_{2}, y_{3})=\frac{1}{3(2\pi )^{3/2}}e^{-Q(y_{1},y_{2},y_{3})/2}

where the index is

Q(y_{1},y_{2},y_{3})=\left ( \frac{(y_{1}+y_{2}+y_{3})}{3} \right )^{2} + \left ( \frac{(y_{1}-2y_{2}+y_{3})}{3} \right )^{2} + \left ( \frac{(y_{1}+y_{2}-2y_{3})}{3} \right )^{2}

=\frac{y_{1}^{2}}{3} + \frac{2}{3} y_{2}^{2} +\frac{2}{3} y_{3}^{2} -\frac{2}{3}y_{2}y_{3}

compute the joint density function of Y1 ……Yn and marginal density function for Yn where

Y_{i}= X_{1}+ \cdot \cdot \cdot.+X_{i} \ \ i=1, \cdot \cdot \cdot..,n

and Xi are independent identically distributed exponential random variables with parameter λ.

for the random variables of the form

Y1 =X1 , Y2 =X1 + X2 , ……, Yn =X1 + ……+ Xn

the Jacobian will be of the form

and hence its value is one, and the joint density function for the exponential random variable

f_{X_{1} \cdot \cdot \cdot X_{n}}(x_{1}, \cdot \cdot \cdot,x_{n})=\prod_{i=1}^{n}\lambda e^{-\lambda x_{i}} \ \ 0< x_{i}< \infty , \ \ i=1, \cdot \cdot \cdot ,n

and the values of the variable Xi ‘s will be

X_{1}=Y_{1} , X_{2}=Y_{2} -Y_{1} ,\cdot \cdot \cdot , X_{i}=Y_{i} -Y_{i-1}, \cdot \cdot \cdot, X_{n}=Y_{n} -Y_{n-1}

so the joint density function is

f_{Y_{1}, \cdot \cdot \cdot \cdot Y_{n}}(y_{1},y_{2}, \cdot \cdot \cdot \cdot y_{n})=f_{X_{1},\cdot \cdot \cdot \cdot ,X_{n}}(y_{1},y_{2} -y_{1},\cdot \cdot \cdot \cdot \cdot ,y_{i}-y_{i-1},\cdot \cdot \cdot ,y_{n}-y_{n-1} )

=\lambda ^{n} exp\left { -\lambda \left [ y_{1} + \sum_{i=2}^{n}(y_{i}-y_{i-1}) \right ] \right }

=\lambda ^{n} e^{-\lambda y_{n}}  \   \  0< y_{1}, 0< y_{i}-y_{i-1} , i=2, \cdot \cdot \cdot ,n

=\lambda ^{n} e^{-\lambda y_{n}} \  \ 0< y_{1} < y_{2} < \cdot \cdot \cdot  < y_{n}

Now to find the marginal density function of Yn we will integrate one by one  as

f_{Y_{2}, \cdot \cdot \cdot \cdot Y_{n}} (y_{2}, \cdot \cdot \cdot \cdot y_{n})= \int_{0}^{y_{2}}\lambda ^{n} e^{-\lambda y_{n}}dy_{1}

=\lambda ^{n} y_{2} e^{-\lambda y_{n}} \ \ 0< y_{2} < y_{3} < \cdot \cdot \cdot < y_{n}

and

f_{Y_{3}, \cdot \cdot \cdot \cdot Y_{n}} (y_{3}, \cdot \cdot \cdot \cdot y_{n})= \int_{0}^{y_{3}}\lambda ^{n} y_{2} e^{-\lambda y_{n}}dy_{2}

=\frac{\lambda ^{n}}{2} y_{3}^{2} e^{-\lambda y_{n}} \ \ 0< y_{3} < y_{4} < \cdot \cdot \cdot < y_{n}

like wise

f_{Y_{4}, \cdot \cdot \cdot \cdot Y_{n}} (y_{4}, \cdot \cdot \cdot \cdot y_{n}) =\frac{\lambda ^{n}}{3!} y_{4}^{3} e^{-\lambda y_{n}} \ \ 0 < y_{4} < \cdot \cdot \cdot < y_{n}

if we continue this process we will get

f_{Y_{n}}(y_{n})=\lambda ^{n}\frac{y_{n}^{n-1}}{(n-1)!}e^{-\lambda y_{n}} \ \ 0< y_{n}

which is the marginal density function.

Conclusion:

The conditional distribution for the discrete and continuous random variable with different examples considering some of the types of these random variables discussed, where the independent random variable plays important role. In addition the  joint distribution for the function of joint continuous random variables also explained with suitable examples, if you require further reading go through below links.

For more post on Mathematics, please refer to our Mathematics Page

Wikipediahttps://en.wikipedia.org/wiki/joint_probability_distribution/” target=”_blank” rel=”noreferrer noopener” class=”rank-math-link”>Wikipedia.org

A first course in probability by Sheldon Ross

Schaum’s Outlines of Probability and Statistics

An introduction to probability and statistics by ROHATGI and SALEH

About DR. MOHAMMED MAZHAR UL HAQUE

I am DR. Mohammed Mazhar Ul Haque , Assistant professor in Mathematics. Having 12 years of experience in teaching. Having vast knowledge in Pure Mathematics , precisely on Algebra. Having the immense ability of problem designing and solving. Capable of Motivating candidates to enhance their performance.
I love to contribute to Lambdageeks to make Mathematics Simple , Interesting & Self Explanatory for beginners as well as experts.
Let's connect through LinkedIn - https://www.linkedin.com/in/dr-mohammed-mazhar-ul-haque-58747899/