Conditional Expectation: 7 Facts You Should Know

Table of Content

For the random variable dependent on one another requires the calculation of conditional probabilities which we already discussed, now we will discuss some more parameters for such random variables or experiments like conditional expectation and conditional variance for different types of random variables.

Conditional Expectation

   The definition of conditional probability mass function of discrete random variable X given Y is


[latex]p_{X|Y}(x|y)=P \left { X=x|Y=y \right }= \frac{p(x,y)}{p_{Y}(y)}[/latex]

here pY(y)>0 , so the conditional expectation for the discrete random variable X given Y when pY (y)>0 is


[latex]E\left [ X|Y=y \right ] =\sum_{x}^{} xP \left \{ X=x|Y=y \right \}[/latex]


[latex]=\sum_{x}^{} xp_{X|Y}(x|y)[/latex]

in the above expectation probability is the conditional probability.

  In similar way if X and Y are continuous then the conditional probability density function of the random variable X given Y is



where f(x,y) is joint probability density function and for all yfY(y)>0 , so the conditional expectation for the random variable X given y will be

[latex]E\left [ X|Y=y \right ]=\int_{-\infty}^{\infty}xf_{X|Y}(x|y)dx[/latex]

for all yfY(y)>0.

   As we know that all the properties of probability are applicable to conditional probability same is the case for the conditional expectation, all the properties of mathematical expectation are satisfied by conditional expectation, for example conditional expectation of function of random variable will be

E[g(X) \mid Y=y]=\left\{\begin{array}{l}
\sum_{x} g(x) p_{X \mid Y}(x \mid y) \quad \text { in the discrete case } \\
\int_{-\infty}^{\infty} g(x) f_{X \mid} \gamma(x \mid y) d x \text { in the continuous case }

and the sum of random variables in conditional expectation will be

E\left[\sum_{i=1}^{n} X_{i} \mid Y=y\right]=\sum_{i=1}^{n} E\left[X_{i} \mid Y=y\right]

Conditional Expectation for the sum of binomial random variables

    To find conditional expectation of the sum of binomial random variables X and Y with parameters n and p which are independent, we know that X+Y will be also binomial random variable with the parameters 2n and p, so for random variable X given X+Y=m the conditional expectation will be obtained by calculating the probability

P[X=k \mid X+Y=m] &=\frac{P[X=k, X+Y=m]}{P(X+Y=m)} \\
&=\frac{P[X=k, Y=m-k]}{P[X+Y=m]} \\
&=\frac{P[X=k \mid P[Y=m-k \mid}{P(X+Y=m]} \\
n \\
\end{array}\right) p^{k}(1-p)^{n-k}\left(\begin{array}{c}
n \\
\end{array}\right) p^{m-k}(1-p)^{n-m+k}}{\left(\begin{array}{l}
2 n \\
\end{array}\right) p^{m}(1-p)^{2 n-m}}

since we know that


[latex]E[X]=E\left[X_{1}\right]+\cdots+E\left[X_{m}\right]=\frac{m n}{N}[/latex]

thus the conditional expectation of X given X+Y=m is


[latex]E[X \mid X+Y=m]=\frac{m}{2}[/latex]


Find the conditional expectation


[latex]E[X \mid Y=y] . [/latex]

if the joint probability density function of continuous random variables X and Y is given as

[latex]f(x, y)=\frac{e^{-x / y} e^{-y}}{y} & 0<x<\infty, 0<y<\infty[/latex]


To calculate the conditional expectation we require conditional probability density function, so

f_{X \mid Y}(x \mid y) &=\frac{f(x, y)}{f_{Y}(y)} \\
&=\frac{f(x, y)}{\int_{-\infty}^{\infty} f(x, y) d x} \\
&=\frac{(1 / y) e^{-x / y} e^{-y}}{\int_{0}^{\infty}(1 / y) e^{-x / y_{e}-y} d x} \\
&=\frac{(1 / y) e^{-x / y}}{\int_{0}^{\infty}(1 / y) e^{-x / y} d x} \\
&=\frac{1}{y} e^{-x / y}
\end{aligned} [/latex]

since for the continuous random variable the conditional expectation is

[latex]E[X \mid Y=y]=\int_{-\infty}^{\infty} x f_{X \mid Y}(x \mid y) d x[/latex]

hence for the given density function the conditional expectation would be

[latex]E[X \mid Y=y]=\int_{0}^{\infty} \frac{x}{y} e^{-x / y} d x=y[/latex]

Expectation by conditioning||Expectation by conditional expectation

                We can calculate the mathematical expectation with the help of conditional expectation of X given Y as


[latex]E[X]=E[E[X \mid Y]][/latex]

for the discrete random variables this will be


[latex]E[X]=\sum_{y} E[X \mid Y=y] P\{Y=y\}[/latex]

which can be obtained as

\sum_{y} E[X \mid Y=y] P\{Y=y\} &=\sum_{y} \sum_{x} x P[X=x \mid Y=y\} P\{Y=y\} \\
&=\sum_{y} \sum_{x} x \frac{P\{X=x, Y=y\}}{P\{Y=y\}} P[Y=y\} \\
&=\sum_{y} \sum_{x} x P[X=x, Y=y\} \\
&=\sum_{x} x \sum_{y} P\{X=x, Y=y\} \\
&=\sum_{x} x P\{X=x\} \\
&=E[X] \end{aligned}

and for the continuous random we can similarly show

[latex]E[X] =\int_{-\infty}^{\infty} E\left[X|Y=y| f_{Y}(y) d y\right.[/latex]


                A person is trapped in his building underground as the entrance is blocked due to some heavy load fortunately there are three pipelines from which he can come out the first pipe take him safely out after 3 hours, the second after 5 hours and the third pipeline  after 7 hours, If any of these pipeline chosen equally likely by him, then what would be the expected time he will come outside safely.


Let X be the random variable that denote the time in hours until the person came out safely and Y denote the pipe he chooses initially, so

E[X]=E[X |Y=1] P{Y=1}+E[X | Y=2] P{Y=2}+E[X |Y=3] P{Y=3}
=1/3(E[X |Y=1]+E[X |Y=2]+E[X| Y=3])

[latex]E[X]=E[X \mid Y=1] P\{Y=1\}+E[X \mid Y=2] P\{Y=2\}+E[X \mid Y=3] P\{Y=3\}\\
=\frac{1}{3}(E[X \mid Y=1]+E[X \mid Y=2]+E[X \mid Y=3])[/latex]


[latex]$E[X \mid Y=1]=3$\\
$E[X \mid Y=2]=5+E[X]$\\
$E[X \mid Y=3]=7+E[X]$

If the person chooses the second pipe , he spends 5 hous in that but  he come outside with expected time


[latex]E[X \mid Y=2]=5+E[X][/latex]

so the expectation  will be

E[X]=1/3(3+5+E[X]+7+E[X]) E[X]=15

[latex]E[X]=\frac{1}{3}(3+5+E[X]+7+E[X]) \quad E[X]=15[/latex]

Expectation of sum of random number of random variables using conditional expectation

                Let N be the random number of random variable and sum of random variables is     then the expectation  

[latex]E\left[\sum_{1}^{N} X_{i}\right]=E\left[E\left[\sum_{1}^{N} X_{i} \mid N\right]\right][/latex]


[latex]E\left[\sum_{1}^{N} X_{i} \mid N=n\right]=E\left[\sum_{1}^{n} X_{i} \mid N=n\right]\\
=E\left[\sum_{1}^{n} X_{i}\right] \text{ by the independence of the }X_{i} \text{ and }N \\=n E[X] \text{where} E[X]=E\left[X_{i}\right][/latex]


[latex]E\left[\sum_{1}^{N} X_{i} \mid N\right]=N E[X][/latex]


[latex]E\left[\sum_{i=1}^{N} X_{i}\right]=E[N E[X]]=E[N] E[X][/latex]

Correlation of bivariate distribution

If the probability density function of the bivariate random variable X and Y is

[latex]\begin{array}{c}f(x, y)=\frac{1}{2 \pi \sigma_{x} \sigma_{y} \sqrt{1-\rho^{2}}} \exp \left\{-\frac{1}{2\left(1-\rho^{2}\right)}\right. & {\left[\left(\frac{x-\mu_{x}}{\sigma_{x}}\right)^{2}+\left(\frac{y-\mu_{y}}{\sigma_{y}}\right)^{2}\right.} & \left. \left.-2 \rho \frac{\left(x-\mu_{x}\right)\left(y-\mu_{y}\right)}{\sigma_{x} \sigma_{y}}\right]\right\}\end{array}[/latex]


[latex]\mu_{x}=E[X], \sigma_{x}^{2}=\operatorname{Var}(X)$, and $\mu_{y}=E[Y], \sigma_{y}^{2}=\operatorname{Var}(Y)$[/latex]

then the correlation between random variable X and Y for the bivariate distribution with density function is

since correlation is defined as

[latex]$\operatorname{Corr}(X, Y)=\frac{\operatorname{Cov}(X, Y)}{\sigma_{x} \sigma_{y}}$\\
$=\frac{E[X Y]-\mu_{x} \mu_{y}}{\sigma_{x} \sigma_{y}}$[/latex]

since the expectation using conditional expectation is

[latex]E[X Y]=E[E[X Y \mid Y]] [/latex]

for the normal distribution the conditional  distribution X given Y is having mean

[latex]mu_{x}+\rho \frac{\sigma_{x}}{\sigma_{y}}\left(y-\mu_{y}\right)[/latex]

now the expectation of  XY given Y is

this gives

[latex]begin{aligned} E[X Y] &=E\left[Y \mu_{x}+\rho \frac{\sigma_{x}}{\sigma_{y}}\left(Y^{2}-\mu_{y} Y\right)\right] \\ &=\mu_{x} E[Y]+\rho \frac{\sigma_{x}}{\sigma_{y}} E\left[Y^{2}-\mu_{y} Y\right] \\ &=\mu_{x} \mu_{y}+\rho \frac{\sigma_{x}}{\sigma_{y}}\left(E\left[Y^{2}\right]-\mu_{y}^{2}\right) \\ &=\mu_{x} \mu_{y}+\rho \frac{\sigma_{x}}{\sigma_{y}} \operatorname{Var}(Y) \\ &=\mu_{x} \mu_{y}+\rho \sigma_{x} \sigma_{y} \end{aligned}[/latex]


[latex]\operatorname{Corr}(X, Y)=\frac{\rho \sigma_{x} \sigma_{y}}{\sigma_{x} \sigma_{y}}=\rho[/latex]

Variance of geometric distribution

    In the geometric distribution let us perform successively independent trials which results in success with probability p , If N represents the time of first success in these succession then the variance of N as by definition will be


Let the random variable Y=1 if the first trial results in success and Y=0 if first trial results in failure, now to find the mathematical expectation here we apply the conditional expectation as

[latex]E\left[N^{2}\right]=E\left[E\left[N^{2} \mid Y\right]\right][/latex]


[latex]E\left[N^{2} \mid Y=1\right]=1\\
E\left[N^{2} \mid Y=0\right]=E\left[(1+N)^{2}\right][/latex]

if success is in first trial then N=1 and N2=1 if failure occur in first trial , then to get the first success  the total number of trials will have the same distribution as 1 i.e the first trial that results in failure with  plus the necessary number of additional trials,  that is

[latex]E\left[N^{2} \mid Y=0\right]=E\left[(1+N)^{2}\right][/latex]

Thus the expectation will be

[latex]E\left[N^{2}\right]=E\left[N^{2} \mid Y=1\right] P\{Y=1\}+E\left[N^{2} \mid Y=0\right] P\{Y=0\}\\
=p+(1-p) E\left[(1+N)^{2}\right]\\
=1+(1-p) E\left[2 N+N^{2}\right][/latex]

since the expectation of geometric distribution is so

[latex]E[N]=1 / p [/latex]


[latex]E\left[N^{2}\right]=1+\frac{2(1-p)}{p}+(1-p) E\left[N^{2}[/latex]



so the variance of geometric distribution will be

[latex]\begin{aligned}\operatorname{Var}(N) & =E\left[N^{2}\right]-(E[N])^{2} \\ = & \frac{2-p}{p^{2}}-\left(\frac{1}{p}\right)^{2} \\ = & \frac{1-p}{p^{2}}\end{aligned}[/latex]

Expectation of Minimum of sequence of uniform random variables

   The sequence of uniform random variables U1, U2 … .. over the interval (0, 1) and N is defined as

[latex]N=\min \left\{n: \sum_{i=1}^{n} U_{i}>1\right\}[/latex]

then for the expectation of N, for any x ∈ [0, 1] the value of N

[latex]N(x)=\min \left\{n: \sum_{i=1}^{n} U_{i}>x\right\}[/latex]

we will set the expectation of N as


to find the expectation we use the definition of conditional expectation on continuous random variable

[latex]E[X \mid Y=y]=\int_{-\infty}^{\infty} x f_{X \mid Y}(x \mid y) d x[/latex]

now conditioning for the first term of the sequence  we have

[latex]m(x)=\int_{0}^{1} E\left[N(x) \mid U_{1}=y\right] d y[/latex]

here we get

[latex]E\left[N(x) \mid U_{1}=y\right]=\left\{\begin{array}{ll}1 & \text { if } y>x \\ 1+m(x-y) & \text { if } y \leq x\end{array}\right.[/latex]

the remaining number of uniform random variable is same at the point where the first uniform value is y,in starting and then were going to add uniform random variables until their sum surpassed x − y.

so using this value of expectation the value of integral will be

[latex]m(x)=1+\int_{0}^{x} m(x-y) d y\\
=1+\int_{0}^{x} m(u) d u \text{ by letting }u=x-y[/latex]

if we differentiate this equation




now integrating this gives

[latex]\log [m(x)]=x+c[/latex]


[latex]m(x)=k e^{x}[/latex]

the value of k=1 if x=0 , so


and m(1) =e, the expected number of uniform  random variables over the interval (0, 1) that need to be added until their sum surpasses 1, is equal to e

Probability using conditional Expectation || probabilities using conditioning

   We can find the probability also by using conditional expectation like expectation we found with conditional expectation, to get this consider an event and a random variable X as

[latex]X=\left\{\begin{array}{ll}1 & \text { if } E \text { occurs } \\ 0 & \text { if } E \text { does not occur }\end{array}\right.[/latex]

from the definition of this random variable and expectation clearly

E[X \mid Y=y]=P(E \mid Y=y)$ for any random variable $Y$[/latex]

now by conditional expectation in any sense we have

[latex]P(E)=\sum_{y} P(E \mid Y=y) P(Y=y) \quad$ if $Y$ is discrete\\
$=\int_{-\infty}^{\infty} P(E \mid Y=y) f_{Y}(y) d y \quad$ if $Y$ is continuous[/latex]


compute the probability mass function of random variable X , if U is the uniform random variable on the interval (0,1), and consider the conditional distribution of X given U=p as binomial  with parameters n and p.


For the value of U the probability by conditioning is

[latex]\begin{aligned} P[X=i] &=\int_{0}^{1} P\left[X=i \mid U=p l f_{U}(p) d p\right.\\ &=\int_{0}^{1} P[X=i \mid U=p\} d p \\ &=\frac{n !}{i !(n-i) !} \int_{0}^{1} p^{i}(1-p)^{n-i} d p

we have the result

[latex]\int_{0}^{1} p^{i}(1-p)^{n-i} d p=\frac{i !(n-i) !}{(n+1) !}[/latex]

so we will get

[latex]P[X=i]=\frac{1}{n+1} \quad i=0, \ldots, n[/latex]


what is the probability of X < Y, If X and Y are the continuous random variables with probability density functions fX and fY respectively.


By using conditional expectation and conditional probability

[latex]\begin{aligned} P\{X<Y\} &=\int_{-\infty}^{\infty} P[X<Y \mid Y=y\} f_{Y}(y) d y \\ &=\int_{-\infty}^{\infty} P[X<y|Y=y| f Y(y) d y\\=& \int_{-\infty}^{\infty} P\left[X<y \mid f_{Y}(y] d y \quad \text { by independence }\right.\\=& \int_{-\infty}^{\infty} F_{X}(y) f_{Y}(y) d y \end{aligned}[/latex]


[latex]F X(y)=\int_{-\infty}^{y} f_{X}(x) d x[/latex]


Calculate the distribution of sum of continuous independent random variables X and Y.


To find the distribution of X+Y we have to find the probability of the sum by using the conditioning as follows

[latex]\begin{aligned}P(X+Y<a) &=\int_{-\infty}^{\infty} P\left[X+Y<a|Y=y| f_{Y}(y) d y\right.\\ &=\int_{-\infty}^{\infty} P[X+y<a|Y=y| f Y(y) d y\\ &=\int_{-\infty}^{\infty} P\left[X<a-y 1 f_{Y}(y) d y\right.\\ &=\int_{-\infty}^{\infty} F_{X}(a-y) f_{Y}(y) d y \end{aligned}[/latex]


The conditional Expectation for the discrete and continuous random variable with different examples considering some of the types of these random variables discussed using the independent random variable and the joint distribution in different conditions, Also the expectation and probability how to find using conditional expectation is explained with examples, if you require further reading go through below books or for more Article on Probability, please follow our Mathematics pages.

A first course in probability by Sheldon Ross

Schaum’s Outlines of Probability and Statistics

An introduction to probability and statistics by ROHATGI and SALEH


I am DR. Mohammed Mazhar Ul Haque , Assistant professor in Mathematics. Having 12 years of experience in teaching. Having vast knowledge in Pure Mathematics , precisely on Algebra. Having the immense ability of problem designing and solving. Capable of Motivating candidates to enhance their performance. I love to contribute to Lambdageeks to make Mathematics Simple , Interesting & Self Explanatory for beginners as well as experts. Let's connect through LinkedIn -

Recent Posts