I am DR. Mohammed Mazhar Ul Haque. I have completed my Ph.D. in Mathematics and working as an Assistant professor in Mathematics. Having 12 years of experience in teaching. Having vast knowledge in Pure Mathematics, precisely on Algebra. Having the immense ability of problem design and solving. Capable of Motivating candidates to enhance their performance.
I love to contribute to Lambdageeks to make Mathematics Simple, Interesting & Self Explanatory for beginners as well as experts.
Normal distribution is skewed with zero skewness, so the answer to the most common confusion can normaldistribution be skewed is normal distribution is not skewed distribution as the curve of the normal distribution is symmetric without tail whose skewness is zero. The normal distribution curve is bell shaped with symmetry on the curve.
Since the skewness is lack of symmetry in the curve so if the symmetry is present in the curve there is lack of skewness.
How do you tell if the data is normally distributed?
For the data to check whether normally distributed or not just try to sketch the histogram and from the curve of the curve if the symmetry is present in the curve then the data is normally distributed, from the curve of data itself the question can normal distribution be skewed or not cleared if the concept of skewness is clear. Sketching the histogram or curve in each case is tedious or time consuming so instead of that their are number of statistical tests like Anderson-Darling statistic (AD) which are more useful to tell whether data is normally distributed or not.
The data which follows normal distribution have zero skewness in the curve and the characteristics of the curve of the skewed distribution is different without symmetry, this we will understand with the following example:
Example: Find the percent of score lies between 70 to 80 if the score of mathematics of university students are normally distributed with the mean 67 and standard deviation 9?
Solution:
To find the percent of score we follow the probability for the normal distribution discussed earlier in normal distribution, so to do so first we will convert into normal variate and follow the table discussed in normal distribution to find the probability using the conversion
Z=(X-μ)/σ
we want to find the score percent between 70 and 80 so we use random variable values 70 and 80 with the given mean 67 and standard deviation 9 this gives
Z=70-67/9 = 0.333
and
Z=80-67/9 = 1.444
This we can sketch as
the above shaded area shows the region between z=0.333 and z=1.444 from the table of standard normal variate the probabilities are
P(z > 0.333)=0.3707 and P(z > 1.444)=0.0749 so p(0.333 < z0.333)-P(z > 1.444)=0.3707-0.0749=0.2958
so 29.58% students will score between 70 to 80 .
In the above example the skewness of the curve is zero and the curve is symmetric, to check the data is normally distributed or not we have to perform the hypothesis tests.
How do you tell if a distribution is skewed left or right?
The distribution is known to be skewed if it is right tailed or left tailed in the curve so the depending on the nature of the curve we can judge whether the distribution is positive skewed or negative skewed. The concept of skewness is discussed in detail in the articles positively and negatively skewed distribution. If the symmetry in the left side lacks the distribution is skewed left and if the symmetry lacks in the right side the distribution is skewed right. The best way to check the distribution is skewed is to check the variation in the central tendencies that is if mean<median<mode then the distribution is left skewed and if mean>median>mode then the distribution is right skewed. The geometrical representation is as follows
The measures to calculate the skewness left or right for the information given in detail in the article of skewness.
What is an acceptable skewness?
Since the skewness as earlier discussed is lack of symmetry so what range is acceptable that must be clear. The question can normal distribution is skewed arise to check whether in the normal distribution is acceptable or not and the answer of the acceptable skewness is in normal distribution because in normal distribution the skewness is zero and the distribution in which skewness is near to zero is more acceptable. So after the testing for skewness if the skewness is nearer to zero then the skewness is acceptable depending on the requirement and range for the client.
In brief the acceptable skewness is the skewness which is nearer to zero as per the requirement.
How skewed is too skewed?
The skewness is the statistical measurement to check the symmetry present in the curve of the distribution and the information and all the measures to check skewness is present or not, depending on that we can find if the distribution is far from zero then too skewed or symmetry is zero then we can say the distribution is too skewed.
How do you determine normal distribution?
To determine the distribution is normal or not we have to look the distribution have the symmetry or not if the symmetry is present and the skewness is zero then the distribution is normal distribution, the detail methods and techniques were already discussed in detail in normal distribution
Do outliers skew data?
In the distribution data if any data follow unusual way and very far or away from the usual data that is known as outlier and in most of the cases the outliers are responsible for the skewness of the distribution and because of the unusual nature of outliers the distribution have skewness, so we can say that in the distribution the outliers skew data. The outliers in all cases will not skew data they skewed data only if they also follow the systematic sequence in continuous distribution to give left or right tailed curve.
In the previous articles the detail discussion of normal distribution and skewed distribution discussed.
Skewed Distribution | skewed distribution definition
The distribution in which symmetry is not present and the curve of the distribution shows tail either left or right side is known as skewed distribution, so skewness is the asymmetry present in the curve or histogram apart from the symmetric or normal curve.
depending on the measure of central tendencies the nature of the distribution whether skewed or not can be evaluated there is special relations between mean, mode and median in left-tailed or right-tailed skewed distribution.
normal distribution vs skewed | normal vs skewed distribution
Normal distribution
skewed distribution
In Normal distribution the curve is symmetric
In skewed distribution the curve is not symmetric
The measure of central tendencies mean, mode and median are equal
The measure of central tendencies mean, mode and median are not equal
mean=median =mode
mean>median>mode or mean<median<mode
Normal distribution vs skewed distribution
skewed distribution examples in real life
skewed distribution occurs in number of real life situation like the ticket sale of the particular show or movies in different months, record of athletes performance in competition, stock market returns, real estate rates fluctuation, life cycle of specific species, income variation, exam score and many more competitive outcomes. The distribution curve which shows asymmetry occurs frequently in applications.
difference between symmetrical and skewed distribution | symmetrical and skewed distribution
The main difference between the symmetrical distributions and skewed distribution is the differences between the central tendencies mean median and mode and in addition as the name suggest in the symmetrical distribution the curve of distribution is symmetric while in the skewed distribution the curve is not symmetric but have the skewness and it may be right-tailed or left tailed or may be both tailed also, the different distribution differs only on the nature of the skewness and symmetry so all the probability distributions can be classified into these two main categories.
To find the nature of distribution whether symmetric or skewed we must have to either draw the curve of the distribution or the coefficient of skewness with the help of absolute or relative measures.
highly skewed distribution
The modal or highest value of the distribution if differs from mean and median that gives the skewed distribution, if the highest value coincides with mean and median and equal then the distribution is symmetric distribution, the highly skewed distribution may be positive or negative. The skewed distribution modal value can be find out using the coefficient of skewness.
Negatively skewed distribution| which is a negatively skewed distribution
Any distribution in which the measure of central tendencies follows the order mean<median<mode and the coefficient of skewness in negative in the negatively skewed distribution, the negatively skewed distribution is also known as left skewed distribution because in negatively skewed distribution the tail of graph or plot of information is left.
The coefficient of skewness for the negatively skewed distribution can easily find out with the usual methods of finding the coefficients of skewness.
negatively skewed distribution example
If 150 students in an examination performed as given below then find the nature of skewness of the distribution
marks
0-10
10-20
20-30
30-40
40-50
50-60
60-70
70-80
freq
12
40
18
0
12
42
14
12
Solution: To find the nature of skewness of distribution we have to calculate the coefficient of skewness for which we require mean, mode, median and standard deviation for the given information so for this we will calculate these with the help of the following table
class interval
f
mid value x
c.f.
d’=(x-35)/10
f*d’
f*d’2
0-10
12
5
12
-3
-36
108
10-20
40
15
52
-2
-80
160
20-30
18
25
70
-1
-18
18
30-40
0
35
70
0
0
0
40-50
12
45
82
1
12
12
50-60
42
55
124
2
84
168
60-70
14
65
138
3
42
126
70-80
12
75
150
4
48
192
total=52
total=784
so the measures will be
and
hence the coefficient of skewness for the distribution is
negatively skewed distribution mean median mode
In the negatively skewed distribution mean median mode is in ascending order which represents the tail on the left side of the curve of distribution, the measure of central tendencies mean median and mode for the negatively skewed distribution follows exactly the reverse pattern of positively skewed distribution. The curve of the negatively skewed distribution is also an inverse image of the positively skewed distribution. so Mean<median<mode in negatively skewed distribution.
negatively skewed distribution curve
The nature of the curve for the negatively skewed distribution curve is left-skewed without symmetry either in a histogram or continuous curve.
As symmetry is the measure to calculate the asymmetry present in the distribution, so the distribution curve of negatively skewed distribution shows the asymmetry present on the left side.
positively skewed normal distribution
The continuous distribution which is following the normal distribution curve including the asymmetry by gathering the information to the right tail shows the right-skewed curve asymmetric about the median following descending order in the central tendencies mean median and mode.
FAQs
Why chi square distribution is positively skewed
The chi-square distribution gives the values from zero to infinity and the curve of the distribution gathers the information in the right tail so it shows the right-skewed curve hence the chi-square distribution is a positively skewed distribution.
Is Poisson distribution positively skewed
Yes, Poisson distribution is a positively skewed distribution as the information scattered near the right tail so the nature of the plot is positively skewed
Why does negative binomial distribution always positively skew
The negative binomial distribution is always positively skewed because negative binomial distribution is the generalization of pascal distribution which is always positively skewed so is the negative binomial distribution.
Does skewness have any impact on linear regression models My dependent variable and my interaction variable is positively skewed
The impact on linear regression of the model having my dependent variable and my interaction skewed does not mean the regression error is also skewed and vice versa as the error is skewed does not mean the variables are skewed.
The curve which is the plotted observations represents the skewness if the shape of the curve is not symmetric, of the given set. In other words the lack of symmetry in the graph of the given information represents the skewness of the given set. Depending on the tail in the right or left the skewness is known as positively skewed or negatively skewed. The distribution depending on this skewness is known as positively skewed distribution or negatively skewed distribution
The mean, mode and median shows the nature of distribution so if the nature or shape of the curve is symmetric these measure of central tendencies are equal and for the skewed distributions these measure of central tendencies varies as either mean>median>mode or mean<median<mode.
Variance and Skewness
Variance
Skewness
Amount of variability can be obtained using variance
Direction of variability can be obtained using skewness
Application of measure of variation is in Business and economics
Application of measure of Skewness is in medical and life sciences
variance and skewness
Measure of Skewness
To find the degree and the direction of the frequency distribution whether positive or negative the measure of skewness is very helpful even with the help of the graph we know the positive or negative nature of the skewness but the magnitude will not be exact in graphs hence these statistical measures gives the magnitude of lack of symmetry.
To be specific the measure of skewness must have
Unit free so that the different distributions can be comparable if the units are same or different.
Value of measure for symmetric distribution zero and positive or negative for positive or negative distributions accordingly.
The value of measure should vary if we move from negative skewness to positive skewness.
There are two types of measure of skewness
Absolute Measure of skewness
Relative Measure of skewness
Absolute Measure of skewness
In the symmetrical distribution the mean, mode and median are same so in absolute measure of skewness the difference of these central tendencies gives the extent of symmetry in the distribution and the nature as positive or negative skewed distribution but the absolute measure for different units is not useful while comparing two sets of information.
The Absolute skewness can be obtained using
Skewness(Sk)=Mean-Median
Skewness(Sk)=Mean-Mode
Skewness(Sk)=(Q3-Q2)-(Q2-Q1)
Relative Measure of skewness
Relative measure of skewness is used to compare the skewness in two or more distributions by eliminating the influence of variation, relative measure of skewness is known as coefficient of skewness, the following are the important relative measure of skewness.
Karl Pearson’s Coefficient of Skewness
This method is used most often to calculate skewness
this coefficient of skewness is positive for positive distribution, negative for negative distribution and zero for the symmetric distribution. This Karl Pearson’s coefficient usually lies between +1 and -1. If Mode is not defined then to calculate the Karl Pearson’s coefficient we use the formula as
If we use this relation then Karl Pearson’s coefficient lies between +3 and -3.
2. Bowleys’s Coefficient of Skewness|Quartile measure of skew ness
In Bowleys’s coefficient of skewness the quartile deviations were used to find the skewness so it is also known as quartile measure of skewness
or we can write it as
this value of coefficient is zero if the distribution is symmetric and the value for positive distribution is positive, for negative distribution is negative. The value of Sk lies between -1 and +1.
3. Kelly’s Coefficient of Skewness
In this measure of skewness the percentiles and deciles are used to calculate the skewness, the coefficient is
where these skewness involves the 90, 50 and 10 percentiles and using deciles we can write it as
in which 9,5 and 1 deciles were used.
4. β and γ Coefficient of Skewness| Measure of skew ness based on moments.
Using the central moments the measure of skewness the β coefficient of skewness can be define as
this coefficient of skewness gives value zero for the symmetric distribution but this coefficient not tells specifically for the direction either positive or negative, so this drawback can be removed by taking square root of beta as
this value gives the positive and negative value for the positive and negative distributions respectively.
Examples of skewness
Using the following information find the coefficient of skewness
Wages
0-10
10-20
20-30
30-40
40-50
50-60
60-70
70-80
No. of people
12
18
35
42
50
45
20
8
Solution: To find the coefficient of skewness we will use karl Pearson’s coefficient
frequency
mid-value(x)
fx
fx2
0-10
12
5
60
300
10-20
18
15
270
4050
20-30
35
25
875
21875
30-40
42
35
1470
51450
40-50
50
45
2250
101250
50-60
45
55
2475
136125
60-70
20
65
1300
84500
70-80
8
75
600
45000
230
9300
444550
the karl pearson coefficient of skewness is
the modal class is maximum frequent class 40-50 and the respective frequencies are
thus
so the coefficient of skewness will be
which shows the negative skewness.
2. Find the coefficient of skewness of the frequency distributed marks of 150 students in certain examination
marks
0-10
10-20
20-30
30-40
40-50
50-60
60-70
70-80
freq
10
40
20
0
10
40
16
14
Solution: To calculate the coefficient of skewness we require mean, mode, median and standard deviation for the given information so for calculating these we form the following table
class interval
f
mid value x
c.f.
d’=(x-35)/10
f*d’
f*d’2
0-10
10
5
10
-3
-30
90
10-20
40
15
50
-2
-80
160
20-30
20
25
70
-1
-20
20
30-40
0
35
70
0
0
0
40-50
10
45
80
1
10
10
50-60
40
55
120
2
80
160
60-70
16
65
136
3
48
144
70-80
14
75
150
4
56
244
total=64
total=828
now the measures will be
and
hence the coefficient of skewness for the distribution is
3. Find the mean, variance and coefficient of skewness of distribution whose first four moments about 5 are 2,20,40 and 50.
Solution: since the first four moments are given so
so we can write it
so the coefficient of skewness is
Positively skewed distribution definition|Right skewed distribution meaning
Any distribution in which the measure of central tendencies i.e mean, mode and median having positive values and the information in the distribution lacks the symmetry.
In other words the positively skewed distribution is the distribution in which the measure of central tendencies follows as mean>median>mode in the right side of the curve of the distribution.
If we sketch the information of the distribution the curve will be right tailed because of which positively skewed distribution is also known as right skewed distribution.
from above curve it is clear that the mode is the smallest measure in positively or right skewed distribution and the mean is the largest measure of central tendencies.
positively skewed distribution example|example of right skewed distribution
For a positively skewed or right skewed distribution if the coefficient of skewness is 0.64, find the mode and median of the distribution if mean and standard deviations are 59.2 and 13 respectively.
Solution: The given values are mean=59.2, sk=0.64 and σ=13 so using the relation
2. Find the standard deviation of the positively skewed distribution whose coefficient of skewness is 1.28 with mean 164 and mode 100?
Solution: In the same way using the given information and the formula for the coefficient of positively skewed distribution
so the standard deviation will be 50.
3. In the quarterlies deviations if the addition of first and third quarterlies is 200 with median 76 find the value of third quartile of the frequency distribution which is positively skewed with coefficient of skewness 1.2?
Solution: To find the third quartile we have to use the relation of coefficient of skewness and quarterlies, since the given information is
from the given relation we have
from these two equations we can write
so the value of the third quartile is 120.
4. Find the coefficient of skewness for the following information
x
93-97
98-102
103-107
108-112
113-117
118-122
123-127
128-132
f
2
5
12
17
14
6
3
1
Solution: here we will use Bowley’s measure of skewness using quartiles
class
frequency
cumulative frequency
92.5-97.5
2
2
97.5-102.5
5
7
102.5-107.5
12
19
107.5-112.5
17
36
112.5-117.5
14
50
117.5-122.5
6
56
122.5-127.5
3
59
127.5-132.5
1
60
N=60
As Nth/4=15th observation of class is 102.5-107.5 , Nth/2=30th observation of class is 107.5-112.5and 3Nth/4=45th observation of class is 112.5-117.5so
and
and median is
thus
which is positively skewed distribution.
where is the mean in a positively skewed distribution
We know that the positively skewed distribution is right skewed distribution so the curve is right tailed the meaning of this most of the information will be nearer to the tail so the mean in a positively skewed distribution is nearer to the tail and since in positively or right skewed distribution mean>median>mode so mean will be after the median.
Right skewed distribution mean median mode|relationship between mean median and mode in positively skewed distribution
In the positively skewed or right skewed distribution the measure of central tendencies mean, median and mode are in the order mean>median>mode, as mode is the smallest one then median and the largest central tendency is the mean which for the right tailed curve is nearer to the tail of the curve for the information.
so the relationship between mean median and mode in positively skewed distribution is in the increasing order and with the help of the difference of these two central tendencies the coefficient of skewness can be calculated, so mean, median and mode gives the nature of skewness also.
positively skewed distribution graph|positively skewed distribution curve
The graph either in the form of smooth curve or in the form of histogram for the discrete information, the nature is right tailed as the mean of the information gather around the tail of the curve as skewness of distribution discusses the shape of the distribution. Since the large amount of data is in left of the curve and tail of the curve onto the right is longer.
some of the graphs of positively distributed information are as follows
from the above graphs it is clear that the curve has lacking the symmetry in any aspects .
positively skewed score distribution
In any distribution if the scores are in the positively skewed that is the score following the positively skewed distribution as mean>median>mode and the curve of the distribution score having right tailed curve in which score is affected by the large value.
This type of distribution is known as positively skewed score distribution. All the properties and rules for this distribution are the same from positively skewed or right skewed distribution.
positive skew frequency distribution
In positively skewed frequency distribution on average the frequency of the information are smaller as compared to the distribution so the positive skew frequency distribution is nothing but the positively skewed or right skewed distribution where the curve is right tailed curve.
positive vs negative skewed distribution|positively skewed distribution vs negatively skewed
positive skewed distribution
negative skewed distribution
In the positively skewed distribution the information is distributed as mean is the largest and mode is smallest
In the negatively skewed distribution the information is distributed as mean is the smallest and mode is largest
the curve is right tailed
the curve is left tailed
mean>median>mode
mean<median<mode
FAQs
How do you know if a distribution is positively or negatively skewed
The skewness is positive if mean>median>mode and negative if mean<median<mode,
From the distribution curve also we can judge if the curve is right tailed it is positive and if the curve is left tailed it is negative
How do you determine positive skewness
By calculating the measure of coefficient of skewness if positive then skewness is positive or by plotting the curve of distribution if right tailed then positive or by checking mean>median>mode
What does a positive skew represent
The positive skewness represent that the score of the distribution lies nearer to large values and the curve is right tailed and the mean is the largest measure
How do you interpret a right skewed histogram
if the histogram is right skewed then the distribution is positively skewed distribution where mean>median>mode
In distributions that are skewed to the right what is the relationship of the mean median and mode
The relationship is mean>median>mode
Conclusion:
The skewness is important concept of statistics which gives the asymmetry or lack of symmetry present in the distribution of probability depending on the positive or negative value it is classified as positively skewed distribution or negatively skewed distribution, in the above article the brief concept with examples discussed , if you require further reading go through
The Hermite polynomial is widely occurred in applications as an orthogonal function. Hermite polynomial is the series solution of Hermite differential equation.
Hermite’s Equation
The differential equation of second order with specific coefficients as
d2y/dx2 – 2x dy/dx + 2xy = 0
is known as Hermite’s equation, by solving this differential equation we will get the polynomial which is Hermite Polynomial.
Let us find the solution of the equation
d2y/dx2 – 2x dy/dx + 2ny = 0
with the help of series solution of differential equation
now substituting all these values in the Hermite’s equation we have
This equation satisfies for the value of k=0 and as we assumed the value of k will not be negative, now for the lowest degree term xm-2 take k=0 in the first equation as the second gives negative value, so the coefficient xm-2 is
a0m (m-1)=0 ⇒ m=0,m=1
as a0 ≠ 0
now in the same way equating the coefficient of xm-1 from the second summation
and equating the coefficients of xm+k to zero,
ak+2(m+k+2)(m+k+1)-2ak(m+k-n) = 0
we can write it as
ak+2 = 2(m+k-n)/(m+k+2)(m+k+1) ak
if m=0
ak+2 = 2(k-n)/(k+2)(k+1) ak
if m=1
ak+2 = 2(k+1-n)/(k+3)(k+2) ak
for these two cases now we discuss the cases for k
so far m=0 we have two conditions when a1=0, then a3=a5=a7=….=a2r+1=0 and when a1 is not zero then
by following this put the values of a0,a1,a2,a3,a4 and a5 we have
and for m=1 a1=0 by putting k=0,1,2,3,….. we get
ak+2 = 2(k+1-n)/(k+3)(k+2)ak
so the solution will be
so the complete solution is
where A and B are the arbitrary constants
Hermite Polynomial
The Hermite’s equation solution is of the form y(x)=Ay1(x)+By2(x) where y1(x) and y2(x) are the series terms as discussed above,
one of these series end if n is non negative integer if n is even y1 terminates otherwise y2 if n is odd, and we can easily verify that for n=0,1,2,3,4…….. these polynomials are
1,x,1-2x2, x-2/3 x3, 1-4x2+4/3x4, x-4/3x3+ 4/15x5
so we can say here that the solution of Hermite’s equation are constant multiple of these polynomials and the terms containing highest power of x is of the form 2nxn denoted by Hn(x) is known as Hermite polynomial
Generating function of Hermite polynomial
Hermite polynomial usually defined with the help of relation using generating function
[n/2] is the greatest integer less than or equal to n/2 so it follows the value of Hn(x) as
this shows that Hn(x) is a polynomial of degree n in x and
Hn(x) = 2nxn + πn-2 (x)
where πn-2 (x) is the polynomial of degree n-2 in x, and it will be even function of x for even value of n and odd function of x for odd value of n, so
Hn(-x) = (-1)n Hn(x)
some of the starting Hermite polynomials are
H0(x) = 1
H1(x) = 2x
H2(x) = 4x2 – 2
H3(x) = 8x3-12
H4(x) = 16x4 – 48x2+12
H5(x) = 32x2 – 160x3+120x
Generating function of Hermite polynomial by Rodrigue Formula
Hermite Polynomial can also be defined with the help of Rodrigue formula using generating function
since the relation of generating function
Using the Maclaurin’s theorem, we have
or
by putting z=x-t and
for t=0,so z=x gives
this we can show in another way as
differentiating
with respect to t gives
taking limit t tends to zero
now differentiating with respect to x
taking limit t tends to zero
from these two expressions we can write
in the same way we can write
differentiating n times put t=0, we get
from these values we can write
from these we can get the values
Example on Hermite Polynomial
Find the ordinary polynomial of
Solution: using the Hermite polynomial definition and the relations we have
2. Find the Hermite polynomial of the ordinary polynomial
Solution: The given equation we can convert to Hermite as
and from this equation equating the same powers coefficient
hence the Hermite polynomial will be
Orthogonality of Hermite Polynomial | Orthogonal property of Hermite Polynomial
The important characteristic for Hermite polynomial is its orthogonality which states that
To prove this orthogonality let us recall that
which is the generating function for the Hermite polynomial and we know
so multiplying these two equations we will get
multiplying and integrating within infinite limits
and since
so
using this value in above expression we have
which gives
now equate the coefficients on both the sides
which shows the orthogonal property of Hermite polynomial.
The result of orthogonal property of Hermite polynomial can be shown in another way by considering the recurrence relation
Example on orthogonality of Hermite Polynomial
1.Evaluate the integral
Solution: By using the property of orthogonality of hermite polynomial
since the values here are m=3 and n=2 so
2. Evaluate the integral
Solution: Using the orthogonality property of Hermite polynomial we can write
Recurrence relations of Hermite polynomial
The value of Hermite polynomial can be easily find out by the recurrence relations
These relations can easily obtained with the help of definition and properties.
Proofs:1. We know the Hermite equation
y”-2xy’+2ny = 0
and the relation
by taking differentiation with respect to x partially we can write it as
from these two equations
now replace n by n-1
by equating the coefficient of tn
so the required result is
2. In the similar way differentiating partially with respect to t the equation
we get
n=0 will be vanished so by putting this value of e
now equating the coefficients of tn
thus
3. To prove this result we will eliminate Hn-1 from
and
so we get
thus we can write the result
4. To prove this result we differentiate
we get the relation
substituting the value
and replacing n by n+1
which gives
Examples on Recurrence relations of Hermite polynomial
1.Show that
H2n(0) = (-1)n. 22n (1/2)n
Solution:
To show the result we have
H2n(x) =
taking x=0 here we get
2. Show that
H’2n+1(0) = (-1)n 22n+1 (3/2)2
Solution:
Since from the recurrence relation
H’n(x) = 2nHn-1(X)
here replace n by 2n+1 so
H’2n-1(x) = 2(2n+1) H2n(x)
taking x=0
3. Find the value of
H2n+1(0)
Solution
Since we know
use x=0 here
H2n-1(0) = 0
4. Find the value of H’2n(0).
Solution :
we have the recurrence relation
H’n(x) = 2nHn-1(x)
here replace n by 2n
H’2n(x) = =2(2n)H2n-1(x)
put x=0
H’2n(0) = (4n)H2n-1(0) = 4n*0=0
5. Show the following result
Solution :
Using the recurrence relation
H’n(x) = 2nHn-1 (x)
so
and
d3/dx3 {Hn(x)} = 23n(n-1)(n-2)Hn-3(x)
differentiating this m times
which gives
6. Show that
Hn(-x) = (-1)n Hn(x)
Solution :
we can write
from the coefficient of tn we have
and for -x
7. Evaluate the integral and show
Solution : For solving this integral use integration parts as
Now differentiation under the Integral sign differentiate with
respect to x
using
H’n(x) = 2nHn-1 (x)
and
H’m(x) = 2mHm-1 (x)
we have
and since
???? n,m-1 = ????n+1, m
so the value of integral will be
Conclusion:
The specific polynomial which frequently occurs in application is Hermite polynomial, so the basic definition, generating function , recurrence relations and examples related to Hermite Polynomial were discussed in brief here , if you require further reading go through
In the probability theory the Chebyshev’s Inequality & central limit theorem deal with the situations where we want to find the probability distribution of sum of large numbers of random variables in approximately normal condition, Before looking the limit theorems we see some of the inequalities, which provides the bounds for the probabilities if the mean and variance is known.
Markov’s inequality
The Markov’s inequality for the random variable X which takes only positive value for a>0 is
where sigma and mu represents the variance and mean of random variable, to prove this we use the Markov’s inequality as the non negative random variable
for the value of a as constant square, hence
this equation is equivalent to
as clearly
Examples of Markov’s and Chebyshev’s inequalities :
If the production of specific item is taken as random variable for the week with mean 50 , find the probability of production exceeding 75 in a week and what would be the probability if the production of a week is between 40 and 60 provided the variance for that week is 25?
Solution: Consider the random variable X for the production of the item for a week then to find the probability of production exceeding 75 we will use Markov’s inequality as
Now the probability for the production in between 40 to 60 with variance 25 we will use Chebyshev’s inequality as
so
this shows the probability for the week if the production is between 40 to 60 is 3/4.
2. Show that the chebyshev’s inequality which provides upper bound to the probability is not particularly nearer to the actual value of the probability.
Solution:
Consider the random variable X is uniformly distributed with mean 5 and variance 25/3 over the interval (0,1) then by the chebyshev’s inequality we can write
but the actual probability will be
which is far from the actual probability likewise if we take the random variable X as normally distributed with mean and variance then Chebyshev’s inequality will be
but the actual probability is
Weak Law of Large Numbers
The weak law for the sequence of random variables will be followed by the result that Chebyshev’s inequality can be used as the tool for proofs for example to prove
if the variance is zero that is the only random variables having variances equal to 0 are those which are constant with probability 1 so by Chebyshev’s inequality for n greater than or equal to 1
as
by the continuity of the probability
which proves the result.
to prove this we assume that variance is also finite for each random variable in the sequence so the expectation and variance
The central limit theorem is one of the important result in probability theory as it gives the distribution to the sum of large numbers which is approximately normal distribution in addition to the method for finding the approximate probabilities for sums of independent random variables central limit theorem also shows the empirical frequencies of so many natural populations exhibit bell-shaped means normal curves, Before giving the detail explanation of this theorem we use the result
“If the sequence of random variables Z1,Z2,…. have the distribution function and moment generating function as FZn and Mzn then
Central Limit theorem: For the sequence of identically distributed and independent random variables X1,X2,……. each of which having the mean μ and variance σ2 then the distribution of the sum
tends to the standard normal as n tends to infinity for a to be real values
Proof: To prove the result consider the mean as zero and variance as one i.e μ=0 & σ2=1 and the moment generating function for Xi exists and finite valued so the moment generating function for the random variable Xi/√n will be
hene the moment generating function for the sum ΣXi/√n will be
Now let us take L(t)=logM(t)
so
to show the proof we first show
by showing its equivalent form
since
hence this shows the result for the mean zero and variance 1, and this same result follows for the general case also by taking
and for each a we have
Example of the Central Limit theorem
To calculate the distance in light year of a star from the lab of an astronomer, he is using some measuring techniques but because of change in atmosphere each time the distance measured is not exact but with some error so to find the exact distance he plans to observe continuously in a sequence and the average of these distances as the estimated distance, If he consider the values of measurement identically distributed and independent random variable with mean d and variance 4 light year, find the number of measurement to do to obtain the 0.5 error in the estimated and actual value?
Solution: Let us consider the measurements as the independent random variables in sequence X1,X2,…….Xn so by the Central Limit theorem we can write
which is the approximation into standard normal distribution so the probability will be
so to get the accuracy of the measurement at 95 percent the astronomer should measure n* distances where
so from the normal distribution table we can write it as
which says the measurement should be done for 62 number of times, this also can be observed with the help of Chebyshev’s inequality by taking
so the inequality results in
hence for n=16/0.05=320 which gives certainity that there will be only o.5 percent error in the measurement of the distance of the star from the lab of observations.
2. The number of admitted students in engineering course is Poisson distributed with mean 100, it was decided that if the admitted students are 120 or more the teaching will be in two sections otherwise in one section only, what will be the probability that there will be two sections for the course?
Solution: By following the Poisson distribution the exact solution will be
which is obviously not give the particular numerical value, If we consider the random variable X as the students admitted then by the central limit theorem
which can be
which is the numerical value.
3. Calculate the probability that the sum on ten die when rolled is between 30 and 40 including 30 and 40?
Solution: Here considering the die as Xi for ten values of i. the mean and variance will be
thus the summation of the random variable will be 14 percent.
5. Find the probability for the evaluator of the exam to give grades will be 25 exams in starting 450 min if there are 50 exams whose grading time is independent with mean 20 min and standard deviation 4 min.
Solution: Consider the time require to grade the exam by the random variable Xi so the random variable X will be
Central Limit theorem for independent random variables
For the sequence which is not identically distributed but having independent random variables X1,X2,……. each of which having the mean μ and variance σ2 provided it satisfies
each Xi is uniformly bounded
sum of the variances is infinite, then
Strong Law of Large Numbers
Strong Law of Large numbers is very crucial concept of the probability theory which says that the average of sequence of commonly distributed random variable with probability one will converge to the mean of that same distribution
Statement: For the sequence of identically distributed and independent random variables X1,X2,……. each of which having the finite mean with probability one then
Proof: To prove this consider the mean of each of random variable is zero, and the series
now for this consider power of this as
after taking the expansion of the right hand side terms we have the terms of the form
since these are independents so the mean of these will be
with the help of combination of the pair the expansion of the series now will be
since
so
we get
this suggest the inequality
hence
By the convergence of the series since the probability of each random variable is one so
since
if the mean of each random variable is not equal to zero then with deviation and probability one we can write it as
or
which is required result.
One Sided Chebyshev Inequality
The one sided Chebysheve inequality for the random variable X with mean zero and finite variance if a>0 is
to prove this consider for b>0 let the random variable X as
which gives the required inequality. for the mean and variance we can write it as
This further can be written as
Example:
Find the upper bound of the probability that the production of the company which is distributed randomly will at least 120, if the production of this certain company is having mean 100 and variance 400.
so this gives the probability of the production within a week atleast 120 is 1/2, now the bound for this probability will be obtained by using Markov’s inequality
which shows the upper bound for the probability.
Example:
Hundred pairs are taken from two hundred persons having hundred men and hundred women find the upper bound of the probability that at most thirty pair will consist a men and a women.
Solution:
Let the random variable Xi as
so the pair can be expressed as
Since every man can equally likely to be pair with remaining people in which hundred are women so the mean
which tells that the possibility of pairing 30 men with women is less than six, thus we can improve the bound by using one sided chebyshev inequality
Chernoff Bound
If the moment generating function already known then
as
in the same way we can write for t<0 as
Thus the Chernoff bound can be define as
this inequality stands for all the values of t either positive or negative.
Chernoff bounds for the standard normal random variable
The Chernoff bounds for the standard normal random variable whose moment generating function
is
so minimizing this inequality and right hand side power terms gives for a>0
and for a<0 it is
Chernoff bounds for the Poisson random variable
The Chernoff bounds for the Poisson random variable whose moment generating function
is
so minimizing this inequality and right hand side power terms gives for a>0
and it would be
Example on Chernoff Bounds
In a game if a player is equally likely to either win or lose the game independent of any past score, find the chernoff bound for the probability
Solution: Let Xi denote the winning of the player then the probability will be
for the sequence of n plays let
so the moment generating function will be
here using the expansions of exponential terms
so we have
now applying the property of moment generating function
This gives the inequality
hence
Conclusion:
The inequalities and limit theorem for the large numbers were discussed and the justifiable examples for the bounds of the probabilities were also taken to get the glimpse of the idea, Also the the help of normal, poisson random variable and moment generating function is taken to demonstrate the concept easily, if you require further reading go through below books or for more Article on Probability, please follow our Mathematics pages.
A first course in probability by Sheldon Ross
Schaum’s Outlines of Probability and Statistics
An introduction to probability and statistics by ROHATGI and SALEH
Probability is a fundamental concept in mathematics that allows us to quantify uncertainty and make predictions about the likelihood of events occurring. It plays a crucial role in various fields, including statistics, economics, physics, and computer science. In this section, we will explore the definition of probability and its importance in mathematics, as well as the axioms that form the foundation of probability theory.
Definition of Probability and Its Importance in Math
Probability can be defined as a measure of the likelihood of an event occurring. It is represented as a number between 0 and 1, where 0 indicates impossibility and 1 indicates certainty. The concept of probability is essential in mathematics because it helps us analyze and understand uncertain situations.
In real life, we encounter probabilistic situations every day. For example, when flipping a fair coin, we know that the probability of it landing on heads is 0.5. Similarly, when rolling a fair six-sided die, the probability of rolling a specific number, say 3, is 1/6. By understanding and applying probability, we can make informed decisions and assess risks in various scenarios.
Probability theory provides a systematic framework for studying and analyzing uncertain events. It allows us to mathematically model and analyze random phenomena, such as coin flips, dice rolls, and card games. By using probability theory, we can calculate the likelihood of different outcomes, estimate the expected value of random variables, and make predictions based on available data.
Axioms of Probability Theory
To ensure a consistent and coherent approach to probability, mathematicians have established a set of axioms that form the foundation of probability theory. These axioms provide a rigorous framework for defining and manipulating probabilities. Let’s take a closer look at the three axioms of probability:
Non-negativity: The probability of any event is always a non-negative number. In other words, the probability of an event cannot be negative.
Additivity: For any collection of mutually exclusive events (events that cannot occur simultaneously), the probability of the union of these events is equal to the sum of their individual probabilities. This axiom allows us to calculate the probability of complex events by considering the probabilities of their constituent parts.
Normalization: The probability of the entire sample space (the set of all possible outcomes) is equal to 1. This axiom ensures that the total probability of all possible outcomes is always 1, providing a consistent framework for probability calculations.
By adhering to these axioms, we can ensure that our calculations and reasoning about probabilities are logically sound and consistent. These axioms, along with other probability concepts, such as conditional probability, independence, and Bayes’ theorem, form the building blocks of probability theory.
In the upcoming sections, we will delve deeper into probability theory, exploring various probability concepts, examples, exercises, and calculations. By understanding the axioms and principles of probability, we can develop a solid foundation for tackling more complex probability problems and applying probability in real-world scenarios.
Problems on Probability and Its Axioms
Example 1: Restaurant Menu Combinations
Imagine you’re at a restaurant with a diverse menu, offering a variety of appetizers, entrees, and desserts. Let’s say there are 5 appetizers, 10 entrees, and 3 desserts to choose from. How many different combinations of a meal can you create?
To solve this problem, we can use the fundamental principle of counting. The principle states that if there are m ways to do one thing and n ways to do another, then there are m * n ways to do both.
In this case, we can multiply the number of choices for each course: 5 appetizers * 10 entrees * 3 desserts = 150 different combinations of a meal.
Example 2: Probability of Item Purchases
Suppose you’re running an online store and you want to analyze the probability of customers purchasing certain items together. Let’s say you have 100 customers, and you track their purchase history. Out of these customers, 30 have bought item A, 40 have bought item B, and 20 have bought both items A and B. What is the probability that a randomly selected customer has bought either item A or item B?
To solve this problem, we can use the principle of inclusion-exclusion. This principle allows us to calculate the probability of the union of two events by subtracting the probability of their intersection.
First, we calculate the probability of buying item A or item B separately. The probability of buying item A is 30/100 = 0.3, and the probability of buying item B is 40/100 = 0.4.
Next, we calculate the probability of buying both item A and item B. This is given by the intersection of the two events, which is 20/100 = 0.2.
To find the probability of buying either item A or item B, we add the probabilities of buying each item and subtract the probability of buying both items: 0.3 + 0.4 – 0.2 = 0.5.
Therefore, the probability that a randomly selected customer has bought either item A or item B is 0.5.
Example 3: Probability of Card Occurrences
Let’s consider a standard deck of 52 playing cards. What is the probability of drawing a heart or a diamond from the deck?
To solve this problem, we need to determine the number of favorable outcomes (drawing a heart or a diamond) and the total number of possible outcomes (drawing any card from the deck).
There are 13 hearts and 13 diamonds in a deck, so the number of favorable outcomes is 13 + 13 = 26.
The total number of possible outcomes is 52 (since there are 52 cards in a deck).
Therefore, the probability of drawing a heart or a diamond is 26/52 = 0.5.
Example 4: Probability of Temperature Occurrences
Suppose you are interested in predicting the weather for the next day. You have observed that over the past year, the probability of a hot day is 0.3, the probability of a cold day is 0.2, and the probability of a rainy day is 0.4. What is the probability that tomorrow will be either hot or cold, but not rainy?
To solve this problem, we can use the probability addition rule. The rule states that the probability of the union of two mutually exclusive events is the sum of their individual probabilities.
In this case, the events“hot day” and “cold day” are mutually exclusive, meaning they cannot occur at the same time. Therefore, we can simply add their probabilities: 0.3 + 0.2 = 0.5.
Therefore, the probability that tomorrow will be either hot or cold, but not rainy, is 0.5.
Example 5: Probability of Card Denominations and Suits
Consider a standard deck of 52 playing cards. What is the probability of drawing a card that is either a king or a spade?
To solve this problem, we need to determine the number of favorable outcomes (drawing a king or a spade) and the total number of possible outcomes (drawing any card from the deck).
There are 4 kings and 13 spades in a deck, so the number of favorable outcomes is 4 + 13 = 17.
The total number of possible outcomes is 52 (since there are 52 cards in a deck).
Therefore, the probability of drawing a card that is either a king or a spade is 17/52 ≈ 0.327.
Example 6: Probability of Pen Colors
Suppose you have a bag containing 5 red pens, 3 blue pens, and 2 green pens. What is the probability of randomly selecting a red or blue pen from the bag?
To solve this problem, we need to determine the number of favorable outcomes (selecting a red or blue pen) and the total number of possible outcomes (selecting any pen from the bag).
There are 5 red pens and 3 blue pens in the bag, so the number of favorable outcomes is 5 + 3 = 8.
The total number of possible outcomes is 5 + 3 + 2 = 10 (since there are 5 red pens, 3 blue pens, and 2 green pens in the bag).
Therefore, the probability of randomly selecting a red or blue pen from the bag is 8/10 = 0.8.
Example 7: Probability of Committee Formation
Suppose there are 10 people, and you need to form a committee of 3 people. What is the probability that you select 2 men and 1 woman for the committee?
To solve this problem, we need to determine the number of favorable outcomes (selecting 2 men and 1 woman) and the total number of possible outcomes (selecting any 3 people from the group of 10).
First, we calculate the number of ways to select 2 men from a group of 5 men: C(5, 2) = 10.
Next, we calculate the number of ways to select 1 woman from a group of 5 women: C(5, 1) = 5.
To find the total number of favorable outcomes, we multiply the number of ways to select 2 men by the number of ways to select 1 woman: 10 * 5 = 50.
The total number of possible outcomes is the number of ways to select any 3 people from a group of 10: C(10, 3) = 120.
Therefore, the probability of selecting 2 men and 1 woman for the committee is 50/120 ≈ 0.417.
Example 8: Probability of Suit Occurrences in a Card Hand
Consider a standard deck of 52 playing cards. What is the probability of drawing a hand of 5 cards that contains at least one card of each suit (hearts, diamonds, clubs, and spades)?
To solve this problem, we need to determine the number of favorable outcomes (drawing a hand with at least one card of each suit) and the total number of possible outcomes (drawing any hand of 5 cards from the deck).
First, we calculate the number of ways to select one card from each suit: 13 * 13 * 13 * 13 = 285,316.
Next, we calculate the total number of possible outcomes, which is the number of ways to draw any 5 cards from a deck of 52: C(52, 5) = 2,598,960.
Therefore, the probability of drawing a hand of 5 cards that contains at least one card of each suit is 285,316/2,598,960 ≈ 0.11.
Example 9: Probability of choosing the same letter from two words
When it comes to probability, we often encounter interesting problems that challenge our understanding of the subject. Let’s consider an example that involves choosing the same letter from two words.
Suppose we have two words, “apple” and “banana.” We want to determine the probability of randomly selecting the same letter from both words. To solve this problem, we need to break it down into smaller steps.
First, let’s list all the letters in each word:
Word 1: “apple”
Word 2: “banana”
Now, we can calculate the probability of choosing the same letter by considering each letter individually. Let’s go through the process step by step:
Selecting a letter from the first word:
The word “apple” has five letters, namely ‘a’, ‘p’, ‘p’, ‘l’, and ‘e’.
The probability of selecting any particular letter is 1 out of 5, as there are five letters in total.
Selecting a letter from the second word:
The word “banana” has six letters, namely ‘b’, ‘a’, ‘n’, ‘a’, ‘n’, and ‘a’.
Similarly, the probability of selecting any particular letter is 1 out of 6.
Calculating the probability of choosing the same letter:
Since each letter has an equal chance of being selected from both words, we multiply the probabilities together.
The probability of selecting the same letter is (1/5) * (1/6) = 1/30.
Therefore, the probability of choosing the same letter from the words “apple” and “banana” is 1/30.
What are the important properties of conditional expectation and how do they relate to problems on probability and its axioms?
The concept of conditional expectation is a fundamental concept in probability theory, and it has important properties that can help us solve problems related to probability and its axioms. To understand these properties and their relationship to probability problems, it is essential to delve into the Properties of conditional expectation explained. These properties provide insights into how conditional expectations behave and can be used to calculate expectations and probabilities in various scenarios. By understanding these properties, we can bridge the gap between the concept of probability and its axioms and the idea of conditional expectation, enabling us to tackle complex probability problems with confidence.
Frequently Asked Questions
1. What is the importance of probability in math?
Probability is important in math because it allows us to quantify uncertainty and make predictions based on available information. It provides a framework for analyzing and understanding random events and their likelihood of occurrence.
2. How would you define probability and its axioms?
Probability is a measure of the likelihood of an event occurring. It is defined using three axioms:
The probability of any event is a non-negative number.
The probability of the entire sample space is 1.
The probability of the union of mutually exclusive events is equal to the sum of their individual probabilities.
3. What are the three axioms of probability?
The three axioms of probability are:
Non-negativity: The probability of any event is a non-negative number.
Normalization: The probability of the entire sample space is 1.
Additivity: The probability of the union of mutually exclusive events is equal to the sum of their individual probabilities.
4. What are the axioms of expected utility theory?
The axioms of expected utility theory are a set of assumptions that describe how individuals make decisions under uncertainty. They include the axioms of completeness, transitivity, continuity, and independence.
5. What are the axioms of probability theory?
The axioms of probability theory are the fundamental principles that govern the behavior of probabilities. They include the axioms of non-negativity, normalization, and additivity.
6. Can you provide some solved problems on axioms of probability?
Certainly! Here is an example:
Problem: A fair six-sided die is rolled. What is the probability of rolling an even number?
Solution: Since the die is fair, it has six equally likely outcomes: {1, 2, 3, 4, 5, 6}. Out of these, three are even numbers: {2, 4, 6}. Therefore, the probability of rolling an even number is 3/6 = 1/2.
7. Where can I find probability problems and answers?
You can find probability problems and answers in various resources such as textbooks, online math websites, and educational platforms. Additionally, there are specific websites that provide probability problems and solutions, such as Math-Aids Answers.
8. Are there any probability examples available?
Yes, there are many probability examples available. Some common examples include flipping a coin, rolling dice, drawing cards from a deck, and selecting balls from an urn. These examples help illustrate how probability concepts can be applied in different scenarios.
9. What are some probability formulas and rules?
There are several probability formulas and rules that are commonly used, including:
Addition Rule: P(A or B) = P(A) + P(B) – P(A and B)
Multiplication Rule: P(A and B) = P(A) * P(B|A)
Complement Rule: P(A’) = 1 – P(A)
Conditional Probability: P(A|B) = P(A and B) / P(B)
Bayes’ Theorem: P(A|B) = P(B|A) * P(A) / P(B)
10. Can you suggest some probability exercises for practice?
Certainly! Here are a few probability exercises you can try:
A bag contains 5 red balls and 3 blue balls. What is the probability of drawing a red ball?
Two dice are rolled. What is the probability of getting a sum of 7?
A deck of cards is shuffled and one card is drawn. What is the probability of drawing a heart?
A jar contains 10 red marbles and 5 green marbles. If two marbles are drawn without replacement, what is the probability of getting two red marbles?
A spinner is divided into 8 equal sections numbered 1 to 8. What is the probability of landing on an even number?
These exercises will help you practice applying probability concepts and calculations.
For the random variable dependent on one another requires the calculation of conditional probabilities which we already discussed, now we will discuss some more parameters for such random variables or experiments like conditional expectation and conditional variance for different types of random variables.
Conditional Expectation
The definition of conditional probability mass function of discrete random variable X given Y is
In similar way if X and Y are continuous then the conditional probability density function of the random variable X given Y is
where f(x,y) is joint probability density function and for all yfY(y)>0 , so the conditional expectation for the random variable X given y will be
for all yfY(y)>0.
As we know that all the properties of probability are applicable to conditional probability same is the case for the conditional expectation, all the properties of mathematical expectation are satisfied by conditional expectation, for example conditional expectation of function of random variable will be
and the sum of random variables in conditional expectation will be
Conditional Expectation for the sum of binomial random variables
To find conditional expectation of the sum of binomial random variables X and Y with parameters n and p which are independent, we know that X+Y will be also binomial random variable with the parameters 2n and p, so for random variable X given X+Y=m the conditional expectation will be obtained by calculating the probability
since we know that
thus the conditional expectation of X given X+Y=m is
To calculate the conditional expectation we require conditional probability density function, so
since for the continuous random variable the conditional expectation is
hence for the given density function the conditional expectation would be
Expectation by conditioning||Expectation by conditional expectation
We can calculate the mathematical expectation with the help of conditional expectation of X given Y as
for the discrete random variables this will be
which can be obtained as
and for the continuous random we can similarly show
Example:
A person is trapped in his building underground as the entrance is blocked due to some heavy load fortunately there are three pipelines from which he can come out the first pipe take him safely out after 3 hours, the second after 5 hours and the third pipeline after 7 hours, If any of these pipeline chosen equally likely by him, then what would be the expected time he will come outside safely.
Solution:
Let X be the random variable that denote the time in hours until the person came out safely and Y denote the pipe he chooses initially, so
since
If the person chooses the second pipe , he spends 5 hous in that but he come outside with expected time
so the expectation will be
Expectation of sum of random number of random variables using conditional expectation
Let N be the random number of random variable and sum of random variables is then the expectation
since
as
thus
Correlation of bivariate distribution
If the probability density function of the bivariate random variable X and Y is
where
then the correlation between random variable X and Y for the bivariate distribution with density function is
since correlation is defined as
since the expectation using conditional expectation is
for the normal distribution the conditional distribution X given Y is having mean
now the expectation of XY given Y is
this gives
hence
Variance of geometric distribution
In the geometric distribution let us perform successively independent trials which results in success with probability p , If N represents the time of first success in these succession then the variance of N as by definition will be
Let the random variable Y=1 if the first trial results in success and Y=0 if first trial results in failure, now to find the mathematical expectation here we apply the conditional expectation as
since
if success is in first trial then N=1 and N2=1 if failure occur in first trial , then to get the first success the total number of trials will have the same distribution as 1 i.e the first trial that results in failure with plus the necessary number of additional trials, that is
Thus the expectation will be
since the expectation of geometric distribution is so
hence
and
E
so the variance of geometric distribution will be
Expectation of Minimum of sequence of uniform random variables
The sequence of uniform random variables U1, U2 … .. over the interval (0, 1) and N is defined as
then for the expectation of N, for any x ∈ [0, 1] the value of N
we will set the expectation of N as
to find the expectation we use the definition of conditional expectation on continuous random variable
now conditioning for the first term of the sequence we have
here we get
the remaining number of uniform random variable is same at the point where the first uniform value is y,in starting and then were going to add uniform random variables until their sum surpassed x − y.
so using this value of expectation the value of integral will be
if we differentiate this equation
and
now integrating this gives
hence
the value of k=1 if x=0 , so
m
and m(1) =e, the expected number of uniform random variables over the interval (0, 1) that need to be added until their sum surpasses 1, is equal to e
Probability using conditional Expectation || probabilities using conditioning
We can find the probability also by using conditional expectation like expectation we found with conditional expectation, to get this consider an event and a random variable X as
from the definition of this random variable and expectation clearly
now by conditional expectation in any sense we have
Example:
compute the probability mass function of random variable X , if U is the uniform random variable on the interval (0,1), and consider the conditional distribution of X given U=p as binomial with parameters n and p.
Solution:
For the value of U the probability by conditioning is
we have the result
so we will get
Example:
what is the probability of X < Y, If X and Y are the continuous random variables with probability density functions fX and fY respectively.
Solution:
By using conditional expectation and conditional probability
as
Example:
Calculate the distribution of sum of continuous independent random variables X and Y.
Solution:
To find the distribution of X+Y we have to find the probability of the sum by using the conditioning as follows
Conclusion:
The conditional Expectation for the discrete and continuous random variable with different examples considering some of the types of these random variables discussed using the independent random variable and the joint distribution in different conditions, Also the expectation and probability how to find using conditional expectation is explained with examples, if you require further reading go through below books or for more Article on Probability, please follow our Mathematics pages.
Moment generating function is very important function which generates the moments of random variable which involve mean, standard deviation and variance etc., so with the help of moment generating function only, we can find basic moments as well as higher moments, In this article we will see moment generating functions for the different discrete and continuous random variables. since the Moment generating function(MGF) is defined with the help of mathematical expectation denoted by M(t) as
which by substituting the value of t as zero generates respective moments. These moments we have to collect by differentiating this moment generating function for example for first moment or mean we can obtain by differentiating once as
This gives the hint that differentiation is interchangeable under the expectation and we can write it as
and
if t=0 the above moments will be
and
In general we can say that
hence
Moment generating function of Binomial distribution||Binomial distribution moment generating function||MGF of Binomial distribution||Mean and Variance of Binomial distribution using moment generating function
The Moment generating function for the random variable X which is Binomially distribution will follow the probability function of binomial distribution with the parameters n and p as
which is the result by binomial theorem, now differentiating and putting the value of t=0
which is the mean or first moment of binomial distribution similarly the second moment will be
so the variance of the binomial distribution will be
which is the standard mean and variance of Binomial distribution, similarly the higher moments also we can find using this moment generating function.
Moment generating function of Poisson distribution||Poisson distribution moment generating function||MGF of Poisson distribution||Mean and Variance of Poisson distribution using moment generating function
If we have the random variable X which is Poisson distributed with parameter Lambda then the moment generating function for this distribution will be
now differentiating this will give
this gives
which gives the mean and variance for the Poisson distribution same which is true
Moment generating function of Exponential distribution||Exponential distribution moment generating function||MGF of Exponential distribution||Mean and Variance of Exponential distribution using moment generating function
The Moment generating function for the exponential random variable X by following the definition is
here the value of t is less than the parameter lambda, now differentiating this will give
which provides the moments
clearly
Which are the mean and variance of exponential distribution.
Moment generating function of Normal distribution||Normal distribution moment generating function||MGF of Normal distribution||Mean and Variance of Normal distribution using moment generating function
The Moment generating function for the continuous distributions also same as the discrete one so the moment generating function for the normal distribution with standard probability density function will be
this integration we can solve by adjustment as
since the value of integration is 1. Thus the moment generating function for the standard normal variate will be
from this we can find for any general normal random variable the moment generating function by using the relation
thus
so differetiation gives us
thus
so the variance will be
Moment generating function of Sum of random variables
The Moment generating function of sum of random variables gives important property that it equals the product of moment generating function of respective independent random variables that is for independent random variables X and Y then the moment generating function for the sum of random variable X+Y is
here moment generating functions of each X and Y are independent by the property of mathematical expectation. In the succession we will find the sum of moment generating functions of different distributions.
Sum of Binomial random variables
If the random variables X and Y are distributed by Binomial distribution with the parameters (n,p) and (m,p) respectively then moment generating function of their sum X+Y will be
where the parameters for the sum is (n+m,p).
Sum of Poisson random variables
The distribution for the sum of independent random variables X and Y with respective means which are distributed by Poisson distribution we can find as
then for the sum of random variables X+Y with parameters
so the moment generating function will be
which is moment generating function with additive mean and variance.
Sum of random number of random variables
To find the moment generating function of the sum of random number of random variables let us assume the random variable
where the random variables X1,X2, … are sequence of random variables of any type, which are independent and identically distributed then the moment generating function will be
Which gives the moment generating function of Y on differentiation as
hence
in the similar way the differentiation two times will give
which give
thus the variance will be
Example of Chi-square random variable
Calculate the moment generating function of the Chi-squared random variable with n-degree of freedom.
Solution: consider the Chi-squared random variable with the n-degree of freedom for
the sequence of standard normal variables then the moment generating function will be
so it gives
the normal density with mean 0 and variance σ2 integrates to 1
which is the required moment generating function of n degree of freedom.
Example of Uniform random variable
Find the moment generating function of random variable X which is binomially distributed with parameters n and p given the conditional random variable Y=p on the interval (0,1)
Solution: To find the moment generating function of random variable X given Y
using the binomial distribution, sin Y is the Uniform random variable on the interval (0,1)
Joint moment generating function
The Joint moment generating function for the n number of random variables X1,X2,…,Xn
where t1,t2,……tn are the real numbers, from the joint moment generating function we can find the individual moment generating function as
Theorem: The random variables X1,X2,…,Xn are independent if and only if the joint mement generating function
Proof: Let us assume that the given random variables X1,X2,…,Xn are independent then
Now assume that the joint moment generating function satisfies the equation
to prove the random variables X1,X2,…,Xn are independent we have the result that the joint moment generating function uniquely gives the joint distribution(this is another important result which requires proof) so we must have joint distribution which shows the random variables are independent, hence the necessary and sufficient condition proved.
Example of Joint Moment generating function
1.Calculate the joint moment generating function of the random variable X+Y and X-Y
Solution : Since the sum of random variables X+Y and subtraction of random variables X-Y are independent as for the independent random variables X and Y so the joint moment generating function for these will be
as this moment generating function determine the joint distribution so from this we can have X+Y and X-Y are independent random variables.
2. Consider for the experiment the number of events counted and uncounted distributed by poisson distribution with probability p and the mean λ, show that the number of counted and uncounted events are independent with respective means λp and λ(1-p).
Solution: We will consider X as the number of events and Xc the number of counted events so the number of uncounted events is X-Xc,the joint moment genrating function will generate moment
and by the moment generating function of binomial distribution
and taking expectation off these will give
Conclusion:
By using the standard definition of moment generating function the moments for the different distributions like binomial, poisson, normal etc were discussed and the sum of these random variables either the discrete or continuous the moment generating function for those and joint moment generating function were obtained with suitable examples , if you require further reading go through below books.
COVARIANCE, VARIANCE OF SUMS, AND CORRELATIONS OF RANDOM VARIABLES
The statistical parameters of the random variables of different nature using the definition of expectation of random variable is easy to obtain and understand, in the following we will find some parameters with the help of mathematical expectation of random variable.
Moments of the number of events that occur
So far we know that expectation of different powers of random variable is the moments of random variables and how to find the expectation of random variable from the events if number of event occurred already, now we are interested in the expectation if pair of number of events already occurred, now if X represents the number of event occurred then for the events A1, A2, ….,An define the indicator variable Ii as
the expectation of X in discrete sense will be
because the random variable X is
now to find expectation if number of pair of event occurred already we have to use combination as
this gives expectation as
from this we get the expectation of x square and the value of variance also by
By using this discussion we focus different kinds of random variable to find such moments.
Moments of binomial random variables
If p is the probability of success from n independent trials then lets denote Ai for the trial i as success so
this expectation we can obtain successively for the value of k greater than 3 let us find for 3
using this iteration we can get
Moments of hypergeometric random variables
The moments of this random variable we will understand with the help of an example suppose n pens are randomly selected from a box containing N pens of which m are blue, Let Ai denote the events that i-th pen is blue, Now X is the number of blue pen selected is equal to the number of events A1,A2,…..,An that occur because the ith pen selected is equally likely to any of the N pens of which m are blue
and so
this gives
so the variance of hypergeometric random variable will be
in similar way for the higher moments
hence
Moments of the negative hypergeometric random variables
consider the example of a package containing n+m vaccines of which n are special and m are ordinary, these vaccines removed one at a time, with each new removal equally likely to be any of the vaccine that remain in the package. Now let random variable Y denote the number of vaccines that need to be withdrawn until a total of r special vaccines have been removed, which is negative hypergeometric distribution, this is somehow similar with negative binomial to binomial as to hypergeometric distribution. to find the probability mass function if the kth draw gives the special vaccine after k-1 draw gives r-1 special and k-r ordinary vaccine
now the random variable Y
Y=r+X
for the events Ai
as
hence to find the variance of Y we must know the variance of X so
hence
COVARIANCE
The relationship between two random variable can be represented by the statistical parameter covariance, before the definition of covariance of two random variable X and Y recall that the expectation of two functions g and h of random variables X and Y respectively gives
using this relation of expectation we can define covariance as
“ The covariance between random variable X and random variable Y denoted by cov(X,Y) is defined as
using definition of expectation and expanding we get
it is clear that if the random variables X and Y are independent then
but the converse is not true for example if
and defining the random variable Y as
so
here clearly X and Y are not independent but covariance is zero.
Properties of covariance
Covariance between random variables X and Y has some properties as follows
using the definition off the covariance the first three properties are immediate and the fourth property follows by considering
now by definition
Variance of the sums
The important result from these properties is
as
If Xi ‘s are pairwise independent then
Example: Variance of a binomial random variable
If X is the random variable
where Xi are the independent Bernoulli random variables such that
then find the variance of a binomial random variable X with parameters n and p.
Solution:
since
so for single variable we have
so the variance is
Example
For the independent random variables Xi with the respective means and variance and a new random variable with deviation as
then compute
solution:
By using the above property and definition we have
now for the random variable S
take the expectation
Example:
Find the covariance of indicator functions for the events A and B.
Solution:
for the events A and B the indicator functions are
so the expectation of these are
thus the covariance is
Example:
Show that
where Xi are independent random variables with variance.
Solution:
The covariance using the properties and definition will be
Example:
Calculate the mean and variance of random variable S which is the sum of n sampled values if set of N people each of whom has an opinion about a certain subject that is measured by a real number v that represents the person’s “strength of feeling” about the subject. Let represent the strength of feeling of person which is unknown, to collect information a sample of n from N is taken randomly, these n people are questioned and their feeling is obtained to calculate vi
Solution
let us define the indicator function as
thus we can express S as
and its expectation as
this gives the variance as
since
we have
we know the identity
so
so the mean and variance for the said random variable will be
Conclusion:
The correlation between two random variables is defined as covariance and using the covariance the sum of the variance is obtained for different random variables, the covariance and different moments with the help of definition of expectation is obtained , if you require further reading go through
In this article the conditional Variance and predictions using conditional expectation for the different kind of random variable with some examples we will discuss.
The conditional variance of random variable X given Y is defined in similar way as conditional Expectation of random variable X given Y as
(X|Y)=E[(X-E[X|Y])2|Y]
here variance is the conditional expectation of difference between random variable and square of conditional expectation of X given Y when the value of Y is given.
this is somehow similar from the relation of unconditional variance and expectation which was
Var(X) = E[X2] – (E[X])2
and we can find the variance with the help of conditional variance as
Var(X) = E[var(X|Y] + var(E[X|Y])
Example of conditional variance
Find the mean and variance of the number of travelers who enters into the bus if the people arrived at bus depot is Poisson distributed with mean λt and the initial bus arrived at bus depot is uniformly distributed over the interval (0,T) independent of people arrived or not.
Solution:
To find the mean and variance let for any time t , Y is the random variable for the time bus arrive and N(t) is the number of arrivals
E[N(Y)|Y = t] = E[N(t)|Y = t]
by the independence of Y and N(t)
=λt
since N(t) is Poisson with mean \lambda t Hence
E[N(Y)|Y]=λY
so taking expectations gives
E[N(Y)] = λE[Y] = λT/2
To obtain Var(N(Y)), we use the conditional variance formula
thus
(N(Y)|Y) = λY
E[N(Y)|Y] = λY
Hence, from the conditional variance formula,
Var(N(Y)) = E[λY]+(λY)
=λT/2 + λ2T2/12
where we have used the fact that Var(Y)=T2 / 12.
Variance of a sum of a random number of random variables
consider the sequence of independent and identically distributed random variables X1,X2,X3,………. and another random variable N independent of this sequence, we will find variance of sum of this sequence as
using
which is obvious with the definition of variance and conditional variance for the individual random variable to the sum of sequence of random variables hence
Prediction
In prediction the value of one random variable can be predicted on the basis of observation of another random variable, for prediction of random variable Y if observed random variable is X we use g(X) as the function which tells the predicted value, obviously we try to choose g(X) closed to Y for this the best g is g(X)=E(Y|X) for this we must have to minimize the value of g by using the inequality
This inequality we can get as
However, given X, E[Y|X]-g(X), being a function of X, can be treated as a constant. Thus,
which gives the required inequality
Examples on Prediction
1. It is observed that the height of a person is six feet, what would be the prediction of his sons height after grown up if the height of son which is x inches now is normally distributed with mean x+1 and variance 4.
Solution: let X be the random variable denoting the height of the person and Y be the random variable for the height of son, then the random variable Y is
Y=X+e+1
here e represent the normal random variable independent of random variable X with mean zero and variance four.
so the prediction for the sons height is
so the height of the son will be 73 inches after growth.
2. Consider an example of sending signals from location A and location B, if from location A a signal value s is sent which at location B received by normal distribution with mean s and variance 1 while if the signal S sent at A is normally distributed with mean \mu and variance \sigma^2, how we can predict that the signal value R sent from location A will be received is r at location B?
Solution: The signal values S and R denote here the random variables distributed normally, first we find the conditional density function S given R as
this K is independent of S, now
here also C1 and C2 are independent on S, so the value of conditional density function is
C is also independent on s, Thus the signal sent from location A as R and received at location B as r is normal with mean and variance
and the mean square error for this situation is
Linear Predictor
Every time we can not find the joint probability density function even the mean, variance and the correlation between two random variables is known, in such a situation linear predictor of one random variable with respect to another random variable is very helpful which can predict the minimum, so the for the linear predictor of random variable Y with respect to random variable X we take a and b to minimize
Now differentiate partially with respect to a and b we will get
solving these two equations for a nd b we will get
thus minimizing this expectation gives the linear predictor as
where the means are the respective means of random variables X and Y, the error for the linear predictor will be obtained with the expectation of
This error will be nearer to zero if correlation is perfectly positive or perfectly negative that is coefficient of correlation is either +1 or -1.
Conclusion
The conditional variance for the discrete and continuous random variable with different examples were discussed, one of the important application of conditional expectation in prediction is also explained with suitable examples and with best linear predictor, if you require further reading go through below links.