[PDF] Chapter 5: The Normal Distribution and the Central Limit Theorem




Loading...







[PDF] Normal distribution

The normal distribution is the most widely known and used of all distributions Because the and intelligence are approximately normally distributed; 

[PDF] The Normal Distribution

For example, we might say that the scores on an exam are (approximately) normally distributed, even though the scores are discrete 2 There are actually many 

[PDF] The Normal Distribution

Describe the standard Normal distribution ? Perform Normal calculations Approximately 95 of the population has IQ scores between 70 and 130

[PDF] The Normal Distribution Sue Gordon - The University of Sydney

It is a very useful curve in statistics because many attributes, when a large number of measurements are taken, are approximately distributed in this pattern

[PDF] Normal Distribution Lab

approximately normally distributed with a mean of 72 4 degrees (F) and a standard deviation of 2 6 degrees (F) Q1] Sketch the normal curve by hand here

[PDF] Normal Distributions

Section 2 2 Notes - Almost Done The distribution of heights of women aged 20 to 29 is approximately Normal with mean 64 inches and standard deviation 2 7 

[PDF] Normal Probabilities Practice Problems Solution

Scores on the GMAT are roughly normally distributed with a mean of 527 and a standard deviation of 112 What is the probability of an individual scoring 

[PDF] The normal distribution, estimation, confidence intervals

themselves roughly normally distributed and they seem to be zeroing in on the true value of 2 917 ?But let's look more closely: for sample sizes between 2 and

[PDF] Chapter 5: The Normal Distribution and the Central Limit Theorem

identically distributed random variables is approximately Normal: The Normal distribution has two parameters, the mean, µ, and the variance, ?2

[PDF] Normal distribution

ACT scores are distributed nearly normally with mean 21 and standard deviation 5 A college admissions officer wants to determine which of the two applicants 

[PDF] Chapter 5: The Normal Distribution and the Central Limit Theorem 16704_6ch5.pdf

Chapter 5: The Normal Distribution

and the Central Limit Theorem The Normal distribution is the familiar bell-shaped distribution. It is probably the most important distribution in statistics, mainly because of its link with the Central Limit Theorem, which states thatany large sum of independent, identically distributed random variables is approximately Normal: X

1+X2+...+Xn≂approx Normal

ifX1,...,Xnare i.i.d. andnis large.Before studying the Central Limit Theorem, we look at the Normal distribution

and some of its general properties.

5.1The Normal Distribution

The Normal distribution has two parameters,the mean,μ, and the variance,σ2.

μandσ2satisfy-∞< μ <∞,σ2>0.

We writeX≂Normal(μ,σ2), orX≂N(μ,σ2).

Probability density function,fX(x)f

X(x) =1⎷2πσ2e{-(x-μ)2/2σ2}for-∞< x <∞.Distribution function,FX(x)There is no closed form for the distribution function of the Normal distribution.

IfX≂Normal(μ,σ2), thenFX(x) can can only be calculatedby computer. Rcommand:FX(x) =pnorm(x, mean=μ, sd=sqrt(σ2)). 165
Probability density function,fX(x)Distribution function,FX(x)Mean and Variance ForX≂Normal(μ,σ2),E(X) =μ,Var(X) =σ2.Linear transformations IfX≂Normal(μ,σ2), then for any constantsaandb, aX+b≂Normal? aμ+b, a2σ2? .In particular,puta=1σ andb=-μσ , then

X≂Normal(μ σ2)??X-μσ

? ≂Normal(0,1).Z≂Normal(0,1) is called thestandard Normal random variable. 166

Proof thataX+b≂Normal?

aμ+b, a2σ2? :LetX≂Normal(μ,σ2), and letY=aX+b.We wish to find the distribution ofY. Usethe change of variable technique.

1)y(x) =ax+bis monotone, so we can apply the Change of Variable technique.

2)Lety=y(x) =ax+bfor-∞< x <∞.

3)Thenx=x(y) =y-ba

for-∞< y <∞. 4) ????dxdy ? ???=????1a ? ???=1|a|.

5)SofY(y) =fX(x(y))????dxdy

? ???=fX?y-ba ?

1|a|.(?)

ButX≂Normal(μ,σ2), sofX(x) =1⎷2πσ2e-(x-μ)2/2σ2

ThusfX?y-ba

? =1⎷2πσ2e-(y-ba -μ)2/2σ2 =

1⎷2πσ2e-(y-(aμ+b))2/2a2σ2.

Returning to(?),

f

Y(y) =fX?y-ba

? ·1|a|=1⎷2πa2σ2e-(y-(aμ+b))2/2a2σ2for- ∞< y <∞. But this is the p.d.f. of a Normal(aμ+b, a2σ2)random variable. So, ifX≂Normal(μ,σ2), thenaX+b≂Normal? aμ+b, a2σ2? . 167

Sums of Normal random variablesIfXandYareindependent,andX≂Normal(μ1, σ21),Y≂Normal(μ2, σ22),

then

X+Y≂Normal?

μ

1+μ2, σ21+σ22?

. More generally, ifX1,X2,...,Xnareindependent, andXi≂Normal(μi,σ2i) for i= 1,...,n, then a

1X1+a2X2+...+anXn≂Normal?

(a1μ1+...+anμn),(a21σ21+...+a2nσ2n)? .For mathematicians: properties of the Normal distribution

1. Proof that

?∞ -∞fX(x)dx= 1.The full proof that ? ∞ -∞ f

X(x)dx=?

∞ -∞1⎷2πσ2e{-(x-μ)2/(2σ2)}dx= 1 relies on the following result: FACT: ? ∞ -∞ e-y2dy=⎷π. This result is non-trivial to prove. See Calculus courses for details.

Using this result, the proof that

?∞ -∞fX(x)dx= 1 follows by using the change of variabley=(x-μ)⎷2σin the integral.

2. Proof thatE(X) =μ.E(X) =?

∞ -∞ xf

X(x)dx=?

∞ -∞ x1⎷2πσ2e-(x-μ)2/2σ2dx

Change variable of integration: letz=x-μσ

: thenx=σz+μanddxdz =σ.

ThenE(X) =?

∞ -∞ (σz+μ)·1⎷2πσ2·e-z2/2·σ dz 168
=? ∞ -∞σz⎷2π·e-z2/2dz ???? this is an odd function ofz (i.e.g(-z) =-g(z)), so it integrates to 0 over range -∞to∞.+μ? ∞ -∞1⎷2πe-z2/2dz ???? p.d.f. ofN(0,1) integrates to 1.

ThusE(X) = 0 +μ×1

=μ.

3. Proof thatVar(X) =σ2.Var(X) =E?(X-μ)2?

= ? ∞ -∞ (x-μ)21⎷2πσ2e-(x-μ)2/(2σ2)dx =σ2?∞ -∞1⎷2πz2e-z2/2dz? puttingz=x-μσ ? =σ2?1⎷2π? -ze-z2/2?∞ -∞+? ∞ -∞1⎷2πe-z2/2dz? (integration by parts) =σ2{0 + 1} =σ2.?5.2The Central Limit Theorem (CLT) also known as...the Piece of Cake Theorem The Central Limit Theorem (CLT) is one of the most fundamental results in statistics. In its simplest form, it states that if a large number of independent random variables are drawn fromanydistribution, then the distribution of their sum (or alternatively their sample average) always converges to the Normal distribution. 169

Theorem (The Central Limit Theorem):LetX1,...,Xnbe independent r.v.s with meanμand varianceσ2, from ANY

distribution. For example,Xi≂Binomial(n,p)for eachi, soμ=npandσ2=np(1-p).

Then the sumSn=X1+...+Xn=?n

i=1Xihas a distribution that tends to Normal asn→ ∞.Themeanof the Normal distribution isE(Sn) =?n i=1E(Xi) =nμ.

Thevarianceof the Normal distribution is

Var(Sn) =Var?

n? i=1X i? = n? i=1Var(Xi)becauseX1,...,Xnare independent =nσ2. SoSn=X1+X2+...+Xn→Normal(nμ, nσ2)asn→ ∞.Notes:

1. This is a remarkable theorem, because the limit holds foranydistribution

ofX1,...,Xn.

2. A sufficient condition onXfor the Central Limit Theorem to apply is

that Var(X) is finite. Other versions of the Central Limit Theorem relax the conditions thatX1,...,Xnare independent and have the same distribution.

3. Thespeedof convergence ofSnto the Normal distribution depends upon the

distribution ofX. Skewed distributions converge more slowly than symmetric Normal-like distributions. It is usually safe to assume that the Central Limit Theorem applies whenevern≥30.It might apply for as little asn= 4. 170

Distribution of the sample mean,X, using the CLTLetX1,...,Xnbe independent, identically distributed with meanE(Xi) =μ

and varianceVar(Xi) =σ2for alli.

The sample mean,X, is defined as:X=X1+X2+...+Xnn

.

SoX=Snn

, whereSn=X1+...+Xn≂approx Normal(nμ, nσ2)by the CLT. BecauseXis a scalar multiple of a Normal r.v. asngrows large,Xitself is approximately Normal for largen: X

1+X2+...+Xnn

≂approx Normal?

μ,σ2n

? asn→ ∞. The following three statements of the Central Limit Theorem are equivalent:X=X1+X2+...+Xnn ≂approx Normal? μ,

σ2n

? asn→ ∞. S n=X1+X2+...+Xn≂approx Normal?nμ, nσ2?asn→ ∞. S n-nμ⎷nσ

2=X-μ?σ

2/n≂approx Normal(0,1) asn→ ∞.The essential point to remember about the Central Limit Theorem is that large

sums or sample means of independent random variables converge to a Normal distribution,whateverthe distribution of the original r.v.s.

More general version of the CLTA more general form of CLT states that, ifX1,...,Xnare independent, and

E(Xi) =μi, Var(Xi) =σ2i(not necessarily all equal), then Z n=? n i=1(Xi-μi)?? n i=1σ2i→Normal(0,1) asn→ ∞. Other versions of the CLT relax the condition thatX1,...,Xnare independent. 171

The Central Limit Theorem in action : simulation studiesThe following simulation study illustrates the Central Limit Theorem, making

use of several of the techniques learnt in STATS 210. We will look particularly athow fast the distribution ofSnconverges to the Normal distribution. Example 1:Triangular distribution:fX(x) = 2xfor 0< x <1.xf(x)

0 1FindE(X) and Var(X):

μ=E(X) =?

1 0 xf

X(x)dx

= ? 1 0 2x2dx = ?2x33 ? 1 0 =23 . σ

2=Var(X) =E(X2)- {E(X)}2

= ? 1 0 x2fX(x)dx-?23 ? 2 = ? 1 0

2x3dx-49

=?2x44 ? 1 0 -49 = 118
.

LetSn=X1+...+XnwhereX1,...,Xnareindependent.

Then

E(Sn) =E(X1+...+Xn) =nμ=2n3

Var(Sn) =Var(X1+...+Xn) =nσ2by independence

?Var(Sn) =n18 .

SoSn≂approx Normal?2n3

,n18?for largen, by the Central Limit Theorem. 172
The graph shows histograms of 10000 values ofSn=X1+...+Xnforn= 1,2,3, and 10. The Normal p.d.f. Normal(nμ,nσ2) = Normal(2n3 ,n18 ) is superimposed across the top. Even fornas low as 10, the Normal curve is a very good approximation.0.0 0.2 0.4 0.6 0.8 1.0 1.2

0.0 0.5 1.0 1.5 2.0

n= 1 Sn

0.0 0.5 1.0 1.5 2.0

0.0 0.2 0.4 0.6 0.8 1.0 1.2

n= 2 Sn

0.5 1.0 1.5 2.0 2.5 3.0

0.0 0.2 0.4 0.6 0.8 1.0

n= 3 Sn

4 5 6 7 8 9

0.0 0.1 0.2 0.3 0.4 0.5

n= 10 S nExample 2:U-shaped distribution:fX(x) =32 x2for-1< x <1. -110xf(x)We find thatE(X) =μ= 0, Var(X) =σ2=35 . (Exercise)

LetSn=X1+...+XnwhereX1,...,Xnareindependent.

Then

E(Sn) =E(X1+...+Xn) =nμ= 0

Var(Sn) =Var(X1+...+Xn) =nσ2by independence

?Var(Sn) =3n5 .

SoSn≂approx Normal?0,3n5

?for largen, by the CLT. Even with this highly non-Normal distribution forX, the Normal curve provides a good approximation toSn=X1+...+Xnfornas small as 10. -1.0 -0.5 0.0 0.5 1.0

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

n= 1 Sn -2 -1 0 1 2

0.0 0.2 0.4 0.6

n= 2 Sn -3 -2 -1 0 1 2 3

0.0 0.1 0.2 0.3 0.4

n= 3 Sn -5 0 5

0.0 0.05 0.10 0.15

n= 10 S n 173
Normal approximation to the Binomial distribution, using the CLTLetY≂Binomial(n,p). We can think ofYas thesum ofnBernoulli random variables: Y=X1+X2+...+Xn, whereXi=?1if trialiis a "success" (prob=p),

0otherwise (prob=1-p)

SoY=X1+...+Xnand eachXihasμ=E(Xi) =p,σ2=Var(Xi) =p(1-p).

Thus by the CLT,

Y=X1+X2+...+Xn→Normal(nμ,nσ2)

=Normal? np,np(1-p)? . Thus,

Bin(n,p)→Normal?

np???? mean of Bin(n,p), np(1-p)???? var of Bin(n,p)? asn→ ∞withpfixed.The Binomial distribution is therefore well approximated by the Normal distribution whennis large, for any fixed value ofp. The Normal distribution is also a good approximation to the Poisson(λ) distribution whenλis large: Poisson(λ)→Normal(λ,λ)whenλis large.30 40 50 60 70

0.0 0.02 0.04 0.06 0.08

60 80 100 120 140

0.0 0.01 0.02 0.03 0.04

Binomial(n= 100,p= 0.5) Poisson(λ= 100)

Why the Piece of Cake Theorem? ...

•The Central Limit Theorem makes whole realms of statistics into apiece of cake. •After seeing a theorem this good, you deservea piece of cake!5.3Confidence intervals Example:Remember the margin of error for an opinion poll? An opinion pollster wishes to estimate the level of support for Labour in an upcoming election. She interviewsnpeople about their voting preferences. Let pbe the true, unknown level of support for the Labour party in New Zealand. LetXbe the number of of thenpeople interviewed by the opinion pollster who plan to vote Labour. ThenX≂Binomial(n,p). At the end of Chapter 2, we said that the maximum likelihood estimator forp is ?p=Xn .

In a large sample (largen), we now know that

X≂approx Normal(np, npq)whereq= 1-p.

So ?p=Xn ≂approx Normal? p,pqn ? (linear transformation of Normal r.v.) So ?p-p?pq n ≂approx Normal(0,1). Now ifZ≂Normal(0,1), we find (using a computer) that the 95% central probability region ofZis from-1.96 to +1.96:

P(-1.96< Z <1.96) = 0.95.

Check inR:pnorm(1.96, mean=0, sd=1) - pnorm(-1.96, mean=0, sd=1)

175Normal (0, 1)distribution

f(x) 0 x0.025 0.025 -1.96 1.960.95PuttingZ=?p-p?pq n , we obtain P ? -1.96Rearrangingto put the unknownpin the middle: P ? ?p-1.96?pq n < p

Confidence intervals for the PoissonλparameterWe saw in section 3.6 that ifX1,...,Xnare independent, identically distributed

withXi≂Poisson(λ), then the maximum likelihood estimator ofλis ?

λ=X=1n

n ? i=1X i. NowE(Xi) =μ=λ, and Var(Xi) =σ2=λ, fori= 1,...,n.

Thus, whennis large,

?

λ=X≂approx Normal(μ,σ2n

) by the Central Limit Theorem. In other words, ?

λ≂approx Normal?

λ,λn

? asn→ ∞. We use the same transformation as before to find approximate 95% confidence intervals forλasngrows large:

LetZ=?λ-λ?

λ n . We haveZ≂approxNormal(0,1)for largen. Thus: P( ( (-1.96Rearrangingto put the unknownλin the middle: P ? ?

λ-1.96?λ

n < λ λ-1.96?? λn to?λ+ 1.96?? λn . 177

Why is this so good?It"s clear that it"s important to measure precision, or reliability, of an estimate,

otherwise the estimate is almost worthless. However, we have already seen various measures of precision: variance, standard error, coefficient of variation, and now confidence intervals. Why do we need so many? •The true variance of an estimator, e.g. Var(?λ), is the most convenient quantity to work with mathematically. However, it is on a non-intuitive scale (squared deviation from the mean), and it usually depends upon the unknown parameter, e.g.λ. •Thestandard erroris se(?λ) =??

Var??λ?

. It is anestimateof the square root of the true variance, Var( ?λ). Because of the square root, the standard error is a direct measure of deviation from the mean, rather than squared deviation from the mean. This means it is measured in more intuitive units. However, it is still unclear how we should comprehend the information that the standard error gives us. •The beauty of the Central Limit Theorem is that it gives us an incredibly easy way of understanding what the standard error is telling us, usingNormal- based asymptotic confidence intervalsas computed in the previous two examples. Although it is beyond the scope of this course to see why, the Central Limit Theorem guarantees that almostanymaximum likelihood estimator will be Normally distributed as long as the sample sizenis large enough, subject only to fairly mild conditions. Thus,if we can find an estimate of the variance, e.g.?Var(?λ), we can immediately convert it to an estimated 95% confidence interval using the Normal formulation: ?

λ-1.96??

Var??λ?

to?λ+ 1.96??

Var??λ?

, or equivalently, ?λ-1.96se(?λ) to?λ+ 1.96se(?λ). The confidence interval has an easily-understood interpretation:on 95% of occasions we conduct a random experiment and build a confidence interval, the interval will contain the true parameter. So the Central Limit Theorem has given us an incredibly simple and power- ful way of converting from a hard-to-understand measure of precision, se(?λ), to a measure that is easily understood and relevant to the problem at hand.

Brilliant!


Politique de confidentialité -Privacy policy