[PDF] Searches related to formule variance filetype:pdf



Previous PDF Next PDF
















[PDF] problème du second degré seconde

[PDF] bpjeps

[PDF] moyenne nationale bac francais 2017

[PDF] moyenne nationale math bac s

[PDF] moyenne nationale bac philo 2015

[PDF] moyenne nationale bac physique 2016

[PDF] moyenne bac francais 2016

[PDF] jobrapido maroc

[PDF] démonstration fonction inverse

[PDF] courbe fonction inverse

[PDF] fonction carré exercice

[PDF] ensemble de définition d'une fonction inverse

[PDF] courbe fonction cube

[PDF] offre d'emploi maroc 2016

[PDF] trovit maroc

Searches related to formule variance filetype:pdf

Deriving & Understanding the Variance Formulas

Max H. Farrell

BUS 41100

August 28, 2015

The purpose of this handout is to derive the variance formulas that we discussed in class and show why take the form they do. In class we went over the formulas at an intuitive level, and discussed which features of the data impacted them and in what ways. But why were those the right formulas? This handout tries to answer that question, which may also shed some light on how the formulas work. One important issue will be the forming of prediction intervals, and the reason we care aboutV[ef] versusV[^Yf]. The main punchline here is this: a prediction interval is a range of likely values for Y f, notE[YjX=Xf] =0+1Xf. Throughout, we will only look at simple linear regression. The formulas are easier to understand, and the intuition is entirely the same. (The formulas are the same too, just reinterpretingXas vectors or matrixes as appropriate.) Also, just like in class we will do everythingconditionalon X. That means we will treat the valuesX1;X2;:::;Xnas xed numbers; they are not random. This doesn't change any of the intuition either.

Contents

1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Week 2: Sampling Distributions, Derivation 1 . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.1 Nonconstant variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3 Week 2: Sampling Distributions, Derivation 2 . . . . . . . . . . . . . . . . . . . . . . . . . 3

3.1 Weighted Average Formulas forb0,b1, and^Y. . . . . . . . . . . . . . . . . . . . . . 4

3.2 Variance Derivations forb0andb1. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3.3 Variance Derivations for^Yfand Prediction Intervals . . . . . . . . . . . . . . . . . . 5

3.4 Understanding Prediction Intervals, Which are forYf. . . . . . . . . . . . . . . . . 6

4 Week 4: Variance ofejand leverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1 Notation

Remember that the model is

Y=0+1X+"; "N(0;2):

Note thatV[YijXi] =2.

The sample variances ofXandYare:

s

2x=1n1n

X i=1(XiX)2ands2y=1n1n X i=1(YiY)2:

The sample covariance is

s xy=1n1n X i=1(XiX)(YiY) =1n1n X i=1(XiX)Yi:

The sample correlation is:

r xy=sxys xsy:

2 Week 2: Sampling Distributions, Derivation 1

In this section I give a fairly simple derivation of the sampling distribution of the slope estimateb1.

Similarly derivations apply to the intercept estimate,b0, a forecast,^Yf, and anything from multiple linear regression.

Recall from week 1 that

b

1=corr(X;Y)var(X):

Using this, plugging in the denition ofYi=0+1Xi+"i, we get b 1=P n i=1(XiX)YiP n i=1(XiX)2=P n i=1(XiX)f0+1Xi+"igP n i=1(XiX)2 =0P n i=1(XiX)P n i=1(XiX)2+1P n i=1(XiX)XiP n i=1(XiX)2+P n i=1(XiX)"iP n i=1(XiX)2 = 0 +1+P n i=1(XiX)"iP n i=1(XiX)2:

The last equality comes from these two facts:

n X i=1(XiX) = 0nX i=1(XiX)Xi=nX i=1(XiX)2; which you can verify by direct calculation (just like you did on homework zero!).

So we have shown that

b 1=1+P n i=1(XiX)"i(n1)s2x: The rst term is just1, which is just some xed number (even though we don't know it). The second term is Normally distributed, either by the Central Limit Theorem or because the"iare 2 assumed to be Normal. Thereforeb1is also Normally distributed, butwhatNormal distribution? A Normal distribution is characterized by the mean and variance. The mean is easy: the second term has mean zero, because the"ihave mean zero, and so the mean ofb1is just1. To compute the variance, remember that1is just a number, so it has no variance, and that we are treating theXias xed numbers. Therefore:

V[b1] =V"

Pn i=1(XiX)"i(n1)s2x#

1((n1)s2x)2n

X i=1 (XiX)2V["i]: Now, we use the assumption that the"ihaveconstantvariance,2. Then we can pull it out of the summation, and we get the result from class:

V[b1] =21((n1)s2x)2n

X i=1 (XiX)2=21((n1)s2x)2(n1)s2x=21(n1)s2x:

We have thus shown that

b 1N

1;21(n1)s2x

(1) This exactly matches the result from class and it matches Equation (6) below.

2.1 Nonconstant variance

To see what happens with the variance isnotconstant, return to the penultimate step above:

V[b1] =1((n1)s2x)2n

X i=1 (XiX)2V["i]: Suppose every"ican have a dierent variance; call it2i. Then wecan'tpull anything out of summation! We just get:

V[b1] =1((n1)s2x)2n

X i=1 (XiX)22i; and therefore b 1N 1;P n i=1(XiX)22i((n1)s2x)2!

To deal with this, we either need to do a variance-stabilizing transformation or do heteroskedasticity

robust inference; both of which we discuss in Week 4.

3 Week 2: Sampling Distributions, Derivation 2

Here I carefully derive all the variance formulas forb0,b1, and^Yusing a weighted-average rep- resentation that will prove very useful. This is the rst thing that is introduced, in the next subsection. 3

3.1 Weighted Average Formulas forb0,b1, and^Y

We will show thatb0,b1, and^Yare all just weighted averages of the outcomesYi. This kind of makes sense in the following way: in regression our aim is to extract a \general, on-average" trend forYgivenX. Even more precisely, we are estimating the conditonal expectation, and since expectations are just averages, it makes sense that our estimators are just averages. This is more than just a nice coincidence, it's an important idea in terms of the type of estimation we're doing and how we get the results we do. Begin with the slope coecient,b1. Recall from class thatb1=rxysy=sx. Let's write out exactly what that means, and re-write the formula:quotesdbs_dbs2.pdfusesText_2