[PDF] Multivariate Analysis Homework 1 MSU




Loading...







[PDF] 1107 Bivariate Normal Distribution - SU LMS

Theorem 1 17 Let X and Y be jointly continuous random variables with joint pdf fX,Y (x, y) which has support on S ? R2 Consider random variables U =

[PDF] Multivariate Analysis Homework 1 MSU

16 mar 2018 · Consider a bivariate normal population with µ1 = 0, µ2 = 2, ?11 = 2, ?22 = 1, and ?12 = 0 5 (a) Write out the bivariate normal density

[PDF] The Bivariate Normal Distribution

If two random variables X and Y are jointly normal and are uncorrelated, then they are independent This property can be verified using multivariate transforms, 

[PDF] Conditioning and the Bivariate Normal Distribution

Math 280B, Winter 2012 Conditioning and the Bivariate Normal Distribution In what follows, X and Y are random variables defined on a probability space

[PDF] The bivariate and multivariate normal distribution

(To actually do this is a very useful exercise ) The Multivariate Normal Distribution Using vector and matrix notation To study the joint normal 

[PDF] 10 — BIVARIATE DISTRIBUTIONS

After some discussion of the Normal distribution, consideration is given to Bivariate Distributions — Continuous Random Variables Exercises — X

[PDF] SIMGM713 Homework 5 Solutions

In ;this problem we will construct a formulation of the probability density function for the bivariate normal distribution based on the covariance matrix and 

[PDF] Multivariate Analysis Homework 1  MSU 34609_6STT843_HW1_Solution_YiChen.pdf

Multivariate Analysis Homework 1

A49109720 Yi-Chen Zhang

March 16, 2018

4.2.Consider a bivariate normal population with1= 0,2= 2,11= 2,22= 1, and



12= 0:5.

(a)

W riteout the biv ariatenormal densit y.

(b) W riteout the squared generalized distance ex pression( x)T1(x) as a function ofx1andx2. (c) Determine (and sk etch)the constan t-densitycon tourthat con tains50% of the prob- ability. Sol.(a)T hem ultivariatenormal densit yis de ned b ythe follo wingequation. f(x) =1(2)p=2jj1=2exp 12 (x)T1(x) :

In this question, we havep= 2,x=x1

x 2 ,=1  2 ,=1112 

2122

, and 

12=12p

11p

22. Note that=0

2 ,= 2p2 2 p2 2 1! ,jj= 21 p2 2  2=32, jj1=2=q3 2 , and1=23 1p2 2 p2 2 2! . So the bivariate normal density is f(x) =1(2)2=2q3 2 exp( 12 x1x2223 1p2 2 p2 2 2! x1 x 22 ) =

1p6exp

13  x

21p2x1(x22) + 2(x22)2

(b) (x)T1(x) =x1x2223 1p2 2 p2 2 2! x1 x 22 = 23
 x

21p2x1(x22) + 2(x22)2

: (c) F or = 0:5, the solid ellipsoid of (x1;x2) satisfy (x)T1(x)2p; = c

2will have probability 50%. From the quantile function inRwe have22;0:5=

qchisq(0.5,df=2)= 1:3863, therefore,c= 1:1774. The eigenvalues ofare (1;2) = (2:3660;0:6340) with eigenvectorse1e2=0:8881 0:4597 0:45970:8881 .

Therefore, we have the axes as:cp

1= 1:8111 andcp

2= 0:9375. The contour is

plotted in Figure 1. 1 -4 -2 0 2 4 -2 0 2 4 x1 x2Figure 1: Contour that contains 50% of the probability

4.4.LetXbeN3(;) withT= (2;3;1) and=0

@1 1 1 1 3 2

1 2 21

A (a)

Find the distribution of 3 X12X2+X3.

(b) Re labelthe v ariablesif necessary ,and nd a 2 1 vectorasuch thatX2and X

2aTX1

X 3 are independent. Sol.(a)Let a= (3;2;1)T, thenaTX= 3X12X2+X3. Therefore, a

TXN(aT;aTa);

where a

T=32 10

@2 3 11 A = 13 and a

Ta=32 10

@1 1 1 1 3 2

1 2 21

A0 @3 2 11 A = 9

The distribution of 3X12X2+X3isN3(13;9).

(b)

Le ta=a1a2

T, thenY=X2aTX1

X 3 =a1X1+X2a2X3.

Now, letA=0 1 0

a11a2 , thenAX=X2 Y N(A;AAT), where

AAT=0 1 0

a11a2 0 @1 1 1 1 3 2

1 2 21

A0 @0a1 1 1 0a21 A = 3a12a2+ 3 a12a2+ 3a212a14a2+ 2a1a2+ 2a22+ 3 2 Since we want to haveX2andYindependent, this implies thata12a2+ 3 = 0.

So we have vector

a=3 0 +c2 1 ;forc2R

4.6.LetXbe distributed asN3(;), whereT= (1;1;2) and=0

@4 01 0 5 0 1 0 21 A . Which of the following random variables are independent? Explain. (a)X1andX2 (b)X1andX3 (c)X2andX3 (d) ( X1;X3) andX2 (e)X1andX1+ 3X22X3

Sol.(a)12=21= 0,X1andX2are independent.

(b)13=31=1,X1andX3are not independent. (c)23=32= 0,X2andX3are independent. (d) W erearrange the co variancematrix and partition it. T henew co variancematrix is as following:  =0 @410 1 20 0 05 1 A

It is clear that (X1;X3) andX2are independent.

(e)

Let A=1 0 0

1 32

, thenAX=X1 X

1+ 3X22X3

andAXN(A;AAT), where

AAT=1 0 0

1 32

0 @4 01 0 5 0 1 0 21 A0 @1 1 0 3 021 A = 4 6 6 61 It is clear thatX1andX1+ 3X22X3are not independent.

4.7.Refer to Exercise 4.6 and specify each of the following.

(a) Th econditional distribution of X1, given thatX3=x3. (b) T heconditional distribution of X1, given thatX2=x2andX3=x3. Sol.We use the result 4.6 from textbook. LetX=X1X 2 N(;) with=1 2 and=11 12 21
22
andj22j>0. Then X

1 X2=x2N1+12122(x22);111212221

3 (a) X

1 X3=x3N1 + (1)(2)1(x32);4(1)(2)1(1)

)X1 X3=x3N 12 x3+ 2; (b) X

1 X2=x2;X3=x3

N

1 +015 0

0 2 1x2(1) x 32 ;4015 0 0 2 10 1 ! )X1 X2=x2;X3=x3N 12 x3+ 2;

4.16.LetX1,X2,X3, andX4be independentNp(;) random vectors.

(a) Find the marginal distributions for eac hof the random v ectors V 1=14 X114 X2+14 X314 X4 and V 2=14 X1+14 X214 X314 X4 (b) Fin dthe join tdensit yof the random v ectorsV1andV2de ned in (a). Sol.(a)By result 4.8 in the textb ook,V1andV2have the following distribution N p nX i=1c i; nX i=1c 2i! !

Then we haveV1Np(0;14

) andV2Np(0;14 ). (b) Also b yresult 4.8, V1andV2are jointly multivariate normal with covariance matrix 0 B BBB@ nX i=1c 2i! (bTc) (bTc) nX j=1b 2j! 1 C CCCA; withc= (14 ;14 ;14 ;14 )Tandb= (14 ;14 ;14 ;14 )T. So that we have the joint distri- bution ofV1andV2as following:V1 V 2 N2p0 0 ; 14  0 0 14 

4.18.Find the maximum likelihood estimates of the 21 mean vectorand the 22 covariance

matrixbased on the random sample X=0 B B@3 6 4 4 5 7 4 71 C CA from a bivariate normal population. 4 Sol.Since the random samplesX1,X2,X3, andX4are from normal population, the maximum likelihood estimates ofandareXand1n n X i=1(XiX)(XiX)T. Therefore, ^ =X=4 6 and b=14 4 X i=1(XiX)(XiX)T=1=2 1=4

1=4 3=2

4.19.LetX1,X2, ...,X20be a random sample of sizen= 20 from anN6(;) population.

Specify each of the following completely.

(a)

Th edistribution of ( X1)T1(X1)

(b)

T hedistributions of

Xandpn(X) (c)

The distribution of ( n1)S

Sol.(a)( X1)T1(X1) is distributed as26

(b) Xis distributed asN6;120 andpn Xis distributed asN6(0;) (c) ( n1)Sis distributed as Wishart distribution201X i=1Z iZTi, whereZiN6(0;). We write this asW6(19;), i.e., Wishart distribution with dimensionality 6, degrees of freedom 19, and covariance matrix.

4.21.LetX1;:::;X60be a random sample of size 60 from a four-variate normal distribution

having meanand covariance. Specify each of the following completely. (a)

Th edistribution of

X (b)

T hedistribution of ( X1)T1(X1)

(c)

The distribution of n(X)T1(X)

(d)

T heappro ximatedistribution of n(X)TS1(X)

Sol.(a)Xis distributed asN4;160

. (b) ( X1)T1(X1) is distributed as24. (c)n(X)T1(X) is distributed as24. (d) S ince60 4,n(X)TS1(X) can be approximated as24.

4.23.Consider the annual rates of return (including dividends) on the Dow-Jones industrial

average for the years 1996-2005. These data, multiplied by 100, are 0:6 3:1 25:316:87:16:2 25:2 22:6 26:0 Use these 10 observations to complete the following. (a) Co nstructa Q-Qplot. Do the data seem to be normally distributed? Explain. (b) C arryout a test o fno rmalitybased on the correlation co ecientrQ. Let the signif- icance level be = 0:1. Sol.(a)T heQ-Qplot of this data is plotted in Figure 2. It seems that all the sample quantiles are close the theoretical quantiles. However, theQ-Qplots are not particularly informative unless the sample size is moderate to large, for instance,n20. There can be quite a bit of variability in the straightness of theQ-Qplot for small samples, even when the observations are known to come from a normal population. 5 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 -10 0 10 20

Theoretical Quantiles

Sample QuantilesFigure 2: NormalQ-Qplot

(b)

F rom(4-31) in the textb ook,the qQis de ned by

r Q=P n j=1(x(j)x)(q(j)q)qP n j=1(x(j)x)2qP n j=1(q(j)q)2 Using the information from the data, we haverQ= 0:9351. TheRcode of this calculation is compiled in Appendix. From Table 4.2 in the textbook we know that the critical point to test of normality at the 10% level of signi cance corresponding ton= 9 and = 0:1 is between 0:9032 and 0:9351. SincerQ= 0:9351>the critical point, we do not reject the hypothesis of normality.

4.26.Exercise 1.2 gives the agex1, measured in years, as well as the selling pricex2, measured

in thousands of dollars, forn= 10 used cars. These data are reproduced as follows: x

11 2 3 3 4 5 6 8 9 11

x

218.95 19.00 17.95 15.54 14.00 12.95 8.94 7.49 6.00 3.99

(a) Use the results of Exercise 1.2 to calculate the squared statistical distances (xjx)TS1(xjx),j= 1;2;:::;10, wherexTj= (xj1;xj2). (b) Using the distances in P art(a), determine the prop ortionof the observ ationsfallin g within the estimated 50% probability contour of a bivariate normal distribution. (c) Order the distances in P art(a) and construct a c hi-squareplot. (d) G iventhe results in P arts(b) and (c), ar ethese data appro ximatelybiv ariatenormal?

Explain.

Sol.(a)F romExercise 1.2 w eha vex=x1

x2 =5:2

12:481

andS=10:622217:7102 17:7102 30:8544 . The squared statistical distancesd2j= (xjx)TS1(xjx),j= 1;:::;10 are cal- culated and listed belowd

21d22d23d24d25d26d27d28d29d2101.8753 2.0203 2.9009 0.7352 0.3105 0.0176 3.7329 0.8165 1.3753 4.21526

(b)W eplot the data p ointsand 50% probabilit ycon tour(the blue ellipse) in Figure

3. It is clear that subject 4, 5, 6, 8, and 9 are falling within the estimated 50%

probability contour. The proportion of that is 0:5.2 4 6 8 10

5 10 15

x1 x2 4 5 6 8 9 12 3 7

10Figure 3: Contour of a bivariate normal

(c) The squared distances in P art(a) are ordered as b elow.The c hi-squareplot is sho wn in Figure 4.d

26d25d24d28d29d21d22d23d27d2100.0176 0.3105 0.7353 0.8165 1.3753 1.8753 2.0203 2.9009 3.7329 4.2153

0 2 4 6 8 10 12 14

0 1 2 3 4

Theoretical Quantiles

Sample QuantilesFigure 4: Chi-square plot

(d) G iventhe results in P arts(b) and (c), w econclude these data are appro ximately bivariate normal. Most of the data are around the theoretical line. 7

Appendix

Rcode for Problem 4.2 (c).

> library(ellipse) > library(MASS) > library(mvtnorm) > set.seed(123) > > mu <- c(0,2) > Sigma <- matrix(c(2,sqrt(2)/2,sqrt(2)/2,1), nrow=2, ncol=2) > X <- mvrnorm(n=10000,mu=mu, Sigma=Sigma) > lambda <- eigen(Sigma)$values > Gamma <- eigen(Sigma)$vectors > elps <- t(t(ellipse(Sigma, level=0.5, npoints=1000))+mu) > chi <- qchisq(0.5,df=2) > c <- sqrt(chi) > factor <- c*sqrt(lambda) > plot(X[,1],X[,2]) > lines(elps) > points(mu[1], mu[2]) > segments(mu[1],mu[2],factor[1]*Gamma[1,1],factor[1]*Gamma[2,1]+mu[2]) > segments(mu[1],mu[2],factor[2]*Gamma[1,2],factor[2]*Gamma[2,2]+mu[2])

Rcode for Problem 4.23.

> x <- c(-0.6, 3.1, 25.3, -16.8, -7.1, -6.2, 25.2, 22.6, 26.0) > # (a) > qqnorm(x) > qqline(x) > # (b) > y <- sort(x) > n <- length(y) > p <- (1:n)-0.5)/n > q <- qnorm(p) > rQ <- cor(y,q)

Rcode for Problem 4.26.

> n <- 10 > x1 <- c(1,2,3,3,4,5,6,8,9,11) > x2 <- c(18.95, 19.00, 17.95, 15.54, 14.00, 12.95, 8.94, 7.49, 6.00, 3.99) > X <- cbind(x1,x2) > Xbar <- colMeans(X) > S <- cov(X) > Sinv <- solve(S) > > # (a) > d <- diag(t(t(X)-Xbar)%*%Sinv%*%(t(X)-Xbar)) > > # (b) > library(ellipse) 8 > p <- 2 > elps <- t(t(ellipse(S, level=0.85, npoints=1000))+Xbar) > plot(X[,1],X[,2],type="n") > index <- d < qchisq(0.5,df=p) > text(X[,1][index],X[,2][index],(1:n)[index],col="blue") > text(X[,1][!index],X[,2][!index],(1:n)[!index],col="red") > lines(elps,col="blue") > > # (c) > names(d) <- 1:10 > sort(d) > qqplot(qchisq(ppoints(500),df=p), d, main="", + xlab="Theoretical Quantiles", ylab="Sample Quantiles") > qqline(d,distribution=function(x){qchisq(x,df=p)}) 9
Politique de confidentialité -Privacy policy