[PDF] Lecture 1 — January 9 11 Design of Experiments PDF lecture1.pdf

Why perform Design of Experiments (DOE)? There are at least two reasons: 1 To do more efficient Note that this also includes (purposeful) randomization Example: (Ch1 of 1 4 Post-hoc Analysis of ANOVA Results An ANOVA can only

STAT 8200 — Design and Analysis of Experiments for Research Workers — Lecture Notes Basics of Experimental Design Terminology Response (Outcome

[PDF] Design and Analysis of Experiments Lecture 21

time of day, day of week, shift etc Diploma in Statistics Design and Analysis of Experiments Page 10 Randomized blocks

[PDF] Introduction to Experimental Design and Analysis - Computer

COMP 528 Lecture 11 22 February 2005 Page 2 2 Experimental Design and Analysis Understand Design a experiments for measurement or simulation

[PDF] Chapter 20 Design and Analysis of Experiments and Observational

Design and Analysis of Experiments and Observational Studies (lecture notes 10 ) drinking → heavy light 3 1 6 2 2 1 This is an observational study because

[PDF] Notes for ISyE 6413 Design and Analysis of Experiments

Notes for course instructors: feel free to adapt the materials here to suit the needs of your course (latex files also available) 1 Page 2 Unit 1 : Basic Concepts and

[PDF] 6831/6813 Lecture 15 Notes, Experiment analysis

756 Statistical Methods in Experimental Design Also see statistics mit edu/, a clearinghouse site for classes and research in statistics at MIT 5

[PDF] Design and Analysis of Experiments - University of Washington

Structures of an Experimental Design The Three R's of Experimental Design Note that this "residual" for the within plot (subplot) part of the analysis is

[PDF] Lecture 1 — January 9 11 Design of Experiments

[PDF] Chapter 4 Experimental Designs and Their Analysis - For IIT Kanpur

Design of experiment means how to design an experiment in the sense that how the (Note: This is how the randomization principle is utilized is CRD )

Design and Analysis of Experiments

the subject of design and analysis of experiments can seem like “a bunch of of the planning stage of several experiments that were run in the course of Professor Hoerl in the Royal Statistical Society News and Notes (January 1988)

[PDF] design and analysis of experiments montgomery ppt

[PDF] design and analysis of experiments solution manual

[PDF] design and build payment method

[PDF] design and fabrication of electric bike

[PDF] design and produce business documents assessment answers

[PDF] design and produce business documents textbook

[PDF] design build flowchart

[PDF] design build pdf

[PDF] design build proposal

[PDF] design build training

[PDF] design butterworth filter using bilinear transformation

[PDF] design by contract

[PDF] design by contract java

[PDF] design by contract java example

[PDF] design by contract unit testing

STAT 263/363: Experimental Design Winter 2016/17

Lecture 1 | January 9

Lecturer: Minyong Lee Scribe: Zachary del Rosario1.1 Design of Experiments Why perform Design of Experiments (DOE)? There are at least two reasons:

1. To do more ecient data gathering

2. To infer causality

By eciency, we mean reducing the variance of some estimateV(^) (say from regression) for some xed number of samplesn. For inferring causality, note that a regression by itself is 'passive' { it only allows us to learn some correlation.

1The best way to get at causality

in a reliable way is to design an experiment. Denition 1.An experimentis a purposeful setting of input variables with observations of corresponding output variable(s). This may include (purposeful) randomization of the input variables. Denition 2.A controlled experimentis an experiment which changes only one variable, in order to isolate the result. Note that this also includes (purposeful) randomization. Example: (Ch1. of Modern Experimental Design) A math teacher has many students in their class (40), and is considering a new approach for teaching math concepts. In order to test this new technique, they are considering splitting the class to compare the old and new approaches. Suppose there are an equal number of male and female students, and the teacher has settled on performing an even split of 20 students under the old approach, and

20 under the new method. There are at least 3 ways to perform this split:

1. Split by gender (20 boys and 20 girls)

2. Split randomly

3. Block the students by gender, then randomly assign

Approach (1) is a terrible idea, as it introduces aconfounding factor(aka lurking vari- able); the performance dierence between the groups could be attributed to either the teach- ing method, or the gender of the students. Approach (2) is better, but may still suer from the confound if the gender ratios are not equal. Approach (3) guarantees an equal ratio in both groups; it is the best of these options.1

There are some cases where regression without a designed experiment can still infer causality. This is

not the case for arbitrary experiments, though. 1-1

STAT 263/363 Lecture 1 | January 9 Winter 2016/17

1.2 Paired Data

Example: (Shoe sole robustness) A group of experimenters want to quantify the robustness of two dierent types of shoe soles; A and B versions. To this end, they study 10 kids, and assign them specially made shoes with a dierent sole for each foot. They randomly assign subjects to have an A left sole and B right sole or vise-versa, and measure the wear on each

sole after a set period of time. The results are shown in Figure 1.1Figure 1.1.Example paired data; on average A is more robust than B. This is most clear from the paired

data, but less clear from the estimated densities, which hide the pairwise nature of the data. Note that from considering the pairwise dierences, it is clear that A is more robust than B. This is less clear from the distributions, where we ignore the pairwise nature of the data. This suggests that a pairwise test will have more power, an idea which can be made rigorous by comparing the two statistical tests.

1.2.1 Unpaired t-test

Given two random variablesX N(X;2) andY N(Y;2), suppose we have measurements x

1;:::;xmandy1;:::;yn. We dene a t-statistic

t=xys p1=m+ 1=n;(1.1) where x=Pm i=1xi=m, y=Pn i=1yi=n, ands2=P n i=1(xix)2+Pm i=1(yiy)2m+n2. In this case, we havettn+m2with a null hypothesis of x= y. 1-2

STAT 263/363 Lecture 1 | January 9 Winter 2016/17

1.2.2 Paired t-test

Given two random variablesX;Ywhose dierenceD=XYis normally distributed D N(;2), suppose we have data pairs (xi;yi) fori= 1;:::;n, and denedi=xiyi. We dene a t-statistic t=ds

D;(1.2)

where d=Pn i=1di=nands2D=Pn i=1(did)2=(n1). In this case, we havettn1 with a null hypothesis ofd= 0. Comparing Equations 1.1 and 1.2, we see that their numerators are identical, while their denominators (variance estimates) are not. Since we reject the null hypothesis whentis large, the smaller the variance estimate the easier it is to reject (a large variance suppresses the numerator). Note that in the unpaired case, we add together the contributions fromX andY, while in the paired case we consider only the dierences; thussDs.2

1.3 Normal Theory via One-way ANOVA

Suppose we haveYij(i;2) fori= 1;:::;K;j= 1;:::;ni, withN=PK i=1ni. We use the modelYij=+i+ij, where the eectsisum to zero; i.e.Pk i=1i= 0. We may test the null hypothesisi==kby using the2 log likelihood3 1 2k X i=1n iX j=1(YiY)2;(1.3) which is asymptotically (n! 1) distributed as a2variable. However, we may apply a dierent test for niten{ this leads to the ANOVA F-test.

1.3.1 One-way ANOVA Identity and Test

With the model above, we have

K X i=1n iX j=1(YijY)2=KX i=1n iX j=1(YijYi)2+KX i=1n iX j=1(YiY)2; SS total=SSerror+SStreatment:4(1.4) To test the null hypothesis dened above, we dene the following F-statistic

F=1K1SStreatment1

NKSSerror;(1.5)2

Note that this argument breaks down ifX;Yare independent.

3What does this terminology mean?

4Note thatSSerroris sometimes calledSSwithin, andSStreatmentis akaSSbetween.

1-3

STAT 263/363 Lecture 1 | January 9 Winter 2016/17

which is distributed according toFFK1;NK. Under an alternative hypothesisHAwith xedi, one can show that

E(SStreatment) =KX

i=1n iE((YiY)2); KX i=1n i2i+ (K1)2;

E(SSerror) = (NK)2;(1.6)

and ourFstatistic dened above is distributed according to a non-centralF0distribution

FF0K1;NK(1

2PK i=1ni2i).

1.3.2 ANOVA Table

When computing an ANOVA via standard statistical software (e.g. R), a table similar to the following will be produced. Table 1.3.2 labels the entries of such a table. SourcedfSS MS FTreatmentK1SStreatmentSStreatment=(K1)MStreatment=MSerror

ErrorNK SSerrorSSerror=(NK)

TotalN1SStotal

1.4 Post-hoc Analysis of ANOVA Results

An ANOVA can only detect if a dierence between treatments exists; further/dierent anal- ysis is required to determine which treatments dier. There are various methods to test for these dierences. 5

1.4.1 Pairwise t-tests

If there arektreatments, then there areK

2pairwise t-tests to perform; with null hypotheses

ij= 0. However, a blind testing of each possible pairwise test will yield a large number of false discoveries; supposing a type 1 error rate of, we expect to seeK

2false positives.

1.4.2 Bonferroni

To solve the false discovery issue of pairwise testing, the Bonferroni correction enforces a signicance level of0==K

2, whereis the overall desired signicance level. This is a

(very) conservative approach.5

Note that here we are performingsequential tests; i.e. we choose to perform further statistical tests

based on the results of an initial test. It can be dicult to analyze the statistical eects of this sequential

procedure. 1-4

STAT 263/363 Lecture 1 | January 9 Winter 2016/17

1.4.3 Schee

The Schee test (apparently?) builds simultaneous condence intervals for contrasts. It is intended to solve the pairwise issue noted above.

1.4.4 Benjamini-Hochberg Procedure

The Benjamini-Hochberg procedure is intended to control the false discovery rate (FDR) at some specied level. This is in contrast with the Bonferroni approach, which controls the probability of makingat least onefalse discovery. Thus, this is a less conservative approach than Bonferroni. 1-5quotesdbs_dbs7.pdfusesText_13

[PDF] [PDF] Lecture 1 — January 9 11 Design of Experiments

STAT 263/363: Experimental Design Winter 2016/17

Lecture 1 | January 9

1. To do more ecient data gathering

2. To infer causality

1The best way to get at causality

20 under the new method. There are at least 3 ways to perform this split:

1. Split by gender (20 boys and 20 girls)

2. Split randomly

3. Block the students by gender, then randomly assign

STAT 263/363 Lecture 1 | January 9 Winter 2016/17

1.2 Paired Data

1.2.1 Unpaired t-test

1;:::;xmandy1;:::;yn. We dene a t-statistic

STAT 263/363 Lecture 1 | January 9 Winter 2016/17

1.2.2 Paired t-test

D;(1.2)

1.3 Normal Theory via One-way ANOVA

1.3.1 One-way ANOVA Identity and Test

With the model above, we have

F=1K1SStreatment1

NKSSerror;(1.5)2

3What does this terminology mean?

4Note thatSSerroris sometimes calledSSwithin, andSStreatmentis akaSSbetween.

STAT 263/363 Lecture 1 | January 9 Winter 2016/17

E(SStreatment) =KX

E(SSerror) = (NK)2;(1.6)

FF0K1;NK(1

1.3.2 ANOVA Table

ErrorNK SSerrorSSerror=(NK)

TotalN1SStotal

1.4 Post-hoc Analysis of ANOVA Results

1.4.1 Pairwise t-tests

If there arektreatments, then there areK

2pairwise t-tests to perform; with null hypotheses

2false positives.

1.4.2 Bonferroni

2, whereis the overall desired signicance level. This is a

STAT 263/363 Lecture 1 | January 9 Winter 2016/17

1.4.3 Schee

1.4.4 Benjamini-Hochberg Procedure