1 Why is multiple testing a problem?

the disease, then there is a higher chance that a positive test is a false positive When there are many people in a population that have had the disease, then there is a higher chance that a positive test is a true positive

COVID-19 Testing: Molecular, Antigen, and Antibody Tests

that have had the disease, then there is a higher cha nce that a positive test is a false positive When there are many people in a population that have had the disease, then there is a higher chance that a positive test is a true positive

A Very Brief Intro to Statistics: The data: t-tests

Today we’ll test whether the difference in means is “significant,” using a “t-test” • “Significant” = a difference in means this big is unlikely to have occurred by chance – Thus there’s likely to be a systematic, generalizable effect • Let’s get some intuitions: what might determine whether or not we think a

1 Why is multiple testing a problem?

might come to mind is to test each hypothesis separately, using some level of signi cance At rst blush, this doesn’t seem like a bad idea However, consider a case where you have 20 hypotheses to test, and a signi cance level of 0 05 What’s the probability of observing at least one signi cant result just due to chance?

TESTING TIPSFOR MAPGROWTH

Give your best effort on this test It is a chance to show how much you know Your teacher can use it to choose what you are ready to learn next This is not a timed test, so it’s important to take your time to understand each question before answering

Work Capacity Test Administrators Guide

given time to prepare for the test (not to exceed 4 weeks) The number of retesting opportunities that will be allowed include: Three opportunities for permanent employees required to pass a test for duties in the fire program One opportunity for temporary employees required to pass a test (a second chance may be provided at the discretion of

AP Statistics Test A – Probability – Part IV Name

AP Statistics Test A – Probability – Part IV – Key 1 B 2 E 3 B 4 B 5 E 6 A 7 C 8 A 9 C 10 B 11 Passing the test 12 Grades You believe that there is a 20 chance that you will earn an A in your English class, a 10 chance that you will earn an A in your Physics class, and a 5 chance that you will earn an A in both classes a

LOAD TESTS - HCI

time intervals, and acceptable ultimate displacement consistent with specific project and load conditions Test procedures shall conform to ASTM D-1143-07, Standard Test Method for Pile under Static Axial Compressive Load and/or ASTM D3689-07, Standard Test Method for Pile under Static Axial Tension Load unless otherwise specified by the engineer

Drug Free Workplace Last Chance Agreement

tested positive on a Company required drug or alcohol test The Company will allow the employee to continue to be employed with the Company based on this second chance agreement which requires the following: The employee will be suspended from employment without pay for 3 days

ANCHORS - hubbellcdn

Chance ratings to those of manufacturers who rate their anchors based on average strength Soil Probe, A Logical Development Chance engineers developed the "soil test probe", a mechanical tool which makes it possible to infer subsoil conditions from the surface of the earth The soil test probe is screwed into the soil As it displaces the soil,

[PDF] le palais d'aix la chapelle

[PDF] chapelle du palais d'aix la chapelle

[PDF] pourquoi charlemagne a t il fait construire une chapelle aussi somptueuse

[PDF] trésor de la cathédrale d'aix la chapelle

[PDF] cathédrale d aix la chapelle personnes inhumées

[PDF] visite cathédrale aix la chapelle

[PDF] ajax sophocle commentaire

[PDF] ajax sophocle texte grec

[PDF] ajax sophocle pdf

[PDF] philoctète sophocle

[PDF] fitbit charge 2 manuel francais

[PDF] fitbit charge manuel francais

[PDF] fitbit charge 2 mode d'emploi

[PDF] mode d'emploi fitbit alta

[PDF] changer heure fitbit

Spring 2008 - Stat C141/ Bioeng C141 - Statistics for Bioinformatics Course Website: http://www.stat.berkeley.edu/users/hhuang/141C-2008.html Section Website: http://www.stat.berkeley.edu/users/mgoldman

GSI Contact Info:

Megan Goldman

mgoldman@stat.berkeley.edu Oce Hours: 342 Evans M 10-11, Th 3-4, and by appointment

1 Why is multiple testing a problem?

Say you have a set of hypotheses that you wish to test simultaneously. The rst idea that might come to mind is to test each hypothesis separately, using some level of signicance. At rst blush, this doesn't seem like a bad idea. However, consider a case where you have

20 hypotheses to test, and a signicance level of 0.05. What's the probability of observing

at least one signicant result just due to chance? P(at least one signicant result) = 1P(no signicant results) = 1(10:05)20 0:64 So, with 20 tests being considered, we have a 64% chance of observing at least one sig- nicant result, even if all of the tests are actually not signicant. In genomics and other biology-related elds, it's not unusual for the number of simultaneous tests to be quite a bit larger than 20... and the probability of getting a signicant result simply due to chance keeps going up. Methods for dealing with multiple testing frequently call for adjustingin some way, so that the probability of observing at least one signicant result due to chance remains below your desired signicance level.

2 The Bonferroni correction

The Bonferroni correction sets the signicance cut-o at=n. For example, in the example above, with 20 tests and= 0:05, you'd only reject a null hypothesis if the p-value is less than 0.0025. The Bonferroni correction tends to be a bit too conservative. To demonstrate this, let's calculate the probability of observing at least one signicant result when using the correction just described: 1 P(at least one signicant result) = 1P(no signicant results) = 1(10:0025)20

0:0488

Here, we're just a shade under our desired 0.05 level. We benet here from assuming that all tests are independent of each other. In practical applications, that is often not the case. Depending on the correlation structure of the tests, the Bonferroni correction could be extremely conservative, leading to a high rate of false negatives.

3 The False Discovery Rate

In large-scale multiple testing (as often happens in genomics), you may be better served by controlling the false discovery rate (FDR). This is dened as the proportion of false positives among all signicant results. The FDR works by estimating some rejection region so that, on average, FDR< .

4 The positive False Discovery Rate

The positive false discovery rate (pFDR) is a bit of a wrinkle on the FDR. Here, you try to control the probability that the null hypothesis is true, given that the test rejected the null. This method works by rst xing the rejection region, then estimating, which is quite the opposite of how the FDR is handled. For gory levels of detail, see the Storey paper the professor has linked to from the class website.

5 Comparing the three

First, let's make some data. For kicks and grins, we'll use the random normals in such a way that we'll know what the result of each hypothesis test should be. x <- c(rnorm(900), rnorm(100, mean = 3)) p <- pnorm(x, lower.tail = F) These functions should all be familiar from the rst few weeks of class. Here, we've made a vector, x, of length 1000. The rst 900 entries are random numbers with a standard normal distribution. The last 100 are random numbers from a normal distribution with mean 3 and sd 1. Note that I didn't need to indicated the sd of 1 in the second bit; it's the default value. The second line of code is nding the p-values for a hypothesis test on each value of x. The hypothesis being tested is that the value of x is not dierent from 0, given the entries are drawn from a standard normal distribution. The alternate is a one-sided test, claiming that the value is larger than 0. Now, in this case, we know the truth: The rst 900 observations should fail to reject the null hypothesis: they are, in fact, drawn from a standard normal distribution and any 2 dierence between the observed value and 0 is just due to chance. The last 100 observations should reject the null hypothesis: the dierence between these values and 0 is not due to chance alone. Let's take a look at our p-values, adjust them in various ways, and see what sort of results we get. Note that, since we all will have our own random vectors, your gures will probably be a dierent from mine. They should be pretty close, however.

5.1 No corrections

test <- p > 0.05 summary(test[1:900]) summary(test[901:1000]) The two summary lines will give me a count of how many p-values where above and below .05. Based on my random vector, I get something that looks like this: > summary(test[1:900])

Mode FALSE TRUE

logical 46 854 > summary(test[901:1000])

Mode FALSE TRUE

logical 88 12 The type I error rate (false positives) is 46/900 = 0.0511. The type II error rate (false negatives) is 12/100 = 0.12. Note that the type I error rate is awfully close to our, 0.05. This isn't a coincidence:can be thought of as some target type I error rate.

5.2 Bonferroni correction

We have= 0:05, and 1000 tests, so the Bonferroni correction will have us looking for p-values smaller than 0.00005: > bonftest <- p > 0.00005 > summary(bonftest[1:900])

Mode FALSE TRUE

logical 1 899 > summary(bonftest[901:1000])

Mode FALSE TRUE

logical 23 77 Here, the type I error rate is 1/900 = 0.0011, but the type II error rate has skyrocketed to 0.77. We've reduced our false positives at the expense of false negatives. Ask yourself: which is worse? False positives or false negatives? Note: there isn't a rm answer. It really depends on the context of the problem and the consequences of each type of error. 3

5.3 FDR

For the FDR, we want to consider the ordered p-values. We'll see if the kth ordered p-value is larger than k:051000 psort <- sort(p) fdrtest <- NULL for (i in 1:1000) fdrtest <- c(fdrtest, p[i] > match(p[i],psort) * .05/1000) Let's parse this bit of code. I want the string of trues and falses to be in the same order as the original p-values, so we can easily pick o the errors. I start by creating a separate variable, psort, which holds the same values as p, but sorted from smallest to largest. Say I want to test only the rst entry of p: > p[1] > match(p[1],psort) * .05/1000 [1] TRUE p[1] picks o the rst entry from the vector p. match(p[1],psort) looks through the vector psort, nds the rst value that's exactly equal to p[1], and returns which entry of the vector it is. In my random vector, match(p[1], psort) returns 619. That means that, if you put all the p-values in order from smallest to largest, the 619th largest value is the one that appears rst in my vector. The value you get might dier pretty wildly in this case. Anyhow, on to see how the errors go: > summary(fdrtest[1:900])

Mode FALSE TRUE

logical 3 897 > summary(fdrtest[901:1000])

Mode FALSE TRUE

logical 70 30 Now we have a type I error rate of 3/900 = 0.0033. The type II error rate is now 30/70 = 0.30, a big improvement over the Bonferroni correction!

5.4 pFDR

The pFDR is an awful lot more involved, coding-wise. Mercifully, someone's already written the package for us. > library(qvalue) > pfdrtest <- qvalue(p)$qvalues > .05 The qvalue function returns several things. By using$qvalues after the function, it says we only really want the bit of the output they call "qvalues". 4 > summary(pfdrtest[1:900])

Mode FALSE TRUE

logical 3 897 > summary(pfdrtest[901:1000])

Mode FALSE TRUE

logical 70 30 I seem to get the same results as in the regular FDR, at least at the 5% level. Let's take a look at the cumulative number of signicant calls for various levels ofand the dierent corrections: alpha0:0001 0:001 0:01 0:025 0:05 0:1Uncorrected31 57 93 118 134 188

Bonferroni0 6 13 21 24 31

FDR0 19 44 63 73 91

pFDR0 20 48 64 73 93

Here's how the type I errors do:

alpha0:0001 0:001 0:01 0:025 0:05 0:1Uncorrected0:0011 0:0022 0:0144 0:0344 0:0511 0:1056

Bonferroni0 0 0 0:0011 0:0011 0:0011

FDR0 0 0:0011 0:0022 0:0033 0:0122

pFDR0 0 0:0011 0:0022 0:0033 0:0144 ... and type II errors: alpha0:0001 0:001 0:01 0:025 0:05 0:1Uncorrected0:70 0:45 0:20 0:13 0:12 0:07

Bonferroni1 0:94 0:87 0:80 0:77 0:70

FDR1 0:81 0:57 0:39 0:30 0:20

pFDR1 0:80 0:53 0:38 0:30 0:20 5quotesdbs_dbs23.pdfusesText_29

[PDF] 1 Why is multiple testing a problem?

COVID-19 Testing: PCR, Antigen, & Serology

COVID-19 Testing: Molecular, Antigen, and Antibody Tests

A Very Brief Intro to Statistics: The data: t-tests

1 Why is multiple testing a problem?

TESTING TIPSFOR MAPGROWTH

Work Capacity Test Administrators Guide

AP Statistics Test A – Probability – Part IV Name

LOAD TESTS - HCI

Drug Free Workplace Last Chance Agreement

ANCHORS - hubbellcdn

GSI Contact Info:

Megan Goldman

1 Why is multiple testing a problem?

20 hypotheses to test, and a signicance level of 0.05. What's the probability of observing

2 The Bonferroni correction

0:0488

3 The False Discovery Rate

4 The positive False Discovery Rate

5 Comparing the three

5.1 No corrections

Mode FALSE TRUE

Mode FALSE TRUE

5.2 Bonferroni correction

Mode FALSE TRUE

Mode FALSE TRUE

5.3 FDR

Mode FALSE TRUE

Mode FALSE TRUE

5.4 pFDR

Mode FALSE TRUE

Mode FALSE TRUE

Bonferroni0 6 13 21 24 31

FDR0 19 44 63 73 91

Here's how the type I errors do:

Bonferroni0 0 0 0:0011 0:0011 0:0011

FDR0 0 0:0011 0:0022 0:0033 0:0122

Bonferroni1 0:94 0:87 0:80 0:77 0:70

FDR1 0:81 0:57 0:39 0:30 0:20