[PDF] [PDF] Introduction to the Bootstrap - Harvard Medical School

Tibshirani (1993) (Full details concerning this series are available from the Publishers ) An Introduction to the Bootstrap Bradley Efron Department of Statistics



Previous PDF Next PDF





[PDF] Efrons bootstrap

1 déc 2010 · The bootstrap was introduced by Brad Efron in the late 1970s It is a computer- intensive method for approximating the sampling distribution of any 



[PDF] An Introduction to Bootstrap

An Introduction to the Bootstrap BRADLEY EFRON Department of Statistics Stanford University and ROBERT J TIBSHIRANI Department of Preventative 



[PDF] Introduction to the Bootstrap - Harvard Medical School

Tibshirani (1993) (Full details concerning this series are available from the Publishers ) An Introduction to the Bootstrap Bradley Efron Department of Statistics



Introduction to Efron (1979) Bootstrap Methods: Another Look at the

Efron (1979) Bootstrap Methods: Another Look at the Jackknife Rudolf J Beran University of California at Berkeley It is not unusual, in the history of statistics, 



[PDF] THE AUTOMATIC CONSTRUCTION OF BOOTSTRAP

BOOTSTRAP CONFIDENCE INTERVALS By Bradley Efron Balasubramanian Narasimhan Stanford University Technical Report No 2018-07 October 2018



[PDF] An Introduction to the Bootstrap

Bootstrap Bradley Efron Department of Statistics Stanford University and Robert J Tibshirani 12 Confidence intervals based on bootstrap "tables" 153



[PDF] Introduction to the Bootstrap

1 jui 2003 · A modern alternative to the traditional ap- proach is the bootstrapping method, introduced by Efron (1979) The bootstrap is a computer- intensive 

[PDF] egg cooker water measurements

[PDF] egkk airport charts

[PDF] eglinton crosstown

[PDF] egypt desertification

[PDF] egypt legal system

[PDF] egyptian law

[PDF] ehr and cpoe

[PDF] eic tax table 2019

[PDF] eide bailly login

[PDF] eide bailly portal login

[PDF] eiffel design by contract

[PDF] eiffel scholarship 2018 results

[PDF] eiffel scholarship benefits

[PDF] eigenvalues of adjacency matrix

[PDF] eighteenth century prisons

)esign andAnalysisofCross-OverTrialsB.JonesandM.G.Kenward(1989) SymmetricMultivariateand RelatedDistributionsK.-T. Fang,S.Kotzand

K. Ng (1989)

38 CyclicDesignsJ.A. John (1987)

P.Walley(1990)

lnspectionErrors forAttributesinQualityControlN.L.Johnson,S.Kotzand x.Wu (1991) ·5 TheAnalysisofContingencyTables, 2ndeditionB.S.Everitt(1992)

46 TheAnalysisofQuantalResponseDataB.f.T.Morgan(1992)

47LongitudinalData with SerialCorrelation:AState-SpaceApproach

R.H.Jones(1993)

:DifferentialGeometryandStatistics

M.K.Murrayandf. W.Rice(1993)

50 Chaos andNetworks:StatisticalandProbabilisticAspectsEditedby

O.Barndorff-Nielsenet al.(1993)

NumberTheoreticMethodsinStatisticsK.-T.FangandW. Yuan (1993)

M.Pesonen(1993)

I.f.Lauder(1994)

57 AnIntroductionto theBootstrapB.EfronandR.Tibshirani(1993)

(Full detailsconcerningthis series areavailablefrom thePublishers.) An

Introduction

tothe

Bootstrap

BradleyEfron

Department

ofStatistics

StanfordUniversity

and

RobertJ.Tibshirani

CHAPMAN&HALL/CRC

BocaRatonLondonNew YorkWashington,D.C.

Efron,Bradley.

Anintroductionto thebootstrap/BradEfron,RobTibshirani. p. em.

Includesbibliographicalreferences.

ISBN0-412-04231-2

1.Bootstrap(Statistics).!.Tibshirani,Robert.II. Title.

QA276.8.E37451993

519.5'44-dc2093-4489

CIP TO

CHERYL,CHARLIE,RYANANDJULIE

widevarietyof referencesare listed.Reasonableeffortshave beenmadetopublishreliabledataandinformation, but theauthorand thepublishercannotassumeresponsibilityfor thevalidity ofallmaterialsor for theconsequencesoftheiruse. Apartfrom any fairdealingfor thepurposesofresearchorprivatestudy, orcriticismor review, aspermittedunderthe UKCopyrightDesignsandPatentsAct, 1988, thispublicationmay not bereproduced,storedortransmitted,in any form or by anymeans,electronicormechanical, includingphotocopying,microfihning,andrecording,or by anyinformationstorageorretrieval system,withoutthepriorpermissioninwritingof thepublishers,or in the case ofreprographic reproductiononly inaccordancewith the terms of thelicensesissuedby theCopyrightLicensing Agencyin the UK, or inaccordancewith the terms of thelicenseissuedby theappropriate

ReproductionRightsOrganization outsidethe UK.

Theconsentof

CRCPress LLC does notextendtocopyingforgeneraldistribution,for promotion,forcreatingnewworks,or forresale.Specificpermissionmustbeobtainedinwriting from CRC Press

LLCfor suchcopying.

Directallinquiriesto

33431.

TrademarkNotice:Productorcorporatenamesmay betrademarksorregisteredtrademarks, and are used only foridentificationandexplanation,withoutintenttoinfringe.

FirstCRC Pressreprint1998

Originallypublishedby

Chapman& Hall

©1993 byChapman&Hall

©1998 byCRCPress LLC

NoclaimtooriginalU.S.Governmentworks

InternationalStandard

BookNumber0-412-04231-2

Library

ofCongressCardNumber93-4489 Printedin theUnitedStatesofAmerica2 3 4 5 6 7 8 9 0

Printedonacid-free

paper

ANDTOTHEMEMORYOF

RUPERTG.MILLER,JR.

xvi

PREFACE

cluding wouldlike tothankhis wifeCherylforherunderstandingand supportduringthisentireproject,andhisparentsfor alifetime ofencouragement.Hegratefullyacknowledges thesupportofthe

CHAPTER1

Introduction

PaloAltoandToronto

June1993

BradleyEfron

RobertTibshirani

perience seenstatisticaltechniquesbecometheanalyticmethodsof choice inbiomedicalscience,psychology, education,economics,communi areas.Recently,traditionalscienceslike geology,physics,andas tronomyhavebeguntomakeincreasinguse ofstatisticalmethods astheyfocus onareasthatdemandinformationalefficiency,suchas

Mostpeople

deviceswe arenotverygoodatpickingoutpatternsfroma sea ofnoisy data.Toputitanotherway, wearealltoogoodatpick ing optimalmethodsforfindingarealsignalin anoisybackground, randompatterns. (1) HowshouldIcollectmy data? (2) HowshouldIanalyzeandsummarizethedatathatI'vecol lected? (3) How accuratearemydatasummaries? ference. Thebootstrapis arecentlydevelopedtechniqueformaking becauseitrequires moderncomputerpowertosimplifytheoften Theexplanationsthatwe will give forthebootstrap,andother We will seeexamplesofmuchmorecomplicatedsummariesinlater chapters.Oneadvantageofusingagoodexperimentaldesignis a simplificationof ratesis in buttheirimplementationhas.Themoderncomputerletsus ap mathematicalassumptions.Ourprimarypurposeinthebookis to beappliedin awidevarietyofrealdata-analyticsituations. inference,areillustratedintheNew YorkTimesexcerptofFigure

1.1. Astudywasdonetosee ifsmallaspirindoseswouldprevent

pirinstudywerecollectedin aparticularlyefficientway: by a con trolled, placebo, statisticianskeepingasecretcodeof whoreceivedwhichsubstance. tosucceed.

Theelaborateprecautionsof acontrolled,randomized,

while 3

HEARTATTACKRISK

FOUNDTOBECUT

BYTAKINGASPIRIN

LIFESAVINGEFFECTSSEEN

StudyFindsBenefitofTablet

EveryDther.DayIsMuch

GreaterThanExpected

ByHAROLDM.SCHMECKJr.

Amajornationwide study showsthat

a singleaspirintabletevery-otherday cansharplyreduce a man

Is risk of

heartattackanddeathfromheartat tack.

The lifesaving effects were so dra

maticthatthe study washaltedin mid

Decembersothattheresultscouldbe

reponedas soon as possible to the par ticipantsand to the medical profession ingeneral.Themagnitudeof the beneficial et feet wasfargreaterthanexpected,Dr.

Charles

H.Hennekens ofHarvard,

priDdpalinvestigatorin theresearch, said ina telephone interview. The risk ofmyocardialinfarction, thetechnical nameforheartattack,was cutalmost inhalf. 'ExtremeBeneficialEffect'

Aspecialreportsaid theresults

showed uastatisticallyextremebenefi cialeffect"from the use ofaspirin.The reportis tobepublishedThursdayin

The NewEnglandJournalof Medicine.

In recentyearssmallerstudieshave demonstrated thata person who has had one heartattackcanreducethe risk of a second bytakingaspirin,buttherehad been noproof that the benefi cial effect would extend to thegeneral male population.

Dr. Claude

Lentant,thedirectorof

the NationalHeartLung and

BloodIn

stitute,said the findingswere"ex tremelyimponant,"but he said the generalpublic should nottakethe re portas an indication thateveryone should start taking aspirin.

1987.Reproduced

bypermissionoftheNewYorkTimes.

INTRODUCTION

(1.1)

INTRODUCTION

11037
11034
subjects heartattacks (fatalplusnon-fatal) 104
189
aspiringroup: placebogroup: (j104/11037.55.

189/11034

If believable, theaspirin-takersonlyhave55% asmanyheartattacks asplacebo-takers. 2 4

INTRODUCTION

INTRODUCTION5

e==119/11037- 1 21(1.4)98/11034- . .

It now looks like

with95%confidence.Thisincludestheneutralvalue()==1, at whichaspirinwouldbe no betteror worsethanplacebovis-a-vis strokes.In wasfoundto besignificantlybeneficialforpreventingheartattacks, clusion wewouldsee if wecouldtreatallsubjects,andnotjustasampleof them.Thevaluee==.55 isonlyanestimateof().Thesampleseems largehere,22071 subjectsin all,buttheconclusionthataspirin works isreallybasedon asmallernumber,the293observedheart attacks.How do we knowthatemightnotcomeoutmuchless favorablyif theexperimentwererunagain? allows us tomakethefollowinginference:thetruevalueof()lies in theinterval with95%confidence.Statement(1.2) is aclassicalconfidencein terval,of thetypediscussedinChaptersand22. Itsaysthat if weranamuchbiggerexperiment,withmillionsofsubjects,the

Wealmost

aspirinwasactuallyharmful.It isreallyratheramazingthatthe samedatathatgive usanestimatedvalue,e==.55 inthiscase, also cangive us agoodideaoftheestimate'saccuracy. Statisticalinferenceisseriousbusiness.A lotcanrideonthe decisionofwhetherornotanobservedeffect isreal.Theaspirin studytrackedstrokesas well asheartattacks,withthefollowing results:(1.6)who Thebootstrapis adata-basedsimulationmethodforstatistical inference,whichcanbeusedtoproduceinferenceslike (1.2)and (1.5).Theuse ofthetermbootstrapderivesfromthephraseto pulloneselfup byone'sbootstrap,widelythoughtto bebasedon one of byRudolphErichRaspe.(TheBaronhadfallen tothebottomof adeeplake.Justwhenitlookedlike all waslost,hethoughtto pickhimselfup by his own bootstraps.)It isnotthesameasthe computerfrom a set of coreinstructions,thoughthederivationis similar.

Hereis how

thebootstrapworksinthestrokeexample.We cre atetwopopulations:thefirstconsistingof 119 onesand11037

119==10918zeroes,

andthesecondconsistingof 98 onesand11034

98==10936zeroes.We

drawwithreplacementasampleof 11037 itemsfromthefirstpopulation,andasampleof 11034itemsfrom

0*==Proportionof ones inbootstrapsample#1

Proportionof ones inbootstrapsample#2'

Werepeatthisprocessalargenumberoftimes,say 1000times, andobtain1000bootstrapreplicates()*.Thisprocessiseasyto im plementon acomputer,as we will seelater.These1000replicates data.Forexample,thestandarddeviationturnedoutto be 0.17 in a batchof 1000replicatesthatwegenerated.Thevalue0.17 is indicatesthattheratio0==1.21 isonlyalittlemorethan cannotberuledout.Arough95%confidenceintervallike (1.5) replicates,whichin thiscaseturnedoutto be (.93,1.60). In ofinferenceslike (1.2)and(1.5),producingtheminanautomatic way even insituationsmuchmorecomplicatedthantheaspirin study.(1.5)(1.3)(1.2) subjects 11037

11034strokes

119
98.43
<()<.70 .93 <()<1.59 aspiringroup: placebogroup:

Forstrokes,

theratioofratesis

6INTRODUCTIONANOVERVIEWOFTHISBOOK7

Theterminologyofstatisticalsummariesandinferences,like re gression,correlation,analysisofvariance, discriminantanalysis, come thelinguafrancaof alldisciplinesthatdealwithnoisydata. We will beexaminingwhatthislanguagemeansandhow itworks inpractice.

Theparticulargoalofbootstraptheoryis acomputer

basedimplementationofbasicstatisticalconcepts.In some ways it iseasierto

1.1Anoverviewofthisbook

statisticalaccuracy.Thebootstrapdoesnotwork inisolationbut ratherisappliedto a widevarietyofstatisticalprocedures.Part of sity regression.

Hereis a

introducesthebootstrapestimateofstandarderrorfor asimple mean. andmaybeskimmedbyreaderseagerto get tothedetailsof defines to belin.Chapter4 also showsthatmanyfamiliarstatisticscan pluggingin estimationfor amean,andshows'howtheusualtextbookformula canbederivedas asimpleplug-inestimate. ThebootstrapisdefinedinChapter6, forestimatingthestan darderrorof astatisticfrom asinglesample.Thebootstrapstan darderrorestimateis aplug-inestimatethatrarelycanbe com for approximatingit. rorsin twocomplicatedexamples:a principalcomponentsanalysis andacurvefittingproblem. Up to cussed. structuresisdiscussedinChapter8. Atwo-sampleproblemand atime-seriesanalysisaredescribed.

Regressionanalysis

andthebootstraparediscussedandillus appliedin anumberofdifferentwaysandtheresultsarediscussed in twoexamples. Theuse ofthebootstrapforestimationofbiasisthetopicofquotesdbs_dbs17.pdfusesText_23