[PDF] Introduction to Statistics and Data Analysis for - DESY



Previous PDF Next PDF







Data Extract System (DESY) User Manual

DESY Application Access Pre-Requisites To access the DESY application, users must have the following: A valid EUA ID An approved DUA The DESY_P_User Job Code (Request through EUA) Note: While CMS supports all web browsers, therecommended browser for the DESY application is Chrome andFirefox 5



Distributed Energy Systems – DESY

Distributed Energy Systems – DESY Combining together different technologies can form a strong R hybrid solution adapted to local needs Local energy production also increases local business and energy production from local I waste reduces waste management costs, thus enabling other local business and employment



Centers for Medicare & Medicaid Services Office of

Jul 10, 2013 · the DESY application, users can specify targeted CMS data sources, search selection criteria, view selections, and file formats DESY captures the user’s request and submits it to the



EMI Effects at TTF (ElectroMagnetic Interference) - DESY

EMI Sources at DESY Pulsed operation: The combination of • high pulsed currents and voltage sources (source emitter) and • the need of high precision measurements with sensitive equipment (susceptible victim) is especially critical (transient effects) DC & pulsed systems:



Introduction to Statistics and Data Analysis for - DESY

as an electronic book at the DESY library The present book is addressed mainly to master and Ph D students but also to physicists who are interested to get an intro-duction into recent developments in statistical methods of data analysis in particle physics When reading the book, some parts can be skipped, especially in the first five



Crystal Ball at DESY - Stanford University

Mar 18, 2016 · 2 Schematic of the detector as configured at DESY for this experiment FIG 3 Event map for Y(2S}~yyp+p event The energy is given in MeV for all crystals containing more than 0 5 MeV VOLUME 54, NUMBER 20 PHYSICAL REVIEW LETTERS 20 MAY 1985 The hadronic Y(2S) event sample is obtained by re-moving background due to beam-gas interactions,



RF Control for the DESY UV-FEL

JLAB, Nov 2001 Stefan Simrock DESY Performance at TTF (1) 200 400 600 800 1000 1200 1400 1600 1800 0 5 10 15 400 600 800 1000 1200 11 11 5 12 12 5 time [µs]



reduced mass - desyde

step 3: Created Date: 2/19/2021 3:51:58 PM



FEL experiments at FLASH - DESY

E A Schneidmiller and M V Yurkov (DESY, Hamburg) Part 2: Frequency doubler at FLASH2 operating in the water window Studies of the post -saturation undulator tapering Studies of the coherence properties of the radiation from SASE FEL using statistical methods Part 1: Studies of the reverse undulator tapering

[PDF] 301 - Ratp

[PDF] 301 SGB V - Universität Trier

[PDF] 301 Stainless Steel - Anciens Et Réunions

[PDF] 3010 Thermostats d`ambiance pour ventilo

[PDF] 30100_Alle Versionen_RS.indd

[PDF] 30108 T-CONNECTOR

[PDF] 30138 racing thunder

[PDF] 30147 ferrari competition

[PDF] 3015 Aqua TimerTM

[PDF] 3015/1j02 French - Anciens Et Réunions

[PDF] 30163 FORZa FeRRaRi

[PDF] 30174

[PDF] 302 - DIN 1.4319

[PDF] 302 - happy feet

[PDF] 302 000 € Terrain 820 Mouans

Gerhard Bohm, Günter ZechIntroduction to Statistics and DataAnalysis for PhysicistsVerlag Deutsches Elektronen-Synchrotron

liografie; detaillierte bibliografischeDaten sind im Internet über abrufbar.

HerausgeberDeutsches Elektronen-Synchrotron

und Vertrieb:in der Helmholtz-Gemeinschaft

Zentralbibliothek

D-22603 Hamburg

Telefon (040) 8998-3602

Telefax (040) 8998-4440

Umschlaggestaltung:Christine IezziDeutsches Elektronen-Synchrotron, Zeuthen

Druck:Druckerei, Deutsches Elektronen-Synchrotron

Copyright:Gerhard Bohm, Günter Zech

ISBN978-3-935702-41-6

DOI10.3204/DESY-BOOK/statistics (e-book)

Das Buch darf nicht ohne schriftliche Genehmigung der Autoren durch Druck, Fo- tokopie oder andere Verfahren reproduziert oder unter Verwendung elektronischer laubt. PrefaceThere is a large number of excellent statistic books. Nevertheless, we think that it is justified to complement them by another textbook with the focus on modern appli- cations in nuclear and particle physics. To this end we have included a large number of related examples and figures in the text. We emphasize lessthe mathematical foundations but appeal to the intuition of the reader. Data analysis in modern experiments is unthinkable withoutsimulation tech- niques. We discuss in some detail how to apply Monte Carlo simulation to parameter estimation, deconvolution, goodness-of-fit tests. We sketch also modern developments like artificial neural nets, bootstrap methods, boosted decision trees and support vec- tor machines. Likelihood is a central concept of statistical analysis andits foundation is the likelihood principle. We discuss this concept in more detail than usually done in textbooks and base the treatment of inference problems as far as possible on the likelihood function only, as is common in the majority of thenuclear and particle physics community. In this way point and interval estimation, error propagation, combining results, inference of discrete and continuous parameters are consistently treated. We apply Bayesian methods where the likelihood function is not sufficient to proceed to sensible results, for instance in handling systematic errors, deconvolution problems and in some cases when nuisance parameters have to be eliminated, but we avoid improper prior densities. Goodness-of-fit and significance tests, where no likelihood function exists, are based on standard frequentist methods. Our textbook is based on lecture notes from a course given to master physics students at the University of Siegen, Germany, a few years ago. The content has been considerably extended since then. A preliminary German version is published as an electronic book at the DESY library. The present book isaddressed mainly to master and Ph.D. students but also to physicists who are interested to get an intro- duction into recent developments in statistical methods ofdata analysis in particle physics. When reading the book, some parts can be skipped, especially in the first five chapters. Where necessary, back references are included. We welcome comments, suggestions and indications of mistakes and typing errors. We are prepared to discuss or answer questions to specific statistical problems. We acknowledge the technical support provided by DESY and the University of

Siegen.

February 2010,

Gerhard Bohm, Günter Zech

Contents1 Introduction: Probability and Statistics. . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 The Purpose of Statistics . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 1

1.2 Event, Observation and Measurement . . . . . . . . . . . . . . . . .. . . . . . . . . . . 2

1.3 How to Define Probability? . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3

1.4 Assignment of Probabilities to Events . . . . . . . . . . . . . . .. . . . . . . . . . . . . 4

1.5 Outline of this Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 6

2 Basic Probability Relations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1 Random Events and Variables. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 9

2.2 Probability Axioms and Theorems . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 10

2.2.1 Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 10

2.2.2 Conditional Probability, Independence, and Bayes" Theorem . . 11

3 Probability Distributions and their Properties. . . . . . . . . . . . . . . . . . . 15

3.1 Definition of Probability Distributions . . . . . . . . . . . . .. . . . . . . . . . . . . . 16

3.1.1 Discrete Distributions . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 16

3.1.2 Continuous Distributions . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 16

3.1.3 Empirical Distributions . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 20

3.2 Expected Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 20

3.2.1 Definition and Properties of the Expected Value. . . . . .. . . . . . . 21

3.2.2 Mean Value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 22

3.2.3 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 23

3.2.4 Skewness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 26

3.2.5 Kurtosis (Excess). . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 26

3.2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 27

3.2.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 28

II Contents

3.3 Moments and Characteristic Functions . . . . . . . . . . . . . . .. . . . . . . . . . . . 32

3.3.1 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 32

3.3.2 Characteristic Function . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 33

3.3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 36

3.4 Transformation of Variables. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 38

3.4.1 Calculation of the Transformed Density . . . . . . . . . . . .. . . . . . . . 39

3.4.2 Determination of the Transformation Relating two Distributions 41

3.5 Multivariate Probability Densities . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 42

3.5.1 Probability Density of two Variables . . . . . . . . . . . . . .. . . . . . . . . 43

3.5.2 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 44

3.5.3 Transformation of Variables . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 46

3.5.4 Reduction of the Number of Variables. . . . . . . . . . . . . . .. . . . . . . 47

3.5.5 Determination of the Transformation between two Distributions 50

3.5.6 Distributions of more than two Variables . . . . . . . . . . .. . . . . . . . 51

3.5.7 Independent, Identically Distributed Variables . . .. . . . . . . . . . . 52

3.5.8 Angular Distributions . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 53

3.6 Some Important Distributions. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 55

3.6.1 The Binomial Distribution. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 55

3.6.2 The Multinomial Distribution . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 58

3.6.3 The Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 58

3.6.4 The Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 65

3.6.5 The Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 65

3.6.6 The Exponential Distribution . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 69

3.6.7 Theχ2Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.6.8 The Gamma Distribution. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 72

3.6.9 The Lorentz and the Cauchy Distributions. . . . . . . . . . .. . . . . . . 74

3.6.10 The Log-normal Distribution . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 75

3.6.11 Student"stDistribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

3.6.12 The Extreme Value Distributions . . . . . . . . . . . . . . . . .. . . . . . . . . 77

4 Measurement errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.1 General Considerations.. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 81

4.1.1 Importance of Error Assignments. . . . . . . . . . . . . . . . . .. . . . . . . . 81

4.1.2 Verification of Assigned Errors . . . . . . . . . . . . . . . . . . .. . . . . . . . . 82

4.1.3 The Declaration of Errors . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 82

Contents III

4.1.4 Definition of Measurement and its Error. . . . . . . . . . . . .. . . . . . . 83

4.2 Different Types of Measurement Uncertainty . . . . . . . . . . .. . . . . . . . . . . 84

4.2.1 Statistical Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 84

4.2.2 Systematic Errors (G. Bohm) . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 88

4.2.3 Systematic Errors (G. Zech) . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 90

4.2.4 Controversial Examples . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 94

4.3 Linear Propagation of Errors. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 94

4.3.1 Error Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 94

4.3.2 Error of a Function of Several Measured Quantities . . .. . . . . . . 95

4.3.3 Averaging Uncorrelated Measurements . . . . . . . . . . . . .. . . . . . . . 98

4.3.4 Averaging Correlated Measurements . . . . . . . . . . . . . . .. . . . . . . . 98

4.3.5 Several Functions of Several Measured Quantities. . .. . . . . . . . . 100

4.3.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 101

4.4 Biased Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 103

4.5 Confidence Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 104

5 Monte Carlo Simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 107

5.2 Generation of Statistical Distributions . . . . . . . . . . . .. . . . . . . . . . . . . . . 109

5.2.1 Computer Generated Pseudo Random Numbers . . . . . . . . . .. . . 109

5.2.2 Generation of Distributions by Variable Transformation . . . . . . 110

5.2.3 Simple Rejection Sampling . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 115

5.2.4 Importance Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 116

5.2.5 Treatment of Additive Probability Densities . . . . . . .. . . . . . . . . 119

5.2.6 Weighting Events. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 120

5.2.7 Markov Chain Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 120

5.3 Solution of Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 123

5.3.1 Simple Random Selection Method . . . . . . . . . . . . . . . . . . .. . . . . . 123

5.3.2 Improved Selection Method . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 126

5.3.3 Weighting Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 127

5.3.4 Reduction to Expected Values . . . . . . . . . . . . . . . . . . . . .. . . . . . . 128

5.3.5 Stratified Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 129

5.4 General Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 129

6 Parameter Inference I. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 131

IV Contents

6.2 Inference with Given Prior. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 133

6.2.1 Discrete Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 133

6.2.2 Continuous Parameters . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 135

6.3 Definition and Visualization of the Likelihood . . . . . . . .. . . . . . . . . . . . . 137

6.4 The Likelihood Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 140

6.5 The Maximum Likelihood Method for Parameter Inference .. . . . . . . . 142

6.5.1 The Recipe for a Single Parameter. . . . . . . . . . . . . . . . . .. . . . . . . 143

6.5.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 144

6.5.3 Likelihood Inference for Several Parameters . . . . . . .. . . . . . . . . . 148

6.5.4 Combining Measurements . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 151

6.5.5 Normally Distributed Variates andχ2. . . . . . . . . . . . . . . . . . . . . . 151

6.5.6 Likelihood of Histograms . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 152

6.5.7 Extended Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 154

6.5.8 Complicated Likelihood Functions . . . . . . . . . . . . . . . .. . . . . . . . . 155

6.5.9 Comparison of Observations with a Monte Carlo Simulation . . 155

6.5.10 Parameter Estimate of a Signal Contaminated by Background 160

6.6 Inclusion of Constraints . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 163

6.6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 163

6.6.2 Eliminating Redundant Parameters . . . . . . . . . . . . . . . .. . . . . . . . 164

6.6.3 Gaussian Approximation of Constraints . . . . . . . . . . . .. . . . . . . . 166

6.6.4 The Method of Lagrange Multipliers . . . . . . . . . . . . . . . .. . . . . . . 167

6.6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 168

6.7 Reduction of the Number of Variates. . . . . . . . . . . . . . . . . .. . . . . . . . . . . 168

6.7.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 168

6.7.2 Two Variables and a Single Linear Parameter . . . . . . . . .. . . . . . 169

6.7.3 Generalization to Several Variables and Parameters .. . . . . . . . . 169

6.7.4 Non-linear Parameters . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 171

6.8 Method of Approximated Likelihood Estimator. . . . . . . . .. . . . . . . . . . . 171

6.9 Nuisance Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 174

6.9.1 Nuisance Parameters with Given Prior . . . . . . . . . . . . . .. . . . . . . 175

6.9.2 Factorizing the Likelihood Function. . . . . . . . . . . . . .. . . . . . . . . . 176

6.9.3 Parameter Transformation, Restructuring . . . . . . . . .. . . . . . . . . 177

6.9.4 Profile Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 179

6.9.5 Integrating out the Nuisance Parameter . . . . . . . . . . . .. . . . . . . . 181

6.9.6 Explicit Declaration of the Parameter Dependence. . .. . . . . . . . 181

Contents V

6.9.7 Advice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 181

7 Parameter Inference II. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

7.1 Likelihood and Information . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 183

7.1.1 Sufficiency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 183

7.1.2 The Conditionality Principle . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 185

7.1.3 The Likelihood Principle . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 186

7.1.4 Bias of Maximum Likelihood Results. . . . . . . . . . . . . . . .. . . . . . . 187

7.1.5 Stopping Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 190

7.2 Further Methods of Parameter Inference. . . . . . . . . . . . . .. . . . . . . . . . . . 191

7.2.1 The Moments Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 191

7.2.2 The Least Square Method . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 195

7.2.3 Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 198

7.3 Comparison of Estimation Methods . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 199

8 Interval Estimation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 201

8.2 Error Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 202

8.2.1 Parabolic Approximation . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 203

8.2.2 General Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 204

8.3 Error Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 205

8.3.1 Averaging Measurements . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 205

8.3.2 Approximating the Likelihood Function . . . . . . . . . . . .. . . . . . . . 208

8.3.3 Incompatible Measurements . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 209

8.3.4 Error Propagation for a Scalar Function of a Single Parameter 210

8.3.5 Error Propagation for a Function of Several Parameters . . . . . . 210

8.4 One-sided Confidence Limits . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 214

8.4.1 General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 214

8.4.2 Upper Poisson Limits, Simple Case . . . . . . . . . . . . . . . . .. . . . . . . 215

8.4.3 Poisson Limit for Data with Background . . . . . . . . . . . . .. . . . . . 216

8.4.4 Unphysical Parameter Values . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 219

8.5 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 219

9 Deconvolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .221

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 221

9.1.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 221

VI Contents

9.1.2 Deconvolution by Matrix Inversion . . . . . . . . . . . . . . . .. . . . . . . . 224

9.1.3 The Transfer Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 226

9.1.4 Regularization Methods . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 226

9.2 Deconvolution of Histograms . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 227

9.2.1 Fitting the Bin Content . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 227

9.2.2 Iterative Deconvolution . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 231

9.2.3 Regularization of the Transfer Matrix . . . . . . . . . . . . .. . . . . . . . . 232

9.3 Binning-free Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 234

9.3.1 Iterative Deconvolution . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 234

9.3.2 The Satellite Method . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 235

9.3.3 The Maximum Likelihood Method . . . . . . . . . . . . . . . . . . . .. . . . . 237

9.4 Comparison of the Methods. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 239

9.5 Error Estimation for the Deconvoluted Distribution . . .. . . . . . . . . . . . . 241

10 Hypothesis Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 245

10.2 Some Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 246

10.2.1 Single and Composite Hypotheses . . . . . . . . . . . . . . . . .. . . . . . . . 246

10.2.2 Test Statistic, Critical Region and Significance Level . . . . . . . . . 246

10.2.3 Errors of the First and Second Kind, Power of a Test . . .. . . . . 247

10.2.4 P-Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 248

10.2.5 Consistency and Bias of Tests . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 248

10.3 Goodness-of-Fit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 250

10.3.1 General Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 250

10.3.2 P-Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 252

10.3.3 Theχ2Test in Generalized Form . . . . . . . . . . . . . . . . . . . . . . . . . . 254

10.3.4 The Likelihood Ratio Test . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 261

10.3.5 The Kolmogorov-Smirnov Test. . . . . . . . . . . . . . . . . . . .. . . . . . . . 263

10.3.6 Tests of the Kolmogorov-Smirnov - and Cramer-von Mises

Families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 265

10.3.7 Neyman"s Smooth Test. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 266

10.3.8 TheL2Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

10.3.9 Comparing a Data Sample to a Monte Carlo Sample and the

Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 269

10.3.10 The k-Nearest Neighbor Test . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 270

10.3.11 The Energy Test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 270

Contents VII

10.3.12 Tests Designed for Specific Problems . . . . . . . . . . . . .. . . . . . . . . 272

10.3.13 Comparison of Tests . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 273

10.4 Two-Sample Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 275

10.4.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 275

10.4.2 Theχ2Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

10.4.3 The Likelihood Ratio Test . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 276

10.4.4 The Kolmogorov-Smirnov Test. . . . . . . . . . . . . . . . . . . .. . . . . . . . 277

10.4.5 The Energy Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 277

10.4.6 The k-Nearest Neighbor Test . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 278

10.5 Significance of Signals . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 279

10.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 279

10.5.2 The Likelihood Ratio Test . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 281

10.5.3 Tests Based on the Signal Strength . . . . . . . . . . . . . . . .. . . . . . . . 286

11 Statistical Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 289

11.2 Smoothing of Measurements and Approximation by Analytic

Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 291

11.2.1 Smoothing Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 292

11.2.2 Approximation by Orthogonal Functions . . . . . . . . . . .. . . . . . . . 294

11.2.3 Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 298

11.2.4 Spline Approximation . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 300

11.2.5 Approximation by a Combination of Simple Functions .. . . . . . 302

11.2.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 302

11.3 Linear Factor Analysis and Principal Components . . . . .. . . . . . . . . . . . 303

11.4 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 309

11.4.1 The Discriminant Analysis. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 311

11.4.2 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 312

11.4.3 Weighting Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 319

11.4.4 Decision Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 322

11.4.5 Bagging and Random Forest . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 325

11.4.6 Comparison of the Methods . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 326

12 Auxiliary Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

12.1 Probability Density Estimation. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 329

12.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 329

VIII Contents

12.1.2 Fixed Interval Methods . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 330

12.1.3 Fixed Number and Fixed Volume Methods . . . . . . . . . . . . .. . . . 333

12.1.4 Kernel Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 333

12.1.5 Problems and Discussion . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 334

12.2 Resampling Techniques. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 336

12.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 336

12.2.2 Definition of Bootstrap and Simple Examples . . . . . . . .. . . . . . . 337

12.2.3 Precision of the Error Estimate . . . . . . . . . . . . . . . . . .. . . . . . . . . 339

12.2.4 Confidence Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 339

12.2.5 Precision of Classifiers . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 340

12.2.6 Random Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 340

12.2.7 Jackknife and Bias Correction.. . . . . . . . . . . . . . . . . .. . . . . . . . . . 340

13 Appendix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 343

13.1 Large Number Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 343

13.1.1 The Chebyshev Inequality and the Law of Large Numbers. . . . 343

13.1.2 The Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 344

13.2 Consistency, Bias and Efficiency of Estimators . . . . . . . .. . . . . . . . . . . . 345

13.2.1 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 345

13.2.2 Bias of Estimates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 345

13.2.3 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 346

13.3 Properties of the Maximum Likelihood Estimator. . . . . .. . . . . . . . . . . . 347

13.3.1 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 347

13.3.2 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 348

13.3.3 Asymptotic Form of the Likelihood Function. . . . . . . .. . . . . . . . 349

13.3.4 Properties of the Maximum Likelihood Estimate for Small

Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . 350

13.4 Error of Background-Contaminated Parameter Estimates . . . . . . . . . . . 350

13.5 Frequentist Confidence Intervals . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 353

13.6 Comparison of Different Inference Methods . . . . . . . . . . .. . . . . . . . . . . . 355

13.6.1 A Few Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 355

13.6.2 The Frequentist approach . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 358

13.6.3 The Bayesian Approach . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 358

13.6.4 The Likelihood Ratio Approach . . . . . . . . . . . . . . . . . . .. . . . . . . . 359

13.6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 359

13.7 P-Values for EDF-Statistics . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 359

Contents IX

13.8 Comparison of two Histograms, Goodness-of-Fit and Parameter

Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 361

13.8.1 Comparison of two Poisson Numbers with Different

quotesdbs_dbs14.pdfusesText_20