Searches related to how do you describe a normal distribution filetype:pdf




Loading...







The Normal Distribution - University of West Georgia

A Normal distribution is described by a Normal density curve Any particular Normal distribution is completely specified by two numbers: its mean ???? and its standard deviation ???? The mean of a Normal distribution is the center of the symmetric Normal curve The standard deviation is the distance from the center to the change-

Normal distribution - University of Notre Dame

Normal distribution The normal distribution is the most widely known and used of all distributions Because the normal distribution approximates many natural phenomena so well, it has developed into a standard of reference for many probability problems I Characteristics of the Normal distribution • Symmetric, bell shaped

What is a normal distribution - California State University

normal then the transformed distribution will not be normal either One important use of the standard normal distribution is for converting between scores from a normal distribution and percentile ranks Areas under portions of the standard normal distribution are shown to the right About 68 ( 34 + 34) of the

Unit 5 The Normal Distribution - UMass

IF 1) We have an independent random sample of n observations X 1 X n 2) The X 1 X n are all from the same distribution, whatever that is 3) This distribution has mean =

Searches related to how do you describe a normal distribution filetype:pdf

Perhaps the most important ideal distribution used is the 'normal' distribution (Figure 6 1) Once one understands the characteristics of the normal distribution, knowledge of other distributions is easily obtained Figure 6 1 The normal distribution Most people are familiar with the normal distribution described as a “bell-shaped curve,”

Searches related to how do you describe a normal distribution filetype:pdf 134467_6Ch6.pdf

Chapter 6

Putting Statistics to Work: the Normal Distribution Productive inference from sample to population requires that the appropriate statistic be used to characterize various probabilities associated with the distribut ions of interest. As we may hypothetically have an infinite number of means as well as an infinite number of standard deviations that describe potential distributions, we therefore have a problem to solve in that we do not have an infinite number of statistical procedures to deal with every possible distribution. Does this mean that we can't use statistics to analyze a vast majority of our data? Thankfully, no. While each distribution is unique, most distributions can be grouped with other distributions that share important characteristics. These groups of similar distributions can be further characterized by an 'ideal' (i.e., theoretical) distribution that typifies the important characteristics. Statistics applicable to the entire group of similar distributions can then be developed based upon our knowledge of the ideal distribution. Perhaps the most important ideal distribution used is the 'normal' distribution (Figure 6.1). Once one understands the characteristics of the normal distribution, knowledge of other distributions is easily obtained.

Figure 6.1. The normal distribution.

Most people are familiar with the normal distribution described as a "bell-shaped curve," perhaps as a scale for grading. The bell-shaped curve is nothing but a s pecial case of the normal distribution; the words "bell-shaped" describe the general shape of the distribution, and the word "curve" is used as a synonym for distribution. While we generally refer to the normal distribution, there are really many different normal distributions. In fact, there are as many different normal distributions as there are possible means and standard deviations, both theoretically and in the real world. However, all of these normal distributions share five characteristics.

1. Symmetry. If divided into left and right halves, each half is a mirror image of the other.

2. The maximum height of the distribution is at the mean. One of the consequences of

this stipulation and number 1 above is that the mean, the mode and the median have identical values.

3. The area under a normal distribution sums to unity. This characteristic is simpler than

it sounds but is important because of how we use the normal distribution. Areas within the theoretical distribution as a geometric form represent probabilities of events that range from 0 to 1 (i.e., 0% to 100%). The phrase 'sums to unity' means that all of the probabilities represented by the area under the normal distribution sum to 1 and thus represent all possible outcomes. Each half of the symmetrical distribution of which the mean is the center represents half (.5) of the probabilities.

4. Normal distributions are theoretically asymptotic at both ends, or tails, of the

distribution. If we were to follow a point along the slope of the curve toward the tail to infinity, the point would incrementally become ever closer to zero without ever quite reaching it. This aspect of the normal distribution is necessary because we need to consider every possible variate to infinity. Put another way, every single possible variate can be assigned some probability of occurring, even if it is astronomically small.

5. The distribution of means of multiple samples from a normal distribution will have a

tendency to be normally distributed. Considering this commonality among normal distributions requires thinking about mean s somewhat differently. As you know, means characterize groupings of variates. In this special context we need to c onsider calculating individual means on repeated samples, and plotting these means as variates that collectively create a new distribution that is composed of means. Accordingly, this new distribution has a tendency to be normally distributed. This issue will be further discussed in Chapter 7. With these commonalties in mind, let us further consider some of the differences among normal distributions. First, as Figures 3.17, 3.18 and 3.19 show, normal distributions may be conceptualized as leptokurtic, platykurtic, or mesokurtic. Additionally, any combination of means and standard deviations is possible, and there is no necessary relationship between the mean and the standard deviation for any given distribution. Normal distributions may have different means and the same standard deviation (Figure

6.2) or the same means and different standard deviations (Figure 6.3).

Figure 6.2. Two normal distributions with different means and the same standard deviation. Figure 6.3. Two normal distributions with the same mean and different standard deviations. If is large, variates are generally far from the mean. If is small, most variates are relatively close to the mean. Regardless of the standard deviation, variates near the mean in a normal distribution are more common, and therefore more probable, than variates in the tails of the distribution. One of the most useful aspects of normal distributions is that regardless of the value of or µ (Figure 6.4):

µ ± 1 contains 68.26% of all variates

µ ± 2 contains 95.44% of all variates

µ ± 3 contains 99.734% of all variates

Figure 6.4. Percentages of variates within 1, 2, and 3 standard deviations from µ . It is also possible to express this relationship in terms of more commonly used percentages. For example:

50% of all items fall between µ ± .674

95% of all items fall between µ ± 1.96

99% of all items fall between µ ± 2.58

If µ ± 1 contains 68.26% of all variates, µ ± 2 contains 95.44% of all variates, and µ

± 3 contains 99.74% of all variates (Figure 6.4), we know that any values beyond µ ±

2 are rare events, expected less than 5 times in 100, and µ ± 3

is even more rare, expected less than 1 time out of 100. This characteristic of the normal distribution allows us to consider the probability of individual variates occurring within a geometric space under the distribution. As the probability space (i.e., the sum of the area of probability we are considering) under the normal distribution = 1.0, we know that the percentages mentioned above may be converted to probabilities. When we consider the relationship between a distribution and an individual variate of that distribution, we know that the probability is .6826 that the variate is within µ ± 1; .9544 that the variate is within µ ± 2; and .9974 that the variate is within µ ± 3 (Figure 6.5). Figure 6.5. Standard deviations as areas under the normal curve expressed as probabilities. The probabilities illustrated in Figure 6.5 are unchanging for all normal distributions regardless of their means or standard deviations. Furthermore, probabilities may be calculated for any area under the curve. For example, we might be interested in the areas between two points on the axis, or between one point and the mean, or between one point and infinity. These areas under the curve do vary depending on the locat ion and the shape of the distribution as described by the mean and the standard deviation. In other words, there are as many relationships between any individual variate and the probabilities associated with normal distributions as there are different possible means and standard deviations. All are infinite in number.µı In order to best effectively use the normal distribution to generate probabilities, statisticians have created the standard normal distribution. The standard normal distribution has, by definition, µ=0 and =1. Rather than calculate probabilities of areas under the curve for every possible mean and standard deviation, it is easiest to convert any distribution to the standard normal. This transformation occurs through the calculation of z, where:

Formula 6.1:

i Y z The calculation of z establishes the difference between any variate and the mean ( i Y), and expresses that difference in standard deviation units (by dividing by ). In other words, the product of the formula, called a z-score, is how many standard deviations is from µ in the standard normal distribution. Appendix A is a table of areas under the curve of the standard normal distribution. Once we have a z-score, it is possible to use Appendix A to determine the exact probabilities under the curve. To illustrate this point, let us consider the following example. i Y Donald K. Grayson, in his analysis of the microfauna from Hidden Cave, Nevada, notes that only one species of pocket gopher, Thomomys bottae, occurs in the area today, although it is possible for other species to have been represented in th e past. Grayson (1985:144) presents the following descriptive statistics on mandibular alveolar lengths in mm for modern Thomomys bottae: Y=5.7, s=.48, and n=54. Specimen number HC-215 has a value =6.4. What is the probability of obtaining a value between the mean, i Y

Y=5.7, and =6.4 (Figure 6.6)?

i Y Figure 6.6. Illustration of the relationship between the sample mean and the variate of 6.4. Since we do not have the population parameters, we substitute the sample values for the mean and standard deviation. 48

705406

. .. s YY z i

461.z

This value for z tells us that 6.4 is 1.46 standard deviations from the mean . Is this a common or rare event? We know in general it is common, as the value lies between one and two standard deviations from the mean. Yet, we might be interested in the exact probability. These probabilities for areas under the standard normal distribution can be found in Appendix A. Values expressed in Appendix A are probabilities in the area between z and the mean. To find the probability for 1.46 standard deviation units, look down the left side of the table until the value 1.4 is located. Follow t his row until it intersects with the column value for .06. At that intersection is the value .4279, which represents the probability of a variate falling between the mean and 461.z. A value in that interval is therefore a common event. In addition to determining the probability found in the above example, we can also find the probability of having a value greater than and less than 6.4 (z = 1 .46). Since we know that the total probability represented in the curve is equal to 1.0, and that .50 lies on each side of the mean, we can determine that .5 + .4279 = .9279 equals the probability of a value less than z = 1.46, and 1-.9279 = .0721 represents the probability of a value greater than z = 1.46. We could then conclude that a value larger than 1.46 would approach being a rare event, something that we would expect approximately only 7 times out of 100. The above example illustrates finding probabilities based on areas under the normal curve. It should be noted that we cannot determine exact values that represent points on the line, because points are infinitesimally small. To illustrate this, we did not determine above the probability of a value z = 1.46; only values greater or lesser than this value, or the probability of a value between z = 1.46 and the mean. The probability of the point z =

1.46 cannot be measured. If absolutely necessary to find the area that closely relates t

o

1.46, one should look for the area under the curve between 1.455 and 1.4

65.
Note that Appendix A only presents values for areas where is greater than the mean. What happens if is less than the mean? Another example will serve to illustrate this point. What is the probability of an alveolar length between 5.3 mm and 6.8 mm (Figure

6.7)?

i Y i Y Figure 6.7. Illustration of the relationship between the sample mean and the variates 5.3 and 6.8. We can illustrate this probability in the following way:

Pr{5.3<<6.8}

i Y Pr{ s YY z s YY 21
} Pr{ 48
7586
48
7535
. .. z . .. }

Pr{} 29283.z.

Since the normal curve is symmetrical, it is possible to ignore the negative sign for -.83 to use Appendix A to find the area between this value and the mean. The tabled value for .83 = .2967. The tabled value for 2.29 = .4890. Since we are interested in the area under the curve between -.83 and 2.29, we can sum the two individual probabilities to determine that the

Pr{} = .2967+.4890 = .7857. 29283.z.

You will note that we used a new kind of notation in the preceding example. Unlike many of the symbols previously discussed, this notation does not provide instructions for computation. Instead, it describes the problem we wish to solve. Pr is the symbol indicating we are determining a probability. The area inside the brackets {} is called the probability space. It indicates exactly what probability we wish to find : in this case, the probability of a variate with a value between 5.3 and 6.8. While it may seem tedious, it is important you explicitly write and draw a sketch of your probability space . It is a useful and easy way to keep track of the probability space you are after while ensuring you do not make a simple mistake. The normal distribution is incredibly useful for a number of reasons. For example, we may now conclude that specimen HC-215 from Hidden Cave does not differ in a significant manner from the modern population of Thomomys bottae. If it did, it may have led us to suggest that another species of Thomomys might have been present at Hidden Cave in the past - a conclusion of significant paleoenvironmental and archaeological significance. This and similar uses fall under the subject of hypothesis testing, the subject of the next chapter.

References Cited

Grayson, D.K. 1985. The paleontology of Hidden Cave: Birds and mammals. In The Archaeology of Hidden Cave, Nevada, edited by D. H. Thomas, pp. 125-161. American Museum of Natural History Anthropological Papers 66(1).
Politique de confidentialité -Privacy policy