[PDF] Statistics Intermediate Normal Distribution and Standard Scores




Loading...







[PDF] What is a normal distribution

They are symmetric with scores more concentrated in the middle than in the tails Normal distributions are sometimes described as bell shaped Examples of 

[PDF] The Normal Distribution

This distribution describes many human traits All Normal curves have symmetry, but not all symmetric distributions are Normal ? Normal distributions are 

[PDF] The Normal Distribution - Students - Flinders University Students

On the other hand, counting the number of heads/tails in a collection of coin tosses is not continuous (it is discrete) because the result can only be an 

Probability, Normal Distributions, and z Scores - Sage Publications

Behavioral data in a population tend to be normally distributed, meaning the data are symmetrically distributed around the mean, the median, and the mode, which 

[PDF] Normal Distribution Lab

approximately normally distributed with a mean of 72 4 degrees (F) and a standard deviation of 2 6 degrees (F) Q1] Sketch the normal curve by hand here

[PDF] 5 The Normal Distribution

Many populations have distributions that can be fit very The statement that X is normally distributed with captures upper-tail area 01

[PDF] Normal Distributions

numbers used to describe what is a typical case value or how much variability is A curve like the one in Figure 4 1, which has a tail

[PDF] Statistics Intermediate Normal Distribution and Standard Scores

center and less frequent scores fall into the tails Central tendency means most scores(68 ) in a normally distributed set of data tend to cluster in the 

[PDF] Contents

The normal probability distribution is the most commonly used probability There are many normal distributions, and each variable X which is nor-

[PDF] Statistics Intermediate Normal Distribution and Standard Scores 255_6lecture3_intermediate.pdf

Statistics Intermediate

Normal Distribution and Standard Scores

Session 3

Oscar BARRERA

oscardavid.barrerarodriguez@sciencespo.fr

February 6, 2018

Previous lectureStandard Scores

Grades" distribution

1The best (N-1) scores on problem sets 30%, the midterm 30%,

the final exam 30% and 10% participation.2Three controls 20%(each), Participation: 10 % , Final exam

30 %.3Two controls 20% (each), 20% problem sets, 10%

participation, 30% final exam.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Outline

1Previous Lecture

Central tendency

Variability

2Standard Scores

Definition

Characteristics

Normal Distribution

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyMeasures of Central Tendency

In a normal distribution the most frequent scores cluster near the

center and less frequent scores fall into the tails.Central tendency means most scores(68%) in a normally distributed

set of data tend to cluster in the central tendency area. [Come back to characteristics!]

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyCentral tendency and dispersion

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyMeasures of Central Tendency

There are at leastthree characteristicsyou look for in a descriptive statistic to represent a set of data. 1. Rep resented:A go oddescriptive statistic should b esimila rto many scores in a distribution. (High frequency) 2. W ellbalanced: neither greater-than o rless-than sco resa re overrepresented 3. Inclusive: Should tak eindividual values from the distribution into account so no value is left out

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyCentral tendency and Dispersion

Central tendency measures

The Mode(Mo): is the most frequently occurring score in a distribution.The median (Md) is the middle score of a distribution.

Quartiles

Mean

Dispersion measuresRange

Sum of squares

Variance

Standard Deviation

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyCentral tendency measures

The modeThe Mode(Mo)is the most frequently occurring score in a distribution.Limitations Multi-modal:There can be more than one modeLack of Representativeness:It may not be a good representative of all values

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyCentral tendency measures

The medianThe median (Md) is the middle score of a distribution. Half on the left half on the right (the 50th percentile.) Better measure of central tendency than the mode since it balances perfectly distribution.

How to find it?: Two simple steps

1. determine the median"s lo cation 2. find the value at that lo cation.It differs whether you have anevenor anoddnumber of scores. Limitation:The median does not take into account the actual values of the scores in a set of data.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyCentral tendency measures

QuartilesQuartiles split a distribution into fourths or quarters of the distribution.There are actually three quartile points in a distribution: The 1st quartile (Q1) separates the lower 25% of the scores from the upper 75% of the scores;the 2nd quartile (Q2) is the median and separates the lower

50% of the scores from the upper 50%;and the 3rd quartile (Q3) separates the upper 25% of the

scores from the lower 75%. Steps: Determining the quartile value is like determining the median.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyCentral tendency measures

Understanding SigmaThe symbolin Maths meansSummation. Means to add all values to the right of(say varX). X Means to sum up all values that belong to variable X.

IMPORTANT!!

is a grouping symbol like a set of parentheses and everything to the right ofmust be completed before summing the resulting values.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyCentral tendency measures

The meanIs the arithmetic average of all the scores in a distribution. The mean is the most-often used measure of central tendency 1 It evenly balances a distribution so b oththe la rgeand small values are equally represented 2

T akesinto account all individual va lues.

To estimate it.. in two steps

1. add together all the sco resin a distribution X. 2. divide that sum b ythe total numb erof sco resin the distribution.

SampleX=M=Xn

Population

=XN

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyCentral tendency measures

Mean -LimitationsOne problem...it takes individual values into account, By taking individual values into account the mean can be influenced by extremely large or extremely small values (outliers).Specifically, extremely small values pull the mean down and

extremely large values pull the mean up.This only occurs in skewed distributions: it is good to use the

median as a measure of central tendency (Why?)

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyWhat to use, when?

Nominal data)the mode is the best measure of central tendency. (i.e Sex)ordinal scale)the median may be more appropriate. e the

individual values on an ordinal scale are meaninglessinterval or ratio scale,)the mean is generally preferred.

(Counts for all obs)

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyThe Mean Median and Mode in Normal and Skewed DistributionsThe positions of the mean, median, and mode are affected by

whether a distribution is normally distributed or skewed.In data normally distributed: mean =median = mode.

50% of the scores must lie above the center point, which is

also the mode, and the other 50% of the scores must lie below.Because the distribution is perfectly symmetrical the

differences between the mean and the values larger than the mean must cancel out. Negatively skewedthe tail of the distribution is to the left and the hump is to the right. Positively skewedthe tail of the distribution is to the right and the hump is to the left,

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyThe Mean Median and Mode in Normal and Skewed

Distributions

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyThe Mean Median and Mode in Normal and Skewed DistributionsSo if the mean differs from the median/mode, the distribution is skewed.The median is better as the measure of central tendency when the distribution is positively or negatively skewed.

Best example. Income

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Central tendencyBox Plots

Is a way to present the dispersion of scores in a distribution by using five pieces of statistical information: the minimum value, the first quartile, the median, the third quartile, and the maximum value.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

VariabilityVariability

Variability is simply the differences among items, which could be differences in eye color, hair color, height, weight, sex, intelligence, etc. There are several measures of variability. I will go from the easiest

to more complex onesThe rangeis the largest value minus the smallest value.Sum of squaresIs the sum of the squared deviation scores

from the mean and measures the summed or total variation in

a set of data.The variance: Avg sum of squaresThe standard deviation.: square root of the varianceOscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

VariabilityVariability

The rangeThe rangeis the largest value minus the smallest value. It provides information about the area a distribution covers, BUT does not say anything about individual values.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

VariabilityVariability

Sum of squaresIs the sum of the squared deviation scores from the mean and measures the summed or total variation in a set of data. SS= (XX)2could be equal to zero if there is no variability and all the scores are equalCan"t be negative (mathematically impossible) LimitationsSum of squares measures the total variation among scores in a distribution;sum of squares does not measure average variability We want a measure of variability that takes into account both the variation of the scores and number of scores in a distribution

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

VariabilityVariability

VarianceSample VarianceS2is the average sum of the squared deviation scores from a mean.Measures the average variability among scores in a distribution. S

2=(XX)2n

=SSn The one issue with sample variance is it is in squared units, not the originalunit of measurement.Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

VariabilityVariability

Standard DeviationOnce sample variance has been calculated, calculating the sample standard deviation (s) is as simple as taking the square root of the variance.

S=s(XX)2n

=rSS n =pS 2 The standard deviation measures the average deviation between a score and the mean of a distribution.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

DefinitionComparing Scores across Distributions

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

DefinitionComparing Scores across Distributions

To understand why comparing across distributions is problematic let"s think about an example..You have a friend, say Dennis, who thinks he is smarter than everybody.Both you and Dennis take Statistics in the same semester but with different professors.You have a test in that course on the same day, and the content of the test is exactly the same. On the test, Dennis obtains 18 out of 20 points and you obtain 15 out of 20 points.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Definition

Of course, Dennis starts bragging and boasting about his statistics intelligence.But who really did better on this test?

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Definition

Say each test was on the same material and included the same content, but...The make-up of each test was slightly different e.g., more multiple choice questions for one section, slightly differently-worded questions.So, let"s assume that Your class" mean test grade was M = 13 points and Dennis" class mean was M = 16 points.

Now, who did better?

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

DefinitionYour class" mean test grade was M = 13 points and Dennis" class mean was M = 16 points. Now, who did better?It seems you and Dennis performed equally, because your

scores were each 2 points above your respective class meanHowever, because the means of each class are different, this

suggests the underlying distributions in each class are different2 point difference in one class may be different than a 2

point difference in the other

So, who did better, you or Dennis?

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Definition2 point difference in one class may be different than a 2 point difference in the other

So, who did better, you or Dennis?

What you need is a standard measureFortunately, we already know of such a measure: the standard deviation.Remember The standard deviation is the average difference between a score

(x) and the mean (M) of a distributionAlthough the value of the standard deviation changes depending on

the data in the distribution, a standard deviation means the same thing in any distribution.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Definition

When you measure the number of standard deviations between your test score and the mean of your class, you can compare that

difference, which is in numbers of standard deviations, but..How do you compute your score in standard deviations?

The measure of the number of standard deviations between any raw score and the mean of a distributionis a standard score, or z-Score.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

DefinitionStandard (z) Scores

A standard score (z-Score)measures the number of standard deviations between a score (X) and the mean (Xor). z=X and in sample z=XX s

Let"s go back to you and Dennis

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

DefinitionStandard (z) Scores

Let"s go back to you and Dennis

Assume the standard deviation for your class iss=1 and the

standard deviation for Dennis" class iss=2.Remember, your score wasX=15 and Dennis" wasX=18,Your class" mean wasM=13 and Dennis" class mean was

M=16.

So, we have:

Z you=15131 =2 and Z dennis=18162 =1 Because most scores in your class were similar to the mean, deviations from your class" mean are unlikely, but because there is higher variability in Dennis" class deviations from the mean are not surprising.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Definition

Of course, Dennis starts bragging and boasting about his statistics intelligence.But who really did better on this test? YOU =)

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

CharacteristicsCharacteristics of Standard (z) Scores

1Can be used to compare across distributions as was done in

the example.unless the mean and standard deviation are identical in each distribution or if the raw scores come from the same distribution.2Larger standard scores indicate larger deviations (distances) from the mean.3Standard scores are defined as the distance between a score and its mean in standard deviations.You are 2 sd above your M, Dennis is 1sd above its M.

4Standard scores can be positive or negative.

5What would be the mean, variance and standard deviation of a

distribution of z-scores?

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

CharacteristicsWhat would be the mean, variance and standard deviation of a distribution of z-scores?The mean of the z-distribution will be zero and the variance and standard deviation will be equal to one

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Normal distributionStandard Scores and the Normal Distribution

Two characteristics of a normal distribution:

Mean = Median = Mode

This implies further characteristics likeUnique hump

Perfectly symmetric

Central tendencythe proportion of scores between one standard deviation above/below the mean a is approximately 0:6826Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Normal distribution

Remember, in a normal distribution most scores lie at the hump in the middle and fewer scores lie in the tails.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Normal distributionCharacteristics

1The height of the curve reflects the probability of observing

that particular score (X).As the distance between a score (X) and the mean increases, the height of the curve and the probability of observing that score decreases.2As you move farther and farther away from the mean in standard deviations, or z-Scores, the probability of bring that far away become less and less.3Regardless of the underlying data, as long as I is normally distributed these proportions/probabilities will be found for any z-score.0.68 of the data falls over the range Mean +/- 1Sd.

4With mean and standard deviation of a distribution, we can

calculate a z-score for any raw score (the graph applies to any data).

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Normal distributionProbabilities and Proportions in the Normal Distribution This is a fragment of aStandard Normal (z) Tableyou will find

in the appendix of any statistics book. (Useful? what for?)can be used to look up proportions of scores above or below

z-Score within a normal distribution. Remember the example of Dennis and YOU?What share of the scores are below/above of YOURs? what share are bellow/above of Dennis?

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Normal distributionProbabilities and Proportions in the Normal Distribution What share of the scores are below/above of YOURs? what share are bellow/above of Dennis?

P(X15)

And for Dennis

P(X18)

and

P(X15)

And for Dennis

P(X18)Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Normal distributionProbabilities and Proportions in the Normal Distribution Let"s come back to the table and review some properties Column 1lists z-Scores that increase from 0.00 to 4.00, mainly in increments of 0.01Col 2proportions of scores under the normal curve that are between the z-score and the meanCol 3Scores beyond that z-score and into the tail of the distribution.No negative z-Scores: the sign only tells you the direction of the z-score relative to the mean.

Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Normal distributionProbabilities and Proportions in the Normal Distribution What share of the scores are below/above of YOURs? what share are bellow/above of Dennis?

P(X15) =P(Z2) =0:47725

And for Dennis

P(X18) =P(Z1) =0:34134

and the cumulative?

P(X15) =P(Z2) =0;5+0:47725=0;9772

And for Dennis

P(X18) =P(Z1) =0;5+0:34134=0;84Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Normal distributionProbabilities and Proportions in the Normal Distribution

More examples:

P(X15) =?

And for Dennis

P(X18) =?Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Normal distributionProbabilities and Proportions in the Normal Distribution

More examples:

P(X15) =1P(X15) =1P(Z2) =10;9772=0;0228

And for Dennis

P(X18) =1P(X18) =1P(Z1) =10;84=0;16Oscar BARRERAStatistics Intermediate

Previous lectureStandard Scores

Normal distribution

Statistics Intermediate

Normal Distribution and Standard Scores

Session 3

Oscar BARRERA

oscardavid.barrerarodriguez@sciencespo.fr

February 6, 2018

Oscar BARRERAStatistics Intermediate


Politique de confidentialité -Privacy policy