[PDF] [PDF] Chapter 2 Discrete random variables - CERMICS

Let α ∈ R We say that a discrete random variable X follows the degen- Proof Let us show that X1 + ··· + Xn ∼ B(n, p) Let k ∈ {0,··· ,n} By σ-additivity,



Previous PDF Next PDF





[PDF] Discrete random variables - UConn Undergraduate Probability OER

(i) E[X + Y ] = EX + EY , (ii) E[aX] = aEX, as long as all expectations are well- defined PROOF Consider a random variable Z := X + Y which is a discrete random 



[PDF] 11 Discrete Random variables

If X is a random variable on S and g : R → R then Y = g(X) is a new random variable on X which maps S into R by Y (s) = g(X(s)) for all outcomes s ∈ S For example Y = 2X − 7 or Z = X2 are both new random variables on S Let X be a discrete random variable



[PDF] Probability Review - Discrete Random Variables

25 sept 2019 · P[X ∈ {1, 2, 3}] Random variables are usually divided into discrete and continuous, even The support SX of the discrete random variable X is more rigorous way of showing that (1 5 4) is correct is to evaluate the sums



[PDF] Discrete Random Variables and Probability Distributions

Can we show that the two equations for variance are equal? V (x) = σ 2 = E(X 2 ) 



[PDF] Chapter 2 Discrete random variables - CERMICS

Let α ∈ R We say that a discrete random variable X follows the degen- Proof Let us show that X1 + ··· + Xn ∼ B(n, p) Let k ∈ {0,··· ,n} By σ-additivity,



[PDF] RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS 11

serve as the probability distribution for a discrete random variable X if and only if it s Proof for case of finite values of X Consider the case where the random 



[PDF] 3 Discrete Random Variables - EPFL

where p is the probability that a coin shows a head Write down the sample space and the sets Ax when n = 3 What is the random variable X = I1 + ··· + In?



[PDF] Discrete Random Variables I - David Dalpiaz

Informally, a random variable is a quantity X whose value depends on some Probability mass function (p m f ) (also called a “discrete density function”, or,



[PDF] 2 Discrete Random Variables - Arizona Math

Definition 1 A discrete random variable X on a probability space (Ω, F, P) Proof The proof of the theorem is trivial First note that if replace A by its intersection 



[PDF] Discrete Random Variables, I - Illinois

Informally, a random variable is a quantity X whose value depends on some Probability mass function (p m f ) (also called a “discrete density function”, or,

[PDF] show that x is a markov chain

[PDF] show that x is a random variable

[PDF] show that [0

[PDF] show the mechanism of acid hydrolysis of ester

[PDF] show time zone cisco

[PDF] show ∞ n 2 1 n log np converges if and only if p > 1

[PDF] shredded workout plan pdf

[PDF] shredding diet

[PDF] shredding workout plan

[PDF] shrm furlough

[PDF] shuttle paris

[PDF] si clauses french examples

[PDF] si clauses french exercises

[PDF] si clauses french practice

[PDF] si clauses french practice pdf

These are lecture notes for an introductory course on probability taught as part of the first year"s program at École des Ponts ParisTech. This course is currently under the direction of Aurélien Alfonsi (CERMICS) and based on the following textbook:

B. Jourdain,

Probabilités et statistique

, Ellipses 2009, 2nd edition 2016.

Chapter 2

Discrete random variables

2.1 Probability measure

From now on, we consider a random experiment which may have infinitely many possible outcomes. The set of all possible outcomes of this experiment is still calledsample space and denotedΩ. In the present lesson we will calleventany subset ofΩ. For the sake of accuracy, let us mention that this definition actually raises an issue whenΩis uncountably infinite. The power setP(Ω)is indeed proven to be "too big" to be considered as the set of events in many natural settings. One of the main troubles is the possible non-existence of a satisfying probability measure onP(Ω). Therefore, we must restrict the set of events to a subset ofP(Ω). Such a set must be compatible with the usual operations on sets, namely the complement, union and intersection, which leads to the definition of aσ-algebra. Definition 2.1.1(σ-algebra).Aσ-algebraon a setΩis a classAof subsets ofΩ(A ?

P(Ω)) such that

(i)Ω? A; (ii)

If A? A, thenA{? A;

(iii)

If for al ln??,An? A, then?

n??A n? A.

We then say that(Ω,A)is a measurable space.

For the sake of accuracy again, let us mention that a map between two measurable spaces is in general not compatible with their respective underlyingσ-algebras, hence the following definition. 1 Definition 2.1.2(Measurable map).Let(E,E)and(F,F)be two measurable spaces. We callmeasurablemap between(E,E)and(F,F)any mapf:E→Fsuch that ?B? E, f-1(B) ={x?A|f(x)?B} ? E. From now on, we will consider that the sample spaceΩis endowed with aσ-algebraA which is seen as the set of events. One should then consider in order to be accurate that A?Ωis an event iffA? A. However, all theσ-algebras we will encounter in the present lesson will always be big enough to contain all the subsets we will consider. Therefore, we will still call event any subset ofΩ. For similar reasons, we will purposely never worry about the measurability of a function, as all the functions encountered in the present lessons will be measurable. Definitions 2.1.1 and 2.1.2 can therefore b eignored at first reading. Definition 2.1.3(Probability measure).Aprobability measureon a sample spaceΩis a map?from the set of events to[0,1]such that (i)?(Ω) = 1; (ii) If (Ai)i?Iis an at most countable disjoint family of events, then i?IA i? i?I?(Ai). IfAdenotes the class of events, then we say that(Ω,A,?)is a probability space. Remark 2.1.4.A map?which satisfies(ii)is calledσ-additive. Throughout the rest of the present chapter and the next ones and unless explicitly men- tioned otherwise,(Ω,A,?)will always refer to a probability space such that -Ωis the sample space we work on; -Ais the class of events (which can be harmlessly considered to be the power setP(Ω)); -?is the probability measure which measures the likelihood of each event. Except for the uniform distribution, the definitions and propositions given in the previous chapter for a finite sample space hold for a general sample space, including the notions of conditional probability, independence and Bayes" theorem.

2.2 Discrete random variables

2.2.1 Definition

We introduce here the fundamental notion ofrandom variable, which is an expression whose value depends on the outcome of a random experiment. In this chapter we only consider random variables which have finitely or countably infinitely many different possible values. 2 Definition 2.2.1(Discrete random variable).Adiscrete random variableis a (measurable) mapX: Ω→EwhereEis an at most countable set.

For any eventA, we denote

{X?A}=X-1(A) ={ω?Ω|X(ω)?A}. The family(?({X=x}))x?Eis called theprobability distributionofX. Example 2.2.2.LetAbe an event. We recall that the indicator function ofAis defined by ?ω?Ω,?A(ω) =?1ifω?A

0else.

Therefore,?A: Ω→ {0,1}is a discrete random variable. Definition 2.2.3(Equality in distribution).LetEbe an at most countable set andX: Ω→ EandY: Ω→Ebe two discrete random variables. We say thatXandYare equal in distribution if the probability distribution ofXis equal to the probability distribution ofY, that is ?z?E,?({X=z}) =?({Y=z}).

In that case, we denoteXd=Y.

2.2.2 Independence

Definition 2.2.4(Independence).LetEandFbe two at most countable sets. Two random variablesX: Ω→EandY: Ω→Fare said to beindependentif ?(x,y)?E×F,?({X=x,Y=y}) =?({X=x})?({Y=y}).

In that case, we denoteX??Y.

We already saw in the previous chapter a notion of independence which concerns events. Those two notions coincide in the following sense: the eventAis independent of the event Biff the random variable?Ais independent of the random variable?A. countable set andXk: Ω→Ekbe a discrete random variable. The family of discrete random

?(x1,···,xn)?E1×···×En,?({X1=x1,···,Xn=xn}) =?({X1=x1})×···×?({Xn=xn}).

We say that a family(Xi)i?Iof discrete random variables ismutually independentif any finite subfamily of(Xi)i?Iis mutually independent. We say that a family(Xi)i?Iof discrete random variables ispairwise independentif for alli,j?Isuch thati?=j,Xiis independent ofXj. 3 Remark 2.2.6.If(Xi)i?Iis mutually independent, then it is pairwise independent, but the converse in not true. Definition 2.2.7(i.i.d.).A family of discrete random variables is calledindependent and identically distributed, usually abbreviatedi.i.d., if this family is mutually independent and all its elements are equal in distribution.

2.2.3 Common discrete probability distributions

We present here the most common discrete probability distributions.

2.2.3.1 The degenerate univariate distribution

Definition 2.2.8.Letα??. We say that a discrete random variableXfollows the degen- erate univariate distribution with parameterαif ?({X=α}) = 1.

In that case, we denoteX≂δα.

The degenerate univariate distribution is the distribution of an almost constant discrete random variable. It is a way to see a deterministic variable as a particular case of random variable.

2.2.3.2 The Bernoulli distribution

Definition 2.2.9.Letp?[0,1]. We say that a discrete random variableXfollows the

Bernoulli distribution with parameterpif

?({X= 1}) =pand?({X= 0}) = 1-p.

In that case, we denoteX≂ B(p).

Remark 2.2.10.

- For any eventA,?A≂ B(?(A)). -Xfollows the Bernoulli distribution with parameterpiff ?x? {0,1},?({X=x}) =px(1-p)1-x. -B(0) =δ0andB(1) =δ1. The Bernoulli distribution models a random experiment which has two possible outcomes: head or tail, true or false, yes or no, etc. This kind of experiment is called a Bernoulli trial. When it makes sense, we usually interpret the event{X= 1}as the success of the experiment and{X= 0}as the failure. Therefore,pis often interpreted as the probability of success of a Bernoulli trial. 4

2.2.3.3 The binomial distribution

Definition 2.2.11.Letn???andp?[0,1]. We say that a discrete random variableX follows the binomial distribution with parametersnandpif k? p k(1-p)n-k.

In that case, we denoteX≂ B(n,p).

Remark 2.2.12.B(1,p) =B(p).

Proposition 2.2.13.Letn???andp?[0,1]. LetX: Ω→?be a discrete random variable andX1,···,Xnbe i.i.d. random variables, each having a Bernoulli distribution with parameterp. Then

X≂ B(n,p)??Xd=X1+···+Xn.

Proof.Let us show thatX1+···+Xn≂ B(n,p). Letk? {0,···,n}. Byσ-additivity, ?({X1+···+Xn=k}) =?( x

1,···,xn?{0,1}

x x

1,···,xn?{0,1}

x x

1,···,xn?{0,1}

x

1+···+xn=k?({X1=x1})× ··· ×?({Xn=xn})

x

1,···,xn?{0,1}

x

1+···+xn=kp

x1(1-p)1-x1× ··· ×pxn(1-p)1-xn x

1,···,xn?{0,1}

x

1+···+xn=kp

k(1-p)n-k =C×pk(1-p)n-k, whereCis the cardinality of the set{(x1,···,xn)? {0,1}n|x1+···+xn=k}. Choosing x

1,···,xn? {0,1}such thatx1+···+xn=kis equivalent to assigning1tokcomponents

of a vector of{0,1}nand0to then-kother components. The latter is itselft equivalent to choosingkelements amongn. We deduce thatC=?n k?. This proves thatX1+···+Xn follows a binomial distribution with parametersnandp. Therefore,X≂ B(n,p)iffXis

equal in distribution toX1+···+Xn.The best understanding of the binomial distribution is given by Proposition2.2.13 . For

X≂ B(n,p),?(X=k)is the probability that exactlyksuccesses occur innindependent

Bernoulli trials.

5

2.2.3.4 The Poisson distribution

Definition 2.2.14.Letλ >0. We say that a discrete random variableXfollows the Poisson distribution with parameterλif ?n??,?({X=n}) =e-λλnn!·

In that case, we denoteX≂ P(λ).

A random variableXwhich describes the number of events which happen during a certain time interval is typically modeled by a Poisson distribution.

2.2.3.5 The geometric distribution

Definition 2.2.15.Letp?(0,1]. We say that a discrete random variableXfollows the geometric distribution with parameterpif ?n???,?({X=n}) =p(1-p)n-1.

In that case, we denoteX≂ Geo(p).

Remark 2.2.16.Geo(1) =δ1.

Proposition 2.2.17.Letp?(0,1]. Let(Xn)n???be a family of i.i.d. random variables, each having a Bernoulli distribution with parameterp. Then

X≂ Geo(p)??Xd= inf{n???|Xn= 1}.

Proof.LetG= inf{n???|Xn= 1}. Let us show thatG≂ Geo(p). Letn???. By definition of the infimum and mutual independence of(Xk)k???, we have ?({G=n}) =?({X1= 0,···,Xn-1= 0,Xn= 1}) =?({X1= 0})× ··· ×?({Xn-1= 0})×?({Xn= 1}) = (1-p)n-1×p,

soG≂ Geo(p). Therefore,X≂ Geo(p)iffXd=G.Proposition2.2.17 giv esus a b etterunderstanding of the Geometric distribution. F or

X≂ Geo(p),?(X=n)is the probability that exactlynattempts are needed to witness the first success in a series of independent Bernoulli trials. 6

2.2.4 Marginal distribution

LetEandFbe two at most countable sets. LetX: Ω→EandY: Ω→Fbe two discrete random variables. SinceE×Fis at most countable, the map (X,Y) :Ω→E×F

ω?→(X(ω),Y(ω))

is a discrete random variable. The probability distribution ofX(resp.Y) is calledfirst(resp second)marginal distributionof(X,Y). The probability distribution of(X,Y)is called the joint probability distributionforXandY.

For allx?E, byσ-additivity, we have

?({X=x}) =?( y?F{X=x,Y=y}) y?F?({X=x,Y=y}).

Similarly, for ally?F,?({Y=y}) =?

x?E?({X=x,Y=y}). We deduce that the marginal distributions can be deduced from the joint probability distribution. However, the converse is in general not true. Nevertheless let us mention that in the particular case of independence betweenXandY, the joint probability distribution can be deduced from the marginal distributions. Indeed, ifXis independent ofY, then for all(x,y)?E×F, ?({X=x,Y=y}) =?({X=x})?({Y=y}). Exercise 2.2.18.LetEandFbe two at most countable sets. LetX: Ω→Eand Y: Ω→Fbe two discrete random variables such that there existc??,μ:E→?and

ν:F→?which satisfy

?(x,y)?E×F,?({X=x,Y=y}) =cμ(x)ν(y). 1.

Co mputec.

2.

What can w esa yab outXandY?

2.3 Expected value and variance

2.3.1 Expected value

Definition 2.3.1(Expected value).LetEbe an at most countable subset of?andX: Ω→ Ebe a real-valued discrete random variable. We say thatXisintegrableand denoteX?L1 if x?E|x|?({X=x})<+∞. In that case, theexpected valueofXis denoted?[X]and defined by ?[X] =? x?Ex?({X=x}). 7 Remark 2.3.2.- The integrability ofXand its expected value depend only on the probability distribution ofX; -IfEis finite, thenXis integrable ; -Ifλ??, then?[λ] =λ; -IfAis an event, then?[?A] =?(A). Proposition 2.3.3.LetXandYbe two integrable discrete random variables. (i) Line arity:F oral lλ??,λX+Yis an integrable discrete random variable and ?[λX+Y] =λ?[X] +?[Y]; (ii) Positivity and non-de generacy:If ?(X≥0) = 1, then?[X]≥0. If in addition ?[X] = 0, then?(X= 0) = 1. (iii) Proof.LetEandFbe the two at most countable sets such thatX: Ω→EandY: Ω→F. (i) Le tλ??andG={λx+y|(x,y)?E×F}. The setGis at most countable, so λX+Y: Ω→Gis a discrete random variable. Usingσ-additivity for the second equal- ity, Fubini"s theorem (for nonnegative series) for the fourth equality and the triangle inequality for the first inequality, we have z?G|z|?({λX+Y=z}) =? z?G|z|?( (x,y)?E×F

λx+y=z{X=x,Y=y})

z?G|z|? (x,y)?E×F

λx+y=z?({X=x,Y=y})

z?G? (x,y)?E×F? {λx+y=z}|z|?({X=x,Y=y}) (x,y)?E×F? z?G? {λx+y=z}|z|?({X=x,Y=y}) (x,y)?E×F|λx+y|?({X=x,Y=y}) (x,y)?E×F|x|?({X=x,Y=y}) +? (x,y)?E×F|y|?({X=x,Y=y}) x?E|x|?({X=x}) +? y?F|y|?({Y=y}) 8 soλX+Yis integrable. We now reproduce the same calculation as above but we remove the absolute values. This time we use Fubini"s theorem for absolutely convergent series and the triangle inequality becomes an equality, so that z?Gz?({λX+Y=z}) =? z?G? (x,y)?E×F? {λx+y=z}z?({X=x,Y=y}) (x,y)?E×F? z?G? {λx+y=z}z?({X=x,Y=y}) (x,y)?E×Fx?({X=x,Y=y}) +? (x,y)?E×Fy?({X=x,Y=y}) x?Ex?({X=x}) +? y?Fy?({Y=y}) =λ?[X] +?[Y]. (ii) If ?({X≥0}) = 1, then for allx?E∩??-,?({X=x}) = 0, so ?[X] =? x?Ex?({X=x}) =? x?E∩?+x?({X=x}) +? x?E∩??-x?({X=x}) x?E∩?+x?({X=x})≥0.

If in addition?[X] = 0, then?

x?E∩?+x?({X=x}= 0, so for allx?E∩?+, x?({X=x}) = 0. We deduce that for allx?Esuch thatx?= 0,?({X=x}) = 0, so ?({X= 0}) = 1. (iii)

?[Y-X]≥0, hence?[Y]≥?[X].The next proposition, known as thelaw of the unconscious statistician, usually abbrevi-

atedLOTUS, is very useful in practice. Proposition 2.3.4(LOTUS).LetEbe a an at most countable subset of?,X: Ω→Ebe a discrete random variable andf:E→?be a (measurable) map. Thenf(X) :ω?Ω?→ f(X(ω))is a real-valued discrete random variable. Moreover, f(X)?L1??? x?E|f(x)|?({X=x})<+∞.

In that case,

?[f(X)] =? x?Ef(x)?({X=x}). 9 Proof.SinceEis at most countable,f(E)is at most countable as well, sof(X) : Ω→f(E) is a real-valued discrete random variable. Usingσ-additivity for the second equality and Fubini"s theorem (for nonnegative series) for the fourth equality, we have y?f(E)|y|?({f(X) =y}) =? y?f(E)|y|?( x?Ef(x)=y{X=x}) y?f(E)|y|? x?Ef(x)=y?({X=x}) y?f(E)? x?E? {f(x)=y}|y|?({X=x}) x?E? y?f(E)? {f(x)=y}|y|?({X=x}) x?E|f(x)|?({X=x}).

Therefore,f(X)?L1???

y?f(E)|y|?({f(X) =y})<+∞ ??? x?E|f(x)|?({X= x})<+∞. Suppose now thatf(X)?L1. We reproduce the same calculation as above but we remove the absolute values. This time we use Fubini"s theorem for absolutely convergent series, so that ?[f(X)] =? y?f(E)y?({f(X) =y}) =? y?f(E)? x?E? {f(x)=y}y?({X=x}) x?E? y?f(E)? {f(x)=y}y?({X=x}) =? x?Ef(x)?({X=x}).Proposition 2.3.5.LetEandFbe two at most countable sets andX: Ω→Eand

Y: Ω→Fbe two discrete random variables.

(i) If Xis independent ofY, then for all (measurable) mapsf:E→?andg:F→? such thatf(X),g(Y)?L1, we havef(X)g(Y)?L1and ?[f(X)g(Y)] =?[f(X)]?[g(Y)].(2.1) (ii) Conversely, if (2.1)holds for all (measurable) bounded mapsf:E→?andg:F→?, thenXis independent ofY. Proof.(i)Supp osethat Xis independent ofY. Letf:E→?andg:F→?be two (measurable) maps. Leth:E×F→?be defined for all(x,y)?E×Fby 10 h(x,y) =f(x)g(y). By Fubini"s theorem for nonnegative series and independence ofX andY, we get (x,y)?E×F|h(x,y)|?({X=x,Y=y}) =? x?E? y?F|f(x)g(y)|?({X=x})?({Y=y}) x?E|f(x)|?({X=x})? y?F|g(y)|?({Y=y}).

According to Proposition

2.3.4 ,h(X,Y) =f(X)g(Y)?L1. Using LOTUS, Fubini"s theorem for absolutely convergent series and independence ofXandY, we have ?[f(X)g(Y)] =?[h(X,Y)] =? (x,y)?E×Fh(x,y)?({X=x,Y=y}) x?E? y?Ff(x)g(y)?({X=x})?({Y=y}) x?Ef(x)?({X=x})? y?Fg(y)?({Y=y}) =?[f(X)]?[g(Y)]. (ii)

Supp oseno wthat (

2.1 ) holds for all (measurable) bounded mapsf:E→?and g:F→?. Let(x,y)?(E,F). Then (2.1) forf=?{x}andg=?{y}writes ?[?{X=x,Y=y}] =?[?{X=x}]?[?{Y=y}], that is?({X=x,Y=y}) =?({X=x})?({Y= y}). SoXis independent ofY.2.3.2 Variance Definition 2.3.6(Variance).LetEbe an at most countable subset of?andX: Ω→E be a real-valued discrete random variable. We say thatXissquare-integrableand denote

X?L2if?

x?Ex2?({X=x})<+∞. In that case, thevarianceofXis denotedVarXand defined by

VarX=?[(X-?[X])2].

The square root of the variance ofXis called thestandard deviationofX.

Remark 2.3.7.According to LOTUS,X?L2iffX2?L1.

Proposition 2.3.8.LetXbe a square-integrable discrete random variable. Then (i)Xis integrable. (ii)VarX=?[X2]-?[X]2. 11 (iii)F oral la,b??,aX+b?L2and

Var(aX+b) =a2VarX.

(1+X2). By hypothesis,X2?L1, so|X|is bounded from above by an integrable random variable. Therefore,X?L1. (ii) Let μ=?[X], which is well defined according to(i). Using LOTUS, we get

VarX=?[(X-?[X])2] =?

x?E(x-μ)2?({X=x}) x?Ex2?({X=x})-2μ? x?Ex?({X=x}) +μ2? x?E?({X=x}) x?Ex2?({X=x})-2μ2+μ2=?[X2]-?[X]2. (iii) Let a,b??. Then(aX+b)2=aX2+ 2abX+b2. By hypothesis and(i),aX2?L1quotesdbs_dbs14.pdfusesText_20