[PDF] Information Theory (5XSE0) Ch.0: Mathematical Preliminaries





Previous PDF Next PDF



sh(x) = b) La fonction cosinus hyperbolique : ch(x)

ex ? e?x. 2. = sh(x). Or sh(0) = 0 et d'après ci-dessus



Chapitre III - Fonctions hyperboliques

La dérivée de sh est ch et on a vu que chx ? 1 > 0 pour tout x ? R donc sh est strictement croissante sur R. On a ch0 = 1 donc le graphe de sh admet la droite 





Index seminum 2020

22 janv. 2021 CH-0-NEU-20200511. 2 Adenostyles alpina (L.) Bluff & Fingerh. Asteraceae. CH-0-NEU-20200512. 3 Alchemilla conjuncta aggr. Rosaceae.



CH 0-4

24 août 2016 CH 0/4 réf. 1000000092. Utilisation. CHANAC (Lozère). Etablissement. Tamisage AFNOR. Analyse chimique moyenne.





LES FONCTIONS STANDARDS

Length (ch). Retourne un entier représentant la longueur de ch. Long (ch) = ord (ch[0]). Chaîne. Entier. Lg := Length ('Turbo Pascal'); ? lg = 12.



Physique des Matériaux - Ch 0 -2019

5 févr. 2019 Matière: faite d'atomes. • Cohésion assurée par les liaisons chimiques qui peuvent être de différents types. I. Liaison ionique.



Case Study - E-Spain SE-40 (CH. 0+000 – CH. 41+300 Seville)

0+000 and CH. 24+2600. The last stretch opened to traffic is located between CH. 15+980. (interchange with A-376 highway) 



Information Theory (5XSE0) Ch.0: Mathematical Preliminaries

2 févr. 2021 Ch.0: Mathematical Preliminaries. Hamdi Joudeh. TU/e (Q3 2020-2021). 1 Probability. Sets: Sets are denoted using calligraphic font ...



[PDF] FORMULAIRE SUR LES FONCTIONS HYPERBOLIQUES

cos(0) = 1 Formule de puissance : (chx + shx)n = ch(nx) + sh(nx) pour tout n ? N ch(2x)+1 10 Formules de factorisation : chx ? chy = 2sh



[PDF] Chapitre13 : Fonctions hyperboliques - Melusine

Enfin comme ch x ´ sh x = e´x ch x ´ sh x est positif et tend vers 0 en +8 ‚ Notons enfin que la courbe représentative de ch ressemble à une parabole 



[PDF] sh(x) = b) La fonction cosinus hyperbolique : ch(x) - Normale Sup

b) La fonction cosinus hyperbolique : ch(x) = ex + e ?x 2 Pour tout x ? R ch (x) = ex ? e?x 2 = sh(x) Or sh(0) = 0 et d'après ci-dessus 



[PDF] Les fonctions de référence

Par suite le graphe de la fonction sinus est le translaté du graphe de la fonction cosinus par la translation de vecteur ( ? 2 0) ou encore le graphe de la 



[PDF] developpements limités usuels

Le développement limité de MAC LAURIN au voisinage de x = 0 à l'ordre "n" pour une fonction "f" indéfiniment dérivable s'écrit : /(x) = /(0) + x/'(0) +x2



[PDF] Fonctions circulaires et hyperboliques inverses - Exo7

0 z = x+iy x y ? 1 Montrer que si x > 0 alors ? = arctan Commencer par calculer Cn +Sn et Cn ?Sn à l'aide des fonctions ch et sh



[PDF] Fonctions usuelles - Exo7 - Cours de mathématiques

La restriction ch : [0+?[? [1+?[ est une bijection Sa bijection réciproque est argch : [1+?[? [0+?[ x y



[PDF] La chaînette 1 Le cosinus hyperbolique

Ici “ch” désigne le cosinus hyperbolique défini à partir de la fonction exponentielle La fonction x ?? chx est une bijection de [0+?[ dans [1+?[



[PDF] Ch 4 FONCTIONS HYPERBOLIQUESpdf

http://ginoux univ-tln 1 FONCTIONS HYPERBOLIQUES 4 A Fonctions exponentielle puissance et logarithme 1 La fonction exponentielle de base a ( 0

  • Quel est la formule de ch ?

    cosinus hyperbolique : ch(x)=ex+e?x2.
  • Comment calculer le sinus hyperbolique ?

    La fonction sinus hyperbolique est la fonction sinh : R ? R définie par sinh(x) = ex ? e?x 2 . La fonction tangente hyperbolique est la fonction tanh : R ? R définie par tanh(x) = sinh(x) cosh(x) = ex ? e?x ex + e?x .
  • Comment Etudier une fonction hyperbolique ?

    ? Pour la fonction sh, il suffit de l'étudier sur [0,+?[ puisqu'il s'agit d'une fonction impaire. La dérivée de sh est ch et on a vu que chx ? 1 > 0 pour tout x ? R donc sh est strictement croissante sur R. On a ch0 = 1 donc le graphe de sh admet la droite ? d'équation y = x pour tangente en 0.
  • Règle. Placer le centre de l'hyperbole et déterminer son orientation. Tracer les asymptotes en prolongeant les diagonales du rectangle. Tracer l'hyperbole en passant par les sommets et en s'approchant des asymptotes, sans jamais y toucher.

Last modied on February 2, 2021

Information Theory (5XSE0)

Ch.0: Mathematical Preliminaries

Hamdi Joudeh

TU/e(Q3 2020-2021)

1

Probabilit y

Sets:Sets are denoted using calligraphic font, e.g.A=f1;2;3;4;5g. Ifais an element ofA, we write a2 A. The number of elements inA(i.e. cardinality) is denoted byjAj. IfSis contained inA, we write S A, i.e.Sis a subset ofA. HereScan be equal toA. IfSisstrictlycontained inA, we writeS A. For a pair of setsAandB, theset dierenceis dened as

A n B felements inAand not inB g:

The union of two sets is denoted byA [ B, while their intersection is denoted byA \ B. The empty set,

which contains no elements, is denoted by;. If two sets do not overlap, then their intersection is the

empty set, e.g.f1;2;3g \ f4;5;6g=;. For any setA, we have; [ A=A,; \ A=;, and; A. Probability Space:Consider a random experiment, e.g. tossing a coin, or rolling a dice. Thesample space is the set of all possible outcomes of the random experiment in question. In this course, we mostly work with discrete sample spaces, where is nite or countably innite, i.e. =f!1;!2;:::;!ngfor somen 1. Aneventis a subset of . Events include the certain event , and the empty event;(or impossible event). An eventAoccurs when the outcome of the random experiment is an element inA. A probability measure?assigns a real number between 0 and 1 to each event, such that ] = 1;and?[A [ B] =?[A] +?[B] ifA \ B=;: The above implies that?[;] = 0. Note that we write?[A] or?fAgto denote the probability of eventA.

For an eventA

, the complementAcis dened asAc n A.AB Example 1.Consider a random experiment involving tossing a coin and observing the outcome. Here we have =fH;Tg, where H denotes heads and T denotes tails. The set of all events (i.e. theevent space) is given byf;;fHg;fTg;fH;Tgg. If the coin isfair, then we will have?fHg=?fTg= 0:5. Joint Probability:For two eventsAandB, their joint probability is given by?[A \ B], i.e. the probability that bothAandBoccur. In the above example,?[fHg \ fTg] = 0, as a coin cannot be simultaneously heads and tails. These are calledmutually exclusiveordisjointevents. Independence:Two eventsAandBare independent if and only if their joint probability is equal to the product of their probabilities, i.e.

AandBare independent()?[A \ B] =?[A]?[B]:

Note that mutual exclusiveness and independence are not the same thing.

Example 2.Consider another experiment of tossing a pair of fair coins and observing the outcomes. Here

each outcome is an ordered pair and the sample space is given by =(H;H);(H;T);(T;H);(T;T).

The event space consists of all 2

4= 16 subsets of

(including;and itself). Assuming coins are

independent, each of the 4 outcomes will have a probability of 0:25. The event of observing at least one

head is given by(H;H);(H;T);(T;H). This has a probability of 0:75. Conditional Probability:For eventsAandB, the conditional probability ofAgivenBis dened as ?[AjB]?[A \ B]?[B]: Note that?[A \ B] =?[AjB]?[B]. Since?[A \ B] =?[B \ A], we can exchange the order and write ?[A \ B] =?[B \ A] =?[BjA]?[A]. This leads to Bayes' rule ?[BjA] =?[AjB]?[B]?[A]: Law of Total Probability:LetB1;B2;:::;Bnbe a partition of the sample space, i.e. allnevents are disjoint and their union is the sample space. Then ?[A] =nX i=1?[A \ Bi] =nX i=1?[A j Bi]?[Bi]: As a special case, we have?[A] =?[A j B]?[B] +?[A j Bc]?[Bc]. Example 3.Assume that there is a 1% chance of being infected with a particular virus and you take a test which is 95 % accurate, i.e.?[+ve testjinfected] =?[ve testjnot infected] = 0:95. Given that you test positive, what is the probability that you have the virus? LetIdenote the event of being infected, andPbe the event of testing positive. Note thatIcis the event of not being infected, whilePcis the event of testing negative. We have?[I] = 0:01 and ?[P j I] =?[Pcj Ic] = 0:95. We want to nd?[I j P]. This is calculated as follows ?[I j P] =?[P j I]?[I]?[P] ?[P j I]?[I]?[P j I]?[I] +?[P j Ic]?[Ic] ?[P j I]?[I]?[P j I]?[I] +1?[Pcj Ic]1?[I]

0:950:010:950:01 + 0:10:99= 0:16:

Despite testing positive, the probability that you are infected is only 16%. Chain Rule:From the denition of the conditional probability, we get ?[A1\ A2\ \ An] =?[A1]?[A2jA1]?[A3jA1\ A2]?[AnjA1\ A2\ \ An1]: Note that this expansion can also be written in reverse order (or any other order of events). 2 Union Bound:The following inequality holds (verify it!) ?[A1[ A2[ [ An]nX i=1?[Ai]: Random Variables:A random variable is a function that assigns numerical values to outcomes of a random experiment, e.g.X: ! Xmaps each outcome!in to a corresponding valueX(!) from the setX. For instance, we can dene a binary random variable for the coin toss experiment from earlier, such thatX=f0;1g,X(H) = 1 andX(T) = 0. From now on, the outcome argument!inX(!) will be dropped, and a random variableX(!) will be simply denoted byX. 2

Random V ariables

Random variables are denoted by uppercase letters, e.g.X;Y;Z. A random variableXtakes values on a setX, which we often refer to as thealphabetofX. In this course, we mainly focus on real-valued discrete random variables, i.e.Xis a countable set and its elements are real numbers. Probability Mass Function:The probability mass function (pmf) ofXis given by p

X(x)?[X=x];for allx2 X

which assigns a probability between 0 and 1 to each value inX. A pmf satises X x2Xp

X(x) = 1 andpX(x)0;for allx2 X:

A pmfpX(x) is also referred to as thedistributionofX. For brevity, we will often drop the subscript in

p X(x) and write it asp(x), where the argumentxidentiesp(x) as a pmf ofX. Expectation:The expected value of a random variableXis dened as ?XX x2Xxp X(x): We sometimes use brackets as?[X]. Iff(X) is some function ofX, then?f(X)=P x2XpX(x)f(x). Variance:The variance of a random variableXis dened as var(X)?(X?X)2 The variance is also equivalently given by var(X) =?[X2](?[X])2. Verify this! Example 4.(Bernoulli) A Bernoulli random variable is a binary random variable with an alphabet

X=f0;1gand a pmf of

p

X(x) =(

1p; x= 0

p; x= 1 where 0p1 is a parameter. We use Bern(p) to denote a Bernoulli distribution with parameterp, andXBern(p) means thatXhas a distribution Bern(p). A Bernoulli random variable represents a coin toss, which isfairwhenp= 0:5 andbiasedotherwise. The expected value and variance are given by ?X=pand var(X) =p(1p), respectively (verify this!). 3 Joint pmf:Consider a pair of random variablesXandYtaking values onXandY, respectively. The joint pmf is given by p

XY(x;y)?[X=x;Y=y];for all (x;y)2 X Y:

Note thatX Y f(x;y) :x2 X;y2 Ygis the set of all pairs (x;y), known as theCartesian productof XandY. The pair (X;Y) can be seen as a vector-valued random variable with a pmfpXY(x;y), which satisesP (x;y)2XYpXY(x;y) = 1 andpXY(x;y)0 for all (x;y)2 X Y. Marginal pmfs:From a joint pmfpXY(x;y), we obtain themarginalpmfs ofXandYas follows p

X(x) =X

y2Yp

XY(x;y) andpY(y) =X

x2Xp

XY(x;y):

Independence:The two random variablesXandYare independent if and only if their joint pmf is equal to the product of their marginal pmfs, i.e. p

XY(x;y) =pX(x)pY(y):

Conditional pmfs:The conditional pmf ofXgivenY=yis dened as p

XjY(xjy)?[X=xjY=y] =pXY(x;y)p

Y(y): Similarly, the conditional pmf ofYgivenX=xis given by p

YjX(yjx)?[Y=yjX=x] =pXY(x;y)p

X(x):

Using brief notation,pXjY(xjy) andpYjX(yjx) are denoted byp(xjy) andp(yjx), respectively. We use this

brief notation when there is no ambiguity. Note that each conditional pmf is a distribution, e.g. each

valuey2 Ydenes a new (conditional) distribution forXgiven byp(xjy).

Note that ifXandYare independent, then we have

p(xjy) =p(x):

In this case, regardless of the valueYtakes, the distribution ofXremains unchanged. Similarly, we will

also havep(yjx) =p(y). Example 5.Consider a pair of Bernoulli (or binary) random variablesXandYwith a joint distribution given in the following tableY= 0Y= 1X= 00:50:1X= 10:20:2

Here the marginal distributions are given by

p

X(x) =(

0:6; x= 0

0:4; x= 1andpY(y) =(

0:7; y= 0

0:3; y= 1:

The conditional distributions ofXgivenY= 0 andXgivenY= 1 are given by p

XjY(xj0) =(

5=7; x= 0

2=7; x= 1andpXjY(xj1) =(

1=3; x= 0

2=3; x= 1:

Work out the conditional distributions ofYgivenX.

4 Chain Rule:LetX1;X2;:::;Xnhave a joint pmf ofp(x1;x2;:::;xn). We have p(x1;x2;:::;xn) =p(x1)p(x2jx1)p(x3jx1;x2)p(xnjx1;x2;:::;xn1) which follows from the denition of the conditional pmf. This expansion can be written more brie y as p(x1;x2;:::;xn) =Qn i=1p(xijx1;:::;xi1). Wheni= 1 here, we havep(xijx1;:::;xi1) =p(xi) =p(x1), i.e. the sequencex1;:::;xi1is empty in this case. Independent and Identically Distributed (i.i.d.):A sequence of random variablesX1;X2;:::;Xn is i.i.d. if all random variables are independent and have the same marginal distributionpX(x), i.e. p(x1;x2;:::;xn) =pX(x1)pX(x2)pX(xn) =nY i=1p

X(xi):

Example 6.(Binomial) Consider an i.i.d. sequenceX1;X2;:::;Xnin which each entry is a Bernoulli random variable with parameterp(i.e. has a Bern(p) distribution). The number of ones in this random sequence is another random variable given byK=Pn i=1Xi, which takes values on the setf0;1;2;:::;ng. Khas a binomial distribution with parameters (n;p) and a pmf given by p

K(k) =n

k p k(1p)nk where n k=n!(nk)!k!is the binomial coecient (i.e.nchoosek). The expected value and variance are given by?K=npand var(K) =np(1p), respectively (verify this!). Weak Law of Large NumbersConsider a random variableXwith distributionp(x) and a nite expected value, i.e.1to an i.i.d. sample. How good is this estimate? The weak law of large numbers says that the probability

that the empirical mean estimate is bad goes to zero asngrows large. This is formally stated as follows.

Theorem 1.(Weak Law of Large Numbers). For every >0, we have lim n!1?nX n?X> o = 0:

Proof.This proof is part of the rst assignment. In particular, you will be guided through the steps of

showing that the following inequality holds: nX n?X> o var(X)n 2: Theorem 1 follows directly as the right-hand-side of the above inequality converges to 0 for any >0. This simple proof of the weak law of large number also requires that var(X)<1.3Con vexity

Let's take a small break from probability, and look at convexity. We will combine the two through Jensen's

inequality further on. An open interval on the real line is denote by (a;b), for somea < b. Similarly, a

closed interval is denoted by [a;b]. We usef(x) to denote a real-valued function of a real variablex.

5 Convex Function:A functionf(x) is said to beconvexover an interval (a;b) if we have f x1+ (1)x2f(x1) + (1)f(x2) for everyx1;x22(a;b) and 01. The functionf(x) isstrictly convexif the inequality is strict for all2(0;1), and equality holds only if= 0 or= 1. Roughly speaking, convex functions curve upwards and often look like a cup (i.e.[). This is seen from the denition by noting that a convex function always lies below any chord. Examples of convex functions arex2,exandjxj(see below). The opposite of convex functions are concave functions.xx 2xe xConcave Function:A functionf(x) is said to beconcaveover an interval (a;b) if we have f x1+ (1)x2f(x1) + (1)f(x2) for everyx1;x22(a;b) and 01. The functionf(x) isstrictly concaveif the inequality is strict for all2(0;1), and equality holds only if= 0 or= 1. From the above denition, it follows that a functionf(x) is concave iff(x) is convex. Hence roughly speaking, concave functions curve downwards and often look like a cap (i.e.\). A concave function always lies above any chord. Examples of concave functions are log(x),pxandcx2(see below). Note that an ane (or linear) function given bycx+dis both convex and concave (verify this!).xlog

2(x)xcx2Second Derivative:Now assume that the a functionf(x) is twice dierentiable, i.e. its second deriva-

tivef00(x) exists. Iff00(x)0 on an interval (a;b), then the function is convex on that interval. Strictness

in the inequality implies strict convexity. That is f

00(x)0 =)f(x) is convex.f00(x)>0 =)f(x) is strictly convex:(1)

The second derivative captures the change in the slope of the function. The slope of a convex function is

non-decreasing, or increasing when the function is strictly convex (check this forx2andex). Note that

(1) immediately implies the following statement for concave functions f

00(x)0 =)f(x) is concave.f00(x)<0 =)f(x) is strictly concave:(2)

We now give a proof* (no need to learn this proof) for the statement in (1). 6 Proof.Suppose that we have a functionf(x) withf00(x)0 for allx2(a;b). From the Taylor theorem, for anyx02(a;b), we can expressf(x) as follows: f(x) =f(x0) +f0(x0)(xx0) +f00(x?)2 (xx0)2 wherex?is betweenx0andx. Sincex?is also in (a;b), we havef00(x?)0. Therefore, by removing the non-negative term f00(x?)2 (xx0)2, we may only reducef(x), that is f(x)f(x0) +f0(x0)(xx0):(3) Now let's setx0=x1+ (1)x2for somex1;x22(a;b) and2[0;1]. On the other hand, we takex to be eitherx1orx2. Combing with (3), we obtain the two following inequalities x=x1=)f(x1)f(x0) +f0(x0)(1)(x1x2) (4) x=x2=)f(x2)f(x0) +f0(x0)(x2x1):(5) Multiplying (4) byand (5) by (1) and adding the resulting inequalities, we obtain f(x1) + (1)f(x2)f(x0) +f0(x0)(1)(x1x2)+ (1)f(x0) +f0(x0)(x2x1) =f(x0) +f0(x0)(1)(x1x2) +f0(x0)(1)(x2x1) =f(x0) =f(x1+ (1)x2): This holds for anyx1;x22(a;b) and2[0;1], as we can always choosex0from (a;b) for the Taylor

expansion such thatx0=x1+ (1)x2. Therefore,f(x) is convex, which completes the proof.Jensen's Inequality:We now put together probability and convexity.

Theorem 2.(Jensen's Inequality). LetXbe a random variable and letfbe a convex function. We have ?f(X)f(?X):

Iffis strictly convex, then the inequality is strict. Moreover, iffis strictly convex and?f(X) =f(?X),

then we must haveX=?X(i.e.Xis a constant). Proof.Let's assume thatXtakes values inX=fx1;x2;:::;xkgwith probabilitiesp1;p2;:::;pk. If k= 2, the inequality follows directly from the denition of convexity. In particular, we have=p1and

1= 1p1=p2, for which we obtain

?f(X) =p1f(x1) +p2f(x2)f(p1x1+p2x2) =f(?X):(6) Now we wish to show that the statement holds for anyk. We do this by induction. In particular, let's suppose that the statement holds for anyk1 mass points. This means that for a pmf withk1 mass points given byp01;p02;:::;p0k1, we have k1X i=1p

0if(xi)f

k1X i=1p 0ixi! :(7) Now let's go back to our pmf withkmass pointsp1;p2;:::;pk. We split this pmf intop1;p2;:::;pk1and p k. By normalizingp1;p2;:::;pk1, we obtain a pmf withk1 mass points as follows: p

0i=piP

k1 j=1pj=pi1pk;for alli2 f1;2;:::;k1g: 7

We now proceed as follows

k X i=1p if(xi) =pkf(xk) +k1X i=1p if(xi) =pkf(xk) + (1pk)k1X i=1p i1pkf(xi) =pkf(xk) + (1pk)k1X i=1p

0if(xi)

pkf(xk) + (1pk)f k1X i=1p 0ixi! (8) f p kxk+ (1pk)k1X i=1p 0ixi! (9) =f kX i=1p ixi!

Note that the inequality in (8) is due to our hypothesis that the statement holds fork1 (see (7)). On

the other hand, (9) follows from the denition of convexity (see (6)). Based on the hypothesis (or assumption) that Jensen's inequality holds fork1, we have shown that

it must also hold fork. Since we know that it holds for the case with 2 mass points, this means that it

must hold for the case with 3 mass points, and so on until we reach anyk. This completes the proof.8quotesdbs_dbs45.pdfusesText_45
[PDF] up and down entre deux pdf

[PDF] candidature définition

[PDF] je suis vivement intéressée par votre offre d'emploi

[PDF] phrase d'accroche lettre de motivation candidature spontanée

[PDF] pourquoi postulez vous pour ce poste

[PDF] pourquoi avez vous choisi notre entreprise reponse

[PDF] lettre de motivation maison de retraite sans experience

[PDF] envoute moi ekladata

[PDF] mon expérience professionnelle m'a permis de développer

[PDF] mes expériences professionnelles m'ont permis d'acquérir

[PDF] m'a permis d'acquérir synonyme

[PDF] cette expérience m'a permis d'acquérir des compétences en matière

[PDF] de part mon expérience professionnelle

[PDF] cette expérience m'a permise d'acquérir des compétences en matière

[PDF] mes expériences professionnelles m'ont permises ou m'ont permis