THE SECOND DERIVATIVE
1. Curvature- concavity and convexity. An intuitive definition: a function is said to be convex at an interval if for all pairs of points on the graph
Quasi-Concave Programming
for differentiable quasi-concave functions.4. In speaking of a quasi-concave function some specific domain of definition
Lecture 14 The Last One!
27 ago 2015 Convex and Concave Functions. Definition. Suppose X is a convex subset of Rn. A function f : X ? R is: concave if.
Concave functions in economics 1. Preliminaries 1 2. Concave
John Riley. 6. We have the following alternative definition. Definition: Concave function. The differentiable function f is concave on X if for any 0.
1 Theory of convex functions
1 mar 2016 Convex concave
Quasi-Concave Programming Author(s): Kenneth J. Arrow and Alain
which is an alternative definition of concavity for differentiable functions. The inequality (1.3) states that if f(x) is concave it lies everywhere on or.
Concave and Convex Functions1
Concave and Convex Functions1. 1 Basic Definitions. Definition 1. Let C ? RN be non-empty and convex and let f : C ? R. 1. (a) f is concave iff for any a
Existence and Uniqueness of Equilibrium Points for Concave N
The existence of an equilibrium point for this concave n-person game is show DEFINITION: The function o(x r) will be called diagonally strictly concave ...
CS281B/Stat241B. Statistical Learning Theory. Lecture 14.
Mixable loss: Reduction to mix loss. Crux: exp-concavity is convenient but too strong. Definition: We say L(a x) is ?-mixable if.
COEFFICIENT ESTIMATES FOR BI-CONCAVE FUNCTIONS 1
13 feb 2018 Introduction Preliminaries and DeFinition. The knowledge on bi-concave univalent functions is based on univalent
CS281B/Stat241B. Statistical Learning Theory.
Lecture 14.
Wouter M. Koolen
Convex lossesExp-concave lossesMixable lossesThe gradient trickSpecialists 1 Menu Today we solve new online learning problems by reducing themto problems/algorithms/analyses we already cracked before. 2Prediction with Expert Advice
Prediction with expert advice:
Protocol:Fort= 1,2,...-
Experts announce actionsa1t,...,aKt? A.
Learner chooses an actionat? A.
Adversary reveals outcomext? X.
Learner incurs lossL(at,xt).
Goal: small regret w.r.t. best expert.
3Convex: Reduction to dot loss
SayL(a,x)is[0,1]-bounded and convex inafor eachx:
K k=1w kL(ak,x)≥ L? K? k=1w kak,x?Then we can feed Hedge?kt=L(akt,xt).
Hedge outputswt. Play the mean actionat=?Kk=1wktakt. K k=1w ktL(akt,xt)Dot lossw?
t?t≥ L(at,xt)? actual loss Dot-loss bound translates to convex bounded lossL. RT/2lnK4
Exp-concave: Reduction to mix loss
Definition:We sayL(a,x)isη-exp-concaveinafor eachxif K k=1w The AA outputswt. Play the mean actionat=?Kk=1wktakt. -ln? K? k=1w kte-ηL(akt,xt)?Mix loss ofwtonη?t≥ηL?
K? k=1w ktakt,xt? actual loss 5Mix-loss bound translates to exp-concave lossL:
R 6Example: square loss is exp-concave
Let's consider
L(a,x) = (a-x)2
whereA=X= [-1,+1]. Findηsuch thatLisη-exp-concave by testing negative second derivative: 2 ∂a2e-η(a-x)2=∂ ∂a-2e-η(a-x)2η(a-x) =e-η(a-x)2η?4η(a-x)2-2?ForX=A= [-Y,+Y]we findη=1
8Y2. 7Mixable loss: Reduction to mix loss
Crux: exp-concavity is convenient buttoo strong.
Definition:We sayL(a,x)isη-mixableif
ηln?
K? k=1w ke-ηL(ak,x)? Mapping fromw,a1,...,aKto witnessacalledsubstitution function. Mixable losses behave just enough like the mix loss to carry the AA regret bound through. R 8Square loss is mixable
Square loss is mixable withη=1
2. The substitution function is
w,a1,...,aK?→m 12(-1)-m1
2(+1) 4 wheremη(x) =-1ηln?Kk=1wke-η(ak-x)2
See (Vovk 1990, Haussler, Kivinen, Warmuth, 1998)
9Mixable loss list
Popular mixable losses:mix loss, log loss, entropic losssquare loss, Brier lossHellinger lossA=X= [0,1]:
L(a,x):=1
2?(⎷
1-x-⎷
1-a)2+ (⎷
x-⎷ a)? Characterisation of mixability: (Van Erven, Reid, Williamson 2012). 10Gradient trick
11Gradient trick
Abusing an algorithm that can compete with the bestexpertto in fact compete with the bestconvex combination(cf portfolios).Assume convex lossL(w,x):
L(w,x)≥ L(wt,x) + (w-wt)?wL(wt,x)?
First-order expansion around algorithm's actionwt
Idea: feed Hedge?t=?wL(wt,x)(may need restriction + translation + scaling to make this[0,1]bounded). Getwt. Playwt. 12Imaginary regret upper bounds the actual regret:
T t=1w? t?t-minkT t=1? k t= maxkT t=1? w? t?t-?kt? = max wT t=1(wt-w)??w?(wt,x) ≥maxwT t=1(L(wt,x)- L(w,x)) T? t=1L(wt,x)-maxwT t=1L(w,x) Caveat: even if original loss was nice (mixable/curved/...), the imagined lossw?→w??wL(wt,x)islinear. Regret of order⎷ T. 13Specialists
14Motivation
Not all expert predictions/actions available every round.Missing dataNoiseToo expensive ($/time/memory)
How to model missingness? Adversarial.
How to redefine the objective? New variant of regret.How to still do something optimal? Upgrade of AA.
15Mix loss game with specialists
Protocol:Fort= 1,2,...-
Adversary picks the subsetAt?[K]ofawakespecialists. Learner chooses a distributionwton awake specialistsAt. Adversary reveals loss vector?t?(-∞,∞]At.Learner's loss is themix loss-ln??
k?Atwt,ke-?t,k? 16Objective
There is no loss when a specialist is asleep.
Regret w.r.t. specialistj: only measured during rounds wherejis awake R j T=? t?[T] j?At-ln?? k?Atw kte-?kt? t?[T] j?At? j t 17Specialist AA
Definition:TheSpecialist Aggregating Algorithm(SAA) maintains a distributionut. It starts uniformuk1= 1/K.In roundtwith awake expertsAt, SAA predict with
w kt=ut(k|At) =ukt1{k?At} j?AtujtUpdate:
u k t+1=?????u k te-?kt j?Atukte-?kt? j?Atuktk?At u k tk /?AtAA update relative to awake setAt
18What makes this tick
Consider the sequence??1,??2obtained by completing?1,?2by assigning in each round the SAA mix loss to all the asleep specialists. Theorem:SAA on?and AA on??produce identical weightsut=w?t and suffer identical mix loss.Proof: (homework) 19Specialist regret bound for SAA
The AA has small regret w.r.t. expertj:
lnK≥T? t=1-ln? K? k=1w ?tke-??tk? -T? t=1? tj T? t=1-ln?? k?Atw kte-?kt? t=[T] t?Aj? j t-? t=[T] t/?Aj-ln?? k?Atw kte-?kt? t=[T] t?Aj-ln?? k?Atw kte-?kt? t=[T] t?Aj? j t =Rj T Adversary more power (sleeping) but regret stilllnK: SAA minimax for specialist mix-loss regret game. 20Discussion
Convex bounded losses are easier than dot loss.Mixable losses are easier than mix loss.Gradient trick allows us to compete with mixtures (at a cost)Specialists extension deals with missing data (at no cost).
21quotesdbs_dbs14.pdfusesText_20
[PDF] concealed carry permit
[PDF] concealed carry permit paperwork
[PDF] concealed carry questionnaire
[PDF] concealed handgun application
[PDF] concentrated acid hydrolysis of biomass
[PDF] concentrated solution
[PDF] concentrated sulfuric acid hydrolysis
[PDF] concentration aqueous solution meaning
[PDF] concentration calculations worksheet answer key
[PDF] concentration calculations worksheet answers
[PDF] concentration calculations worksheet gcse
[PDF] concentration calculations worksheet gcse tes
[PDF] concentration calculations worksheet key
[PDF] concentration calculations worksheet pdf