[PDF] epigraph package latex
[PDF] epigraph width latex
[PDF] epinephrine dilution chart
[PDF] episode timeskip one piece
[PDF] épithélium de revêtement
[PDF] épithélium de revêtement pdf
[PDF] épithélium définition
[PDF] épithélium glandulaire
[PDF] épithélium malpighien
[PDF] épithélium prismatique
[PDF] épithélium pseudostratifié
[PDF] épitope
[PDF] eprep udemy suite of courses
[PDF] épreuve allemand lv2 bac s 2019
[PDF] epreuve bac anglais 2020
Epigraphprojections forfastgeneralcon vexprogramming
Po-WeiWangPOWEIW@CS.CMU.EDU
MachineLearningDepartment, Carnegie MellonUniv ersity,Pittsburgh,P A15213USA
MattW ytockMWYTOCK@STANFORD.EDU
ElectricalEngineeringDepartment, StanfordUniv ersity,Stanford, CA94305 USA
J.ZicoKolter ZKOLTER@CS.CMU.EDU
SchoolofComputer Science,Carnegie MellonUni versity, Pittsburgh,PA15213 USA
Abstract
Thispaper develops anapproachforefÞciently
solvinggeneralcon vex optimizationproblems speciÞedas disciplinedconv exprograms (DCP), acommongeneral-purpose modelingframe- work.SpeciÞcallywe developanalgorithm baseduponf astepigraphprojections,projections ontotheepigraph ofacon vex function,anap- proachcloselylink edtoproximal operatormeth- ods.We showthatbyusingthese operators,we cansolve anydisciplinedconve xprogramwith- outtransformingthe problemtoa standardcone form,asis donebycurrent DCPlibraries.W e thendev elopalargelibraryofefÞcientepigraph projectionoperators,mirroring ande xtending workonfastproximal algorithms,forman ycom- monconv exfunctions.Finally,weevaluate the performanceof thealgorithm,and showit often achievesorderofmagnitudespeedupso vere xist- inggeneral-purposeoptimization solvers.
1.Intr oduction
Althoughconv exoptimizationtechniquesunderlyalarge numberofmachine learningalgorithms,there hasbeena traditionaltensionbetween generalpurposeoptimization methodsandspeci alizedalgorithms.General purposeal- gorithms,ex empliÞedbystandardconeformsolverslike linear,secondorder,and semideÞnitecone solvers,with theadditionof modelinglanguagessuch ascvx(Grant &
Boyd,2008)orcvxpy (Diamond&Bo yd,2015)(which
convertproblemstotheseforms),pro videav eryßexi- bleÒrapidprototypingÓ framework forconv exoptimiza-
Proceedingsofthe33
rd
InternationalConference onMachine
Learning,New York,NY,USA, 2016.JMLR:W&CPvolume
48.Copyright 2016bytheauthor(s).
tion.Howe ver,thesemethodstypicallydonotscalebe- yondmedium-sizeproblems, andarenot well-suitedtothe largerproblemsthatmake upmuchcurrent machinelearn- ingwork. Thishasleadtothe development ofanum- berofspecialized solversfor manyproblems ofinterest includingsupportv ectormachines (Platt,1999;Shalev-
Shwartzetal.,2011; Chang&Lin, 2011),rankingap-
proaches(Joachims,2002), sparseinv ersecov ariancees- timation(Friedmanet al.,2008; Hsiehetal., 2014),etc, justtoname avery smallnumberof examples.Recently , manyspecialpurposealgorithmsha ve startedtouse proxi- maloper atorsasak eyb uildingblock. Thiscurrent worklooks tobeginbridgingtheg apbetween specializedandgeneral purposessolvers. TheÞrstmain contributionofthiswork isamethod forconv ertingany disciplinedconv exprogram(DCP)intoaformthatcanbe directlysolv edbyproximalmethods,withoutcon verting toconeform. Doingsorequires anew setofoperators forepigraphprojections,anoperator relatedtob utdis- tinctfrommost existingproximal operators;speciÞcally, forsomecon vex functionf:R n
θR,epigraphprojec-
tionsare optimizationproblemsof theform epi f (v,s)=ar gmin x,t {λx#vλ 2 2 +(t#s) 2 |f(x)$t}(1) i.e.,they areaprojectionofsome (v%R n ,s%R)ontothe epigraphset{(x,t)|f(x)$t}.Thus,the secondmain epigraphprojectionalgorithms (plussomeadditional prox- imaloperators),which inturncan solvea verybroad range ofDCPswithout ev erresortingto standardconetransfor- mations.To buildthesefastepigraph projectionsopera- torswede velopne woptimizationapproaches,including animplicit dualNewton methodanda O(n)Òsum-of-clipÓ solver,bothofwhichwedetail inlatersections. Finally,we implementthesemethods inageneric optimizationsolver , approaches,oftenan orderof magnitudeormore. Epigraphprojections forfastgeneralconv exprogramming
2.Backgr ound
Disciplinedconv exprogrammingDisciplinedcon vex
workforconve xprograms thatallowsverygeneralconvex problemstobe speciÞedusing arelativ elysmallset ofcon- vexatomicfunctionandsetofcompositional rulesthatpre- serveconvexity .Forinstance,abasicrulestatesthatfor h:R k
θR,g
i :R n
θR,andone ofthefollo wingholds
foreachi=1,...,k: ¥g i convex,hconvexnondecreasinginargumenti ¥g i concave,hconcavenonincreasinginargumenti ¥g i afÞne,hconvex, thenf=h(g 1 (x),...,g k (x)),iscon vex (seee.g.Boyd& issufÞcient butnotnecessaryforaproblem tobecon vex: theÒlog-sum-expÓ functionisconvex, butcannot beveri- Þedassuch bythe above rules.Howe ver, log-sum-expcan berepresentedas aseparateatomic functionthatis conve x (andmonotonic initsar guments).Inpractice, mostcon- vexproblemscanbewrittenusing theDCPruleset witha relativelysmallsetofatoms. Inadditionto thisveriÞcation, DCPlibraries alsoprovide a meansofcon vertingthe problemstoastandardformcone problem.Eachfunctional atomalso providesan graphim- plementationofthe function,arepresentation ofthefunc- tionasthe solutiontoa linearconeproblem. Fore xample, theθ 1 normhasthe graphform
λxλ
1 =min y 1 T y,subjectto#y$x$y.(2) Onceanoptimization problemhasbeen veriÞed asconv ex bytheDCP rules,wereplace allinstancesof DCPatoms withtheircorresponding epigraphimplementation(intro- ducingnew variablesasneeded).Theresulting problem, whichmustbe itselfalinear coneproblem,is guaranteed tobeequi valentto theoriginaloptimizationproblem,and canthenbe solvedby standardformsolv ers.
ProximalmethodsArecenttrend inmachinelearning
optimizationmethods hasbeenthe increaseddev elopment ofalgorithmsbased uponproximaloperators. Given acon- vexfunctionf:R n
θR,theproximal operatoris deÞned
as prox θf (v)=ar gmin x
λf(x)+
1 2
λx#vλ
2 2 .(3)
Crucially,formanyfunctions f,includingman ynon-
smoothfunctions,we cancompute theproximaloperator inclosedf or m(orifnot,atleastcomputeittonumerical precisionvery efÞciently).Forinstance,the proximaloper- atorforthe θ 1 normisgi venby softthresholding(Donoho,
1995),prox
θθáθ1
(v)=m ax{|v|#λ,0}ásign(v),whereall operatorsare appliedelementwise. Generallyspeaking,Òproximal algorithmsÓreferto anyop- timizationmethodthat usesaproximal operatorinits iter- ation.Suchalgorithms arenot new, withtheoriginal proxi- malpointalgorithm proposedin1976 (Rockafellar,1976), buttheyhav eseenincreasedusageinrecent years,often inconjunction withtheincreasing useofθ 1 regularization; seee.g.(P arikh&Bo yd,2013)forrecentsurve yofse veral suchmethods. Operatorsplittingtechniques areoneclass ofproximalal- gorithm,whichsolv eacomposite optimizationproblem minimize x f(x)+g(x),(4) typicallybye xploitingfast proximaloperatorsforfandg. Ageneralre view ofoperatorsplittingalgorithmsisgiven in(Ryu& Boyd,2016), andtwo algorithmsofparticular recentinterestare Douglas-Rachfordsplitting (Douglas& Rachford,1956) andthealternating directionmethodof multipliers(ADMM)(Gabay &Mercier,1976; Boydet al.,
2011).Ultimately, ouralgorithmusesADMMtosolv ethe
resultingproblemsafter transformingthem toasuitable form,thoughother operatorsplitting methodscanbe ap- pliedaswell. Ofparticularrele vanceto ourproblemis the splittingconesolv er(SCS) (OÕDonoghueetal.,2013),an applicationofADMM tolinearcone programs;together withexisting DCPlibraries,thissystemisv erycompara- bletoour own(a solver capableofsolvingDCPproblems usingproximalmethods), andcomparisonsto SCSwillbe oneofthe primaryfocusesin latersections.Also related istheTFOCS algorithm(Becker etal.,2011), whichuses Þrstordermethods tosolv esev eralclassesof coneprob- lems,thoughit doesnotsolv egeneralDCPs. Inthegeneral themeofproximal algorithms,thereha ve beenafe wpapersthat doconsiderepigraphprojections inalimited manner.F orexample, ToÞghietal.(2014) useaepigraph projectiononto atotal-variation-type norm describedvia alinearprogram forimagedecon volution; Chierchiaetal. (2012)describeprojections ontothe epi- graphoftheθ 2 andθ norms(inthelatercaseusinganaive O(nlogn)algorithm),for afew specializedoptimization problems;andHarizano vet al.(2013)useanepigraphpro- jectionontoa veryspeciÞc functionuseful forimagepro- cessing.Noneof theseapproaches relatetosolving general optimizationproblems, nordothe ythede velopthe range ofepigraphprojections thatweco ver.
3.Conv ertingDCPstoproximalformwith
epigraphprojections
Inthissection, weshow thatany DCPcanbe solvedus-
ingaproximal algorithmthatemplo ysstandardproximal operatorsplusepigraph projections.Inparticular weshow thatany DCPcomposedentirelyofatomsfrom someset offunctionsG(butwhichcanbe composedaccordingto Epigraphprojections forfastgeneralconv exprogramming theDCP rulesettoform muchmorecomple xfunctionsthat donotadmit efÞcientproximal operatorsorepigraph pro- jections)canbe solvedusing onlypr oximaloperatorsand epigraphprojectionsforfunctions gθG;thus,if weimple- mentproximaland epigraphprojectionoperators forev ery elementina DCPatomlibrary ,wecan directlysolve the problemwithoute ver usinganyoftheDCPgraphimple- mentations,but justtherelevantoperators directly.This is formalizedas follows
Theorem1.ConsidertheDCP optimizationproblem
minimize x f 0 (x) subjecttof i (x)λ0,i=1,...,p Ax=b (5) withoptimizationvariable x,afÞne constraintsAθR mθn andbθR m ,andcon vexfunctions f i :R n #Rbuiltvia theDCP rulesetfrom convex atomsinasetG.Thenthis optimizationproblem canbewrittenequivalentlyas minimize øx N i=1 f i (øx) subjectto
Aøx=
b (6) forsomee xpandedsetof variablesøxθR øn suchthatx= øx I0 forsomeset I 0 ;afÞne constraints
AθR
ømθøn
and bθR øm ;and convex functions f:R øn #Roftheform f i (øx)= g(øx Ii
I{g(øx
Ii )λøx ti or(7) forsomeatomic functiongθGwhereI i selectssomesub- jectofthe indicesandt i $θIselectstheepigr aphvariable. Again,thekey pointhereis thatalthoughatomgθGad- mitsefÞcient operators,thefunctionsf i madeupof com- positionsofthese functionscanbe muchmorecomple x (onesimplee xamplefora robustSVMisbelow). Butthe theoremstates thattheproblem canbetransformed intoan equivalentonethatonlyrequiresproximaloperators and epigraphprojectionsfor functionsin theatomset G. Theproof ofTherem1 isstraightforw ard,but requires aslightlymore formaldeÞnitionof therepresentationof DCPfunctions;we describethisonly brießyhere,with moredetailabout DCPsingeneral available inGrantet al. (2006).Each DCPfunctionis representedasa expression tree,whereeach non-leafnodein thetreemust beaDCP atom(whichitself canbe eitherconv ex,conca ve,or afÞne), andeachleaf nodemustbe av ariableora constant.For ex- ample,anθ robustSVM(seeAppendix A.4fordetails aboutthederi vationof thisobjectiveterm)canbewritten astheoptimization problem minimize 2 2 2 m i=1 max{0,1&y i T x i +%P T 1 (8) 2 2
θbmKλKt{0,á}+1θλáλ
2
θθ#/B;(y)XθP
T Figure1.ADCPe xpressiontreerepresenting anθθrobustSVM optimizationobjectiv e. Figure1sho wstheDCP expressiontreeforthisobjecti ve: 2 2 1 ,andmax{0,á}atomsarecon vex, thesum,+ and'atomsareaf Þne(the'atomunderDCP musthav e aconstanton oneside), theleafnode #isav ariable,and theλ,1,&diag(y)XandP T leafnodesare constants.
ProofofTheorem1. Theproofproceeds byconsidering
eachDCPfunction f i asane xpressiontree ofdepthn, andemploys atypeofÒbottomupÓ reductiontoreduce thetreeto oneofle veln&1,plussome additionalequal- ityandepigraph constraints.We selectsomeleaf node atlev eln,whichwe denotel 1 andconsiderits immedi- ateparentgandsiblings,l 2 ,...,l n .We introduceanew variable land,basedupon whethergisafÞne, convex,or concave,weaddtheconstraintg(l 1 ,...,l n l: weaddan equalityconstraintfor gafÞneandplacethese constraintsintothe
Aøx=
bmatrix;andwe addλcon- straintsfor gconvexand(constraintsfor gconcave, andintroduce theappropriateepigraph indicatorfunctions f i =I{g(l 1 ,...,l n l}inthesetw ocases. Finally, wereplacethe entiresubtreewith the lvariable.Werepeat thisprocessfor allleav esatle veln,resultingin anequiv a- lentexpression treeofdepthn&1(plusadditionalequality constraintsandepigraph indicators). Whenwereach theroot nodeofthe treeforthe objective functionf 0 ,wesimply addthee xpressionitself asafunc- tion f i =g(orbyw ayofoptimization, weimmediately terminateifthe rootnode isanaddition operatorandthe nodesatdepth 2alreadyha veef Þcientproximaloperators orepigraphprojections). For therootnodes ofexpres- siontreesfor f i ,i(1,weadd theepigraphconstraint f i =I(g(l 1 ,...,l n l)plustheconstraint l=0.
ExampleThetheoremis bestillustratedby example,so
weconsiderhere howwe canusethis approachtotrans- formtherob ustSVM problem(8).Thisobjectiv ecom- posesthehinge loss,alinear functionofthe parameters#, andan θ 1 norminsidethe hingeloss,and thereisno prox- Epigraphprojections forfastgeneralconv exprogramming imaloperator thatcanbe applieddirectlyto theentiretop- levelobjectiveterms.Howe ver,theobjectivestill admitsa
DCPformulation,as shownin Figure1,and applyingthe
operationsfromTheorem 1(using atomsforthe squared 2 norm,hingefunction max{0,á},andθ 1 norm)resultsi n thenew variablesandconstraints l 1 θP T x= l 1 l 2quotesdbs_dbs17.pdfusesText_23