[PDF] Learning Convex Optimization Models arXiv:2006.04248v2 [cs.LG





Previous PDF Next PDF



Convex Optimization Solutions Manual

4 janv. 2006 Solution. Let H be the convex hull of S and let D be the intersection of all convex sets that contain S i.e.





Additional Exercises for Convex Optimization

17 mars 2022 Optimization by Stephen Boyd and Lieven Vandenberghe. ... Course instructors can obtain solutions to these exercises by email to us.



Additional Exercises for Convex Optimization

17 mars 2022 Optimization by Stephen Boyd and Lieven Vandenberghe. ... Course instructors can obtain solutions to these exercises by email to us.



A Convex Optimization Solution for the Effective Reproduction

Abstract. COVID-19 is a global infectious disease that has affected millions of people. With new variants emerging with augmented transmission.



Convex Optimization Theory Chapter 1 Exercises and Solutions

20 févr. 2014 (g) f7(x) = f(Ax + b) where f : ?m ?? ? is a convex function



Convex Optimization

Convex Optimization / Stephen Boyd & Lieven Vandenberghe A solution method for a class of optimization problems is an algorithm that com-.



1 Convex Optimization with Sparsity-Inducing Norms

Estimators may then be obtained as solutions of convex programs. convex optimization (Boyd and Vandenberghe 2004; Bertsekas





Uncertain convex programs: randomized solutions and confidence

12 sept. 2002 Robust optimization is a deterministic paradigm where one seeks a solution which simultaneously satisfies all possible constraint instances. In ...



Convex Optimization Theory Chapter 3 Exercises and Solutions

20 févr. 2010 Many of the exercises and solutions given here were developed as part of my earlier convex optimization book [BNO03] (coauthored with ...

Learning Convex Optimization Models

Akshay Agrawal Shane Barratt Stephen Boyd

June 19, 2020

AbstractA convex optimization model predicts an output from an input by solving a convex optimization problem. The class of convex optimization models is large, and includes as special cases many well-known models like linear and logistic regression. We propose a heuristic for learning the parameters in a convex optimization model from a dataset of input-output pairs, using recently developed methods for differentiating the solution of a convex optimization problem with respect to its parameters. We describe three general classes of convex optimization models, maximum a posteriori (MAP) models, utility maximization models, and agent models, and present a numerical experiment for each.

1 Introduction

1.1 Convex optimization models

We consider the problem of learning to predict outputsy2 Yfrom inputsx2 X, given a set of input-output pairs(xi;yi),i= 1;:::;N, with(xi;yi)2 X Y. We assume thatY Rm is a convex set, but make no assumptions onX. In this paper, we specifically consider models :X ! Ythat predict the outputyby solving a convex optimization problem that depends on the inputx. We call such modelsconvex optimization models. While convex optimization has historically played a large role infittingmachine learning models, we emphasize that in this paper, we solve convex optimization problems to performinference.

A convex optimization model has the form

(x;) = argmin y2YE(x;y;);(1) where the objective functionE:X Y !R[ f+1gis convex in its second argument, and is a parameter belonging to a set of allowable parameters. The objective functionEis the model"senergy function, and the quantityE(x;y;)is the energy ofygivenx; the energy E(x;y;)can depend arbitrarily onxand, as long as it is convex iny. Infinite values ofE Authors listed in alphabetical order. Emails:{akshayka,sbarratt,boyd}@stanford.edu. encode additional constraints on the prediction, sinceE(x;y;) = +1implies(x;)6=y. Evaluating a convex optimization model atxcorresponds to finding an outputy2 Yof minimum energy. The functionis in general set-valued, since the convex optimization problem in(1)may have zero, one, or many solutions. Throughout this paper, we only consider the case where the argmin exists and is unique. Convex optimization models are particularly well-suited for problems in which the outputs y2 Yare known to have structure. For example, if the outputs are probability mass functions, we can takeYto be the probability simplex; if they are sorted vectors, we can takeYto be the monotone cone; or if they are covariance matrices, we can takeYto be the set of symmetric positive semidefinite matrices. In all cases, convex optimization models provide an efficient way of searching over a structured set to produce predictions satisfying known priors. Because convex optimization models can depend arbitrarily onxand, they are quite general. We will see that they include familiar models for regression and classification, such as linear and logistic regression, as specific instances. In the basic examples of linear and logistic regression, the corresponding convex optimization models have analytical solutions. But in most cases, convex optimization models must be evaluated by a numerical algorithm. Learning a parametric model requires tuning the parameters to make good predictions on Dand ultimately on held-out input-output pairs. In this paper, we present a gradient method for learning the parameters in a convex optimization model; this learning problem is in general non-convex, since the solution map of a convex optimization model is a complicated function. Our method uses the fact that the solution map is often differentiable, and its derivative can be computed efficiently, without differentiating through each step of the numerical solver [ 1 2 6 2 27

Outline.

Our learning method is presented in §

2 for the general case. In the follo wingthree sections, we describe general classes of convex optimization models with particular forms or interpretations. In § 3 , we interpret convex optimization models as solving a maximum a posteriori (MAP) inference task, and we give examples of these MAP models in regression, classification, and graphical models. In § 4 , we show how convex optimization models can be used to model utility-maximizing processes. In § 5 , we give examples of modeling agents using the framework of stochastic control. In § 6 , we present numerical experiments of learning convex optimization models for several prediction tasks.

1.2 Related work

Structured prediction.

Structured prediction refers to supervised learning problems where the output has known structure [ 12 ]. A common approach to structured prediction is energy-based models, which associate a scalar energy to each output, and select a value of the output that minimizes the energy, subject to constraints on the output [ 44
]. Most energy-based learning methods are learned by reducing the energy for input-output pairs in the training set and increasing it for other pairs [ 56
55
57
30
]. More recently, the authors of [ 19 20 ] proposed a method for end-to-end learning of energy networks by unrolled 2 optimization. Indeed, a convex optimization model can be viewed as a form of energy-based learning where the energy function is convex in the output. For example, input-convex neural networks (ICNNs) [ 9 ] can be viewed as a convex optimization model where the energy function is an ICNN. We also note that several authors have proposed using structured prediction methods as the final layer of a deep neural network [ 51
59
29
]; of particular note is [ 37
], in which the authors used a second-order cone program (SOCP) as their final layer.

Inverse optimization.

Inverse optimization refers to the problem of recovering the struc- ture or parameters of an optimization problem, given solutions to it [ 5 41
]. In general, inverse optimization is very difficult. One special case where it is tractable is when the optimization problem is a linear program and the loss function is convex in the parameters [ 5 ], and another is when the optimization problem is convex and the parameters enter in a certain way [ 28
42
]. This paper can be viewed as a heuristic method for inverse optimization for general convex optimization problems.

Differentiable optimization.

There has been significant recent interest in differentiating the solution maps of optimization problems; these differentiable solution maps are sometimes calledoptimization layers. The paper [8] showed how quadratic programs can be embedded as optimization layers in machine learning pipelines, by implicitly differentiating the KKT conditions (as in the early works [ 34
35
]). Recently, [ 2 6 ] showed how to efficiently differentiate through convex cone programs by applying the implicit function theorem to a residual map introduced in [ 27
], and [ 1 ] showed how to differentiate through convex optimization problems by an automatable reduction to convex cone programs; our method for learning convex optimization models builds on this recent work. Optimization layers have been used in many applications, including control [ 7 11 15 3 ], game-playing [ 46
45
], computer graphics [ 37
combinatorial tasks [ 58
52
53
21
], automatic repair of optimization problems [ 14 ], and data fitting more generally [ 9 17 1 6 10 ]. Differentiable optimization for nonconvex problems is often performed numerically by differentiating each individual step of a numerical solver [ 33
48
32
36
],although sometimes it is done impl icitly;see, e.g., [7,47 ,4 ].

Bilevel optimization.

The task of minimizing the training error of a convex optimization model can be interpreted as abileveloptimization problem,i.e., an optimization problem in which some of the variables are constrained to be optimal for another optimization problem 31
]. In our case, the optimization problem is to minimize the model"s training error, subject to the constraint that the predicted output is the solution to a convex optimization problem.

2 Learning convex optimization models

In this section we describe a general method for learning the parameterin a convex optimization model, given a data set consisting of input-output pairs(x1;y1);:::;(xN;yN)2 X Y . We let^yi=(xi;)denote the prediction ofyibased onxi, fori= 1;:::;N. These predictions depend on, but we suppress this dependency to lighten the notation. 3

2.1 Learning problemThe fidelity of a convex optimization model"s predictions is measured by a loss function

L:Y Y !R. The valueL(^yi;yi)is the loss for theith data point; the lower the loss, the better the prediction. Through^yi, this depends on the parameter. Our ultimate goal is to construct a model that generalizes,i.e., makes accurate predictions for input-output pairs not present inD. To this end, we first partition the data pair indices into two sets, a training setT f1;:::;Ngand a validation setV=f1;:::;Ng n T. We define the average training loss as

L() =1jT j

X i2TL(^yi;yi): We fit the model by choosingto minimize the average training loss plus a regularizer

R: !R[ f1g,i.e., solving the optimization problem

minimizeL() +R();(2) with variable. The regularizer measures how compatibleis with prior knowledge, and we assume thatR() =1for62,i.e., the regularizer encodes the constraint2. We describe below a gradient-based method to (approximately) solve the problem ( 2 We can check how well a convex optimization model generalizes by computing its average loss on the validation set, L val() =1jVj X i2VL(^yi;yi): In some cases, the model or learning procedure depends on parameters other than, called hyper-parameters. It is common to learn multiple models over a grid of hyper-parameter values and use the model with the lowest validation loss.

2.2 A gradient-based learning method

In general,Lis not convex, so we must resort to an approximate or heuristic method for learning the parameters. One could consider zeroth-order methods,e.g., evolutionary strategies [ 39
], Bayesian optimization [ 49
], or random search [ 54
]. Instead, we use a first-order method, taking advantage of the fact that the convex optimization model is often differentiable in the parameter.quotesdbs_dbs4.pdfusesText_8
[PDF] bragard chef jacket dubai

[PDF] bragard chef jacket singapore

[PDF] bragard chef jacket size chart

[PDF] bragard chef jacket sizes

[PDF] bragard chef jackets canada

[PDF] bragard chef jackets uk

[PDF] bragard chef jackets usa

[PDF] bragard outlet

[PDF] branches of sociology and their definition pdf

[PDF] branches of sociology in nursing

[PDF] branches of sociology in pakistan

[PDF] branches of sociology of education

[PDF] branches of sociology wikipedia

[PDF] brassage interchromosomique drosophile

[PDF] brassage interchromosomique en anglais