Big Data Analytics: Optimization and Randomization PDF

Big-Data Tutorial

'Big-data' is similar to 'Small-data' but bigger. ? Recently getting popular expression “Midsize data”. ? …but having data bigger it requires somewhat.

Big-Data Tutorial

? Good news about big-data: ? Often because of vast amount of data

Data Science Tutorial

10 de ago. de 2017 2017 SEI Data Science in Cybersecurity Symposium. Approved for Public Release; Distribution is Unlimited. Data Science Tutorial.

Preview Big Data Analytics Tutorial (PDF Version)

sources to a data product useful for organizations forms the core of Big Data Analytics. In this tutorial we will discuss the most fundamental concepts and

Data Science do zero: Primeiras regras com o Python

Translated from original Data Science from Scratch by Joel Grus. Mas não é um tutorial compreensível sobre Python é direcionado a ... O pdf para y.

introduction to big data and hadoop

volumes of data Facebook was generating. Makes it possible for analysts with strong SQL skills to run queries. Used by many organizations.

Informatica Big Data Management - 10.2.2 - User Guide - (English)

10 de jul. de 2020 Informatica the Informatica logo

Big Data Conceitos básicos

Volume de dados de difícil tratamento. Page 5. SEFAZ/ES – do BI ao Big Data Analytics. • Início de

UNIVERSIDADE ESTADUAL DE CAMPINAS FACULDADE DE

O curso “Data Mining e Big Data: inteligência analítica na pesquisa FEQ-0267 – “A História da Big Data: Tutorial sobre Big Data /.

Big Data Analytics: Optimization and Randomization

10 de ago. de 2015 http://www.cs.uiowa.edu/˜tyng/kdd15-tutorial.pdf. Yang Lin

Big Data Analytics: Optimization

and Randomization

Tianbao Yang

†, Qihang Lin?, Rong Jin?‡

Tutorial@SIGKDD 2015

Sydney, Australia

Department of Computer Science, The University of Iowa, IA, USA ?Department of Management Sciences, The University of Iowa, IA, USA ?Department of Computer Science and Engineering, Michigan State University, MI, USA ‡Institute of Data Science and Technologies at Alibaba Group, Seattle, USA

August 10, 2015

Yang, Lin, JinTutorial for KDD"15August 10, 2015 1 / 234 URL http://www.cs.uiowa.edu/ ˜tyng/kdd15-tutorial.pdfYang, Lin, JinTutorial for KDD"15August 10, 2015 2 / 234

Some Claims

NoThis tutorial is not an exhaustive literature survey It is not a survey on different machine learning/data mining algorithms YesIt is about how toefficien tlysolve machine lea rning/datamining (formulated as optimization) problems for big data Yang, Lin, JinTutorial for KDD"15August 10, 2015 3 / 234

Outline

Part I: Basics

Part II: Optimization

Part III: Randomization

Yang, Lin, JinTutorial for KDD"15August 10, 2015 4 / 234 Big Data Analytics: Optimization and Randomization

Part I: Basics

Yang, Lin, JinTutorial for KDD"15August 10, 2015 5 / 234

BasicsIntroduction

Outline

1Basics

Introduction

Notations and Definitions

Yang, Lin, JinTutorial for KDD"15August 10, 2015 6 / 234

BasicsIntroduction

Three Steps for Machine Learning

Model Optimization

20406080100

0 0.05 0.1 0.15 0.2 0.25 0.3 iterations distance to optimal objective 0.5 T 1/T 2 1/T Data Yang, Lin, JinTutorial for KDD"15August 10, 2015 7 / 234

BasicsIntroduction

Big Data Challenge

Big Data

Yang, Lin, JinTutorial for KDD"15August 10, 2015 8 / 234

BasicsIntroduction

Big Data Challenge

Big Model60 million parameters

Yang, Lin, JinTutorial for KDD"15August 10, 2015 9 / 234

BasicsIntroduction

Learning as Optimization

Ridge Regression Problem:min

w?Rd1n n i=1(yi-w?xi)2+λ2 ?w?22x i?Rd:d-dimensional feature vectory

i?R: target variablew?Rd: model parametersn: number of data pointsYang, Lin, JinTutorial for KDD"15August 10, 2015 10 / 234

BasicsIntroduction

Learning as Optimization

Ridge Regression Problem:min

w?Rd1n n i=1(yi-w?xi)2

Empirical Loss+

λ2 ?w?22x i?Rd:d-dimensional feature vectory

i?R: target variablew?Rd: model parametersn: number of data pointsYang, Lin, JinTutorial for KDD"15August 10, 2015 11 / 234

BasicsIntroduction

Learning as Optimization

Ridge Regression Problem:min

w?Rd1n n i=1(yi-w?xi)2+λ2 ?w?22????

Regularizationx

i?Rd:d-dimensional feature vectory

i?R: target variablew?Rd: model parametersn: number of data pointsYang, Lin, JinTutorial for KDD"15August 10, 2015 12 / 234

BasicsIntroduction

Learning as Optimization

Classification Problems:min

w?Rd1n n i=1?(yiw ?xi) +λ2 ?w?22y i? {+1,-1}: labelLoss function?(z):z=yw?x 1. S VMs :(squa red)hinge loss ?(z) =max(0,1-z)p, wherep=1,2 2.

L ogisticRegression

:?(z) =log(1+exp(-z))Yang, Lin, JinTutorial for KDD"15August 10, 2015 13 / 234

BasicsIntroduction

Learning as Optimization

Feature Selection:min

w?Rd1n n i=1?(w?xi,yi) +λ?w?1?

1regularization?w?1=?di=1|wi|λcontrols sparsity levelYang, Lin, JinTutorial for KDD"15August 10, 2015 14 / 234

BasicsIntroduction

Learning as Optimization

Feature Selection using

Elastic Net

:min w?Rd1n n i=1?(w?xi,yi)+λ?

?w?1+γ?w?22?Elastic net regularizer, more robust than?1regularizerYang, Lin, JinTutorial for KDD"15August 10, 2015 15 / 234

BasicsIntroduction

Learning as Optimization

Multi-class/Multi-task Learning:

min W1n n i=1?(Wxi,yi) +λr(W)W?RK×dr(W) =?W?2F=?Kk=1?dj=1W2kj: Frobenius Normr(W) =?W??=?

iσi: Nuclear Norm (sum of singular values)r(W) =?W?1,∞=?dj=1?W:j?∞:?1,∞mixed normYang, Lin, JinTutorial for KDD"15August 10, 2015 16 / 234

BasicsIntroduction

Learning as Optimization

Regularized Empirical Loss Minimization

min w?Rd1n n

i=1?(w?xi,yi) +R(w)Both?andRare convex functionsExtensions to Matrix Cases are possible (sometimes straightforward)

Extensions to Kernel methods can be combined with randomized approachesExtensions to Non-convex (e.g., deep learning) are in progress Yang, Lin, JinTutorial for KDD"15August 10, 2015 17 / 234

BasicsIntroduction

Data Matrices and Machine Learning

The Instance-feature Matrix:X?Rn×dX=(

(((((((x ?1x?2· x ?n) )))))))Yang, Lin, JinTutorial for KDD"15August 10, 2015 18 / 234

BasicsIntroduction

Data Matrices and Machine Learning

The output vector:y=(

(((((((y 1 y 2 y n)

)))))))?Rn×1continuousyi?R: regression (e.g., house price)discrete, e.g.,yi? {1,2,3}: classification (e.g., species of iris)Yang, Lin, JinTutorial for KDD"15August 10, 2015 19 / 234

BasicsIntroduction

Data Matrices and Machine Learning

The Instance-Instance Matrix:K?Rn×nSimilarity Matrix

Kernel Matrix

Yang, Lin, JinTutorial for KDD"15August 10, 2015 20 / 234

BasicsIntroduction

Data Matrices and Machine Learning

Some machine learning tasks are formulated on the kernel matrixClustering

Kernel Methods

Yang, Lin, JinTutorial for KDD"15August 10, 2015 21 / 234

BasicsIntroduction

Data Matrices and Machine Learning

The Feature-Feature Matrix:C?Rd×dCovariance Matrix

Distance Metric Matrix

Yang, Lin, JinTutorial for KDD"15August 10, 2015 22 / 234

BasicsIntroduction

Data Matrices and Machine Learning

Some machine learning tasks requires

the cova riancematrix Principal Component Analysis Top-k Singular Value (Eigen-Value) Decomposition of the Covariance

Matrix

Yang, Lin, JinTutorial for KDD"15August 10, 2015 23 / 234

BasicsIntroduction

Why Learning from Big Data is Challenging?

High per-iteration cost

High memory cost

High communication cost

Large iteration complexity

Yang, Lin, JinTutorial for KDD"15August 10, 2015 24 / 234

BasicsNotations and Definitions

Outline

1Basics

Introduction

Notations and Definitions

Yang, Lin, JinTutorial for KDD"15August 10, 2015 25 / 234

BasicsNotations and Definitions

Norms

Vectorx?RdEuclidean vector norm:?x?2=⎷x

?x=?? di=1x2i? p-norm of a vector:?x?p=??di=1|xi|p?1/pwherep≥11?

2norm?x?2=??

d i=1x2i2?

1norm?x?1=?d

i=1|xi|3? ∞norm?x?∞=maxi|xi|Yang, Lin, JinTutorial for KDD"15August 10, 2015 26 / 234

BasicsNotations and Definitions

Norms

Vectorx?RdEuclidean vector norm:?x?2=⎷x

?x=?? di=1x2i? p-norm of a vector:?x?p=??di=1|xi|p?1/pwherep≥11?

2norm?x?2=??

d i=1x2i2?

1norm?x?1=?d

i=1|xi|3? ∞norm?x?∞=maxi|xi|Yang, Lin, JinTutorial for KDD"15August 10, 2015 26 / 234

BasicsNotations and Definitions

Norms

Vectorx?RdEuclidean vector norm:?x?2=⎷x

?x=?? di=1x2i? p-norm of a vector:?x?p=??di=1|xi|p?1/pwherep≥11?

2norm?x?2=??

d i=1x2i2?

1norm?x?1=?d

i=1|xi|3? ∞norm?x?∞=maxi|xi|Yang, Lin, JinTutorial for KDD"15August 10, 2015 26 / 234

BasicsNotations and Definitions

Matrix Factorization

kkV?k: top-kapproximationPseudo inverse:X†=V-1U?QR factorization:X=QR(n≥d)Q?Rn×d: orthonormal columnsR?Rd×d: upper triangular matrixYang, Lin, JinTutorial for KDD"15August 10, 2015 27 / 234

BasicsNotations and Definitions

Matrix Factorization

kkV?k: top-kapproximationPseudo inverse:X†=V-1U?QR factorization:X=QR(n≥d)Q?Rn×d: orthonormal columnsR?Rd×d: upper triangular matrixYang, Lin, JinTutorial for KDD"15August 10, 2015 27 / 234

BasicsNotations and Definitions

Matrix Factorization

kkV?k: top-kapproximationPseudo inverse:X†=V-1U?QR factorization:X=QR(n≥d)Q?Rn×d: orthonormal columnsR?Rd×d: upper triangular matrixYang, Lin, JinTutorial for KDD"15August 10, 2015 27 / 234

BasicsNotations and Definitions

Norms

MatrixX?Rn×dFrobenius norm:?X?F=?tr(X?X) =??

ni=1?dj=1X2ijSpectral (induced norm) of a matrix:?X?2=max?u?2=1?Xu?2?A?2=σ1(maximum singular value)Yang, Lin, JinTutorial for KDD"15August 10, 2015 28 / 234

BasicsNotations and Definitions

Norms

MatrixX?Rn×dFrobenius norm:?X?F=?tr(X?X) =??

ni=1?dj=1X2ijSpectral (induced norm) of a matrix:?X?2=max?u?2=1?Xu?2?A?2=σ1(maximum singular value)Yang, Lin, JinTutorial for KDD"15August 10, 2015 28 / 234

BasicsNotations and Definitions

Convex Optimization

min x?Xf(x)Xis a convex domainfor anyx,y? X, their convex combination αx+ (1-α)y? Xf(x)is a convex functionYang, Lin, JinTutorial for KDD"15August 10, 2015 29 / 234

BasicsNotations and Definitions

Convex Function

?x,y? X,α?[0,1]f(x)≥f(y) +?f(y)?(x-y)?x,y? Xlocal optimum is global optimum Yang, Lin, JinTutorial for KDD"15August 10, 2015 30 / 234

BasicsNotations and Definitions

Convex Function

?x,y? X,α?[0,1]f(x)≥f(y) +?f(y)?(x-y)?x,y? Xlocal optimum is global optimum Yang, Lin, JinTutorial for KDD"15August 10, 2015 30 / 234

BasicsNotations and Definitions

Convex vs Strongly Convex

Convex function:

f(x)≥f(y) +?f(y)?(x-y)?x,y? X

Strongly Convex function:

f(x)≥f(y) +?f(y)?(x-y) +λ 2 ?x-y?22?x,y? X

Global optimum is uniquestrong convexity

constant Yang, Lin, JinTutorial for KDD"15August 10, 2015 31 / 234

BasicsNotations and Definitions

Convex vs Strongly Convex

Convex function:

f(x)≥f(y) +?f(y)?(x-y)?x,y? X

Strongly Convex function:

f(x)≥f(y) +?f(y)?(x-y) +λ 2 ?x-y?22?x,y? X

Global optimum is uniquestrong convexity

constant Yang, Lin, JinTutorial for KDD"15August 10, 2015 31 / 234

BasicsNotations and Definitions

Non-smooth function vs Smooth function

Non-smooth functionLipschitz continuous: e.g.absolute loss constantSubgradient:f(x)≥f(y) +∂f(y)?(x-y)-1-0.500.51-0.2 0 0.2 0.4 0.6 0.8 |x| non-smooth sub-gradientSmooth function e.g. constant -5-4-3-2-1012345-1 0 1 2 3 4 5 6 log(1+exp(-x)) f(y)+f'(y)(x-y) y f(x) Quadratic FunctionYang, Lin, JinTutorial for KDD"15August 10, 2015 32 / 234

BasicsNotations and Definitions

Non-smooth function vs Smooth function

quotesdbs_dbs50.pdfusesText_50

[PDF] bilan admission post bac lyon

[PDF] bilan apb 2016

[PDF] bilan arjel 2016

[PDF] bilan biochimique sang

[PDF] bilan biochimique sang pdf

[PDF] bilan cm2 systeme solaire

[PDF] bilan comptable marocain excel

[PDF] bilan comptable marocain exemple

[PDF] bilan comptable marocain exercice corrigé

[PDF] bilan dune macrocytose

[PDF] bilan de cycle eps

[PDF] bilan des omd en afrique

[PDF] bilan dysgraphie orthophonie

[PDF] bilan energetique formule pdf

[PDF] bilan energetique physique 3eme

[PDF] Big Data Analytics: Optimization and Randomization

Big Data Analytics: Optimization

Tianbao Yang

Tutorial@SIGKDD 2015

Sydney, Australia

August 10, 2015

Some Claims

Outline

Part I: Basics

Part II: Optimization

Part III: Randomization

Part I: Basics

BasicsIntroduction

Outline

1Basics

Introduction

Notations and Definitions

BasicsIntroduction

Three Steps for Machine Learning

Model Optimization

20406080100

BasicsIntroduction

Big Data Challenge

Big Data

BasicsIntroduction

Big Data Challenge

Big Model60 million parameters

BasicsIntroduction

Learning as Optimization

Ridge Regression Problem:min

BasicsIntroduction

Learning as Optimization

Ridge Regression Problem:min

Empirical Loss+

BasicsIntroduction

Learning as Optimization

Ridge Regression Problem:min

Regularizationx

BasicsIntroduction

Learning as Optimization

Classification Problems:min

L ogisticRegression

BasicsIntroduction

Learning as Optimization

Feature Selection:min

1regularization?w?1=?di=1|wi|λcontrols sparsity levelYang, Lin, JinTutorial for KDD"15August 10, 2015 14 / 234

BasicsIntroduction

Learning as Optimization

Feature Selection using

Elastic Net

BasicsIntroduction

Learning as Optimization

Multi-class/Multi-task Learning:

BasicsIntroduction

Learning as Optimization

Regularized Empirical Loss Minimization

BasicsIntroduction

Data Matrices and Machine Learning

The Instance-feature Matrix:X?Rn×dX=(

BasicsIntroduction

Data Matrices and Machine Learning

The output vector:y=(

BasicsIntroduction

Data Matrices and Machine Learning

Kernel Matrix

BasicsIntroduction

Data Matrices and Machine Learning

Kernel Methods

BasicsIntroduction

Data Matrices and Machine Learning

Distance Metric Matrix

BasicsIntroduction

Data Matrices and Machine Learning

Some machine learning tasks requires

Matrix

BasicsIntroduction

Why Learning from Big Data is Challenging?

High per-iteration cost

High memory cost

High communication cost