Matrix Multiplication
Matrix Multiplication Brian Krummel February 28, 2020 Today we want to de ne matrix multiplication The idea is to de ne matrix multiplication as a composition of linear transformations In particular, let A be an m n matrix and B be a n p matrix Let T A(X) = AX and T B(X) = BX be the corresponding matrix transformations We
Matrix Multiplication - Michigan State University
Matrix Multiplication Simplify Write "undefined" for expressions that are undefined 1) 0 2 −2 −5 ⋅ 6 −6 3 0 2) 6 −3 ⋅ −5 4 3) −5 −5
Notes on Matrix Multiplication and the Transitive Closure
Notes on Matrix Multiplication and the Transitive Closure Instructor: Sandy Irani An n m matrix over a set S is an array of elements from S with n rows and m columns Each element in a matrix is called an entry The entry in row i and column j is denoted by A i;j A matrix is called a square matrix if the number of rows is equal to the number
Section 24 - Properties of Matrix-Matrix Multiplication
Matrix-Matrix Multiplication is Associative Let A, B, and C be matrices of conforming dimensions Then (AB)C = A(BC): Proof Let e j equal the jth unit basis vector Then (AB)Ce j = (AB)c
CS 140 : Matrix multiplication
Sequential Matrix Multiplication Simple mathematics, but getting good performance is complicated by memory hierarchy --- cache issues Naïve 3-loop matrix multiply
Matrix Multiplication - SageMath
J: matrix of Jordan blocks for eigenvalues P: nonsingular matrix A smith_form() triple with: D == U*A*V D: elementary divisors on diagonal U, V: with unit determinant A LU() triple with: P*A == L*U P: a permutation matrix L: lower triangular matrix, U: upper triangular matrix A QR() pair with: A == Q*R Q: a unitary matrix, R: upper triangular
Matrix-vector Multiplication
matrix Sequential algorithm complexity: (n2) – multiplying n elements of each row of the matrix times n elements of the vector Parallel algorithm computational complexity: (n2/p) Communication complexity of all-gather: (log p + n) Why? All processes sending log p results to one process Assuming that p is a square number
2 Approximating Matrix Multiplication
2 2 Approximating matrix multiplication by random sampling We will start by considering a very simple randomized algorithm to approximate the product of two matrices Matrix multiplication is a fundamental linear algebraic problem, and this randomized algorithm for it is of interest in its own right
[PDF] signes qu'un homme prend du plaisir
[PDF] arts visuels cycle 2 arbre printemps
[PDF] arts visuels arbres cycle 2
[PDF] arbre arts visuels cycle 3
[PDF] arbres arts visuels
[PDF] les arbres en arts plastiques ? l'école
[PDF] arbre arts visuels cycle 2
[PDF] arbre arts plastiques maternelle
[PDF] comment rediger un exercice de math
[PDF] redaction maths prepa
[PDF] académie de créteil recrutement sans concours
[PDF] date titularisation enseignant stagiaire 2017
[PDF] liste titularisation creteil 2016
[PDF] exemple de corpus corrigé poesie
Stat260/CS294: Randomized Algorithms for Matrices and Data
Lecture 2 - 09/09/2013
Lecture 2: Approximating Matrix Multiplication
Lecturer: Michael Mahoney Scribe: Michael MahoneyWarning: these notes are still very rough. They provide more details on what we discussed in class,
but there may still be some errors, incomplete/imprecise statements, etc. in them.2 Approximating Matrix Multiplication
Approximating the product of two matrices with random sampling or random projection methods isa fundamental operation that is of interest in and of itself as well as since it is used in a critical way
as a primitive for many RandNLA algorithms. In this class, we will introduce a basic algorithm; and in this and the next few classes, we will discusses several related methods. Here is the reading for today. Drineas, Kannan, and Mahoney, \Fast Monte Carlo Algorithms for Matrices I: ApproximatingMatrix Multiplication"
2.1 Some notation
Before proceeding, here is some notation that we will use. For a vectorx2Rnwe letkxk2=Pn i=1jxij21=2denote its Euclidean length. For a matrixA2Rmnwe letA(j),j= 1;:::;n, denote thej-th column ofAas a column vector andA(i),i= 1;:::;m, denote thei-th row ofA as a row vector. We denote matrix norms bykAk, using subscripts to distinguish between various norms. Of particular interest will be the Frobenius norm which is dened by kAkF=v uutm X i=1n X j=1A2ij;(1)
and the spectral norm which is dened by kAk2= sup x2Rn; x6=0kAxk2kxk2:(2) These norms are related to each other as:kAk2 kAkFpnkAk2. Both of these norms provide a measure of the \size" of the matrixA. There are several situations in which we will be interested in measuring the size of a matrix|e.g., the error that is incurred by a random sampling or random projection process can be viewed as a matrix, and we will be interested in showing that it is small in an appropriate norm. 12.2 Approximating matrix multiplication by random sampling
We will start by considering a very simple randomized algorithm to approximate the product of two matrices. Matrix multiplication is a fundamental linear algebraic problem, and this randomizedalgorithm for it is of interest in its own right. In addition, this algorithm is of interest since matrix
multiplication is a primitive that is used|often \under the hood" or within the analysis|for many many other randomized algorithms for many many other matrix problems, and thus this algorithm will appear|either explicitly or implicitly, e.g., within the analysis|throughout the course. The problem is the following: given an arbitrarymnmatrixAand an arbitrarynpmatrixB, compute, exactly or approximately, the productAB. As a starting point, the well-known three-loopalgorithm to solve this problem is the following.Algorithm 1Vanilla three-look matrix multiplication algorithm.Input:AnmnmatrixAand annpmatrixB
Output:The productAB
1:fori= 1 tomdo
2:forj= 1 topdo
3:(AB)ij= 0
4:fork= 1 tondo
5:(AB)ik+=AijBjk
6:end for
7:end for
8:end for
9:ReturnABThe running time of this algorithm isO(mnp) time, which isO(n3) time ifm=n=p. Note in
particular that this algorithm loops over all pairs of elements in the product matrix and computes that element as a dot product or inner product between theithrow ofAand thejthcolumn ofB. The question of interest here is: can we solve this problem more quickly? There has been a lot of work on Strassen-like algorithms, which say that one can use a recursive procedure to decrease the running time too(n3) time. For a range of reasons, these algorithms are rarely-used in practice. They will not be our focus; but, since some of our randomized algorithms will call traditional algorithms as black boxes, Strassen-like algorithms can be used to speed up running times of those black boxes, theoretically at least. Here, we will consider a dierent approach: a randomized algorithm that randomly samples columns and rows of the matricesAandB. The key insight is that one shouldnotthink of matrix multi- plication as looping over elements in the product matrix and computing, say, the (i;j)th element of the product matrix as the dot product between theithrowofAand thejthcolumnofB, as is common. That is, the usual perspective is that the elements of the product matrix should be viewed as (AB)ik=nX j=1A ijBjk=A(i)B(j); where eachA(i)B(j)2Ris a number, computed as the inner product of two vectors inRn. Insteadof this, one should think of matrix multiplication as returning a matrix that equals the sum of outer
products ofcolumnsofAand the correspondingrowsofB, i.e., as the sum of rank-one matrices. 2Recall that
AB=nX i=1A (i)B(i);(3) where eachA(i)B(i)2Rmpis anmprank-one matrix, computed as the outer product of two vectors inRn. Viewing matrix multiplication as the sum of outer productssuggests, by analogy with the sum of numbers, that we should sample rank-1 components, to minimize their size, according to their size. Recall that, if we were summing numbers, that we could sample (and rescale|see below) according to any probability distribution, and in particular the uniform distribution, and obtain an unbiased estimator of the sum; but that if we want to minimize the variance of the estimator that we should sample (and rescale) according to the size or magnitude of the numbers. Well, the same is true inthe case of matrices. Since the role of these probabilities will be important in what follows, we will
leave then unspecied as input to this algorithm, and we will return below to what probabilities should or could be used in this algorithm.Given that background, here is ourBasicMatrixMultiplicationalgorithm.Algorithm 2TheBasicMatrixMultiplicationalgorithm.Input:AnmnmatrixA, annpmatrixB, a positive integerc, and probabilitiesfpigni=1.
Output:MatricesCandRsuch thatCRAB
1:fort= 1 tocdo
2:Pickit2 f1;:::;ngwith probabilityPr[it=k] =pk, in i.i.d. trials, with replacement
3:SetC(t)=A(it)=pcp
itandR(t)=B(it)=pcp it.4:end for
5:ReturnCandR.Basically, what we want to show for this algorithm is that
AB=nX i=1A (i)B(i)) 1c c X t=11p itA(it)B(it)) =CR: In particular, we will want to show thatkABCRkis small, for appropriate matrix norms.Not surprisingly, the extent to which we will be able to do this will depend strongly on the sampling
probabilitiesfpigni=1and the number of samplesc, in ways that we will describe in detail. For much of what follows, it will be convenient to express this and related subsequent algorithms in a standardized matrix notation that we will call thesampling matrix formalism. To do so, letS2Rncbe a matrix such that
S ij=1 if theithcolumn ofAis chosen in thejthindependent trial0 otherwise
and letD2Rccbe a diagonal matrix such that D tt= 1=pcp it 3 With this notation, we can write the output of theBasicMatrixMultiplicationalgorithm and what it is doing asC=ASD,R= (SD)TB, andCR=ASD(SD)TB=ASSBAB:
Here,S=SDis just a way to absorb the diagonal rescaling matrix into the sampling matrix; we will do this often, and we will often refer to it simply asS|since one nearly always rescales in a standard way when one samples, the meaning should be clear from context.2.3 Sampling probabilities and implementation issues
For approximating the product of two matrices, as well as for all of the other problems we will consider this semester, one can always sample uniformly at random. Unfortunately, as we willshow, by doing so one would do very poorly in general, in the sense that it is very easy to construct
matrices for which uniform sampling will perform extremely poorly. Thus, a central question in everything we will do will be: how should we construct the random sample? Sincefpigni=1are essentially importance sampling probabilities, informally we would like to choose samples that are more important. Quantifying this is a little subtle for some of the other problems we will consider, but for approximating the product of two matrices, it is quite easy. Recall that we are basically trying to approximate the product of two matrices by sampling randomly rank-one components in the sum Eqn. (3). Thus, by analogy with biasing oneself toward larger terms in the sum of numbers, in order to minimize variance, we would like to bias our random sample toward rank-one components that are larger. The notion of size or magnitude of a rank-one matrix that we will use is the spectral norm of the rank-one components. That is, we will choose columns of Aand rows ofBaccording to a probability distribution that is proportional to