Derivative of a Matrix : Data Science Basics
Derivative of a Matrix : Data Science Basics

1
Math for CSLecture 41 Linear Least Squares Problem Over-determined systems Minimization problem: Least squares norm Normal Equations Singular Value Decomposition

2
Math for CSLecture 42 Linear Least Squares: Example Consider an equation for a stretched beam: Y = x 1 + x 2 T Where x 1 is the original length, T is the force applied and x 2 is the inverse coefficient of stiffness. Suppose that the following measurements where taken: T 10 15 20 Y 11.60 11.85 12.25 Corresponding to the overcomplete system: 11.60 = x1 + x2 10 11.85 = x1 + x2 15 – can not be satisfied exactly… 12.25 = x1 + x2 20

3
Math for CSLecture 43 Linear Least Squares: Definition Problem: Given A(m x n), m≥n, b(m x 1) find x(n x 1) to minimize ||Ax-b|| 2. If m > n, we have more equations than the number of unknowns, there is generally no x satisfying Ax=b exactly. This is an overcomplete system.

4
Math for CSLecture 44 Solution Approaches There are three different algorithms for computing the least square minimum. 1.Normal Equations (Cheap, less Accurate). 2.QR decomposition. 3.SVD (expensive, more reliable). The first algorithm in the fastest and the least accurate among the three. On the other hand SVD is the slowest and most accurate.

5
Math for CSLecture 45 Minimize the squared Euclidean norm of the residual vector: To minimize we take the derivative with respect to x and set it to zero: Which reduces to an (n x n) linear system commonly known as NORMAL EQUATIONS: Normal Equations 1

6
Math for CSLecture 46 Normal Equations 2 11.60 = x1 + x2 10 11.85 = x1 + x2 15 12.25 = x1 + x2 20 A x b min||Ax-b|| 2

7
Math for CSLecture 47 Normal Equations 3 We must solve the system For the following values (A T A) -1 A T is called a Pseudo-inverse

8
Math for CSLecture 48 QR factorization 1 A matrix Q is said to be orthogonal if its columns are orthonormal, i.e. Q T ·Q=I. Orthogonal transformations preserve the Euclidean norm since Orthogonal matrices can transform vectors in various ways, such as rotation or reflections but they do not change the Euclidean length of the vector. Hence, they preserve the solution to a linear least squares problem.

9
Math for CSLecture 49 QR factorization 2 Any matrix A (m·n) can be represented as A = Q·R,where Q (m·n) is orthonormal and R (n·n) is upper triangular:

10
Math for CSLecture 410 QR factorization 2 Given A, let its QR decomposition be given as A=Q·R, where Q is an (m x n) orthonormal matrix and R is upper triangular. QR factorization transform the linear least square problem into a triangular least squares. Q·R·x = b R·x = Q T ·b x=R -1 ·Q T ·b Matlab Code:

11
Math for CSLecture 411 Singular Value Decomposition Normal equations and QR decomposition only work for fully-ranked matrices (i.e. rank( A) = n). If A is rank- deficient, that there are infinite number of solutions to the least squares problems and we can use algorithms based on SVD’s. Given the SVD: U (m x m), V (n x n) are orthogonal Σ is an (m x n) diagonal matrix (singular values of A) The minimal solution corresponds to:

12
Math for CSLecture 412 Singular Value Decomposition Matlab Code:

13
Math for CSLecture 413 Linear algebra review – SVD

14
Math for CSLecture 414 Approximation by a low-rank matrix

15
Math for CSLecture 415 Geometric Interpretation of the SVD The image of the unit sphere under any m x n matrix is a hyperellipse v1v1 v2v2 σ·v 2 σ·v 1

16
Math for CSLecture 416 Left and Right singular vectors We can define the properties of A in terms of the shape of AS v1v1 v2v2 σ·u 2 σ·u 1 SAS Singular values of A are the lengths of principal axes of AS, usually written in non-increasing order σ1 ≥ σ2 ≥ … ≥ σn n left singular vectors of A are the unit vectors {u 1,…, u n }, oriented in the directions of the principal semiaxes of AS numbered in correspondance with {σ i } n right singular vectors of A are the unit vectors {v 1,…, v n }, of S, which are the preimages of the principal semiaxes of AS: Av i = σ i u i

17
Math for CSLecture 417 Singular Value Decomposition Av i = σ i u i, 1 ≤ i ≤ n Matrices U,V are orthogonal and Σ is diagonal – Singular Value decomposition

18
Math for CSLecture 418 Matrices in the Diagonal Form Every matrix is diagonal in appropriate basis: Any vector b (m,1) can be expanded in the basis of left singular vectors of A {u i }; Any vector x (n,1) can be expanded in the basis of right singular vectors of A {v i }; Their coordinates in these new expansions are: Then the relation b=Ax can be expressed in terms of b’ and x’:

19
Math for CSLecture 419 Rank of A Let p=min{m,n}, let r≤p denote the number of nonzero singlular values of A, Then: The rank of A equals to r, the number of nonzero singular values Proof: The rank of a diagonal matrix equals to the number of its nonzero entries, and in the decomposition A=UΣV *,U and V are of full rank

20
Math for CSLecture 420 Determinant of A For A (m,m), Proof: The determinant of a product of square matrices is the product of their determinants. The determinant of a Unitary matrix is 1 in absolute value, since: U * U=I. Therefore,

21
Math for CSLecture 421 For A (m,n), can be written as a sum of r rank-one matrices: (1) Proof: If we write Σ as a sum of Σ i, where Σ i =diag(0,..,σ i,..0), then (1) Follows from (2) A in terms of singular vectors

22
Math for CSLecture 422 The L 2 norm of the vector is defined as: (1) The L2 norm of the matrix is defined as: Therefore,where λ i are the eigenvalues Norm of the matrix

23
Math for CSLecture 423 For any ν with 0 ≤ ν ≤r, define (1) If ν=p=min{m,n}, define σ v+1 =0. Then Matrix Approximation in SVD basis

Similar presentations