Skip to main content

My Channels

20: More Machine Learning

20.3 Perform matrix factorization using irlba

20: More Machine Learning

20.3 Perform matrix factorization using irlba - Video Tutorials & Practice Problems

Video duration:

4m

<v Voiceover>Ever</v> since the Netflix prize, matrix factorization has proven very popular. In case you want to factorize your own matrix, you can use the package irlba from Bryan Lewis. This stands for implicitly restarted lanczos bidiagonalization. It is a very fast memory efficient algorithm for matrix factorization. To do this, first let's get some data. On my website at jaredlander.com I posted an adjacency matrix in sparse matrix format. We can get it by running load url in quotes http://www .jaredlander.com/data /adjacency .rdata if we look at the object adj we see the sparse adjacency matrix. The whole point of factorizing a matrix is to break it down to its approximate components. The matrix A can be broken down to the left matrix u diagonal matrix d and the transposed to the right matrix v. By decomposing it into these parts and then multiplying it back together we get an approximation for the original which is good for imputing missing values or making recommendations. To do this, we load the package irlba. And then, save it as matDec for matrix decomposition irlba of the adjacency matrix, and we need to specify two values: nu and nv. Nu is the number of left singular vectors and nv is the number of right singular vectors. I'll set them both to three for now. And the matrix is done. We can look at each of these individually. If we look at the structure of this decomposed matrix we have d, which is just a vector, but represents the diagonal values, u and v, number of iterations and some other controls. No need to worry about them too much. The beautiful thing about irlba is that it can take both dense matrices and sparse matrices. So now that we have our decomposition, we should go ahead and multiply it back together to find the predicted adjacency matrix. We want to keep things in the spirit of sparse matrices, so we will load the matrix package. We set adjPrime to be matDec $u matrix multiply the diagonal of x equals matDec $d matrix multiply the transpose of matDec $v We just want to look at the rounded values of this, so I will say round of adjPrime comma 2. And we see here, spread out over a few lines, the approximations for our adjacency matrix. Now if you recall, the adjacency matrix is all ones and zeros in this case, so what we would do is we would find some cut-off where if these predicted values were above a certain measure, predict a one, below a certain measure, predict a zero. As matrix factorization becomes more popular, it's good to have a powerful tool for decomposing the matrix very quickly. As matrix factorization becomes more popular, it's good to have a powerful tool to quickly decompose that matrix and irlba by Bryan Lewis is seeing rapid improvement and is really becoming an essential tool for that process.