Linear Algebra review - Matrix-matrix multiplication

摘要: 本文是吴恩达 (Andrew Ng)老师《机器学习》课程,第三章《线性代数回顾》中第17课时《矩阵乘法》的视频原文字幕。为本人在视频学习过程中逐字逐句记录下来并加以修正,使其更加简洁,方便阅读,以便日后查阅使用。现分享给大家。如有错误,欢迎大家批评指正,在此表示诚挚地感谢!同时希望对大家的学习能有所帮助。

In this video (article) we talk about matrix-matrix multiplication or how to multiply two matrices together. When we talk about the method in linear regression for how to solve for the parameters \theta _{0} and \theta _{1}, all in one shot, so, without needing an iterative algorithm like gradient descent. When we talk about that algorithm, it turns out that matrix-matrix multiplication is one of the key steps that you need to know.

So, let's start with an example. Let's say I have two matrices, and I want to multiply them together. Let me again just work through this example, and then I'll tell you a little bit what happened. So, the first thing I'm gonna do is I'm going to pull out the first column of this matrix on the right. And I'm going to take this matrix on the left and multiply it by a vector. That's just this first column. And it turns out if I do that, I'm going to get a vector \begin{bmatrix} 11\\ 9 \end{bmatrix}. So, this is the same matrix vector multiplication as you saw in the last video (article). And then the second thing I'm going to do is I'm going to pull out the second column of this matrix on the right, and I'm then going to take this matrix on the left and multiply it by that second column on the right. And it turns out that you get \begin{bmatrix} 10\\ 14 \end{bmatrix}. Then I'm just going to take these two results and put them together, and that'll be my answer. So, it turns out the output of this product is going to be a 2\times 2 matrix \begin{bmatrix} 11 & 10\\ 9 & 14 \end{bmatrix}. So, that was the mechanics of how to multiply a matrix by another matrix. You basically look at the second matrix one column at a time, and you assemble the answers. And again, we'll step through this much more carefully in a second, but I just want to point out also, this first example is a 2\times 3 matrix, and multiplying that by a 3\times 2 matrix, and the outcome of this product turns out to be a 2\times 2 matrix.

Let's actually look at the details. I have a matrix A. I want to multiply that with a matrix B, and the result will be some new matrix C. It turns out that you can only multiply together matrices whose dimensions match. So A is an m\times n matrix, so m rows, n columns. I am going to multiply that with a n\times o matrix, and it turns out this n here (columns number of A) must match this n here (rows number of B), so the number of columns in first matrix must equal to the number of rows in second matrix. And the result of this product will be an m\times o matrix, like the matrix C here. And in the previous video, everything we did corresponded to this special case of o be equal to 1. That was in case of B being a vector. But now, we're gonna deal with the case of values of o larger than 1. So, here's how you multiply together the two matrices. I'm going to take the first column of B, and treat that as a vector, and multiply the matrix A with the first column of B. And the result of that will be an m\times 1 vector, and I'm going to put that over here (green box in C). Then, I'm going to take the second column of B which is another n\times 1 vector, and multiply A with it. The result will be a n dimensional vector which we'll put there (magenta box in C), and so on. Then I'm going to take the third column, multiply by this matrix, and get a m dimensional vector. Until you get to the last column of C.

Just say that again. The i^{th} column of the matrix C is obtained by multiplying the matrix A with the i^{th} column of matrix B, for the values of i equals 1, 2 up to o.

Let's look at just one more example. Let's say I want to multiply together these two matrices. So, what I'm going to do is, first pull out the first column of my second matrix. And have the first matrix times this vector and the result is \begin{bmatrix} 9\\ 15 \end{bmatrix}. And next, I'm going to pull out the second column of the second matrix, and do the corresponding calculation with result \begin{bmatrix} 7\\ 12 \end{bmatrix}. So, the product of these two matrices is going to be \begin{bmatrix} 9 & 7\\ 15 & 12 \end{bmatrix}

Finally, let me show you one more neat trick you can do with matrix matrix multiplication. Let's say as before that we have four houses whose prices we want to predict, only now we have three competing hypothesis shown here on the right, so if you want to apply all 3 competing hypotheses to all four of the houses, it turns out that you can do that very efficiently, using a matrix-matrix multiplication. So, here on the left is my usual matrix. What I'm going to do is constructing another matrix, where the first column, is this \begin{bmatrix} -40\\ 0.25 \end{bmatrix}. And the second column is this \begin{bmatrix} 200\\ 0.1 \end{bmatrix}, and so on. It turns out that if you multiply these two matrices, what you find is that, this first column in blue is result of multiply the first matrix with the first column of second matrix. This is exactly the predicted housing prices of the first hypothesis. Similarly, for the 2nd and 3rd columns. It turns out that by constructing these two matrices, what you can therefore do is very quickly apply all three hypotheses to all four house sizes to get all twelve predicted price output by your three hypotheses on your four houses. And even better, it turns out that in order to do that matrix multiplication, there are lots of good linear algebra libraries in order to do this multiplication step for you.

<end>

发布了41 篇原创文章 · 获赞 12 · 访问量 1306

猜你喜欢

转载自blog.csdn.net/edward_wang1/article/details/103172166
今日推荐