Assume I have an array a
of matrices, e.g. of shape (N, 3, 3)
= N 3x3 matrices and an array b
of shape (i, 3, k)
.
I want these behaviors for the dot product a * b
- If
i = N
, the result should be a(N, 3, k)
array where the first element isa[0].dot(b[0])
, the seconda[1].dot(b[1])
, and so on. - If
i = 1
, then each element ofa
must be dot multiplied byb[0]
and the resulting shape should be again(N, 3, k)
.
If I try to just use numpy.dot() the result is almost good but the resulting shapes are not what I expect. Is there a way to do this easily and efficiently and make it general for any dimension?
You are right that in the (N, 3, 3) * (N, 3, k)
case you can not use np.dot
directly, because the result will be (N, 3, N, k)
. You will effectively have to extract the N
diagonal elements from axes 0 and 2, but that is a whole lot of unnecessary computation. The (N, 3, 3) * (1, 3, k)
case can be solved using np.dot
if you post-apply a squeeze
to remove the unnecessary third axis: result = a.dot(b).squeeze()
.
The good news is that you don't need np.dot
to get a dot product. Here are three alternatives:
Most simply, use the @
operator, equivalent to np.matmul
, which requires the leading dimensions to broadcast together:
a @ b
np.matmul(a, b)
If your matrices are not in the last two dimensions, you can transpose them to be. This may be inefficient because it will likely copy the data. In those cases, read on.
Another popular solution is to use np.einsum
to explicitly specify the matching axes and the axes to sum:
np.einsum('ijk,ikl->ijl', a, b)
Broadcasting will take care of both i = N
and i = 1
cases.
And of course you can always take the sum-product by hand using *
/ np.multiply
and np.sum
:
(a[..., None] * b[:, None, ...]).sum(axis=2)