Given two matrices, I want to create a new array of the sum of squared differences of each row, but I cannot seem to find a way.
To be more clear what I mean, let’s have an example. I would like to do the following for-loop in numpy matrix calculations:
a = np.array([[0.0, 0.0, 0.0], [1.0, 1.0, 1.0]]) b = np.array([[1.2, 2.3, 3.4], [4.5, 5.6, 7.8], [9.10, 10.11, 11.12]]) summed = np.ones((2,3)) for i, aSample in enumerate(a): for j, bSample in enumerate(b): summed[i, j] = np.sum(np.power(aSample - bSample, 2)) >>>summed array([[ 18.29 , 112.45 , 308.6765], [ 7.49 , 79.65 , 251.0165]])
These are just example matrices, in my use case both of the matrices have over tens of thousands of rows. So the shapes of these matrices are more like (20000, 1000). Is there a way to do this efficiently with numpy?
EDIT: @Blorgon provided correct results, but in my case, I couldn’t allocate bigger matrix with np.newaxis. The solution by a @MadPhysicist calculated successfully the distance of the vectors within memory limits.
>>> from scipy.spatial.distance import cdist >>> cdist(a, b)**2 array([[ 18.29 , 112.45 , 308.6765], [ 7.49 , 79.65 , 251.0165]])
The problem with this approach is that it takes a square root and then undoes it. The advantage is that it does not use a large intermediate array. You can avoid some intermediates in numpy like this:
>>> diff = b - a[:, np.newaxis] >>> np.power(diff, 2, out=diff).sum(axis=2) array([[ 18.29 , 112.45 , 308.6765], [ 7.49 , 79.65 , 251.0165]])