i have a 2D numpy array. I’m trying to compute the similarities between rows and put it into a `similarities`

array. Is this possible without loop? Thanks for your time!

# ratings.shape = (943, 1682) arri = np.zeros(943) arri = np.where(arri == 0)[0] arrj = np.zeros(943) arrj = np.where(arrj ==0)[0] similarities = np.zeros((ratings.shape[0], ratings.shape[0])) similarities[arri, arrj] = np.abs(ratings[arri]-ratings[arrj])

I want to make a 2D-array similarities in that similarities[i, j] is the differentiation between row i and row j in ratings

[ValueError: shape mismatch: value array of shape (943,1682) could not be broadcast to indexing result of shape (943,)] [1][1]: https://i.stack.imgur.com/gtst9.png

## Answer

The problem is how numpy iterates through the array when indexing a two-dimentional array with two arrays.

First some setup:

import numpy; ratings = numpy.arange(1, 6) indicesX = numpy.indices((ratings.shape[0],1))[0] indicesY = numpy.indices((ratings.shape[0],1))[0]

`ratings`

: `[1 2 3 4 5]`

`indicesX`

: `[[0][1][2][3][4]]`

`indicesY`

: `[[0][1][2][3][4]]`

Now lets see what your program produces:

similarities = numpy.zeros((ratings.shape[0], ratings.shape[0])) similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[0])

`similarities`

:

[[0. 0. 0. 0. 0.] [0. 1. 0. 0. 0.] [0. 0. 2. 0. 0.] [0. 0. 0. 3. 0.] [0. 0. 0. 0. 4.]]

As you can see, numpy iterates over `similarities`

basically like the following:

for i in range(5): similarities[indicesX[i], indicesY[i]] = numpy.abs(ratings[i]-ratings[0])

`similarities`

:

[[0. 0. 0. 0. 0.] [0. 1. 0. 0. 0.] [0. 0. 2. 0. 0.] [0. 0. 0. 3. 0.] [0. 0. 0. 0. 4.]]

Now instead we need indices like the following to iterate through the entire array:

indecesX = [0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4] indecesY = [0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4]

We do that the following:

# Reshape indicesX from (x,1) to (x,). Thats important for numpy.tile(). indicesX = indicesX.reshape(indicesX.shape[0]) indicesX = numpy.tile(indicesX, ratings.shape[0]) indicesY = numpy.repeat(indicesY, ratings.shape[0])

`indicesX`

: `[0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]`

`indicesY`

: `[0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4]`

Perfect! Now just call `similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY])`

again and we see:

`similarities`

:

[[0. 1. 2. 3. 4.] [1. 0. 1. 2. 3.] [2. 1. 0. 1. 2.] [3. 2. 1. 0. 1.] [4. 3. 2. 1. 0.]]

Here the whole code again:

import numpy; ratings = numpy.arange(1, 6) indicesX = numpy.indices((ratings.shape[0],1))[0] indicesY = numpy.indices((ratings.shape[0],1))[0] similarities = numpy.zeros((ratings.shape[0], ratings.shape[0])) indicesX = indicesX.reshape(indicesX.shape[0]) indicesX = numpy.tile(indicesX, ratings.shape[0]) indicesY = numpy.repeat(indicesY, ratings.shape[0]) similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY]) print(similarities)

## PS

You commented on your own post to improve it. You should edit your question instead of commenting on it, when you want to improve it.