This is my code that uses a text file to do some calculations. I want to make my code more dynamic rather than static the way it is right now, I need some help on how I can shorten the code more specifically the part where I am calculating the sum. I am a bit new to python so please bear with me.
import numpy as np import math # Getting all the values in the file accept the first line file = open("history.txt", "r") next(file) lines = np.array([list(map(int,line.split())) for line in file]) mylist=np.unique(lines,axis=0) print(f"Positive entries: {len(mylist)}") file.close() # Getting the first line of the file file = open("history.txt", "r") first_line = [list(map(int,line.split())) for line in file] No_items = first_line[0][1] No_Customers = first_line[0][0] file.close() # Creating a zeros array vectors = np.zeros((No_items,No_Customers)) rows = [i[1] - 1 for i in lines] # print(rows) columns = [x[0] - 1 for x in lines] # print(columns) vectors[rows, columns] = 1 print(vectors) number_of_vectors = len(vectors) * (len(vectors) - 1) def calc_angle(x, y): norm_x = np.linalg.norm(x) norm_y = np.linalg.norm(y) cos_theta = np.dot(x, y) / (norm_x * norm_y) theta = math.degrees(math.acos(cos_theta)) return theta sum = (calc_angle(vectors[0],vectors[1]) + calc_angle(vectors[0],vectors[2]) + calc_angle(vectors[0],vectors[3]) + calc_angle(vectors[0],vectors[4]) + calc_angle(vectors[1],vectors[0]) + calc_angle(vectors[1],vectors[2]) + calc_angle(vectors[1],vectors[3]) + calc_angle(vectors[1],vectors[4]) + calc_angle(vectors[2],vectors[0]) + calc_angle(vectors[2],vectors[1]) + calc_angle(vectors[2],vectors[3]) + calc_angle(vectors[2],vectors[4]) + calc_angle(vectors[3],vectors[0]) + calc_angle(vectors[3],vectors[1]) + calc_angle(vectors[3],vectors[2]) + calc_angle(vectors[3],vectors[4]) + calc_angle(vectors[4],vectors[0]) + calc_angle(vectors[4],vectors[1]) + calc_angle(vectors[4],vectors[2]) + calc_angle(vectors[4],vectors[3])) print(sum/number_of_vectors)
Answer
Why use for loops atall? You can do this in a vectorized way by doing this. –
Vectorized to work as pairwise
Here is a completely vectorized approach without using np.vectorize signatures. This takes in a matrix containing row vectors and runs a pairwise calc_angle on it before taking a sum.
def calc_angle(vectors): norm = np.linalg.norm(vectors, axis=-1) cos_theta = np.dot(vectors,vectors.T) / np.outer(norm,norm) theta = np.degrees(np.arccos(cos_theta)) np.fill_diagonal(theta, 0) return theta np.sum(calc_angle(vectors)) #Takes in one set of vectors
Vectorized to work for 2 independent vectors
This takes in 2 independent vectors (or sets of vectors) as inputs instead of one matrix. Allows you to work with 2 different sets of vectors as well if you don’t want to do pairwise distances among the same input vectors.
def calc_angle(x, y): norm_x = np.linalg.norm(x, axis=-1) norm_y = np.linalg.norm(y, axis=-1) cos_theta = np.dot(x,y.T) / np.outer(norm_x,norm_y) theta = np.degrees(np.arccos(cos_theta)) np.fill_diagonal(theta, 0) return theta np.sum(calc_angle(vectors, vectors)) #Takes 2 sets of vectors (could be same)
Using np.vectorize
It’s not very efficient, but helps when you are stuck with non-vectorizable code.
def calc_angle(x, y): norm_x = np.linalg.norm(x) norm_y = np.linalg.norm(y) cos_theta = np.dot(x,y) / (norm_x * norm_y) theta = np.degrees(np.arccos(cos_theta)) return theta calc_angle_vec = np.vectorize(calc_angle, signature='(k),(k)->()') sums = calc_angle_vec(vectors[None,:,:], vectors[:,None,:]) np.fill_diagonal(sums, 0) output = np.sum(sums)
Broad idea –
- These approaches calculate pairwise calc_angle between all vectors by taking norm over the last axis
- Then the dot of the matrices in numerator normalized by the outer of the norms give you the distances
- The distances are then passed to numpy functions which are vectorized and turn them into corresponding degrees
- Next, fill diagonals by 0 (because they are the same vectors)
- Finally reduce the matrix with a
np.sum
. - Note: I have changed the math functions to equivalent NumPy functions because they support vectorization.