I have the data set below (*Data*) and I create a histogram using the code below to extract n (number of points in each bin or frequency). Then I calculate the probability of each of the bins by dividing frequency by total number of points to get the respective probability of each bin (*bin_probability*).

Now I want to get the probability for each point in a list. For example say point 1 is in bin 1 therefore, probability is the first value in the array of 0.65; point 2 is in bin 5 so probability is 0.05, etc. **How do I map each point to its respective bin_probability so that I have a list of probabilities for each point (in this case 20 probabilities)?**

Data = [4.33, 4.11, 6.33, 5.67, 3.24, 6.74, 24.6, 6.43, 4.122, 9.67, 9.99, 3.44, 5.66, 3.54, 5.34, 6.55, 5.78, 3.56, 1.55, 5.45] n, bin_edges = np.histogram(Data, bins = 10) totalcount = np.sum(n) bin_probability = n / totalcount print(bin_probability) >> array([0.65, 0.3 , 0. , 0. , 0.05])

Many thanks for your help!

## Answer

Based on @kcsquared’s link above, a list can be made with the respective bin locations for each point. The variable ‘*bins_per_point*‘ includes 20 elements in an array. Each element corresponds to bin the data point is part of. Next the ‘*probability_perpoint* variable divides each frequency by the total count to get the respective probabilities.

bins_per_point = np.fmin(np.digitize(Data, bin_edges), len(bin_edges)-1) probability_perpoint = [bin_probability[bins_per_point[i]-1] for i in range(len(Data))] >> array([0.1 , 0.1 , 0.15, 0.1 , 0.05, 0.15, 0.55, 0.15, 0.1 , 0.2 , 0.2 , 0.05, 0.1 , 0.05, 0.1 , 0.15, 0.1 , 0.05, 0.05, 0.1 ])

To verify, the sum of unique probabilities is 1.

np.sum(bin_probability) >> 1