dec = 0.1 data = np.array([100,200,300,400,500])
I have a for loop like this
y = np.zeros(len(data)) for i in range(len(data)): if i == 0: y[i] = (1.0 - dec) * data[i] else: y[i] = (1.0 - dec) * data[i] + (dec * y[i - 1])
Output y is:
array([ 90. , 189. , 288.9 , 388.89 , 488.889])
And now I want to do the above calculation without a loop, so if I break the code and do
data = (1.0 - dec) * data data[1:] = (1.0 - dec) * data[1:] + (dec * data)
Output data is:
array([ 90, 189, 279, 369, 459])
When you compare y and data output first two values are correct because it is getting multiplied with data which makes sense but later on it should continue as the loop does in loop code, so how can we achieve that? Is there a function that can handle this? This is mainly to optimize my code so that it runs faster for thousands of data.
The expected output is the same as the y output.
We can do this with
scipy.linalg.toeplitz to make a matrix of shifts of the data and then multiplying that by powers of
dec and summing columns:
import numpy as np from scipy.linalg import toeplitz dec = 0.1 data = np.array([100,200,300,400,500]) decs = np.power(dec, np.arange(len(data))) r = np.zeros_like(data) r = data toep = toeplitz(r, data) output = (1 - dec) * np.sum(toep * decs.reshape(-1, 1), axis=0)
decs is a vector of powers of
print(decs) #[1.e+00 1.e-01 1.e-02 1.e-03 1.e-04]
Next, we use
toeplitz to make a matrix of shifts of
print(toep) #[[100 200 300 400 500] # [ 0 100 200 300 400] # [ 0 0 100 200 300] # [ 0 0 0 100 200] # [ 0 0 0 0 100]])
Finally we reshape
decs into a column, multiply it by
toep and sum along columns. This result needs to be scaled by
1 - dec.
This all works because we can rewrite our definition of
data[i] as a sum instead of recursively:
y[i] = (1.0 - dec) * data[i] + (dec * y[i - 1]) y[i] = (1.0 - dec) * data[i] + (dec * ((1.0 - dec) * data[i - 1] + (dec * y[i - 2]))) ... y[i] = (1.0 - dec) * (data[i] + dec * data[i - 1] + dec ** 2 * data[i - 2] + ... dec ** i * data) y[i] = (1.0 - dec) * sum(dec ** j * data[i - j] for j in range(i + 1))
This can be proven by induction.
From there it follows from rewriting those sums as columns of a matrix and translating that matrix to a calculation in numpy/scipy.