Calculation on my for loop and want to do it without for loop using some function

dec = 0.1
data = np.array([100,200,300,400,500])

I have a for loop like this

y = np.zeros(len(data))
for i in range(len(data)):
    if i == 0:
        y[i] = (1.0 - dec) * data[i]
    else:
        y[i] = (1.0 - dec) * data[i] + (dec * y[i - 1])

Output y is:

array([ 90.   , 189.   , 288.9  , 388.89 , 488.889])

And now I want to do the above calculation without a loop, so if I break the code and do

data[0] = (1.0 - dec) * data[0]
data[1:] = (1.0 - dec) * data[1:] + (dec * data[0])

Output data is:

array([ 90, 189, 279, 369, 459])

When you compare y and data output first two values are correct because it is getting multiplied with data[0] which makes sense but later on it should continue as the loop does in loop code, so how can we achieve that? Is there a function that can handle this? This is mainly to optimize my code so that it runs faster for thousands of data.

The expected output is the same as the y output.

Answer

We can do this with scipy.linalg.toeplitz to make a matrix of shifts of the data and then multiplying that by powers of dec and summing columns:

import numpy as np
from scipy.linalg import toeplitz

dec = 0.1
data = np.array([100,200,300,400,500])

decs = np.power(dec, np.arange(len(data)))

r = np.zeros_like(data)
r[0] = data[0]
toep = toeplitz(r, data)

output = (1 - dec) * np.sum(toep * decs.reshape(-1, 1), axis=0)

First decs is a vector of powers of dec:

print(decs) 
#[1.e+00 1.e-01 1.e-02 1.e-03 1.e-04]

Next, we use toeplitz to make a matrix of shifts of data:

print(toep)
#[[100 200 300 400 500]
# [  0 100 200 300 400]
# [  0   0 100 200 300]
# [  0   0   0 100 200]
# [  0   0   0   0 100]])

Finally we reshape decs into a column, multiply it by toep and sum along columns. This result needs to be scaled by 1 - dec.

This all works because we can rewrite our definition of data[i] as a sum instead of recursively:

y[i] = (1.0 - dec) * data[i] + (dec * y[i - 1])
y[i] = (1.0 - dec) * data[i] + (dec * ((1.0 - dec) * data[i - 1] + (dec * y[i - 2])))
...
y[i] = (1.0 - dec) * (data[i] + dec * data[i - 1] + dec ** 2 * data[i - 2] + ... dec ** i * data[0])
y[i] = (1.0 - dec) * sum(dec ** j * data[i - j] for j in range(i + 1))

This can be proven by induction.

From there it follows from rewriting those sums as columns of a matrix and translating that matrix to a calculation in numpy/scipy.