How can I speed up scipy.integrate.quad?

Consider the following code:

def disFunc(x,n):
        y = np.power(x, n-1)*np.exp(-x)
        return y
    

def disFunc2(t,n,beta):
    y = integrate.quad(disFunc, a = 0, b = beta*t, args = (n))
    return y[0]   

def disFunc3(func):
    y = np.vectorize(func)
    return(y)

maxCounter = 6000
stepSize = .001
n = 5
beta = 25
t = np.cumsum(np.ones(maxCounter)*stepSize)-stepSize
x = disFunc3(disFunc2)
start_time = time.time()
y = x(t,n,beta)
time_taken = (time.time() - start_time)
print (time_taken)

Works like a charm, but it is way too slow (1.85s). How can I speed it up?

Answer

If you just want to integrate the function you gave here:

Note that the function you wish to integrate is actually equivalent to the Lower Incomplete Gamma Function. scipy includes the scipy.special.gammainc function for approximating the lower incomplete Gamma function, regularised by the (complete) Gamma function (i.e: its output is divided by Gamma(n)). We can therefore approximate the integral far more efficiently using these specialised functions:

from scipy import special
import time

# ...

start_time = time.time()
y = special.gammainc(n, beta * t)
y *= special.gamma(n)  # Reverse the regularisation
time_taken = (time.time() - start_time)
print(time_taken)
print(y)

The time this takes will vary for different values of n, but for n = 5.5, this ran in ~0.0005 seconds on my machine, compared to 0.3 seconds for the method below.

If you want to integrate arbitrary functions for which a closed form doesn’t exist:

Here’s an idea which just improves the approach you’re using mathematically, rather than using JIT compilation. This runs ~10 times faster on my machine (your code takes ~2.2 seconds to run for me, this takes ~0.3 seconds):

import numpy as np
from scipy import integrate
import time


def disFunc(x, n):
    y = np.power(x, n - 1) * np.exp(-x)
    return y


def disFunc2(t_prev, t_cur, n, beta):
    y = integrate.quad(disFunc, a=beta * t_prev, b=beta * t_cur, args=(n))
    return y[0]


def disFunc3(func):
    y = np.vectorize(func)
    return y


maxCounter = 6000
stepSize = 0.001
n = 5
beta = 25
t = np.cumsum(np.ones(maxCounter) * stepSize) - stepSize
x = disFunc3(disFunc2)
start_time = time.time()
y = x(t[:-1], t[1:], n, beta)
time_taken = time.time() - start_time
print(time_taken)
print(np.cumsum(y))

The idea uses properties of the integral: The integral over a range [a, b] is equivalent to the integral over [a, c] plus the integral over [c, b], for some c between a and b. So, rather than calculating the integral between (0, b * t) each time, we just calculate the integral between (b * prev_t, b * t) (where prev_t is the last t we used), then run a cumulative sum. This just makes it faster because a lot less iterations of the approximation are required to carry out an integral over a smaller range.

It should be noted that this does skip the first integral (from 0 to beta * t[0]), so you might want to calculate that separately and add it to the front of the array.