Create Population Data

Create a population of 5000 individuals and their number of phone pick-ups per day to be used for later sampling. Here’s the code I came up with:

def get_population(pickups, pop_size, std):
    pop = np.random.randint(0, pickups, 5000)
    mean = np.mean(pop)
    std = np.std(pop)
    return pop, mean, std

I used the given assertion errors to come up with the function, so I’m not even sure what I should be returning:

pop_pickups, pop_mean, pop_std = get_population(45, 5000, 42)
assert np.abs(pop_mean - 45) < 0.5, "Get population problem, testing pop_mean, population mean returned does not match expected population mean"
assert np.abs(pop_std - np.sqrt(45)) < 0.5, "Get population problem, testing pop_std, population standard deviation returned does not match expected standard deviation"

I assumed I needed to generate the population, then take the mean and std of said pop. But my code triggered the assertion error for incorrect mean. The ultimate goal is to visualize the bootstrap of a single mean.

Answer

Check the documentation of numpy.random.randint : https://numpy.org/doc/stable/reference/random/generated/numpy.random.randint.html

You’re generating 5000 points between 0 and 45. The mean will likely be around (45 – 0)/2 not around 45. Your code is OK but the assert tests are not checking for the proper mean and STD.


Edit: If you want to generate a sample following a Poisson distribution, you can use this:

import numpy as np

param = 45  # Poisson parameter
nb_samples = 50000
sample = np.random.poisson(param, nb_samples)

assert np.abs(param - np.mean(sample)) < 0.5, "Get population problem, testing pop_mean, population mean returned does not match expected population mean"

Note that this code will run without returning anything.