I have a dataframe containing customers
I want to create a new column named
group_user which would take only 3 values :
I want these values to be assigned randomly to customers in balanced proportions.
The output would be :
ID group_user 341 1 127 0 389 2
You could try this:
>>> lst = [0, 1, 2] >>> df['group_user'] = pd.Series(np.tile(lst, len(df) // len(lst) + 1)[:len(df)]).sample(frac=1) >>> df
This would work for all length columns and list.