I am looking to cumulative values under a few conditional situations: I only want to add the value if
Condition == New, and group the cumulative values by name. Say I have the data frame, df:
Name | Condition | Value A | New | 5 A | Not New | 7 A | New | 15 B | Old | 20 B | New | 14 B | Old | 3
I am looking for the output:
Name | Condition | Value | Cumulative Value A | New | 5 | 5 A | Not New | 7 | 5 A | New | 15 | 20 B | Old | 20 | 0 B | New | 14 | 14 B | Old | 3 | 14
For each entry where
Condition == New, I add the current Value to the current Cumulative Value. Else, the Cumulative Value should be maintained and unchanged. And when a new Name comes in, I reset the Cumulative Value to zero if the first entry for Condition is ‘Old’, or whatever the first Value is if the first entry for Condition is ‘New’.
I am stuck with the code
df['Cumulative Value'] = df.groupby(['Name'])['Value'].cumsum()
which is grouping the Name and calculating all the cumulative sums. Not sure how to move forward with adding an extra condition to check the string for the Condition column and keep Cumulative Value unchanged if condition does not match.
You can replace
Condition is not equal to
New with 0 and then do cumulative sum:
df['Cumulative Value'] = df.Value.where(df.Condition == 'New', 0).groupby(df.Name).cumsum() df Name Condition Value Cumulative Value 0 A New 5 5 1 A Not New 7 5 2 A New 15 20 3 B Old 20 0 4 B New 14 14 5 B Old 3 14