I have a Pandas dataframe columns A, B, C and D. I would like the Desired Column as follows:

Grouping by [‘A’,’B’,’C’], I would like the Desired Column to show the cumulative sum of the **FIRST CONSECUTIVE** *True* values in column D.

A | B | C | D | Desired Column |
---|---|---|---|---|

100 | AAA | 001 | False | 0 |

100 | AAA | 001 | False | 0 |

200 | BBB | 055 | True | 1 |

200 | BBB | 055 | True | 2 |

200 | BBB | 055 | True | 3 |

200 | BBB | 055 | False | 3 |

200 | BBB | 055 | True | 3 |

300 | CCC | 099 | False | 0 |

300 | CCC | 099 | True | 0 |

A False value stops the cumulative sum in a group, and any True values after that False are not considered.

I want to use this table to calculate an aggregate one:

A | B | C | Max(Desired Column) |
---|---|---|---|

100 | AAA | 001 | 0 |

200 | BBB | 055 | 3 |

300 | CCC | 099 | 0 |

Thanks for your help!

## Answer

You can use `cummin`

to mark all values after `False`

as `False`

and then calculate `cumsum`

:

df['Desired Column'] = df.groupby(['A', 'B', 'C']).D.transform(lambda x: x.cummin().cumsum()) df A B C D Desired Column 0 100 AAA 1 False 0 1 100 AAA 1 False 0 2 200 BBB 55 True 1 3 200 BBB 55 True 2 4 200 BBB 55 True 3 5 200 BBB 55 False 3 6 200 BBB 55 True 3 7 300 CCC 99 False 0 8 300 CCC 99 True 0

If you only need the aggregate column, then you can just find the index of the first `False`

with `argmin`

:

df.groupby(['A', 'B', 'C'], as_index=False).D.agg( lambda x: len(x) if x.all() else x.argmin() ) A B C D 0 100 AAA 1 0 1 200 BBB 55 3 2 300 CCC 99 0