Final concat data shape not matching with the actual data shape

Hi i am doing a subsequent filtration of the data basis on the negative value and count. However after doing operation when i concat the filtered data back to the orginal form, then i am not getting the actual data shape

Below is the code for your reference:

df = pd.DataFrame({'A':[1, 2, 3, 4, 5], 'B':[6, 7, -2, -3, 10],'C':[9, -3, -7, 6, 10],'D':[-4,6,5,7,-2]})
df.shape
####################################

negative_vlaue = df[(df['B']<=-1)|(df['C']<=-1)|(df['D']<=-1)]
inv_countone=df[df['A']<=1]

##############################################

mydata=df[(df['B']>-1)|(df['C']>-1)|(df['D']>-1)]
mydata=mydata[mydata['A']>1]
################################################

finaldata=pd.concat([mydata,negative_vlaue,inv_countone])

finaldata.shape

Answer

IIUC, what you need to do is add together your mydata and negative_vlaue dataframes, and then use combine_first to bring in your inv_countone:

final_data = (mydata.add(negative_vlaue)).combine_first(inv_countone)

Prints:

>>> final_data

      A     B     C     D
0   1.0   6.0   9.0  -4.0
1   4.0  14.0  -6.0  12.0
2   6.0  -4.0 -14.0  10.0
3   8.0  -6.0  12.0  14.0
4  10.0  20.0  20.0  -4.0

I can’t be fully confident that this is what you need as you did not specify your desired outcome, but this utilises the calculations you listed above and keeps the initial shape.

The reason that your pd.concat is not the solution here, is because it appends the datasets together, row-wise as it is the default:

>>> finaldata_concat = pd.concat([mydata,negative_vlaue,inv_countone])
>>> len(mydata) +len(negative_vlaue) + len(inv_countone) == len(finaldata_concat)

Out[171]: True