# Calculate 2 dataframes with related elements but different shape

main_df

ID A B C D E F G
01 1 0 0 0 1 0 0
02 0 0 0 0 0 1 1
03 1 0 1 0 0 0 0
04 1 0 0 1 0 0 0

sub_df

ID B C D E
01 1 0 0 1
02 0 1 1 0
04 1 0 1 0

I want to add sub_df onto main_df, and substitute all values greater than 1 into 1 ( all elements in main_df should only contains 0 and 1s)

The final result should look like this:

ID A B C D E F G
01 1 1 0 0 1 0 0
02 0 0 1 1 0 1 1
03 1 0 1 0 0 0 0
04 1 1 0 1 0 0 0

I’ve tried append(), merge() but the result will only append the dataframe. I will have to write another python function to loop through dataframe to calculate. Is there a better way to complete the task?

Use `pandas.DataFrame.add` with `fill_value==0`.

Set `ID`s as index if they are not already so:

```df1 = df1.set_index("ID")
df2 = df2.set_index("ID")
```

`pandas` will fill any hole with `fill_value` when comparing indices to each other.

```new_df = df1.add(df2, fill_value=0)
```

Then use `astype` to convert it to either zero or one.

Note that this is bit hacky if you were to have decimals.

```print(new_df.astype(bool).astype(int))
```

Or just plain old comparison without conversion to int:

```new_df.mask(new_df.gt(1), 1)
```

Output:

```    A  B  C  D  E  F  G
ID
1   1  1  0  0  1  0  0
2   0  0  1  1  0  1  1
3   1  0  1  0  0  0  0
4   1  1  0  1  0  0  0
```