Finding duplicates and creating sets in pandas

Input:

Part no A B C D
A1 0.25 0.2 0.3 0.4
A2 0.26 0.3 0.3 0.4
A3 0.3 0.3 0.3 0.3
A4 0.7 0.3 0.3 0.3
A5 0.8 0.4 0.45 0.46

I have to create set for duplicates on the column A with the tolerance of +/-0.1

Expected output

Part no A B C D Set
A1 0.25 0.2 0.3 0.4 1
A2 0.26 0.3 0.3 0.4 1
A3 0.3 0.3 0.3 0.3 1
A4 0.7 0.3 0.3 0.3 2
A5 0.8 0.4 0.45 0.46 2

Answer

Use cumsum to create groups:

# If your dataframe is not sorted by 'A' columns
df = df.sort_values('A')

df['Set'] = df['A'].sub(df['A'].shift()).abs().ge(0.1000000001).cumsum().add(1)
>>> df
  Part no     A    B     C     D  Set
0      A1  0.25  0.2  0.30  0.40    1
1      A2  0.26  0.3  0.30  0.40    1
2      A3  0.30  0.3  0.30  0.30    1
3      A4  0.70  0.3  0.30  0.30    2
4      A5  0.80  0.4  0.45  0.46    2

0.1000000001 is due to float precision. You can also use np.isclose.

With np.close:

>>> df['Set'] = np.cumsum(~np.isclose(df['A'], df['A'].shift(), atol=0.1))

  Part no     A    B     C     D  Set
0      A1  0.25  0.2  0.30  0.40    1
1      A2  0.26  0.3  0.30  0.40    1
2      A3  0.30  0.3  0.30  0.30    1
3      A4  0.70  0.3  0.30  0.30    2
4      A5  0.80  0.4  0.45  0.46    2