Turning object columns except one into integer (pd.to_numeric not working; data listed as object and errored as float)

I have run into a problem were my data is being listed as an object at some point, and then given an error due to being a ‘float’

I have columns that are typed as objects right now here:

Household    Energy   Water   Food
Brunt         23%      34%    43%
Liv           17%      29%    54%
Rowan         37%      22%    41%
Lisz          32%      32%    36%

I’m trying to remove the ‘%’, turn these into decimal form and turn these into int with the following method:

df.update(df.apply(lambda x : pd.to_numeric(x.str.rstrip('%'),downcast='integer',errors='coerce'))/100)

But as I check with df.info(), it states the objects as still objects.

I’m trying to plot this into a stacked horizontal barchart where the % value is on the respective bars us matplotlib. But when I try to apply the following:

df.plot( 
    x = 'Household', 
    kind = 'barh', 
    stacked = True, 
    title = 'Cluster 1021 HH Consumption', 
    mark_right = True) 

df_total = df.iloc[:,-17:-1].sum(axis=0)
df_rel = df[df.columns[1:]].div(df_total, 0)*100
  
for n in df_rel:
    for i, (cs, ab, pc, tot) in enumerate(zip(df.iloc[:, 1:].cumsum(1)[n], df_rel[n], df_total[n])): 
        plt.text(tot, i, str(tot), va='center')
        plt.text(cs - ab/2, i, str(np.round(pc, 1)) + '%', va='center', ha='center')

It gives me TypeError: 'float' object is not iterable in the second for. How can this be? How can I fix this?

Answer

Firstly creates a list of columns which contains '%' symbol and whom you want to convert into int:-

col=df.columns[1:]

The output of above code is Index(['Energy', 'Water', 'Food'], dtype='object')

You can make use of for loop,replace() method and astype() method:-

for x in col:
    df[x]=df[x].str.replace('%','').astype(int)

Now if you type df.info() it will shows int

Leave a Reply

Your email address will not be published. Required fields are marked *