I have two columns from a dataframe that I want to get the Correlation Coefficient for: df[‘a’] and df[‘b’] there are around 15 or 20 rows of data.
I assign these to “col1” and “col2” and try and call the corr method:
col1 = df['a'] col2 = df['b'] corr = col1.corr(col2,method="pearson")
I get an error: ‘float’ object has no attribute ‘shape’
If I import the stats library and try:
I get a correlation coefficient. So what did I do wrong on the first one?
In answer to one of the comments, I checked the type of col1 and col2 and they are both series. I thought this would work since I went to this link in the documentation: https://pandas.pydata.org/docs/reference/api/pandas.Series.corr.html Which gives no indication that you need to specify this is a series rather than a dataframe.
I also checked the type of the full dataframe:
And it comes back as type dataframe
The full dataframe is 21 columns with an index. I only want to get the Correlation Coefficient for two of the columns.
Here is a subset of the data I get if I print col1 and col2:
Name: a, dtype: object
Name: b, dtype: object
Is the index of Country causing the problem?
df is a Series:
>>> df a 10.0 b 12.0 dtype: float64
or a columns of your dataframe has a wrong type:
>>> df a b 0 10.0 20.0 1 12.0 22.0 >>> df.dtypes a float64 b object dtype: object