I am trying to use Jupyter to run analysis and have run the code below but I get NameError instead. I had defined df at the beginning

df = pd.read_csv('dowjones.csv', index_col=0);
df['rm'] = 100 * (np.log(df.DJIA) - np.log(df.DJIA.shift(1)))
df.head()
  1. I initially defined df here, in the code above
df = df.dropna()
formula = 'MSFTtrans ~ rm'
results2 = smf.ols(formula, df).fit(cov_type = 'HAC', cov_kwds={'maxlags':10,'use_correction':True})
print(results2.summary())
  1. Then I ran the code above
NameError                                 Traceback (most recent call last)
<ipython-input-3-b46efd5c722d> in <module>
      2 
      3 
----> 4 df = df.dropna()
      5 formula = 'MSFTtrans ~ rm'
      6 results2 = smf.ols(formula, df).fit(cov_type = 'HAC', cov_kwds={'maxlags':10,'use_correction':True})

NameError: name 'df' is not defined
  1. This is the error I got saying df is not defined.

Answer

There should not be a semi colon at the end of df = pd.read_csv().

Also run the first code and then run the second code. What you are doing is you are not running the first code so df is not defined and when you try to run second code, it is giving you the error.