I am using pandas.to_sql to write data to an existing MySQL table. The code has been running in a crontab job for weeks without fail.
I started to get the following error: ValueError: MySQL identifier cannot be entirely numeric
thisweek.to_sql(name='bs_reporting_weeklymetrics', con = cnx, flavor = 'mysql', if_exists = 'append', index=False)
As you can see the table name is not numeric.
This is caused by an update in pandas 0.16.1, where I was using a previous version prior (I think 0.14.XX) EDIT: this will be fixed in pandas 0.16.2
With this update there is new code in the io.sql package for to_sql that is checking the table name and all column names for numeric characters:
def _get_valid_mysql_name(name): # Filter for unquoted identifiers # See http://dev.mysql.com/doc/refman/5.0/en/identifiers.html uname = _get_unicode_name(name) if not len(uname): raise ValueError("Empty table or column name specified") basere = r'[0-9,a-z,A-Z$_]' for c in uname: if not re.match(basere, c): if not (0x80 < ord(c) < 0xFFFF): raise ValueError("Invalid MySQL identifier '%s'" % uname) if not re.match(r'[^0-9]', uname): raise ValueError('MySQL identifier cannot be entirely numeric') return '`' + uname + '`'
re.match(r'[0-9], uname) returns as None if the uname value is only numeric OR STARTS with a numeric character. I think this is a bug because MySQL supports column names which contain and start with numeric characters (which I had ’90DayTrailingAvgRevenue’)
You can update the pandas code to use:
if re.match(r'[0-9][0-9]*$', uname):
inplace of that line. This changes the regular expression to look for 1 or more numeric characters and the end of line, so that it qualifies the entire name as numeric and not just the first character. It also switches to a positive match instead of negative, so I removed ‘not’
If you don’t want to mess with the pandas package, then I suggest renaming your columns to not start with numerics.