I am facing 3000 docx in several directories and subdirectories. I have to prepare a list which consists of the filename and extracted information from the tables in the docx. I have successfully added all the docx to the list targets_in_dir
separating it from non relevant files.
Question : I would like to iterate through targets_in_dir
extract all tables from the docx,
len_target =len(targets_in_dir) file_processed=[] string_tables=[] for i in len_target: doc = docx.Document(targets_in_dir[i]) file_processed.append(targets_ind[i]) for table in doc.tables: for row in table.rows: for cell in row.cells: str.split('MANUFACTURER') string_tables.append(cell.text)
I get the error 'int' object is not iterable
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-39-4847866a9234> in <module> 4 string_tables=[] 5 ----> 6 for i in len_target: 7 8 doc = docx.Document(targets_in_dir[i]) TypeError: 'int' object is not iterable
What am I doing wrong?
Answer
It looks like you are trying to iterate through len_target = len(targets_in_dir)
, which is an int. Because int
is not an iterable object, your for-loop fails.
You need to iterate through an iterable object for the for
loop to work.
fixing it to
for i in range(len_target): # do stuff
or
for i in targets_in_dir: # do stuff
is a good place to start.
Also, your file_processed.append(targets_ind[i])
has a typo.