Openpyxl – Transfer range of rows from a worksheet to another

I want to take a certain part of data from a sheet and copy it to another sheet.

So far, I have a dictionary with key as start row and value as end row.

Using this, I would like to do the following:

-Get the first range from sheet0 and append it to sheet1

-Get the second range from sheet0 and append it to sheet2

-Get the third range from sheet0 and append it to sheet3

I tried the following:

#First range starts at 1 and ends at 34, second range from 34-52 and third from 52-75
myDict = {1: 34, 34: 52, 52: 75}

#store all the sheets, ignoring main sheet
sheet = wb.worksheets[1:] 


for item in myDict:
    for col in ws.iter_cols(min_row=item, max_row=myDict[item], min_col=1 , max_col=ws.max_column):
        for cell in col:
            for z in sheet:
                z.append(col)

Another approach was to use a function and lists:

startRow=[1,34,52]
endRow=[34,52,75]

def addRange(first, second):
    for col in ws.iter_cols(min_row=first, max_row=second, min_col=1 , max_col=ws.max_column):
        for cell in col:
            for z in sheet:
                z.append(col)    

#Call function    
for start, end in zip(startRow, endRow):
    addRange(start, end)  

But on both occasions, I get the following error “ValueError: Cells cannot be copied from other worksheets”

Does anyone have a clue on what am I missing in here?

Thanks in advance!

Answer

from openpyxl import load_workbook
from itertools import product

filename = 'wetransfer-a483c9/testFile.xlsx'
wb = load_workbook(filename)

sheets = wb.sheetnames[1:]

Where sheets would be ['Table 1', 'Table 2', 'Table 3']

# access the main worksheet
ws = wb['Main']

First, get the boundaries (start-/endpoint) for each Table

span = []
for row in ws:
    for cell in row:
        if (cell.value
                and (cell.column == 2)  # restrict search to column2, which is where the Table entries are
                # this also avoids the int error, since integers are not iterable
                and ("Table" in cell.value)):
            span.append(cell.row)

# add sheet's length -> allows us to effectively capture the data boundaries
span.append(ws.max_row + 1)

Result span : [1, 29, 42, 58]

Second, get the pairing of boundaries. +1 ensures the end is included when capturing the tables and convert them to string format Since openpyxl refers to the boundaries in string form and has a 1 index notation, instead of adding 1, you have to take one off.

boundaries = [":".join(map(str,(start, end-1)))  for start, end in zip(span,span[1:])]

Result boundaries : ['1:28', '29:41', '42:57']

Third, create a cartesian of the main sheet, the boundaries and the other sheets. Note that boundaries and sheets are zipped – essentially they are a pair. As such, we paired each table with a boundary:

#table 1 is bound to 1:28,
#table 2 is bound to 29:41, ...

Next, we combine the main sheet with the pair, so main sheet is paired with (table 1, 1:28). The same main sheet is paired with (table 2, 29:41) ...

Fourth, get the data within the ranges. Since we have successfully paired the main sheet with every pair of table and boundary, we can safely get the data for that particular region and shift it to the particular table.

So table 1 in the main sheet refers to 1:28, since it is bound to this particular table. When it’s done with table 1, it returns to the loop and starts at “Table 2”, selecting only “29:41” since this is the limit in this section, and so on.

for main,(ref, table) in product([ws],zip(boundaries, sheets)):

    sheet_content = main[ref]
    # append row to the specified table

    for row in sheet_content:
        #here we iterate through the main sheet
        #get one row of data
        #append it to the table
        #move to the next row, append to the table beneath the previous one
        #and repeat the process till the boundary has been exhausted
        wb[table].append([cell.value for cell in row])
    

Finally, save your file.

wb.save(filename)

Leave a Reply

Your email address will not be published. Required fields are marked *