Copying the segment from one Excel file to another with python: xlrd and xlsxwriter

I am trying to copy an entire segment of an Excel sheet to another file. The segment is actually a header/description, which mainly describes the attributes of the file, the date it was created, etc… All this takes some cells at first five rows and first 3 columns, say from A1:C3. Here’s the code I’ve written (for sake of example, made only for 3 rows):

import xlsxwriter
import xlrd


#### open original excelbook
workbook = xlrd.open_workbook('hello.xlsx')
sheet = workbook.sheet_by_index(0)
# list of populated header rows
row_header_list = ['A1','A2','A3','A4','A5']
i = 0
c = 0
while c <= 2:
#### read original xcel book 3 rows by loop - counter is futher below
         data = [sheet.cell_value(c, col) for col in range(sheet.ncols)]
         #print data
#### write rows to the new excel book

         workbook = xlsxwriter.Workbook('tty_header.xlsx')
         worksheet = workbook.add_worksheet()
         worksheet.write_row(row_header_list[i], data)
         print i,c,row_header_list[i], data
         i+=1
         c+=1
         print "new i is", i, "new c is", c, "list value", row_header_list[i],"data is", data
         workbook.close()

JavaScript
​x
 
import xlsxwriter
import xlrd
​
​
#### open original excelbook
workbook = xlrd.open_workbook('hello.xlsx')
sheet = workbook.sheet_by_index(0)
# list of populated header rows
row_header_list = ['A1','A2','A3','A4','A5']
i = 0
c = 0
while c <= 2:
#### read original xcel book 3 rows by loop - counter is futher below
         data = [sheet.cell_value(c, col) for col in range(sheet.ncols)]
         #print data
#### write rows to the new excel book
​
         workbook = xlsxwriter.Workbook('tty_header.xlsx')
         worksheet = workbook.add_worksheet()
         worksheet.write_row(row_header_list[i], data)
         print i,c,row_header_list[i], data
         i+=1
         c+=1
         print "new i is", i, "new c is", c, "list value", row_header_list[i],"data is", data
         workbook.close()
​

The counters, data, list values – everything seems to be correct and on time, according to print commands, however, when I run this code, in the newly created file only 3’rd row gets populated, rows 1 and 2 are EMPTY. Don’t understand why… To test the issue, made another example-a really inelegant one – no looping, control lists, etc-just blunt approach:

import xlsxwriter
import xlrd

# open original excelbook
workbook = xlrd.open_workbook('hello.xlsx')
sheet = workbook.sheet_by_index(0)
data1 = [sheet.cell_value(0, col) for col in range(sheet.ncols)]
data2 = [sheet.cell_value(1, col) for col in range(sheet.ncols)]
data3 = [sheet.cell_value(2, col) for col in range(sheet.ncols)]
data4 = [sheet.cell_value(3, col) for col in range(sheet.ncols)]

### new excelbook
workbook = xlsxwriter.Workbook('tty_header2.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write_row('A1', data1)
worksheet.write_row('A2', data2)
worksheet.write_row('A3', data3)
worksheet.write_row('A4', data4)

workbook.close()

JavaScript
 
import xlsxwriter
import xlrd
​
# open original excelbook
workbook = xlrd.open_workbook('hello.xlsx')
sheet = workbook.sheet_by_index(0)
data1 = [sheet.cell_value(0, col) for col in range(sheet.ncols)]
data2 = [sheet.cell_value(1, col) for col in range(sheet.ncols)]
data3 = [sheet.cell_value(2, col) for col in range(sheet.ncols)]
data4 = [sheet.cell_value(3, col) for col in range(sheet.ncols)]
​
### new excelbook
workbook = xlsxwriter.Workbook('tty_header2.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write_row('A1', data1)
worksheet.write_row('A2', data2)
worksheet.write_row('A3', data3)
worksheet.write_row('A4', data4)
​
workbook.close()
​

In THIS case everything worked out fine and all the needed data was transferred. Anyone can explain me what is wrong with the first one? Thank you.

Additional trouble I have is if I, after placing header, start to populate columns, the header values become NULL. That is despite me, starting column population from the cell below the “header” cell(in the code, I provide below it’s column 1, starting from cell 6. Any ideas on how to solve it?

workbook = xlrd.open_workbook('tty_header2.xlsx.xlsx')
sheet = workbook.sheet_by_index(0)

data = [sheet.cell_value(row, 2) for row in range(23, sheet.nrows)]
print  data

##### writing new file with xlswriter 
workbook = xlsxwriter.Workbook('try2.xlsx')
worksheet = workbook.add_worksheet('A')
worksheet.write_column('A6', data)
workbook.close()

JavaScript
 
workbook = xlrd.open_workbook('tty_header2.xlsx.xlsx')
sheet = workbook.sheet_by_index(0)
​
data = [sheet.cell_value(row, 2) for row in range(23, sheet.nrows)]
print  data
​
##### writing new file with xlswriter 
workbook = xlsxwriter.Workbook('try2.xlsx')
worksheet = workbook.add_worksheet('A')
worksheet.write_column('A6', data)
workbook.close()
​

UPDATE: Here’s the revised code, after Mike’s correction:

import xlsxwriter
import xlrd


# open original excelbook and access first sheet
workbook = xlrd.open_workbook('hello_.xlsx')
sheet = workbook.sheet_by_index(0)

# define description rows
row_header_list = ['A1','A2','A3','A4','A5']
i = 0
c = 0

#create second file, add first sheet
workbook2 = xlsxwriter.Workbook('try2.xlsx')
worksheet = workbook2.add_worksheet('A')

# read original xcel book 5 rows by loop - counter is futher below
while c <= 5:

         data = [sheet.cell_value(c, col) for col in range(1,5)]
#print data


# write rows to the new excel book

         worksheet.write_row(row_header_list[i], data)
#   print "those are initial values",i,c,row_header_list[i], data
         i+=1
         c+=1
#  print "new i is", i, "new c is", c, "list value", row_header_list[i],"data is", data



####### works !!! xlrd - copy some columns, disclaiming 23 first rows and writing data to the new file


columnB_data = [sheet.cell_value(row, 2) for row in range(23, 72)]
print  columnB_data

##### writing new file with xlswriter - works, without (!!!) converting data to tuple
worksheet.write_column('A5', columnB_data)

columnG_data = [sheet.cell_value(row, 6) for row in range(23, 72)]
#worksheet = workbook.add_worksheet('B')
print columnG_data
worksheet.write_column('B5', columnG_data)

worksheet = workbook.add_worksheet('C')
columnC_dta = [sheet.cell_value(row, 7) for row in range(23, 72)]
print columnC_dta
worksheet.write_column('A5', columnC_dta)

#close workbook2
workbook2.close()

JavaScript
 
import xlsxwriter
import xlrd
​
​
# open original excelbook and access first sheet
workbook = xlrd.open_workbook('hello_.xlsx')
sheet = workbook.sheet_by_index(0)
​
# define description rows
row_header_list = ['A1','A2','A3','A4','A5']
i = 0
c = 0
​
#create second file, add first sheet
workbook2 = xlsxwriter.Workbook('try2.xlsx')
worksheet = workbook2.add_worksheet('A')
​
# read original xcel book 5 rows by loop - counter is futher below
while c <= 5:
​
         data = [sheet.cell_value(c, col) for col in range(1,5)]
#print data
​
​
# write rows to the new excel book
​
         worksheet.write_row(row_header_list[i], data)
#   print "those are initial values",i,c,row_header_list[i], data
         i+=1
         c+=1
#  print "new i is", i, "new c is", c, "list value", row_header_list[i],"data is", data
​
​
​
####### works !!! xlrd - copy some columns, disclaiming 23 first rows and writing data to the new file
​
​
columnB_data = [sheet.cell_value(row, 2) for row in range(23, 72)]
print  columnB_data
​
##### writing new file with xlswriter - works, without (!!!) converting data to tuple
worksheet.write_column('A5', columnB_data)
​
columnG_data = [sheet.cell_value(row, 6) for row in range(23, 72)]
#worksheet = workbook.add_worksheet('B')
print columnG_data
worksheet.write_column('B5', columnG_data)
​
worksheet = workbook.add_worksheet('C')
columnC_dta = [sheet.cell_value(row, 7) for row in range(23, 72)]
print columnC_dta
worksheet.write_column('A5', columnC_dta)
​
#close workbook2
workbook2.close()
​

After running this I get the following error “Traceback (most recent call last): File “C:/Users/Michael/PycharmProjects/untitled/cleaner.py”, line 28, in worksheet.write_row(row_header_list[i], data) IndexError: list index out of range Exception Exception: Exception(‘Exception caught in workbook destructor. Explicit close() may be required for workbook.’,) in > ignored”. The “line 28” refers to:

worksheet.write_row(row_header_list[i], data)

JavaScript
 
worksheet.write_row(row_header_list[i], data)
​

running the entire segment from the beginning to finalizing the loop seems to be fine and provide correct output, thus the problem is down below. If I use the explicit close method, as suggested, I will not be able to use add_sheet method again, since it’ll run over my current sheet. In the provided documentation there are “sheet.activate” and “sheet.select” methods, but they seem to be for cosmetic improvement reasons. I have tried to place the xlsxwriter’s work into a different variable (although if I place all the “copying” process at the top, I don’t ming “workbook” being run over) – didn’t help

Answer

You create new output file with the same name in each loop:

while c <= 2:
     #...
     workbook = xlsxwriter.Workbook('tty_header.xlsx')
     worksheet = workbook.add_worksheet()

JavaScript
 
while c <= 2:
     #...
     workbook = xlsxwriter.Workbook('tty_header.xlsx')
     worksheet = workbook.add_worksheet()
​

Therefore, you overwrite the file in each loop and only the last row gets saved.

Just move this out of the loop:

workbook = xlsxwriter.Workbook('tty_header.xlsx')
worksheet = workbook.add_worksheet()
while c <= 2:
     #...

workbook.close()

JavaScript
 
workbook = xlsxwriter.Workbook('tty_header.xlsx')
worksheet = workbook.add_worksheet()
while c <= 2:
     #...
​
workbook.close()
​

Advertisement

Answer