Skip to content
Advertisement

Get combined/merged cells value

I’m coding a new python script that need to extract data from google sheets, but there are many cells which are merged/combined, and only the top-left cell from this merge has the value. It’s important to have that value on all the merged cells.

How can I do that?

Python 3.8.5 + gspread 3.6.0

Note: every comment “trying to get…”, the code right below it should return the same value as the previous code.

Spreadsheet test: https://docs.google.com/spreadsheets/d/17Dyxufu1y1ouBCPkf5Y7Vt1UW70WroK0Moy_DD7bZKc/edit?usp=sharing

Code for reproducing the problem:

import gspread
from oauth2client.service_account import ServiceAccountCredentials
import os
import pprint

here = os.path.dirname(os.path.abspath(__file__))
secret = os.path.join(here, 'credentials.json')

scope = ['https://spreadsheets.google.com/feeds']
creds = ServiceAccountCredentials.from_json_keyfile_name(secret, scope)

client = gspread.authorize(creds)

sheet = client.open_by_key('17Dyxufu1y1ouBCPkf5Y7Vt1UW70WroK0Moy_DD7bZKc')
ws = sheet.sheet1
pp = pprint.PrettyPrinter()

#getting the FIRST text
result = ws.acell('A1')
pp.pprint('A1: '+result.value)

#trying to get the SAME text on the cell col+1
result = ws.acell('A2')
pp.pprint('A2: '+result.value)

#getting the 'simple_cell'
result = ws.acell('C2')
pp.pprint('C2: '+result.value)

#getting the 'row_merged'
result = ws.acell('D2')
pp.pprint('D2: '+result.value)

#trying to get 'row_merged' on row+1
result = ws.acell('E2')
pp.pprint('E2: '+result.value)

#getting the 'col_merged'
result = ws.acell('D6')
pp.pprint('D6: '+result.value)

#trying to get 'col_merged' on col+1
result = ws.acell('D7')
pp.pprint('D7: '+result.value)

The output is like this:

('A1: just to confirm, the value "row_merged" has been put to cell D2 originally. Value "col_merged" is in D6 initially. Since it's merged, the expected result should be found on all merged cells')
'A2: '
'C2: simple cell'
'D2: row_merged'
'E2: '
'D6: col_merged'
'D7: '
PS C:UsersjoaovDesktop>

The point is: A2 must be equals to A1… E2=D2, D7==D6… But it seems there’s no way of handling merged cells with gspread.

Advertisement

Answer

I believe your goal as follows.

  • You want to retrieve the values from the merged cells using gspread of python.

In your sample Spreadsheet, for example, the cells “A1:L12” are merged. Under this condition, when the values are retrieved from the merged cells of “A1:L12” using Sheets API, just to confirm, the value "row_merged" has been put to cell D2 originally. Value "col_merged" is in D6 initially. Since it's merged, the expected result should be found on all merged cells is retrieved only cell “A1”. So in order to retrieve the same values from all cells of “A1:L12”, it is required to put the values to the cells “A2:L12”. Unfortunately, it seems that there are no methods for directly achieving this situation in Sheets API. So, in this answer, I would like to propose this using a script.

By the way, about result = ws.acell('A2') in your script, this cell is not merged and it’s empty. So in this case, I think that the empty value is correct. I thought that you might have wanted to check the cell “B1”. In this answer, this is also considered.

Sample script:

spreadsheet_id = '###'  # Please set the Spreadsheet ID.
sheet_name = 'Sheet1'  # Please set the sheet name.

client = gspread.authorize(credentials)
access_token = client.auth.token
url = "https://sheets.googleapis.com/v4/spreadsheets/" + 
    spreadsheet_id + "?fields=sheets&ranges=" + sheet_name
res = requests.get(url, headers={"Authorization": "Bearer " + access_token})
obj = res.json()
# print(obj['sheets'][0]['merges'])

sheet = client.open_by_key(spreadsheet_id)
ws = sheet.worksheet(sheet_name)

# 1. All values are retrieved.
values = ws.get_all_values()

# 2. Put the values to the merged cells.
if 'merges' in obj['sheets'][0].keys():
    for e in obj['sheets'][0]['merges']:
        value = values[e['startRowIndex']][e['startColumnIndex']]
        rows = len(values)
        if rows < e['endRowIndex']:
            for i in range(0, e['endRowIndex'] - rows):
                values.append([''])
        for r in range(e['startRowIndex'], e['endRowIndex']):
            cols = len(values[r])
            if cols < e['endColumnIndex']:
                values[r].extend([''] * (e['endColumnIndex'] - cols))
            for c in range(e['startColumnIndex'], e['endColumnIndex']):
                values[r][c] = value

# For A1
print('A1: '+values[0][0])

# For B1
# output: just to confirm, the value "row_merged" has been put to cell D2 originally. Value "col_merged" is in D6 initially. Since it's merged, the expected result should be found on all merged cells
print('B1: '+values[0][1])

# For C2
# output: simple cell
print('C2: '+values[1][2])

# For D2
# output: row_merged
print('D2: '+values[1][3])

# For E2
# output: row_merged
print('E2: '+values[1][4])

# For D6
# output: col_merged
print('D6: '+values[5][3])

# For D7
# output: col_merged
print('D7: '+values[6][3])

Note:

  • In this sample script, the method of “spreadsheets.get” in Sheets API is used with requests using the access token retrieved from client = gspread.authorize(credentials) of gspread.
  • In this sample script, the values are processed in the list. So when you retrieved the values from the merged cells, please retrieve them from the list values.

Reference:

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement