I want to parse a .properties-file's keys and values into a python dictionary. The .properties-file I'm parsing uses the following syntax (keys and values are examples): So each value corresponds to a key consisting of one or more levels divided with periods. The goal is to create a Python dictionary where each key is a dictionary containing its value and

Crafting a python dictionary based on a .properties file

I want to parse a .properties-file’s keys and values into a python dictionary. The .properties-file I’m parsing uses the following syntax (keys and values are examples):

key1.subkey1.subsubkey1=value1
key1.subkey1.subsubkey2=value2
key1.subkey2=value3
key2=value4

JavaScript
​x
 
key1.subkey1.subsubkey1=value1
key1.subkey1.subsubkey2=value2
key1.subkey2=value3
key2=value4
​

So each value corresponds to a key consisting of one or more levels divided with periods. The goal is to create a Python dictionary where each key is a dictionary containing its value and subkeys. The dictionary should be recursively iterable, so each level should follow the same structure.

The previous example should result in the following kind of dictionary:

'subKeys': 
  'key1':
    'subKeys':
      'subkey1': 
        'subKeys':
          'subsubkey1': 
            'val': 'value1'
          'subsubkey2': 
            'val': 'value2'
      'subkey2': 
        'val': 'value3'
  'key2':
    'val': 'value4'

JavaScript
 
'subKeys': 
  'key1':
    'subKeys':
      'subkey1': 
        'subKeys':
          'subsubkey1': 
            'val': 'value1'
          'subsubkey2': 
            'val': 'value2'
      'subkey2': 
        'val': 'value3'
  'key2':
    'val': 'value4'
​

I’m looping it with the following algorithm in python:

def setKeyAndValue(storageDict, rowParts):
    keyParts = rowParts[0].split('.')
    if not keyParts[0] in outputDict:
        storageDict[keyParts[0]] = {}
    newObj = storageDict[keyParts[0]]
    for i in range(len(keyParts)):
        if i == len(keyParts)-1:
            # Reached the end of the key, save value to dictionary
            newObj["val"] = rowParts[1]
        else :
            # Not yet at the end of the key
            if "subKeys" not in newObj:
                newObj["subKeys"] = {}
            if keyParts[i+1] not in newObj["subKeys"]:
                newObj["subKeys"][keyParts[i+1]] = {}
            newObj = newObj["subKeys"][keyParts[i+1]]

f = open("FILEPATH.properties", "r")
outputDict = {}
outputDict["subKeys"] = {}
outputDictSubKeys = outputDict["subKeys"]
for row in f:
    if not row.startswith('#') and not row.startswith('//'):
        parts = row.split('=', 1)
        if  len(parts)== 2:
            setKeyAndValue(outputDictSubKeys, parts)  
f.close()

JavaScript
 
def setKeyAndValue(storageDict, rowParts):
    keyParts = rowParts[0].split('.')
    if not keyParts[0] in outputDict:
        storageDict[keyParts[0]] = {}
    newObj = storageDict[keyParts[0]]
    for i in range(len(keyParts)):
        if i == len(keyParts)-1:
            # Reached the end of the key, save value to dictionary
            newObj["val"] = rowParts[1]
        else :
            # Not yet at the end of the key
            if "subKeys" not in newObj:
                newObj["subKeys"] = {}
            if keyParts[i+1] not in newObj["subKeys"]:
                newObj["subKeys"][keyParts[i+1]] = {}
            newObj = newObj["subKeys"][keyParts[i+1]]
​
f = open("FILEPATH.properties", "r")
outputDict = {}
outputDict["subKeys"] = {}
outputDictSubKeys = outputDict["subKeys"]
for row in f:
    if not row.startswith('#') and not row.startswith('//'):
        parts = row.split('=', 1)
        if  len(parts)== 2:
            setKeyAndValue(outputDictSubKeys, parts)  
f.close()
​

The resulting dictionary (outputDict) is missing two key-value pairs (key1.subkey1.subsubkey1=value1, key1.subkey1.subsubkey2=value2):

'subKeys': 
  'key1':
    'subKeys':
      'subkey2': 
        'val': 'value3'
  'key2':
    'val': 'value4'

JavaScript
 
'subKeys': 
  'key1':
    'subKeys':
      'subkey2': 
        'val': 'value3'
  'key2':
    'val': 'value4'
​

I’m pretty sure the problem is with the following row:

newObj = newObj["subKeys"][keyParts[i+1]]

JavaScript
 
newObj = newObj["subKeys"][keyParts[i+1]]
​

I’m replacing newObj within the dictionary with each iteration of the loop.

Is there a way to tweak this existing algorithm to make it work, and if not, how should I start over? Efficiency is not an issue, the properties-file isn’t very large.

Answer

I copied your function and test your code and made some changes. The below code is working fine.

def setKeyAndValue(storageDict, rowParts):
    print rowParts
    keyParts = rowParts[0].split('.')
    if not keyParts[0] in storageDict.keys():
            storageDict[keyParts[0]] = {}
    newObj = storageDict[keyParts[0]]
    for i in range(len(keyParts)):
            if i == len(keyParts)-1:
                    # Reached the end of the key, save value to dictionary
                    newObj["val"] = rowParts[1]
            else :
                    # Not yet at the end of the key
                    if "subKeys" not in newObj:
                            newObj["subKeys"] = {}
                    if keyParts[i+1] not in newObj["subKeys"]:
                            newObj["subKeys"][keyParts[i+1]] = {}
                    newObj = newObj["subKeys"][keyParts[i+1]]



def main():
    input  = [
            'key1.subkey1.subsubkey1=value1',
            'key1.subkey1.subsubkey2=value2',
            'key1.subkey2=value3',
            'key2=value4'
    ]
    ans = {}
    ans1 = {
            'subKeys': ans
    }

    for row in input:
            parts = row.split('=', 1)
            setKeyAndValue(ans, parts)
    print ans1

main()

JavaScript
 
def setKeyAndValue(storageDict, rowParts):
    print rowParts
    keyParts = rowParts[0].split('.')
    if not keyParts[0] in storageDict.keys():
            storageDict[keyParts[0]] = {}
    newObj = storageDict[keyParts[0]]
    for i in range(len(keyParts)):
            if i == len(keyParts)-1:
                    # Reached the end of the key, save value to dictionary
                    newObj["val"] = rowParts[1]
            else :
                    # Not yet at the end of the key
                    if "subKeys" not in newObj:
                            newObj["subKeys"] = {}
                    if keyParts[i+1] not in newObj["subKeys"]:
                            newObj["subKeys"][keyParts[i+1]] = {}
                    newObj = newObj["subKeys"][keyParts[i+1]]
​
​
​
def main():
    input  = [
            'key1.subkey1.subsubkey1=value1',
            'key1.subkey1.subsubkey2=value2',
            'key1.subkey2=value3',
            'key2=value4'
    ]
    ans = {}
    ans1 = {
            'subKeys': ans
    }
​
    for row in input:
            parts = row.split('=', 1)
            setKeyAndValue(ans, parts)
    print ans1
​
main()
​

Output is coming as:

{'subKeys': {'key2': {'val': 'value4'}, 'key1': {'subKeys': {'subkey2': {'val': 'value3'}, 'subkey1': {'subKeys': {'subsubkey1': {'val': 'value1'}, 'subsubkey2': {'val': 'value2'}}}}}}}

JavaScript
 
{'subKeys': {'key2': {'val': 'value4'}, 'key1': {'subKeys': {'subkey2': {'val': 'value3'}, 'subkey1': {'subKeys': {'subsubkey1': {'val': 'value1'}, 'subsubkey2': {'val': 'value2'}}}}}}}
​

Replaced your OutputDict variable with storageDict.keys() and wrote a sample main function. Try running it for yourself and see if it is working for you.

What I think is your OutputDict contains only subKeys key so the condition will always be true and you will replace the previously added dictionary with a blank dictionary.

Advertisement

Answer