I want to parse a .properties-file’s keys and values into a python dictionary. The .properties-file I’m parsing uses the following syntax (keys and values are examples):
key1.subkey1.subsubkey1=value1 key1.subkey1.subsubkey2=value2 key1.subkey2=value3 key2=value4
So each value corresponds to a key consisting of one or more levels divided with periods. The goal is to create a Python dictionary where each key is a dictionary containing its value and subkeys. The dictionary should be recursively iterable, so each level should follow the same structure.
The previous example should result in the following kind of dictionary:
'subKeys': 'key1': 'subKeys': 'subkey1': 'subKeys': 'subsubkey1': 'val': 'value1' 'subsubkey2': 'val': 'value2' 'subkey2': 'val': 'value3' 'key2': 'val': 'value4'
I’m looping it with the following algorithm in python:
def setKeyAndValue(storageDict, rowParts): keyParts = rowParts[0].split('.') if not keyParts[0] in outputDict: storageDict[keyParts[0]] = {} newObj = storageDict[keyParts[0]] for i in range(len(keyParts)): if i == len(keyParts)-1: # Reached the end of the key, save value to dictionary newObj["val"] = rowParts[1] else : # Not yet at the end of the key if "subKeys" not in newObj: newObj["subKeys"] = {} if keyParts[i+1] not in newObj["subKeys"]: newObj["subKeys"][keyParts[i+1]] = {} newObj = newObj["subKeys"][keyParts[i+1]] f = open("FILEPATH.properties", "r") outputDict = {} outputDict["subKeys"] = {} outputDictSubKeys = outputDict["subKeys"] for row in f: if not row.startswith('#') and not row.startswith('//'): parts = row.split('=', 1) if len(parts)== 2: setKeyAndValue(outputDictSubKeys, parts) f.close()
The resulting dictionary (outputDict) is missing two key-value pairs (key1.subkey1.subsubkey1=value1, key1.subkey1.subsubkey2=value2):
'subKeys': 'key1': 'subKeys': 'subkey2': 'val': 'value3' 'key2': 'val': 'value4'
I’m pretty sure the problem is with the following row:
newObj = newObj["subKeys"][keyParts[i+1]]
I’m replacing newObj within the dictionary with each iteration of the loop.
Is there a way to tweak this existing algorithm to make it work, and if not, how should I start over? Efficiency is not an issue, the properties-file isn’t very large.
Advertisement
Answer
I copied your function and test your code and made some changes. The below code is working fine.
def setKeyAndValue(storageDict, rowParts): print rowParts keyParts = rowParts[0].split('.') if not keyParts[0] in storageDict.keys(): storageDict[keyParts[0]] = {} newObj = storageDict[keyParts[0]] for i in range(len(keyParts)): if i == len(keyParts)-1: # Reached the end of the key, save value to dictionary newObj["val"] = rowParts[1] else : # Not yet at the end of the key if "subKeys" not in newObj: newObj["subKeys"] = {} if keyParts[i+1] not in newObj["subKeys"]: newObj["subKeys"][keyParts[i+1]] = {} newObj = newObj["subKeys"][keyParts[i+1]] def main(): input = [ 'key1.subkey1.subsubkey1=value1', 'key1.subkey1.subsubkey2=value2', 'key1.subkey2=value3', 'key2=value4' ] ans = {} ans1 = { 'subKeys': ans } for row in input: parts = row.split('=', 1) setKeyAndValue(ans, parts) print ans1 main()
Output is coming as:
{'subKeys': {'key2': {'val': 'value4'}, 'key1': {'subKeys': {'subkey2': {'val': 'value3'}, 'subkey1': {'subKeys': {'subsubkey1': {'val': 'value1'}, 'subsubkey2': {'val': 'value2'}}}}}}}
Replaced your OutputDict
variable with storageDict.keys()
and wrote a sample main function. Try running it for yourself and see if it is working for you.
What I think is your OutputDict contains only subKeys
key so the condition will always be true and you will replace the previously added dictionary with a blank dictionary.