I’m using PyYAML. Is there a way to define a YAML anchor in a way it won’t be a part of the data structure loaded by yaml.load (I can remove “wifi_parm” from the dictionary but looking for a smarter way)?
example.yaml
:
wifi_parm: &wifi_params ssid: 1 key: 2 test1: name: connectivity <<: *wifi_params test2: name: connectivity_5ghz <<: *wifi_params
load_example.py
:
import yaml import pprint with open('aaa.yaml', 'r') as f: result = yaml.load(f) pprint.pprint(result)
prints:
{'test1': {'key': 2, 'name': 'connectivity', 'ssid': 1}, 'test2': {'key': 2, 'name': 'connectivity_5ghz', 'ssid': 1}, 'wifi_parm': {'key': 2, 'ssid': 1}}
I need:
{'test1': {'key': 2, 'name': 'connectivity', 'ssid': 1}, 'test2': {'key': 2, 'name': 'connectivity_5ghz', 'ssid': 1}}
Advertisement
Answer
The anchor information in PyYAML is discarded before you get the result from yaml.load()
. This is according to the YAML 1.1 specification that PyYAML follows (… anchor names are a serialization detail and are discarded once composing is completed). This has not changed in the YAML 1.2 specification (from 2009). You cannot do this in PyYAML by walking over your result
(recursively) and testing what values might be anchors, without extensively modifying the parser.
In my ruamel.yaml (which is YAML 1.2) in round-trip-mode, I preserve the anchors and aliases for anchors that are actually used to alias mappings or sequences (anchors aliases are currently not preserved for scalars, nor are “unused” anchors):
import ruamel.yaml yaml = ruamel.yaml.YAML() with open('aaa.yaml') as f: result = yaml.load(f) yaml.dump(result, sys.stdout)
gives:
wifi_parm: &wifi_params ssid: 1 key: 2 test1: <<: *wifi_params name: connectivity test2: <<: *wifi_params name: connectivity_5ghz
and you can actually walk the mapping (or recursively the tree) and find the anchor node and delete it, without knowing the keys name.
import ruamel.yaml from ruamel.yaml.comments import merge_attrib yaml = ruamel.yaml.YAML() with open('aaa.yaml') as f: result = yaml.load(f) keys_to_delete = [] for k in result: v = result[k] if v.yaml_anchor(): keys_to_delete.append(k) for merge_data in v.merge: # update the dict with the merge data v.update(merge_data[1]) delattr(v, merge_attrib) for k in keys_to_delete: del result[k] yaml.dump(result, sys.stdout)
gives:
test1: name: connectivity ssid: 1 key: 2 test2: name: connectivity_5ghz ssid: 1 key: 2
doing this generically and recursively (i.e. for anchors and aliases that are anywhere in the tree) is possible as well. The update would be as easy as above, but you would need to keep track of how to delete a key, and this doesn’t have to be a mapping value, it could be a sequence item or a scalar.