Skip to content
Advertisement

yaml anchors definitions loading in PyYAML

I’m using PyYAML. Is there a way to define a YAML anchor in a way it won’t be a part of the data structure loaded by yaml.load (I can remove “wifi_parm” from the dictionary but looking for a smarter way)?

example.yaml:

wifi_parm: &wifi_params
  ssid: 1
  key: 2
test1:
  name: connectivity
  <<: *wifi_params
test2:
  name: connectivity_5ghz
  <<: *wifi_params

load_example.py:

import yaml
import pprint

with open('aaa.yaml', 'r') as f:
    result = yaml.load(f)
pprint.pprint(result)

prints:

{'test1': {'key': 2, 'name': 'connectivity', 'ssid': 1},
 'test2': {'key': 2, 'name': 'connectivity_5ghz', 'ssid': 1},
 'wifi_parm': {'key': 2, 'ssid': 1}}

I need:

{'test1': {'key': 2, 'name': 'connectivity', 'ssid': 1},
 'test2': {'key': 2, 'name': 'connectivity_5ghz', 'ssid': 1}}

Advertisement

Answer

The anchor information in PyYAML is discarded before you get the result from yaml.load(). This is according to the YAML 1.1 specification that PyYAML follows (… anchor names are a serialization detail and are discarded once composing is completed). This has not changed in the YAML 1.2 specification (from 2009). You cannot do this in PyYAML by walking over your result (recursively) and testing what values might be anchors, without extensively modifying the parser.

In my ruamel.yaml (which is YAML 1.2) in round-trip-mode, I preserve the anchors and aliases for anchors that are actually used to alias mappings or sequences (anchors aliases are currently not preserved for scalars, nor are “unused” anchors):

import ruamel.yaml

yaml = ruamel.yaml.YAML()

with open('aaa.yaml') as f:
    result = yaml.load(f)

yaml.dump(result, sys.stdout)

gives:

wifi_parm: &wifi_params
  ssid: 1
  key: 2
test1:
  <<: *wifi_params
  name: connectivity
test2:
  <<: *wifi_params
  name: connectivity_5ghz

and you can actually walk the mapping (or recursively the tree) and find the anchor node and delete it, without knowing the keys name.

import ruamel.yaml
from ruamel.yaml.comments import merge_attrib

yaml = ruamel.yaml.YAML()
with open('aaa.yaml') as f:
    result = yaml.load(f)

keys_to_delete = []
for k in result:
    v = result[k]
    if v.yaml_anchor():
        keys_to_delete.append(k)
    for merge_data in v.merge:  # update the dict with the merge data 
        v.update(merge_data[1])
        delattr(v, merge_attrib)
for k in keys_to_delete:
    del result[k]

yaml.dump(result, sys.stdout)

gives:

test1:
  name: connectivity
  ssid: 1
  key: 2
test2:
  name: connectivity_5ghz
  ssid: 1
  key: 2

doing this generically and recursively (i.e. for anchors and aliases that are anywhere in the tree) is possible as well. The update would be as easy as above, but you would need to keep track of how to delete a key, and this doesn’t have to be a mapping value, it could be a sequence item or a scalar.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement