How do you control how the order in which PyYaml outputs key/value pairs when serializing a Python dictionary?
I’m using Yaml as a simple serialization format in a Python script. My Yaml serialized objects represent a sort of “document”, so for maximum user-friendliness, I’d like my object’s “name” field to appear first in the file. Of course, since the value returned by my object’s __getstate__
is a dictionary, and Python dictionaries are unordered, the “name” field will be serialized to a random location in the output.
e.g.
>>> import yaml >>> class Document(object): ... def __init__(self, name): ... self.name = name ... self.otherstuff = 'blah' ... def __getstate__(self): ... return self.__dict__.copy() ... >>> doc = Document('obj-20111227') >>> print yaml.dump(doc, indent=4) !!python/object:__main__.Document otherstuff: blah name: obj-20111227
Advertisement
Answer
Took me a few hours of digging through PyYAML docs and tickets, but I eventually discovered this comment that lays out some proof-of-concept code for serializing an OrderedDict as a normal YAML map (but maintaining the order).
e.g. applied to my original code, the solution looks something like:
>>> import yaml >>> from collections import OrderedDict >>> def dump_anydict_as_map(anydict): ... yaml.add_representer(anydict, _represent_dictorder) ... >>> def _represent_dictorder( self, data): ... if isinstance(data, Document): ... return self.represent_mapping('tag:yaml.org,2002:map', data.__getstate__().items()) ... else: ... return self.represent_mapping('tag:yaml.org,2002:map', data.items()) ... >>> class Document(object): ... def __init__(self, name): ... self.name = name ... self.otherstuff = 'blah' ... def __getstate__(self): ... d = OrderedDict() ... d['name'] = self.name ... d['otherstuff'] = self.otherstuff ... return d ... >>> dump_anydict_as_map(Document) >>> doc = Document('obj-20111227') >>> print yaml.dump(doc, indent=4) !!python/object:__main__.Document name: obj-20111227 otherstuff: blah