Skip to content
Advertisement

How can I control what scalar form PyYAML uses for my data?

I’ve got an object with a short string attribute, and a long multi-line string attribute. I want to write the short string as a YAML quoted scalar, and the multi-line string as a literal scalar:

my_obj.short = "Hello"
my_obj.long = "Line1nLine2nLine3"

I’d like the YAML to look like this:

short: "Hello"
long: |
  Line1
  Line2
  Line3

How can I instruct PyYAML to do this? If I call yaml.dump(my_obj), it produces a dict-like output:

{long: 'line1

    line2

    line3

    ', short: Hello}

(Not sure why long is double-spaced like that…)

Can I dictate to PyYAML how to treat my attributes? I’d like to affect both the order and style.

Advertisement

Answer

Based on Any yaml libraries in Python that support dumping of long strings as block literals or folded blocks?

import yaml
from collections import OrderedDict

class quoted(str):
    pass

def quoted_presenter(dumper, data):
    return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='"')
yaml.add_representer(quoted, quoted_presenter)

class literal(str):
    pass

def literal_presenter(dumper, data):
    return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
yaml.add_representer(literal, literal_presenter)

def ordered_dict_presenter(dumper, data):
    return dumper.represent_dict(data.items())
yaml.add_representer(OrderedDict, ordered_dict_presenter)

d = OrderedDict(short=quoted("Hello"), long=literal("Line1nLine2nLine3n"))

print(yaml.dump(d))

Output

short: "Hello"
long: |
  Line1
  Line2
  Line3
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement