How to JSON serialize sets?

Question

I have a Python set that contains objects with __hash__ and __eq__ methods in order to make certain no duplicates are included in the collection. I need to json encode this result set, but passing even an empty set to the json.dumps method raises a TypeError. I know I can create an extension to the json.JSONE…

Accepted Answer

JSON notation has only a handful of native datatypes (objects, arrays, strings, numbers, booleans, and null), so anything serialized in JSON needs to be expressed as one of these types.As shown in the json module docs, this conversion can be done automatically by a JSONEncoder and JSONDecoder, but then you would be giving up some other structure you might need (if you convert sets to a list, then you lose the ability to recover regular lists; if you convert sets to a dictionary using dict.fromkeys(s) then you lose the ability to recover dictionaries).A more sophisticated solution is to build-out a custom type that can coexist with other native JSON types.  This lets you store nested structures that include lists, sets, dicts, decimals, datetime objects, etc.:from json import dumps, loads, JSONEncoder, JSONDecoderimport pickleclass PythonObjectEncoder(JSONEncoder):    def default(self, obj):        try:            return {'_python_object': pickle.dumps(obj).decode('latin-1')}        except pickle.PickleError:            return super().default(obj)def as_python_object(dct):    if '_python_object' in dct:        return pickle.loads(dct['_python_object'].encode('latin-1'))    return dctHere is a sample session showing that it can handle lists, dicts, and sets:>>> data = [1,2,3, set(['knights', 'who', 'say', 'ni']), {'key':'value'}, Decimal('3.14')]>>> j = dumps(data, cls=PythonObjectEncoder)>>> loads(j, object_hook=as_python_object)[1, 2, 3, set(['knights', 'say', 'who', 'ni']), {'key': 'value'}, Decimal('3.14')]Alternatively, it may be useful to use a more general purpose serialization technique such as YAML, Twisted Jelly, or Python&#8217;s pickle module.  These each support a much greater range of datatypes.

Advertisement

Answer