How to distinquish between floats, ints and scientific notation

Question

I'm writing a custom json compresser. It is going to be reading numbers in all formats. How do I print the values of the json in the format, it is given, with json.load(). I would also want to preserve the type. Example of a file it would have to read would be: I would also want it to be able

Accepted Answer

So IIUC you want to keep the formatting of the json even if the value is given as float. I think the only way to do this is to change the type in your json i.e. adding quotes around float elements.This can be done with regex:import jsonimport redata = """{"a":301, "b":301.0, "c":3.01E2, "d":"301", "e":"301.0", "f":"3.01E2", "g": true, "h":"hello"}"""# the cricial part: enclosing float/int in quotes:pattern = re.compile(r'(?<=:)s*([+-]?d+(?:.d*(?:E-?d+)?)?)b')data_str = pattern.sub(r'"1"', data)val_dict = json.loads(data) # the values as normally read by the json moduletype_dict = {k: type(v) for k,v in val_dict.items()} # their typesrepr_dict = json.loads(data_str) # the representations (everything is a sting there)# using Pandas for pretty formattingimport pandas as pddf = pd.DataFrame([val_dict, type_dict, repr_dict], index=["Value", "Type", "Repr."]).TOutput: Value Type Repr.a 301 301b 301.0 301.0c 301.0 3.01E2d 301 301e 301.0 301.0f 3.01E2 3.01E2g True Trueh hello helloSo here the details of the regex:([+-]?d+(?:.d*(?:E-?d+)?)?) this is our matching group, consisting of:[+-]? optional leading + or –d+ one or more digits, followed by (optionally):(?:.d*(?:E-?d+)?)?: non capturing group made of. a dotd* zero or more digits(optionally) an E with an (optional) minus - followed by one or more digits d+b specifie a word boundary (so that the match doesn’t cut a series of digits)(?<=:) is a lookbehind, ensuring the expression is directly preceeded by : (we don’t add quotes around existing strings)s* any white character before the expression is ignored/removed1 is a back reference to our (1st) group. So we replace the whole match with "1"Edit: slightly changed the regex to replace numbers directly following : and taking leading +/- into account

Advertisement

Answer