I have a list of geographical postcodes that take the format xxxx
(a string of numbers).
However, in the process of gathering and treating the data, the leading zero has been lost in cases where the postcode begins with '0'
.
I need to reinstate the leading '0'
in such cases.
Postcodes either occur singularly as xxxx
, or they occur as a range in my list, xxxx-xxxx
.
Have:
v = ['821-322', '877', '2004-2218', '2022']
Desired output:
['0821-0322', '0877', '2004-2218', '2022'] ^ ^ ^
Attempt:
for i in range(len(v)): v[i] = re.sub(pattern, '0' + pattern, v)
However, I’m struggling with the regex pattern, and how to simply get the desired result.
There is no requirement to use re.sub()
. Any simple solution will do.
Advertisement
Answer
You should use f-string formatting instead!
Here is a one-liner to solve your problem:
>>> v = ['821-322', '877', '2004-2218', '2022'] >>> ["-".join([f'{i:0>4}' for i in x.split("-")]) for x in v] ['0821-0322', '0877', '2004-2218', '2022']
A more verbose example is this:
v = ['821-322', '877', '2004-2218', '2022'] newv = [] for number in v: num_holder = [] # Split the numbers on "-", returns a list of one if no split occurs for num in number.split("-"): # Append the formatted string to num_holder num_holder.append(f'{num:0>4}') # After each number has been formatted correctly, create a # new string which is joined together with a "-" once again and append it to newv newv.append("-".join(num_holder)) print(newv)
You can read up more on how f-strings work here and a further description of the “mini-language” that is used by the formatter here
The short version of the explanation is this:
f'{num:0>4}'
f
tells the interpreter that a format-string is following{}
inside of the string tells the formatter that it is a replacement-field and should be “calculated”num
inside of the brackets is a reference to a variable:
tells the formatter that there is a format-specifier settings following.0
is the variable / value that should be used to ‘fill’ the string.>
is the alignment of the variablenum
on the new string.>
means to the right.4
is the minimum number of characters that we want the resulting string to have. Ifnum
is equal to or greater that4
characters long then the formatter will do nothing.