I am web-scraping some stuff and i got something like this “735 𝚆𝚒𝚕𝚕𝚒𝚊𝚖 𝚃 𝙼𝚘𝚛𝚛𝚒𝚜𝚜𝚎𝚢 𝙱𝚕𝚟𝚍, 𝙳𝚘𝚛𝚌𝚑𝚎𝚜𝚝𝚎𝚛, 𝙼𝙰 02122 Dorchester MA 02121” how do i convert it to normal text in python?
Advertisement
Answer
You can run it through Unicode normalization:
JavaScript
x
6
1
import unicodedata
2
3
unicodedata.normalize('NFKD', '735 𝚆𝚒𝚕𝚕𝚒𝚊𝚖 𝚃 𝙼𝚘𝚛𝚛𝚒𝚜𝚜𝚎𝚢 𝙱𝚕𝚟𝚍, 𝙳𝚘𝚛𝚌𝚑𝚎𝚜𝚝𝚎𝚛, 𝙼𝙰 02122')
4
5
# '735 William T Morrissey Blvd, Dorchester, MA 02122'
6
Here’s a REPL screenshot that demonstrates it works: