Which unicode characters can be used in python3 scripts?

Question

Some unicode characters can be used to name variables, functions, etc. without any problems, e.g. α. Other unicode characters raise an error, e.g. ∇. Which unicode characters can be used to form valid expressions in python? Which unicode characters will raise a SyntaxError? And, is there a reasonable means of including unicode characters that raise errors in python scripts? I

Accepted Answer

https://docs.python.org/3/reference/lexical_analysis.html#identifiers says:Within the ASCII range (U+0001..U+007F), the valid characters for identifiers are the same as in Python 2.x: the uppercase and lowercase letters A through Z, the underscore _ and, except for the first character, the digits 0 through 9.Python 3.0 introduces additional characters from outside the ASCII range (see PEP 3131). For these characters, the classification uses the version of the Unicode Character Database as included in the unicodedata module.Let&#8217;s check your specific characters:~$ python3Python 3.6.9 (default, Jan 26 2021, 15:33:00) [GCC 8.4.0] on linuxType "help", "copyright", "credits" or "license" for more information.>>> import unicodedata>>> unicodedata.category('α')'Ll'>>> unicodedata.category('∇')'Sm'>>> α is categorized as &#8220;Letter, lower-case&#8221;, while ∇ is a &#8220;symbol, math&#8221;. The former category is allowed in identifiers, but the latter is not. The full list of allowed character categories is available in the docs link at the top.

Advertisement

Answer