a = 'xe6xb8xacxe8xa9xa6' print(bytes(a, 'latin-1').decode('utf-8')) a = input("input:") print(bytes(a, 'latin-1').decode('utf-8'))
The first one can print out the result correctly
While the second one will just print out the string I entered
output:
測試 input:xe6xb8xacxe8xa9xa6 xe6xb8xacxe8xa9xa6 Process finished with exit code 0
Advertisement
Answer
The transformation is a bit tricky:
# Use r'', simulate input a = r'xe6xb8xacxe8xa9xa6' print(a.encode('ascii').decode('unicode-escape').encode('latin-1').decode('utf-8'))
Follow the transformation:
# Step 0 (initial) print(a) xe6xb8xacxe8xa9xa6 # Step 1 print(a.encode('ascii')) b'\xe6\xb8\xac\xe8\xa9\xa6' # Step 2 print(a.encode('ascii').decode('unicode-escape')) 測試 # Step 3 print(a.encode('ascii').decode('unicode-escape').encode('latin-1')) b'xe6xb8xacxe8xa9xa6' # Step 4 (final) print(a.encode('ascii').decode('unicode-escape').encode('latin-1').decode('utf-8')) 測試