I am trying to parse an xml that looks like this. I want to extract information regarding the katagorie i.e ID, parent ID etc: I am trying this but I get this error: even though I am already using encode(‘utf-8’) in my code. How can I get rid of this error? Answer EDIT 2 If want to find regarding nested
Tag: utf-8
UTF-8 characters in python string even after decoding from UTF-8?
I’m working on converting portions of XHTML to JSON objects. I finally got everything in JSON form, but some UTF-8 character codes are being printed. Example: This should be: This is just one example of UTF-8 codes coming through. How can I got through the string and replace every instance of a UTF-8 code with the character it represents? Answer
Can’t figure out error: “UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xdc in position 0: ordinal not in range(128)”
I have a program that is communicating with another machine that sends (or is supposed to send) ASCII characters, the code below is how I write and read code to the machine. def writeCode(send): address = ‘COM4’ I get the error on the “out+=ser.read(1).decode(‘ascii’) line. I looked online but most of the advice seems to be based around if you
Convert utf-16 to utf-8 using python
I am trying to convert a huge csv file from utf-16 to utf-8 format using python and below is the code: But this code uses lots of memory and fails with Memoryerror. Please help me with an alternate method. Answer an option is to convert the file line by line: or you could open the files with your desired encoding:
Unicode decode mismatch on emojis when using json loads
I have a list of utf-8 encoded objects such as : and decode it as follows: I notice that some emojis are not converted as expected as shown below: However, when I decode an individual string, I get the expected output: I’m not sure why the first approach using json.loads gives an unexpected output. Can someone provide any pointers? Answer
Python requests and LanguageTool encoding error
I am trying to post text data to a langaugetool server. My text includes trademark symbols and copyright symbols etc. On my first attempt to just post the text like so: I received an error from requests: Following this post I updated my request as follows: Now requests does not error but the langaugetool server complains that it cannot decode
Converting utf-8 encoded to string from user input in python
The first one can print out the result correctly While the second one will just print out the string I entered output: Answer The transformation is a bit tricky: Follow the transformation:
UTF-8 decoding doesn’t decode special characters in python
Hi I have the following data (abstracted) that comes from an API. I’m using the following code to decode the data byte: The cleanhtml is a regex function that I’ve created to remove html tags from the returned data (It’s working correctly). Although, decode(utf-8) is not removing characters like u00e1. My expected output is: I’ve tried to use replace(“\u00e1”, “á”)
How can I read a byte array file of strings?
There is a file with following contents: This is my try to read the lines and convert them to readable utf characters, but still it shows the same strings in the output file: The output file is: As you see, the problem exists for input line but not for target and prediction lines (however scrambled but that’s okay) Answer It
How to encode a webscraped image link in UTF-8 to ASCII but still have a functional link?
I’m trying to webscrape a link to an image to use it in my Kivy app. The problem is that the image adress has Polish signs in it (ę, ł , ó, ą) and I get this error: Full error traceback: Here is an example where you can see what I mean. On picture loads normaly, without errors, the second