After looking all over the Internet, I’ve come to this.
Let’s say I have already made a text file that reads:
Hello World
Well, I want to remove the very last character (in this case d
) from this text file.
So now the text file should look like this: Hello Worl
But I have no idea how to do this.
All I want, more or less, is a single backspace function for text files on my HDD.
This needs to work on Linux as that’s what I’m using.
Advertisement
Answer
Use fileobject.seek()
to seek 1 position from the end, then use file.truncate()
to remove the remainder of the file:
import os with open(filename, 'rb+') as filehandle: filehandle.seek(-1, os.SEEK_END) filehandle.truncate()
This works fine for single-byte encodings. If you have a multi-byte encoding (such as UTF-16 or UTF-32) you need to seek back enough bytes from the end to account for a single codepoint.
For variable-byte encodings, it depends on the codec if you can use this technique at all. For UTF-8, you need to find the first byte (from the end) where bytevalue & 0xC0 != 0x80
is true, and truncate from that point on. That ensures you don’t truncate in the middle of a multi-byte UTF-8 codepoint:
with open(filename, 'rb+') as filehandle: # move to end, then scan forward until a non-continuation byte is found filehandle.seek(-1, os.SEEK_END) while filehandle.read(1) & 0xC0 == 0x80: # we just read 1 byte, which moved the file position forward, # skip back 2 bytes to move to the byte before the current. filehandle.seek(-2, os.SEEK_CUR) # last read byte is our truncation point, move back to it. filehandle.seek(-1, os.SEEK_CUR) filehandle.truncate()
Note that UTF-8 is a superset of ASCII, so the above works for ASCII-encoded files too.