Skip to content
Advertisement

Remove very last character in file

After looking all over the Internet, I’ve come to this.

Let’s say I have already made a text file that reads: Hello World

Well, I want to remove the very last character (in this case d) from this text file.

So now the text file should look like this: Hello Worl

But I have no idea how to do this.

All I want, more or less, is a single backspace function for text files on my HDD.

This needs to work on Linux as that’s what I’m using.

Advertisement

Answer

Use fileobject.seek() to seek 1 position from the end, then use file.truncate() to remove the remainder of the file:

import os

with open(filename, 'rb+') as filehandle:
    filehandle.seek(-1, os.SEEK_END)
    filehandle.truncate()

This works fine for single-byte encodings. If you have a multi-byte encoding (such as UTF-16 or UTF-32) you need to seek back enough bytes from the end to account for a single codepoint.

For variable-byte encodings, it depends on the codec if you can use this technique at all. For UTF-8, you need to find the first byte (from the end) where bytevalue & 0xC0 != 0x80 is true, and truncate from that point on. That ensures you don’t truncate in the middle of a multi-byte UTF-8 codepoint:

with open(filename, 'rb+') as filehandle:
    # move to end, then scan forward until a non-continuation byte is found
    filehandle.seek(-1, os.SEEK_END)
    while filehandle.read(1) & 0xC0 == 0x80:
        # we just read 1 byte, which moved the file position forward,
        # skip back 2 bytes to move to the byte before the current.
        filehandle.seek(-2, os.SEEK_CUR)

    # last read byte is our truncation point, move back to it.
    filehandle.seek(-1, os.SEEK_CUR)
    filehandle.truncate()

Note that UTF-8 is a superset of ASCII, so the above works for ASCII-encoded files too.

Advertisement