Skip to content
Advertisement

discord.py: too big variable?

I’m very new to python and programming in general, and I’m looking to make a discord bot that has a lot of hand-written chat lines to randomly pick from and send back to the user. Making a really huge variable full of a list of sentences seems like a bad idea. Is there a way that I can store the chatlines on a different file and have the bot pick from the lines in that file? Or is there anything else that would be better, and how would I do it?

Advertisement

Answer

I’ll interpret this question as “how large a variable is too large”, to which the answer is pretty simple. A variable is too large when it becomes a problem. So, how can a variable become a problem? The big one is that the machien could possibly run out of memory, and an OOM killer (out-of-memory killer) or similiar will stop your program. How would you know if your variable is causing these issues? Pretty simple, your program crashes.

If the variable is static (with a size fully known at compile-time or prior to interpretation), you can calculate how much RAM it will take. (This is a bit finnicky with Python, so it might be easier to load it up at runtime and figure it out with a profiler.) If it’s more than ~500 megabytes, you should be concerned. Over a gigabyte, and you’ll probably want to reconsider your approach[^0]. So, what do you do then?

As suggested by @FishballNooodles, you can store your data line-by-line in a file and read the lines to an array. Unfortunately, the code they’ve provided still reads the entire thing into memory. If you use the code they’re providing, you’ve got a few options, non-exhaustively listed below.

  1. Consume a random number of newlines from the file when you need a line of text. You would look at one character at a time, compare it to n, and read the line if you’ve encountered the requested number of newlines. This is O(n) worst case with respect to the number of lines in the file.

  2. Rather than storing the text you need at a given index, store its location in a file. Then, you can seek to the location (which is probably O(1)), and read the text. This requires an O(n) construction cost at the start of the program, but would work much better at runtime.

  3. Use an actual database. It’s usually better not to reinvent the wheel. If you’re just storing plain text, this is probably overkill, but don’t discount it.

[^0]: These numbers are actually just random. If you control the server environment on which you run the code, then you can probably come up with some more precise signposts.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement