Skip to content
Advertisement

How to make this code not to consume so much RAM memory?

I have these two function and when I run them my kernel dies so freaking quickly. What can I do to prevent it? It happens after appending about 10 files to the dataframe. Unfortunately json files are such big (approx. 150 MB per one, having dozens of them) and I have no idea how to join it together.

JavaScript

EDIT: Due to @jlandercy answer, I’ve made this:

JavaScript

and I have this type of error:

JavaScript

What’s wrong? I’ll upload two smallest json files here: https://drive.google.com/drive/folders/1xlC-kK6NLGr0isdy1Ln2tzGmel45GtPC?usp=sharing

Advertisement

Answer

You are facing multiple issue in your original approach:

  • Multiple copy of dataframe: df = df.drop(...);
  • Whole information stored in RAM because of append;
  • Unnecessary for loop to filter rows, use boolean indexing instead.

Here is baseline snippet to solve your problem based on data sample you provided:

JavaScript

It loads file one by one in RAM then append filtered rows to disk not to RAM. Those fixes will drastically reduce RAM usage and should be kept as high as twice the biggest JSON file.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement