I just spent a few hours tracking down a massive memory leak in the Python code I’m working on for my current project. Running the code on even a single object would eat up 4 GB of memory in my computer, and that slowed the entire system down to the point that other programs were unusable, even if they didn’t crash outright. After fixing the memory leak, the memory usage stayed between 200-204 MB for the entire time the program was running.
As it turns out, the problem was that every loop was creating several plots, and (even though I was writing them to files) not closing them with a plt.clf() except after the end of the loop, which only affected the last plot object.
The gist of it is, there are only a few reasons for memory leaks in Python.
- There’s some object you are opening (repeatedly) but not closing.
- You are repeatedly adding data to a list/dict and not removing the old data. (for instance, if you’re appending data to an array instead of replacing the array)
- There’s a memory leak in one of the libraries you’re calling. (unlikely)
I haven’t successfully tried the methods listed in the above link (they seem to be for heavier-duty programming than we use) but that first reason is likely to be a problem for BDNYC: We open plot figures, we open the BDNYC database, we open .fits files with astropy… remember to close things properly.
For the record, this seems to be the correct sequence for making a matplotlib plot:
fig = pyplot.figure(1) ax = fig.add_subplot(111) # or fig.add_axes() ax.plot(<blah>) ax.set_xlabel(<label>) (...) fig.savefig(<filename>) fig.clf() pyplot.close()
That last line was completely new to me, but is apparently necessary if you do it this way. Matplotlib works without creating a figure, adding axes to a figure, and then adding plot commands to the axis… but if used this way, you can actually edit multiple figures and their subplots at the same time — just specify which axis variable you want to add the plot element to. To my knowledge, this is a clear advantage over IDL: IDL can only operate on one plot at a time; trying to open a new one closes the old one.