Tuesday, 17 December 2019

GMemoryMonitor (low-memory-monitor, 2nd phase)

TL;DR

Use GMemoryMonitor in glib 2.63.3 and newer in your applications to lower overall memory usage, and detect low memory conditions.

low-memory-monitor

To start with, let's come back to low-memory-monitor, announced at the end of August.

It's not really a “low memory monitor”. I know, the name is deceiving, but it actually monitors memory pressure stalls, and how hard it is for the kernel to allocate memory when applications need it. The longer it takes to allocate memory, the longer the kernel takes to allocate it, usually because it needs to move memory around to make room for a big allocation, when an application starts up for example, or prepares an in-memory buffer for saving.

It is not a daemon that will kill programs on low memory. It's not a user-space out-of-memory killer, and does not take those policy decisions. It can however be configured to ask the kernel to do that. The kernel doesn't really know what it's doing though, and user-space isn't helping either, so best disable that for now...

As listed in low-memory-monitor's README (and in the announcement post), there were a number of similar projects around, but none that would offer everything we needed, eg.:
  • Has a D-Bus interface to propagate low memory conditions
  • Requires Linux 5.2's kernel memory pressure stalls information (Android's lowmemorykiller daemon has loads of code to get the same information from the kernel for older versions, and it really is quite a lot of code)
  • Written in a compiled language to save on startup/memory usage costs (around 500 lines of C code, as counted by sloccount)
  • Built-in policy, based upon values used in Android and Endless OS
 GMemoryMonitor

Next up, in our effort to limit memory usage, we'll need some help from applications. That's where GMemoryMonitor comes in. It's simple enough, listen to the low-memory-warning signal and free some image thumbnails, index caches, or dump some data to disk, when you receive a signal.

The signal also gives you a “warning level”, with 255 being when low-memory-monitor would trigger the kernel's OOM killer, and lower values different levels of “try to be a good citizen”.

The more astute amongst you will have noticed that low-memory-monitor runs as root, on the system bus, and wonder how those new fangled (5 years old today!) sandboxed applications would receive those signals. Fear not! Support for a portal version of GMemoryMonitor landed in xdg-desktop-portal on the same day as in glib. Everything tied together with installed tests that use the real xdg-desktop-portal to test the portal and unsandboxed versions.

How about an OOM killer?

By using memory pressure stall information, we receive information about the state of the kernel before getting into swapping that'd cause the machine to become unusable. This also means that, as our threshold for keeping everything ticking is low, if we were to kill high memory consumers, we'd get a butter smooth desktop, but, based on my personal experience, your browser and your mail client would take it in turns disappearing from your desktop in a way that you wouldn't even notice.

We'll definitely need to think about our next step in application state management, and changing our running applications paradigm.

Distributions should definitely disable the OOM killer for now, and possibly try their hands at upstream some systemd OOMPolicy and OOMScoreAdjust options for system daemons.

Conclusion

Creating low-memory-monitor was easy enough, getting everything else in place was decidedly more complicated. In addition to requiring changes to glib, xdg-desktop-portal and python-dbusmock, it also required a lot of work on the glib CI to save me from having to write integration tests in C that would have required a lot of scaffolding. So thanks to all involved in particular Philip Withnall for his patience reviewing my changes.

6 comments:

bmillemathias said...

Hi Bastien,

Hope you're fine.

If applications start to use this information, don't you fear of any some side-effects when all applications will start at the same very moment to empty caches or trigger calls to make room to memory that would render the system unresponsive ?

I understand this is not on the GMemoryMonitor responsibility, but just asking.

Bye.

Bastien Nocera said...

I don't think that multiple applications empty their caches at the same time would have much of an impact. In the worst case, we need to figure out which application to send this information to, but we won't need to change the client API if that were the case.

Philip said...

How do I use it in my application? Do you have some example code?

Bastien Nocera said...

> Do you have some example code?

Is anything about the API documentation unclear?

Sam said...

Thanks for the clear explanations. Really interesting project!

How do you test the project? I can see some automated API tests for the GMemoryMonitor API, but I wonder if you also have some tricks for doing manual testing -- maybe you set up a VM and then trigger a high memory pressure situation somehow?

Bastien Nocera said...

> How do you test the project?

There's a "fill-memory" test program in the low-memory-monitor sources. And 3 months of dogfooding :)

If somebody wanted to spend more time automating this, I think that porting Android's lmkd test suite would be very useful (even if we already use the same trigger points, so there's not that much left to test for).