Non-Child Process Exit Notification Support

Daniel Colascione submitted some code to support processes knowing when others have terminated. Normally a process can tell when its own child processes have ended, but not unrelated processes, or at least not trivially. Daniel's patch created a new file in the /proc directory entry for each process—a file called "exithand" that is readable by any other process. If the target process is still running, attempts to read() its exithand file will simply block, forcing the querying process to wait. When the target process ends, the read() operation will complete, and the querying process will thereby know that the target process has ended.

It may not be immediately obvious why such a thing would be useful. After all, non-child processes are by definition unrelated. Why would the kernel want to support them keeping tabs on each other? Daniel gave a concrete example, saying:

Android's lmkd kills processes in order to free memory in response to various memory pressure signals. It's desirable to wait until a killed process actually exits before moving on (if needed) to killing the next process. Since the processes that lmkd kills are not lmkd's children, lmkd currently lacks a way to wait for a process to actually die after being sent SIGKILL.

Daniel explained that on Android, the lmkd process currently would simply keep checking the proc directory for the existence of each process it tried to kill. By implementing this new interface, instead of continually polling the process, lmkd could simply wait until the read() operation completed, thus saving the CPU cycles needed for continuous polling.

And more generally, Daniel said in a later email:

I want to get polling loops out of the system. Polling loops are bad for wakeup attribution, bad for power, bad for priority inheritance, and bad for latency. There's no right answer to the question "How long should I wait before checking $CONDITION again?". If we can have an explicit waitqueue interface to something, we should. Besides, PID polling is vulnerable to PID reuse, whereas this mechanism (just like anything based on struct pid) is immune to it.

Joel Fernandes suggested, as an alternative, using ptrace() to get the process exit notifications, instead of creating a whole new file under /proc. Daniel explained: