Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 1 | .. _active_mm: |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 2 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 3 | ========= |
| 4 | Active MM |
| 5 | ========= |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 6 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 7 | :: |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 8 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 9 | List: linux-kernel |
| 10 | Subject: Re: active_mm |
| 11 | From: Linus Torvalds <torvalds () transmeta ! com> |
| 12 | Date: 1999-07-30 21:36:24 |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 13 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 14 | Cc'd to linux-kernel, because I don't write explanations all that often, |
| 15 | and when I do I feel better about more people reading them. |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 16 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 17 | On Fri, 30 Jul 1999, David Mosberger wrote: |
| 18 | > |
| 19 | > Is there a brief description someplace on how "mm" vs. "active_mm" in |
| 20 | > the task_struct are supposed to be used? (My apologies if this was |
| 21 | > discussed on the mailing lists---I just returned from vacation and |
| 22 | > wasn't able to follow linux-kernel for a while). |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 23 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 24 | Basically, the new setup is: |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 25 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 26 | - we have "real address spaces" and "anonymous address spaces". The |
| 27 | difference is that an anonymous address space doesn't care about the |
| 28 | user-level page tables at all, so when we do a context switch into an |
| 29 | anonymous address space we just leave the previous address space |
| 30 | active. |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 31 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 32 | The obvious use for a "anonymous address space" is any thread that |
| 33 | doesn't need any user mappings - all kernel threads basically fall into |
| 34 | this category, but even "real" threads can temporarily say that for |
| 35 | some amount of time they are not going to be interested in user space, |
| 36 | and that the scheduler might as well try to avoid wasting time on |
| 37 | switching the VM state around. Currently only the old-style bdflush |
| 38 | sync does that. |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 39 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 40 | - "tsk->mm" points to the "real address space". For an anonymous process, |
| 41 | tsk->mm will be NULL, for the logical reason that an anonymous process |
| 42 | really doesn't _have_ a real address space at all. |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 43 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 44 | - however, we obviously need to keep track of which address space we |
| 45 | "stole" for such an anonymous user. For that, we have "tsk->active_mm", |
| 46 | which shows what the currently active address space is. |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 47 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 48 | The rule is that for a process with a real address space (ie tsk->mm is |
| 49 | non-NULL) the active_mm obviously always has to be the same as the real |
| 50 | one. |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 51 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 52 | For a anonymous process, tsk->mm == NULL, and tsk->active_mm is the |
| 53 | "borrowed" mm while the anonymous process is running. When the |
| 54 | anonymous process gets scheduled away, the borrowed address space is |
| 55 | returned and cleared. |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 56 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 57 | To support all that, the "struct mm_struct" now has two counters: a |
| 58 | "mm_users" counter that is how many "real address space users" there are, |
| 59 | and a "mm_count" counter that is the number of "lazy" users (ie anonymous |
| 60 | users) plus one if there are any real users. |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 61 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 62 | Usually there is at least one real user, but it could be that the real |
| 63 | user exited on another CPU while a lazy user was still active, so you do |
| 64 | actually get cases where you have a address space that is _only_ used by |
| 65 | lazy users. That is often a short-lived state, because once that thread |
| 66 | gets scheduled away in favour of a real thread, the "zombie" mm gets |
Alexander Gordeev | 25356cf | 2020-10-13 16:54:54 -0700 | [diff] [blame] | 67 | released because "mm_count" becomes zero. |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 68 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 69 | Also, a new rule is that _nobody_ ever has "init_mm" as a real MM any |
| 70 | more. "init_mm" should be considered just a "lazy context when no other |
| 71 | context is available", and in fact it is mainly used just at bootup when |
| 72 | no real VM has yet been created. So code that used to check |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 73 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 74 | if (current->mm == &init_mm) |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 75 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 76 | should generally just do |
Michael Ellerman | a55ce6d | 2009-04-13 14:40:09 -0700 | [diff] [blame] | 77 | |
Mike Rapoport | 438b8e2 | 2018-03-21 21:22:17 +0200 | [diff] [blame] | 78 | if (!current->mm) |
| 79 | |
| 80 | instead (which makes more sense anyway - the test is basically one of "do |
| 81 | we have a user context", and is generally done by the page fault handler |
| 82 | and things like that). |
| 83 | |
| 84 | Anyway, I put a pre-patch-2.3.13-1 on ftp.kernel.org just a moment ago, |
| 85 | because it slightly changes the interfaces to accommodate the alpha (who |
| 86 | would have thought it, but the alpha actually ends up having one of the |
| 87 | ugliest context switch codes - unlike the other architectures where the MM |
| 88 | and register state is separate, the alpha PALcode joins the two, and you |
| 89 | need to switch both together). |
| 90 | |
| 91 | (From http://marc.info/?l=linux-kernel&m=93337278602211&w=2) |