mm: memcg/percpu: account percpu memory to memory cgroups
Percpu memory is becoming more and more widely used by various subsystems,
and the total amount of memory controlled by the percpu allocator can make
a good part of the total memory.
As an example, bpf maps can consume a lot of percpu memory, and they are
created by a user. Also, some cgroup internals (e.g. memory controller
statistics) can be quite large. On a machine with many CPUs and big
number of cgroups they can consume hundreds of megabytes.
So the lack of memcg accounting is creating a breach in the memory
isolation. Similar to the slab memory, percpu memory should be accounted
by default.
To implement the perpcu accounting it's possible to take the slab memory
accounting as a model to follow. Let's introduce two types of percpu
chunks: root and memcg. What makes memcg chunks different is an
additional space allocated to store memcg membership information. If
__GFP_ACCOUNT is passed on allocation, a memcg chunk should be be used.
If it's possible to charge the corresponding size to the target memory
cgroup, allocation is performed, and the memcg ownership data is recorded.
System-wide allocations are performed using root chunks, so there is no
additional memory overhead.
To implement a fast reparenting of percpu memory on memcg removal, we
don't store mem_cgroup pointers directly: instead we use obj_cgroup API,
introduced for slab accounting.
[akpm@linux-foundation.org: fix CONFIG_MEMCG_KMEM=n build errors and warning]
[akpm@linux-foundation.org: move unreachable code, per Roman]
[cuibixuan@huawei.com: mm/percpu: fix 'defined but not used' warning]
Link: http://lkml.kernel.org/r/6d41b939-a741-b521-a7a2-e7296ec16219@huawei.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Dennis Zhou <dennis@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tobin C. Harding <tobin@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Waiman Long <longman@redhat.com>
Cc: Bixuan Cui <cuibixuan@huawei.com>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200623184515.4132564-3-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
diff --git a/mm/percpu-stats.c b/mm/percpu-stats.c
index 3255806..c8400a2 100644
--- a/mm/percpu-stats.c
+++ b/mm/percpu-stats.c
@@ -34,11 +34,15 @@ static int find_max_nr_alloc(void)
{
struct pcpu_chunk *chunk;
int slot, max_nr_alloc;
+ enum pcpu_chunk_type type;
max_nr_alloc = 0;
- for (slot = 0; slot < pcpu_nr_slots; slot++)
- list_for_each_entry(chunk, &pcpu_slot[slot], list)
- max_nr_alloc = max(max_nr_alloc, chunk->nr_alloc);
+ for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++)
+ for (slot = 0; slot < pcpu_nr_slots; slot++)
+ list_for_each_entry(chunk, &pcpu_chunk_list(type)[slot],
+ list)
+ max_nr_alloc = max(max_nr_alloc,
+ chunk->nr_alloc);
return max_nr_alloc;
}
@@ -129,6 +133,9 @@ static void chunk_map_stats(struct seq_file *m, struct pcpu_chunk *chunk,
P("cur_min_alloc", cur_min_alloc);
P("cur_med_alloc", cur_med_alloc);
P("cur_max_alloc", cur_max_alloc);
+#ifdef CONFIG_MEMCG_KMEM
+ P("memcg_aware", pcpu_is_memcg_chunk(pcpu_chunk_type(chunk)));
+#endif
seq_putc(m, '\n');
}
@@ -137,6 +144,7 @@ static int percpu_stats_show(struct seq_file *m, void *v)
struct pcpu_chunk *chunk;
int slot, max_nr_alloc;
int *buffer;
+ enum pcpu_chunk_type type;
alloc_buffer:
spin_lock_irq(&pcpu_lock);
@@ -202,18 +210,18 @@ static int percpu_stats_show(struct seq_file *m, void *v)
chunk_map_stats(m, pcpu_reserved_chunk, buffer);
}
- for (slot = 0; slot < pcpu_nr_slots; slot++) {
- list_for_each_entry(chunk, &pcpu_slot[slot], list) {
- if (chunk == pcpu_first_chunk) {
- seq_puts(m, "Chunk: <- First Chunk\n");
- chunk_map_stats(m, chunk, buffer);
-
-
- } else {
- seq_puts(m, "Chunk:\n");
- chunk_map_stats(m, chunk, buffer);
+ for (type = 0; type < PCPU_NR_CHUNK_TYPES; type++) {
+ for (slot = 0; slot < pcpu_nr_slots; slot++) {
+ list_for_each_entry(chunk, &pcpu_chunk_list(type)[slot],
+ list) {
+ if (chunk == pcpu_first_chunk) {
+ seq_puts(m, "Chunk: <- First Chunk\n");
+ chunk_map_stats(m, chunk, buffer);
+ } else {
+ seq_puts(m, "Chunk:\n");
+ chunk_map_stats(m, chunk, buffer);
+ }
}
-
}
}