blob: 637cc49d1b2ffd4c53524138174138adc3ef2a99 [file] [log] [blame]
Mike Rapoport4a832582018-03-21 21:22:31 +02001.. _page_frags:
2
3==============
Alexander Duyck4d09d0f2017-01-10 16:58:12 -08004Page fragments
Mike Rapoport4a832582018-03-21 21:22:31 +02005==============
Alexander Duyck4d09d0f2017-01-10 16:58:12 -08006
7A page fragment is an arbitrary-length arbitrary-offset area of memory
8which resides within a 0 or higher order compound page. Multiple
9fragments within that page are individually refcounted, in the page's
10reference counter.
11
12The page_frag functions, page_frag_alloc and page_frag_free, provide a
13simple allocation framework for page fragments. This is used by the
14network stack and network device drivers to provide a backing region of
15memory for use as either an sk_buff->head, or to be used in the "frags"
16portion of skb_shared_info.
17
18In order to make use of the page fragment APIs a backing page fragment
19cache is needed. This provides a central point for the fragment allocation
20and tracks allows multiple calls to make use of a cached page. The
21advantage to doing this is that multiple calls to get_page can be avoided
22which can be expensive at allocation time. However due to the nature of
23this caching it is required that any calls to the cache be protected by
24either a per-cpu limitation, or a per-cpu limitation and forcing interrupts
25to be disabled when executing the fragment allocation.
26
27The network stack uses two separate caches per CPU to handle fragment
28allocation. The netdev_alloc_cache is used by callers making use of the
29__netdev_alloc_frag and __netdev_alloc_skb calls. The napi_alloc_cache is
30used by callers of the __napi_alloc_frag and __napi_alloc_skb calls. The
31main difference between these two calls is the context in which they may be
32called. The "netdev" prefixed functions are usable in any context as these
33functions will disable interrupts, while the "napi" prefixed functions are
34only usable within the softirq context.
35
36Many network device drivers use a similar methodology for allocating page
37fragments, but the page fragments are cached at the ring or descriptor
38level. In order to enable these cases it is necessary to provide a generic
39way of tearing down a page cache. For this reason __page_frag_cache_drain
40was implemented. It allows for freeing multiple references from a single
41page via a single call. The advantage to doing this is that it allows for
42cleaning up the multiple references that were added to a page in order to
43avoid calling get_page per allocation.
44
45Alexander Duyck, Nov 29, 2016.