Blame - Documentation/locking/seqlock.rst - SHIFTPHONES/mainline/linux

blob: 64405e5da63e4379df29ea77b800b1473a075a25 [file] [log] [blame]

Ahmed S. Darwish	0d24f65	2020-07-20 17:55:07 +0200	[diff] [blame]	1	======================================
				2	Sequence counters and sequential locks
				3	======================================
				4
				5	Introduction
				6	============
				7
				8	Sequence counters are a reader-writer consistency mechanism with
				9	lockless readers (read-only retry loops), and no writer starvation. They
				10	are used for data that's rarely written to (e.g. system time), where the
				11	reader wants a consistent set of information and is willing to retry if
				12	that information changes.
				13
				14	A data set is consistent when the sequence count at the beginning of the
				15	read side critical section is even and the same sequence count value is
				16	read again at the end of the critical section. The data in the set must
				17	be copied out inside the read side critical section. If the sequence
				18	count has changed between the start and the end of the critical section,
				19	the reader must retry.
				20
				21	Writers increment the sequence count at the start and the end of their
				22	critical section. After starting the critical section the sequence count
				23	is odd and indicates to the readers that an update is in progress. At
				24	the end of the write side critical section the sequence count becomes
				25	even again which lets readers make progress.
				26
				27	A sequence counter write side critical section must never be preempted
				28	or interrupted by read side sections. Otherwise the reader will spin for
				29	the entire scheduler tick due to the odd sequence count value and the
				30	interrupted writer. If that reader belongs to a real-time scheduling
				31	class, it can spin forever and the kernel will livelock.
				32
				33	This mechanism cannot be used if the protected data contains pointers,
				34	as the writer can invalidate a pointer that the reader is following.
				35
				36
				37	.. _seqcount_t:
				38
				39	Sequence counters (``seqcount_t``)
				40	==================================
				41
				42	This is the the raw counting mechanism, which does not protect against
				43	multiple writers. Write side critical sections must thus be serialized
				44	by an external lock.
				45
				46	If the write serialization primitive is not implicitly disabling
				47	preemption, preemption must be explicitly disabled before entering the
				48	write side section. If the read section can be invoked from hardirq or
				49	softirq contexts, interrupts or bottom halves must also be respectively
				50	disabled before entering the write section.
				51
				52	If it's desired to automatically handle the sequence counter
				53	requirements of writer serialization and non-preemptibility, use
				54	:ref:`seqlock_t` instead.
				55
				56	Initialization::
				57
				58	/* dynamic */
				59	seqcount_t foo_seqcount;
				60	seqcount_init(&foo_seqcount);
				61
				62	/* static */
				63	static seqcount_t foo_seqcount = SEQCNT_ZERO(foo_seqcount);
				64
				65	/* C99 struct init */
				66	struct {
				67	.seq = SEQCNT_ZERO(foo.seq),
				68	} foo;
				69
				70	Write path::
				71
				72	/* Serialized context with disabled preemption */
				73
				74	write_seqcount_begin(&foo_seqcount);
				75
				76	/* ... [[write-side critical section]] ... */
				77
				78	write_seqcount_end(&foo_seqcount);
				79
				80	Read path::
				81
				82	do {
				83	seq = read_seqcount_begin(&foo_seqcount);
				84
				85	/* ... [[read-side critical section]] ... */
				86
				87	} while (read_seqcount_retry(&foo_seqcount, seq));
				88
				89
Ahmed S. Darwish	55f3560	2020-07-20 17:55:15 +0200	[diff] [blame]	90	.. _seqcount_locktype_t:
				91
Ahmed S. Darwish	cf48647	2020-12-06 17:21:41 +0100	[diff] [blame]	92	Sequence counters with associated locks (``seqcount_LOCKNAME_t``)
Ahmed S. Darwish	55f3560	2020-07-20 17:55:15 +0200	[diff] [blame]	93	-----------------------------------------------------------------
				94
				95	As discussed at :ref:`seqcount_t`, sequence count write side critical
				96	sections must be serialized and non-preemptible. This variant of
				97	sequence counters associate the lock used for writer serialization at
				98	initialization time, which enables lockdep to validate that the write
				99	side critical sections are properly serialized.
				100
				101	This lock association is a NOOP if lockdep is disabled and has neither
				102	storage nor runtime overhead. If lockdep is enabled, the lock pointer is
				103	stored in struct seqcount and lockdep's "lock is held" assertions are
				104	injected at the beginning of the write side critical section to validate
				105	that it is properly protected.
				106
				107	For lock types which do not implicitly disable preemption, preemption
				108	protection is enforced in the write side function.
				109
				110	The following sequence counters with associated locks are defined:
				111
				112	- ``seqcount_spinlock_t``
				113	- ``seqcount_raw_spinlock_t``
				114	- ``seqcount_rwlock_t``
				115	- ``seqcount_mutex_t``
				116	- ``seqcount_ww_mutex_t``
				117
Ahmed S. Darwish	cf48647	2020-12-06 17:21:41 +0100	[diff] [blame]	118	The sequence counter read and write APIs can take either a plain
				119	seqcount_t or any of the seqcount_LOCKNAME_t variants above.
Ahmed S. Darwish	55f3560	2020-07-20 17:55:15 +0200	[diff] [blame]	120
Ahmed S. Darwish	cf48647	2020-12-06 17:21:41 +0100	[diff] [blame]	121	Initialization (replace "LOCKNAME" with one of the supported locks)::
Ahmed S. Darwish	55f3560	2020-07-20 17:55:15 +0200	[diff] [blame]	122
				123	/* dynamic */
Ahmed S. Darwish	cf48647	2020-12-06 17:21:41 +0100	[diff] [blame]	124	seqcount_LOCKNAME_t foo_seqcount;
				125	seqcount_LOCKNAME_init(&foo_seqcount, &lock);
Ahmed S. Darwish	55f3560	2020-07-20 17:55:15 +0200	[diff] [blame]	126
				127	/* static */
Ahmed S. Darwish	cf48647	2020-12-06 17:21:41 +0100	[diff] [blame]	128	static seqcount_LOCKNAME_t foo_seqcount =
				129	SEQCNT_LOCKNAME_ZERO(foo_seqcount, &lock);
Ahmed S. Darwish	55f3560	2020-07-20 17:55:15 +0200	[diff] [blame]	130
				131	/* C99 struct init */
				132	struct {
Ahmed S. Darwish	cf48647	2020-12-06 17:21:41 +0100	[diff] [blame]	133	.seq = SEQCNT_LOCKNAME_ZERO(foo.seq, &lock),
Ahmed S. Darwish	55f3560	2020-07-20 17:55:15 +0200	[diff] [blame]	134	} foo;
				135
				136	Write path: same as in :ref:`seqcount_t`, while running from a context
Ahmed S. Darwish	cf48647	2020-12-06 17:21:41 +0100	[diff] [blame]	137	with the associated write serialization lock acquired.
Ahmed S. Darwish	55f3560	2020-07-20 17:55:15 +0200	[diff] [blame]	138
				139	Read path: same as in :ref:`seqcount_t`.
				140
Ahmed S. Darwish	80793c3	2020-08-27 13:40:39 +0200	[diff] [blame]	141
				142	.. _seqcount_latch_t:
				143
				144	Latch sequence counters (``seqcount_latch_t``)
				145	----------------------------------------------
				146
				147	Latch sequence counters are a multiversion concurrency control mechanism
				148	where the embedded seqcount_t counter even/odd value is used to switch
				149	between two copies of protected data. This allows the sequence counter
				150	read path to safely interrupt its own write side critical section.
				151
				152	Use seqcount_latch_t when the write side sections cannot be protected
				153	from interruption by readers. This is typically the case when the read
				154	side can be invoked from NMI handlers.
				155
				156	Check `raw_write_seqcount_latch()` for more information.
				157
				158
Ahmed S. Darwish	0d24f65	2020-07-20 17:55:07 +0200	[diff] [blame]	159	.. _seqlock_t:
				160
				161	Sequential locks (``seqlock_t``)
				162	================================
				163
				164	This contains the :ref:`seqcount_t` mechanism earlier discussed, plus an
				165	embedded spinlock for writer serialization and non-preemptibility.
				166
				167	If the read side section can be invoked from hardirq or softirq context,
				168	use the write side function variants which disable interrupts or bottom
				169	halves respectively.
				170
				171	Initialization::
				172
				173	/* dynamic */
				174	seqlock_t foo_seqlock;
				175	seqlock_init(&foo_seqlock);
				176
				177	/* static */
				178	static DEFINE_SEQLOCK(foo_seqlock);
				179
				180	/* C99 struct init */
				181	struct {
				182	.seql = __SEQLOCK_UNLOCKED(foo.seql)
				183	} foo;
				184
				185	Write path::
				186
				187	write_seqlock(&foo_seqlock);
				188
				189	/* ... [[write-side critical section]] ... */
				190
				191	write_sequnlock(&foo_seqlock);
				192
				193	Read path, three categories:
				194
				195	1. Normal Sequence readers which never block a writer but they must
				196	retry if a writer is in progress by detecting change in the sequence
				197	number. Writers do not wait for a sequence reader::
				198
				199	do {
				200	seq = read_seqbegin(&foo_seqlock);
				201
				202	/* ... [[read-side critical section]] ... */
				203
				204	} while (read_seqretry(&foo_seqlock, seq));
				205
				206	2. Locking readers which will wait if a writer or another locking reader
				207	is in progress. A locking reader in progress will also block a writer
				208	from entering its critical section. This read lock is
				209	exclusive. Unlike rwlock_t, only one locking reader can acquire it::
				210
				211	read_seqlock_excl(&foo_seqlock);
				212
				213	/* ... [[read-side critical section]] ... */
				214
				215	read_sequnlock_excl(&foo_seqlock);
				216
				217	3. Conditional lockless reader (as in 1), or locking reader (as in 2),
				218	according to a passed marker. This is used to avoid lockless readers
				219	starvation (too much retry loops) in case of a sharp spike in write
				220	activity. First, a lockless read is tried (even marker passed). If
				221	that trial fails (odd sequence counter is returned, which is used as
				222	the next iteration marker), the lockless read is transformed to a
				223	full locking read and no retry loop is necessary::
				224
				225	/* marker; even initialization */
				226	int seq = 0;
				227	do {
				228	read_seqbegin_or_lock(&foo_seqlock, &seq);
				229
				230	/* ... [[read-side critical section]] ... */
				231
				232	} while (need_seqretry(&foo_seqlock, seq));
				233	done_seqretry(&foo_seqlock, seq);
				234
				235
				236	API documentation
				237	=================
				238
				239	.. kernel-doc:: include/linux/seqlock.h