Fix NULL pointer dereference in proc_sys_compare
The VFS interface for the 'd_compare()' is a bit special (read: 'odd'),
because it really just essentially replaces a memcmp(). The filesystem
is supposed to just compare the two names with whatever case-independent
or other function.
And when I say 'is supposed to', I obviously mean that 'procfs does odd
things, and actually looks at the dentry that we don't even pass down,
rather than just the name'. Which results in problems, because we
actually call d_compare before we have even verified that the dentry is
still hashed at all.
And that causes a problm since the inode that procfs looks at may have
been free'd and the d_inode pointer is NULL. procfs just assumes that
all dentries are positive, since procfs itself never generates a
negative one. But memory pressure will still result in the dentry
getting torn down, and as it is removed by RCU, it still remains visible
on some lists - and to d_compare.
If the filesystem just did a name comparison, we wouldn't care. And we
could just fix procfs to know about negative dentries too. But rather
than have the low-level filesystems know about internal VFS details,
just move the check for a unhashed dentry up a bit, so that we will only
call d_compare on dentries that are still active.
The actual oops this caused didn't look like a NULL pointer dereference
because procfs did a 'container_of(inode, struct proc_inode, vfs_inode)'
to get at its internal proc_inode information from the inode pointer,
and accessed a field below the inode. So the oops would look something
like
BUG: unable to handle kernel paging request at fffffffffffffff0
IP: [<ffffffff802bc6c6>] proc_sys_compare+0x36/0x50
and was seen on both x86-64 (Alexey Dobriyan and Hugh Dickins) and
ppc64 (Hugh Dickins).
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>
diff --git a/fs/dcache.c b/fs/dcache.c
index 80e9395..e7a1a99 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1395,6 +1395,10 @@
if (dentry->d_parent != parent)
goto next;
+ /* non-existing due to RCU? */
+ if (d_unhashed(dentry))
+ goto next;
+
/*
* It is safe to compare names since d_move() cannot
* change the qstr (protected by d_lock).
@@ -1410,10 +1414,8 @@
goto next;
}
- if (!d_unhashed(dentry)) {
- atomic_inc(&dentry->d_count);
- found = dentry;
- }
+ atomic_inc(&dentry->d_count);
+ found = dentry;
spin_unlock(&dentry->d_lock);
break;
next: