Merge "ASoC: tlv320aic3x: Add reset inverted DT property"
diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
index e09db01..e916c1e 100644
--- a/Documentation/ABI/testing/sysfs-fs-f2fs
+++ b/Documentation/ABI/testing/sysfs-fs-f2fs
@@ -86,6 +86,13 @@
The unit size is one block, now only support configuring in range
of [1, 512].
+What: /sys/fs/f2fs/<disk>/umount_discard_timeout
+Date: January 2019
+Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
+Description:
+ Set timeout to issue discard commands during umount.
+ Default: 5 secs
+
What: /sys/fs/f2fs/<disk>/max_victim_search
Date: January 2014
Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com>
diff --git a/Documentation/accounting/psi.txt b/Documentation/accounting/psi.txt
new file mode 100644
index 0000000..4fb40fe
--- /dev/null
+++ b/Documentation/accounting/psi.txt
@@ -0,0 +1,180 @@
+================================
+PSI - Pressure Stall Information
+================================
+
+:Date: April, 2018
+:Author: Johannes Weiner <hannes@cmpxchg.org>
+
+When CPU, memory or IO devices are contended, workloads experience
+latency spikes, throughput losses, and run the risk of OOM kills.
+
+Without an accurate measure of such contention, users are forced to
+either play it safe and under-utilize their hardware resources, or
+roll the dice and frequently suffer the disruptions resulting from
+excessive overcommit.
+
+The psi feature identifies and quantifies the disruptions caused by
+such resource crunches and the time impact it has on complex workloads
+or even entire systems.
+
+Having an accurate measure of productivity losses caused by resource
+scarcity aids users in sizing workloads to hardware--or provisioning
+hardware according to workload demand.
+
+As psi aggregates this information in realtime, systems can be managed
+dynamically using techniques such as load shedding, migrating jobs to
+other systems or data centers, or strategically pausing or killing low
+priority or restartable batch jobs.
+
+This allows maximizing hardware utilization without sacrificing
+workload health or risking major disruptions such as OOM kills.
+
+Pressure interface
+==================
+
+Pressure information for each resource is exported through the
+respective file in /proc/pressure/ -- cpu, memory, and io.
+
+The format for CPU is as such:
+
+some avg10=0.00 avg60=0.00 avg300=0.00 total=0
+
+and for memory and IO:
+
+some avg10=0.00 avg60=0.00 avg300=0.00 total=0
+full avg10=0.00 avg60=0.00 avg300=0.00 total=0
+
+The "some" line indicates the share of time in which at least some
+tasks are stalled on a given resource.
+
+The "full" line indicates the share of time in which all non-idle
+tasks are stalled on a given resource simultaneously. In this state
+actual CPU cycles are going to waste, and a workload that spends
+extended time in this state is considered to be thrashing. This has
+severe impact on performance, and it's useful to distinguish this
+situation from a state where some tasks are stalled but the CPU is
+still doing productive work. As such, time spent in this subset of the
+stall state is tracked separately and exported in the "full" averages.
+
+The ratios are tracked as recent trends over ten, sixty, and three
+hundred second windows, which gives insight into short term events as
+well as medium and long term trends. The total absolute stall time is
+tracked and exported as well, to allow detection of latency spikes
+which wouldn't necessarily make a dent in the time averages, or to
+average trends over custom time frames.
+
+Monitoring for pressure thresholds
+==================================
+
+Users can register triggers and use poll() to be woken up when resource
+pressure exceeds certain thresholds.
+
+A trigger describes the maximum cumulative stall time over a specific
+time window, e.g. 100ms of total stall time within any 500ms window to
+generate a wakeup event.
+
+To register a trigger user has to open psi interface file under
+/proc/pressure/ representing the resource to be monitored and write the
+desired threshold and time window. The open file descriptor should be
+used to wait for trigger events using select(), poll() or epoll().
+The following format is used:
+
+<some|full> <stall amount in us> <time window in us>
+
+For example writing "some 150000 1000000" into /proc/pressure/memory
+would add 150ms threshold for partial memory stall measured within
+1sec time window. Writing "full 50000 1000000" into /proc/pressure/io
+would add 50ms threshold for full io stall measured within 1sec time window.
+
+Triggers can be set on more than one psi metric and more than one trigger
+for the same psi metric can be specified. However for each trigger a separate
+file descriptor is required to be able to poll it separately from others,
+therefore for each trigger a separate open() syscall should be made even
+when opening the same psi interface file.
+
+Monitors activate only when system enters stall state for the monitored
+psi metric and deactivates upon exit from the stall state. While system is
+in the stall state psi signal growth is monitored at a rate of 10 times per
+tracking window.
+
+The kernel accepts window sizes ranging from 500ms to 10s, therefore min
+monitoring update interval is 50ms and max is 1s. Min limit is set to
+prevent overly frequent polling. Max limit is chosen as a high enough number
+after which monitors are most likely not needed and psi averages can be used
+instead.
+
+When activated, psi monitor stays active for at least the duration of one
+tracking window to avoid repeated activations/deactivations when system is
+bouncing in and out of the stall state.
+
+Notifications to the userspace are rate-limited to one per tracking window.
+
+The trigger will de-register when the file descriptor used to define the
+trigger is closed.
+
+Userspace monitor usage example
+===============================
+
+#include <errno.h>
+#include <fcntl.h>
+#include <stdio.h>
+#include <poll.h>
+#include <string.h>
+#include <unistd.h>
+
+/*
+ * Monitor memory partial stall with 1s tracking window size
+ * and 150ms threshold.
+ */
+int main() {
+ const char trig[] = "some 150000 1000000";
+ struct pollfd fds;
+ int n;
+
+ fds.fd = open("/proc/pressure/memory", O_RDWR | O_NONBLOCK);
+ if (fds.fd < 0) {
+ printf("/proc/pressure/memory open error: %s\n",
+ strerror(errno));
+ return 1;
+ }
+ fds.events = POLLPRI;
+
+ if (write(fds.fd, trig, strlen(trig) + 1) < 0) {
+ printf("/proc/pressure/memory write error: %s\n",
+ strerror(errno));
+ return 1;
+ }
+
+ printf("waiting for events...\n");
+ while (1) {
+ n = poll(&fds, 1, -1);
+ if (n < 0) {
+ printf("poll error: %s\n", strerror(errno));
+ return 1;
+ }
+ if (fds.revents & POLLERR) {
+ printf("got POLLERR, event source is gone\n");
+ return 0;
+ }
+ if (fds.revents & POLLPRI) {
+ printf("event triggered!\n");
+ } else {
+ printf("unknown event received: 0x%x\n", fds.revents);
+ return 1;
+ }
+ }
+
+ return 0;
+}
+
+Cgroup2 interface
+=================
+
+In a system with a CONFIG_CGROUP=y kernel and the cgroup2 filesystem
+mounted, pressure stall information is also tracked for tasks grouped
+into cgroups. Each subdirectory in the cgroupfs mountpoint contains
+cpu.pressure, memory.pressure, and io.pressure files; the format is
+the same as the /proc/pressure/ files.
+
+Per-cgroup psi monitors can be specified and used the same way as
+system-wide ones.
diff --git a/Documentation/arm/kernel_mode_neon.txt b/Documentation/arm/kernel_mode_neon.txt
index 5254527..b9e060c 100644
--- a/Documentation/arm/kernel_mode_neon.txt
+++ b/Documentation/arm/kernel_mode_neon.txt
@@ -6,7 +6,7 @@
* Use only NEON instructions, or VFP instructions that don't rely on support
code
* Isolate your NEON code in a separate compilation unit, and compile it with
- '-mfpu=neon -mfloat-abi=softfp'
+ '-march=armv7-a -mfpu=neon -mfloat-abi=softfp'
* Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your
NEON code
* Don't sleep in your NEON code, and be aware that it will be executed with
@@ -87,7 +87,7 @@
Therefore, the recommended and only supported way of using NEON/VFP in the
kernel is by adhering to the following rules:
* isolate the NEON code in a separate compilation unit and compile it with
- '-mfpu=neon -mfloat-abi=softfp';
+ '-march=armv7-a -mfpu=neon -mfloat-abi=softfp';
* issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls
into the unit containing the NEON code from a compilation unit which is *not*
built with the GCC flag '-mfpu=neon' set.
diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index 4cc07ce..73950fd 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -717,6 +717,12 @@
$PERIOD duration. If only one number is written, $MAX is
updated.
+ cpu.pressure
+ A read-only nested-key file which exists on non-root cgroups.
+
+ Shows pressure stall information for CPU. See
+ Documentation/accounting/psi.txt for details.
+
5-2. Memory
@@ -925,6 +931,12 @@
Swap usage hard limit. If a cgroup's swap usage reaches this
limit, anonymous meomry of the cgroup will not be swapped out.
+ memory.pressure
+ A read-only nested-key file which exists on non-root cgroups.
+
+ Shows pressure stall information for memory. See
+ Documentation/accounting/psi.txt for details.
+
5-2-2. Usage Guidelines
@@ -1055,6 +1067,12 @@
8:16 rbps=2097152 wbps=max riops=max wiops=max
+ io.pressure
+ A read-only nested-key file which exists on non-root cgroups.
+
+ Shows pressure stall information for IO. See
+ Documentation/accounting/psi.txt for details.
+
5-3-2. Writeback
diff --git a/Documentation/device-mapper/dm-bow.txt b/Documentation/device-mapper/dm-bow.txt
new file mode 100644
index 0000000..e3fc4d2
--- /dev/null
+++ b/Documentation/device-mapper/dm-bow.txt
@@ -0,0 +1,99 @@
+dm_bow (backup on write)
+========================
+
+dm_bow is a device mapper driver that uses the free space on a device to back up
+data that is overwritten. The changes can then be committed by a simple state
+change, or rolled back by removing the dm_bow device and running a command line
+utility over the underlying device.
+
+dm_bow has three states, set by writing ‘1’ or ‘2’ to /sys/block/dm-?/bow/state.
+It is only possible to go from state 0 (initial state) to state 1, and then from
+state 1 to state 2.
+
+State 0: dm_bow collects all trims to the device and assumes that these mark
+free space on the overlying file system that can be safely used. Typically the
+mount code would create the dm_bow device, mount the file system, call the
+FITRIM ioctl on the file system then switch to state 1. These trims are not
+propagated to the underlying device.
+
+State 1: All writes to the device cause the underlying data to be backed up to
+the free (trimmed) area as needed in such a way as they can be restored.
+However, the writes, with one exception, then happen exactly as they would
+without dm_bow, so the device is always in a good final state. The exception is
+that sector 0 is used to keep a log of the latest changes, both to indicate that
+we are in this state and to allow rollback. See below for all details. If there
+isn't enough free space, writes are failed with -ENOSPC.
+
+State 2: The transition to state 2 triggers replacing the special sector 0 with
+the normal sector 0, and the freeing of all state information. dm_bow then
+becomes a pass-through driver, allowing the device to continue to be used with
+minimal performance impact.
+
+Usage
+=====
+dm-bow takes one command line parameter, the name of the underlying device.
+
+dm-bow will typically be used in the following way. dm-bow will be loaded with a
+suitable underlying device and the resultant device will be mounted. A file
+system trim will be issued via the FITRIM ioctl, then the device will be
+switched to state 1. The file system will now be used as normal. At some point,
+the changes can either be committed by switching to state 2, or rolled back by
+unmounting the file system, removing the dm-bow device and running the command
+line utility. Note that rebooting the device will be equivalent to unmounting
+and removing, but the command line utility must still be run
+
+Details of operation in state 1
+===============================
+
+dm_bow maintains a type for all sectors. A sector can be any of:
+
+SECTOR0
+SECTOR0_CURRENT
+UNCHANGED
+FREE
+CHANGED
+BACKUP
+
+SECTOR0 is the first sector on the device, and is used to hold the log of
+changes. This is the one exception.
+
+SECTOR0_CURRENT is a sector picked from the FREE sectors, and is where reads and
+writes from the true sector zero are redirected to. Note that like any backup
+sector, if the sector is written to directly, it must be moved again.
+
+UNCHANGED means that the sector has not been changed since we entered state 1.
+Thus if it is written to or trimmed, the contents must first be backed up.
+
+FREE means that the sector was trimmed in state 0 and has not yet been written
+to or used for backup. On being written to, a FREE sector is changed to CHANGED.
+
+CHANGED means that the sector has been modified, and can be further modified
+without further backup.
+
+BACKUP means that this is a free sector being used as a backup. On being written
+to, the contents must first be backed up again.
+
+All backup operations are logged to the first sector. The log sector has the
+format:
+--------------------------------------------------------
+| Magic | Count | Sequence | Log entry | Log entry | …
+--------------------------------------------------------
+
+Magic is a magic number. Count is the number of log entries. Sequence is 0
+initially. A log entry is
+
+-----------------------------------
+| Source | Dest | Size | Checksum |
+-----------------------------------
+
+When SECTOR0 is full, the log sector is backed up and another empty log sector
+created with sequence number one higher. The first entry in any log entry with
+sequence > 0 therefore must be the log of the backing up of the previous log
+sector. Note that sequence is not strictly needed, but is a useful sanity check
+and potentially limits the time spent trying to restore a corrupted snapshot.
+
+On entering state 1, dm_bow has a list of free sectors. All other sectors are
+unchanged. Sector0_current is selected from the free sectors and the contents of
+sector 0 are copied there. The sector 0 is backed up, which triggers the first
+log entry to be written.
+
diff --git a/Documentation/devicetree/bindings/arm/msm/msm.txt b/Documentation/devicetree/bindings/arm/msm/msm.txt
index 7422877..4f8e4f5 100644
--- a/Documentation/devicetree/bindings/arm/msm/msm.txt
+++ b/Documentation/devicetree/bindings/arm/msm/msm.txt
@@ -146,6 +146,9 @@
- SDA429
compatible = "qcom,sda429"
+- SDM429W
+ compatible = "qcom,sdm429w"
+
- QM215
compatible = "qcom, qm215"
@@ -366,6 +369,7 @@
compatible = "qcom,sdm429-cdp"
compatible = "qcom,sdm429-mtp"
compatible = "qcom,sdm429-qrd"
+compatible = "qcom,sdm429w-qrd"
compatible = "qcom,sda429-cdp"
compatible = "qcom,sda429-mtp"
compatible = "qcom,sdm439-cdp"
diff --git a/Documentation/filesystems/f2fs.txt b/Documentation/filesystems/f2fs.txt
index b640a37..564ccc6 100644
--- a/Documentation/filesystems/f2fs.txt
+++ b/Documentation/filesystems/f2fs.txt
@@ -125,6 +125,9 @@
disable_ext_identify Disable the extension list configured by mkfs, so f2fs
does not aware of cold files such as media files.
inline_xattr Enable the inline xattrs feature.
+noinline_xattr Disable the inline xattrs feature.
+inline_xattr_size=%u Support configuring inline xattr size, it depends on
+ flexible inline xattr feature.
inline_data Enable the inline data feature: New created small(<~3.4k)
files can be written into inode block.
inline_dentry Enable the inline dir feature: data in new created
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 719c4cd..8d9ff03 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2780,6 +2780,10 @@
nohugeiomap [KNL,x86] Disable kernel huge I/O mappings.
+ nospectre_v1 [PPC] Disable mitigations for Spectre Variant 1 (bounds
+ check bypass). With this option data leaks are possible
+ in the system.
+
nosmt [KNL,S390] Disable symmetric multithreading (SMT).
Equivalent to smt=1.
@@ -2787,7 +2791,7 @@
nosmt=force: Force disable SMT, cannot be undone
via the sysfs control file.
- nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2
+ nospectre_v2 [X86,PPC_FSL_BOOK3E] Disable all mitigations for the Spectre variant 2
(indirect branch prediction) vulnerability. System may
allow data leaks with this option, which is equivalent
to spectre_v2=off.
@@ -3409,6 +3413,10 @@
before loading.
See Documentation/blockdev/ramdisk.txt.
+ psi= [KNL] Enable or disable pressure stall information
+ tracking.
+ Format: <bool>
+
psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to
probe for; one of (bare|imps|exps|lifebook|any).
psmouse.rate= [HW,MOUSE] Set desired mouse report rate, in reports
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 827622e..c88a0a8 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -405,6 +405,7 @@
minimum RTT when it is moved to a longer path (e.g., due to traffic
engineering). A longer window makes the filter more resistant to RTT
inflations such as transient congestion. The unit is seconds.
+ Possible values: 0 - 86400 (1 day)
Default: 300
tcp_moderate_rcvbuf - BOOLEAN
diff --git a/Documentation/printk-formats.txt b/Documentation/printk-formats.txt
index d2fbeeb..54504c5 100644
--- a/Documentation/printk-formats.txt
+++ b/Documentation/printk-formats.txt
@@ -31,6 +31,15 @@
Raw pointer value SHOULD be printed with %p. The kernel supports
the following extended format specifiers for pointer types:
+Pointer Types:
+
+Pointers printed without a specifier extension (i.e unadorned %p) are
+hashed to give a unique identifier without leaking kernel addresses to user
+space. On 64 bit machines the first 32 bits are zeroed. If you _really_
+want the address see %px below.
+
+ %p abcdef12 or 00000000abcdef12
+
Symbols/Function Pointers:
%pF versatile_init+0x0/0x110
@@ -58,12 +67,24 @@
Kernel Pointers:
- %pK 0x01234567 or 0x0123456789abcdef
+ %pK 01234567 or 0123456789abcdef
For printing kernel pointers which should be hidden from unprivileged
users. The behaviour of %pK depends on the kptr_restrict sysctl - see
Documentation/sysctl/kernel.txt for more details.
+Unmodified Addresses:
+
+ %px 01234567 or 0123456789abcdef
+
+ For printing pointers when you _really_ want to print the address. Please
+ consider whether or not you are leaking sensitive information about the
+ Kernel layout in memory before printing pointers with %px. %px is
+ functionally equivalent to %lx. %px is preferred to %lx because it is more
+ uniquely grep'able. If, in the future, we need to modify the way the Kernel
+ handles printing pointers it will be nice to be able to find the call
+ sites.
+
Struct Resources:
%pr [mem 0x60000000-0x6fffffff flags 0x2200] or
diff --git a/Documentation/siphash.txt b/Documentation/siphash.txt
new file mode 100644
index 0000000..908d348
--- /dev/null
+++ b/Documentation/siphash.txt
@@ -0,0 +1,175 @@
+ SipHash - a short input PRF
+-----------------------------------------------
+Written by Jason A. Donenfeld <jason@zx2c4.com>
+
+SipHash is a cryptographically secure PRF -- a keyed hash function -- that
+performs very well for short inputs, hence the name. It was designed by
+cryptographers Daniel J. Bernstein and Jean-Philippe Aumasson. It is intended
+as a replacement for some uses of: `jhash`, `md5_transform`, `sha_transform`,
+and so forth.
+
+SipHash takes a secret key filled with randomly generated numbers and either
+an input buffer or several input integers. It spits out an integer that is
+indistinguishable from random. You may then use that integer as part of secure
+sequence numbers, secure cookies, or mask it off for use in a hash table.
+
+1. Generating a key
+
+Keys should always be generated from a cryptographically secure source of
+random numbers, either using get_random_bytes or get_random_once:
+
+siphash_key_t key;
+get_random_bytes(&key, sizeof(key));
+
+If you're not deriving your key from here, you're doing it wrong.
+
+2. Using the functions
+
+There are two variants of the function, one that takes a list of integers, and
+one that takes a buffer:
+
+u64 siphash(const void *data, size_t len, const siphash_key_t *key);
+
+And:
+
+u64 siphash_1u64(u64, const siphash_key_t *key);
+u64 siphash_2u64(u64, u64, const siphash_key_t *key);
+u64 siphash_3u64(u64, u64, u64, const siphash_key_t *key);
+u64 siphash_4u64(u64, u64, u64, u64, const siphash_key_t *key);
+u64 siphash_1u32(u32, const siphash_key_t *key);
+u64 siphash_2u32(u32, u32, const siphash_key_t *key);
+u64 siphash_3u32(u32, u32, u32, const siphash_key_t *key);
+u64 siphash_4u32(u32, u32, u32, u32, const siphash_key_t *key);
+
+If you pass the generic siphash function something of a constant length, it
+will constant fold at compile-time and automatically choose one of the
+optimized functions.
+
+3. Hashtable key function usage:
+
+struct some_hashtable {
+ DECLARE_HASHTABLE(hashtable, 8);
+ siphash_key_t key;
+};
+
+void init_hashtable(struct some_hashtable *table)
+{
+ get_random_bytes(&table->key, sizeof(table->key));
+}
+
+static inline hlist_head *some_hashtable_bucket(struct some_hashtable *table, struct interesting_input *input)
+{
+ return &table->hashtable[siphash(input, sizeof(*input), &table->key) & (HASH_SIZE(table->hashtable) - 1)];
+}
+
+You may then iterate like usual over the returned hash bucket.
+
+4. Security
+
+SipHash has a very high security margin, with its 128-bit key. So long as the
+key is kept secret, it is impossible for an attacker to guess the outputs of
+the function, even if being able to observe many outputs, since 2^128 outputs
+is significant.
+
+Linux implements the "2-4" variant of SipHash.
+
+5. Struct-passing Pitfalls
+
+Often times the XuY functions will not be large enough, and instead you'll
+want to pass a pre-filled struct to siphash. When doing this, it's important
+to always ensure the struct has no padding holes. The easiest way to do this
+is to simply arrange the members of the struct in descending order of size,
+and to use offsetendof() instead of sizeof() for getting the size. For
+performance reasons, if possible, it's probably a good thing to align the
+struct to the right boundary. Here's an example:
+
+const struct {
+ struct in6_addr saddr;
+ u32 counter;
+ u16 dport;
+} __aligned(SIPHASH_ALIGNMENT) combined = {
+ .saddr = *(struct in6_addr *)saddr,
+ .counter = counter,
+ .dport = dport
+};
+u64 h = siphash(&combined, offsetofend(typeof(combined), dport), &secret);
+
+6. Resources
+
+Read the SipHash paper if you're interested in learning more:
+https://131002.net/siphash/siphash.pdf
+
+
+~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~
+
+HalfSipHash - SipHash's insecure younger cousin
+-----------------------------------------------
+Written by Jason A. Donenfeld <jason@zx2c4.com>
+
+On the off-chance that SipHash is not fast enough for your needs, you might be
+able to justify using HalfSipHash, a terrifying but potentially useful
+possibility. HalfSipHash cuts SipHash's rounds down from "2-4" to "1-3" and,
+even scarier, uses an easily brute-forcable 64-bit key (with a 32-bit output)
+instead of SipHash's 128-bit key. However, this may appeal to some
+high-performance `jhash` users.
+
+Danger!
+
+Do not ever use HalfSipHash except for as a hashtable key function, and only
+then when you can be absolutely certain that the outputs will never be
+transmitted out of the kernel. This is only remotely useful over `jhash` as a
+means of mitigating hashtable flooding denial of service attacks.
+
+1. Generating a key
+
+Keys should always be generated from a cryptographically secure source of
+random numbers, either using get_random_bytes or get_random_once:
+
+hsiphash_key_t key;
+get_random_bytes(&key, sizeof(key));
+
+If you're not deriving your key from here, you're doing it wrong.
+
+2. Using the functions
+
+There are two variants of the function, one that takes a list of integers, and
+one that takes a buffer:
+
+u32 hsiphash(const void *data, size_t len, const hsiphash_key_t *key);
+
+And:
+
+u32 hsiphash_1u32(u32, const hsiphash_key_t *key);
+u32 hsiphash_2u32(u32, u32, const hsiphash_key_t *key);
+u32 hsiphash_3u32(u32, u32, u32, const hsiphash_key_t *key);
+u32 hsiphash_4u32(u32, u32, u32, u32, const hsiphash_key_t *key);
+
+If you pass the generic hsiphash function something of a constant length, it
+will constant fold at compile-time and automatically choose one of the
+optimized functions.
+
+3. Hashtable key function usage:
+
+struct some_hashtable {
+ DECLARE_HASHTABLE(hashtable, 8);
+ hsiphash_key_t key;
+};
+
+void init_hashtable(struct some_hashtable *table)
+{
+ get_random_bytes(&table->key, sizeof(table->key));
+}
+
+static inline hlist_head *some_hashtable_bucket(struct some_hashtable *table, struct interesting_input *input)
+{
+ return &table->hashtable[hsiphash(input, sizeof(*input), &table->key) & (HASH_SIZE(table->hashtable) - 1)];
+}
+
+You may then iterate like usual over the returned hash bucket.
+
+4. Performance
+
+HalfSipHash is roughly 3 times slower than JenkinsHash. For many replacements,
+this will not be a problem, as the hashtable lookup isn't the bottleneck. And
+in general, this is probably a good sacrifice to make for the security and DoS
+resistance of HalfSipHash.
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 3ff58a8..d1908e5 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -13,7 +13,7 @@
- VM ioctls: These query and set attributes that affect an entire virtual
machine, for example memory layout. In addition a VM ioctl is used to
- create virtual cpus (vcpus).
+ create virtual cpus (vcpus) and devices.
Only run VM ioctls from the same process (address space) that was used
to create the VM.
@@ -24,6 +24,11 @@
Only run vcpu ioctls from the same thread that was used to create the
vcpu.
+ - device ioctls: These query and set attributes that control the operation
+ of a single device.
+
+ device ioctls must be issued from the same process (address space) that
+ was used to create the VM.
2. File descriptors
-------------------
@@ -32,10 +37,11 @@
open("/dev/kvm") obtains a handle to the kvm subsystem; this handle
can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this
handle will create a VM file descriptor which can be used to issue VM
-ioctls. A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu
-and return a file descriptor pointing to it. Finally, ioctls on a vcpu
-fd can be used to control the vcpu, including the important task of
-actually running guest code.
+ioctls. A KVM_CREATE_VCPU or KVM_CREATE_DEVICE ioctl on a VM fd will
+create a virtual cpu or device and return a file descriptor pointing to
+the new resource. Finally, ioctls on a vcpu or device fd can be used
+to control the vcpu or device. For vcpus, this includes the important
+task of actually running guest code.
In general file descriptors can be migrated among processes by means
of fork() and the SCM_RIGHTS facility of unix domain socket. These
diff --git a/MAINTAINERS b/MAINTAINERS
index 34f509a..5ea427b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11081,6 +11081,13 @@
F: arch/arm/mach-s3c24xx/bast-ide.c
F: arch/arm/mach-s3c24xx/bast-irq.c
+SIPHASH PRF ROUTINES
+M: Jason A. Donenfeld <Jason@zx2c4.com>
+S: Maintained
+F: lib/siphash.c
+F: lib/test_siphash.c
+F: include/linux/siphash.h
+
TI DAVINCI MACHINE SUPPORT
M: Sekhar Nori <nsekhar@ti.com>
M: Kevin Hilman <khilman@kernel.org>
diff --git a/Makefile b/Makefile
index f5e6a01..d2f585a 100755
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
VERSION = 4
PATCHLEVEL = 9
-SUBLEVEL = 163
+SUBLEVEL = 172
EXTRAVERSION =
NAME = Roaring Lionus
@@ -516,7 +516,7 @@
ifeq ($(shell $(srctree)/scripts/clang-android.sh $(CC) $(CLANG_FLAGS)), y)
$(error "Clang with Android --target detected. Did you specify CLANG_TRIPLE?")
endif
-GCC_TOOLCHAIN_DIR := $(dir $(shell which $(LD)))
+GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit))
CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)
GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..)
endif
@@ -720,7 +720,7 @@
endif
ifdef CONFIG_CFI_CLANG
-cfi-clang-flags += -fsanitize=cfi
+cfi-clang-flags += -fsanitize=cfi $(call cc-option, -fsplit-lto-unit)
DISABLE_CFI_CLANG := -fno-sanitize=cfi
ifdef CONFIG_MODULES
cfi-clang-flags += -fsanitize-cfi-cross-dso
@@ -747,8 +747,7 @@
endif
ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
-KBUILD_CFLAGS += $(call cc-option,-Oz,-Os)
-KBUILD_CFLAGS += $(call cc-disable-warning,maybe-uninitialized,)
+KBUILD_CFLAGS += -Os $(call cc-disable-warning,maybe-uninitialized,)
else
ifdef CONFIG_PROFILE_ALL_BRANCHES
KBUILD_CFLAGS += -O2 $(call cc-disable-warning,maybe-uninitialized,)
diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index 0684fd2..f82393f 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -209,7 +209,7 @@ __arc_copy_from_user(void *to, const void __user *from, unsigned long n)
*/
"=&r" (tmp), "+r" (to), "+r" (from)
:
- : "lp_count", "lp_start", "lp_end", "memory");
+ : "lp_count", "memory");
return n;
}
@@ -438,7 +438,7 @@ __arc_copy_to_user(void __user *to, const void *from, unsigned long n)
*/
"=&r" (tmp), "+r" (to), "+r" (from)
:
- : "lp_count", "lp_start", "lp_end", "memory");
+ : "lp_count", "memory");
return n;
}
@@ -658,7 +658,7 @@ static inline unsigned long __arc_clear_user(void __user *to, unsigned long n)
" .previous \n"
: "+r"(d_char), "+r"(res)
: "i"(0)
- : "lp_count", "lp_start", "lp_end", "memory");
+ : "lp_count", "memory");
return res;
}
@@ -691,7 +691,7 @@ __arc_strncpy_from_user(char *dst, const char __user *src, long count)
" .previous \n"
: "+r"(res), "+r"(dst), "+r"(src), "=r"(val)
: "g"(-EFAULT), "r"(count)
- : "lp_count", "lp_start", "lp_end", "memory");
+ : "lp_count", "memory");
return res;
}
diff --git a/arch/arc/kernel/head.S b/arch/arc/kernel/head.S
index 1f945d0..208bf2c 100644
--- a/arch/arc/kernel/head.S
+++ b/arch/arc/kernel/head.S
@@ -107,6 +107,7 @@
; r2 = pointer to uboot provided cmdline or external DTB in mem
; These are handled later in handle_uboot_args()
st r0, [@uboot_tag]
+ st r1, [@uboot_magic]
st r2, [@uboot_arg]
#endif
diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c
index 9119bea..9f96120 100644
--- a/arch/arc/kernel/setup.c
+++ b/arch/arc/kernel/setup.c
@@ -32,6 +32,7 @@ unsigned int intr_to_DE_cnt;
/* Part of U-boot ABI: see head.S */
int __initdata uboot_tag;
+int __initdata uboot_magic;
char __initdata *uboot_arg;
const struct machine_desc *machine_desc;
@@ -400,6 +401,8 @@ static inline bool uboot_arg_invalid(unsigned long addr)
#define UBOOT_TAG_NONE 0
#define UBOOT_TAG_CMDLINE 1
#define UBOOT_TAG_DTB 2
+/* We always pass 0 as magic from U-boot */
+#define UBOOT_MAGIC_VALUE 0
void __init handle_uboot_args(void)
{
@@ -415,6 +418,11 @@ void __init handle_uboot_args(void)
goto ignore_uboot_args;
}
+ if (uboot_magic != UBOOT_MAGIC_VALUE) {
+ pr_warn(IGNORE_ARGS "non zero uboot magic\n");
+ goto ignore_uboot_args;
+ }
+
if (uboot_tag != UBOOT_TAG_NONE &&
uboot_arg_invalid((unsigned long)uboot_arg)) {
pr_warn(IGNORE_ARGS "invalid uboot arg: '%px'\n", uboot_arg);
diff --git a/arch/arc/lib/memcpy-archs.S b/arch/arc/lib/memcpy-archs.S
index d61044d..ea14b0b 100644
--- a/arch/arc/lib/memcpy-archs.S
+++ b/arch/arc/lib/memcpy-archs.S
@@ -25,15 +25,11 @@
#endif
#ifdef CONFIG_ARC_HAS_LL64
-# define PREFETCH_READ(RX) prefetch [RX, 56]
-# define PREFETCH_WRITE(RX) prefetchw [RX, 64]
# define LOADX(DST,RX) ldd.ab DST, [RX, 8]
# define STOREX(SRC,RX) std.ab SRC, [RX, 8]
# define ZOLSHFT 5
# define ZOLAND 0x1F
#else
-# define PREFETCH_READ(RX) prefetch [RX, 28]
-# define PREFETCH_WRITE(RX) prefetchw [RX, 32]
# define LOADX(DST,RX) ld.ab DST, [RX, 4]
# define STOREX(SRC,RX) st.ab SRC, [RX, 4]
# define ZOLSHFT 4
@@ -41,8 +37,6 @@
#endif
ENTRY_CFI(memcpy)
- prefetch [r1] ; Prefetch the read location
- prefetchw [r0] ; Prefetch the write location
mov.f 0, r2
;;; if size is zero
jz.d [blink]
@@ -72,8 +66,6 @@
lpnz @.Lcopy32_64bytes
;; LOOP START
LOADX (r6, r1)
- PREFETCH_READ (r1)
- PREFETCH_WRITE (r3)
LOADX (r8, r1)
LOADX (r10, r1)
LOADX (r4, r1)
@@ -117,9 +109,7 @@
lpnz @.Lcopy8bytes_1
;; LOOP START
ld.ab r6, [r1, 4]
- prefetch [r1, 28] ;Prefetch the next read location
ld.ab r8, [r1,4]
- prefetchw [r3, 32] ;Prefetch the next write location
SHIFT_1 (r7, r6, 24)
or r7, r7, r5
@@ -162,9 +152,7 @@
lpnz @.Lcopy8bytes_2
;; LOOP START
ld.ab r6, [r1, 4]
- prefetch [r1, 28] ;Prefetch the next read location
ld.ab r8, [r1,4]
- prefetchw [r3, 32] ;Prefetch the next write location
SHIFT_1 (r7, r6, 16)
or r7, r7, r5
@@ -204,9 +192,7 @@
lpnz @.Lcopy8bytes_3
;; LOOP START
ld.ab r6, [r1, 4]
- prefetch [r1, 28] ;Prefetch the next read location
ld.ab r8, [r1,4]
- prefetchw [r3, 32] ;Prefetch the next write location
SHIFT_1 (r7, r6, 8)
or r7, r7, r5
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 353563e..628715e 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1502,6 +1502,7 @@
bool "Support for hot-pluggable CPUs"
select GENERIC_IRQ_MIGRATION
depends on SMP
+ select GENERIC_IRQ_MIGRATION
help
Say Y here to experiment with turning CPUs off and on. CPUs
can be controlled through /sys/devices/system/cpu.
diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index 38ed62f..f1b4ea9 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -1385,7 +1385,21 @@
@ Preserve return value of efi_entry() in r4
mov r4, r0
- bl cache_clean_flush
+
+ @ our cache maintenance code relies on CP15 barrier instructions
+ @ but since we arrived here with the MMU and caches configured
+ @ by UEFI, we must check that the CP15BEN bit is set in SCTLR.
+ @ Note that this bit is RAO/WI on v6 and earlier, so the ISB in
+ @ the enable path will be executed on v7+ only.
+ mrc p15, 0, r1, c1, c0, 0 @ read SCTLR
+ tst r1, #(1 << 5) @ CP15BEN bit set?
+ bne 0f
+ orr r1, r1, #(1 << 5) @ CP15 barrier instructions
+ mcr p15, 0, r1, c1, c0, 0 @ write SCTLR
+ ARM( .inst 0xf57ff06f @ v7+ isb )
+ THUMB( isb )
+
+0: bl cache_clean_flush
bl cache_off
@ Set parameters for booting zImage according to boot protocol
diff --git a/arch/arm/boot/dts/lpc32xx.dtsi b/arch/arm/boot/dts/lpc32xx.dtsi
index b5841fa..0d20aad 100644
--- a/arch/arm/boot/dts/lpc32xx.dtsi
+++ b/arch/arm/boot/dts/lpc32xx.dtsi
@@ -230,7 +230,7 @@
status = "disabled";
};
- i2s1: i2s@2009C000 {
+ i2s1: i2s@2009c000 {
compatible = "nxp,lpc3220-i2s";
reg = <0x2009C000 0x1000>;
};
@@ -273,7 +273,7 @@
status = "disabled";
};
- i2c1: i2c@400A0000 {
+ i2c1: i2c@400a0000 {
compatible = "nxp,pnx-i2c";
reg = <0x400A0000 0x100>;
interrupt-parent = <&sic1>;
@@ -284,7 +284,7 @@
clocks = <&clk LPC32XX_CLK_I2C1>;
};
- i2c2: i2c@400A8000 {
+ i2c2: i2c@400a8000 {
compatible = "nxp,pnx-i2c";
reg = <0x400A8000 0x100>;
interrupt-parent = <&sic1>;
@@ -295,7 +295,7 @@
clocks = <&clk LPC32XX_CLK_I2C2>;
};
- mpwm: mpwm@400E8000 {
+ mpwm: mpwm@400e8000 {
compatible = "nxp,lpc3220-motor-pwm";
reg = <0x400E8000 0x78>;
status = "disabled";
@@ -394,7 +394,7 @@
#gpio-cells = <3>; /* bank, pin, flags */
};
- timer4: timer@4002C000 {
+ timer4: timer@4002c000 {
compatible = "nxp,lpc3220-timer";
reg = <0x4002C000 0x1000>;
interrupts = <3 IRQ_TYPE_LEVEL_LOW>;
@@ -412,7 +412,7 @@
status = "disabled";
};
- watchdog: watchdog@4003C000 {
+ watchdog: watchdog@4003c000 {
compatible = "nxp,pnx4008-wdt";
reg = <0x4003C000 0x1000>;
clocks = <&clk LPC32XX_CLK_WDOG>;
@@ -451,7 +451,7 @@
status = "disabled";
};
- timer1: timer@4004C000 {
+ timer1: timer@4004c000 {
compatible = "nxp,lpc3220-timer";
reg = <0x4004C000 0x1000>;
interrupts = <17 IRQ_TYPE_LEVEL_LOW>;
@@ -475,14 +475,14 @@
status = "disabled";
};
- pwm1: pwm@4005C000 {
+ pwm1: pwm@4005c000 {
compatible = "nxp,lpc3220-pwm";
reg = <0x4005C000 0x4>;
clocks = <&clk LPC32XX_CLK_PWM1>;
status = "disabled";
};
- pwm2: pwm@4005C004 {
+ pwm2: pwm@4005c004 {
compatible = "nxp,lpc3220-pwm";
reg = <0x4005C004 0x4>;
clocks = <&clk LPC32XX_CLK_PWM2>;
diff --git a/arch/arm/boot/dts/sama5d2-pinfunc.h b/arch/arm/boot/dts/sama5d2-pinfunc.h
index 8a394f3..ee65702 100644
--- a/arch/arm/boot/dts/sama5d2-pinfunc.h
+++ b/arch/arm/boot/dts/sama5d2-pinfunc.h
@@ -517,7 +517,7 @@
#define PIN_PC9__GPIO PINMUX_PIN(PIN_PC9, 0, 0)
#define PIN_PC9__FIQ PINMUX_PIN(PIN_PC9, 1, 3)
#define PIN_PC9__GTSUCOMP PINMUX_PIN(PIN_PC9, 2, 1)
-#define PIN_PC9__ISC_D0 PINMUX_PIN(PIN_PC9, 2, 1)
+#define PIN_PC9__ISC_D0 PINMUX_PIN(PIN_PC9, 3, 1)
#define PIN_PC9__TIOA4 PINMUX_PIN(PIN_PC9, 4, 2)
#define PIN_PC10 74
#define PIN_PC10__GPIO PINMUX_PIN(PIN_PC10, 0, 0)
diff --git a/arch/arm/configs/ranchu_defconfig b/arch/arm/configs/ranchu_defconfig
index e217631..f59f38c 100644
--- a/arch/arm/configs/ranchu_defconfig
+++ b/arch/arm/configs/ranchu_defconfig
@@ -185,7 +185,6 @@
CONFIG_TABLET_USB_HANWANG=y
CONFIG_TABLET_USB_KBTAB=y
CONFIG_INPUT_MISC=y
-CONFIG_INPUT_KEYCHORD=y
CONFIG_INPUT_UINPUT=y
CONFIG_INPUT_GPIO=y
# CONFIG_SERIO_SERPORT is not set
diff --git a/arch/arm/configs/spyro-perf_defconfig b/arch/arm/configs/spyro-perf_defconfig
index 03396f4..cfefe78 100644
--- a/arch/arm/configs/spyro-perf_defconfig
+++ b/arch/arm/configs/spyro-perf_defconfig
@@ -360,6 +360,7 @@
CONFIG_REGULATOR=y
CONFIG_REGULATOR_FIXED_VOLTAGE=y
CONFIG_REGULATOR_PROXY_CONSUMER=y
+CONFIG_REGULATOR_GPIO=y
CONFIG_REGULATOR_CPR=y
CONFIG_REGULATOR_CPR4_APSS=y
CONFIG_REGULATOR_CPRH_KBSS=y
diff --git a/arch/arm/configs/spyro_defconfig b/arch/arm/configs/spyro_defconfig
index fcf060a..421e21c 100644
--- a/arch/arm/configs/spyro_defconfig
+++ b/arch/arm/configs/spyro_defconfig
@@ -369,6 +369,7 @@
CONFIG_REGULATOR=y
CONFIG_REGULATOR_FIXED_VOLTAGE=y
CONFIG_REGULATOR_PROXY_CONSUMER=y
+CONFIG_REGULATOR_GPIO=y
CONFIG_REGULATOR_CPR=y
CONFIG_REGULATOR_CPR4_APSS=y
CONFIG_REGULATOR_CPRH_KBSS=y
diff --git a/arch/arm/crypto/sha256-armv4.pl b/arch/arm/crypto/sha256-armv4.pl
index fac0533..f64e841 100644
--- a/arch/arm/crypto/sha256-armv4.pl
+++ b/arch/arm/crypto/sha256-armv4.pl
@@ -205,10 +205,11 @@
.global sha256_block_data_order
.type sha256_block_data_order,%function
sha256_block_data_order:
+.Lsha256_block_data_order:
#if __ARM_ARCH__<7
sub r3,pc,#8 @ sha256_block_data_order
#else
- adr r3,sha256_block_data_order
+ adr r3,.Lsha256_block_data_order
#endif
#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__)
ldr r12,.LOPENSSL_armcap
diff --git a/arch/arm/crypto/sha256-core.S_shipped b/arch/arm/crypto/sha256-core.S_shipped
index 555a1a8..72c2480 100644
--- a/arch/arm/crypto/sha256-core.S_shipped
+++ b/arch/arm/crypto/sha256-core.S_shipped
@@ -86,10 +86,11 @@
.global sha256_block_data_order
.type sha256_block_data_order,%function
sha256_block_data_order:
+.Lsha256_block_data_order:
#if __ARM_ARCH__<7
sub r3,pc,#8 @ sha256_block_data_order
#else
- adr r3,sha256_block_data_order
+ adr r3,.Lsha256_block_data_order
#endif
#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__)
ldr r12,.LOPENSSL_armcap
diff --git a/arch/arm/crypto/sha512-armv4.pl b/arch/arm/crypto/sha512-armv4.pl
index a2b11a8..5fe3364 100644
--- a/arch/arm/crypto/sha512-armv4.pl
+++ b/arch/arm/crypto/sha512-armv4.pl
@@ -267,10 +267,11 @@
.global sha512_block_data_order
.type sha512_block_data_order,%function
sha512_block_data_order:
+.Lsha512_block_data_order:
#if __ARM_ARCH__<7
sub r3,pc,#8 @ sha512_block_data_order
#else
- adr r3,sha512_block_data_order
+ adr r3,.Lsha512_block_data_order
#endif
#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__)
ldr r12,.LOPENSSL_armcap
diff --git a/arch/arm/crypto/sha512-core.S_shipped b/arch/arm/crypto/sha512-core.S_shipped
index 3694c4d..de9bd7f 100644
--- a/arch/arm/crypto/sha512-core.S_shipped
+++ b/arch/arm/crypto/sha512-core.S_shipped
@@ -134,10 +134,11 @@
.global sha512_block_data_order
.type sha512_block_data_order,%function
sha512_block_data_order:
+.Lsha512_block_data_order:
#if __ARM_ARCH__<7
sub r3,pc,#8 @ sha512_block_data_order
#else
- adr r3,sha512_block_data_order
+ adr r3,.Lsha512_block_data_order
#endif
#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__)
ldr r12,.LOPENSSL_armcap
diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 513e03d..8331cb0 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -10,6 +10,8 @@
#define sev() __asm__ __volatile__ ("sev" : : : "memory")
#define wfe() __asm__ __volatile__ ("wfe" : : : "memory")
#define wfi() __asm__ __volatile__ ("wfi" : : : "memory")
+#else
+#define wfe() do { } while (0)
#endif
#if __LINUX_ARM_ARCH__ >= 7
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index e53638c..61e1d08 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -24,7 +24,6 @@
#ifndef __ASSEMBLY__
struct irqaction;
struct pt_regs;
-extern void migrate_irqs(void);
extern void asm_do_IRQ(unsigned int, struct pt_regs *);
void handle_IRQ(unsigned int, struct pt_regs *);
diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
index b2c5d79..f59a196 100644
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -80,7 +80,11 @@ extern void release_thread(struct task_struct *);
unsigned long get_wchan(struct task_struct *p);
#if __LINUX_ARM_ARCH__ == 6 || defined(CONFIG_ARM_ERRATA_754327)
-#define cpu_relax() smp_mb()
+#define cpu_relax() \
+ do { \
+ smp_mb(); \
+ __asm__ __volatile__("nop; nop; nop; nop; nop; nop; nop; nop; nop; nop;"); \
+ } while (0)
#else
#define cpu_relax() barrier()
#endif
diff --git a/arch/arm/kernel/irq.c b/arch/arm/kernel/irq.c
old mode 100644
new mode 100755
index 42d3974..407527f
--- a/arch/arm/kernel/irq.c
+++ b/arch/arm/kernel/irq.c
@@ -31,7 +31,6 @@
#include <linux/smp.h>
#include <linux/init.h>
#include <linux/seq_file.h>
-#include <linux/ratelimit.h>
#include <linux/errno.h>
#include <linux/list.h>
#include <linux/kallsyms.h>
diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index b18c1ea..ef6b27f 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -87,8 +87,11 @@ void machine_crash_nonpanic_core(void *unused)
set_cpu_online(smp_processor_id(), false);
atomic_dec(&waiting_for_crash_ipi);
- while (1)
+
+ while (1) {
cpu_relax();
+ wfe();
+ }
}
static void machine_kexec_mask_interrupts(void)
diff --git a/arch/arm/kernel/patch.c b/arch/arm/kernel/patch.c
index 69bda1a..1f665ac 100644
--- a/arch/arm/kernel/patch.c
+++ b/arch/arm/kernel/patch.c
@@ -15,7 +15,7 @@ struct patch {
unsigned int insn;
};
-static DEFINE_SPINLOCK(patch_lock);
+static DEFINE_RAW_SPINLOCK(patch_lock);
static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
__acquires(&patch_lock)
@@ -32,7 +32,7 @@ static void __kprobes *patch_map(void *addr, int fixmap, unsigned long *flags)
return addr;
if (flags)
- spin_lock_irqsave(&patch_lock, *flags);
+ raw_spin_lock_irqsave(&patch_lock, *flags);
else
__acquire(&patch_lock);
@@ -47,7 +47,7 @@ static void __kprobes patch_unmap(int fixmap, unsigned long *flags)
clear_fixmap(fixmap);
if (flags)
- spin_unlock_irqrestore(&patch_lock, *flags);
+ raw_spin_unlock_irqrestore(&patch_lock, *flags);
else
__release(&patch_lock);
}
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index b7a75a0..9316300429 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -254,7 +254,7 @@ int __cpu_disable(void)
/*
* OK - migrate IRQs away from this CPU
*/
- migrate_irqs();
+ irq_migrate_all_off_this_cpu();
/*
* Flush user cache and TLB mappings, and then remove this CPU
@@ -615,8 +615,10 @@ static void ipi_cpu_stop(unsigned int cpu)
local_fiq_disable();
local_irq_disable();
- while (1)
+ while (1) {
cpu_relax();
+ wfe();
+ }
}
static DEFINE_PER_CPU(struct completion *, cpu_completion);
diff --git a/arch/arm/kernel/unwind.c b/arch/arm/kernel/unwind.c
index 0bee233..314cfb2 100644
--- a/arch/arm/kernel/unwind.c
+++ b/arch/arm/kernel/unwind.c
@@ -93,7 +93,7 @@ extern const struct unwind_idx __start_unwind_idx[];
static const struct unwind_idx *__origin_unwind_idx;
extern const struct unwind_idx __stop_unwind_idx[];
-static DEFINE_SPINLOCK(unwind_lock);
+static DEFINE_RAW_SPINLOCK(unwind_lock);
static LIST_HEAD(unwind_tables);
/* Convert a prel31 symbol to an absolute address */
@@ -201,7 +201,7 @@ static const struct unwind_idx *unwind_find_idx(unsigned long addr)
/* module unwind tables */
struct unwind_table *table;
- spin_lock_irqsave(&unwind_lock, flags);
+ raw_spin_lock_irqsave(&unwind_lock, flags);
list_for_each_entry(table, &unwind_tables, list) {
if (addr >= table->begin_addr &&
addr < table->end_addr) {
@@ -213,7 +213,7 @@ static const struct unwind_idx *unwind_find_idx(unsigned long addr)
break;
}
}
- spin_unlock_irqrestore(&unwind_lock, flags);
+ raw_spin_unlock_irqrestore(&unwind_lock, flags);
}
pr_debug("%s: idx = %p\n", __func__, idx);
@@ -529,9 +529,9 @@ struct unwind_table *unwind_table_add(unsigned long start, unsigned long size,
tab->begin_addr = text_addr;
tab->end_addr = text_addr + text_size;
- spin_lock_irqsave(&unwind_lock, flags);
+ raw_spin_lock_irqsave(&unwind_lock, flags);
list_add_tail(&tab->list, &unwind_tables);
- spin_unlock_irqrestore(&unwind_lock, flags);
+ raw_spin_unlock_irqrestore(&unwind_lock, flags);
return tab;
}
@@ -543,9 +543,9 @@ void unwind_table_del(struct unwind_table *tab)
if (!tab)
return;
- spin_lock_irqsave(&unwind_lock, flags);
+ raw_spin_lock_irqsave(&unwind_lock, flags);
list_del(&tab->list);
- spin_unlock_irqrestore(&unwind_lock, flags);
+ raw_spin_unlock_irqrestore(&unwind_lock, flags);
kfree(tab);
}
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 27f4d962..b3ecffb 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -38,7 +38,7 @@
$(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
- NEON_FLAGS := -mfloat-abi=softfp -mfpu=neon
+ NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
CFLAGS_xor-neon.o += $(NEON_FLAGS)
obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
endif
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index 2c40aea..c691b90 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -14,7 +14,7 @@
MODULE_LICENSE("GPL");
#ifndef __ARM_NEON__
-#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
+#error You should compile this file with '-march=armv7-a -mfloat-abi=softfp -mfpu=neon'
#endif
/*
diff --git a/arch/arm/mach-imx/cpuidle-imx6q.c b/arch/arm/mach-imx/cpuidle-imx6q.c
index bfeb25a..326e870 100644
--- a/arch/arm/mach-imx/cpuidle-imx6q.c
+++ b/arch/arm/mach-imx/cpuidle-imx6q.c
@@ -16,30 +16,23 @@
#include "cpuidle.h"
#include "hardware.h"
-static atomic_t master = ATOMIC_INIT(0);
-static DEFINE_SPINLOCK(master_lock);
+static int num_idle_cpus = 0;
+static DEFINE_SPINLOCK(cpuidle_lock);
static int imx6q_enter_wait(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index)
{
- if (atomic_inc_return(&master) == num_online_cpus()) {
- /*
- * With this lock, we prevent other cpu to exit and enter
- * this function again and become the master.
- */
- if (!spin_trylock(&master_lock))
- goto idle;
+ spin_lock(&cpuidle_lock);
+ if (++num_idle_cpus == num_online_cpus())
imx6_set_lpm(WAIT_UNCLOCKED);
- cpu_do_idle();
- imx6_set_lpm(WAIT_CLOCKED);
- spin_unlock(&master_lock);
- goto done;
- }
+ spin_unlock(&cpuidle_lock);
-idle:
cpu_do_idle();
-done:
- atomic_dec(&master);
+
+ spin_lock(&cpuidle_lock);
+ if (num_idle_cpus-- == num_online_cpus())
+ imx6_set_lpm(WAIT_CLOCKED);
+ spin_unlock(&cpuidle_lock);
return index;
}
diff --git a/arch/arm/mach-omap2/display.c b/arch/arm/mach-omap2/display.c
index 70b3eaf..5ca7e29 100644
--- a/arch/arm/mach-omap2/display.c
+++ b/arch/arm/mach-omap2/display.c
@@ -115,6 +115,7 @@ static int omap4_dsi_mux_pads(int dsi_id, unsigned lanes)
u32 enable_mask, enable_shift;
u32 pipd_mask, pipd_shift;
u32 reg;
+ int ret;
if (dsi_id == 0) {
enable_mask = OMAP4_DSI1_LANEENABLE_MASK;
@@ -130,7 +131,11 @@ static int omap4_dsi_mux_pads(int dsi_id, unsigned lanes)
return -ENODEV;
}
- regmap_read(omap4_dsi_mux_syscon, OMAP4_DSIPHY_SYSCON_OFFSET, ®);
+ ret = regmap_read(omap4_dsi_mux_syscon,
+ OMAP4_DSIPHY_SYSCON_OFFSET,
+ ®);
+ if (ret)
+ return ret;
reg &= ~enable_mask;
reg &= ~pipd_mask;
diff --git a/arch/arm/mach-omap2/prm_common.c b/arch/arm/mach-omap2/prm_common.c
index f1ca947..9e14604 100644
--- a/arch/arm/mach-omap2/prm_common.c
+++ b/arch/arm/mach-omap2/prm_common.c
@@ -533,8 +533,10 @@ void omap_prm_reset_system(void)
prm_ll_data->reset_system();
- while (1)
+ while (1) {
cpu_relax();
+ wfe();
+ }
}
/**
diff --git a/arch/arm/mach-s3c24xx/mach-osiris-dvs.c b/arch/arm/mach-s3c24xx/mach-osiris-dvs.c
index 262ab07..f4fdfca 100644
--- a/arch/arm/mach-s3c24xx/mach-osiris-dvs.c
+++ b/arch/arm/mach-s3c24xx/mach-osiris-dvs.c
@@ -70,16 +70,16 @@ static int osiris_dvs_notify(struct notifier_block *nb,
switch (val) {
case CPUFREQ_PRECHANGE:
- if (old_dvs & !new_dvs ||
- cur_dvs & !new_dvs) {
+ if ((old_dvs && !new_dvs) ||
+ (cur_dvs && !new_dvs)) {
pr_debug("%s: exiting dvs\n", __func__);
cur_dvs = false;
gpio_set_value(OSIRIS_GPIO_DVS, 1);
}
break;
case CPUFREQ_POSTCHANGE:
- if (!old_dvs & new_dvs ||
- !cur_dvs & new_dvs) {
+ if ((!old_dvs && new_dvs) ||
+ (!cur_dvs && new_dvs)) {
pr_debug("entering dvs\n");
cur_dvs = true;
gpio_set_value(OSIRIS_GPIO_DVS, 0);
diff --git a/arch/arm/plat-samsung/Kconfig b/arch/arm/plat-samsung/Kconfig
index e8229b9..3265b8f 100644
--- a/arch/arm/plat-samsung/Kconfig
+++ b/arch/arm/plat-samsung/Kconfig
@@ -258,7 +258,7 @@
config SAMSUNG_PM_CHECK
bool "S3C2410 PM Suspend Memory CRC"
- depends on PM
+ depends on PM && (PLAT_S3C24XX || ARCH_S3C64XX || ARCH_S5PV210)
select CRC32
help
Enable the PM code's memory area checksum over sleep. This option
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 2d53e26..bca2c3c 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1264,6 +1264,10 @@
def_bool y
depends on COMPAT && SYSVIPC
+config KEYS_COMPAT
+ def_bool y
+ depends on COMPAT && KEYS
+
endmenu
menu "Power management options"
diff --git a/arch/arm64/boot/dts/qcom/msm8937-pinctrl.dtsi b/arch/arm64/boot/dts/qcom/msm8937-pinctrl.dtsi
index c7d8ff1..94cc560 100644
--- a/arch/arm64/boot/dts/qcom/msm8937-pinctrl.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8937-pinctrl.dtsi
@@ -514,7 +514,7 @@
function = "sec_mi2s";
};
configs {
- pins = "gpio2", "gpio3";
+ pins = "gpio12", "gpio13";
drive-strength = <8>; /* 8 MA */
bias-disable; /* No PULL */
output-high;
@@ -526,7 +526,7 @@
function = "sec_mi2s";
};
configs {
- pins = "gpio2", "gpio3";
+ pins = "gpio12", "gpio13";
drive-strength = <2>; /* 2 MA */
bias-pull-down; /* PULL DOWN */
};
diff --git a/arch/arm64/boot/dts/qcom/sdm429-qrd-spyro-evt.dts b/arch/arm64/boot/dts/qcom/sdm429-qrd-spyro-evt.dts
index 951720e..c9d952d 100644
--- a/arch/arm64/boot/dts/qcom/sdm429-qrd-spyro-evt.dts
+++ b/arch/arm64/boot/dts/qcom/sdm429-qrd-spyro-evt.dts
@@ -20,5 +20,5 @@
model = "Qualcomm Technologies, Inc. SDM429 QRD Spyro";
compatible = "qcom,sdm429-qrd", "qcom,sdm429", "qcom,qrd";
qcom,board-id = <0xb 6>;
- qcom,pmic-id = <0x010016 0x25 0x0 0x0>;
+ qcom,pmic-id = <0x0002001b 0x0 0x0 0x0>;
};
diff --git a/arch/arm64/boot/dts/qcom/sdm429-qrd-spyro-evt.dtsi b/arch/arm64/boot/dts/qcom/sdm429-qrd-spyro-evt.dtsi
index b0f8539..4f7ccde 100644
--- a/arch/arm64/boot/dts/qcom/sdm429-qrd-spyro-evt.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm429-qrd-spyro-evt.dtsi
@@ -15,6 +15,32 @@
#include "sdm429w-pm660.dtsi"
#include "sdm429w-camera-sensor-spyro.dtsi"
+&gpio_key_active {
+ mux {
+ pins = "gpio91", "gpio127", "gpio128", "gpio35", "gpio126";
+ function = "gpio";
+ };
+
+ config {
+ pins = "gpio91", "gpio127", "gpio128", "gpio35", "gpio126";
+ drive-strength = <2>;
+ bias-pull-up;
+ };
+};
+
+&gpio_key_suspend {
+ mux {
+ pins = "gpio91", "gpio127", "gpio128", "gpio35", "gpio126";
+ function = "gpio";
+ };
+
+ config {
+ pins = "gpio91", "gpio127", "gpio128", "gpio35", "gpio126";
+ drive-strength = <2>;
+ bias-pull-up;
+ };
+};
+
&cdc_pdm_lines_act {
mux {
pins = "gpio68", "gpio73", "gpio74";
@@ -44,6 +70,35 @@
qcom,dsi-pref-prim-pan = <&dsi_hx8399c_hd_vid>;
};
+&i2c_5 {
+ status = "disabled";
+};
+
+&vol_up {
+ gpios = <&tlmm 35 0x1>;
+};
+
+&gpio_keys {
+ function_1: function_1 {
+ label = "function_1";
+ gpios = <&tlmm 127 0x1>;
+ linux,input-type = <1>;
+ linux,code = <116>;
+ debounce-interval = <15>;
+ linux,can-disable;
+ gpio-key,wakeup;
+ };
+
+ function_2: function_2 {
+ label = "function_2";
+ gpios = <&tlmm 126 0x1>;
+ linux,input-type = <1>;
+ linux,code = <117>;
+ debounce-interval = <15>;
+ linux,can-disable;
+ gpio-key,wakeup;
+ };
+};
&soc {
/delete-node/ qcom,spm@b1d2000;
@@ -196,8 +251,8 @@
asoc-wsa-codec-prefixes;
qcom,audio-routing =
"CDC_CONN", "MCLK",
- "QUAT_MI2S_RX", "DIGIT_REGULATOR",
- "TX_I2S_CLK", "DIGIT_REGULATOR",
+ "QUAT_MI2S_RX", "DIGITAL_REGULATOR",
+ "TX_I2S_CLK", "DIGITAL_REGULATOR",
"DMIC1", "Digital Mic1",
"DMIC2", "Digital Mic2";
qcom,cdc-dmic-gpios = <&cdc_dmic_gpios>;
@@ -234,6 +289,10 @@
compatible = "qcom,msm-digital-codec";
reg = <0xc0f0000 0x0>;
qcom,no-analog-codec;
+ cdc-vdd-digital-supply = <&pm660_l9>;
+ qcom,cdc-vdd-digital-voltage = <1800000 1800000>;
+ qcom,cdc-vdd-digital-current = <10000>;
+ qcom,cdc-on-demand-supplies = "cdc-vdd-digital";
};
cdc_dmic_gpios: cdc_dmic_pinctrl {
diff --git a/arch/arm64/boot/dts/qcom/sdm429w-camera-sensor-spyro.dtsi b/arch/arm64/boot/dts/qcom/sdm429w-camera-sensor-spyro.dtsi
index 1990d84..f0dec67 100644
--- a/arch/arm64/boot/dts/qcom/sdm429w-camera-sensor-spyro.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm429w-camera-sensor-spyro.dtsi
@@ -250,13 +250,16 @@
&cam_sensor_front_sleep>;
gpios = <&tlmm 27 0>,
<&tlmm 33 0>,
+ <&tlmm 66 0>,
<&tlmm 38 0>;
qcom,gpio-vana= <1>;
- qcom,gpio-reset = <2>;
- qcom,gpio-req-tbl-num = <0 1 2>;
- qcom,gpio-req-tbl-flags = <1 0 0>;
+ qcom,gpio-vdig= <2>;
+ qcom,gpio-reset = <3>;
+ qcom,gpio-req-tbl-num = <0 1 2 3>;
+ qcom,gpio-req-tbl-flags = <1 0 0 0>;
qcom,gpio-req-tbl-label = "CAMIF_MCLK1",
"CAM_AVDD1",
+ "CAM_DVDD1",
"CAM_RESET1";
qcom,sensor-position = <0x1>;
qcom,sensor-mode = <0>;
diff --git a/arch/arm64/boot/dts/qcom/sdm429w-pm660.dtsi b/arch/arm64/boot/dts/qcom/sdm429w-pm660.dtsi
index 7acc2d6..a959680 100644
--- a/arch/arm64/boot/dts/qcom/sdm429w-pm660.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm429w-pm660.dtsi
@@ -177,11 +177,12 @@
};
qcom,lpass@c200000 {
- /delete-property/ vdd_cx-supply;
+ vdd_cx-supply = <&pm660_s2_level>;
};
qcom,pronto@a21b000 {
/delete-property/ vdd_pronto_pll-supply;
+ vdd_pronto_pll-supply = <&pm660_l12>;
};
qcom,wcnss-wlan@0a000000 {
@@ -193,6 +194,13 @@
/delete-property/ qcom,iris-vddpa-supply;
/delete-property/ qcom,iris-vdddig-supply;
/delete-property/ qcom,wcnss-adc_tm;
+ qcom,pronto-vddmx-supply = <&pm660_s2_level_ao>;
+ qcom,pronto-vddcx-supply = <&pm660_s1_level>;
+ qcom,pronto-vddpx-supply = <&pm660_l13>;
+ qcom,iris-vddxo-supply = <&pm660_l12>;
+ qcom,iris-vddrfa-supply = <&pm660_l5>;
+ qcom,iris-vdddig-supply = <&pm660_l13>;
+ qcom,wcnss-adc_tm = <&pm660_adc_tm>;
};
qcom,csid@1b30000 {
@@ -240,12 +248,11 @@
};
&pil_mss {
- /delete-property/ vdd_mss-supply;
- /delete-property/ vdd_cx-supply;
- /delete-property/ vdd_cx-voltage;
- /delete-property/ vdd_mx-supply;
- /delete-property/ vdd_mx-uV;
- /delete-property/ vdd_pll-supply;
+ vdd_mss-supply = <&S6A>;
+ vdd_cx-supply = <&pm660_s1_level_ao>;
+ vdd_mx-supply = <&pm660_s2_level_ao>;
+ vdd_pll-supply = <&L12A>;
+ qcom,vdd_pll = <1800000>;
};
&cci {
diff --git a/arch/arm64/boot/dts/qcom/sdm439-qrd.dtsi b/arch/arm64/boot/dts/qcom/sdm439-qrd.dtsi
index c382f06..3d2ff51 100644
--- a/arch/arm64/boot/dts/qcom/sdm439-qrd.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm439-qrd.dtsi
@@ -130,13 +130,13 @@
};
&soc {
- gpio_keys {
+ gpio_keys: gpio_keys {
compatible = "gpio-keys";
label = "gpio-keys";
pinctrl-names = "default";
pinctrl-0 = <&gpio_key_active>;
- vol_up {
+ vol_up: vol_up {
label = "volume_up";
gpios = <&tlmm 91 0x1>;
linux,input-type = <1>;
diff --git a/arch/arm64/configs/cuttlefish_defconfig b/arch/arm64/configs/cuttlefish_defconfig
index 5aead19..b87deb9 100644
--- a/arch/arm64/configs/cuttlefish_defconfig
+++ b/arch/arm64/configs/cuttlefish_defconfig
@@ -6,6 +6,7 @@
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
+CONFIG_PSI=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_CGROUP_FREEZER=y
@@ -57,8 +58,6 @@
CONFIG_ARM64_SW_TTBR0_PAN=y
CONFIG_ARM64_LSE_ATOMICS=y
CONFIG_RANDOMIZE_BASE=y
-CONFIG_CMDLINE="console=ttyAMA0"
-CONFIG_CMDLINE_EXTEND=y
# CONFIG_EFI is not set
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
CONFIG_COMPAT=y
@@ -81,6 +80,7 @@
CONFIG_UNIX=y
CONFIG_XFRM_USER=y
CONFIG_XFRM_INTERFACE=y
+CONFIG_XFRM_STATISTICS=y
CONFIG_NET_KEY=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
@@ -182,6 +182,7 @@
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_HTB=y
CONFIG_NET_SCH_NETEM=y
+CONFIG_NET_SCH_INGRESS=y
CONFIG_NET_CLS_U32=y
CONFIG_NET_CLS_BPF=y
CONFIG_NET_EMATCH=y
@@ -216,6 +217,7 @@
CONFIG_DM_VERITY=y
CONFIG_DM_VERITY_FEC=y
CONFIG_DM_VERITY_AVB=y
+CONFIG_DM_BOW=y
CONFIG_NETDEVICES=y
CONFIG_NETCONSOLE=y
CONFIG_NETCONSOLE_DYNAMIC=y
@@ -282,6 +284,7 @@
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
+CONFIG_SERIAL_OF_PLATFORM=y
CONFIG_SERIAL_AMBA_PL011=y
CONFIG_SERIAL_AMBA_PL011_CONSOLE=y
CONFIG_VIRTIO_CONSOLE=y
@@ -383,6 +386,7 @@
# CONFIG_MMC_BLOCK is not set
CONFIG_RTC_CLASS=y
# CONFIG_RTC_SYSTOHC is not set
+CONFIG_RTC_DRV_PL030=y
CONFIG_RTC_DRV_PL031=y
CONFIG_VIRTIO_PCI=y
# CONFIG_VIRTIO_PCI_LEGACY is not set
@@ -412,6 +416,7 @@
CONFIG_QUOTA=y
CONFIG_QFMT_V2=y
CONFIG_FUSE_FS=y
+CONFIG_OVERLAY_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_TMPFS=y
diff --git a/arch/arm64/configs/ranchu64_defconfig b/arch/arm64/configs/ranchu64_defconfig
index 704d9b5..4195a35 100644
--- a/arch/arm64/configs/ranchu64_defconfig
+++ b/arch/arm64/configs/ranchu64_defconfig
@@ -188,7 +188,6 @@
CONFIG_INPUT_JOYSTICK=y
CONFIG_INPUT_TABLET=y
CONFIG_INPUT_MISC=y
-CONFIG_INPUT_KEYCHORD=y
CONFIG_INPUT_UINPUT=y
CONFIG_INPUT_GPIO=y
# CONFIG_SERIO_SERPORT is not set
diff --git a/arch/arm64/crypto/aes-ce-ccm-core.S b/arch/arm64/crypto/aes-ce-ccm-core.S
index 3363560..7bc459d 100644
--- a/arch/arm64/crypto/aes-ce-ccm-core.S
+++ b/arch/arm64/crypto/aes-ce-ccm-core.S
@@ -74,12 +74,13 @@
beq 10f
ext v0.16b, v0.16b, v0.16b, #1 /* rotate out the mac bytes */
b 7b
-8: mov w7, w8
+8: cbz w8, 91f
+ mov w7, w8
add w8, w8, #16
9: ext v1.16b, v1.16b, v1.16b, #1
adds w7, w7, #1
bne 9b
- eor v0.16b, v0.16b, v1.16b
+91: eor v0.16b, v0.16b, v1.16b
st1 {v0.16b}, [x0]
10: str w8, [x3]
ret
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
old mode 100644
new mode 100755
index a891bb6..e69df28
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -30,8 +30,8 @@ do { \
" prfm pstl1strm, %2\n" \
"1: ldxr %w1, %2\n" \
insn "\n" \
-"2: stlxr %w3, %w0, %2\n" \
-" cbnz %w3, 1b\n" \
+"2: stlxr %w0, %w3, %2\n" \
+" cbnz %w0, 1b\n" \
" dmb ish\n" \
"3:\n" \
" .pushsection .fixup,\"ax\"\n" \
@@ -56,23 +56,23 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
switch (op) {
case FUTEX_OP_SET:
- __futex_atomic_op("mov %w0, %w4",
+ __futex_atomic_op("mov %w3, %w4",
ret, oldval, uaddr, tmp, oparg);
break;
case FUTEX_OP_ADD:
- __futex_atomic_op("add %w0, %w1, %w4",
+ __futex_atomic_op("add %w3, %w1, %w4",
ret, oldval, uaddr, tmp, oparg);
break;
case FUTEX_OP_OR:
- __futex_atomic_op("orr %w0, %w1, %w4",
+ __futex_atomic_op("orr %w3, %w1, %w4",
ret, oldval, uaddr, tmp, oparg);
break;
case FUTEX_OP_ANDN:
- __futex_atomic_op("and %w0, %w1, %w4",
+ __futex_atomic_op("and %w3, %w1, %w4",
ret, oldval, uaddr, tmp, ~oparg);
break;
case FUTEX_OP_XOR:
- __futex_atomic_op("eor %w0, %w1, %w4",
+ __futex_atomic_op("eor %w3, %w1, %w4",
ret, oldval, uaddr, tmp, oparg);
break;
default:
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 0c721bd..12e073f 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -535,8 +535,7 @@
/* GICv3 system register access */
mrs x0, id_aa64pfr0_el1
ubfx x0, x0, #24, #4
- cmp x0, #1
- b.ne 3f
+ cbz x0, 3f
mrs_s x0, ICC_SRE_EL2
orr x0, x0, #ICC_SRE_EL2_SRE // Set ICC_SRE_EL2.SRE==1
diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index e017a94..72a660a 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -231,24 +231,33 @@ int kgdb_arch_handle_exception(int exception_vector, int signo,
static int kgdb_brk_fn(struct pt_regs *regs, unsigned int esr)
{
+ if (user_mode(regs))
+ return DBG_HOOK_ERROR;
+
kgdb_handle_exception(1, SIGTRAP, 0, regs);
- return 0;
+ return DBG_HOOK_HANDLED;
}
NOKPROBE_SYMBOL(kgdb_brk_fn)
static int kgdb_compiled_brk_fn(struct pt_regs *regs, unsigned int esr)
{
+ if (user_mode(regs))
+ return DBG_HOOK_ERROR;
+
compiled_break = 1;
kgdb_handle_exception(1, SIGTRAP, 0, regs);
- return 0;
+ return DBG_HOOK_HANDLED;
}
NOKPROBE_SYMBOL(kgdb_compiled_brk_fn);
static int kgdb_step_brk_fn(struct pt_regs *regs, unsigned int esr)
{
+ if (user_mode(regs))
+ return DBG_HOOK_ERROR;
+
kgdb_handle_exception(1, SIGTRAP, 0, regs);
- return 0;
+ return DBG_HOOK_HANDLED;
}
NOKPROBE_SYMBOL(kgdb_step_brk_fn);
diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
index d2b1b62..17f6471 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -450,6 +450,9 @@ kprobe_single_step_handler(struct pt_regs *regs, unsigned int esr)
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
int retval;
+ if (user_mode(regs))
+ return DBG_HOOK_ERROR;
+
/* return error if this is not our step */
retval = kprobe_ss_hit(kcb, instruction_pointer(regs));
@@ -466,6 +469,9 @@ kprobe_single_step_handler(struct pt_regs *regs, unsigned int esr)
int __kprobes
kprobe_breakpoint_handler(struct pt_regs *regs, unsigned int esr)
{
+ if (user_mode(regs))
+ return DBG_HOOK_ERROR;
+
kprobe_handler(regs);
return DBG_HOOK_HANDLED;
}
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
old mode 100644
new mode 100755
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index c1d02d1..c568c51 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -721,11 +721,12 @@ void __init hook_debug_fault_code(int nr,
debug_fault_info[nr].name = name;
}
-asmlinkage int __exception do_debug_exception(unsigned long addr,
+asmlinkage int __exception do_debug_exception(unsigned long addr_if_watchpoint,
unsigned int esr,
struct pt_regs *regs)
{
const struct fault_info *inf = debug_fault_info + DBG_ESR_EVT(esr);
+ unsigned long pc = instruction_pointer(regs);
struct siginfo info;
int rv;
@@ -736,19 +737,19 @@ asmlinkage int __exception do_debug_exception(unsigned long addr,
if (interrupts_enabled(regs))
trace_hardirqs_off();
- if (user_mode(regs) && instruction_pointer(regs) > TASK_SIZE)
+ if (user_mode(regs) && pc > TASK_SIZE)
arm64_apply_bp_hardening();
- if (!inf->fn(addr, esr, regs)) {
+ if (!inf->fn(addr_if_watchpoint, esr, regs)) {
rv = 1;
} else {
pr_alert("Unhandled debug exception: %s (0x%08x) at 0x%016lx\n",
- inf->name, esr, addr);
+ inf->name, esr, pc);
info.si_signo = inf->sig;
info.si_errno = 0;
info.si_code = inf->code;
- info.si_addr = (void __user *)addr;
+ info.si_addr = (void __user *)pc;
arm64_notify_die("", regs, &info, 0);
rv = 0;
}
diff --git a/arch/h8300/Makefile b/arch/h8300/Makefile
index e1c02ca..073bba6 100644
--- a/arch/h8300/Makefile
+++ b/arch/h8300/Makefile
@@ -23,7 +23,7 @@
LDFLAGS += $(ldflags-y)
ifeq ($(CROSS_COMPILE),)
-CROSS_COMPILE := h8300-unknown-linux-
+CROSS_COMPILE := $(call cc-cross-prefix, h8300-unknown-linux- h8300-linux-)
endif
core-y += arch/$(ARCH)/kernel/ arch/$(ARCH)/mm/
diff --git a/arch/m68k/Makefile b/arch/m68k/Makefile
index f0dd9fc..a229d28 100644
--- a/arch/m68k/Makefile
+++ b/arch/m68k/Makefile
@@ -58,7 +58,10 @@
cpuflags-$(CONFIG_M5206) := $(call cc-option,-mcpu=5206,-m5200)
KBUILD_AFLAGS += $(cpuflags-y)
-KBUILD_CFLAGS += $(cpuflags-y) -pipe
+KBUILD_CFLAGS += $(cpuflags-y)
+
+KBUILD_CFLAGS += -pipe -ffreestanding
+
ifdef CONFIG_MMU
# without -fno-strength-reduce the 53c7xx.c driver fails ;-(
KBUILD_CFLAGS += -fno-strength-reduce -ffixed-a2
diff --git a/arch/m68k/kernel/time.c b/arch/m68k/kernel/time.c
index 4e5aa2f..87160b4 100644
--- a/arch/m68k/kernel/time.c
+++ b/arch/m68k/kernel/time.c
@@ -14,6 +14,7 @@
#include <linux/export.h>
#include <linux/module.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/kernel.h>
#include <linux/param.h>
#include <linux/string.h>
diff --git a/arch/microblaze/kernel/heartbeat.c b/arch/microblaze/kernel/heartbeat.c
index 4643e3a..2022130 100644
--- a/arch/microblaze/kernel/heartbeat.c
+++ b/arch/microblaze/kernel/heartbeat.c
@@ -9,6 +9,7 @@
*/
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/io.h>
#include <asm/setup.h>
diff --git a/arch/mips/include/asm/jump_label.h b/arch/mips/include/asm/jump_label.h
index e776725..e4456e4 100644
--- a/arch/mips/include/asm/jump_label.h
+++ b/arch/mips/include/asm/jump_label.h
@@ -21,15 +21,15 @@
#endif
#ifdef CONFIG_CPU_MICROMIPS
-#define NOP_INSN "nop32"
+#define B_INSN "b32"
#else
-#define NOP_INSN "nop"
+#define B_INSN "b"
#endif
static __always_inline bool arch_static_branch(struct static_key *key, bool branch)
{
- asm_volatile_goto("1:\t" NOP_INSN "\n\t"
- "nop\n\t"
+ asm_volatile_goto("1:\t" B_INSN " 2f\n\t"
+ "2:\tnop\n\t"
".pushsection __jump_table, \"aw\"\n\t"
WORD_INSN " 1b, %l[l_yes], %0\n\t"
".popsection\n\t"
diff --git a/arch/mips/kernel/scall64-o32.S b/arch/mips/kernel/scall64-o32.S
index 7913a5c..b9c7887 100644
--- a/arch/mips/kernel/scall64-o32.S
+++ b/arch/mips/kernel/scall64-o32.S
@@ -125,7 +125,7 @@
subu t1, v0, __NR_O32_Linux
move a1, v0
bnez t1, 1f /* __NR_syscall at offset 0 */
- lw a1, PT_R4(sp) /* Arg1 for __NR_syscall case */
+ ld a1, PT_R4(sp) /* Arg1 for __NR_syscall case */
.set pop
1: jal syscall_trace_enter
diff --git a/arch/mips/kernel/vmlinux.lds.S b/arch/mips/kernel/vmlinux.lds.S
index f0a0e6d..2d965d9 100644
--- a/arch/mips/kernel/vmlinux.lds.S
+++ b/arch/mips/kernel/vmlinux.lds.S
@@ -138,6 +138,13 @@
PERCPU_SECTION(1 << CONFIG_MIPS_L1_CACHE_SHIFT)
#endif
+#ifdef CONFIG_MIPS_ELF_APPENDED_DTB
+ .appended_dtb : AT(ADDR(.appended_dtb) - LOAD_OFFSET) {
+ *(.appended_dtb)
+ KEEP(*(.appended_dtb))
+ }
+#endif
+
#ifdef CONFIG_RELOCATABLE
. = ALIGN(4);
@@ -162,11 +169,6 @@
__appended_dtb = .;
/* leave space for appended DTB */
. += 0x100000;
-#elif defined(CONFIG_MIPS_ELF_APPENDED_DTB)
- .appended_dtb : AT(ADDR(.appended_dtb) - LOAD_OFFSET) {
- *(.appended_dtb)
- KEEP(*(.appended_dtb))
- }
#endif
/*
* Align to 64K in attempt to eliminate holes before the
diff --git a/arch/mips/loongson64/lemote-2f/irq.c b/arch/mips/loongson64/lemote-2f/irq.c
index cab5f43..d371f02 100644
--- a/arch/mips/loongson64/lemote-2f/irq.c
+++ b/arch/mips/loongson64/lemote-2f/irq.c
@@ -102,7 +102,7 @@ static struct irqaction ip6_irqaction = {
static struct irqaction cascade_irqaction = {
.handler = no_action,
.name = "cascade",
- .flags = IRQF_NO_THREAD,
+ .flags = IRQF_NO_THREAD | IRQF_NO_SUSPEND,
};
void __init mach_init_irq(void)
diff --git a/arch/parisc/include/asm/processor.h b/arch/parisc/include/asm/processor.h
index 2e674e1..656984e 100644
--- a/arch/parisc/include/asm/processor.h
+++ b/arch/parisc/include/asm/processor.h
@@ -323,6 +323,8 @@ extern int _parisc_requires_coherency;
#define parisc_requires_coherency() (0)
#endif
+extern int running_on_qemu;
+
#endif /* __ASSEMBLY__ */
#endif /* __ASM_PARISC_PROCESSOR_H */
diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c
index c3a532a..2e5216c 100644
--- a/arch/parisc/kernel/process.c
+++ b/arch/parisc/kernel/process.c
@@ -206,12 +206,6 @@ void __cpuidle arch_cpu_idle(void)
static int __init parisc_idle_init(void)
{
- const char *marker;
-
- /* check QEMU/SeaBIOS marker in PAGE0 */
- marker = (char *) &PAGE0->pad0;
- running_on_qemu = (memcmp(marker, "SeaBIOS", 8) == 0);
-
if (!running_on_qemu)
cpu_idle_poll_ctrl(1);
diff --git a/arch/parisc/kernel/setup.c b/arch/parisc/kernel/setup.c
index 2e66a88..581b0c6 100644
--- a/arch/parisc/kernel/setup.c
+++ b/arch/parisc/kernel/setup.c
@@ -403,6 +403,9 @@ void start_parisc(void)
int ret, cpunum;
struct pdc_coproc_cfg coproc_cfg;
+ /* check QEMU/SeaBIOS marker in PAGE0 */
+ running_on_qemu = (memcmp(&PAGE0->pad0, "SeaBIOS", 8) == 0);
+
cpunum = smp_processor_id();
set_firmware_width_unlocked();
diff --git a/arch/parisc/kernel/time.c b/arch/parisc/kernel/time.c
index 47ef8fd..22754e0 100644
--- a/arch/parisc/kernel/time.c
+++ b/arch/parisc/kernel/time.c
@@ -299,7 +299,7 @@ static int __init init_cr16_clocksource(void)
* The cr16 interval timers are not syncronized across CPUs, so mark
* them unstable and lower rating on SMP systems.
*/
- if (num_online_cpus() > 1) {
+ if (num_online_cpus() > 1 && !running_on_qemu) {
clocksource_cr16.flags = CLOCK_SOURCE_UNSTABLE;
clocksource_cr16.rating = 0;
}
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index bc11d709f..062b006 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -129,7 +129,7 @@
select ARCH_HAS_GCOV_PROFILE_ALL
select GENERIC_SMP_IDLE_THREAD
select GENERIC_CMOS_UPDATE
- select GENERIC_CPU_VULNERABILITIES if PPC_BOOK3S_64
+ select GENERIC_CPU_VULNERABILITIES if PPC_BARRIER_NOSPEC
select GENERIC_TIME_VSYSCALL_OLD
select GENERIC_CLOCKEVENTS
select GENERIC_CLOCKEVENTS_BROADCAST if SMP
@@ -165,6 +165,11 @@
select HAVE_ARCH_HARDENED_USERCOPY
select HAVE_KERNEL_GZIP
+config PPC_BARRIER_NOSPEC
+ bool
+ default y
+ depends on PPC_BOOK3S_64 || PPC_FSL_BOOK3E
+
config GENERIC_CSUM
def_bool CPU_LITTLE_ENDIAN
diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
index e0baba1..f3daa17 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -121,4 +121,10 @@ extern s64 __ashrdi3(s64, int);
extern int __cmpdi2(s64, s64);
extern int __ucmpdi2(u64, u64);
+/* Patch sites */
+extern s32 patch__call_flush_count_cache;
+extern s32 patch__flush_count_cache_return;
+
+extern long flush_count_cache;
+
#endif /* _ASM_POWERPC_ASM_PROTOTYPES_H */
diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
index 798ab37..80024c4 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -77,6 +77,27 @@ do { \
#define smp_mb__before_spinlock() smp_mb()
+#ifdef CONFIG_PPC_BOOK3S_64
+#define NOSPEC_BARRIER_SLOT nop
+#elif defined(CONFIG_PPC_FSL_BOOK3E)
+#define NOSPEC_BARRIER_SLOT nop; nop
+#endif
+
+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+/*
+ * Prevent execution of subsequent instructions until preceding branches have
+ * been fully resolved and are no longer executing speculatively.
+ */
+#define barrier_nospec_asm NOSPEC_BARRIER_FIXUP_SECTION; NOSPEC_BARRIER_SLOT
+
+// This also acts as a compiler barrier due to the memory clobber.
+#define barrier_nospec() asm (stringify_in_c(barrier_nospec_asm) ::: "memory")
+
+#else /* !CONFIG_PPC_BARRIER_NOSPEC */
+#define barrier_nospec_asm
+#define barrier_nospec()
+#endif /* CONFIG_PPC_BARRIER_NOSPEC */
+
#include <asm-generic/barrier.h>
#endif /* _ASM_POWERPC_BARRIER_H */
diff --git a/arch/powerpc/include/asm/code-patching-asm.h b/arch/powerpc/include/asm/code-patching-asm.h
new file mode 100644
index 0000000..ed7b144
--- /dev/null
+++ b/arch/powerpc/include/asm/code-patching-asm.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*
+ * Copyright 2018, Michael Ellerman, IBM Corporation.
+ */
+#ifndef _ASM_POWERPC_CODE_PATCHING_ASM_H
+#define _ASM_POWERPC_CODE_PATCHING_ASM_H
+
+/* Define a "site" that can be patched */
+.macro patch_site label name
+ .pushsection ".rodata"
+ .balign 4
+ .global \name
+\name:
+ .4byte \label - .
+ .popsection
+.endm
+
+#endif /* _ASM_POWERPC_CODE_PATCHING_ASM_H */
diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h
index b4ab1f4..ab934f8 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -28,6 +28,8 @@ unsigned int create_cond_branch(const unsigned int *addr,
unsigned long target, int flags);
int patch_branch(unsigned int *addr, unsigned long target, int flags);
int patch_instruction(unsigned int *addr, unsigned int instr);
+int patch_instruction_site(s32 *addr, unsigned int instr);
+int patch_branch_site(s32 *site, unsigned long target, int flags);
int instr_is_relative_branch(unsigned int instr);
int instr_is_relative_link_branch(unsigned int instr);
diff --git a/arch/powerpc/include/asm/feature-fixups.h b/arch/powerpc/include/asm/feature-fixups.h
index 0bf8202..175128e 100644
--- a/arch/powerpc/include/asm/feature-fixups.h
+++ b/arch/powerpc/include/asm/feature-fixups.h
@@ -213,6 +213,25 @@ void setup_feature_keys(void);
FTR_ENTRY_OFFSET 951b-952b; \
.popsection;
+#define NOSPEC_BARRIER_FIXUP_SECTION \
+953: \
+ .pushsection __barrier_nospec_fixup,"a"; \
+ .align 2; \
+954: \
+ FTR_ENTRY_OFFSET 953b-954b; \
+ .popsection;
+
+#define START_BTB_FLUSH_SECTION \
+955: \
+
+#define END_BTB_FLUSH_SECTION \
+956: \
+ .pushsection __btb_flush_fixup,"a"; \
+ .align 2; \
+957: \
+ FTR_ENTRY_OFFSET 955b-957b; \
+ FTR_ENTRY_OFFSET 956b-957b; \
+ .popsection;
#ifndef __ASSEMBLY__
@@ -220,6 +239,8 @@ extern long stf_barrier_fallback;
extern long __start___stf_entry_barrier_fixup, __stop___stf_entry_barrier_fixup;
extern long __start___stf_exit_barrier_fixup, __stop___stf_exit_barrier_fixup;
extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup;
+extern long __start___barrier_nospec_fixup, __stop___barrier_nospec_fixup;
+extern long __start__btb_flush_fixup, __stop__btb_flush_fixup;
#endif
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 9d97810..9587d30 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -316,10 +316,12 @@
#define H_CPU_CHAR_BRANCH_HINTS_HONORED (1ull << 58) // IBM bit 5
#define H_CPU_CHAR_THREAD_RECONFIG_CTRL (1ull << 57) // IBM bit 6
#define H_CPU_CHAR_COUNT_CACHE_DISABLED (1ull << 56) // IBM bit 7
+#define H_CPU_CHAR_BCCTR_FLUSH_ASSIST (1ull << 54) // IBM bit 9
#define H_CPU_BEHAV_FAVOUR_SECURITY (1ull << 63) // IBM bit 0
#define H_CPU_BEHAV_L1D_FLUSH_PR (1ull << 62) // IBM bit 1
#define H_CPU_BEHAV_BNDS_CHK_SPEC_BAR (1ull << 61) // IBM bit 2
+#define H_CPU_BEHAV_FLUSH_COUNT_CACHE (1ull << 58) // IBM bit 5
#ifndef __ASSEMBLY__
#include <linux/types.h>
diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index c4ced1d..48e8f1f 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -225,6 +225,7 @@
/* Misc instructions for BPF compiler */
#define PPC_INST_LBZ 0x88000000
#define PPC_INST_LD 0xe8000000
+#define PPC_INST_LDX 0x7c00002a
#define PPC_INST_LHZ 0xa0000000
#define PPC_INST_LWZ 0x80000000
#define PPC_INST_LHBRX 0x7c00062c
@@ -232,6 +233,7 @@
#define PPC_INST_STB 0x98000000
#define PPC_INST_STH 0xb0000000
#define PPC_INST_STD 0xf8000000
+#define PPC_INST_STDX 0x7c00012a
#define PPC_INST_STDU 0xf8000001
#define PPC_INST_STW 0x90000000
#define PPC_INST_STWU 0x94000000
diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h
index c73750b..bbd35ba 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -437,7 +437,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_601)
.machine push ; \
.machine "power4" ; \
lis scratch,0x60000000@h; \
- dcbt r0,scratch,0b01010; \
+ dcbt 0,scratch,0b01010; \
.machine pop
/*
@@ -780,4 +780,25 @@ END_FTR_SECTION_IFCLR(CPU_FTR_601)
.long 0x2400004c /* rfid */
#endif /* !CONFIG_PPC_BOOK3E */
#endif /* __ASSEMBLY__ */
+
+/*
+ * Helper macro for exception table entries
+ */
+#define EX_TABLE(_fault, _target) \
+ stringify_in_c(.section __ex_table,"a";)\
+ stringify_in_c(.balign 4;) \
+ stringify_in_c(.long (_fault) - . ;) \
+ stringify_in_c(.long (_target) - . ;) \
+ stringify_in_c(.previous)
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+#define BTB_FLUSH(reg) \
+ lis reg,BUCSR_INIT@h; \
+ ori reg,reg,BUCSR_INIT@l; \
+ mtspr SPRN_BUCSR,reg; \
+ isync;
+#else
+#define BTB_FLUSH(reg)
+#endif /* CONFIG_PPC_FSL_BOOK3E */
+
#endif /* _ASM_POWERPC_PPC_ASM_H */
diff --git a/arch/powerpc/include/asm/security_features.h b/arch/powerpc/include/asm/security_features.h
index 44989b2..759597b 100644
--- a/arch/powerpc/include/asm/security_features.h
+++ b/arch/powerpc/include/asm/security_features.h
@@ -22,6 +22,7 @@ enum stf_barrier_type {
void setup_stf_barrier(void);
void do_stf_barrier_fixups(enum stf_barrier_type types);
+void setup_count_cache_flush(void);
static inline void security_ftr_set(unsigned long feature)
{
@@ -59,6 +60,9 @@ static inline bool security_ftr_enabled(unsigned long feature)
// Indirect branch prediction cache disabled
#define SEC_FTR_COUNT_CACHE_DISABLED 0x0000000000000020ull
+// bcctr 2,0,0 triggers a hardware assisted count cache flush
+#define SEC_FTR_BCCTR_FLUSH_ASSIST 0x0000000000000800ull
+
// Features indicating need for Spectre/Meltdown mitigations
@@ -74,6 +78,9 @@ static inline bool security_ftr_enabled(unsigned long feature)
// Firmware configuration indicates user favours security over performance
#define SEC_FTR_FAVOUR_SECURITY 0x0000000000000200ull
+// Software required to flush count cache on context switch
+#define SEC_FTR_FLUSH_COUNT_CACHE 0x0000000000000400ull
+
// Features enabled by default
#define SEC_FTR_DEFAULT \
diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index 3f160cd..862ebce 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -8,6 +8,7 @@ extern void ppc_printk_progress(char *s, unsigned short hex);
extern unsigned int rtas_data;
extern unsigned long long memory_limit;
+extern bool init_mem_is_free;
extern unsigned long klimit;
extern void *zalloc_maybe_bootmem(size_t size, gfp_t mask);
@@ -50,6 +51,26 @@ enum l1d_flush_type {
void setup_rfi_flush(enum l1d_flush_type, bool enable);
void do_rfi_flush_fixups(enum l1d_flush_type types);
+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+void setup_barrier_nospec(void);
+#else
+static inline void setup_barrier_nospec(void) { };
+#endif
+void do_barrier_nospec_fixups(bool enable);
+extern bool barrier_nospec_enabled;
+
+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+void do_barrier_nospec_fixups_range(bool enable, void *start, void *end);
+#else
+static inline void do_barrier_nospec_fixups_range(bool enable, void *start, void *end) { };
+#endif
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+void setup_spectre_v2(void);
+#else
+static inline void setup_spectre_v2(void) {};
+#endif
+void do_btb_flush_fixups(void);
#endif /* !__ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index 8b3b46b..229c91b 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -90,6 +90,8 @@ static inline int prrn_is_enabled(void)
#define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
#define topology_core_id(cpu) (cpu_to_core_id(cpu))
+
+int dlpar_cpu_readd(int cpu);
#endif
#endif
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 31913b3..da85215 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -269,6 +269,7 @@ do { \
__chk_user_ptr(ptr); \
if (!is_kernel_addr((unsigned long)__gu_addr)) \
might_fault(); \
+ barrier_nospec(); \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
(x) = (__typeof__(*(ptr)))__gu_val; \
__gu_err; \
@@ -280,8 +281,10 @@ do { \
unsigned long __gu_val = 0; \
__typeof__(*(ptr)) __user *__gu_addr = (ptr); \
might_fault(); \
- if (access_ok(VERIFY_READ, __gu_addr, (size))) \
+ if (access_ok(VERIFY_READ, __gu_addr, (size))) { \
+ barrier_nospec(); \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
+ } \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
__gu_err; \
})
@@ -292,6 +295,7 @@ do { \
unsigned long __gu_val; \
__typeof__(*(ptr)) __user *__gu_addr = (ptr); \
__chk_user_ptr(ptr); \
+ barrier_nospec(); \
__get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
__gu_err; \
@@ -348,15 +352,19 @@ static inline unsigned long __copy_from_user_inatomic(void *to,
switch (n) {
case 1:
+ barrier_nospec();
__get_user_size(*(u8 *)to, from, 1, ret);
break;
case 2:
+ barrier_nospec();
__get_user_size(*(u16 *)to, from, 2, ret);
break;
case 4:
+ barrier_nospec();
__get_user_size(*(u32 *)to, from, 4, ret);
break;
case 8:
+ barrier_nospec();
__get_user_size(*(u64 *)to, from, 8, ret);
break;
}
@@ -366,6 +374,7 @@ static inline unsigned long __copy_from_user_inatomic(void *to,
check_object_size(to, n, false);
+ barrier_nospec();
return __copy_tofrom_user((__force void __user *)to, from, n);
}
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 1388578..d80fbf0 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -44,9 +44,10 @@
obj-$(CONFIG_VDSO32) += vdso32/
obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
obj-$(CONFIG_PPC_BOOK3S_64) += cpu_setup_ppc970.o cpu_setup_pa6t.o
-obj-$(CONFIG_PPC_BOOK3S_64) += cpu_setup_power.o security.o
+obj-$(CONFIG_PPC_BOOK3S_64) += cpu_setup_power.o
obj-$(CONFIG_PPC_BOOK3S_64) += mce.o mce_power.o
obj-$(CONFIG_PPC_BOOK3E_64) += exceptions-64e.o idle_book3e.o
+obj-$(CONFIG_PPC_BARRIER_NOSPEC) += security.o
obj-$(CONFIG_PPC64) += vdso64/
obj-$(CONFIG_ALTIVEC) += vecemu.o
obj-$(CONFIG_PPC_970_NAP) += idle_power4.o
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 3841d74..bdd88f9 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -34,6 +34,7 @@
#include <asm/ftrace.h>
#include <asm/ptrace.h>
#include <asm/export.h>
+#include <asm/barrier.h>
/*
* MSR_KERNEL is > 0x10000 on 4xx/Book-E since it include MSR_CE.
@@ -347,6 +348,15 @@
ori r10,r10,sys_call_table@l
slwi r0,r0,2
bge- 66f
+
+ barrier_nospec_asm
+ /*
+ * Prevent the load of the handler below (based on the user-passed
+ * system call number) being speculatively executed until the test
+ * against NR_syscalls and branch to .66f above has
+ * committed.
+ */
+
lwzx r10,r10,r0 /* Fetch system call handler [ptr] */
mtlr r10
addi r9,r1,STACK_FRAME_OVERHEAD
@@ -698,6 +708,9 @@
mtcr r10
lwz r10,_LINK(r11)
mtlr r10
+ /* Clear the exception_marker on the stack to avoid confusing stacktrace */
+ li r10, 0
+ stw r10, 8(r11)
REST_GPR(10, r11)
mtspr SPRN_SRR1,r9
mtspr SPRN_SRR0,r12
@@ -932,6 +945,9 @@
mtcrf 0xFF,r10
mtlr r11
+ /* Clear the exception_marker on the stack to avoid confusing stacktrace */
+ li r10, 0
+ stw r10, 8(r1)
/*
* Once we put values in SRR0 and SRR1, we are in a state
* where exceptions are not recoverable, since taking an
@@ -969,6 +985,9 @@
mtlr r11
lwz r10,_CCR(r1)
mtcrf 0xff,r10
+ /* Clear the exception_marker on the stack to avoid confusing stacktrace */
+ li r10, 0
+ stw r10, 8(r1)
REST_2GPRS(9, r1)
.globl exc_exit_restart
exc_exit_restart:
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index e24ae0f..390ebf4 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -26,6 +26,7 @@
#include <asm/page.h>
#include <asm/mmu.h>
#include <asm/thread_info.h>
+#include <asm/code-patching-asm.h>
#include <asm/ppc_asm.h>
#include <asm/asm-offsets.h>
#include <asm/cputable.h>
@@ -38,6 +39,7 @@
#include <asm/context_tracking.h>
#include <asm/tm.h>
#include <asm/ppc-opcode.h>
+#include <asm/barrier.h>
#include <asm/export.h>
#ifdef CONFIG_PPC_BOOK3S
#include <asm/exception-64s.h>
@@ -78,6 +80,11 @@
std r0,GPR0(r1)
std r10,GPR1(r1)
beq 2f /* if from kernel mode */
+#ifdef CONFIG_PPC_FSL_BOOK3E
+START_BTB_FLUSH_SECTION
+ BTB_FLUSH(r10)
+END_BTB_FLUSH_SECTION
+#endif
ACCOUNT_CPU_USER_ENTRY(r13, r10, r11)
2: std r2,GPR2(r1)
std r3,GPR3(r1)
@@ -180,6 +187,15 @@
clrldi r8,r8,32
15:
slwi r0,r0,4
+
+ barrier_nospec_asm
+ /*
+ * Prevent the load of the handler below (based on the user-passed
+ * system call number) being speculatively executed until the test
+ * against NR_syscalls and branch to .Lsyscall_enosys above has
+ * committed.
+ */
+
ldx r12,r11,r0 /* Fetch system call handler [ptr] */
mtctr r12
bctrl /* Call handler */
@@ -473,6 +489,57 @@
li r3,0
b .Lsyscall_exit
+#ifdef CONFIG_PPC_BOOK3S_64
+
+#define FLUSH_COUNT_CACHE \
+1: nop; \
+ patch_site 1b, patch__call_flush_count_cache
+
+
+#define BCCTR_FLUSH .long 0x4c400420
+
+.macro nops number
+ .rept \number
+ nop
+ .endr
+.endm
+
+.balign 32
+.global flush_count_cache
+flush_count_cache:
+ /* Save LR into r9 */
+ mflr r9
+
+ .rept 64
+ bl .+4
+ .endr
+ b 1f
+ nops 6
+
+ .balign 32
+ /* Restore LR */
+1: mtlr r9
+ li r9,0x7fff
+ mtctr r9
+
+ BCCTR_FLUSH
+
+2: nop
+ patch_site 2b patch__flush_count_cache_return
+
+ nops 3
+
+ .rept 278
+ .balign 32
+ BCCTR_FLUSH
+ nops 7
+ .endr
+
+ blr
+#else
+#define FLUSH_COUNT_CACHE
+#endif /* CONFIG_PPC_BOOK3S_64 */
+
/*
* This routine switches between two different tasks. The process
* state of one is saved on its kernel stack. Then the state
@@ -504,6 +571,8 @@
std r23,_CCR(r1)
std r1,KSP(r3) /* Set old stack pointer */
+ FLUSH_COUNT_CACHE
+
#ifdef CONFIG_SMP
/* We need a sync somewhere here to make sure that if the
* previous task gets rescheduled on another CPU, it sees all
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index ca03eb2..423b525 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -295,7 +295,8 @@
andi. r10,r11,MSR_PR; /* save stack pointer */ \
beq 1f; /* branch around if supervisor */ \
ld r1,PACAKSAVE(r13); /* get kernel stack coming from usr */\
-1: cmpdi cr1,r1,0; /* check if SP makes sense */ \
+1: type##_BTB_FLUSH \
+ cmpdi cr1,r1,0; /* check if SP makes sense */ \
bge- cr1,exc_##n##_bad_stack;/* bad stack (TODO: out of line) */ \
mfspr r10,SPRN_##type##_SRR0; /* read SRR0 before touching stack */
@@ -327,6 +328,30 @@
#define SPRN_MC_SRR0 SPRN_MCSRR0
#define SPRN_MC_SRR1 SPRN_MCSRR1
+#ifdef CONFIG_PPC_FSL_BOOK3E
+#define GEN_BTB_FLUSH \
+ START_BTB_FLUSH_SECTION \
+ beq 1f; \
+ BTB_FLUSH(r10) \
+ 1: \
+ END_BTB_FLUSH_SECTION
+
+#define CRIT_BTB_FLUSH \
+ START_BTB_FLUSH_SECTION \
+ BTB_FLUSH(r10) \
+ END_BTB_FLUSH_SECTION
+
+#define DBG_BTB_FLUSH CRIT_BTB_FLUSH
+#define MC_BTB_FLUSH CRIT_BTB_FLUSH
+#define GDBELL_BTB_FLUSH GEN_BTB_FLUSH
+#else
+#define GEN_BTB_FLUSH
+#define CRIT_BTB_FLUSH
+#define DBG_BTB_FLUSH
+#define MC_BTB_FLUSH
+#define GDBELL_BTB_FLUSH
+#endif
+
#define NORMAL_EXCEPTION_PROLOG(n, intnum, addition) \
EXCEPTION_PROLOG(n, intnum, GEN, addition##_GEN(n))
diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
index a620203..7b98c73 100644
--- a/arch/powerpc/kernel/head_booke.h
+++ b/arch/powerpc/kernel/head_booke.h
@@ -31,6 +31,16 @@
*/
#define THREAD_NORMSAVE(offset) (THREAD_NORMSAVES + (offset * 4))
+#ifdef CONFIG_PPC_FSL_BOOK3E
+#define BOOKE_CLEAR_BTB(reg) \
+START_BTB_FLUSH_SECTION \
+ BTB_FLUSH(reg) \
+END_BTB_FLUSH_SECTION
+#else
+#define BOOKE_CLEAR_BTB(reg)
+#endif
+
+
#define NORMAL_EXCEPTION_PROLOG(intno) \
mtspr SPRN_SPRG_WSCRATCH0, r10; /* save one register */ \
mfspr r10, SPRN_SPRG_THREAD; \
@@ -42,6 +52,7 @@
andi. r11, r11, MSR_PR; /* check whether user or kernel */\
mr r11, r1; \
beq 1f; \
+ BOOKE_CLEAR_BTB(r11) \
/* if from user, start at top of this thread's kernel stack */ \
lwz r11, THREAD_INFO-THREAD(r10); \
ALLOC_STACK_FRAME(r11, THREAD_SIZE); \
@@ -127,6 +138,7 @@
stw r9,_CCR(r8); /* save CR on stack */\
mfspr r11,exc_level_srr1; /* check whether user or kernel */\
DO_KVM BOOKE_INTERRUPT_##intno exc_level_srr1; \
+ BOOKE_CLEAR_BTB(r10) \
andi. r11,r11,MSR_PR; \
mfspr r11,SPRN_SPRG_THREAD; /* if from user, start at top of */\
lwz r11,THREAD_INFO-THREAD(r11); /* this thread's kernel stack */\
diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S
index bf4c602..60a0aee 100644
--- a/arch/powerpc/kernel/head_fsl_booke.S
+++ b/arch/powerpc/kernel/head_fsl_booke.S
@@ -452,6 +452,13 @@
mfcr r13
stw r13, THREAD_NORMSAVE(3)(r10)
DO_KVM BOOKE_INTERRUPT_DTLB_MISS SPRN_SRR1
+START_BTB_FLUSH_SECTION
+ mfspr r11, SPRN_SRR1
+ andi. r10,r11,MSR_PR
+ beq 1f
+ BTB_FLUSH(r10)
+1:
+END_BTB_FLUSH_SECTION
mfspr r10, SPRN_DEAR /* Get faulting address */
/* If we are faulting a kernel address, we have to use the
@@ -546,6 +553,14 @@
mfcr r13
stw r13, THREAD_NORMSAVE(3)(r10)
DO_KVM BOOKE_INTERRUPT_ITLB_MISS SPRN_SRR1
+START_BTB_FLUSH_SECTION
+ mfspr r11, SPRN_SRR1
+ andi. r10,r11,MSR_PR
+ beq 1f
+ BTB_FLUSH(r10)
+1:
+END_BTB_FLUSH_SECTION
+
mfspr r10, SPRN_SRR0 /* Get faulting address */
/* If we are faulting a kernel address, we have to use the
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index 30b89d5..3b1c3bb 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -72,7 +72,15 @@ int module_finalize(const Elf_Ehdr *hdr,
do_feature_fixups(powerpc_firmware_features,
(void *)sect->sh_addr,
(void *)sect->sh_addr + sect->sh_size);
-#endif
+#endif /* CONFIG_PPC64 */
+
+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+ sect = find_section(hdr, sechdrs, "__spec_barrier_fixup");
+ if (sect != NULL)
+ do_barrier_nospec_fixups_range(barrier_nospec_enabled,
+ (void *)sect->sh_addr,
+ (void *)sect->sh_addr + sect->sh_size);
+#endif /* CONFIG_PPC_BARRIER_NOSPEC */
sect = find_section(hdr, sechdrs, "__lwsync_fixup");
if (sect != NULL)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 1c141d5..609f0e8 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -153,7 +153,7 @@ void __giveup_fpu(struct task_struct *tsk)
save_fpu(tsk);
msr = tsk->thread.regs->msr;
- msr &= ~MSR_FP;
+ msr &= ~(MSR_FP|MSR_FE0|MSR_FE1);
#ifdef CONFIG_VSX
if (cpu_has_feature(CPU_FTR_VSX))
msr &= ~MSR_VSX;
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index adfa63e..4f28296 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -547,6 +547,7 @@ static int vr_get(struct task_struct *target, const struct user_regset *regset,
/*
* Copy out only the low-order word of vrsave.
*/
+ int start, end;
union {
elf_vrreg_t reg;
u32 word;
@@ -555,8 +556,10 @@ static int vr_get(struct task_struct *target, const struct user_regset *regset,
vrsave.word = target->thread.vrsave;
+ start = 33 * sizeof(vector128);
+ end = start + sizeof(vrsave);
ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, &vrsave,
- 33 * sizeof(vector128), -1);
+ start, end);
}
return ret;
@@ -594,6 +597,7 @@ static int vr_set(struct task_struct *target, const struct user_regset *regset,
/*
* We use only the first word of vrsave.
*/
+ int start, end;
union {
elf_vrreg_t reg;
u32 word;
@@ -602,8 +606,10 @@ static int vr_set(struct task_struct *target, const struct user_regset *regset,
vrsave.word = target->thread.vrsave;
+ start = 33 * sizeof(vector128);
+ end = start + sizeof(vrsave);
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &vrsave,
- 33 * sizeof(vector128), -1);
+ start, end);
if (!ret)
target->thread.vrsave = vrsave.word;
}
diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
index 2277df8..30542e8 100644
--- a/arch/powerpc/kernel/security.c
+++ b/arch/powerpc/kernel/security.c
@@ -9,11 +9,121 @@
#include <linux/device.h>
#include <linux/seq_buf.h>
+#include <asm/asm-prototypes.h>
+#include <asm/code-patching.h>
+#include <asm/debug.h>
#include <asm/security_features.h>
+#include <asm/setup.h>
unsigned long powerpc_security_features __read_mostly = SEC_FTR_DEFAULT;
+enum count_cache_flush_type {
+ COUNT_CACHE_FLUSH_NONE = 0x1,
+ COUNT_CACHE_FLUSH_SW = 0x2,
+ COUNT_CACHE_FLUSH_HW = 0x4,
+};
+static enum count_cache_flush_type count_cache_flush_type = COUNT_CACHE_FLUSH_NONE;
+
+bool barrier_nospec_enabled;
+static bool no_nospec;
+static bool btb_flush_enabled;
+#ifdef CONFIG_PPC_FSL_BOOK3E
+static bool no_spectrev2;
+#endif
+
+static void enable_barrier_nospec(bool enable)
+{
+ barrier_nospec_enabled = enable;
+ do_barrier_nospec_fixups(enable);
+}
+
+void setup_barrier_nospec(void)
+{
+ bool enable;
+
+ /*
+ * It would make sense to check SEC_FTR_SPEC_BAR_ORI31 below as well.
+ * But there's a good reason not to. The two flags we check below are
+ * both are enabled by default in the kernel, so if the hcall is not
+ * functional they will be enabled.
+ * On a system where the host firmware has been updated (so the ori
+ * functions as a barrier), but on which the hypervisor (KVM/Qemu) has
+ * not been updated, we would like to enable the barrier. Dropping the
+ * check for SEC_FTR_SPEC_BAR_ORI31 achieves that. The only downside is
+ * we potentially enable the barrier on systems where the host firmware
+ * is not updated, but that's harmless as it's a no-op.
+ */
+ enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) &&
+ security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR);
+
+ if (!no_nospec)
+ enable_barrier_nospec(enable);
+}
+
+static int __init handle_nospectre_v1(char *p)
+{
+ no_nospec = true;
+
+ return 0;
+}
+early_param("nospectre_v1", handle_nospectre_v1);
+
+#ifdef CONFIG_DEBUG_FS
+static int barrier_nospec_set(void *data, u64 val)
+{
+ switch (val) {
+ case 0:
+ case 1:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (!!val == !!barrier_nospec_enabled)
+ return 0;
+
+ enable_barrier_nospec(!!val);
+
+ return 0;
+}
+
+static int barrier_nospec_get(void *data, u64 *val)
+{
+ *val = barrier_nospec_enabled ? 1 : 0;
+ return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_barrier_nospec,
+ barrier_nospec_get, barrier_nospec_set, "%llu\n");
+
+static __init int barrier_nospec_debugfs_init(void)
+{
+ debugfs_create_file("barrier_nospec", 0600, powerpc_debugfs_root, NULL,
+ &fops_barrier_nospec);
+ return 0;
+}
+device_initcall(barrier_nospec_debugfs_init);
+#endif /* CONFIG_DEBUG_FS */
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+static int __init handle_nospectre_v2(char *p)
+{
+ no_spectrev2 = true;
+
+ return 0;
+}
+early_param("nospectre_v2", handle_nospectre_v2);
+void setup_spectre_v2(void)
+{
+ if (no_spectrev2)
+ do_btb_flush_fixups();
+ else
+ btb_flush_enabled = true;
+}
+#endif /* CONFIG_PPC_FSL_BOOK3E */
+
+#ifdef CONFIG_PPC_BOOK3S_64
ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf)
{
bool thread_priv;
@@ -46,25 +156,39 @@ ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, cha
return sprintf(buf, "Vulnerable\n");
}
+#endif
ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, char *buf)
{
- if (!security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR))
- return sprintf(buf, "Not affected\n");
+ struct seq_buf s;
- return sprintf(buf, "Vulnerable\n");
+ seq_buf_init(&s, buf, PAGE_SIZE - 1);
+
+ if (security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR)) {
+ if (barrier_nospec_enabled)
+ seq_buf_printf(&s, "Mitigation: __user pointer sanitization");
+ else
+ seq_buf_printf(&s, "Vulnerable");
+
+ if (security_ftr_enabled(SEC_FTR_SPEC_BAR_ORI31))
+ seq_buf_printf(&s, ", ori31 speculation barrier enabled");
+
+ seq_buf_printf(&s, "\n");
+ } else
+ seq_buf_printf(&s, "Not affected\n");
+
+ return s.len;
}
ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, char *buf)
{
- bool bcs, ccd, ori;
struct seq_buf s;
+ bool bcs, ccd;
seq_buf_init(&s, buf, PAGE_SIZE - 1);
bcs = security_ftr_enabled(SEC_FTR_BCCTRL_SERIALISED);
ccd = security_ftr_enabled(SEC_FTR_COUNT_CACHE_DISABLED);
- ori = security_ftr_enabled(SEC_FTR_SPEC_BAR_ORI31);
if (bcs || ccd) {
seq_buf_printf(&s, "Mitigation: ");
@@ -77,17 +201,23 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c
if (ccd)
seq_buf_printf(&s, "Indirect branch cache disabled");
- } else
- seq_buf_printf(&s, "Vulnerable");
+ } else if (count_cache_flush_type != COUNT_CACHE_FLUSH_NONE) {
+ seq_buf_printf(&s, "Mitigation: Software count cache flush");
- if (ori)
- seq_buf_printf(&s, ", ori31 speculation barrier enabled");
+ if (count_cache_flush_type == COUNT_CACHE_FLUSH_HW)
+ seq_buf_printf(&s, " (hardware accelerated)");
+ } else if (btb_flush_enabled) {
+ seq_buf_printf(&s, "Mitigation: Branch predictor state flush");
+ } else {
+ seq_buf_printf(&s, "Vulnerable");
+ }
seq_buf_printf(&s, "\n");
return s.len;
}
+#ifdef CONFIG_PPC_BOOK3S_64
/*
* Store-forwarding barrier support.
*/
@@ -235,3 +365,71 @@ static __init int stf_barrier_debugfs_init(void)
}
device_initcall(stf_barrier_debugfs_init);
#endif /* CONFIG_DEBUG_FS */
+
+static void toggle_count_cache_flush(bool enable)
+{
+ if (!enable || !security_ftr_enabled(SEC_FTR_FLUSH_COUNT_CACHE)) {
+ patch_instruction_site(&patch__call_flush_count_cache, PPC_INST_NOP);
+ count_cache_flush_type = COUNT_CACHE_FLUSH_NONE;
+ pr_info("count-cache-flush: software flush disabled.\n");
+ return;
+ }
+
+ patch_branch_site(&patch__call_flush_count_cache,
+ (u64)&flush_count_cache, BRANCH_SET_LINK);
+
+ if (!security_ftr_enabled(SEC_FTR_BCCTR_FLUSH_ASSIST)) {
+ count_cache_flush_type = COUNT_CACHE_FLUSH_SW;
+ pr_info("count-cache-flush: full software flush sequence enabled.\n");
+ return;
+ }
+
+ patch_instruction_site(&patch__flush_count_cache_return, PPC_INST_BLR);
+ count_cache_flush_type = COUNT_CACHE_FLUSH_HW;
+ pr_info("count-cache-flush: hardware assisted flush sequence enabled\n");
+}
+
+void setup_count_cache_flush(void)
+{
+ toggle_count_cache_flush(true);
+}
+
+#ifdef CONFIG_DEBUG_FS
+static int count_cache_flush_set(void *data, u64 val)
+{
+ bool enable;
+
+ if (val == 1)
+ enable = true;
+ else if (val == 0)
+ enable = false;
+ else
+ return -EINVAL;
+
+ toggle_count_cache_flush(enable);
+
+ return 0;
+}
+
+static int count_cache_flush_get(void *data, u64 *val)
+{
+ if (count_cache_flush_type == COUNT_CACHE_FLUSH_NONE)
+ *val = 0;
+ else
+ *val = 1;
+
+ return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_count_cache_flush, count_cache_flush_get,
+ count_cache_flush_set, "%llu\n");
+
+static __init int count_cache_flush_debugfs_init(void)
+{
+ debugfs_create_file("count_cache_flush", 0600, powerpc_debugfs_root,
+ NULL, &fops_count_cache_flush);
+ return 0;
+}
+device_initcall(count_cache_flush_debugfs_init);
+#endif /* CONFIG_DEBUG_FS */
+#endif /* CONFIG_PPC_BOOK3S_64 */
diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index bf0f712..5e7d70c 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -918,6 +918,9 @@ void __init setup_arch(char **cmdline_p)
if (ppc_md.setup_arch)
ppc_md.setup_arch();
+ setup_barrier_nospec();
+ setup_spectre_v2();
+
paging_init();
/* Initialize the MMU context management stuff. */
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index d929afa..bdf2f7b 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -746,12 +746,25 @@ int sys_rt_sigreturn(unsigned long r3, unsigned long r4, unsigned long r5,
if (restore_tm_sigcontexts(current, &uc->uc_mcontext,
&uc_transact->uc_mcontext))
goto badframe;
- }
- else
- /* Fall through, for non-TM restore */
+ } else
#endif
- if (restore_sigcontext(current, NULL, 1, &uc->uc_mcontext))
- goto badframe;
+ {
+ /*
+ * Fall through, for non-TM restore
+ *
+ * Unset MSR[TS] on the thread regs since MSR from user
+ * context does not have MSR active, and recheckpoint was
+ * not called since restore_tm_sigcontexts() was not called
+ * also.
+ *
+ * If not unsetting it, the code can RFID to userspace with
+ * MSR[TS] set, but without CPU in the proper state,
+ * causing a TM bad thing.
+ */
+ current->thread.regs->msr &= ~MSR_TS_MASK;
+ if (restore_sigcontext(current, NULL, 1, &uc->uc_mcontext))
+ goto badframe;
+ }
if (restore_altstack(&uc->uc_stack))
goto badframe;
diff --git a/arch/powerpc/kernel/swsusp_asm64.S b/arch/powerpc/kernel/swsusp_asm64.S
index 988f38d..82d8aae 100644
--- a/arch/powerpc/kernel/swsusp_asm64.S
+++ b/arch/powerpc/kernel/swsusp_asm64.S
@@ -179,7 +179,7 @@
sld r3, r3, r0
li r0, 0
1:
- dcbf r0,r3
+ dcbf 0,r3
addi r3,r3,0x20
bdnz 1b
diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
index c16fddb..50d3650 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -153,8 +153,25 @@
*(__rfi_flush_fixup)
__stop___rfi_flush_fixup = .;
}
-#endif
+#endif /* CONFIG_PPC64 */
+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+ . = ALIGN(8);
+ __spec_barrier_fixup : AT(ADDR(__spec_barrier_fixup) - LOAD_OFFSET) {
+ __start___barrier_nospec_fixup = .;
+ *(__barrier_nospec_fixup)
+ __stop___barrier_nospec_fixup = .;
+ }
+#endif /* CONFIG_PPC_BARRIER_NOSPEC */
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+ . = ALIGN(8);
+ __spec_btb_flush_fixup : AT(ADDR(__spec_btb_flush_fixup) - LOAD_OFFSET) {
+ __start__btb_flush_fixup = .;
+ *(__btb_flush_fixup)
+ __stop__btb_flush_fixup = .;
+ }
+#endif
EXCEPTION_TABLE(0)
NOTES :kernel :notes
diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S
index 81bd8a07..612b7f6 100644
--- a/arch/powerpc/kvm/bookehv_interrupts.S
+++ b/arch/powerpc/kvm/bookehv_interrupts.S
@@ -75,6 +75,10 @@
PPC_LL r1, VCPU_HOST_STACK(r4)
PPC_LL r2, HOST_R2(r1)
+START_BTB_FLUSH_SECTION
+ BTB_FLUSH(r10)
+END_BTB_FLUSH_SECTION
+
mfspr r10, SPRN_PID
lwz r8, VCPU_HOST_PID(r4)
PPC_LL r11, VCPU_SHARED(r4)
diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c
index 990db69..fa88f64 100644
--- a/arch/powerpc/kvm/e500_emulate.c
+++ b/arch/powerpc/kvm/e500_emulate.c
@@ -277,6 +277,13 @@ int kvmppc_core_emulate_mtspr_e500(struct kvm_vcpu *vcpu, int sprn, ulong spr_va
vcpu->arch.pwrmgtcr0 = spr_val;
break;
+ case SPRN_BUCSR:
+ /*
+ * If we are here, it means that we have already flushed the
+ * branch predictor, so just return to guest.
+ */
+ break;
+
/* extra exceptions */
#ifdef CONFIG_SPE_POSSIBLE
case SPRN_IVOR32:
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 753d591..14535ad 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -14,12 +14,20 @@
#include <asm/page.h>
#include <asm/code-patching.h>
#include <asm/uaccess.h>
+#include <asm/setup.h>
+#include <asm/sections.h>
int patch_instruction(unsigned int *addr, unsigned int instr)
{
int err;
+ /* Make sure we aren't patching a freed init section */
+ if (init_mem_is_free && init_section_contains(addr, 4)) {
+ pr_debug("Skipping init section patching addr: 0x%px\n", addr);
+ return 0;
+ }
+
__put_user_size(instr, addr, 4, err);
if (err)
return err;
@@ -32,6 +40,22 @@ int patch_branch(unsigned int *addr, unsigned long target, int flags)
return patch_instruction(addr, create_branch(addr, target, flags));
}
+int patch_branch_site(s32 *site, unsigned long target, int flags)
+{
+ unsigned int *addr;
+
+ addr = (unsigned int *)((unsigned long)site + *site);
+ return patch_instruction(addr, create_branch(addr, target, flags));
+}
+
+int patch_instruction_site(s32 *site, unsigned int instr)
+{
+ unsigned int *addr;
+
+ addr = (unsigned int *)((unsigned long)site + *site);
+ return patch_instruction(addr, instr);
+}
+
unsigned int create_branch(const unsigned int *addr,
unsigned long target, int flags)
{
diff --git a/arch/powerpc/lib/copypage_power7.S b/arch/powerpc/lib/copypage_power7.S
index a84d333..ca5fc8f 100644
--- a/arch/powerpc/lib/copypage_power7.S
+++ b/arch/powerpc/lib/copypage_power7.S
@@ -45,13 +45,13 @@
.machine push
.machine "power4"
/* setup read stream 0 */
- dcbt r0,r4,0b01000 /* addr from */
- dcbt r0,r7,0b01010 /* length and depth from */
+ dcbt 0,r4,0b01000 /* addr from */
+ dcbt 0,r7,0b01010 /* length and depth from */
/* setup write stream 1 */
- dcbtst r0,r9,0b01000 /* addr to */
- dcbtst r0,r10,0b01010 /* length and depth to */
+ dcbtst 0,r9,0b01000 /* addr to */
+ dcbtst 0,r10,0b01010 /* length and depth to */
eieio
- dcbt r0,r8,0b01010 /* all streams GO */
+ dcbt 0,r8,0b01010 /* all streams GO */
.machine pop
#ifdef CONFIG_ALTIVEC
@@ -83,7 +83,7 @@
li r12,112
.align 5
-1: lvx v7,r0,r4
+1: lvx v7,0,r4
lvx v6,r4,r6
lvx v5,r4,r7
lvx v4,r4,r8
@@ -92,7 +92,7 @@
lvx v1,r4,r11
lvx v0,r4,r12
addi r4,r4,128
- stvx v7,r0,r3
+ stvx v7,0,r3
stvx v6,r3,r6
stvx v5,r3,r7
stvx v4,r3,r8
diff --git a/arch/powerpc/lib/copyuser_power7.S b/arch/powerpc/lib/copyuser_power7.S
index da0c568..3916948 100644
--- a/arch/powerpc/lib/copyuser_power7.S
+++ b/arch/powerpc/lib/copyuser_power7.S
@@ -327,13 +327,13 @@
.machine push
.machine "power4"
/* setup read stream 0 */
- dcbt r0,r6,0b01000 /* addr from */
- dcbt r0,r7,0b01010 /* length and depth from */
+ dcbt 0,r6,0b01000 /* addr from */
+ dcbt 0,r7,0b01010 /* length and depth from */
/* setup write stream 1 */
- dcbtst r0,r9,0b01000 /* addr to */
- dcbtst r0,r10,0b01010 /* length and depth to */
+ dcbtst 0,r9,0b01000 /* addr to */
+ dcbtst 0,r10,0b01010 /* length and depth to */
eieio
- dcbt r0,r8,0b01010 /* all streams GO */
+ dcbt 0,r8,0b01010 /* all streams GO */
.machine pop
beq cr1,.Lunwind_stack_nonvmx_copy
@@ -388,26 +388,26 @@
li r11,48
bf cr7*4+3,5f
-err3; lvx v1,r0,r4
+err3; lvx v1,0,r4
addi r4,r4,16
-err3; stvx v1,r0,r3
+err3; stvx v1,0,r3
addi r3,r3,16
5: bf cr7*4+2,6f
-err3; lvx v1,r0,r4
+err3; lvx v1,0,r4
err3; lvx v0,r4,r9
addi r4,r4,32
-err3; stvx v1,r0,r3
+err3; stvx v1,0,r3
err3; stvx v0,r3,r9
addi r3,r3,32
6: bf cr7*4+1,7f
-err3; lvx v3,r0,r4
+err3; lvx v3,0,r4
err3; lvx v2,r4,r9
err3; lvx v1,r4,r10
err3; lvx v0,r4,r11
addi r4,r4,64
-err3; stvx v3,r0,r3
+err3; stvx v3,0,r3
err3; stvx v2,r3,r9
err3; stvx v1,r3,r10
err3; stvx v0,r3,r11
@@ -433,7 +433,7 @@
*/
.align 5
8:
-err4; lvx v7,r0,r4
+err4; lvx v7,0,r4
err4; lvx v6,r4,r9
err4; lvx v5,r4,r10
err4; lvx v4,r4,r11
@@ -442,7 +442,7 @@
err4; lvx v1,r4,r15
err4; lvx v0,r4,r16
addi r4,r4,128
-err4; stvx v7,r0,r3
+err4; stvx v7,0,r3
err4; stvx v6,r3,r9
err4; stvx v5,r3,r10
err4; stvx v4,r3,r11
@@ -463,29 +463,29 @@
mtocrf 0x01,r6
bf cr7*4+1,9f
-err3; lvx v3,r0,r4
+err3; lvx v3,0,r4
err3; lvx v2,r4,r9
err3; lvx v1,r4,r10
err3; lvx v0,r4,r11
addi r4,r4,64
-err3; stvx v3,r0,r3
+err3; stvx v3,0,r3
err3; stvx v2,r3,r9
err3; stvx v1,r3,r10
err3; stvx v0,r3,r11
addi r3,r3,64
9: bf cr7*4+2,10f
-err3; lvx v1,r0,r4
+err3; lvx v1,0,r4
err3; lvx v0,r4,r9
addi r4,r4,32
-err3; stvx v1,r0,r3
+err3; stvx v1,0,r3
err3; stvx v0,r3,r9
addi r3,r3,32
10: bf cr7*4+3,11f
-err3; lvx v1,r0,r4
+err3; lvx v1,0,r4
addi r4,r4,16
-err3; stvx v1,r0,r3
+err3; stvx v1,0,r3
addi r3,r3,16
/* Up to 15B to go */
@@ -565,25 +565,25 @@
addi r4,r4,16
bf cr7*4+3,5f
-err3; lvx v1,r0,r4
+err3; lvx v1,0,r4
VPERM(v8,v0,v1,v16)
addi r4,r4,16
-err3; stvx v8,r0,r3
+err3; stvx v8,0,r3
addi r3,r3,16
vor v0,v1,v1
5: bf cr7*4+2,6f
-err3; lvx v1,r0,r4
+err3; lvx v1,0,r4
VPERM(v8,v0,v1,v16)
err3; lvx v0,r4,r9
VPERM(v9,v1,v0,v16)
addi r4,r4,32
-err3; stvx v8,r0,r3
+err3; stvx v8,0,r3
err3; stvx v9,r3,r9
addi r3,r3,32
6: bf cr7*4+1,7f
-err3; lvx v3,r0,r4
+err3; lvx v3,0,r4
VPERM(v8,v0,v3,v16)
err3; lvx v2,r4,r9
VPERM(v9,v3,v2,v16)
@@ -592,7 +592,7 @@
err3; lvx v0,r4,r11
VPERM(v11,v1,v0,v16)
addi r4,r4,64
-err3; stvx v8,r0,r3
+err3; stvx v8,0,r3
err3; stvx v9,r3,r9
err3; stvx v10,r3,r10
err3; stvx v11,r3,r11
@@ -618,7 +618,7 @@
*/
.align 5
8:
-err4; lvx v7,r0,r4
+err4; lvx v7,0,r4
VPERM(v8,v0,v7,v16)
err4; lvx v6,r4,r9
VPERM(v9,v7,v6,v16)
@@ -635,7 +635,7 @@
err4; lvx v0,r4,r16
VPERM(v15,v1,v0,v16)
addi r4,r4,128
-err4; stvx v8,r0,r3
+err4; stvx v8,0,r3
err4; stvx v9,r3,r9
err4; stvx v10,r3,r10
err4; stvx v11,r3,r11
@@ -656,7 +656,7 @@
mtocrf 0x01,r6
bf cr7*4+1,9f
-err3; lvx v3,r0,r4
+err3; lvx v3,0,r4
VPERM(v8,v0,v3,v16)
err3; lvx v2,r4,r9
VPERM(v9,v3,v2,v16)
@@ -665,27 +665,27 @@
err3; lvx v0,r4,r11
VPERM(v11,v1,v0,v16)
addi r4,r4,64
-err3; stvx v8,r0,r3
+err3; stvx v8,0,r3
err3; stvx v9,r3,r9
err3; stvx v10,r3,r10
err3; stvx v11,r3,r11
addi r3,r3,64
9: bf cr7*4+2,10f
-err3; lvx v1,r0,r4
+err3; lvx v1,0,r4
VPERM(v8,v0,v1,v16)
err3; lvx v0,r4,r9
VPERM(v9,v1,v0,v16)
addi r4,r4,32
-err3; stvx v8,r0,r3
+err3; stvx v8,0,r3
err3; stvx v9,r3,r9
addi r3,r3,32
10: bf cr7*4+3,11f
-err3; lvx v1,r0,r4
+err3; lvx v1,0,r4
VPERM(v8,v0,v1,v16)
addi r4,r4,16
-err3; stvx v8,r0,r3
+err3; stvx v8,0,r3
addi r3,r3,16
/* Up to 15B to go */
diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index cf1398e..e6ed0ec 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -277,8 +277,101 @@ void do_rfi_flush_fixups(enum l1d_flush_type types)
(types & L1D_FLUSH_MTTRIG) ? "mttrig type"
: "unknown");
}
+
+void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_end)
+{
+ unsigned int instr, *dest;
+ long *start, *end;
+ int i;
+
+ start = fixup_start;
+ end = fixup_end;
+
+ instr = 0x60000000; /* nop */
+
+ if (enable) {
+ pr_info("barrier-nospec: using ORI speculation barrier\n");
+ instr = 0x63ff0000; /* ori 31,31,0 speculation barrier */
+ }
+
+ for (i = 0; start < end; start++, i++) {
+ dest = (void *)start + *start;
+
+ pr_devel("patching dest %lx\n", (unsigned long)dest);
+ patch_instruction(dest, instr);
+ }
+
+ printk(KERN_DEBUG "barrier-nospec: patched %d locations\n", i);
+}
+
#endif /* CONFIG_PPC_BOOK3S_64 */
+#ifdef CONFIG_PPC_BARRIER_NOSPEC
+void do_barrier_nospec_fixups(bool enable)
+{
+ void *start, *end;
+
+ start = PTRRELOC(&__start___barrier_nospec_fixup),
+ end = PTRRELOC(&__stop___barrier_nospec_fixup);
+
+ do_barrier_nospec_fixups_range(enable, start, end);
+}
+#endif /* CONFIG_PPC_BARRIER_NOSPEC */
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+void do_barrier_nospec_fixups_range(bool enable, void *fixup_start, void *fixup_end)
+{
+ unsigned int instr[2], *dest;
+ long *start, *end;
+ int i;
+
+ start = fixup_start;
+ end = fixup_end;
+
+ instr[0] = PPC_INST_NOP;
+ instr[1] = PPC_INST_NOP;
+
+ if (enable) {
+ pr_info("barrier-nospec: using isync; sync as speculation barrier\n");
+ instr[0] = PPC_INST_ISYNC;
+ instr[1] = PPC_INST_SYNC;
+ }
+
+ for (i = 0; start < end; start++, i++) {
+ dest = (void *)start + *start;
+
+ pr_devel("patching dest %lx\n", (unsigned long)dest);
+ patch_instruction(dest, instr[0]);
+ patch_instruction(dest + 1, instr[1]);
+ }
+
+ printk(KERN_DEBUG "barrier-nospec: patched %d locations\n", i);
+}
+
+static void patch_btb_flush_section(long *curr)
+{
+ unsigned int *start, *end;
+
+ start = (void *)curr + *curr;
+ end = (void *)curr + *(curr + 1);
+ for (; start < end; start++) {
+ pr_devel("patching dest %lx\n", (unsigned long)start);
+ patch_instruction(start, PPC_INST_NOP);
+ }
+}
+
+void do_btb_flush_fixups(void)
+{
+ long *start, *end;
+
+ start = PTRRELOC(&__start__btb_flush_fixup);
+ end = PTRRELOC(&__stop__btb_flush_fixup);
+
+ for (; start < end; start += 2)
+ patch_btb_flush_section(start);
+}
+#endif /* CONFIG_PPC_FSL_BOOK3E */
+
void do_lwsync_fixups(unsigned long value, void *fixup_start, void *fixup_end)
{
long *start, *end;
diff --git a/arch/powerpc/lib/memcpy_power7.S b/arch/powerpc/lib/memcpy_power7.S
index 786234f..193909a 100644
--- a/arch/powerpc/lib/memcpy_power7.S
+++ b/arch/powerpc/lib/memcpy_power7.S
@@ -261,12 +261,12 @@
.machine push
.machine "power4"
- dcbt r0,r6,0b01000
- dcbt r0,r7,0b01010
- dcbtst r0,r9,0b01000
- dcbtst r0,r10,0b01010
+ dcbt 0,r6,0b01000
+ dcbt 0,r7,0b01010
+ dcbtst 0,r9,0b01000
+ dcbtst 0,r10,0b01010
eieio
- dcbt r0,r8,0b01010 /* GO */
+ dcbt 0,r8,0b01010 /* GO */
.machine pop
beq cr1,.Lunwind_stack_nonvmx_copy
@@ -321,26 +321,26 @@
li r11,48
bf cr7*4+3,5f
- lvx v1,r0,r4
+ lvx v1,0,r4
addi r4,r4,16
- stvx v1,r0,r3
+ stvx v1,0,r3
addi r3,r3,16
5: bf cr7*4+2,6f
- lvx v1,r0,r4
+ lvx v1,0,r4
lvx v0,r4,r9
addi r4,r4,32
- stvx v1,r0,r3
+ stvx v1,0,r3
stvx v0,r3,r9
addi r3,r3,32
6: bf cr7*4+1,7f
- lvx v3,r0,r4
+ lvx v3,0,r4
lvx v2,r4,r9
lvx v1,r4,r10
lvx v0,r4,r11
addi r4,r4,64
- stvx v3,r0,r3
+ stvx v3,0,r3
stvx v2,r3,r9
stvx v1,r3,r10
stvx v0,r3,r11
@@ -366,7 +366,7 @@
*/
.align 5
8:
- lvx v7,r0,r4
+ lvx v7,0,r4
lvx v6,r4,r9
lvx v5,r4,r10
lvx v4,r4,r11
@@ -375,7 +375,7 @@
lvx v1,r4,r15
lvx v0,r4,r16
addi r4,r4,128
- stvx v7,r0,r3
+ stvx v7,0,r3
stvx v6,r3,r9
stvx v5,r3,r10
stvx v4,r3,r11
@@ -396,29 +396,29 @@
mtocrf 0x01,r6
bf cr7*4+1,9f
- lvx v3,r0,r4
+ lvx v3,0,r4
lvx v2,r4,r9
lvx v1,r4,r10
lvx v0,r4,r11
addi r4,r4,64
- stvx v3,r0,r3
+ stvx v3,0,r3
stvx v2,r3,r9
stvx v1,r3,r10
stvx v0,r3,r11
addi r3,r3,64
9: bf cr7*4+2,10f
- lvx v1,r0,r4
+ lvx v1,0,r4
lvx v0,r4,r9
addi r4,r4,32
- stvx v1,r0,r3
+ stvx v1,0,r3
stvx v0,r3,r9
addi r3,r3,32
10: bf cr7*4+3,11f
- lvx v1,r0,r4
+ lvx v1,0,r4
addi r4,r4,16
- stvx v1,r0,r3
+ stvx v1,0,r3
addi r3,r3,16
/* Up to 15B to go */
@@ -499,25 +499,25 @@
addi r4,r4,16
bf cr7*4+3,5f
- lvx v1,r0,r4
+ lvx v1,0,r4
VPERM(v8,v0,v1,v16)
addi r4,r4,16
- stvx v8,r0,r3
+ stvx v8,0,r3
addi r3,r3,16
vor v0,v1,v1
5: bf cr7*4+2,6f
- lvx v1,r0,r4
+ lvx v1,0,r4
VPERM(v8,v0,v1,v16)
lvx v0,r4,r9
VPERM(v9,v1,v0,v16)
addi r4,r4,32
- stvx v8,r0,r3
+ stvx v8,0,r3
stvx v9,r3,r9
addi r3,r3,32
6: bf cr7*4+1,7f
- lvx v3,r0,r4
+ lvx v3,0,r4
VPERM(v8,v0,v3,v16)
lvx v2,r4,r9
VPERM(v9,v3,v2,v16)
@@ -526,7 +526,7 @@
lvx v0,r4,r11
VPERM(v11,v1,v0,v16)
addi r4,r4,64
- stvx v8,r0,r3
+ stvx v8,0,r3
stvx v9,r3,r9
stvx v10,r3,r10
stvx v11,r3,r11
@@ -552,7 +552,7 @@
*/
.align 5
8:
- lvx v7,r0,r4
+ lvx v7,0,r4
VPERM(v8,v0,v7,v16)
lvx v6,r4,r9
VPERM(v9,v7,v6,v16)
@@ -569,7 +569,7 @@
lvx v0,r4,r16
VPERM(v15,v1,v0,v16)
addi r4,r4,128
- stvx v8,r0,r3
+ stvx v8,0,r3
stvx v9,r3,r9
stvx v10,r3,r10
stvx v11,r3,r11
@@ -590,7 +590,7 @@
mtocrf 0x01,r6
bf cr7*4+1,9f
- lvx v3,r0,r4
+ lvx v3,0,r4
VPERM(v8,v0,v3,v16)
lvx v2,r4,r9
VPERM(v9,v3,v2,v16)
@@ -599,27 +599,27 @@
lvx v0,r4,r11
VPERM(v11,v1,v0,v16)
addi r4,r4,64
- stvx v8,r0,r3
+ stvx v8,0,r3
stvx v9,r3,r9
stvx v10,r3,r10
stvx v11,r3,r11
addi r3,r3,64
9: bf cr7*4+2,10f
- lvx v1,r0,r4
+ lvx v1,0,r4
VPERM(v8,v0,v1,v16)
lvx v0,r4,r9
VPERM(v9,v1,v0,v16)
addi r4,r4,32
- stvx v8,r0,r3
+ stvx v8,0,r3
stvx v9,r3,r9
addi r3,r3,32
10: bf cr7*4+3,11f
- lvx v1,r0,r4
+ lvx v1,0,r4
VPERM(v8,v0,v1,v16)
addi r4,r4,16
- stvx v8,r0,r3
+ stvx v8,0,r3
addi r3,r3,16
/* Up to 15B to go */
diff --git a/arch/powerpc/lib/string_64.S b/arch/powerpc/lib/string_64.S
index 57ace35..11e6372 100644
--- a/arch/powerpc/lib/string_64.S
+++ b/arch/powerpc/lib/string_64.S
@@ -192,7 +192,7 @@
mtctr r6
mr r8,r3
14:
-err1; dcbz r0,r3
+err1; dcbz 0,r3
add r3,r3,r9
bdnz 14b
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 5f84433..1e93dbc 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -62,6 +62,7 @@
#endif
unsigned long long memory_limit;
+bool init_mem_is_free;
#ifdef CONFIG_HIGHMEM
pte_t *kmap_pte;
@@ -396,6 +397,7 @@ void __init mem_init(void)
void free_initmem(void)
{
ppc_md.progress = ppc_printk_progress;
+ init_mem_is_free = true;
free_initmem_default(POISON_FREE_INITMEM);
}
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 0ef83c2..9cad2ed 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1540,13 +1540,6 @@ static void reset_topology_timer(void)
#ifdef CONFIG_SMP
-static void stage_topology_update(int core_id)
-{
- cpumask_or(&cpu_associativity_changes_mask,
- &cpu_associativity_changes_mask, cpu_sibling_mask(core_id));
- reset_topology_timer();
-}
-
static int dt_update_callback(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -1559,7 +1552,7 @@ static int dt_update_callback(struct notifier_block *nb,
!of_prop_cmp(update->prop->name, "ibm,associativity")) {
u32 core_id;
of_property_read_u32(update->dn, "reg", &core_id);
- stage_topology_update(core_id);
+ rc = dlpar_cpu_readd(core_id);
rc = NOTIFY_OK;
}
break;
diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S
index eb82d78..b7e9c09 100644
--- a/arch/powerpc/mm/tlb_low_64e.S
+++ b/arch/powerpc/mm/tlb_low_64e.S
@@ -69,6 +69,13 @@
std r15,EX_TLB_R15(r12)
std r10,EX_TLB_CR(r12)
#ifdef CONFIG_PPC_FSL_BOOK3E
+START_BTB_FLUSH_SECTION
+ mfspr r11, SPRN_SRR1
+ andi. r10,r11,MSR_PR
+ beq 1f
+ BTB_FLUSH(r10)
+1:
+END_BTB_FLUSH_SECTION
std r7,EX_TLB_R7(r12)
#endif
TLB_MISS_PROLOG_STATS
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 89f7007..7b1d172 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -51,6 +51,8 @@
#define PPC_LIS(r, i) PPC_ADDIS(r, 0, i)
#define PPC_STD(r, base, i) EMIT(PPC_INST_STD | ___PPC_RS(r) | \
___PPC_RA(base) | ((i) & 0xfffc))
+#define PPC_STDX(r, base, b) EMIT(PPC_INST_STDX | ___PPC_RS(r) | \
+ ___PPC_RA(base) | ___PPC_RB(b))
#define PPC_STDU(r, base, i) EMIT(PPC_INST_STDU | ___PPC_RS(r) | \
___PPC_RA(base) | ((i) & 0xfffc))
#define PPC_STW(r, base, i) EMIT(PPC_INST_STW | ___PPC_RS(r) | \
@@ -65,7 +67,9 @@
#define PPC_LBZ(r, base, i) EMIT(PPC_INST_LBZ | ___PPC_RT(r) | \
___PPC_RA(base) | IMM_L(i))
#define PPC_LD(r, base, i) EMIT(PPC_INST_LD | ___PPC_RT(r) | \
- ___PPC_RA(base) | IMM_L(i))
+ ___PPC_RA(base) | ((i) & 0xfffc))
+#define PPC_LDX(r, base, b) EMIT(PPC_INST_LDX | ___PPC_RT(r) | \
+ ___PPC_RA(base) | ___PPC_RB(b))
#define PPC_LWZ(r, base, i) EMIT(PPC_INST_LWZ | ___PPC_RT(r) | \
___PPC_RA(base) | IMM_L(i))
#define PPC_LHZ(r, base, i) EMIT(PPC_INST_LHZ | ___PPC_RT(r) | \
@@ -85,17 +89,6 @@
___PPC_RA(a) | ___PPC_RB(b))
#define PPC_BPF_STDCX(s, a, b) EMIT(PPC_INST_STDCX | ___PPC_RS(s) | \
___PPC_RA(a) | ___PPC_RB(b))
-
-#ifdef CONFIG_PPC64
-#define PPC_BPF_LL(r, base, i) do { PPC_LD(r, base, i); } while(0)
-#define PPC_BPF_STL(r, base, i) do { PPC_STD(r, base, i); } while(0)
-#define PPC_BPF_STLU(r, base, i) do { PPC_STDU(r, base, i); } while(0)
-#else
-#define PPC_BPF_LL(r, base, i) do { PPC_LWZ(r, base, i); } while(0)
-#define PPC_BPF_STL(r, base, i) do { PPC_STW(r, base, i); } while(0)
-#define PPC_BPF_STLU(r, base, i) do { PPC_STWU(r, base, i); } while(0)
-#endif
-
#define PPC_CMPWI(a, i) EMIT(PPC_INST_CMPWI | ___PPC_RA(a) | IMM_L(i))
#define PPC_CMPDI(a, i) EMIT(PPC_INST_CMPDI | ___PPC_RA(a) | IMM_L(i))
#define PPC_CMPW(a, b) EMIT(PPC_INST_CMPW | ___PPC_RA(a) | \
diff --git a/arch/powerpc/net/bpf_jit32.h b/arch/powerpc/net/bpf_jit32.h
index a8cd7e2..81a9045 100644
--- a/arch/powerpc/net/bpf_jit32.h
+++ b/arch/powerpc/net/bpf_jit32.h
@@ -122,6 +122,10 @@ DECLARE_LOAD_FUNC(sk_load_byte_msh);
#define PPC_NTOHS_OFFS(r, base, i) PPC_LHZ_OFFS(r, base, i)
#endif
+#define PPC_BPF_LL(r, base, i) do { PPC_LWZ(r, base, i); } while(0)
+#define PPC_BPF_STL(r, base, i) do { PPC_STW(r, base, i); } while(0)
+#define PPC_BPF_STLU(r, base, i) do { PPC_STWU(r, base, i); } while(0)
+
#define SEEN_DATAREF 0x10000 /* might call external helpers */
#define SEEN_XREG 0x20000 /* X reg is used */
#define SEEN_MEM 0x40000 /* SEEN_MEM+(1<<n) = use mem[n] for temporary
diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h
index 62fa758..bb944b6 100644
--- a/arch/powerpc/net/bpf_jit64.h
+++ b/arch/powerpc/net/bpf_jit64.h
@@ -86,6 +86,26 @@ DECLARE_LOAD_FUNC(sk_load_byte);
(imm >= SKF_LL_OFF ? func##_negative_offset : func) : \
func##_positive_offset)
+/*
+ * WARNING: These can use TMP_REG_2 if the offset is not at word boundary,
+ * so ensure that it isn't in use already.
+ */
+#define PPC_BPF_LL(r, base, i) do { \
+ if ((i) % 4) { \
+ PPC_LI(b2p[TMP_REG_2], (i)); \
+ PPC_LDX(r, base, b2p[TMP_REG_2]); \
+ } else \
+ PPC_LD(r, base, i); \
+ } while(0)
+#define PPC_BPF_STL(r, base, i) do { \
+ if ((i) % 4) { \
+ PPC_LI(b2p[TMP_REG_2], (i)); \
+ PPC_STDX(r, base, b2p[TMP_REG_2]); \
+ } else \
+ PPC_STD(r, base, i); \
+ } while(0)
+#define PPC_BPF_STLU(r, base, i) do { PPC_STDU(r, base, i); } while(0)
+
#define SEEN_FUNC 0x1000 /* might call external helpers */
#define SEEN_STACK 0x2000 /* uses BPF stack */
#define SEEN_SKB 0x4000 /* uses sk_buff */
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index bdbbc32..e7d78f9 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -265,7 +265,7 @@ static void bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32
* if (tail_call_cnt > MAX_TAIL_CALL_CNT)
* goto out;
*/
- PPC_LD(b2p[TMP_REG_1], 1, bpf_jit_stack_tailcallcnt(ctx));
+ PPC_BPF_LL(b2p[TMP_REG_1], 1, bpf_jit_stack_tailcallcnt(ctx));
PPC_CMPLWI(b2p[TMP_REG_1], MAX_TAIL_CALL_CNT);
PPC_BCC(COND_GT, out);
@@ -278,7 +278,7 @@ static void bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32
/* prog = array->ptrs[index]; */
PPC_MULI(b2p[TMP_REG_1], b2p_index, 8);
PPC_ADD(b2p[TMP_REG_1], b2p[TMP_REG_1], b2p_bpf_array);
- PPC_LD(b2p[TMP_REG_1], b2p[TMP_REG_1], offsetof(struct bpf_array, ptrs));
+ PPC_BPF_LL(b2p[TMP_REG_1], b2p[TMP_REG_1], offsetof(struct bpf_array, ptrs));
/*
* if (prog == NULL)
@@ -288,7 +288,7 @@ static void bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32
PPC_BCC(COND_EQ, out);
/* goto *(prog->bpf_func + prologue_size); */
- PPC_LD(b2p[TMP_REG_1], b2p[TMP_REG_1], offsetof(struct bpf_prog, bpf_func));
+ PPC_BPF_LL(b2p[TMP_REG_1], b2p[TMP_REG_1], offsetof(struct bpf_prog, bpf_func));
#ifdef PPC64_ELF_ABI_v1
/* skip past the function descriptor */
PPC_ADDI(b2p[TMP_REG_1], b2p[TMP_REG_1],
@@ -620,7 +620,7 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
* the instructions generated will remain the
* same across all passes
*/
- PPC_STD(dst_reg, 1, bpf_jit_stack_local(ctx));
+ PPC_BPF_STL(dst_reg, 1, bpf_jit_stack_local(ctx));
PPC_ADDI(b2p[TMP_REG_1], 1, bpf_jit_stack_local(ctx));
PPC_LDBRX(dst_reg, 0, b2p[TMP_REG_1]);
break;
@@ -676,7 +676,7 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
PPC_LI32(b2p[TMP_REG_1], imm);
src_reg = b2p[TMP_REG_1];
}
- PPC_STD(src_reg, dst_reg, off);
+ PPC_BPF_STL(src_reg, dst_reg, off);
break;
/*
@@ -723,7 +723,7 @@ static int bpf_jit_build_body(struct bpf_prog *fp, u32 *image,
break;
/* dst = *(u64 *)(ul) (src + off) */
case BPF_LDX | BPF_MEM | BPF_DW:
- PPC_LD(dst_reg, src_reg, off);
+ PPC_BPF_LL(dst_reg, src_reg, off);
break;
/*
diff --git a/arch/powerpc/platforms/83xx/suspend-asm.S b/arch/powerpc/platforms/83xx/suspend-asm.S
index 3d1ecd2..8137f77 100644
--- a/arch/powerpc/platforms/83xx/suspend-asm.S
+++ b/arch/powerpc/platforms/83xx/suspend-asm.S
@@ -26,13 +26,13 @@
#define SS_MSR 0x74
#define SS_SDR1 0x78
#define SS_LR 0x7c
-#define SS_SPRG 0x80 /* 4 SPRGs */
-#define SS_DBAT 0x90 /* 8 DBATs */
-#define SS_IBAT 0xd0 /* 8 IBATs */
-#define SS_TB 0x110
-#define SS_CR 0x118
-#define SS_GPREG 0x11c /* r12-r31 */
-#define STATE_SAVE_SIZE 0x16c
+#define SS_SPRG 0x80 /* 8 SPRGs */
+#define SS_DBAT 0xa0 /* 8 DBATs */
+#define SS_IBAT 0xe0 /* 8 IBATs */
+#define SS_TB 0x120
+#define SS_CR 0x128
+#define SS_GPREG 0x12c /* r12-r31 */
+#define STATE_SAVE_SIZE 0x17c
.section .data
.align 5
@@ -103,6 +103,16 @@
stw r7, SS_SPRG+12(r3)
stw r8, SS_SDR1(r3)
+ mfspr r4, SPRN_SPRG4
+ mfspr r5, SPRN_SPRG5
+ mfspr r6, SPRN_SPRG6
+ mfspr r7, SPRN_SPRG7
+
+ stw r4, SS_SPRG+16(r3)
+ stw r5, SS_SPRG+20(r3)
+ stw r6, SS_SPRG+24(r3)
+ stw r7, SS_SPRG+28(r3)
+
mfspr r4, SPRN_DBAT0U
mfspr r5, SPRN_DBAT0L
mfspr r6, SPRN_DBAT1U
@@ -493,6 +503,16 @@
mtspr SPRN_IBAT7U, r6
mtspr SPRN_IBAT7L, r7
+ lwz r4, SS_SPRG+16(r3)
+ lwz r5, SS_SPRG+20(r3)
+ lwz r6, SS_SPRG+24(r3)
+ lwz r7, SS_SPRG+28(r3)
+
+ mtspr SPRN_SPRG4, r4
+ mtspr SPRN_SPRG5, r5
+ mtspr SPRN_SPRG6, r6
+ mtspr SPRN_SPRG7, r7
+
lwz r4, SS_SPRG+0(r3)
lwz r5, SS_SPRG+4(r3)
lwz r6, SS_SPRG+8(r3)
diff --git a/arch/powerpc/platforms/cell/cpufreq_spudemand.c b/arch/powerpc/platforms/cell/cpufreq_spudemand.c
index 88301e5..5d8e8b6 100644
--- a/arch/powerpc/platforms/cell/cpufreq_spudemand.c
+++ b/arch/powerpc/platforms/cell/cpufreq_spudemand.c
@@ -22,6 +22,7 @@
#include <linux/cpufreq.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/module.h>
#include <linux/timer.h>
#include <linux/workqueue.h>
@@ -48,7 +49,7 @@ static int calc_freq(struct spu_gov_info_struct *info)
cpu = info->policy->cpu;
busy_spus = atomic_read(&cbe_spu_info[cpu_to_node(cpu)].busy_spus);
- CALC_LOAD(info->busy_spus, EXP, busy_spus * FIXED_1);
+ info->busy_spus = calc_load(info->busy_spus, EXP, busy_spus * FIXED_1);
pr_debug("cpu %d: busy_spus=%d, info->busy_spus=%ld\n",
cpu, busy_spus, info->busy_spus);
diff --git a/arch/powerpc/platforms/cell/spufs/sched.c b/arch/powerpc/platforms/cell/spufs/sched.c
index 460f5f3..1149c34 100644
--- a/arch/powerpc/platforms/cell/spufs/sched.c
+++ b/arch/powerpc/platforms/cell/spufs/sched.c
@@ -24,6 +24,7 @@
#include <linux/errno.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/sched/rt.h>
#include <linux/kernel.h>
#include <linux/mm.h>
@@ -986,9 +987,9 @@ static void spu_calc_load(void)
unsigned long active_tasks; /* fixed-point */
active_tasks = count_active_contexts() * FIXED_1;
- CALC_LOAD(spu_avenrun[0], EXP_1, active_tasks);
- CALC_LOAD(spu_avenrun[1], EXP_5, active_tasks);
- CALC_LOAD(spu_avenrun[2], EXP_15, active_tasks);
+ spu_avenrun[0] = calc_load(spu_avenrun[0], EXP_1, active_tasks);
+ spu_avenrun[1] = calc_load(spu_avenrun[1], EXP_5, active_tasks);
+ spu_avenrun[2] = calc_load(spu_avenrun[2], EXP_15, active_tasks);
}
static void spusched_wake(unsigned long data)
@@ -1070,9 +1071,6 @@ void spuctx_switch_state(struct spu_context *ctx,
}
}
-#define LOAD_INT(x) ((x) >> FSHIFT)
-#define LOAD_FRAC(x) LOAD_INT(((x) & (FIXED_1-1)) * 100)
-
static int show_spu_loadavg(struct seq_file *s, void *private)
{
int a, b, c;
diff --git a/arch/powerpc/platforms/embedded6xx/wii.c b/arch/powerpc/platforms/embedded6xx/wii.c
index 3fd683e..2914529 100644
--- a/arch/powerpc/platforms/embedded6xx/wii.c
+++ b/arch/powerpc/platforms/embedded6xx/wii.c
@@ -104,6 +104,10 @@ unsigned long __init wii_mmu_mapin_mem2(unsigned long top)
/* MEM2 64MB@0x10000000 */
delta = wii_hole_start + wii_hole_size;
size = top - delta;
+
+ if (__map_without_bats)
+ return delta;
+
for (bl = 128<<10; bl < max_size; bl <<= 1) {
if (bl * 2 > size)
break;
diff --git a/arch/powerpc/platforms/powernv/opal-msglog.c b/arch/powerpc/platforms/powernv/opal-msglog.c
index 39d6ff9..d10bad1 100644
--- a/arch/powerpc/platforms/powernv/opal-msglog.c
+++ b/arch/powerpc/platforms/powernv/opal-msglog.c
@@ -98,7 +98,7 @@ static ssize_t opal_msglog_read(struct file *file, struct kobject *kobj,
}
static struct bin_attribute opal_msglog_attr = {
- .attr = {.name = "msglog", .mode = 0444},
+ .attr = {.name = "msglog", .mode = 0400},
.read = opal_msglog_read
};
diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index 17203ab..365e2b6 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -77,6 +77,12 @@ static void init_fw_feat_flags(struct device_node *np)
if (fw_feature_is("enabled", "fw-count-cache-disabled", np))
security_ftr_set(SEC_FTR_COUNT_CACHE_DISABLED);
+ if (fw_feature_is("enabled", "fw-count-cache-flush-bcctr2,0,0", np))
+ security_ftr_set(SEC_FTR_BCCTR_FLUSH_ASSIST);
+
+ if (fw_feature_is("enabled", "needs-count-cache-flush-on-context-switch", np))
+ security_ftr_set(SEC_FTR_FLUSH_COUNT_CACHE);
+
/*
* The features below are enabled by default, so we instead look to see
* if firmware has *disabled* them, and clear them if so.
@@ -123,6 +129,7 @@ static void pnv_setup_rfi_flush(void)
security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV));
setup_rfi_flush(type, enable);
+ setup_count_cache_flush();
}
static void __init pnv_setup_arch(void)
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index a1b63e0..7a2beed 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -785,6 +785,25 @@ static int dlpar_cpu_add_by_count(u32 cpus_to_add)
return rc;
}
+int dlpar_cpu_readd(int cpu)
+{
+ struct device_node *dn;
+ struct device *dev;
+ u32 drc_index;
+ int rc;
+
+ dev = get_cpu_device(cpu);
+ dn = dev->of_node;
+
+ rc = of_property_read_u32(dn, "ibm,my-drc-index", &drc_index);
+
+ rc = dlpar_cpu_remove_by_index(drc_index);
+ if (!rc)
+ rc = dlpar_cpu_add(drc_index);
+
+ return rc;
+}
+
int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
{
u32 count, drc_index;
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 91ade77..adb09ab 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -475,6 +475,12 @@ static void init_cpu_char_feature_flags(struct h_cpu_char_result *result)
if (result->character & H_CPU_CHAR_COUNT_CACHE_DISABLED)
security_ftr_set(SEC_FTR_COUNT_CACHE_DISABLED);
+ if (result->character & H_CPU_CHAR_BCCTR_FLUSH_ASSIST)
+ security_ftr_set(SEC_FTR_BCCTR_FLUSH_ASSIST);
+
+ if (result->behaviour & H_CPU_BEHAV_FLUSH_COUNT_CACHE)
+ security_ftr_set(SEC_FTR_FLUSH_COUNT_CACHE);
+
/*
* The features below are enabled by default, so we instead look to see
* if firmware has *disabled* them, and clear them if so.
@@ -525,6 +531,7 @@ void pseries_setup_rfi_flush(void)
security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR);
setup_rfi_flush(types, enable);
+ setup_count_cache_flush();
}
static void __init pSeries_setup_arch(void)
diff --git a/arch/s390/appldata/appldata_os.c b/arch/s390/appldata/appldata_os.c
index 08b9e94..731dc7e 100644
--- a/arch/s390/appldata/appldata_os.c
+++ b/arch/s390/appldata/appldata_os.c
@@ -17,15 +17,12 @@
#include <linux/kernel_stat.h>
#include <linux/netdevice.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <asm/appldata.h>
#include <asm/smp.h>
#include "appldata.h"
-
-#define LOAD_INT(x) ((x) >> FSHIFT)
-#define LOAD_FRAC(x) LOAD_INT(((x) & (FIXED_1-1)) * 100)
-
/*
* OS data
*
diff --git a/arch/sh/drivers/heartbeat.c b/arch/sh/drivers/heartbeat.c
index 49bace4..c6d9604 100644
--- a/arch/sh/drivers/heartbeat.c
+++ b/arch/sh/drivers/heartbeat.c
@@ -21,6 +21,7 @@
#include <linux/init.h>
#include <linux/platform_device.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/timer.h>
#include <linux/io.h>
#include <linux/slab.h>
diff --git a/arch/sparc/kernel/led.c b/arch/sparc/kernel/led.c
index 3ae36f3..44a3ed9 100644
--- a/arch/sparc/kernel/led.c
+++ b/arch/sparc/kernel/led.c
@@ -8,6 +8,7 @@
#include <linux/jiffies.h>
#include <linux/timer.h>
#include <linux/uaccess.h>
+#include <linux/sched/loadavg.h>
#include <asm/auxio.h>
diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c
index 4714061..0ae73ee 100644
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -112,7 +112,7 @@ show_signal_msg(struct pt_regs *regs, int sig, int code,
if (!printk_ratelimit())
return;
- printk("%s%s[%d]: segfault at %lx ip %p (rpc %p) sp %p error %x",
+ printk("%s%s[%d]: segfault at %lx ip %px (rpc %px) sp %px error %x",
task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
tsk->comm, task_pid_nr(tsk), address,
(void *)regs->pc, (void *)regs->u_regs[UREG_I7],
diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c
index 643c149..8c85aaf 100644
--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@@ -152,7 +152,7 @@ show_signal_msg(struct pt_regs *regs, int sig, int code,
if (!printk_ratelimit())
return;
- printk("%s%s[%d]: segfault at %lx ip %p (rpc %p) sp %p error %x",
+ printk("%s%s[%d]: segfault at %lx ip %px (rpc %px) sp %px error %x",
task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
tsk->comm, task_pid_nr(tsk), address,
(void *)regs->tpc, (void *)regs->u_regs[UREG_I7],
diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
index ad8f206..68cf65f 100644
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -149,7 +149,7 @@ static void show_segv_info(struct uml_pt_regs *regs)
if (!printk_ratelimit())
return;
- printk("%s%s[%d]: segfault at %lx ip %p sp %p error %x",
+ printk("%s%s[%d]: segfault at %lx ip %px sp %px error %x",
task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
tsk->comm, task_pid_nr(tsk), FAULT_ADDRESS(*fi),
(void *)UPT_IP(regs), (void *)UPT_SP(regs),
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index eb431d8..aa30715 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2052,14 +2052,8 @@
If unsure, leave at the default value.
config HOTPLUG_CPU
- bool "Support for hot-pluggable CPUs"
+ def_bool y
depends on SMP
- ---help---
- Say Y here to allow turning CPUs off and on. CPUs can be
- controlled through /sys/devices/system/cpu.
- ( Note: power management support will enable this option
- automatically on SMP systems. )
- Say N if you want to disable CPU hotplug.
config BOOTPARAM_HOTPLUG_CPU0
bool "Set default setting of cpu0_hotpluggable"
diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
index 3b7156f..3b16935 100644
--- a/arch/x86/boot/Makefile
+++ b/arch/x86/boot/Makefile
@@ -100,7 +100,7 @@
AFLAGS_header.o += -I$(objtree)/$(obj)
$(obj)/header.o: $(obj)/zoffset.h
-LDFLAGS_setup.elf := -T
+LDFLAGS_setup.elf := -m elf_i386 -T
$(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE
$(call if_changed,ld)
diff --git a/arch/x86/configs/i386_ranchu_defconfig b/arch/x86/configs/i386_ranchu_defconfig
index c679e2d..f691e2e 100644
--- a/arch/x86/configs/i386_ranchu_defconfig
+++ b/arch/x86/configs/i386_ranchu_defconfig
@@ -253,7 +253,6 @@
CONFIG_TABLET_USB_KBTAB=y
CONFIG_INPUT_TOUCHSCREEN=y
CONFIG_INPUT_MISC=y
-CONFIG_INPUT_KEYCHORD=y
CONFIG_INPUT_UINPUT=y
CONFIG_INPUT_GPIO=y
# CONFIG_SERIO is not set
diff --git a/arch/x86/configs/x86_64_cuttlefish_defconfig b/arch/x86/configs/x86_64_cuttlefish_defconfig
index 4da2b7e..270da00 100644
--- a/arch/x86/configs/x86_64_cuttlefish_defconfig
+++ b/arch/x86/configs/x86_64_cuttlefish_defconfig
@@ -9,6 +9,7 @@
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
+CONFIG_PSI=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_CGROUPS=y
@@ -85,6 +86,7 @@
CONFIG_UNIX=y
CONFIG_XFRM_USER=y
CONFIG_XFRM_INTERFACE=y
+CONFIG_XFRM_STATISTICS=y
CONFIG_NET_KEY=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
@@ -92,6 +94,7 @@
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
+CONFIG_NET_IPGRE_DEMUX=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
@@ -152,6 +155,7 @@
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=y
CONFIG_NETFILTER_XT_MATCH_HELPER=y
CONFIG_NETFILTER_XT_MATCH_IPRANGE=y
+# CONFIG_NETFILTER_XT_MATCH_L2TP is not set
CONFIG_NETFILTER_XT_MATCH_LENGTH=y
CONFIG_NETFILTER_XT_MATCH_LIMIT=y
CONFIG_NETFILTER_XT_MATCH_MAC=y
@@ -192,9 +196,11 @@
CONFIG_IP6_NF_TARGET_REJECT=y
CONFIG_IP6_NF_MANGLE=y
CONFIG_IP6_NF_RAW=y
+CONFIG_L2TP=y
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_HTB=y
CONFIG_NET_SCH_NETEM=y
+CONFIG_NET_SCH_INGRESS=y
CONFIG_NET_CLS_U32=y
CONFIG_NET_CLS_BPF=y
CONFIG_NET_EMATCH=y
@@ -234,6 +240,7 @@
CONFIG_DM_VERITY=y
CONFIG_DM_VERITY_FEC=y
CONFIG_DM_ANDROID_VERITY=y
+CONFIG_DM_BOW=y
CONFIG_NETDEVICES=y
CONFIG_NETCONSOLE=y
CONFIG_NETCONSOLE_DYNAMIC=y
@@ -244,8 +251,8 @@
CONFIG_PPP_BSDCOMP=y
CONFIG_PPP_DEFLATE=y
CONFIG_PPP_MPPE=y
-CONFIG_PPPOLAC=y
-CONFIG_PPPOPNS=y
+CONFIG_PPTP=y
+CONFIG_PPPOL2TP=y
CONFIG_USB_RTL8152=y
CONFIG_USB_USBNET=y
# CONFIG_USB_NET_AX8817X is not set
@@ -287,7 +294,6 @@
CONFIG_TABLET_USB_HANWANG=y
CONFIG_TABLET_USB_KBTAB=y
CONFIG_INPUT_MISC=y
-CONFIG_INPUT_KEYCHORD=y
CONFIG_INPUT_UINPUT=y
CONFIG_INPUT_GPIO=y
# CONFIG_SERIO_I8042 is not set
@@ -435,6 +441,7 @@
CONFIG_QFMT_V2=y
CONFIG_AUTOFS4_FS=y
CONFIG_FUSE_FS=y
+CONFIG_OVERLAY_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_PROC_KCORE=y
diff --git a/arch/x86/configs/x86_64_ranchu_defconfig b/arch/x86/configs/x86_64_ranchu_defconfig
index f094365..cafa3df 100644
--- a/arch/x86/configs/x86_64_ranchu_defconfig
+++ b/arch/x86/configs/x86_64_ranchu_defconfig
@@ -250,7 +250,6 @@
CONFIG_TABLET_USB_KBTAB=y
CONFIG_INPUT_TOUCHSCREEN=y
CONFIG_INPUT_MISC=y
-CONFIG_INPUT_KEYCHORD=y
CONFIG_INPUT_UINPUT=y
CONFIG_INPUT_GPIO=y
# CONFIG_SERIO is not set
diff --git a/arch/x86/crypto/poly1305-avx2-x86_64.S b/arch/x86/crypto/poly1305-avx2-x86_64.S
index eff2f41..ec234c4 100644
--- a/arch/x86/crypto/poly1305-avx2-x86_64.S
+++ b/arch/x86/crypto/poly1305-avx2-x86_64.S
@@ -321,6 +321,12 @@
vpaddq t2,t1,t1
vmovq t1x,d4
+ # Now do a partial reduction mod (2^130)-5, carrying h0 -> h1 -> h2 ->
+ # h3 -> h4 -> h0 -> h1 to get h0,h2,h3,h4 < 2^26 and h1 < 2^26 + a small
+ # amount. Careful: we must not assume the carry bits 'd0 >> 26',
+ # 'd1 >> 26', 'd2 >> 26', 'd3 >> 26', and '(d4 >> 26) * 5' fit in 32-bit
+ # integers. It's true in a single-block implementation, but not here.
+
# d1 += d0 >> 26
mov d0,%rax
shr $26,%rax
@@ -359,16 +365,16 @@
# h0 += (d4 >> 26) * 5
mov d4,%rax
shr $26,%rax
- lea (%eax,%eax,4),%eax
- add %eax,%ebx
+ lea (%rax,%rax,4),%rax
+ add %rax,%rbx
# h4 = d4 & 0x3ffffff
mov d4,%rax
and $0x3ffffff,%eax
mov %eax,h4
# h1 += h0 >> 26
- mov %ebx,%eax
- shr $26,%eax
+ mov %rbx,%rax
+ shr $26,%rax
add %eax,h1
# h0 = h0 & 0x3ffffff
andl $0x3ffffff,%ebx
diff --git a/arch/x86/crypto/poly1305-sse2-x86_64.S b/arch/x86/crypto/poly1305-sse2-x86_64.S
index 338c748..639d976 100644
--- a/arch/x86/crypto/poly1305-sse2-x86_64.S
+++ b/arch/x86/crypto/poly1305-sse2-x86_64.S
@@ -251,16 +251,16 @@
# h0 += (d4 >> 26) * 5
mov d4,%rax
shr $26,%rax
- lea (%eax,%eax,4),%eax
- add %eax,%ebx
+ lea (%rax,%rax,4),%rax
+ add %rax,%rbx
# h4 = d4 & 0x3ffffff
mov d4,%rax
and $0x3ffffff,%eax
mov %eax,h4
# h1 += h0 >> 26
- mov %ebx,%eax
- shr $26,%eax
+ mov %rbx,%rax
+ shr $26,%rax
add %eax,h1
# h0 = h0 & 0x3ffffff
andl $0x3ffffff,%ebx
@@ -518,6 +518,12 @@
paddq t2,t1
movq t1,d4
+ # Now do a partial reduction mod (2^130)-5, carrying h0 -> h1 -> h2 ->
+ # h3 -> h4 -> h0 -> h1 to get h0,h2,h3,h4 < 2^26 and h1 < 2^26 + a small
+ # amount. Careful: we must not assume the carry bits 'd0 >> 26',
+ # 'd1 >> 26', 'd2 >> 26', 'd3 >> 26', and '(d4 >> 26) * 5' fit in 32-bit
+ # integers. It's true in a single-block implementation, but not here.
+
# d1 += d0 >> 26
mov d0,%rax
shr $26,%rax
@@ -556,16 +562,16 @@
# h0 += (d4 >> 26) * 5
mov d4,%rax
shr $26,%rax
- lea (%eax,%eax,4),%eax
- add %eax,%ebx
+ lea (%rax,%rax,4),%rax
+ add %rax,%rbx
# h4 = d4 & 0x3ffffff
mov d4,%rax
and $0x3ffffff,%eax
mov %eax,h4
# h1 += h0 >> 26
- mov %ebx,%eax
- shr $26,%eax
+ mov %rbx,%rax
+ shr $26,%rax
add %eax,h1
# h0 = h0 & 0x3ffffff
andl $0x3ffffff,%ebx
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index 51a858e..756dc94 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -47,10 +47,8 @@
export CPPFLAGS_vdso.lds += -P -C
-VDSO_LDFLAGS_vdso.lds = -m64 -Wl,-soname=linux-vdso.so.1 \
- -Wl,--no-undefined \
- -Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096 \
- $(DISABLE_LTO)
+VDSO_LDFLAGS_vdso.lds = -m elf_x86_64 -soname linux-vdso.so.1 --no-undefined \
+ -z max-page-size=4096
$(obj)/vdso64.so.dbg: $(src)/vdso.lds $(vobjs) FORCE
$(call if_changed,vdso)
@@ -96,10 +94,8 @@
#
CPPFLAGS_vdsox32.lds = $(CPPFLAGS_vdso.lds)
-VDSO_LDFLAGS_vdsox32.lds = -Wl,-m,elf32_x86_64 \
- -Wl,-soname=linux-vdso.so.1 \
- -Wl,-z,max-page-size=4096 \
- -Wl,-z,common-page-size=4096
+VDSO_LDFLAGS_vdsox32.lds = -m elf32_x86_64 -soname linux-vdso.so.1 \
+ -z max-page-size=4096
# 64-bit objects to re-brand as x32
vobjs64-for-x32 := $(filter-out $(vobjs-nox32),$(vobjs-y))
@@ -127,7 +123,7 @@
$(call if_changed,vdso)
CPPFLAGS_vdso32.lds = $(CPPFLAGS_vdso.lds)
-VDSO_LDFLAGS_vdso32.lds = -m32 -Wl,-m,elf_i386 -Wl,-soname=linux-gate.so.1
+VDSO_LDFLAGS_vdso32.lds = -m elf_i386 -soname linux-gate.so.1
# This makes sure the $(obj) subdirectory exists even though vdso32/
# is not a kbuild sub-make subdirectory.
@@ -165,14 +161,13 @@
# The DSO images are built using a special linker script.
#
quiet_cmd_vdso = VDSO $@
- cmd_vdso = $(CC) -nostdlib -o $@ \
+ cmd_vdso = $(LD) -nostdlib -o $@ \
$(VDSO_LDFLAGS) $(VDSO_LDFLAGS_$(filter %.lds,$(^F))) \
- -Wl,-T,$(filter %.lds,$^) $(filter %.o,$^) && \
+ -T $(filter %.lds,$^) $(filter %.o,$^) && \
sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@'
-VDSO_LDFLAGS = -fPIC -shared $(call cc-ldoption, -Wl$(comma)--hash-style=both) \
- $(call cc-ldoption, -Wl$(comma)--build-id) -Wl,-Bsymbolic $(LTO_CFLAGS) \
- $(filter --target=% --gcc-toolchain=%,$(KBUILD_CFLAGS))
+VDSO_LDFLAGS = -shared $(call ld-option, --hash-style=both) \
+ $(call ld-option, --build-id) -Bsymbolic
GCOV_PROFILE := n
#
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index afb222b..de050d5 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -113,22 +113,39 @@ static __initconst const u64 amd_hw_cache_event_ids
};
/*
- * AMD Performance Monitor K7 and later.
+ * AMD Performance Monitor K7 and later, up to and including Family 16h:
*/
static const u64 amd_perfmon_event_map[PERF_COUNT_HW_MAX] =
{
- [PERF_COUNT_HW_CPU_CYCLES] = 0x0076,
- [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0,
- [PERF_COUNT_HW_CACHE_REFERENCES] = 0x077d,
- [PERF_COUNT_HW_CACHE_MISSES] = 0x077e,
- [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2,
- [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3,
- [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x00d0, /* "Decoder empty" event */
- [PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x00d1, /* "Dispatch stalls" event */
+ [PERF_COUNT_HW_CPU_CYCLES] = 0x0076,
+ [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0,
+ [PERF_COUNT_HW_CACHE_REFERENCES] = 0x077d,
+ [PERF_COUNT_HW_CACHE_MISSES] = 0x077e,
+ [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2,
+ [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3,
+ [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x00d0, /* "Decoder empty" event */
+ [PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x00d1, /* "Dispatch stalls" event */
+};
+
+/*
+ * AMD Performance Monitor Family 17h and later:
+ */
+static const u64 amd_f17h_perfmon_event_map[PERF_COUNT_HW_MAX] =
+{
+ [PERF_COUNT_HW_CPU_CYCLES] = 0x0076,
+ [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0,
+ [PERF_COUNT_HW_CACHE_REFERENCES] = 0xff60,
+ [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c2,
+ [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c3,
+ [PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x0287,
+ [PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x0187,
};
static u64 amd_pmu_event_map(int hw_event)
{
+ if (boot_cpu_data.x86 >= 0x17)
+ return amd_f17h_perfmon_event_map[hw_event];
+
return amd_perfmon_event_map[hw_event];
}
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 1ce6ae3..c42c9d50 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -996,12 +996,12 @@ static inline int intel_pmu_init(void)
return 0;
}
-static inline int intel_cpuc_prepare(struct cpu_hw_event *cpuc, int cpu)
+static inline int intel_cpuc_prepare(struct cpu_hw_events *cpuc, int cpu)
{
return 0;
}
-static inline void intel_cpuc_finish(struct cpu_hw_event *cpuc)
+static inline void intel_cpuc_finish(struct cpu_hw_events *cpuc)
{
}
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9a8167b..83b5b29 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -487,6 +487,7 @@ struct kvm_vcpu_arch {
bool tpr_access_reporting;
u64 ia32_xss;
u64 microcode_version;
+ u64 arch_capabilities;
/*
* Paging state of the vcpu
diff --git a/arch/x86/include/asm/suspend_32.h b/arch/x86/include/asm/suspend_32.h
index 8e9dbe7..5cc2ce4 100644
--- a/arch/x86/include/asm/suspend_32.h
+++ b/arch/x86/include/asm/suspend_32.h
@@ -11,7 +11,13 @@
/* image of the saved processor state */
struct saved_context {
- u16 es, fs, gs, ss;
+ /*
+ * On x86_32, all segment registers, with the possible exception of
+ * gs, are saved at kernel entry in pt_regs.
+ */
+#ifdef CONFIG_X86_32_LAZY_GS
+ u16 gs;
+#endif
unsigned long cr0, cr2, cr3, cr4;
u64 misc_enable;
bool misc_enable_saved;
diff --git a/arch/x86/include/asm/suspend_64.h b/arch/x86/include/asm/suspend_64.h
index 2bd96b4..70175191 100644
--- a/arch/x86/include/asm/suspend_64.h
+++ b/arch/x86/include/asm/suspend_64.h
@@ -19,8 +19,20 @@
*/
struct saved_context {
struct pt_regs regs;
- u16 ds, es, fs, gs, ss;
- unsigned long gs_base, gs_kernel_base, fs_base;
+
+ /*
+ * User CS and SS are saved in current_pt_regs(). The rest of the
+ * segment selectors need to be saved and restored here.
+ */
+ u16 ds, es, fs, gs;
+
+ /*
+ * Usermode FSBASE and GSBASE may not match the fs and gs selectors,
+ * so we save them separately. We save the kernelmode GSBASE to
+ * restore percpu access after resume.
+ */
+ unsigned long kernelmode_gs_base, usermode_gs_base, fs_base;
+
unsigned long cr0, cr2, cr3, cr4, cr8;
u64 misc_enable;
bool misc_enable_saved;
@@ -29,8 +41,7 @@ struct saved_context {
u16 gdt_pad; /* Unused */
struct desc_ptr gdt_desc;
u16 idt_pad;
- u16 idt_limit;
- unsigned long idt_base;
+ struct desc_ptr idt;
u16 ldt;
u16 tss;
unsigned long tr;
diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
index 1fdcb8c..79183bc 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -215,6 +215,9 @@ privcmd_call(unsigned call,
__HYPERCALL_DECLS;
__HYPERCALL_5ARG(a1, a2, a3, a4, a5);
+ if (call >= PAGE_SIZE / sizeof(hypercall_page[0]))
+ return -EINVAL;
+
stac();
asm volatile(CALL_NOSPEC
: __HYPERCALL_5PARAM
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 03b6e5c..f7430d8 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -287,7 +287,7 @@ recompute_jump(struct alt_instr *a, u8 *orig_insn, u8 *repl_insn, u8 *insnbuf)
tgt_rip = next_rip + o_dspl;
n_dspl = tgt_rip - orig_insn;
- DPRINTK("target RIP: %p, new_displ: 0x%x", tgt_rip, n_dspl);
+ DPRINTK("target RIP: %px, new_displ: 0x%x", tgt_rip, n_dspl);
if (tgt_rip - orig_insn >= 0) {
if (n_dspl - 2 <= 127)
@@ -341,7 +341,7 @@ static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr)
sync_core();
local_irq_restore(flags);
- DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ",
+ DUMP_BYTES(instr, a->instrlen, "%px: [%d:%d) optimized NOPs: ",
instr, a->instrlen - a->padlen, a->padlen);
}
@@ -359,7 +359,7 @@ void __init_or_module apply_alternatives(struct alt_instr *start,
u8 *instr, *replacement;
u8 insnbuf[MAX_PATCH_LEN];
- DPRINTK("alt table %p -> %p", start, end);
+ DPRINTK("alt table %px, -> %px", start, end);
/*
* The scan order should be from start to end. A later scanned
* alternative code can overwrite previously scanned alternative code.
@@ -383,14 +383,14 @@ void __init_or_module apply_alternatives(struct alt_instr *start,
continue;
}
- DPRINTK("feat: %d*32+%d, old: (%p, len: %d), repl: (%p, len: %d), pad: %d",
+ DPRINTK("feat: %d*32+%d, old: (%px len: %d), repl: (%px, len: %d), pad: %d",
a->cpuid >> 5,
a->cpuid & 0x1f,
instr, a->instrlen,
replacement, a->replacementlen, a->padlen);
- DUMP_BYTES(instr, a->instrlen, "%p: old_insn: ", instr);
- DUMP_BYTES(replacement, a->replacementlen, "%p: rpl_insn: ", replacement);
+ DUMP_BYTES(instr, a->instrlen, "%px: old_insn: ", instr);
+ DUMP_BYTES(replacement, a->replacementlen, "%px: rpl_insn: ", replacement);
memcpy(insnbuf, replacement, a->replacementlen);
insnbuf_sz = a->replacementlen;
@@ -411,7 +411,7 @@ void __init_or_module apply_alternatives(struct alt_instr *start,
a->instrlen - a->replacementlen);
insnbuf_sz += a->instrlen - a->replacementlen;
}
- DUMP_BYTES(insnbuf, insnbuf_sz, "%p: final_insn: ", instr);
+ DUMP_BYTES(insnbuf, insnbuf_sz, "%px: final_insn: ", instr);
text_poke_early(instr, insnbuf, insnbuf_sz);
}
diff --git a/arch/x86/kernel/cpu/cyrix.c b/arch/x86/kernel/cpu/cyrix.c
index d39cfb2..311d0fa 100644
--- a/arch/x86/kernel/cpu/cyrix.c
+++ b/arch/x86/kernel/cpu/cyrix.c
@@ -121,7 +121,7 @@ static void set_cx86_reorder(void)
setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10); /* enable MAPEN */
/* Load/Store Serialize to mem access disable (=reorder it) */
- setCx86_old(CX86_PCR0, getCx86_old(CX86_PCR0) & ~0x80);
+ setCx86(CX86_PCR0, getCx86(CX86_PCR0) & ~0x80);
/* set load/store serialize from 1GB to 4GB */
ccr3 |= 0xe0;
setCx86(CX86_CCR3, ccr3);
@@ -132,11 +132,11 @@ static void set_cx86_memwb(void)
pr_info("Enable Memory-Write-back mode on Cyrix/NSC processor.\n");
/* CCR2 bit 2: unlock NW bit */
- setCx86_old(CX86_CCR2, getCx86_old(CX86_CCR2) & ~0x04);
+ setCx86(CX86_CCR2, getCx86(CX86_CCR2) & ~0x04);
/* set 'Not Write-through' */
write_cr0(read_cr0() | X86_CR0_NW);
/* CCR2 bit 2: lock NW bit and set WT1 */
- setCx86_old(CX86_CCR2, getCx86_old(CX86_CCR2) | 0x14);
+ setCx86(CX86_CCR2, getCx86(CX86_CCR2) | 0x14);
}
/*
@@ -150,14 +150,14 @@ static void geode_configure(void)
local_irq_save(flags);
/* Suspend on halt power saving and enable #SUSP pin */
- setCx86_old(CX86_CCR2, getCx86_old(CX86_CCR2) | 0x88);
+ setCx86(CX86_CCR2, getCx86(CX86_CCR2) | 0x88);
ccr3 = getCx86(CX86_CCR3);
setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10); /* enable MAPEN */
/* FPU fast, DTE cache, Mem bypass */
- setCx86_old(CX86_CCR4, getCx86_old(CX86_CCR4) | 0x38);
+ setCx86(CX86_CCR4, getCx86(CX86_CCR4) | 0x38);
setCx86(CX86_CCR3, ccr3); /* disable MAPEN */
set_cx86_memwb();
@@ -293,7 +293,7 @@ static void init_cyrix(struct cpuinfo_x86 *c)
/* GXm supports extended cpuid levels 'ala' AMD */
if (c->cpuid_level == 2) {
/* Enable cxMMX extensions (GX1 Datasheet 54) */
- setCx86_old(CX86_CCR7, getCx86_old(CX86_CCR7) | 1);
+ setCx86(CX86_CCR7, getCx86(CX86_CCR7) | 1);
/*
* GXm : 0x30 ... 0x5f GXm datasheet 51
@@ -316,7 +316,7 @@ static void init_cyrix(struct cpuinfo_x86 *c)
if (dir1 > 7) {
dir0_msn++; /* M II */
/* Enable MMX extensions (App note 108) */
- setCx86_old(CX86_CCR7, getCx86_old(CX86_CCR7)|1);
+ setCx86(CX86_CCR7, getCx86(CX86_CCR7)|1);
} else {
/* A 6x86MX - it has the bug. */
set_cpu_bug(c, X86_BUG_COMA);
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 756634f..775c23d 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -914,6 +914,8 @@ int __init hpet_enable(void)
return 0;
hpet_set_mapping();
+ if (!hpet_virt_address)
+ return 0;
/*
* Read the period and check for a sane value:
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 8771766..9954a60 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -352,6 +352,7 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
#endif
default:
WARN_ON_ONCE(1);
+ return -EINVAL;
}
/*
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 64a70b2..3f3cfec 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -545,6 +545,7 @@ void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
unsigned long *sara = stack_addr(regs);
ri->ret_addr = (kprobe_opcode_t *) *sara;
+ ri->fp = sara;
/* Replace the return addr with trampoline addr */
*sara = (unsigned long) &kretprobe_trampoline;
@@ -746,15 +747,21 @@ __visible __used void *trampoline_handler(struct pt_regs *regs)
unsigned long flags, orig_ret_address = 0;
unsigned long trampoline_address = (unsigned long)&kretprobe_trampoline;
kprobe_opcode_t *correct_ret_addr = NULL;
+ void *frame_pointer;
+ bool skipped = false;
INIT_HLIST_HEAD(&empty_rp);
kretprobe_hash_lock(current, &head, &flags);
/* fixup registers */
#ifdef CONFIG_X86_64
regs->cs = __KERNEL_CS;
+ /* On x86-64, we use pt_regs->sp for return address holder. */
+ frame_pointer = ®s->sp;
#else
regs->cs = __KERNEL_CS | get_kernel_rpl();
regs->gs = 0;
+ /* On x86-32, we use pt_regs->flags for return address holder. */
+ frame_pointer = ®s->flags;
#endif
regs->ip = trampoline_address;
regs->orig_ax = ~0UL;
@@ -776,8 +783,25 @@ __visible __used void *trampoline_handler(struct pt_regs *regs)
if (ri->task != current)
/* another task is sharing our hash bucket */
continue;
+ /*
+ * Return probes must be pushed on this hash list correct
+ * order (same as return order) so that it can be poped
+ * correctly. However, if we find it is pushed it incorrect
+ * order, this means we find a function which should not be
+ * probed, because the wrong order entry is pushed on the
+ * path of processing other kretprobe itself.
+ */
+ if (ri->fp != frame_pointer) {
+ if (!skipped)
+ pr_warn("kretprobe is stacked incorrectly. Trying to fixup.\n");
+ skipped = true;
+ continue;
+ }
orig_ret_address = (unsigned long)ri->ret_addr;
+ if (skipped)
+ pr_warn("%ps must be blacklisted because of incorrect kretprobe order\n",
+ ri->rp->kp.addr);
if (orig_ret_address != trampoline_address)
/*
@@ -795,6 +819,8 @@ __visible __used void *trampoline_handler(struct pt_regs *regs)
if (ri->task != current)
/* another task is sharing our hash bucket */
continue;
+ if (ri->fp != frame_pointer)
+ continue;
orig_ret_address = (unsigned long)ri->ret_addr;
if (ri->rp && ri->rp->handler) {
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index e783a5d..55f0487 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -367,7 +367,7 @@
* Per-cpu symbols which need to be offset from __per_cpu_load
* for the boot processor.
*/
-#define INIT_PER_CPU(x) init_per_cpu__##x = x + __per_cpu_load
+#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x) + __per_cpu_load
INIT_PER_CPU(gdt_page);
INIT_PER_CPU(irq_stack_union);
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 510cfc0..b636a1e 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2579,15 +2579,13 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
* CR0/CR3/CR4/EFER. It's all a bit more complicated if the vCPU
* supports long mode.
*/
- cr4 = ctxt->ops->get_cr(ctxt, 4);
if (emulator_has_longmode(ctxt)) {
struct desc_struct cs_desc;
/* Zero CR4.PCIDE before CR0.PG. */
- if (cr4 & X86_CR4_PCIDE) {
+ cr4 = ctxt->ops->get_cr(ctxt, 4);
+ if (cr4 & X86_CR4_PCIDE)
ctxt->ops->set_cr(ctxt, 4, cr4 & ~X86_CR4_PCIDE);
- cr4 &= ~X86_CR4_PCIDE;
- }
/* A 32-bit code segment is required to clear EFER.LMA. */
memset(&cs_desc, 0, sizeof(cs_desc));
@@ -2601,13 +2599,16 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
if (cr0 & X86_CR0_PE)
ctxt->ops->set_cr(ctxt, 0, cr0 & ~(X86_CR0_PG | X86_CR0_PE));
- /* Now clear CR4.PAE (which must be done before clearing EFER.LME). */
- if (cr4 & X86_CR4_PAE)
- ctxt->ops->set_cr(ctxt, 4, cr4 & ~X86_CR4_PAE);
+ if (emulator_has_longmode(ctxt)) {
+ /* Clear CR4.PAE before clearing EFER.LME. */
+ cr4 = ctxt->ops->get_cr(ctxt, 4);
+ if (cr4 & X86_CR4_PAE)
+ ctxt->ops->set_cr(ctxt, 4, cr4 & ~X86_CR4_PAE);
- /* And finally go back to 32-bit mode. */
- efer = 0;
- ctxt->ops->set_msr(ctxt, MSR_EFER, efer);
+ /* And finally go back to 32-bit mode. */
+ efer = 0;
+ ctxt->ops->set_msr(ctxt, MSR_EFER, efer);
+ }
smbase = ctxt->ops->get_smbase(ctxt);
if (emulator_has_longmode(ctxt))
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 01eb045..9a6d258 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3940,14 +3940,25 @@ static int avic_incomplete_ipi_interception(struct vcpu_svm *svm)
kvm_lapic_reg_write(apic, APIC_ICR, icrl);
break;
case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING: {
+ int i;
+ struct kvm_vcpu *vcpu;
+ struct kvm *kvm = svm->vcpu.kvm;
struct kvm_lapic *apic = svm->vcpu.arch.apic;
/*
- * Update ICR high and low, then emulate sending IPI,
- * which is handled when writing APIC_ICR.
+ * At this point, we expect that the AVIC HW has already
+ * set the appropriate IRR bits on the valid target
+ * vcpus. So, we just need to kick the appropriate vcpu.
*/
- kvm_lapic_reg_write(apic, APIC_ICR2, icrh);
- kvm_lapic_reg_write(apic, APIC_ICR, icrl);
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ bool m = kvm_apic_match_dest(vcpu, apic,
+ icrl & KVM_APIC_SHORT_MASK,
+ GET_APIC_DEST_FIELD(icrh),
+ icrl & KVM_APIC_DEST_MASK);
+
+ if (m && !avic_vcpu_is_running(vcpu))
+ kvm_vcpu_wake_up(vcpu);
+ }
break;
}
case AVIC_IPI_FAILURE_INVALID_TARGET:
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a24d080..e41b212 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -714,7 +714,6 @@ struct vcpu_vmx {
u64 msr_guest_kernel_gs_base;
#endif
- u64 arch_capabilities;
u64 spec_ctrl;
u32 vm_entry_controls_shadow;
@@ -3209,12 +3208,6 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
msr_info->data = to_vmx(vcpu)->spec_ctrl;
break;
- case MSR_IA32_ARCH_CAPABILITIES:
- if (!msr_info->host_initiated &&
- !guest_cpuid_has_arch_capabilities(vcpu))
- return 1;
- msr_info->data = to_vmx(vcpu)->arch_capabilities;
- break;
case MSR_IA32_SYSENTER_CS:
msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
break;
@@ -3376,11 +3369,6 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD,
MSR_TYPE_W);
break;
- case MSR_IA32_ARCH_CAPABILITIES:
- if (!msr_info->host_initiated)
- return 1;
- vmx->arch_capabilities = data;
- break;
case MSR_IA32_CR_PAT:
if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
@@ -5468,8 +5456,6 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
++vmx->nmsrs;
}
- vmx->arch_capabilities = kvm_get_arch_capabilities();
-
vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl);
/* 22.2.1, 20.8.1 */
@@ -5965,6 +5951,7 @@ static int handle_external_interrupt(struct kvm_vcpu *vcpu)
static int handle_triple_fault(struct kvm_vcpu *vcpu)
{
vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
+ vcpu->mmio_needed = 0;
return 0;
}
@@ -7046,6 +7033,10 @@ static int get_vmx_mem_address(struct kvm_vcpu *vcpu,
/* Addr = segment_base + offset */
/* offset = base + [index * scale] + displacement */
off = exit_qualification; /* holds the displacement */
+ if (addr_size == 1)
+ off = (gva_t)sign_extend64(off, 31);
+ else if (addr_size == 0)
+ off = (gva_t)sign_extend64(off, 15);
if (base_is_valid)
off += kvm_register_read(vcpu, base_reg);
if (index_is_valid)
@@ -7088,10 +7079,16 @@ static int get_vmx_mem_address(struct kvm_vcpu *vcpu,
/* Protected mode: #GP(0)/#SS(0) if the segment is unusable.
*/
exn = (s.unusable != 0);
- /* Protected mode: #GP(0)/#SS(0) if the memory
- * operand is outside the segment limit.
+
+ /*
+ * Protected mode: #GP(0)/#SS(0) if the memory operand is
+ * outside the segment limit. All CPUs that support VMX ignore
+ * limit checks for flat segments, i.e. segments with base==0,
+ * limit==0xffffffff and of type expand-up data or code.
*/
- exn = exn || (off + sizeof(u64) > s.limit);
+ if (!(s.base == 0 && s.limit == 0xffffffff &&
+ ((s.type & 8) || !(s.type & 4))))
+ exn = exn || (off + sizeof(u64) > s.limit);
}
if (exn) {
kvm_queue_exception_e(vcpu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5a35fee..8285142 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2197,6 +2197,11 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (msr_info->host_initiated)
vcpu->arch.microcode_version = data;
break;
+ case MSR_IA32_ARCH_CAPABILITIES:
+ if (!msr_info->host_initiated)
+ return 1;
+ vcpu->arch.arch_capabilities = data;
+ break;
case MSR_EFER:
return set_efer(vcpu, data);
case MSR_K7_HWCR:
@@ -2473,6 +2478,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_UCODE_REV:
msr_info->data = vcpu->arch.microcode_version;
break;
+ case MSR_IA32_ARCH_CAPABILITIES:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has_arch_capabilities(vcpu))
+ return 1;
+ msr_info->data = vcpu->arch.arch_capabilities;
+ break;
case MSR_MTRRcap:
case 0x200 ... 0x2ff:
return kvm_mtrr_get_msr(vcpu, msr_info->index, &msr_info->data);
@@ -6769,6 +6780,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
}
if (kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu)) {
vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
+ vcpu->mmio_needed = 0;
r = 0;
goto out;
}
@@ -7671,6 +7683,7 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
{
int r;
+ vcpu->arch.arch_capabilities = kvm_get_arch_capabilities();
kvm_vcpu_mtrr_init(vcpu);
r = vcpu_load(vcpu);
if (r)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 0da3945..f7f8aa6 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -836,7 +836,7 @@ show_signal_msg(struct pt_regs *regs, unsigned long error_code,
if (!printk_ratelimit())
return;
- printk("%s%s[%d]: segfault at %lx ip %p sp %p error %lx",
+ printk("%s%s[%d]: segfault at %lx ip %px sp %px error %lx",
task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
tsk->comm, task_pid_nr(tsk), address,
(void *)regs->ip, (void *)regs->sp, error_code);
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index 53cace2..054e276 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -82,12 +82,8 @@ static void __save_processor_state(struct saved_context *ctxt)
/*
* descriptor tables
*/
-#ifdef CONFIG_X86_32
store_idt(&ctxt->idt);
-#else
-/* CONFIG_X86_64 */
- store_idt((struct desc_ptr *)&ctxt->idt_limit);
-#endif
+
/*
* We save it here, but restore it only in the hibernate case.
* For ACPI S3 resume, this is loaded via 'early_gdt_desc' in 64-bit
@@ -103,22 +99,18 @@ static void __save_processor_state(struct saved_context *ctxt)
/*
* segment registers
*/
-#ifdef CONFIG_X86_32
- savesegment(es, ctxt->es);
- savesegment(fs, ctxt->fs);
+#ifdef CONFIG_X86_32_LAZY_GS
savesegment(gs, ctxt->gs);
- savesegment(ss, ctxt->ss);
-#else
-/* CONFIG_X86_64 */
- asm volatile ("movw %%ds, %0" : "=m" (ctxt->ds));
- asm volatile ("movw %%es, %0" : "=m" (ctxt->es));
- asm volatile ("movw %%fs, %0" : "=m" (ctxt->fs));
- asm volatile ("movw %%gs, %0" : "=m" (ctxt->gs));
- asm volatile ("movw %%ss, %0" : "=m" (ctxt->ss));
+#endif
+#ifdef CONFIG_X86_64
+ savesegment(gs, ctxt->gs);
+ savesegment(fs, ctxt->fs);
+ savesegment(ds, ctxt->ds);
+ savesegment(es, ctxt->es);
rdmsrl(MSR_FS_BASE, ctxt->fs_base);
- rdmsrl(MSR_GS_BASE, ctxt->gs_base);
- rdmsrl(MSR_KERNEL_GS_BASE, ctxt->gs_kernel_base);
+ rdmsrl(MSR_GS_BASE, ctxt->kernelmode_gs_base);
+ rdmsrl(MSR_KERNEL_GS_BASE, ctxt->usermode_gs_base);
mtrr_save_fixed_ranges(NULL);
rdmsrl(MSR_EFER, ctxt->efer);
@@ -178,6 +170,9 @@ static void fix_processor_context(void)
write_gdt_entry(desc, GDT_ENTRY_TSS, &tss, DESC_TSS);
syscall_init(); /* This sets MSR_*STAR and related */
+#else
+ if (boot_cpu_has(X86_FEATURE_SEP))
+ enable_sep_cpu();
#endif
load_TR_desc(); /* This does ltr */
load_mm_ldt(current->active_mm); /* This does lldt */
@@ -186,9 +181,12 @@ static void fix_processor_context(void)
}
/**
- * __restore_processor_state - restore the contents of CPU registers saved
- * by __save_processor_state()
- * @ctxt - structure to load the registers contents from
+ * __restore_processor_state - restore the contents of CPU registers saved
+ * by __save_processor_state()
+ * @ctxt - structure to load the registers contents from
+ *
+ * The asm code that gets us here will have restored a usable GDT, although
+ * it will be pointing to the wrong alias.
*/
static void notrace __restore_processor_state(struct saved_context *ctxt)
{
@@ -211,46 +209,52 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
write_cr2(ctxt->cr2);
write_cr0(ctxt->cr0);
- /*
- * now restore the descriptor tables to their proper values
- * ltr is done i fix_processor_context().
- */
-#ifdef CONFIG_X86_32
+ /* Restore the IDT. */
load_idt(&ctxt->idt);
-#else
-/* CONFIG_X86_64 */
- load_idt((const struct desc_ptr *)&ctxt->idt_limit);
-#endif
/*
- * segment registers
+ * Just in case the asm code got us here with the SS, DS, or ES
+ * out of sync with the GDT, update them.
*/
-#ifdef CONFIG_X86_32
+ loadsegment(ss, __KERNEL_DS);
+ loadsegment(ds, __USER_DS);
+ loadsegment(es, __USER_DS);
+
+ /*
+ * Restore percpu access. Percpu access can happen in exception
+ * handlers or in complicated helpers like load_gs_index().
+ */
+#ifdef CONFIG_X86_64
+ wrmsrl(MSR_GS_BASE, ctxt->kernelmode_gs_base);
+#else
+ loadsegment(fs, __KERNEL_PERCPU);
+ loadsegment(gs, __KERNEL_STACK_CANARY);
+#endif
+
+ /* Restore the TSS, RO GDT, LDT, and usermode-relevant MSRs. */
+ fix_processor_context();
+
+ /*
+ * Now that we have descriptor tables fully restored and working
+ * exception handling, restore the usermode segments.
+ */
+#ifdef CONFIG_X86_64
+ loadsegment(ds, ctxt->es);
loadsegment(es, ctxt->es);
loadsegment(fs, ctxt->fs);
- loadsegment(gs, ctxt->gs);
- loadsegment(ss, ctxt->ss);
+ load_gs_index(ctxt->gs);
/*
- * sysenter MSRs
+ * Restore FSBASE and GSBASE after restoring the selectors, since
+ * restoring the selectors clobbers the bases. Keep in mind
+ * that MSR_KERNEL_GS_BASE is horribly misnamed.
*/
- if (boot_cpu_has(X86_FEATURE_SEP))
- enable_sep_cpu();
-#else
-/* CONFIG_X86_64 */
- asm volatile ("movw %0, %%ds" :: "r" (ctxt->ds));
- asm volatile ("movw %0, %%es" :: "r" (ctxt->es));
- asm volatile ("movw %0, %%fs" :: "r" (ctxt->fs));
- load_gs_index(ctxt->gs);
- asm volatile ("movw %0, %%ss" :: "r" (ctxt->ss));
-
wrmsrl(MSR_FS_BASE, ctxt->fs_base);
- wrmsrl(MSR_GS_BASE, ctxt->gs_base);
- wrmsrl(MSR_KERNEL_GS_BASE, ctxt->gs_kernel_base);
+ wrmsrl(MSR_KERNEL_GS_BASE, ctxt->usermode_gs_base);
+#elif defined(CONFIG_X86_32_LAZY_GS)
+ loadsegment(gs, ctxt->gs);
#endif
- fix_processor_context();
-
do_fpu_end();
x86_platform.restore_sched_clock_state();
mtrr_bp_restore();
diff --git a/arch/x86/realmode/rm/Makefile b/arch/x86/realmode/rm/Makefile
index 25012ab..ce5f431 100644
--- a/arch/x86/realmode/rm/Makefile
+++ b/arch/x86/realmode/rm/Makefile
@@ -47,7 +47,7 @@
targets += realmode.lds
$(obj)/realmode.lds: $(obj)/pasyms.h
-LDFLAGS_realmode.elf := --emit-relocs -T
+LDFLAGS_realmode.elf := -m elf_i386 --emit-relocs -T
CPPFLAGS_realmode.lds += -P -C -I$(objtree)/$(obj)
targets += realmode.elf
diff --git a/arch/xtensa/kernel/stacktrace.c b/arch/xtensa/kernel/stacktrace.c
index 7538d80..4835930 100644
--- a/arch/xtensa/kernel/stacktrace.c
+++ b/arch/xtensa/kernel/stacktrace.c
@@ -272,10 +272,14 @@ static int return_address_cb(struct stackframe *frame, void *data)
return 1;
}
+/*
+ * level == 0 is for the return address from the caller of this function,
+ * not from this function itself.
+ */
unsigned long return_address(unsigned level)
{
struct return_addr_data r = {
- .skip = level + 1,
+ .skip = level,
};
walk_stackframe(stack_pointer(NULL), return_address_cb, &r);
return r.addr;
diff --git a/block/bio.c b/block/bio.c
index 8bd6e0d..7c0f0d1 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1230,8 +1230,11 @@ struct bio *bio_copy_user_iov(struct request_queue *q,
}
}
- if (bio_add_pc_page(q, bio, page, bytes, offset) < bytes)
+ if (bio_add_pc_page(q, bio, page, bytes, offset) < bytes) {
+ if (!map_data)
+ __free_page(page);
break;
+ }
len -= bytes;
offset = 0;
diff --git a/crypto/ahash.c b/crypto/ahash.c
index 90d73a2..9a4e877 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -85,17 +85,17 @@ static int hash_walk_new_entry(struct crypto_hash_walk *walk)
int crypto_hash_walk_done(struct crypto_hash_walk *walk, int err)
{
unsigned int alignmask = walk->alignmask;
- unsigned int nbytes = walk->entrylen;
walk->data -= walk->offset;
- if (nbytes && walk->offset & alignmask && !err) {
- walk->offset = ALIGN(walk->offset, alignmask + 1);
- nbytes = min(nbytes,
- ((unsigned int)(PAGE_SIZE)) - walk->offset);
- walk->entrylen -= nbytes;
+ if (walk->entrylen && (walk->offset & alignmask) && !err) {
+ unsigned int nbytes;
+ walk->offset = ALIGN(walk->offset, alignmask + 1);
+ nbytes = min(walk->entrylen,
+ (unsigned int)(PAGE_SIZE - walk->offset));
if (nbytes) {
+ walk->entrylen -= nbytes;
walk->data += walk->offset;
return nbytes;
}
@@ -115,7 +115,7 @@ int crypto_hash_walk_done(struct crypto_hash_walk *walk, int err)
if (err)
return err;
- if (nbytes) {
+ if (walk->entrylen) {
walk->offset = 0;
walk->pg++;
return hash_walk_next(walk);
@@ -189,6 +189,21 @@ static int ahash_setkey_unaligned(struct crypto_ahash *tfm, const u8 *key,
return ret;
}
+static int ahash_nosetkey(struct crypto_ahash *tfm, const u8 *key,
+ unsigned int keylen)
+{
+ return -ENOSYS;
+}
+
+static void ahash_set_needkey(struct crypto_ahash *tfm)
+{
+ const struct hash_alg_common *alg = crypto_hash_alg_common(tfm);
+
+ if (tfm->setkey != ahash_nosetkey &&
+ !(alg->base.cra_flags & CRYPTO_ALG_OPTIONAL_KEY))
+ crypto_ahash_set_flags(tfm, CRYPTO_TFM_NEED_KEY);
+}
+
int crypto_ahash_setkey(struct crypto_ahash *tfm, const u8 *key,
unsigned int keylen)
{
@@ -200,20 +215,16 @@ int crypto_ahash_setkey(struct crypto_ahash *tfm, const u8 *key,
else
err = tfm->setkey(tfm, key, keylen);
- if (err)
+ if (unlikely(err)) {
+ ahash_set_needkey(tfm);
return err;
+ }
crypto_ahash_clear_flags(tfm, CRYPTO_TFM_NEED_KEY);
return 0;
}
EXPORT_SYMBOL_GPL(crypto_ahash_setkey);
-static int ahash_nosetkey(struct crypto_ahash *tfm, const u8 *key,
- unsigned int keylen)
-{
- return -ENOSYS;
-}
-
static inline unsigned int ahash_align_buffer_size(unsigned len,
unsigned long mask)
{
@@ -482,8 +493,7 @@ static int crypto_ahash_init_tfm(struct crypto_tfm *tfm)
if (alg->setkey) {
hash->setkey = alg->setkey;
- if (!(alg->halg.base.cra_flags & CRYPTO_ALG_OPTIONAL_KEY))
- crypto_ahash_set_flags(hash, CRYPTO_TFM_NEED_KEY);
+ ahash_set_needkey(hash);
}
if (alg->export)
hash->export = alg->export;
diff --git a/crypto/pcbc.c b/crypto/pcbc.c
index f654965..de81f71 100644
--- a/crypto/pcbc.c
+++ b/crypto/pcbc.c
@@ -52,7 +52,7 @@ static int crypto_pcbc_encrypt_segment(struct blkcipher_desc *desc,
unsigned int nbytes = walk->nbytes;
u8 *src = walk->src.virt.addr;
u8 *dst = walk->dst.virt.addr;
- u8 *iv = walk->iv;
+ u8 * const iv = walk->iv;
do {
crypto_xor(iv, src, bsize);
@@ -76,7 +76,7 @@ static int crypto_pcbc_encrypt_inplace(struct blkcipher_desc *desc,
int bsize = crypto_cipher_blocksize(tfm);
unsigned int nbytes = walk->nbytes;
u8 *src = walk->src.virt.addr;
- u8 *iv = walk->iv;
+ u8 * const iv = walk->iv;
u8 tmpbuf[bsize];
do {
@@ -89,8 +89,6 @@ static int crypto_pcbc_encrypt_inplace(struct blkcipher_desc *desc,
src += bsize;
} while ((nbytes -= bsize) >= bsize);
- memcpy(walk->iv, iv, bsize);
-
return nbytes;
}
@@ -130,7 +128,7 @@ static int crypto_pcbc_decrypt_segment(struct blkcipher_desc *desc,
unsigned int nbytes = walk->nbytes;
u8 *src = walk->src.virt.addr;
u8 *dst = walk->dst.virt.addr;
- u8 *iv = walk->iv;
+ u8 * const iv = walk->iv;
do {
fn(crypto_cipher_tfm(tfm), dst, src);
@@ -142,8 +140,6 @@ static int crypto_pcbc_decrypt_segment(struct blkcipher_desc *desc,
dst += bsize;
} while ((nbytes -= bsize) >= bsize);
- memcpy(walk->iv, iv, bsize);
-
return nbytes;
}
@@ -156,7 +152,7 @@ static int crypto_pcbc_decrypt_inplace(struct blkcipher_desc *desc,
int bsize = crypto_cipher_blocksize(tfm);
unsigned int nbytes = walk->nbytes;
u8 *src = walk->src.virt.addr;
- u8 *iv = walk->iv;
+ u8 * const iv = walk->iv;
u8 tmpbuf[bsize];
do {
@@ -169,8 +165,6 @@ static int crypto_pcbc_decrypt_inplace(struct blkcipher_desc *desc,
src += bsize;
} while ((nbytes -= bsize) >= bsize);
- memcpy(walk->iv, iv, bsize);
-
return nbytes;
}
diff --git a/crypto/shash.c b/crypto/shash.c
index 4f047c7..a1c7609 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -52,6 +52,13 @@ static int shash_setkey_unaligned(struct crypto_shash *tfm, const u8 *key,
return err;
}
+static void shash_set_needkey(struct crypto_shash *tfm, struct shash_alg *alg)
+{
+ if (crypto_shash_alg_has_setkey(alg) &&
+ !(alg->base.cra_flags & CRYPTO_ALG_OPTIONAL_KEY))
+ crypto_shash_set_flags(tfm, CRYPTO_TFM_NEED_KEY);
+}
+
int crypto_shash_setkey(struct crypto_shash *tfm, const u8 *key,
unsigned int keylen)
{
@@ -64,8 +71,10 @@ int crypto_shash_setkey(struct crypto_shash *tfm, const u8 *key,
else
err = shash->setkey(tfm, key, keylen);
- if (err)
+ if (unlikely(err)) {
+ shash_set_needkey(tfm, shash);
return err;
+ }
crypto_shash_clear_flags(tfm, CRYPTO_TFM_NEED_KEY);
return 0;
@@ -367,7 +376,8 @@ int crypto_init_shash_ops_async(struct crypto_tfm *tfm)
crt->final = shash_async_final;
crt->finup = shash_async_finup;
crt->digest = shash_async_digest;
- crt->setkey = shash_async_setkey;
+ if (crypto_shash_alg_has_setkey(alg))
+ crt->setkey = shash_async_setkey;
crypto_ahash_set_flags(crt, crypto_shash_get_flags(shash) &
CRYPTO_TFM_NEED_KEY);
@@ -389,9 +399,7 @@ static int crypto_shash_init_tfm(struct crypto_tfm *tfm)
hash->descsize = alg->descsize;
- if (crypto_shash_alg_has_setkey(alg) &&
- !(alg->base.cra_flags & CRYPTO_ALG_OPTIONAL_KEY))
- crypto_shash_set_flags(hash, CRYPTO_TFM_NEED_KEY);
+ shash_set_needkey(hash, alg);
return 0;
}
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 4a349b7..125669f 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -4527,7 +4527,49 @@ static struct hash_testvec poly1305_tv_template[] = {
.psize = 80,
.digest = "\x13\x00\x00\x00\x00\x00\x00\x00"
"\x00\x00\x00\x00\x00\x00\x00\x00",
- },
+ }, { /* Regression test for overflow in AVX2 implementation */
+ .plaintext = "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff\xff\xff\xff\xff"
+ "\xff\xff\xff\xff",
+ .psize = 300,
+ .digest = "\xfb\x5e\x96\xd8\x61\xd5\xc7\xc8"
+ "\x78\xe5\x87\xcc\x2d\x5a\x22\xe1",
+ }
};
/* NHPoly1305 test vectors from https://github.com/google/adiantum */
diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c
index 667dc5c..ea05731 100644
--- a/drivers/acpi/acpi_video.c
+++ b/drivers/acpi/acpi_video.c
@@ -2069,21 +2069,29 @@ static int __init intel_opregion_present(void)
return opregion;
}
+/* Check if the chassis-type indicates there is no builtin LCD panel */
static bool dmi_is_desktop(void)
{
const char *chassis_type;
+ unsigned long type;
chassis_type = dmi_get_system_info(DMI_CHASSIS_TYPE);
if (!chassis_type)
return false;
- if (!strcmp(chassis_type, "3") || /* 3: Desktop */
- !strcmp(chassis_type, "4") || /* 4: Low Profile Desktop */
- !strcmp(chassis_type, "5") || /* 5: Pizza Box */
- !strcmp(chassis_type, "6") || /* 6: Mini Tower */
- !strcmp(chassis_type, "7") || /* 7: Tower */
- !strcmp(chassis_type, "11")) /* 11: Main Server Chassis */
+ if (kstrtoul(chassis_type, 10, &type) != 0)
+ return false;
+
+ switch (type) {
+ case 0x03: /* Desktop */
+ case 0x04: /* Low Profile Desktop */
+ case 0x05: /* Pizza Box */
+ case 0x06: /* Mini Tower */
+ case 0x07: /* Tower */
+ case 0x10: /* Lunch Box */
+ case 0x11: /* Main Server Chassis */
return true;
+ }
return false;
}
diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
index 201c7ce..98b513d 100644
--- a/drivers/acpi/device_sysfs.c
+++ b/drivers/acpi/device_sysfs.c
@@ -202,11 +202,15 @@ static int create_of_modalias(struct acpi_device *acpi_dev, char *modalias,
{
struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER };
const union acpi_object *of_compatible, *obj;
+ acpi_status status;
int len, count;
int i, nval;
char *c;
- acpi_get_name(acpi_dev->handle, ACPI_SINGLE_NAME, &buf);
+ status = acpi_get_name(acpi_dev->handle, ACPI_SINGLE_NAME, &buf);
+ if (ACPI_FAILURE(status))
+ return -ENODEV;
+
/* DT strings are all in lower case */
for (c = buf.pointer; *c != '\0'; c++)
*c = tolower(*c);
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 06cf742..31a0760 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -307,6 +307,13 @@ int acpi_nfit_ctl(struct nvdimm_bus_descriptor *nd_desc, struct nvdimm *nvdimm,
return -EINVAL;
}
+ if (out_obj->type != ACPI_TYPE_BUFFER) {
+ dev_dbg(dev, "%s unexpected output object type cmd: %s type: %d\n",
+ dimm_name, cmd_name, out_obj->type);
+ rc = -EINVAL;
+ goto out;
+ }
+
if (call_pkg) {
call_pkg->nd_fw_size = out_obj->buffer.length;
memcpy(call_pkg->nd_payload + call_pkg->nd_size_in,
@@ -325,13 +332,6 @@ int acpi_nfit_ctl(struct nvdimm_bus_descriptor *nd_desc, struct nvdimm *nvdimm,
return 0;
}
- if (out_obj->package.type != ACPI_TYPE_BUFFER) {
- dev_dbg(dev, "%s:%s unexpected output object type cmd: %s type: %d\n",
- __func__, dimm_name, cmd_name, out_obj->type);
- rc = -EINVAL;
- goto out;
- }
-
if (IS_ENABLED(CONFIG_ACPI_NFIT_DEBUG)) {
dev_dbg(dev, "%s:%s cmd: %s output length: %d\n", __func__,
dimm_name, cmd_name, out_obj->buffer.length);
diff --git a/drivers/acpi/sbs.c b/drivers/acpi/sbs.c
index ad0b13a..4a76000 100644
--- a/drivers/acpi/sbs.c
+++ b/drivers/acpi/sbs.c
@@ -443,9 +443,13 @@ static int acpi_ac_get_present(struct acpi_sbs *sbs)
/*
* The spec requires that bit 4 always be 1. If it's not set, assume
- * that the implementation doesn't support an SBS charger
+ * that the implementation doesn't support an SBS charger.
+ *
+ * And on some MacBooks a status of 0xffff is always returned, no
+ * matter whether the charger is plugged in or not, which is also
+ * wrong, so ignore the SBS charger for those too.
*/
- if (!((status >> 4) & 0x1))
+ if (!((status >> 4) & 0x1) || status == 0xffff)
return -ENODEV;
sbs->charger_present = (status >> 15) & 0x1;
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index 342c4de..01d5f5d 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -668,6 +668,26 @@ struct binder_transaction {
};
/**
+ * struct binder_object - union of flat binder object types
+ * @hdr: generic object header
+ * @fbo: binder object (nodes and refs)
+ * @fdo: file descriptor object
+ * @bbo: binder buffer pointer
+ * @fdao: file descriptor array
+ *
+ * Used for type-independent object copies
+ */
+struct binder_object {
+ union {
+ struct binder_object_header hdr;
+ struct flat_binder_object fbo;
+ struct binder_fd_object fdo;
+ struct binder_buffer_object bbo;
+ struct binder_fd_array_object fdao;
+ };
+};
+
+/**
* binder_proc_lock() - Acquire outer lock for given binder_proc
* @proc: struct binder_proc to acquire
*
@@ -2204,26 +2224,34 @@ static void binder_cleanup_transaction(struct binder_transaction *t,
}
/**
- * binder_validate_object() - checks for a valid metadata object in a buffer.
+ * binder_get_object() - gets object and checks for valid metadata
+ * @proc: binder_proc owning the buffer
* @buffer: binder_buffer that we're parsing.
- * @offset: offset in the buffer at which to validate an object.
+ * @offset: offset in the @buffer at which to validate an object.
+ * @object: struct binder_object to read into
*
* Return: If there's a valid metadata object at @offset in @buffer, the
- * size of that object. Otherwise, it returns zero.
+ * size of that object. Otherwise, it returns zero. The object
+ * is read into the struct binder_object pointed to by @object.
*/
-static size_t binder_validate_object(struct binder_buffer *buffer, u64 offset)
+static size_t binder_get_object(struct binder_proc *proc,
+ struct binder_buffer *buffer,
+ unsigned long offset,
+ struct binder_object *object)
{
- /* Check if we can read a header first */
+ size_t read_size;
struct binder_object_header *hdr;
size_t object_size = 0;
- if (buffer->data_size < sizeof(*hdr) ||
- offset > buffer->data_size - sizeof(*hdr) ||
+ read_size = min_t(size_t, sizeof(*object), buffer->data_size - offset);
+ if (offset > buffer->data_size || read_size < sizeof(*hdr) ||
!IS_ALIGNED(offset, sizeof(u32)))
return 0;
+ binder_alloc_copy_from_buffer(&proc->alloc, object, buffer,
+ offset, read_size);
- /* Ok, now see if we can read a complete object. */
- hdr = (struct binder_object_header *)(buffer->data + offset);
+ /* Ok, now see if we read a complete object. */
+ hdr = &object->hdr;
switch (hdr->type) {
case BINDER_TYPE_BINDER:
case BINDER_TYPE_WEAK_BINDER:
@@ -2252,10 +2280,13 @@ static size_t binder_validate_object(struct binder_buffer *buffer, u64 offset)
/**
* binder_validate_ptr() - validates binder_buffer_object in a binder_buffer.
+ * @proc: binder_proc owning the buffer
* @b: binder_buffer containing the object
+ * @object: struct binder_object to read into
* @index: index in offset array at which the binder_buffer_object is
* located
- * @start: points to the start of the offset array
+ * @start_offset: points to the start of the offset array
+ * @object_offsetp: offset of @object read from @b
* @num_valid: the number of valid offsets in the offset array
*
* Return: If @index is within the valid range of the offset array
@@ -2266,34 +2297,46 @@ static size_t binder_validate_object(struct binder_buffer *buffer, u64 offset)
* Note that the offset found in index @index itself is not
* verified; this function assumes that @num_valid elements
* from @start were previously verified to have valid offsets.
+ * If @object_offsetp is non-NULL, then the offset within
+ * @b is written to it.
*/
-static struct binder_buffer_object *binder_validate_ptr(struct binder_buffer *b,
- binder_size_t index,
- binder_size_t *start,
- binder_size_t num_valid)
+static struct binder_buffer_object *binder_validate_ptr(
+ struct binder_proc *proc,
+ struct binder_buffer *b,
+ struct binder_object *object,
+ binder_size_t index,
+ binder_size_t start_offset,
+ binder_size_t *object_offsetp,
+ binder_size_t num_valid)
{
- struct binder_buffer_object *buffer_obj;
- binder_size_t *offp;
+ size_t object_size;
+ binder_size_t object_offset;
+ unsigned long buffer_offset;
if (index >= num_valid)
return NULL;
- offp = start + index;
- buffer_obj = (struct binder_buffer_object *)(b->data + *offp);
- if (buffer_obj->hdr.type != BINDER_TYPE_PTR)
+ buffer_offset = start_offset + sizeof(binder_size_t) * index;
+ binder_alloc_copy_from_buffer(&proc->alloc, &object_offset,
+ b, buffer_offset, sizeof(object_offset));
+ object_size = binder_get_object(proc, b, object_offset, object);
+ if (!object_size || object->hdr.type != BINDER_TYPE_PTR)
return NULL;
+ if (object_offsetp)
+ *object_offsetp = object_offset;
- return buffer_obj;
+ return &object->bbo;
}
/**
* binder_validate_fixup() - validates pointer/fd fixups happen in order.
+ * @proc: binder_proc owning the buffer
* @b: transaction buffer
- * @objects_start start of objects buffer
- * @buffer: binder_buffer_object in which to fix up
- * @offset: start offset in @buffer to fix up
- * @last_obj: last binder_buffer_object that we fixed up in
- * @last_min_offset: minimum fixup offset in @last_obj
+ * @objects_start_offset: offset to start of objects buffer
+ * @buffer_obj_offset: offset to binder_buffer_object in which to fix up
+ * @fixup_offset: start offset in @buffer to fix up
+ * @last_obj_offset: offset to last binder_buffer_object that we fixed
+ * @last_min_offset: minimum fixup offset in object at @last_obj_offset
*
* Return: %true if a fixup in buffer @buffer at offset @offset is
* allowed.
@@ -2324,63 +2367,83 @@ static struct binder_buffer_object *binder_validate_ptr(struct binder_buffer *b,
* C (parent = A, offset = 16)
* D (parent = B, offset = 0) // B is not A or any of A's parents
*/
-static bool binder_validate_fixup(struct binder_buffer *b,
- binder_size_t *objects_start,
- struct binder_buffer_object *buffer,
+static bool binder_validate_fixup(struct binder_proc *proc,
+ struct binder_buffer *b,
+ binder_size_t objects_start_offset,
+ binder_size_t buffer_obj_offset,
binder_size_t fixup_offset,
- struct binder_buffer_object *last_obj,
+ binder_size_t last_obj_offset,
binder_size_t last_min_offset)
{
- if (!last_obj) {
+ if (!last_obj_offset) {
/* Nothing to fix up in */
return false;
}
- while (last_obj != buffer) {
+ while (last_obj_offset != buffer_obj_offset) {
+ unsigned long buffer_offset;
+ struct binder_object last_object;
+ struct binder_buffer_object *last_bbo;
+ size_t object_size = binder_get_object(proc, b, last_obj_offset,
+ &last_object);
+ if (object_size != sizeof(*last_bbo))
+ return false;
+
+ last_bbo = &last_object.bbo;
/*
* Safe to retrieve the parent of last_obj, since it
* was already previously verified by the driver.
*/
- if ((last_obj->flags & BINDER_BUFFER_FLAG_HAS_PARENT) == 0)
+ if ((last_bbo->flags & BINDER_BUFFER_FLAG_HAS_PARENT) == 0)
return false;
- last_min_offset = last_obj->parent_offset + sizeof(uintptr_t);
- last_obj = (struct binder_buffer_object *)
- (b->data + *(objects_start + last_obj->parent));
+ last_min_offset = last_bbo->parent_offset + sizeof(uintptr_t);
+ buffer_offset = objects_start_offset +
+ sizeof(binder_size_t) * last_bbo->parent,
+ binder_alloc_copy_from_buffer(&proc->alloc, &last_obj_offset,
+ b, buffer_offset,
+ sizeof(last_obj_offset));
}
return (fixup_offset >= last_min_offset);
}
static void binder_transaction_buffer_release(struct binder_proc *proc,
struct binder_buffer *buffer,
- binder_size_t *failed_at)
+ binder_size_t failed_at,
+ bool is_failure)
{
- binder_size_t *offp, *off_start, *off_end;
int debug_id = buffer->debug_id;
+ binder_size_t off_start_offset, buffer_offset, off_end_offset;
binder_debug(BINDER_DEBUG_TRANSACTION,
- "%d buffer release %d, size %zd-%zd, failed at %pK\n",
+ "%d buffer release %d, size %zd-%zd, failed at %llx\n",
proc->pid, buffer->debug_id,
- buffer->data_size, buffer->offsets_size, failed_at);
+ buffer->data_size, buffer->offsets_size,
+ (unsigned long long)failed_at);
if (buffer->target_node)
binder_dec_node(buffer->target_node, 1, 0);
- off_start = (binder_size_t *)(buffer->data +
- ALIGN(buffer->data_size, sizeof(void *)));
- if (failed_at)
- off_end = failed_at;
- else
- off_end = (void *)off_start + buffer->offsets_size;
- for (offp = off_start; offp < off_end; offp++) {
+ off_start_offset = ALIGN(buffer->data_size, sizeof(void *));
+ off_end_offset = is_failure ? failed_at :
+ off_start_offset + buffer->offsets_size;
+ for (buffer_offset = off_start_offset; buffer_offset < off_end_offset;
+ buffer_offset += sizeof(binder_size_t)) {
struct binder_object_header *hdr;
- size_t object_size = binder_validate_object(buffer, *offp);
+ size_t object_size;
+ struct binder_object object;
+ binder_size_t object_offset;
+ binder_alloc_copy_from_buffer(&proc->alloc, &object_offset,
+ buffer, buffer_offset,
+ sizeof(object_offset));
+ object_size = binder_get_object(proc, buffer,
+ object_offset, &object);
if (object_size == 0) {
pr_err("transaction release %d bad object at offset %lld, size %zd\n",
- debug_id, (u64)*offp, buffer->data_size);
+ debug_id, (u64)object_offset, buffer->data_size);
continue;
}
- hdr = (struct binder_object_header *)(buffer->data + *offp);
+ hdr = &object.hdr;
switch (hdr->type) {
case BINDER_TYPE_BINDER:
case BINDER_TYPE_WEAK_BINDER: {
@@ -2438,28 +2501,25 @@ static void binder_transaction_buffer_release(struct binder_proc *proc,
case BINDER_TYPE_FDA: {
struct binder_fd_array_object *fda;
struct binder_buffer_object *parent;
- uintptr_t parent_buffer;
- u32 *fd_array;
+ struct binder_object ptr_object;
+ binder_size_t fda_offset;
size_t fd_index;
binder_size_t fd_buf_size;
+ binder_size_t num_valid;
+ num_valid = (buffer_offset - off_start_offset) /
+ sizeof(binder_size_t);
fda = to_binder_fd_array_object(hdr);
- parent = binder_validate_ptr(buffer, fda->parent,
- off_start,
- offp - off_start);
+ parent = binder_validate_ptr(proc, buffer, &ptr_object,
+ fda->parent,
+ off_start_offset,
+ NULL,
+ num_valid);
if (!parent) {
pr_err("transaction release %d bad parent offset",
debug_id);
continue;
}
- /*
- * Since the parent was already fixed up, convert it
- * back to kernel address space to access it
- */
- parent_buffer = parent->buffer -
- binder_alloc_get_user_buffer_offset(
- &proc->alloc);
-
fd_buf_size = sizeof(u32) * fda->num_fds;
if (fda->num_fds >= SIZE_MAX / sizeof(u32)) {
pr_err("transaction release %d invalid number of fds (%lld)\n",
@@ -2473,9 +2533,29 @@ static void binder_transaction_buffer_release(struct binder_proc *proc,
debug_id, (u64)fda->num_fds);
continue;
}
- fd_array = (u32 *)(parent_buffer + (uintptr_t)fda->parent_offset);
- for (fd_index = 0; fd_index < fda->num_fds; fd_index++)
- task_close_fd(proc, fd_array[fd_index]);
+ /*
+ * the source data for binder_buffer_object is visible
+ * to user-space and the @buffer element is the user
+ * pointer to the buffer_object containing the fd_array.
+ * Convert the address to an offset relative to
+ * the base of the transaction buffer.
+ */
+ fda_offset =
+ (parent->buffer - (uintptr_t)buffer->user_data) +
+ fda->parent_offset;
+ for (fd_index = 0; fd_index < fda->num_fds;
+ fd_index++) {
+ u32 fd;
+ binder_size_t offset = fda_offset +
+ fd_index * sizeof(fd);
+
+ binder_alloc_copy_from_buffer(&proc->alloc,
+ &fd,
+ buffer,
+ offset,
+ sizeof(fd));
+ task_close_fd(proc, fd);
+ }
} break;
default:
pr_err("transaction release %d bad object type %x\n",
@@ -2672,9 +2752,8 @@ static int binder_translate_fd_array(struct binder_fd_array_object *fda,
struct binder_transaction *in_reply_to)
{
binder_size_t fdi, fd_buf_size, num_installed_fds;
+ binder_size_t fda_offset;
int target_fd;
- uintptr_t parent_buffer;
- u32 *fd_array;
struct binder_proc *proc = thread->proc;
struct binder_proc *target_proc = t->to_proc;
@@ -2692,23 +2771,33 @@ static int binder_translate_fd_array(struct binder_fd_array_object *fda,
return -EINVAL;
}
/*
- * Since the parent was already fixed up, convert it
- * back to the kernel address space to access it
+ * the source data for binder_buffer_object is visible
+ * to user-space and the @buffer element is the user
+ * pointer to the buffer_object containing the fd_array.
+ * Convert the address to an offset relative to
+ * the base of the transaction buffer.
*/
- parent_buffer = parent->buffer -
- binder_alloc_get_user_buffer_offset(&target_proc->alloc);
- fd_array = (u32 *)(parent_buffer + (uintptr_t)fda->parent_offset);
- if (!IS_ALIGNED((unsigned long)fd_array, sizeof(u32))) {
+ fda_offset = (parent->buffer - (uintptr_t)t->buffer->user_data) +
+ fda->parent_offset;
+ if (!IS_ALIGNED((unsigned long)fda_offset, sizeof(u32))) {
binder_user_error("%d:%d parent offset not aligned correctly.\n",
proc->pid, thread->pid);
return -EINVAL;
}
for (fdi = 0; fdi < fda->num_fds; fdi++) {
- target_fd = binder_translate_fd(fd_array[fdi], t, thread,
- in_reply_to);
+ u32 fd;
+
+ binder_size_t offset = fda_offset + fdi * sizeof(fd);
+
+ binder_alloc_copy_from_buffer(&target_proc->alloc,
+ &fd, t->buffer,
+ offset, sizeof(fd));
+ target_fd = binder_translate_fd(fd, t, thread, in_reply_to);
if (target_fd < 0)
goto err_translate_fd_failed;
- fd_array[fdi] = target_fd;
+ binder_alloc_copy_to_buffer(&target_proc->alloc,
+ t->buffer, offset,
+ &target_fd, sizeof(fd));
}
return 0;
@@ -2718,38 +2807,48 @@ static int binder_translate_fd_array(struct binder_fd_array_object *fda,
* installed so far.
*/
num_installed_fds = fdi;
- for (fdi = 0; fdi < num_installed_fds; fdi++)
- task_close_fd(target_proc, fd_array[fdi]);
+ for (fdi = 0; fdi < num_installed_fds; fdi++) {
+ u32 fd;
+ binder_size_t offset = fda_offset + fdi * sizeof(fd);
+ binder_alloc_copy_from_buffer(&target_proc->alloc,
+ &fd, t->buffer,
+ offset, sizeof(fd));
+ task_close_fd(target_proc, fd);
+ }
return target_fd;
}
static int binder_fixup_parent(struct binder_transaction *t,
struct binder_thread *thread,
struct binder_buffer_object *bp,
- binder_size_t *off_start,
+ binder_size_t off_start_offset,
binder_size_t num_valid,
- struct binder_buffer_object *last_fixup_obj,
+ binder_size_t last_fixup_obj_off,
binder_size_t last_fixup_min_off)
{
struct binder_buffer_object *parent;
- u8 *parent_buffer;
struct binder_buffer *b = t->buffer;
struct binder_proc *proc = thread->proc;
struct binder_proc *target_proc = t->to_proc;
+ struct binder_object object;
+ binder_size_t buffer_offset;
+ binder_size_t parent_offset;
if (!(bp->flags & BINDER_BUFFER_FLAG_HAS_PARENT))
return 0;
- parent = binder_validate_ptr(b, bp->parent, off_start, num_valid);
+ parent = binder_validate_ptr(target_proc, b, &object, bp->parent,
+ off_start_offset, &parent_offset,
+ num_valid);
if (!parent) {
binder_user_error("%d:%d got transaction with invalid parent offset or type\n",
proc->pid, thread->pid);
return -EINVAL;
}
- if (!binder_validate_fixup(b, off_start,
- parent, bp->parent_offset,
- last_fixup_obj,
+ if (!binder_validate_fixup(target_proc, b, off_start_offset,
+ parent_offset, bp->parent_offset,
+ last_fixup_obj_off,
last_fixup_min_off)) {
binder_user_error("%d:%d got transaction with out-of-order buffer fixup\n",
proc->pid, thread->pid);
@@ -2763,10 +2862,10 @@ static int binder_fixup_parent(struct binder_transaction *t,
proc->pid, thread->pid);
return -EINVAL;
}
- parent_buffer = (u8 *)((uintptr_t)parent->buffer -
- binder_alloc_get_user_buffer_offset(
- &target_proc->alloc));
- *(binder_uintptr_t *)(parent_buffer + bp->parent_offset) = bp->buffer;
+ buffer_offset = bp->parent_offset +
+ (uintptr_t)parent->buffer - (uintptr_t)b->user_data;
+ binder_alloc_copy_to_buffer(&target_proc->alloc, b, buffer_offset,
+ &bp->buffer, sizeof(bp->buffer));
return 0;
}
@@ -2891,9 +2990,10 @@ static void binder_transaction(struct binder_proc *proc,
int ret;
struct binder_transaction *t;
struct binder_work *tcomplete;
- binder_size_t *offp, *off_end, *off_start;
+ binder_size_t buffer_offset = 0;
+ binder_size_t off_start_offset, off_end_offset;
binder_size_t off_min;
- u8 *sg_bufp, *sg_buf_end;
+ binder_size_t sg_buf_offset, sg_buf_end_offset;
struct binder_proc *target_proc = NULL;
struct binder_thread *target_thread = NULL;
struct binder_node *target_node = NULL;
@@ -2902,7 +3002,7 @@ static void binder_transaction(struct binder_proc *proc,
uint32_t return_error = 0;
uint32_t return_error_param = 0;
uint32_t return_error_line = 0;
- struct binder_buffer_object *last_fixup_obj = NULL;
+ binder_size_t last_fixup_obj_off = 0;
binder_size_t last_fixup_min_off = 0;
struct binder_context *context = proc->context;
int t_debug_id = atomic_inc_return(&binder_last_id);
@@ -3166,11 +3266,11 @@ static void binder_transaction(struct binder_proc *proc,
ALIGN(tr->offsets_size, sizeof(void *)) +
ALIGN(extra_buffers_size, sizeof(void *)) -
ALIGN(secctx_sz, sizeof(u64));
- char *kptr = t->buffer->data + buf_offset;
- t->security_ctx = (uintptr_t)kptr +
- binder_alloc_get_user_buffer_offset(&target_proc->alloc);
- memcpy(kptr, secctx, secctx_sz);
+ t->security_ctx = (uintptr_t)t->buffer->user_data + buf_offset;
+ binder_alloc_copy_to_buffer(&target_proc->alloc,
+ t->buffer, buf_offset,
+ secctx, secctx_sz);
security_release_secctx(secctx, secctx_sz);
secctx = NULL;
}
@@ -3178,12 +3278,13 @@ static void binder_transaction(struct binder_proc *proc,
t->buffer->transaction = t;
t->buffer->target_node = target_node;
trace_binder_transaction_alloc_buf(t->buffer);
- off_start = (binder_size_t *)(t->buffer->data +
- ALIGN(tr->data_size, sizeof(void *)));
- offp = off_start;
- if (copy_from_user(t->buffer->data, (const void __user *)(uintptr_t)
- tr->data.ptr.buffer, tr->data_size)) {
+ if (binder_alloc_copy_user_to_buffer(
+ &target_proc->alloc,
+ t->buffer, 0,
+ (const void __user *)
+ (uintptr_t)tr->data.ptr.buffer,
+ tr->data_size)) {
binder_user_error("%d:%d got transaction with invalid data ptr\n",
proc->pid, thread->pid);
return_error = BR_FAILED_REPLY;
@@ -3191,8 +3292,13 @@ static void binder_transaction(struct binder_proc *proc,
return_error_line = __LINE__;
goto err_copy_data_failed;
}
- if (copy_from_user(offp, (const void __user *)(uintptr_t)
- tr->data.ptr.offsets, tr->offsets_size)) {
+ if (binder_alloc_copy_user_to_buffer(
+ &target_proc->alloc,
+ t->buffer,
+ ALIGN(tr->data_size, sizeof(void *)),
+ (const void __user *)
+ (uintptr_t)tr->data.ptr.offsets,
+ tr->offsets_size)) {
binder_user_error("%d:%d got transaction with invalid offsets ptr\n",
proc->pid, thread->pid);
return_error = BR_FAILED_REPLY;
@@ -3217,17 +3323,30 @@ static void binder_transaction(struct binder_proc *proc,
return_error_line = __LINE__;
goto err_bad_offset;
}
- off_end = (void *)off_start + tr->offsets_size;
- sg_bufp = (u8 *)(PTR_ALIGN(off_end, sizeof(void *)));
- sg_buf_end = sg_bufp + extra_buffers_size;
+ off_start_offset = ALIGN(tr->data_size, sizeof(void *));
+ buffer_offset = off_start_offset;
+ off_end_offset = off_start_offset + tr->offsets_size;
+ sg_buf_offset = ALIGN(off_end_offset, sizeof(void *));
+ sg_buf_end_offset = sg_buf_offset + extra_buffers_size;
off_min = 0;
- for (; offp < off_end; offp++) {
+ for (buffer_offset = off_start_offset; buffer_offset < off_end_offset;
+ buffer_offset += sizeof(binder_size_t)) {
struct binder_object_header *hdr;
- size_t object_size = binder_validate_object(t->buffer, *offp);
+ size_t object_size;
+ struct binder_object object;
+ binder_size_t object_offset;
- if (object_size == 0 || *offp < off_min) {
+ binder_alloc_copy_from_buffer(&target_proc->alloc,
+ &object_offset,
+ t->buffer,
+ buffer_offset,
+ sizeof(object_offset));
+ object_size = binder_get_object(target_proc, t->buffer,
+ object_offset, &object);
+ if (object_size == 0 || object_offset < off_min) {
binder_user_error("%d:%d got transaction with invalid offset (%lld, min %lld max %lld) or object.\n",
- proc->pid, thread->pid, (u64)*offp,
+ proc->pid, thread->pid,
+ (u64)object_offset,
(u64)off_min,
(u64)t->buffer->data_size);
return_error = BR_FAILED_REPLY;
@@ -3236,8 +3355,8 @@ static void binder_transaction(struct binder_proc *proc,
goto err_bad_offset;
}
- hdr = (struct binder_object_header *)(t->buffer->data + *offp);
- off_min = *offp + object_size;
+ hdr = &object.hdr;
+ off_min = object_offset + object_size;
switch (hdr->type) {
case BINDER_TYPE_BINDER:
case BINDER_TYPE_WEAK_BINDER: {
@@ -3251,6 +3370,9 @@ static void binder_transaction(struct binder_proc *proc,
return_error_line = __LINE__;
goto err_translate_failed;
}
+ binder_alloc_copy_to_buffer(&target_proc->alloc,
+ t->buffer, object_offset,
+ fp, sizeof(*fp));
} break;
case BINDER_TYPE_HANDLE:
case BINDER_TYPE_WEAK_HANDLE: {
@@ -3264,6 +3386,9 @@ static void binder_transaction(struct binder_proc *proc,
return_error_line = __LINE__;
goto err_translate_failed;
}
+ binder_alloc_copy_to_buffer(&target_proc->alloc,
+ t->buffer, object_offset,
+ fp, sizeof(*fp));
} break;
case BINDER_TYPE_FD: {
@@ -3279,14 +3404,23 @@ static void binder_transaction(struct binder_proc *proc,
}
fp->pad_binder = 0;
fp->fd = target_fd;
+ binder_alloc_copy_to_buffer(&target_proc->alloc,
+ t->buffer, object_offset,
+ fp, sizeof(*fp));
} break;
case BINDER_TYPE_FDA: {
+ struct binder_object ptr_object;
+ binder_size_t parent_offset;
struct binder_fd_array_object *fda =
to_binder_fd_array_object(hdr);
+ size_t num_valid = (buffer_offset - off_start_offset) *
+ sizeof(binder_size_t);
struct binder_buffer_object *parent =
- binder_validate_ptr(t->buffer, fda->parent,
- off_start,
- offp - off_start);
+ binder_validate_ptr(target_proc, t->buffer,
+ &ptr_object, fda->parent,
+ off_start_offset,
+ &parent_offset,
+ num_valid);
if (!parent) {
binder_user_error("%d:%d got transaction with invalid parent offset or type\n",
proc->pid, thread->pid);
@@ -3295,9 +3429,11 @@ static void binder_transaction(struct binder_proc *proc,
return_error_line = __LINE__;
goto err_bad_parent;
}
- if (!binder_validate_fixup(t->buffer, off_start,
- parent, fda->parent_offset,
- last_fixup_obj,
+ if (!binder_validate_fixup(target_proc, t->buffer,
+ off_start_offset,
+ parent_offset,
+ fda->parent_offset,
+ last_fixup_obj_off,
last_fixup_min_off)) {
binder_user_error("%d:%d got transaction with out-of-order buffer fixup\n",
proc->pid, thread->pid);
@@ -3314,14 +3450,15 @@ static void binder_transaction(struct binder_proc *proc,
return_error_line = __LINE__;
goto err_translate_failed;
}
- last_fixup_obj = parent;
+ last_fixup_obj_off = parent_offset;
last_fixup_min_off =
fda->parent_offset + sizeof(u32) * fda->num_fds;
} break;
case BINDER_TYPE_PTR: {
struct binder_buffer_object *bp =
to_binder_buffer_object(hdr);
- size_t buf_left = sg_buf_end - sg_bufp;
+ size_t buf_left = sg_buf_end_offset - sg_buf_offset;
+ size_t num_valid;
if (bp->length > buf_left) {
binder_user_error("%d:%d got transaction with too large buffer\n",
@@ -3331,9 +3468,13 @@ static void binder_transaction(struct binder_proc *proc,
return_error_line = __LINE__;
goto err_bad_offset;
}
- if (copy_from_user(sg_bufp,
- (const void __user *)(uintptr_t)
- bp->buffer, bp->length)) {
+ if (binder_alloc_copy_user_to_buffer(
+ &target_proc->alloc,
+ t->buffer,
+ sg_buf_offset,
+ (const void __user *)
+ (uintptr_t)bp->buffer,
+ bp->length)) {
binder_user_error("%d:%d got transaction with invalid offsets ptr\n",
proc->pid, thread->pid);
return_error_param = -EFAULT;
@@ -3342,14 +3483,16 @@ static void binder_transaction(struct binder_proc *proc,
goto err_copy_data_failed;
}
/* Fixup buffer pointer to target proc address space */
- bp->buffer = (uintptr_t)sg_bufp +
- binder_alloc_get_user_buffer_offset(
- &target_proc->alloc);
- sg_bufp += ALIGN(bp->length, sizeof(u64));
+ bp->buffer = (uintptr_t)
+ t->buffer->user_data + sg_buf_offset;
+ sg_buf_offset += ALIGN(bp->length, sizeof(u64));
- ret = binder_fixup_parent(t, thread, bp, off_start,
- offp - off_start,
- last_fixup_obj,
+ num_valid = (buffer_offset - off_start_offset) *
+ sizeof(binder_size_t);
+ ret = binder_fixup_parent(t, thread, bp,
+ off_start_offset,
+ num_valid,
+ last_fixup_obj_off,
last_fixup_min_off);
if (ret < 0) {
return_error = BR_FAILED_REPLY;
@@ -3357,7 +3500,10 @@ static void binder_transaction(struct binder_proc *proc,
return_error_line = __LINE__;
goto err_translate_failed;
}
- last_fixup_obj = bp;
+ binder_alloc_copy_to_buffer(&target_proc->alloc,
+ t->buffer, object_offset,
+ bp, sizeof(*bp));
+ last_fixup_obj_off = object_offset;
last_fixup_min_off = 0;
} break;
default:
@@ -3437,7 +3583,8 @@ static void binder_transaction(struct binder_proc *proc,
err_bad_parent:
err_copy_data_failed:
trace_binder_transaction_failed_buffer_release(t->buffer);
- binder_transaction_buffer_release(target_proc, t->buffer, offp);
+ binder_transaction_buffer_release(target_proc, t->buffer,
+ buffer_offset, true);
if (target_node)
binder_dec_node_tmpref(target_node);
target_node = NULL;
@@ -3716,7 +3863,7 @@ static int binder_thread_write(struct binder_proc *proc,
binder_node_inner_unlock(buf_node);
}
trace_binder_transaction_buffer_release(buffer);
- binder_transaction_buffer_release(proc, buffer, NULL);
+ binder_transaction_buffer_release(proc, buffer, 0, false);
binder_alloc_free_buf(&proc->alloc, buffer);
break;
}
@@ -4324,9 +4471,7 @@ static int binder_thread_read(struct binder_proc *proc,
trd->data_size = t->buffer->data_size;
trd->offsets_size = t->buffer->offsets_size;
- trd->data.ptr.buffer = (binder_uintptr_t)
- ((uintptr_t)t->buffer->data +
- binder_alloc_get_user_buffer_offset(&proc->alloc));
+ trd->data.ptr.buffer = (uintptr_t)t->buffer->user_data;
trd->data.ptr.offsets = trd->data.ptr.buffer +
ALIGN(t->buffer->data_size,
sizeof(void *));
@@ -5384,7 +5529,7 @@ static void print_binder_transaction_ilocked(struct seq_file *m,
seq_printf(m, " node %d", buffer->target_node->debug_id);
seq_printf(m, " size %zd:%zd data %pK\n",
buffer->data_size, buffer->offsets_size,
- buffer->data);
+ buffer->user_data);
}
static void print_binder_work_ilocked(struct seq_file *m,
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 38bfa7a..34454b0 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -28,6 +28,8 @@
#include <linux/slab.h>
#include <linux/sched.h>
#include <linux/list_lru.h>
+#include <linux/uaccess.h>
+#include <linux/highmem.h>
#include "binder_alloc.h"
#include "binder_trace.h"
@@ -65,9 +67,8 @@ static size_t binder_alloc_buffer_size(struct binder_alloc *alloc,
struct binder_buffer *buffer)
{
if (list_is_last(&buffer->entry, &alloc->buffers))
- return (u8 *)alloc->buffer +
- alloc->buffer_size - (u8 *)buffer->data;
- return (u8 *)binder_buffer_next(buffer)->data - (u8 *)buffer->data;
+ return alloc->buffer + alloc->buffer_size - buffer->user_data;
+ return binder_buffer_next(buffer)->user_data - buffer->user_data;
}
static void binder_insert_free_buffer(struct binder_alloc *alloc,
@@ -117,9 +118,9 @@ static void binder_insert_allocated_buffer_locked(
buffer = rb_entry(parent, struct binder_buffer, rb_node);
BUG_ON(buffer->free);
- if (new_buffer->data < buffer->data)
+ if (new_buffer->user_data < buffer->user_data)
p = &parent->rb_left;
- else if (new_buffer->data > buffer->data)
+ else if (new_buffer->user_data > buffer->user_data)
p = &parent->rb_right;
else
BUG();
@@ -134,17 +135,17 @@ static struct binder_buffer *binder_alloc_prepare_to_free_locked(
{
struct rb_node *n = alloc->allocated_buffers.rb_node;
struct binder_buffer *buffer;
- void *kern_ptr;
+ void __user *uptr;
- kern_ptr = (void *)(user_ptr - alloc->user_buffer_offset);
+ uptr = (void __user *)user_ptr;
while (n) {
buffer = rb_entry(n, struct binder_buffer, rb_node);
BUG_ON(buffer->free);
- if (kern_ptr < buffer->data)
+ if (uptr < buffer->user_data)
n = n->rb_left;
- else if (kern_ptr > buffer->data)
+ else if (uptr > buffer->user_data)
n = n->rb_right;
else {
/*
@@ -184,9 +185,9 @@ struct binder_buffer *binder_alloc_prepare_to_free(struct binder_alloc *alloc,
}
static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
- void *start, void *end)
+ void __user *start, void __user *end)
{
- void *page_addr;
+ void __user *page_addr;
unsigned long user_page_addr;
struct binder_lru_page *page;
struct vm_area_struct *vma = NULL;
@@ -260,18 +261,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
page->alloc = alloc;
INIT_LIST_HEAD(&page->lru);
- ret = map_kernel_range_noflush((unsigned long)page_addr,
- PAGE_SIZE, PAGE_KERNEL,
- &page->page_ptr);
- flush_cache_vmap((unsigned long)page_addr,
- (unsigned long)page_addr + PAGE_SIZE);
- if (ret != 1) {
- pr_err("%d: binder_alloc_buf failed to map page at %pK in kernel\n",
- alloc->pid, page_addr);
- goto err_map_kernel_failed;
- }
- user_page_addr =
- (uintptr_t)page_addr + alloc->user_buffer_offset;
+ user_page_addr = (uintptr_t)page_addr;
ret = vm_insert_page(vma, user_page_addr, page[0].page_ptr);
if (ret) {
pr_err("%d: binder_alloc_buf failed to map page at %lx in userspace\n",
@@ -309,8 +299,6 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
continue;
err_vm_insert_page_failed:
- unmap_kernel_range((unsigned long)page_addr, PAGE_SIZE);
-err_map_kernel_failed:
__free_page(page->page_ptr);
page->page_ptr = NULL;
err_alloc_page_failed:
@@ -336,8 +324,8 @@ static struct binder_buffer *binder_alloc_new_buf_locked(
struct binder_buffer *buffer;
size_t buffer_size;
struct rb_node *best_fit = NULL;
- void *has_page_addr;
- void *end_page_addr;
+ void __user *has_page_addr;
+ void __user *end_page_addr;
size_t size, data_offsets_size;
int ret;
@@ -431,15 +419,15 @@ static struct binder_buffer *binder_alloc_new_buf_locked(
"%d: binder_alloc_buf size %zd got buffer %pK size %zd\n",
alloc->pid, size, buffer, buffer_size);
- has_page_addr =
- (void *)(((uintptr_t)buffer->data + buffer_size) & PAGE_MASK);
+ has_page_addr = (void __user *)
+ (((uintptr_t)buffer->user_data + buffer_size) & PAGE_MASK);
WARN_ON(n && buffer_size != size);
end_page_addr =
- (void *)PAGE_ALIGN((uintptr_t)buffer->data + size);
+ (void __user *)PAGE_ALIGN((uintptr_t)buffer->user_data + size);
if (end_page_addr > has_page_addr)
end_page_addr = has_page_addr;
- ret = binder_update_page_range(alloc, 1,
- (void *)PAGE_ALIGN((uintptr_t)buffer->data), end_page_addr);
+ ret = binder_update_page_range(alloc, 1, (void __user *)
+ PAGE_ALIGN((uintptr_t)buffer->user_data), end_page_addr);
if (ret)
return ERR_PTR(ret);
@@ -452,7 +440,7 @@ static struct binder_buffer *binder_alloc_new_buf_locked(
__func__, alloc->pid);
goto err_alloc_buf_struct_failed;
}
- new_buffer->data = (u8 *)buffer->data + size;
+ new_buffer->user_data = (u8 __user *)buffer->user_data + size;
list_add(&new_buffer->entry, &buffer->entry);
new_buffer->free = 1;
binder_insert_free_buffer(alloc, new_buffer);
@@ -478,8 +466,8 @@ static struct binder_buffer *binder_alloc_new_buf_locked(
return buffer;
err_alloc_buf_struct_failed:
- binder_update_page_range(alloc, 0,
- (void *)PAGE_ALIGN((uintptr_t)buffer->data),
+ binder_update_page_range(alloc, 0, (void __user *)
+ PAGE_ALIGN((uintptr_t)buffer->user_data),
end_page_addr);
return ERR_PTR(-ENOMEM);
}
@@ -514,14 +502,15 @@ struct binder_buffer *binder_alloc_new_buf(struct binder_alloc *alloc,
return buffer;
}
-static void *buffer_start_page(struct binder_buffer *buffer)
+static void __user *buffer_start_page(struct binder_buffer *buffer)
{
- return (void *)((uintptr_t)buffer->data & PAGE_MASK);
+ return (void __user *)((uintptr_t)buffer->user_data & PAGE_MASK);
}
-static void *prev_buffer_end_page(struct binder_buffer *buffer)
+static void __user *prev_buffer_end_page(struct binder_buffer *buffer)
{
- return (void *)(((uintptr_t)(buffer->data) - 1) & PAGE_MASK);
+ return (void __user *)
+ (((uintptr_t)(buffer->user_data) - 1) & PAGE_MASK);
}
static void binder_delete_free_buffer(struct binder_alloc *alloc,
@@ -536,7 +525,8 @@ static void binder_delete_free_buffer(struct binder_alloc *alloc,
to_free = false;
binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
"%d: merge free, buffer %pK share page with %pK\n",
- alloc->pid, buffer->data, prev->data);
+ alloc->pid, buffer->user_data,
+ prev->user_data);
}
if (!list_is_last(&buffer->entry, &alloc->buffers)) {
@@ -546,23 +536,24 @@ static void binder_delete_free_buffer(struct binder_alloc *alloc,
binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
"%d: merge free, buffer %pK share page with %pK\n",
alloc->pid,
- buffer->data,
- next->data);
+ buffer->user_data,
+ next->user_data);
}
}
- if (PAGE_ALIGNED(buffer->data)) {
+ if (PAGE_ALIGNED(buffer->user_data)) {
binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
"%d: merge free, buffer start %pK is page aligned\n",
- alloc->pid, buffer->data);
+ alloc->pid, buffer->user_data);
to_free = false;
}
if (to_free) {
binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC,
"%d: merge free, buffer %pK do not share page with %pK or %pK\n",
- alloc->pid, buffer->data,
- prev->data, next ? next->data : NULL);
+ alloc->pid, buffer->user_data,
+ prev->user_data,
+ next ? next->user_data : NULL);
binder_update_page_range(alloc, 0, buffer_start_page(buffer),
buffer_start_page(buffer) + PAGE_SIZE);
}
@@ -588,8 +579,8 @@ static void binder_free_buf_locked(struct binder_alloc *alloc,
BUG_ON(buffer->free);
BUG_ON(size > buffer_size);
BUG_ON(buffer->transaction != NULL);
- BUG_ON(buffer->data < alloc->buffer);
- BUG_ON(buffer->data > alloc->buffer + alloc->buffer_size);
+ BUG_ON(buffer->user_data < alloc->buffer);
+ BUG_ON(buffer->user_data > alloc->buffer + alloc->buffer_size);
if (buffer->async_transaction) {
alloc->free_async_space += size + sizeof(struct binder_buffer);
@@ -600,8 +591,9 @@ static void binder_free_buf_locked(struct binder_alloc *alloc,
}
binder_update_page_range(alloc, 0,
- (void *)PAGE_ALIGN((uintptr_t)buffer->data),
- (void *)(((uintptr_t)buffer->data + buffer_size) & PAGE_MASK));
+ (void __user *)PAGE_ALIGN((uintptr_t)buffer->user_data),
+ (void __user *)(((uintptr_t)
+ buffer->user_data + buffer_size) & PAGE_MASK));
rb_erase(&buffer->rb_node, &alloc->allocated_buffers);
buffer->free = 1;
@@ -657,7 +649,6 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
struct vm_area_struct *vma)
{
int ret;
- struct vm_struct *area;
const char *failure_string;
struct binder_buffer *buffer;
@@ -668,28 +659,9 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
goto err_already_mapped;
}
- area = get_vm_area(vma->vm_end - vma->vm_start, VM_ALLOC);
- if (area == NULL) {
- ret = -ENOMEM;
- failure_string = "get_vm_area";
- goto err_get_vm_area_failed;
- }
- alloc->buffer = area->addr;
- alloc->user_buffer_offset =
- vma->vm_start - (uintptr_t)alloc->buffer;
+ alloc->buffer = (void __user *)vma->vm_start;
mutex_unlock(&binder_alloc_mmap_lock);
-#ifdef CONFIG_CPU_CACHE_VIPT
- if (cache_is_vipt_aliasing()) {
- while (CACHE_COLOUR(
- (vma->vm_start ^ (uint32_t)alloc->buffer))) {
- pr_info("%s: %d %lx-%lx maps %pK bad alignment\n",
- __func__, alloc->pid, vma->vm_start,
- vma->vm_end, alloc->buffer);
- vma->vm_start += PAGE_SIZE;
- }
- }
-#endif
alloc->pages = kzalloc(sizeof(alloc->pages[0]) *
((vma->vm_end - vma->vm_start) / PAGE_SIZE),
GFP_KERNEL);
@@ -707,7 +679,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
goto err_alloc_buf_struct_failed;
}
- buffer->data = alloc->buffer;
+ buffer->user_data = alloc->buffer;
list_add(&buffer->entry, &alloc->buffers);
buffer->free = 1;
binder_insert_free_buffer(alloc, buffer);
@@ -725,9 +697,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
alloc->pages = NULL;
err_alloc_pages_failed:
mutex_lock(&binder_alloc_mmap_lock);
- vfree(alloc->buffer);
alloc->buffer = NULL;
-err_get_vm_area_failed:
err_already_mapped:
mutex_unlock(&binder_alloc_mmap_lock);
pr_err("%s: %d %lx-%lx %s failed %d\n", __func__,
@@ -771,7 +741,7 @@ void binder_alloc_deferred_release(struct binder_alloc *alloc)
int i;
for (i = 0; i < alloc->buffer_size / PAGE_SIZE; i++) {
- void *page_addr;
+ void __user *page_addr;
bool on_lru;
if (!alloc->pages[i].page_ptr)
@@ -784,12 +754,10 @@ void binder_alloc_deferred_release(struct binder_alloc *alloc)
"%s: %d: page %d at %pK %s\n",
__func__, alloc->pid, i, page_addr,
on_lru ? "on lru" : "active");
- unmap_kernel_range((unsigned long)page_addr, PAGE_SIZE);
__free_page(alloc->pages[i].page_ptr);
page_count++;
}
kfree(alloc->pages);
- vfree(alloc->buffer);
}
mutex_unlock(&alloc->mutex);
if (alloc->vma_vm_mm)
@@ -804,7 +772,7 @@ static void print_binder_buffer(struct seq_file *m, const char *prefix,
struct binder_buffer *buffer)
{
seq_printf(m, "%s %d: %pK size %zd:%zd:%zd %s\n",
- prefix, buffer->debug_id, buffer->data,
+ prefix, buffer->debug_id, buffer->user_data,
buffer->data_size, buffer->offsets_size,
buffer->extra_buffers_size,
buffer->transaction ? "active" : "delivered");
@@ -938,10 +906,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
if (vma) {
trace_binder_unmap_user_start(alloc, index);
- zap_page_range(vma,
- page_addr +
- alloc->user_buffer_offset,
- PAGE_SIZE, NULL);
+ zap_page_range(vma, page_addr, PAGE_SIZE, NULL);
trace_binder_unmap_user_end(alloc, index);
@@ -951,7 +916,6 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
trace_binder_unmap_kernel_start(alloc, index);
- unmap_kernel_range(page_addr, PAGE_SIZE);
__free_page(page->page_ptr);
page->page_ptr = NULL;
@@ -1018,3 +982,173 @@ int binder_alloc_shrinker_init(void)
}
return ret;
}
+
+/**
+ * check_buffer() - verify that buffer/offset is safe to access
+ * @alloc: binder_alloc for this proc
+ * @buffer: binder buffer to be accessed
+ * @offset: offset into @buffer data
+ * @bytes: bytes to access from offset
+ *
+ * Check that the @offset/@bytes are within the size of the given
+ * @buffer and that the buffer is currently active and not freeable.
+ * Offsets must also be multiples of sizeof(u32). The kernel is
+ * allowed to touch the buffer in two cases:
+ *
+ * 1) when the buffer is being created:
+ * (buffer->free == 0 && buffer->allow_user_free == 0)
+ * 2) when the buffer is being torn down:
+ * (buffer->free == 0 && buffer->transaction == NULL).
+ *
+ * Return: true if the buffer is safe to access
+ */
+static inline bool check_buffer(struct binder_alloc *alloc,
+ struct binder_buffer *buffer,
+ binder_size_t offset, size_t bytes)
+{
+ size_t buffer_size = binder_alloc_buffer_size(alloc, buffer);
+
+ return buffer_size >= bytes &&
+ offset <= buffer_size - bytes &&
+ IS_ALIGNED(offset, sizeof(u32)) &&
+ !buffer->free &&
+ (!buffer->allow_user_free || !buffer->transaction);
+}
+
+/**
+ * binder_alloc_get_page() - get kernel pointer for given buffer offset
+ * @alloc: binder_alloc for this proc
+ * @buffer: binder buffer to be accessed
+ * @buffer_offset: offset into @buffer data
+ * @pgoffp: address to copy final page offset to
+ *
+ * Lookup the struct page corresponding to the address
+ * at @buffer_offset into @buffer->user_data. If @pgoffp is not
+ * NULL, the byte-offset into the page is written there.
+ *
+ * The caller is responsible to ensure that the offset points
+ * to a valid address within the @buffer and that @buffer is
+ * not freeable by the user. Since it can't be freed, we are
+ * guaranteed that the corresponding elements of @alloc->pages[]
+ * cannot change.
+ *
+ * Return: struct page
+ */
+static struct page *binder_alloc_get_page(struct binder_alloc *alloc,
+ struct binder_buffer *buffer,
+ binder_size_t buffer_offset,
+ pgoff_t *pgoffp)
+{
+ binder_size_t buffer_space_offset = buffer_offset +
+ (buffer->user_data - alloc->buffer);
+ pgoff_t pgoff = buffer_space_offset & ~PAGE_MASK;
+ size_t index = buffer_space_offset >> PAGE_SHIFT;
+ struct binder_lru_page *lru_page;
+
+ lru_page = &alloc->pages[index];
+ *pgoffp = pgoff;
+ return lru_page->page_ptr;
+}
+
+/**
+ * binder_alloc_copy_user_to_buffer() - copy src user to tgt user
+ * @alloc: binder_alloc for this proc
+ * @buffer: binder buffer to be accessed
+ * @buffer_offset: offset into @buffer data
+ * @from: userspace pointer to source buffer
+ * @bytes: bytes to copy
+ *
+ * Copy bytes from source userspace to target buffer.
+ *
+ * Return: bytes remaining to be copied
+ */
+unsigned long
+binder_alloc_copy_user_to_buffer(struct binder_alloc *alloc,
+ struct binder_buffer *buffer,
+ binder_size_t buffer_offset,
+ const void __user *from,
+ size_t bytes)
+{
+ if (!check_buffer(alloc, buffer, buffer_offset, bytes))
+ return bytes;
+
+ while (bytes) {
+ unsigned long size;
+ unsigned long ret;
+ struct page *page;
+ pgoff_t pgoff;
+ void *kptr;
+
+ page = binder_alloc_get_page(alloc, buffer,
+ buffer_offset, &pgoff);
+ size = min_t(size_t, bytes, PAGE_SIZE - pgoff);
+ kptr = kmap(page) + pgoff;
+ ret = copy_from_user(kptr, from, size);
+ kunmap(page);
+ if (ret)
+ return bytes - size + ret;
+ bytes -= size;
+ from += size;
+ buffer_offset += size;
+ }
+ return 0;
+}
+
+static void binder_alloc_do_buffer_copy(struct binder_alloc *alloc,
+ bool to_buffer,
+ struct binder_buffer *buffer,
+ binder_size_t buffer_offset,
+ void *ptr,
+ size_t bytes)
+{
+ /* All copies must be 32-bit aligned and 32-bit size */
+ BUG_ON(!check_buffer(alloc, buffer, buffer_offset, bytes));
+
+ while (bytes) {
+ unsigned long size;
+ struct page *page;
+ pgoff_t pgoff;
+ void *tmpptr;
+ void *base_ptr;
+
+ page = binder_alloc_get_page(alloc, buffer,
+ buffer_offset, &pgoff);
+ size = min_t(size_t, bytes, PAGE_SIZE - pgoff);
+ base_ptr = kmap_atomic(page);
+ tmpptr = base_ptr + pgoff;
+ if (to_buffer)
+ memcpy(tmpptr, ptr, size);
+ else
+ memcpy(ptr, tmpptr, size);
+ /*
+ * kunmap_atomic() takes care of flushing the cache
+ * if this device has VIVT cache arch
+ */
+ kunmap_atomic(base_ptr);
+ bytes -= size;
+ pgoff = 0;
+ ptr = ptr + size;
+ buffer_offset += size;
+ }
+}
+
+void binder_alloc_copy_to_buffer(struct binder_alloc *alloc,
+ struct binder_buffer *buffer,
+ binder_size_t buffer_offset,
+ void *src,
+ size_t bytes)
+{
+ binder_alloc_do_buffer_copy(alloc, true, buffer, buffer_offset,
+ src, bytes);
+}
+
+void binder_alloc_copy_from_buffer(struct binder_alloc *alloc,
+ void *dest,
+ struct binder_buffer *buffer,
+ binder_size_t buffer_offset,
+ size_t bytes)
+{
+ binder_alloc_do_buffer_copy(alloc, false, buffer, buffer_offset,
+ dest, bytes);
+}
+
diff --git a/drivers/android/binder_alloc.h b/drivers/android/binder_alloc.h
index fb3238c..b60d161 100644
--- a/drivers/android/binder_alloc.h
+++ b/drivers/android/binder_alloc.h
@@ -22,6 +22,7 @@
#include <linux/vmalloc.h>
#include <linux/slab.h>
#include <linux/list_lru.h>
+#include <uapi/linux/android/binder.h>
extern struct list_lru binder_alloc_lru;
struct binder_transaction;
@@ -30,16 +31,16 @@ struct binder_transaction;
* struct binder_buffer - buffer used for binder transactions
* @entry: entry alloc->buffers
* @rb_node: node for allocated_buffers/free_buffers rb trees
- * @free: true if buffer is free
- * @allow_user_free: describe the second member of struct blah,
- * @async_transaction: describe the second member of struct blah,
- * @debug_id: describe the second member of struct blah,
- * @transaction: describe the second member of struct blah,
- * @target_node: describe the second member of struct blah,
- * @data_size: describe the second member of struct blah,
- * @offsets_size: describe the second member of struct blah,
- * @extra_buffers_size: describe the second member of struct blah,
- * @data:i describe the second member of struct blah,
+ * @free: %true if buffer is free
+ * @allow_user_free: %true if user is allowed to free buffer
+ * @async_transaction: %true if buffer is in use for an async txn
+ * @debug_id: unique ID for debugging
+ * @transaction: pointer to associated struct binder_transaction
+ * @target_node: struct binder_node associated with this buffer
+ * @data_size: size of @transaction data
+ * @offsets_size: size of array of offsets
+ * @extra_buffers_size: size of space for other objects (like sg lists)
+ * @user_data: user pointer to base of buffer space
*
* Bookkeeping structure for binder transaction buffers
*/
@@ -58,7 +59,7 @@ struct binder_buffer {
size_t data_size;
size_t offsets_size;
size_t extra_buffers_size;
- void *data;
+ void __user *user_data;
};
/**
@@ -81,7 +82,6 @@ struct binder_lru_page {
* (invariant after init)
* @vma_vm_mm: copy of vma->vm_mm (invarient after mmap)
* @buffer: base of per-proc address space mapped via mmap
- * @user_buffer_offset: offset between user and kernel VAs for buffer
* @buffers: list of all buffers for this proc
* @free_buffers: rb tree of buffers available for allocation
* sorted by size
@@ -102,8 +102,7 @@ struct binder_alloc {
struct mutex mutex;
struct vm_area_struct *vma;
struct mm_struct *vma_vm_mm;
- void *buffer;
- ptrdiff_t user_buffer_offset;
+ void __user *buffer;
struct list_head buffers;
struct rb_root free_buffers;
struct rb_root allocated_buffers;
@@ -162,26 +161,24 @@ binder_alloc_get_free_async_space(struct binder_alloc *alloc)
return free_async_space;
}
-/**
- * binder_alloc_get_user_buffer_offset() - get offset between kernel/user addrs
- * @alloc: binder_alloc for this proc
- *
- * Return: the offset between kernel and user-space addresses to use for
- * virtual address conversion
- */
-static inline ptrdiff_t
-binder_alloc_get_user_buffer_offset(struct binder_alloc *alloc)
-{
- /*
- * user_buffer_offset is constant if vma is set and
- * undefined if vma is not set. It is possible to
- * get here with !alloc->vma if the target process
- * is dying while a transaction is being initiated.
- * Returning the old value is ok in this case and
- * the transaction will fail.
- */
- return alloc->user_buffer_offset;
-}
+unsigned long
+binder_alloc_copy_user_to_buffer(struct binder_alloc *alloc,
+ struct binder_buffer *buffer,
+ binder_size_t buffer_offset,
+ const void __user *from,
+ size_t bytes);
+
+void binder_alloc_copy_to_buffer(struct binder_alloc *alloc,
+ struct binder_buffer *buffer,
+ binder_size_t buffer_offset,
+ void *src,
+ size_t bytes);
+
+void binder_alloc_copy_from_buffer(struct binder_alloc *alloc,
+ void *dest,
+ struct binder_buffer *buffer,
+ binder_size_t buffer_offset,
+ size_t bytes);
#endif /* _LINUX_BINDER_ALLOC_H */
diff --git a/drivers/android/binder_alloc_selftest.c b/drivers/android/binder_alloc_selftest.c
index 8bd7bce..b727089 100644
--- a/drivers/android/binder_alloc_selftest.c
+++ b/drivers/android/binder_alloc_selftest.c
@@ -102,11 +102,12 @@ static bool check_buffer_pages_allocated(struct binder_alloc *alloc,
struct binder_buffer *buffer,
size_t size)
{
- void *page_addr, *end;
+ void __user *page_addr;
+ void __user *end;
int page_index;
- end = (void *)PAGE_ALIGN((uintptr_t)buffer->data + size);
- page_addr = buffer->data;
+ end = (void __user *)PAGE_ALIGN((uintptr_t)buffer->user_data + size);
+ page_addr = buffer->user_data;
for (; page_addr < end; page_addr += PAGE_SIZE) {
page_index = (page_addr - alloc->buffer) / PAGE_SIZE;
if (!alloc->pages[page_index].page_ptr ||
diff --git a/drivers/android/binder_trace.h b/drivers/android/binder_trace.h
index b11dffc..7674231 100644
--- a/drivers/android/binder_trace.h
+++ b/drivers/android/binder_trace.h
@@ -296,7 +296,7 @@ DEFINE_EVENT(binder_buffer_class, binder_transaction_failed_buffer_release,
TRACE_EVENT(binder_update_page_range,
TP_PROTO(struct binder_alloc *alloc, bool allocate,
- void *start, void *end),
+ void __user *start, void __user *end),
TP_ARGS(alloc, allocate, start, end),
TP_STRUCT__entry(
__field(int, proc)
diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
index b758633..39676b3 100644
--- a/drivers/base/power/wakeup.c
+++ b/drivers/base/power/wakeup.c
@@ -116,7 +116,6 @@ void wakeup_source_drop(struct wakeup_source *ws)
if (!ws)
return;
- del_timer_sync(&ws->timer);
__pm_relax(ws);
}
EXPORT_SYMBOL_GPL(wakeup_source_drop);
@@ -204,6 +203,13 @@ void wakeup_source_remove(struct wakeup_source *ws)
list_del_rcu(&ws->entry);
spin_unlock_irqrestore(&events_lock, flags);
synchronize_srcu(&wakeup_srcu);
+
+ del_timer_sync(&ws->timer);
+ /*
+ * Clear timer.function to make wakeup_source_not_registered() treat
+ * this wakeup source as not registered.
+ */
+ ws->timer.function = NULL;
}
EXPORT_SYMBOL_GPL(wakeup_source_remove);
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 326b9ba..6914c6e 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3752,7 +3752,7 @@ static unsigned int floppy_check_events(struct gendisk *disk,
if (time_after(jiffies, UDRS->last_checked + UDP->checkfreq)) {
if (lock_fdc(drive))
- return -EINTR;
+ return 0;
poll_drive(false, 0);
process_fd_request();
}
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index d7d2494..b3e432a 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -82,7 +82,6 @@
static DEFINE_IDR(loop_index_idr);
static DEFINE_MUTEX(loop_index_mutex);
-static DEFINE_MUTEX(loop_ctl_mutex);
static int max_part;
static int part_shift;
@@ -1034,7 +1033,7 @@ static int loop_clr_fd(struct loop_device *lo)
*/
if (atomic_read(&lo->lo_refcnt) > 1) {
lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
return 0;
}
@@ -1084,12 +1083,12 @@ static int loop_clr_fd(struct loop_device *lo)
if (!part_shift)
lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN;
loop_unprepare_queue(lo);
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
/*
- * Need not hold loop_ctl_mutex to fput backing file.
- * Calling fput holding loop_ctl_mutex triggers a circular
+ * Need not hold lo_ctl_mutex to fput backing file.
+ * Calling fput holding lo_ctl_mutex triggers a circular
* lock dependency possibility warning as fput can take
- * bd_mutex which is usually taken before loop_ctl_mutex.
+ * bd_mutex which is usually taken before lo_ctl_mutex.
*/
fput(filp);
return 0;
@@ -1402,7 +1401,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
struct loop_device *lo = bdev->bd_disk->private_data;
int err;
- mutex_lock_nested(&loop_ctl_mutex, 1);
+ mutex_lock_nested(&lo->lo_ctl_mutex, 1);
switch (cmd) {
case LOOP_SET_FD:
err = loop_set_fd(lo, mode, bdev, arg);
@@ -1411,7 +1410,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
err = loop_change_fd(lo, bdev, arg);
break;
case LOOP_CLR_FD:
- /* loop_clr_fd would have unlocked loop_ctl_mutex on success */
+ /* loop_clr_fd would have unlocked lo_ctl_mutex on success */
err = loop_clr_fd(lo);
if (!err)
goto out_unlocked;
@@ -1452,7 +1451,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
default:
err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL;
}
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
out_unlocked:
return err;
@@ -1585,16 +1584,16 @@ static int lo_compat_ioctl(struct block_device *bdev, fmode_t mode,
switch(cmd) {
case LOOP_SET_STATUS:
- mutex_lock(&loop_ctl_mutex);
+ mutex_lock(&lo->lo_ctl_mutex);
err = loop_set_status_compat(
lo, (const struct compat_loop_info __user *) arg);
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
break;
case LOOP_GET_STATUS:
- mutex_lock(&loop_ctl_mutex);
+ mutex_lock(&lo->lo_ctl_mutex);
err = loop_get_status_compat(
lo, (struct compat_loop_info __user *) arg);
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
break;
case LOOP_SET_CAPACITY:
case LOOP_CLR_FD:
@@ -1639,7 +1638,7 @@ static void __lo_release(struct loop_device *lo)
if (atomic_dec_return(&lo->lo_refcnt))
return;
- mutex_lock(&loop_ctl_mutex);
+ mutex_lock(&lo->lo_ctl_mutex);
if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) {
/*
* In autoclear mode, stop the loop thread
@@ -1656,7 +1655,7 @@ static void __lo_release(struct loop_device *lo)
loop_flush(lo);
}
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
}
static void lo_release(struct gendisk *disk, fmode_t mode)
@@ -1702,10 +1701,10 @@ static int unregister_transfer_cb(int id, void *ptr, void *data)
struct loop_device *lo = ptr;
struct loop_func_table *xfer = data;
- mutex_lock(&loop_ctl_mutex);
+ mutex_lock(&lo->lo_ctl_mutex);
if (lo->lo_encryption == xfer)
loop_release_xfer(lo);
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
return 0;
}
@@ -1872,6 +1871,7 @@ static int loop_add(struct loop_device **l, int i)
if (!part_shift)
disk->flags |= GENHD_FL_NO_PART_SCAN;
disk->flags |= GENHD_FL_EXT_DEVT;
+ mutex_init(&lo->lo_ctl_mutex);
atomic_set(&lo->lo_refcnt, 0);
lo->lo_number = i;
spin_lock_init(&lo->lo_lock);
@@ -1984,19 +1984,19 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd,
ret = loop_lookup(&lo, parm);
if (ret < 0)
break;
- mutex_lock(&loop_ctl_mutex);
+ mutex_lock(&lo->lo_ctl_mutex);
if (lo->lo_state != Lo_unbound) {
ret = -EBUSY;
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
break;
}
if (atomic_read(&lo->lo_refcnt) > 0) {
ret = -EBUSY;
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
break;
}
lo->lo_disk->private_data = NULL;
- mutex_unlock(&loop_ctl_mutex);
+ mutex_unlock(&lo->lo_ctl_mutex);
idr_remove(&loop_index_idr, lo->lo_number);
loop_remove(lo);
break;
diff --git a/drivers/block/loop.h b/drivers/block/loop.h
index a923e74..60f0fd2 100644
--- a/drivers/block/loop.h
+++ b/drivers/block/loop.h
@@ -55,6 +55,7 @@ struct loop_device {
spinlock_t lo_lock;
int lo_state;
+ struct mutex lo_ctl_mutex;
struct kthread_worker worker;
struct task_struct *worker_task;
bool use_dio;
diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index ff42808..a46f188 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -265,6 +265,7 @@
/* #define ERRLOGMASK (CD_WARNING|CD_OPEN|CD_COUNT_TRACKS|CD_CLOSE) */
/* #define ERRLOGMASK (CD_WARNING|CD_REG_UNREG|CD_DO_IOCTL|CD_OPEN|CD_CLOSE|CD_COUNT_TRACKS) */
+#include <linux/atomic.h>
#include <linux/module.h>
#include <linux/fs.h>
#include <linux/major.h>
@@ -3683,9 +3684,9 @@ static struct ctl_table_header *cdrom_sysctl_header;
static void cdrom_sysctl_register(void)
{
- static int initialized;
+ static atomic_t initialized = ATOMIC_INIT(0);
- if (initialized == 1)
+ if (!atomic_add_unless(&initialized, 1, 1))
return;
cdrom_sysctl_header = register_sysctl_table(cdrom_root_table);
@@ -3696,8 +3697,6 @@ static void cdrom_sysctl_register(void)
cdrom_sysctl_settings.debug = debug;
cdrom_sysctl_settings.lock = lockdoor;
cdrom_sysctl_settings.check = check_media_type;
-
- initialized = 1;
}
static void cdrom_sysctl_unregister(void)
diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index 87d8c3c..53125ec 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -379,7 +379,7 @@
config R3964
tristate "Siemens R3964 line discipline"
- depends on TTY
+ depends on TTY && BROKEN
---help---
This driver allows synchronous communication with devices using the
Siemens R3964 packet protocol. Unless you are dealing with special
diff --git a/drivers/char/hpet.c b/drivers/char/hpet.c
index 50272fe..818a8d4 100644
--- a/drivers/char/hpet.c
+++ b/drivers/char/hpet.c
@@ -376,7 +376,7 @@ static __init int hpet_mmap_enable(char *str)
pr_info("HPET mmap %s\n", hpet_mmap_enabled ? "enabled" : "disabled");
return 1;
}
-__setup("hpet_mmap", hpet_mmap_enable);
+__setup("hpet_mmap=", hpet_mmap_enable);
static int hpet_mmap(struct file *file, struct vm_area_struct *vma)
{
diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
index 3fa2f8a..1c5c431 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -73,7 +73,7 @@ static int virtio_read(struct hwrng *rng, void *buf, size_t size, bool wait)
if (!vi->busy) {
vi->busy = true;
- init_completion(&vi->have_data);
+ reinit_completion(&vi->have_data);
register_buffer(vi, buf, size);
}
diff --git a/drivers/char/random.c b/drivers/char/random.c
index a9ce2f7..4b0edcd 100755
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -893,6 +893,7 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
if (crng == &primary_crng && crng_init < 2) {
numa_crng_init();
crng_init = 2;
+ spin_unlock_irqrestore(&crng->lock, flags);
process_random_ready_list();
wake_up_interruptible(&crng_init_wait);
pr_notice("random: crng init done\n");
@@ -908,8 +909,9 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r)
urandom_warning.missed);
urandom_warning.missed = 0;
}
+ } else {
+ spin_unlock_irqrestore(&crng->lock, flags);
}
- spin_unlock_irqrestore(&crng->lock, flags);
}
static inline void maybe_reseed_primary_crng(void)
diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c
index fa0f668..d29f784 100644
--- a/drivers/char/tpm/tpm_crb.c
+++ b/drivers/char/tpm/tpm_crb.c
@@ -102,19 +102,29 @@ static int crb_recv(struct tpm_chip *chip, u8 *buf, size_t count)
struct crb_priv *priv = dev_get_drvdata(&chip->dev);
unsigned int expected;
- /* sanity check */
- if (count < 6)
+ /* A sanity check that the upper layer wants to get at least the header
+ * as that is the minimum size for any TPM response.
+ */
+ if (count < TPM_HEADER_SIZE)
return -EIO;
+ /* If this bit is set, according to the spec, the TPM is in
+ * unrecoverable condition.
+ */
if (ioread32(&priv->cca->sts) & CRB_CTRL_STS_ERROR)
return -EIO;
- memcpy_fromio(buf, priv->rsp, 6);
- expected = be32_to_cpup((__be32 *) &buf[2]);
- if (expected > count || expected < 6)
+ /* Read the first 8 bytes in order to get the length of the response.
+ * We read exactly a quad word in order to make sure that the remaining
+ * reads will be aligned.
+ */
+ memcpy_fromio(buf, priv->rsp, 8);
+
+ expected = be32_to_cpup((__be32 *)&buf[2]);
+ if (expected > count || expected < TPM_HEADER_SIZE)
return -EIO;
- memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
+ memcpy_fromio(&buf[8], &priv->rsp[8], expected - 8);
return expected;
}
diff --git a/drivers/char/tpm/tpm_i2c_atmel.c b/drivers/char/tpm/tpm_i2c_atmel.c
index 95ce2e9..cc4e642 100644
--- a/drivers/char/tpm/tpm_i2c_atmel.c
+++ b/drivers/char/tpm/tpm_i2c_atmel.c
@@ -65,7 +65,15 @@ static int i2c_atmel_send(struct tpm_chip *chip, u8 *buf, size_t len)
dev_dbg(&chip->dev,
"%s(buf=%*ph len=%0zx) -> sts=%d\n", __func__,
(int)min_t(size_t, 64, len), buf, len, status);
- return status;
+
+ if (status < 0)
+ return status;
+
+ /* The upper layer does not support incomplete sends. */
+ if (status != len)
+ return -E2BIG;
+
+ return 0;
}
static int i2c_atmel_recv(struct tpm_chip *chip, u8 *buf, size_t count)
diff --git a/drivers/clk/clk-twl6040.c b/drivers/clk/clk-twl6040.c
index 7b222a5..82d615f 100644
--- a/drivers/clk/clk-twl6040.c
+++ b/drivers/clk/clk-twl6040.c
@@ -41,6 +41,43 @@ static int twl6040_pdmclk_is_prepared(struct clk_hw *hw)
return pdmclk->enabled;
}
+static int twl6040_pdmclk_reset_one_clock(struct twl6040_pdmclk *pdmclk,
+ unsigned int reg)
+{
+ const u8 reset_mask = TWL6040_HPLLRST; /* Same for HPPLL and LPPLL */
+ int ret;
+
+ ret = twl6040_set_bits(pdmclk->twl6040, reg, reset_mask);
+ if (ret < 0)
+ return ret;
+
+ ret = twl6040_clear_bits(pdmclk->twl6040, reg, reset_mask);
+ if (ret < 0)
+ return ret;
+
+ return 0;
+}
+
+/*
+ * TWL6040A2 Phoenix Audio IC erratum #6: "PDM Clock Generation Issue At
+ * Cold Temperature". This affects cold boot and deeper idle states it
+ * seems. The workaround consists of resetting HPPLL and LPPLL.
+ */
+static int twl6040_pdmclk_quirk_reset_clocks(struct twl6040_pdmclk *pdmclk)
+{
+ int ret;
+
+ ret = twl6040_pdmclk_reset_one_clock(pdmclk, TWL6040_REG_HPPLLCTL);
+ if (ret)
+ return ret;
+
+ ret = twl6040_pdmclk_reset_one_clock(pdmclk, TWL6040_REG_LPPLLCTL);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
static int twl6040_pdmclk_prepare(struct clk_hw *hw)
{
struct twl6040_pdmclk *pdmclk = container_of(hw, struct twl6040_pdmclk,
@@ -48,8 +85,20 @@ static int twl6040_pdmclk_prepare(struct clk_hw *hw)
int ret;
ret = twl6040_power(pdmclk->twl6040, 1);
- if (!ret)
- pdmclk->enabled = 1;
+ if (ret)
+ return ret;
+
+ ret = twl6040_pdmclk_quirk_reset_clocks(pdmclk);
+ if (ret)
+ goto out_err;
+
+ pdmclk->enabled = 1;
+
+ return 0;
+
+out_err:
+ dev_err(pdmclk->dev, "%s: error %i\n", __func__, ret);
+ twl6040_power(pdmclk->twl6040, 0);
return ret;
}
diff --git a/drivers/clk/ingenic/cgu.c b/drivers/clk/ingenic/cgu.c
index e8248f9..4dec9e9 100644
--- a/drivers/clk/ingenic/cgu.c
+++ b/drivers/clk/ingenic/cgu.c
@@ -364,16 +364,16 @@ ingenic_clk_round_rate(struct clk_hw *hw, unsigned long req_rate,
struct ingenic_clk *ingenic_clk = to_ingenic_clk(hw);
struct ingenic_cgu *cgu = ingenic_clk->cgu;
const struct ingenic_cgu_clk_info *clk_info;
- long rate = *parent_rate;
+ unsigned int div = 1;
clk_info = &cgu->clock_info[ingenic_clk->idx];
if (clk_info->type & CGU_CLK_DIV)
- rate /= ingenic_clk_calc_div(clk_info, *parent_rate, req_rate);
+ div = ingenic_clk_calc_div(clk_info, *parent_rate, req_rate);
else if (clk_info->type & CGU_CLK_FIXDIV)
- rate /= clk_info->fixdiv.div;
+ div = clk_info->fixdiv.div;
- return rate;
+ return DIV_ROUND_UP(*parent_rate, div);
}
static int
@@ -393,7 +393,7 @@ ingenic_clk_set_rate(struct clk_hw *hw, unsigned long req_rate,
if (clk_info->type & CGU_CLK_DIV) {
div = ingenic_clk_calc_div(clk_info, parent_rate, req_rate);
- rate = parent_rate / div;
+ rate = DIV_ROUND_UP(parent_rate, div);
if (rate != req_rate)
return -EINVAL;
diff --git a/drivers/clk/ingenic/cgu.h b/drivers/clk/ingenic/cgu.h
index 09700b2..a22f654 100644
--- a/drivers/clk/ingenic/cgu.h
+++ b/drivers/clk/ingenic/cgu.h
@@ -78,7 +78,7 @@ struct ingenic_cgu_mux_info {
* @reg: offset of the divider control register within the CGU
* @shift: number of bits to left shift the divide value by (ie. the index of
* the lowest bit of the divide value within its control register)
- * @div: number of bits to divide the divider value by (i.e. if the
+ * @div: number to divide the divider value by (i.e. if the
* effective divider value is the value written to the register
* multiplied by some constant)
* @bits: the size of the divide value in bits
diff --git a/drivers/clk/sunxi-ng/ccu-sun6i-a31.c b/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
index 6ea5401..7f12812 100644
--- a/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
+++ b/drivers/clk/sunxi-ng/ccu-sun6i-a31.c
@@ -252,9 +252,9 @@ static SUNXI_CCU_GATE(ahb1_mmc1_clk, "ahb1-mmc1", "ahb1",
static SUNXI_CCU_GATE(ahb1_mmc2_clk, "ahb1-mmc2", "ahb1",
0x060, BIT(10), 0);
static SUNXI_CCU_GATE(ahb1_mmc3_clk, "ahb1-mmc3", "ahb1",
- 0x060, BIT(12), 0);
+ 0x060, BIT(11), 0);
static SUNXI_CCU_GATE(ahb1_nand1_clk, "ahb1-nand1", "ahb1",
- 0x060, BIT(13), 0);
+ 0x060, BIT(12), 0);
static SUNXI_CCU_GATE(ahb1_nand0_clk, "ahb1-nand0", "ahb1",
0x060, BIT(13), 0);
static SUNXI_CCU_GATE(ahb1_sdram_clk, "ahb1-sdram", "ahb1",
diff --git a/drivers/clocksource/exynos_mct.c b/drivers/clocksource/exynos_mct.c
index 7f6fed9..fb0cf8b 100644
--- a/drivers/clocksource/exynos_mct.c
+++ b/drivers/clocksource/exynos_mct.c
@@ -388,6 +388,13 @@ static void exynos4_mct_tick_start(unsigned long cycles,
exynos4_mct_write(tmp, mevt->base + MCT_L_TCON_OFFSET);
}
+static void exynos4_mct_tick_clear(struct mct_clock_event_device *mevt)
+{
+ /* Clear the MCT tick interrupt */
+ if (readl_relaxed(reg_base + mevt->base + MCT_L_INT_CSTAT_OFFSET) & 1)
+ exynos4_mct_write(0x1, mevt->base + MCT_L_INT_CSTAT_OFFSET);
+}
+
static int exynos4_tick_set_next_event(unsigned long cycles,
struct clock_event_device *evt)
{
@@ -404,6 +411,7 @@ static int set_state_shutdown(struct clock_event_device *evt)
mevt = container_of(evt, struct mct_clock_event_device, evt);
exynos4_mct_tick_stop(mevt);
+ exynos4_mct_tick_clear(mevt);
return 0;
}
@@ -420,8 +428,11 @@ static int set_state_periodic(struct clock_event_device *evt)
return 0;
}
-static void exynos4_mct_tick_clear(struct mct_clock_event_device *mevt)
+static irqreturn_t exynos4_mct_tick_isr(int irq, void *dev_id)
{
+ struct mct_clock_event_device *mevt = dev_id;
+ struct clock_event_device *evt = &mevt->evt;
+
/*
* This is for supporting oneshot mode.
* Mct would generate interrupt periodically
@@ -430,16 +441,6 @@ static void exynos4_mct_tick_clear(struct mct_clock_event_device *mevt)
if (!clockevent_state_periodic(&mevt->evt))
exynos4_mct_tick_stop(mevt);
- /* Clear the MCT tick interrupt */
- if (readl_relaxed(reg_base + mevt->base + MCT_L_INT_CSTAT_OFFSET) & 1)
- exynos4_mct_write(0x1, mevt->base + MCT_L_INT_CSTAT_OFFSET);
-}
-
-static irqreturn_t exynos4_mct_tick_isr(int irq, void *dev_id)
-{
- struct mct_clock_event_device *mevt = dev_id;
- struct clock_event_device *evt = &mevt->evt;
-
exynos4_mct_tick_clear(mevt);
evt->event_handler(evt);
diff --git a/drivers/cpufreq/pxa2xx-cpufreq.c b/drivers/cpufreq/pxa2xx-cpufreq.c
index ce345bf..a24e9f0 100644
--- a/drivers/cpufreq/pxa2xx-cpufreq.c
+++ b/drivers/cpufreq/pxa2xx-cpufreq.c
@@ -192,7 +192,7 @@ static int pxa_cpufreq_change_voltage(const struct pxa_freqs *pxa_freq)
return ret;
}
-static void __init pxa_cpufreq_init_voltages(void)
+static void pxa_cpufreq_init_voltages(void)
{
vcc_core = regulator_get(NULL, "vcc_core");
if (IS_ERR(vcc_core)) {
@@ -208,7 +208,7 @@ static int pxa_cpufreq_change_voltage(const struct pxa_freqs *pxa_freq)
return 0;
}
-static void __init pxa_cpufreq_init_voltages(void) { }
+static void pxa_cpufreq_init_voltages(void) { }
#endif
static void find_freq_tables(struct cpufreq_frequency_table **freq_table,
diff --git a/drivers/cpufreq/tegra124-cpufreq.c b/drivers/cpufreq/tegra124-cpufreq.c
index 4353025..4bb154f 100644
--- a/drivers/cpufreq/tegra124-cpufreq.c
+++ b/drivers/cpufreq/tegra124-cpufreq.c
@@ -134,6 +134,8 @@ static int tegra124_cpufreq_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, priv);
+ of_node_put(np);
+
return 0;
out_switch_to_pllx:
diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 65bb6fd..72bc6e6 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -18,6 +18,7 @@
#include <linux/hrtimer.h>
#include <linux/tick.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/math64.h>
#include <linux/module.h>
@@ -130,10 +131,6 @@ struct menu_device {
int interval_ptr;
};
-
-#define LOAD_INT(x) ((x) >> FSHIFT)
-#define LOAD_FRAC(x) LOAD_INT(((x) & (FIXED_1-1)) * 100)
-
static inline int get_loadavg(unsigned long load)
{
return LOAD_INT(load) * 10 + LOAD_FRAC(load) / 10;
diff --git a/drivers/crypto/amcc/crypto4xx_alg.c b/drivers/crypto/amcc/crypto4xx_alg.c
index 4afca39..e3b8beb 100644
--- a/drivers/crypto/amcc/crypto4xx_alg.c
+++ b/drivers/crypto/amcc/crypto4xx_alg.c
@@ -138,7 +138,8 @@ static int crypto4xx_setkey_aes(struct crypto_ablkcipher *cipher,
sa = (struct dynamic_sa_ctl *) ctx->sa_in;
ctx->hash_final = 0;
- set_dynamic_sa_command_0(sa, SA_NOT_SAVE_HASH, SA_NOT_SAVE_IV,
+ set_dynamic_sa_command_0(sa, SA_NOT_SAVE_HASH, (cm == CRYPTO_MODE_CBC ?
+ SA_SAVE_IV : SA_NOT_SAVE_IV),
SA_LOAD_HASH_FROM_SA, SA_LOAD_IV_FROM_STATE,
SA_NO_HEADER_PROC, SA_HASH_ALG_NULL,
SA_CIPHER_ALG_AES, SA_PAD_TYPE_ZERO,
diff --git a/drivers/crypto/amcc/crypto4xx_core.c b/drivers/crypto/amcc/crypto4xx_core.c
index c7524bb..7d066fa 100644
--- a/drivers/crypto/amcc/crypto4xx_core.c
+++ b/drivers/crypto/amcc/crypto4xx_core.c
@@ -646,6 +646,15 @@ static u32 crypto4xx_ablkcipher_done(struct crypto4xx_device *dev,
addr = dma_map_page(dev->core_dev->device, sg_page(dst),
dst->offset, dst->length, DMA_FROM_DEVICE);
}
+
+ if (pd_uinfo->sa_va->sa_command_0.bf.save_iv == SA_SAVE_IV) {
+ struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(req);
+
+ crypto4xx_memcpy_from_le32((u32 *)req->iv,
+ pd_uinfo->sr_va->save_iv,
+ crypto_skcipher_ivsize(skcipher));
+ }
+
crypto4xx_ret_sg_desc(dev, pd_uinfo);
if (ablk_req->base.complete != NULL)
ablk_req->base.complete(&ablk_req->base, 0);
diff --git a/drivers/crypto/amcc/crypto4xx_trng.c b/drivers/crypto/amcc/crypto4xx_trng.c
index 677ca17..368c559 100644
--- a/drivers/crypto/amcc/crypto4xx_trng.c
+++ b/drivers/crypto/amcc/crypto4xx_trng.c
@@ -80,8 +80,10 @@ void ppc4xx_trng_probe(struct crypto4xx_core_device *core_dev)
/* Find the TRNG device node and map it */
trng = of_find_matching_node(NULL, ppc4xx_trng_match);
- if (!trng || !of_device_is_available(trng))
+ if (!trng || !of_device_is_available(trng)) {
+ of_node_put(trng);
return;
+ }
dev->trng_base = of_iomap(trng, 0);
of_node_put(trng);
diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c
index 0d743c6..88caca3 100644
--- a/drivers/crypto/caam/caamalg.c
+++ b/drivers/crypto/caam/caamalg.c
@@ -2131,6 +2131,7 @@ static void init_aead_job(struct aead_request *req,
if (unlikely(req->src != req->dst)) {
if (!edesc->dst_nents) {
dst_dma = sg_dma_address(req->dst);
+ out_options = 0;
} else {
dst_dma = edesc->sec4_sg_dma +
sec4_sg_index *
diff --git a/drivers/dma/imx-dma.c b/drivers/dma/imx-dma.c
index 1cfa1d9..f8786f6 100644
--- a/drivers/dma/imx-dma.c
+++ b/drivers/dma/imx-dma.c
@@ -290,7 +290,7 @@ static inline int imxdma_sg_next(struct imxdma_desc *d)
struct scatterlist *sg = d->sg;
unsigned long now;
- now = min(d->len, sg_dma_len(sg));
+ now = min_t(size_t, d->len, sg_dma_len(sg));
if (d->len != IMX_DMA_LENGTH_LOOP)
d->len -= now;
diff --git a/drivers/dma/qcom/hidma.c b/drivers/dma/qcom/hidma.c
index e244e10..5444f39 100644
--- a/drivers/dma/qcom/hidma.c
+++ b/drivers/dma/qcom/hidma.c
@@ -131,24 +131,25 @@ static void hidma_process_completed(struct hidma_chan *mchan)
desc = &mdesc->desc;
last_cookie = desc->cookie;
+ llstat = hidma_ll_status(mdma->lldev, mdesc->tre_ch);
+
spin_lock_irqsave(&mchan->lock, irqflags);
+ if (llstat == DMA_COMPLETE) {
+ mchan->last_success = last_cookie;
+ result.result = DMA_TRANS_NOERROR;
+ } else {
+ result.result = DMA_TRANS_ABORTED;
+ }
+
dma_cookie_complete(desc);
spin_unlock_irqrestore(&mchan->lock, irqflags);
- llstat = hidma_ll_status(mdma->lldev, mdesc->tre_ch);
dmaengine_desc_get_callback(desc, &cb);
dma_run_dependencies(desc);
spin_lock_irqsave(&mchan->lock, irqflags);
list_move(&mdesc->node, &mchan->free);
-
- if (llstat == DMA_COMPLETE) {
- mchan->last_success = last_cookie;
- result.result = DMA_TRANS_NOERROR;
- } else
- result.result = DMA_TRANS_ABORTED;
-
spin_unlock_irqrestore(&mchan->lock, irqflags);
dmaengine_desc_callback_invoke(&cb, &result);
diff --git a/drivers/dma/sh/rcar-dmac.c b/drivers/dma/sh/rcar-dmac.c
index d032032..f37a6ef 100644
--- a/drivers/dma/sh/rcar-dmac.c
+++ b/drivers/dma/sh/rcar-dmac.c
@@ -1311,6 +1311,7 @@ static enum dma_status rcar_dmac_tx_status(struct dma_chan *chan,
enum dma_status status;
unsigned long flags;
unsigned int residue;
+ bool cyclic;
status = dma_cookie_status(chan, cookie, txstate);
if (status == DMA_COMPLETE || !txstate)
@@ -1318,10 +1319,11 @@ static enum dma_status rcar_dmac_tx_status(struct dma_chan *chan,
spin_lock_irqsave(&rchan->lock, flags);
residue = rcar_dmac_chan_get_residue(rchan, cookie);
+ cyclic = rchan->desc.running ? rchan->desc.running->cyclic : false;
spin_unlock_irqrestore(&rchan->lock, flags);
/* if there's no residue, the cookie is complete */
- if (!residue)
+ if (!residue && !cyclic)
return DMA_COMPLETE;
dma_set_residue(txstate, residue);
diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c
index 3722b9d..22f7f0c 100644
--- a/drivers/dma/tegra20-apb-dma.c
+++ b/drivers/dma/tegra20-apb-dma.c
@@ -635,7 +635,10 @@ static void handle_cont_sngl_cycle_dma_done(struct tegra_dma_channel *tdc,
sgreq = list_first_entry(&tdc->pending_sg_req, typeof(*sgreq), node);
dma_desc = sgreq->dma_desc;
- dma_desc->bytes_transferred += sgreq->req_len;
+ /* if we dma for long enough the transfer count will wrap */
+ dma_desc->bytes_transferred =
+ (dma_desc->bytes_transferred + sgreq->req_len) %
+ dma_desc->bytes_requested;
/* Callback need to be call */
if (!dma_desc->cb_count)
diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c
index 236004b..9faa09e 100644
--- a/drivers/firmware/efi/memattr.c
+++ b/drivers/firmware/efi/memattr.c
@@ -93,7 +93,7 @@ static bool entry_is_valid(const efi_memory_desc_t *in, efi_memory_desc_t *out)
if (!(md->attribute & EFI_MEMORY_RUNTIME))
continue;
- if (md->virt_addr == 0) {
+ if (md->virt_addr == 0 && md->phys_addr != 0) {
/* no virtual mapping has been installed by the stub */
break;
}
diff --git a/drivers/gpio/gpio-adnp.c b/drivers/gpio/gpio-adnp.c
index 8ff7b0d..3b68c03 100644
--- a/drivers/gpio/gpio-adnp.c
+++ b/drivers/gpio/gpio-adnp.c
@@ -132,8 +132,10 @@ static int adnp_gpio_direction_input(struct gpio_chip *chip, unsigned offset)
if (err < 0)
goto out;
- if (err & BIT(pos))
- err = -EACCES;
+ if (value & BIT(pos)) {
+ err = -EPERM;
+ goto out;
+ }
err = 0;
diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c
index 6f9c9ac..75f30a0 100644
--- a/drivers/gpio/gpio-omap.c
+++ b/drivers/gpio/gpio-omap.c
@@ -837,14 +837,16 @@ static void omap_gpio_unmask_irq(struct irq_data *d)
if (trigger)
omap_set_gpio_triggering(bank, offset, trigger);
- /* For level-triggered GPIOs, the clearing must be done after
- * the HW source is cleared, thus after the handler has run */
- if (bank->level_mask & BIT(offset)) {
- omap_set_gpio_irqenable(bank, offset, 0);
- omap_clear_gpio_irqstatus(bank, offset);
- }
-
omap_set_gpio_irqenable(bank, offset, 1);
+
+ /*
+ * For level-triggered GPIOs, clearing must be done after the source
+ * is cleared, thus after the handler has run. OMAP4 needs this done
+ * after enabing the interrupt to clear the wakeup status.
+ */
+ if (bank->level_mask & BIT(offset))
+ omap_clear_gpio_irqstatus(bank, offset);
+
raw_spin_unlock_irqrestore(&bank->lock, flags);
}
diff --git a/drivers/gpio/gpio-pxa.c b/drivers/gpio/gpio-pxa.c
index 7a63058..32d22bd 100644
--- a/drivers/gpio/gpio-pxa.c
+++ b/drivers/gpio/gpio-pxa.c
@@ -774,6 +774,9 @@ static int pxa_gpio_suspend(void)
struct pxa_gpio_bank *c;
int gpio;
+ if (!pchip)
+ return 0;
+
for_each_gpio_bank(gpio, c, pchip) {
c->saved_gplr = readl_relaxed(c->regbase + GPLR_OFFSET);
c->saved_gpdr = readl_relaxed(c->regbase + GPDR_OFFSET);
@@ -792,6 +795,9 @@ static void pxa_gpio_resume(void)
struct pxa_gpio_bank *c;
int gpio;
+ if (!pchip)
+ return;
+
for_each_gpio_bank(gpio, c, pchip) {
/* restore level with set/clear */
writel_relaxed(c->saved_gplr, c->regbase + GPSR_OFFSET);
diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
index b59441d..4a95974 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -3069,6 +3069,7 @@ static int drm_dp_mst_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs
msg.u.i2c_read.transactions[i].i2c_dev_id = msgs[i].addr;
msg.u.i2c_read.transactions[i].num_bytes = msgs[i].len;
msg.u.i2c_read.transactions[i].bytes = msgs[i].buf;
+ msg.u.i2c_read.transactions[i].no_stop_bit = !(msgs[i].flags & I2C_M_STOP);
}
msg.u.i2c_read.read_i2c_device_id = msgs[num - 1].addr;
msg.u.i2c_read.num_bytes_read = msgs[num - 1].len;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6509031..26c4bef 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1600,7 +1600,8 @@ __vma_matches(struct vm_area_struct *vma, struct file *filp,
if (vma->vm_file != filp)
return false;
- return vma->vm_start == addr && (vma->vm_end - vma->vm_start) == size;
+ return vma->vm_start == addr &&
+ (vma->vm_end - vma->vm_start) == PAGE_ALIGN(size);
}
/**
diff --git a/drivers/gpu/drm/msm/sde/sde_crtc.c b/drivers/gpu/drm/msm/sde/sde_crtc.c
index 2c7f758..f8e3f23 100644
--- a/drivers/gpu/drm/msm/sde/sde_crtc.c
+++ b/drivers/gpu/drm/msm/sde/sde_crtc.c
@@ -757,6 +757,25 @@ static bool sde_crtc_mode_fixup(struct drm_crtc *crtc,
return true;
}
+static int _sde_crtc_get_ctlstart_timeout(struct drm_crtc *crtc)
+{
+ struct drm_encoder *encoder;
+ int rc = 0;
+
+ if (!crtc || !crtc->dev)
+ return 0;
+
+ list_for_each_entry(encoder,
+ &crtc->dev->mode_config.encoder_list, head) {
+ if (encoder->crtc != crtc)
+ continue;
+
+ if (sde_encoder_get_intf_mode(encoder) == INTF_MODE_CMD)
+ rc += sde_encoder_get_ctlstart_timeout_state(encoder);
+ }
+
+ return rc;
+}
static void _sde_crtc_setup_blend_cfg(struct sde_crtc_mixer *mixer,
struct sde_plane_state *pstate, struct sde_format *format)
@@ -3184,7 +3203,13 @@ static void sde_crtc_atomic_begin(struct drm_crtc *crtc,
if (unlikely(!sde_crtc->num_mixers))
return;
- _sde_crtc_blend_setup(crtc, old_state, true);
+ if (_sde_crtc_get_ctlstart_timeout(crtc)) {
+ _sde_crtc_blend_setup(crtc, old_state, false);
+ SDE_ERROR("border fill only commit after ctlstart timeout\n");
+ } else {
+ _sde_crtc_blend_setup(crtc, old_state, true);
+ }
+
_sde_crtc_dest_scaler_setup(crtc);
/* cancel the idle notify delayed work */
diff --git a/drivers/gpu/drm/msm/sde/sde_encoder.c b/drivers/gpu/drm/msm/sde/sde_encoder.c
index 34777b30..0237b99 100644
--- a/drivers/gpu/drm/msm/sde/sde_encoder.c
+++ b/drivers/gpu/drm/msm/sde/sde_encoder.c
@@ -3229,6 +3229,24 @@ int sde_encoder_idle_request(struct drm_encoder *drm_enc)
return 0;
}
+int sde_encoder_get_ctlstart_timeout_state(struct drm_encoder *drm_enc)
+{
+ struct sde_encoder_virt *sde_enc = NULL;
+ int i, count = 0;
+
+ if (!drm_enc)
+ return 0;
+
+ sde_enc = to_sde_encoder_virt(drm_enc);
+
+ for (i = 0; i < sde_enc->num_phys_encs; i++) {
+ count += atomic_read(&sde_enc->phys_encs[i]->ctlstart_timeout);
+ atomic_set(&sde_enc->phys_encs[i]->ctlstart_timeout, 0);
+ }
+
+ return count;
+}
+
/**
* _sde_encoder_trigger_flush - trigger flush for a physical encoder
* drm_enc: Pointer to drm encoder structure
diff --git a/drivers/gpu/drm/msm/sde/sde_encoder.h b/drivers/gpu/drm/msm/sde/sde_encoder.h
index a4cf467..62b0c9b 100644
--- a/drivers/gpu/drm/msm/sde/sde_encoder.h
+++ b/drivers/gpu/drm/msm/sde/sde_encoder.h
@@ -266,4 +266,11 @@ int sde_encoder_in_clone_mode(struct drm_encoder *enc);
*/
void sde_encoder_control_idle_pc(struct drm_encoder *enc, bool enable);
+/**
+ * sde_encoder_get_ctlstart_timeout_state - checks if ctl start timeout happened
+ * @drm_enc: Pointer to drm encoder structure
+ * @Return: non zero value if ctl start timeout occurred
+ */
+int sde_encoder_get_ctlstart_timeout_state(struct drm_encoder *enc);
+
#endif /* __SDE_ENCODER_H__ */
diff --git a/drivers/gpu/drm/msm/sde/sde_encoder_phys.h b/drivers/gpu/drm/msm/sde/sde_encoder_phys.h
index 163a36d..ab16156 100644
--- a/drivers/gpu/drm/msm/sde/sde_encoder_phys.h
+++ b/drivers/gpu/drm/msm/sde/sde_encoder_phys.h
@@ -267,6 +267,7 @@ struct sde_encoder_irq {
* @pending_retire_fence_cnt: Atomic counter tracking the pending retire
* fences that have to be signalled.
* @pending_kickoff_wq: Wait queue for blocking until kickoff completes
+ * @ctlstart_timeout: Indicates if ctl start timeout occurred
* @irq: IRQ tracking structures
* @cont_splash_single_flush Variable to check if single flush is enabled.
* @cont_splash_settings Variable to store continuous splash settings.
@@ -301,6 +302,7 @@ struct sde_encoder_phys {
atomic_t pending_kickoff_cnt;
atomic_t pending_retire_fence_cnt;
wait_queue_head_t pending_kickoff_wq;
+ atomic_t ctlstart_timeout;
struct sde_encoder_irq irq[INTR_IDX_MAX];
u32 cont_splash_single_flush;
bool cont_splash_settings;
diff --git a/drivers/gpu/drm/msm/sde/sde_encoder_phys_cmd.c b/drivers/gpu/drm/msm/sde/sde_encoder_phys_cmd.c
index 4409408..d6e8fd3 100644
--- a/drivers/gpu/drm/msm/sde/sde_encoder_phys_cmd.c
+++ b/drivers/gpu/drm/msm/sde/sde_encoder_phys_cmd.c
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2015-2018 The Linux Foundation. All rights reserved.
+ * Copyright (c) 2015-2019 The Linux Foundation. All rights reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 and
@@ -193,6 +193,7 @@ static void sde_encoder_phys_cmd_pp_tx_done_irq(void *arg, int irq_idx)
phys_enc,
SDE_ENCODER_FRAME_EVENT_SIGNAL_RETIRE_FENCE);
atomic_add_unless(&phys_enc->pending_ctlstart_cnt, -1, 0);
+ atomic_set(&phys_enc->ctlstart_timeout, 0);
}
/* notify all synchronous clients first, then asynchronous clients */
@@ -292,6 +293,7 @@ static void sde_encoder_phys_cmd_ctl_start_irq(void *arg, int irq_idx)
ctl = phys_enc->hw_ctl;
atomic_add_unless(&phys_enc->pending_ctlstart_cnt, -1, 0);
+ atomic_set(&phys_enc->ctlstart_timeout, 0);
time_diff_us = ktime_us_delta(ktime_get(), cmd_enc->rd_ptr_timestamp);
@@ -1054,6 +1056,7 @@ static void sde_encoder_phys_cmd_disable(struct sde_encoder_phys *phys_enc)
SDE_ERROR("invalid encoder\n");
return;
}
+ atomic_set(&phys_enc->ctlstart_timeout, 0);
SDE_DEBUG_CMDENC(cmd_enc, "pp %d state %d\n",
phys_enc->hw_pp->idx - PINGPONG_0,
phys_enc->enable_state);
@@ -1176,6 +1179,9 @@ static int _sde_encoder_phys_cmd_wait_for_ctl_start(
"ctl start interrupt wait failed\n");
else
ret = 0;
+
+ if (sde_encoder_phys_cmd_is_master(phys_enc))
+ atomic_inc_return(&phys_enc->ctlstart_timeout);
}
return ret;
@@ -1476,6 +1482,7 @@ struct sde_encoder_phys *sde_encoder_phys_cmd_init(
atomic_set(&phys_enc->pending_retire_fence_cnt, 0);
atomic_set(&cmd_enc->pending_rd_ptr_cnt, 0);
atomic_set(&cmd_enc->pending_vblank_cnt, 0);
+ atomic_set(&phys_enc->ctlstart_timeout, 0);
init_waitqueue_head(&phys_enc->pending_kickoff_wq);
init_waitqueue_head(&cmd_enc->pending_vblank_wq);
atomic_set(&cmd_enc->autorefresh.kickoff_cnt, 0);
diff --git a/drivers/gpu/drm/nouveau/dispnv04/tvnv17.c b/drivers/gpu/drm/nouveau/dispnv04/tvnv17.c
index 434d1e2..cd37d00e 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/tvnv17.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/tvnv17.c
@@ -750,7 +750,9 @@ static int nv17_tv_set_property(struct drm_encoder *encoder,
/* Disable the crtc to ensure a full modeset is
* performed whenever it's turned on again. */
if (crtc)
- drm_crtc_force_disable(crtc);
+ drm_crtc_helper_set_mode(crtc, &crtc->mode,
+ crtc->x, crtc->y,
+ crtc->primary->fb);
}
return 0;
diff --git a/drivers/gpu/drm/radeon/evergreen_cs.c b/drivers/gpu/drm/radeon/evergreen_cs.c
index d960d39..f09388e 100644
--- a/drivers/gpu/drm/radeon/evergreen_cs.c
+++ b/drivers/gpu/drm/radeon/evergreen_cs.c
@@ -1299,6 +1299,7 @@ static int evergreen_cs_handle_reg(struct radeon_cs_parser *p, u32 reg, u32 idx)
return -EINVAL;
}
ib[idx] += (u32)((reloc->gpu_offset >> 8) & 0xffffffff);
+ break;
case CB_TARGET_MASK:
track->cb_target_mask = radeon_get_ib_value(p, idx);
track->cb_dirty = true;
diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index c7e6c98..51d34e7 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -846,7 +846,7 @@ static void
vc4_crtc_reset(struct drm_crtc *crtc)
{
if (crtc->state)
- __drm_atomic_helper_crtc_destroy_state(crtc->state);
+ vc4_crtc_destroy_state(crtc, crtc->state);
crtc->state = kzalloc(sizeof(struct vc4_crtc_state), GFP_KERNEL);
if (crtc->state)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c b/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c
index aec6e9e..55884cb 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c
@@ -531,11 +531,9 @@ static int vmw_fb_set_par(struct fb_info *info)
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC)
};
- struct drm_display_mode *old_mode;
struct drm_display_mode *mode;
int ret;
- old_mode = par->set_mode;
mode = drm_mode_duplicate(vmw_priv->dev, &new_mode);
if (!mode) {
DRM_ERROR("Could not create new fb mode.\n");
@@ -546,11 +544,7 @@ static int vmw_fb_set_par(struct fb_info *info)
mode->vdisplay = var->yres;
vmw_guess_mode_timing(mode);
- if (old_mode && drm_mode_equal(old_mode, mode)) {
- drm_mode_destroy(vmw_priv->dev, mode);
- mode = old_mode;
- old_mode = NULL;
- } else if (!vmw_kms_validate_mode_vram(vmw_priv,
+ if (!vmw_kms_validate_mode_vram(vmw_priv,
mode->hdisplay *
DIV_ROUND_UP(var->bits_per_pixel, 8),
mode->vdisplay)) {
@@ -613,8 +607,8 @@ static int vmw_fb_set_par(struct fb_info *info)
schedule_delayed_work(&par->local_work, 0);
out_unlock:
- if (old_mode)
- drm_mode_destroy(vmw_priv->dev, old_mode);
+ if (par->set_mode)
+ drm_mode_destroy(vmw_priv->dev, par->set_mode);
par->set_mode = mode;
drm_modeset_unlock_all(vmw_priv->dev);
diff --git a/drivers/gpu/ipu-v3/ipu-common.c b/drivers/gpu/ipu-v3/ipu-common.c
index 99c813a..57d22bc 100644
--- a/drivers/gpu/ipu-v3/ipu-common.c
+++ b/drivers/gpu/ipu-v3/ipu-common.c
@@ -884,8 +884,8 @@ static struct ipu_devtype ipu_type_imx51 = {
.cpmem_ofs = 0x1f000000,
.srm_ofs = 0x1f040000,
.tpm_ofs = 0x1f060000,
- .csi0_ofs = 0x1f030000,
- .csi1_ofs = 0x1f038000,
+ .csi0_ofs = 0x1e030000,
+ .csi1_ofs = 0x1e038000,
.ic_ofs = 0x1e020000,
.disp0_ofs = 0x1e040000,
.disp1_ofs = 0x1e048000,
@@ -900,8 +900,8 @@ static struct ipu_devtype ipu_type_imx53 = {
.cpmem_ofs = 0x07000000,
.srm_ofs = 0x07040000,
.tpm_ofs = 0x07060000,
- .csi0_ofs = 0x07030000,
- .csi1_ofs = 0x07038000,
+ .csi0_ofs = 0x06030000,
+ .csi1_ofs = 0x06038000,
.ic_ofs = 0x06020000,
.disp0_ofs = 0x06040000,
.disp1_ofs = 0x06048000,
diff --git a/drivers/hid/i2c-hid/Makefile b/drivers/hid/i2c-hid/Makefile
index 832d8f9..099e1ce 100644
--- a/drivers/hid/i2c-hid/Makefile
+++ b/drivers/hid/i2c-hid/Makefile
@@ -3,3 +3,6 @@
#
obj-$(CONFIG_I2C_HID) += i2c-hid.o
+
+i2c-hid-objs = i2c-hid-core.o
+i2c-hid-$(CONFIG_DMI) += i2c-hid-dmi-quirks.o
diff --git a/drivers/hid/i2c-hid/i2c-hid.c b/drivers/hid/i2c-hid/i2c-hid-core.c
similarity index 96%
rename from drivers/hid/i2c-hid/i2c-hid.c
rename to drivers/hid/i2c-hid/i2c-hid-core.c
index ce2b800..850527d 100644
--- a/drivers/hid/i2c-hid/i2c-hid.c
+++ b/drivers/hid/i2c-hid/i2c-hid-core.c
@@ -42,6 +42,7 @@
#include <linux/i2c/i2c-hid.h>
#include "../hid-ids.h"
+#include "i2c-hid.h"
/* quirks to control the device */
#define I2C_HID_QUIRK_SET_PWR_WAKEUP_DEV BIT(0)
@@ -724,6 +725,7 @@ static int i2c_hid_parse(struct hid_device *hid)
char *rdesc;
int ret;
int tries = 3;
+ char *use_override;
i2c_hid_dbg(ihid, "entering %s\n", __func__);
@@ -742,26 +744,37 @@ static int i2c_hid_parse(struct hid_device *hid)
if (ret)
return ret;
- rdesc = kzalloc(rsize, GFP_KERNEL);
+ use_override = i2c_hid_get_dmi_hid_report_desc_override(client->name,
+ &rsize);
- if (!rdesc) {
- dbg_hid("couldn't allocate rdesc memory\n");
- return -ENOMEM;
- }
+ if (use_override) {
+ rdesc = use_override;
+ i2c_hid_dbg(ihid, "Using a HID report descriptor override\n");
+ } else {
+ rdesc = kzalloc(rsize, GFP_KERNEL);
- i2c_hid_dbg(ihid, "asking HID report descriptor\n");
+ if (!rdesc) {
+ dbg_hid("couldn't allocate rdesc memory\n");
+ return -ENOMEM;
+ }
- ret = i2c_hid_command(client, &hid_report_descr_cmd, rdesc, rsize);
- if (ret) {
- hid_err(hid, "reading report descriptor failed\n");
- kfree(rdesc);
- return -EIO;
+ i2c_hid_dbg(ihid, "asking HID report descriptor\n");
+
+ ret = i2c_hid_command(client, &hid_report_descr_cmd,
+ rdesc, rsize);
+ if (ret) {
+ hid_err(hid, "reading report descriptor failed\n");
+ kfree(rdesc);
+ return -EIO;
+ }
}
i2c_hid_dbg(ihid, "Report Descriptor: %*ph\n", rsize, rdesc);
ret = hid_parse_report(hid, rdesc, rsize);
- kfree(rdesc);
+ if (!use_override)
+ kfree(rdesc);
+
if (ret) {
dbg_hid("parsing report descriptor failed\n");
return ret;
@@ -899,12 +912,19 @@ static int i2c_hid_fetch_hid_descriptor(struct i2c_hid *ihid)
int ret;
/* i2c hid fetch using a fixed descriptor size (30 bytes) */
- i2c_hid_dbg(ihid, "Fetching the HID descriptor\n");
- ret = i2c_hid_command(client, &hid_descr_cmd, ihid->hdesc_buffer,
- sizeof(struct i2c_hid_desc));
- if (ret) {
- dev_err(&client->dev, "hid_descr_cmd failed\n");
- return -ENODEV;
+ if (i2c_hid_get_dmi_i2c_hid_desc_override(client->name)) {
+ i2c_hid_dbg(ihid, "Using a HID descriptor override\n");
+ ihid->hdesc =
+ *i2c_hid_get_dmi_i2c_hid_desc_override(client->name);
+ } else {
+ i2c_hid_dbg(ihid, "Fetching the HID descriptor\n");
+ ret = i2c_hid_command(client, &hid_descr_cmd,
+ ihid->hdesc_buffer,
+ sizeof(struct i2c_hid_desc));
+ if (ret) {
+ dev_err(&client->dev, "hid_descr_cmd failed\n");
+ return -ENODEV;
+ }
}
/* Validate the length of HID descriptor, the 4 first bytes:
diff --git a/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c b/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c
new file mode 100644
index 0000000..cac262a
--- /dev/null
+++ b/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c
@@ -0,0 +1,377 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+/*
+ * Quirks for I2C-HID devices that do not supply proper descriptors
+ *
+ * Copyright (c) 2018 Julian Sax <jsbc@gmx.de>
+ *
+ */
+
+#include <linux/types.h>
+#include <linux/dmi.h>
+#include <linux/mod_devicetable.h>
+
+#include "i2c-hid.h"
+
+
+struct i2c_hid_desc_override {
+ union {
+ struct i2c_hid_desc *i2c_hid_desc;
+ uint8_t *i2c_hid_desc_buffer;
+ };
+ uint8_t *hid_report_desc;
+ unsigned int hid_report_desc_size;
+ uint8_t *i2c_name;
+};
+
+
+/*
+ * descriptors for the SIPODEV SP1064 touchpad
+ *
+ * This device does not supply any descriptors and on windows a filter
+ * driver operates between the i2c-hid layer and the device and injects
+ * these descriptors when the device is prompted. The descriptors were
+ * extracted by listening to the i2c-hid traffic that occurs between the
+ * windows filter driver and the windows i2c-hid driver.
+ */
+
+static const struct i2c_hid_desc_override sipodev_desc = {
+ .i2c_hid_desc_buffer = (uint8_t [])
+ {0x1e, 0x00, /* Length of descriptor */
+ 0x00, 0x01, /* Version of descriptor */
+ 0xdb, 0x01, /* Length of report descriptor */
+ 0x21, 0x00, /* Location of report descriptor */
+ 0x24, 0x00, /* Location of input report */
+ 0x1b, 0x00, /* Max input report length */
+ 0x25, 0x00, /* Location of output report */
+ 0x11, 0x00, /* Max output report length */
+ 0x22, 0x00, /* Location of command register */
+ 0x23, 0x00, /* Location of data register */
+ 0x11, 0x09, /* Vendor ID */
+ 0x88, 0x52, /* Product ID */
+ 0x06, 0x00, /* Version ID */
+ 0x00, 0x00, 0x00, 0x00 /* Reserved */
+ },
+
+ .hid_report_desc = (uint8_t [])
+ {0x05, 0x01, /* Usage Page (Desktop), */
+ 0x09, 0x02, /* Usage (Mouse), */
+ 0xA1, 0x01, /* Collection (Application), */
+ 0x85, 0x01, /* Report ID (1), */
+ 0x09, 0x01, /* Usage (Pointer), */
+ 0xA1, 0x00, /* Collection (Physical), */
+ 0x05, 0x09, /* Usage Page (Button), */
+ 0x19, 0x01, /* Usage Minimum (01h), */
+ 0x29, 0x02, /* Usage Maximum (02h), */
+ 0x25, 0x01, /* Logical Maximum (1), */
+ 0x75, 0x01, /* Report Size (1), */
+ 0x95, 0x02, /* Report Count (2), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x95, 0x06, /* Report Count (6), */
+ 0x81, 0x01, /* Input (Constant), */
+ 0x05, 0x01, /* Usage Page (Desktop), */
+ 0x09, 0x30, /* Usage (X), */
+ 0x09, 0x31, /* Usage (Y), */
+ 0x15, 0x81, /* Logical Minimum (-127), */
+ 0x25, 0x7F, /* Logical Maximum (127), */
+ 0x75, 0x08, /* Report Size (8), */
+ 0x95, 0x02, /* Report Count (2), */
+ 0x81, 0x06, /* Input (Variable, Relative), */
+ 0xC0, /* End Collection, */
+ 0xC0, /* End Collection, */
+ 0x05, 0x0D, /* Usage Page (Digitizer), */
+ 0x09, 0x05, /* Usage (Touchpad), */
+ 0xA1, 0x01, /* Collection (Application), */
+ 0x85, 0x04, /* Report ID (4), */
+ 0x05, 0x0D, /* Usage Page (Digitizer), */
+ 0x09, 0x22, /* Usage (Finger), */
+ 0xA1, 0x02, /* Collection (Logical), */
+ 0x15, 0x00, /* Logical Minimum (0), */
+ 0x25, 0x01, /* Logical Maximum (1), */
+ 0x09, 0x47, /* Usage (Touch Valid), */
+ 0x09, 0x42, /* Usage (Tip Switch), */
+ 0x95, 0x02, /* Report Count (2), */
+ 0x75, 0x01, /* Report Size (1), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x95, 0x01, /* Report Count (1), */
+ 0x75, 0x03, /* Report Size (3), */
+ 0x25, 0x05, /* Logical Maximum (5), */
+ 0x09, 0x51, /* Usage (Contact Identifier), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x75, 0x01, /* Report Size (1), */
+ 0x95, 0x03, /* Report Count (3), */
+ 0x81, 0x03, /* Input (Constant, Variable), */
+ 0x05, 0x01, /* Usage Page (Desktop), */
+ 0x26, 0x44, 0x0A, /* Logical Maximum (2628), */
+ 0x75, 0x10, /* Report Size (16), */
+ 0x55, 0x0E, /* Unit Exponent (14), */
+ 0x65, 0x11, /* Unit (Centimeter), */
+ 0x09, 0x30, /* Usage (X), */
+ 0x46, 0x1A, 0x04, /* Physical Maximum (1050), */
+ 0x95, 0x01, /* Report Count (1), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x46, 0xBC, 0x02, /* Physical Maximum (700), */
+ 0x26, 0x34, 0x05, /* Logical Maximum (1332), */
+ 0x09, 0x31, /* Usage (Y), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0xC0, /* End Collection, */
+ 0x05, 0x0D, /* Usage Page (Digitizer), */
+ 0x09, 0x22, /* Usage (Finger), */
+ 0xA1, 0x02, /* Collection (Logical), */
+ 0x25, 0x01, /* Logical Maximum (1), */
+ 0x09, 0x47, /* Usage (Touch Valid), */
+ 0x09, 0x42, /* Usage (Tip Switch), */
+ 0x95, 0x02, /* Report Count (2), */
+ 0x75, 0x01, /* Report Size (1), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x95, 0x01, /* Report Count (1), */
+ 0x75, 0x03, /* Report Size (3), */
+ 0x25, 0x05, /* Logical Maximum (5), */
+ 0x09, 0x51, /* Usage (Contact Identifier), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x75, 0x01, /* Report Size (1), */
+ 0x95, 0x03, /* Report Count (3), */
+ 0x81, 0x03, /* Input (Constant, Variable), */
+ 0x05, 0x01, /* Usage Page (Desktop), */
+ 0x26, 0x44, 0x0A, /* Logical Maximum (2628), */
+ 0x75, 0x10, /* Report Size (16), */
+ 0x09, 0x30, /* Usage (X), */
+ 0x46, 0x1A, 0x04, /* Physical Maximum (1050), */
+ 0x95, 0x01, /* Report Count (1), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x46, 0xBC, 0x02, /* Physical Maximum (700), */
+ 0x26, 0x34, 0x05, /* Logical Maximum (1332), */
+ 0x09, 0x31, /* Usage (Y), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0xC0, /* End Collection, */
+ 0x05, 0x0D, /* Usage Page (Digitizer), */
+ 0x09, 0x22, /* Usage (Finger), */
+ 0xA1, 0x02, /* Collection (Logical), */
+ 0x25, 0x01, /* Logical Maximum (1), */
+ 0x09, 0x47, /* Usage (Touch Valid), */
+ 0x09, 0x42, /* Usage (Tip Switch), */
+ 0x95, 0x02, /* Report Count (2), */
+ 0x75, 0x01, /* Report Size (1), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x95, 0x01, /* Report Count (1), */
+ 0x75, 0x03, /* Report Size (3), */
+ 0x25, 0x05, /* Logical Maximum (5), */
+ 0x09, 0x51, /* Usage (Contact Identifier), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x75, 0x01, /* Report Size (1), */
+ 0x95, 0x03, /* Report Count (3), */
+ 0x81, 0x03, /* Input (Constant, Variable), */
+ 0x05, 0x01, /* Usage Page (Desktop), */
+ 0x26, 0x44, 0x0A, /* Logical Maximum (2628), */
+ 0x75, 0x10, /* Report Size (16), */
+ 0x09, 0x30, /* Usage (X), */
+ 0x46, 0x1A, 0x04, /* Physical Maximum (1050), */
+ 0x95, 0x01, /* Report Count (1), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x46, 0xBC, 0x02, /* Physical Maximum (700), */
+ 0x26, 0x34, 0x05, /* Logical Maximum (1332), */
+ 0x09, 0x31, /* Usage (Y), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0xC0, /* End Collection, */
+ 0x05, 0x0D, /* Usage Page (Digitizer), */
+ 0x09, 0x22, /* Usage (Finger), */
+ 0xA1, 0x02, /* Collection (Logical), */
+ 0x25, 0x01, /* Logical Maximum (1), */
+ 0x09, 0x47, /* Usage (Touch Valid), */
+ 0x09, 0x42, /* Usage (Tip Switch), */
+ 0x95, 0x02, /* Report Count (2), */
+ 0x75, 0x01, /* Report Size (1), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x95, 0x01, /* Report Count (1), */
+ 0x75, 0x03, /* Report Size (3), */
+ 0x25, 0x05, /* Logical Maximum (5), */
+ 0x09, 0x51, /* Usage (Contact Identifier), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x75, 0x01, /* Report Size (1), */
+ 0x95, 0x03, /* Report Count (3), */
+ 0x81, 0x03, /* Input (Constant, Variable), */
+ 0x05, 0x01, /* Usage Page (Desktop), */
+ 0x26, 0x44, 0x0A, /* Logical Maximum (2628), */
+ 0x75, 0x10, /* Report Size (16), */
+ 0x09, 0x30, /* Usage (X), */
+ 0x46, 0x1A, 0x04, /* Physical Maximum (1050), */
+ 0x95, 0x01, /* Report Count (1), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x46, 0xBC, 0x02, /* Physical Maximum (700), */
+ 0x26, 0x34, 0x05, /* Logical Maximum (1332), */
+ 0x09, 0x31, /* Usage (Y), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0xC0, /* End Collection, */
+ 0x05, 0x0D, /* Usage Page (Digitizer), */
+ 0x55, 0x0C, /* Unit Exponent (12), */
+ 0x66, 0x01, 0x10, /* Unit (Seconds), */
+ 0x47, 0xFF, 0xFF, 0x00, 0x00,/* Physical Maximum (65535), */
+ 0x27, 0xFF, 0xFF, 0x00, 0x00,/* Logical Maximum (65535), */
+ 0x75, 0x10, /* Report Size (16), */
+ 0x95, 0x01, /* Report Count (1), */
+ 0x09, 0x56, /* Usage (Scan Time), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x09, 0x54, /* Usage (Contact Count), */
+ 0x25, 0x7F, /* Logical Maximum (127), */
+ 0x75, 0x08, /* Report Size (8), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x05, 0x09, /* Usage Page (Button), */
+ 0x09, 0x01, /* Usage (01h), */
+ 0x25, 0x01, /* Logical Maximum (1), */
+ 0x75, 0x01, /* Report Size (1), */
+ 0x95, 0x01, /* Report Count (1), */
+ 0x81, 0x02, /* Input (Variable), */
+ 0x95, 0x07, /* Report Count (7), */
+ 0x81, 0x03, /* Input (Constant, Variable), */
+ 0x05, 0x0D, /* Usage Page (Digitizer), */
+ 0x85, 0x02, /* Report ID (2), */
+ 0x09, 0x55, /* Usage (Contact Count Maximum), */
+ 0x09, 0x59, /* Usage (59h), */
+ 0x75, 0x04, /* Report Size (4), */
+ 0x95, 0x02, /* Report Count (2), */
+ 0x25, 0x0F, /* Logical Maximum (15), */
+ 0xB1, 0x02, /* Feature (Variable), */
+ 0x05, 0x0D, /* Usage Page (Digitizer), */
+ 0x85, 0x07, /* Report ID (7), */
+ 0x09, 0x60, /* Usage (60h), */
+ 0x75, 0x01, /* Report Size (1), */
+ 0x95, 0x01, /* Report Count (1), */
+ 0x25, 0x01, /* Logical Maximum (1), */
+ 0xB1, 0x02, /* Feature (Variable), */
+ 0x95, 0x07, /* Report Count (7), */
+ 0xB1, 0x03, /* Feature (Constant, Variable), */
+ 0x85, 0x06, /* Report ID (6), */
+ 0x06, 0x00, 0xFF, /* Usage Page (FF00h), */
+ 0x09, 0xC5, /* Usage (C5h), */
+ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */
+ 0x75, 0x08, /* Report Size (8), */
+ 0x96, 0x00, 0x01, /* Report Count (256), */
+ 0xB1, 0x02, /* Feature (Variable), */
+ 0xC0, /* End Collection, */
+ 0x06, 0x00, 0xFF, /* Usage Page (FF00h), */
+ 0x09, 0x01, /* Usage (01h), */
+ 0xA1, 0x01, /* Collection (Application), */
+ 0x85, 0x0D, /* Report ID (13), */
+ 0x26, 0xFF, 0x00, /* Logical Maximum (255), */
+ 0x19, 0x01, /* Usage Minimum (01h), */
+ 0x29, 0x02, /* Usage Maximum (02h), */
+ 0x75, 0x08, /* Report Size (8), */
+ 0x95, 0x02, /* Report Count (2), */
+ 0xB1, 0x02, /* Feature (Variable), */
+ 0xC0, /* End Collection, */
+ 0x05, 0x0D, /* Usage Page (Digitizer), */
+ 0x09, 0x0E, /* Usage (Configuration), */
+ 0xA1, 0x01, /* Collection (Application), */
+ 0x85, 0x03, /* Report ID (3), */
+ 0x09, 0x22, /* Usage (Finger), */
+ 0xA1, 0x02, /* Collection (Logical), */
+ 0x09, 0x52, /* Usage (Device Mode), */
+ 0x25, 0x0A, /* Logical Maximum (10), */
+ 0x95, 0x01, /* Report Count (1), */
+ 0xB1, 0x02, /* Feature (Variable), */
+ 0xC0, /* End Collection, */
+ 0x09, 0x22, /* Usage (Finger), */
+ 0xA1, 0x00, /* Collection (Physical), */
+ 0x85, 0x05, /* Report ID (5), */
+ 0x09, 0x57, /* Usage (57h), */
+ 0x09, 0x58, /* Usage (58h), */
+ 0x75, 0x01, /* Report Size (1), */
+ 0x95, 0x02, /* Report Count (2), */
+ 0x25, 0x01, /* Logical Maximum (1), */
+ 0xB1, 0x02, /* Feature (Variable), */
+ 0x95, 0x06, /* Report Count (6), */
+ 0xB1, 0x03, /* Feature (Constant, Variable),*/
+ 0xC0, /* End Collection, */
+ 0xC0 /* End Collection */
+ },
+ .hid_report_desc_size = 475,
+ .i2c_name = "SYNA3602:00"
+};
+
+
+static const struct dmi_system_id i2c_hid_dmi_desc_override_table[] = {
+ {
+ .ident = "Teclast F6 Pro",
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "TECLAST"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "F6 Pro"),
+ },
+ .driver_data = (void *)&sipodev_desc
+ },
+ {
+ .ident = "Teclast F7",
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "TECLAST"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "F7"),
+ },
+ .driver_data = (void *)&sipodev_desc
+ },
+ {
+ .ident = "Trekstor Primebook C13",
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "TREKSTOR"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Primebook C13"),
+ },
+ .driver_data = (void *)&sipodev_desc
+ },
+ {
+ .ident = "Trekstor Primebook C11",
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "TREKSTOR"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Primebook C11"),
+ },
+ .driver_data = (void *)&sipodev_desc
+ },
+ {
+ .ident = "Direkt-Tek DTLAPY116-2",
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Direkt-Tek"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "DTLAPY116-2"),
+ },
+ .driver_data = (void *)&sipodev_desc
+ },
+ {
+ .ident = "Mediacom Flexbook Edge 11",
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "MEDIACOM"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "FlexBook edge11 - M-FBE11"),
+ },
+ .driver_data = (void *)&sipodev_desc
+ },
+ { } /* Terminate list */
+};
+
+
+struct i2c_hid_desc *i2c_hid_get_dmi_i2c_hid_desc_override(uint8_t *i2c_name)
+{
+ struct i2c_hid_desc_override *override;
+ const struct dmi_system_id *system_id;
+
+ system_id = dmi_first_match(i2c_hid_dmi_desc_override_table);
+ if (!system_id)
+ return NULL;
+
+ override = system_id->driver_data;
+ if (strcmp(override->i2c_name, i2c_name))
+ return NULL;
+
+ return override->i2c_hid_desc;
+}
+
+char *i2c_hid_get_dmi_hid_report_desc_override(uint8_t *i2c_name,
+ unsigned int *size)
+{
+ struct i2c_hid_desc_override *override;
+ const struct dmi_system_id *system_id;
+
+ system_id = dmi_first_match(i2c_hid_dmi_desc_override_table);
+ if (!system_id)
+ return NULL;
+
+ override = system_id->driver_data;
+ if (strcmp(override->i2c_name, i2c_name))
+ return NULL;
+
+ *size = override->hid_report_desc_size;
+ return override->hid_report_desc;
+}
diff --git a/drivers/hid/i2c-hid/i2c-hid.h b/drivers/hid/i2c-hid/i2c-hid.h
new file mode 100644
index 0000000..a8c19ae
--- /dev/null
+++ b/drivers/hid/i2c-hid/i2c-hid.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+
+#ifndef I2C_HID_H
+#define I2C_HID_H
+
+
+#ifdef CONFIG_DMI
+struct i2c_hid_desc *i2c_hid_get_dmi_i2c_hid_desc_override(uint8_t *i2c_name);
+char *i2c_hid_get_dmi_hid_report_desc_override(uint8_t *i2c_name,
+ unsigned int *size);
+#else
+static inline struct i2c_hid_desc
+ *i2c_hid_get_dmi_i2c_hid_desc_override(uint8_t *i2c_name)
+{ return NULL; }
+static inline char *i2c_hid_get_dmi_hid_report_desc_override(uint8_t *i2c_name,
+ unsigned int *size)
+{ return NULL; }
+#endif
+
+#endif
diff --git a/drivers/hid/intel-ish-hid/ipc/ipc.c b/drivers/hid/intel-ish-hid/ipc/ipc.c
index 0c9ac4d..41d4453 100644
--- a/drivers/hid/intel-ish-hid/ipc/ipc.c
+++ b/drivers/hid/intel-ish-hid/ipc/ipc.c
@@ -92,7 +92,10 @@ static bool check_generated_interrupt(struct ishtp_device *dev)
IPC_INT_FROM_ISH_TO_HOST_CHV_AB(pisr_val);
} else {
pisr_val = ish_reg_read(dev, IPC_REG_PISR_BXT);
- interrupt_generated = IPC_INT_FROM_ISH_TO_HOST_BXT(pisr_val);
+ interrupt_generated = !!pisr_val;
+ /* only busy-clear bit is RW, others are RO */
+ if (pisr_val)
+ ish_reg_write(dev, IPC_REG_PISR_BXT, pisr_val);
}
return interrupt_generated;
@@ -795,11 +798,11 @@ int ish_hw_start(struct ishtp_device *dev)
{
ish_set_host_rdy(dev);
+ set_host_ready(dev);
+
/* After that we can enable ISH DMA operation and wakeup ISHFW */
ish_wakeup(dev);
- set_host_ready(dev);
-
/* wait for FW-initiated reset flow */
if (!dev->recvd_hw_ready)
wait_event_interruptible_timeout(dev->wait_hw_ready,
diff --git a/drivers/hid/intel-ish-hid/ishtp/bus.c b/drivers/hid/intel-ish-hid/ishtp/bus.c
index 2565215..0de18c7 100644
--- a/drivers/hid/intel-ish-hid/ishtp/bus.c
+++ b/drivers/hid/intel-ish-hid/ishtp/bus.c
@@ -628,7 +628,8 @@ int ishtp_cl_device_bind(struct ishtp_cl *cl)
spin_lock_irqsave(&cl->dev->device_list_lock, flags);
list_for_each_entry(cl_device, &cl->dev->device_list,
device_link) {
- if (cl_device->fw_client->client_id == cl->fw_client_id) {
+ if (cl_device->fw_client &&
+ cl_device->fw_client->client_id == cl->fw_client_id) {
cl->device = cl_device;
rv = 0;
break;
@@ -688,6 +689,7 @@ void ishtp_bus_remove_all_clients(struct ishtp_device *ishtp_dev,
spin_lock_irqsave(&ishtp_dev->device_list_lock, flags);
list_for_each_entry_safe(cl_device, n, &ishtp_dev->device_list,
device_link) {
+ cl_device->fw_client = NULL;
if (warm_reset && cl_device->reference_count)
continue;
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c
old mode 100644
new mode 100755
diff --git a/drivers/hwtracing/intel_th/gth.c b/drivers/hwtracing/intel_th/gth.c
index 33e0936..98a4cb5 100644
--- a/drivers/hwtracing/intel_th/gth.c
+++ b/drivers/hwtracing/intel_th/gth.c
@@ -599,11 +599,15 @@ static void intel_th_gth_unassign(struct intel_th_device *thdev,
{
struct gth_device *gth = dev_get_drvdata(&thdev->dev);
int port = othdev->output.port;
+ int master;
spin_lock(>h->gth_lock);
othdev->output.port = -1;
othdev->output.active = false;
gth->output[port].output = NULL;
+ for (master = 0; master <= TH_CONFIGURABLE_MASTERS; master++)
+ if (gth->master[master] == port)
+ gth->master[master] = -1;
spin_unlock(>h->gth_lock);
}
diff --git a/drivers/hwtracing/stm/core.c b/drivers/hwtracing/stm/core.c
index f260922..6efa2ed 100644
--- a/drivers/hwtracing/stm/core.c
+++ b/drivers/hwtracing/stm/core.c
@@ -253,6 +253,9 @@ static int find_free_channels(unsigned long *bitmap, unsigned int start,
;
if (i == width)
return pos;
+
+ /* step over [pos..pos+i) to continue search */
+ pos += i;
}
return -1;
@@ -565,7 +568,7 @@ static int stm_char_policy_set_ioctl(struct stm_file *stmf, void __user *arg)
{
struct stm_device *stm = stmf->stm;
struct stp_policy_id *id;
- int ret = -EINVAL;
+ int ret = -EINVAL, wlimit = 1;
u32 size;
if (stmf->output.nr_chans)
@@ -593,8 +596,10 @@ static int stm_char_policy_set_ioctl(struct stm_file *stmf, void __user *arg)
if (id->__reserved_0 || id->__reserved_1)
goto err_free;
- if (id->width < 1 ||
- id->width > PAGE_SIZE / stm->data->sw_mmiosz)
+ if (stm->data->sw_mmiosz)
+ wlimit = PAGE_SIZE / stm->data->sw_mmiosz;
+
+ if (id->width < 1 || id->width > wlimit)
goto err_free;
ret = stm_file_assign(stmf, id->id, id->width);
diff --git a/drivers/i2c/busses/i2c-cadence.c b/drivers/i2c/busses/i2c-cadence.c
index 45d6771..59c08d5 100644
--- a/drivers/i2c/busses/i2c-cadence.c
+++ b/drivers/i2c/busses/i2c-cadence.c
@@ -382,8 +382,10 @@ static void cdns_i2c_mrecv(struct cdns_i2c *id)
* Check for the message size against FIFO depth and set the
* 'hold bus' bit if it is greater than FIFO depth.
*/
- if (id->recv_count > CDNS_I2C_FIFO_DEPTH)
+ if ((id->recv_count > CDNS_I2C_FIFO_DEPTH) || id->bus_hold_flag)
ctrl_reg |= CDNS_I2C_CR_HOLD;
+ else
+ ctrl_reg = ctrl_reg & ~CDNS_I2C_CR_HOLD;
cdns_i2c_writereg(ctrl_reg, CDNS_I2C_CR_OFFSET);
@@ -440,8 +442,11 @@ static void cdns_i2c_msend(struct cdns_i2c *id)
* Check for the message size against FIFO depth and set the
* 'hold bus' bit if it is greater than FIFO depth.
*/
- if (id->send_count > CDNS_I2C_FIFO_DEPTH)
+ if ((id->send_count > CDNS_I2C_FIFO_DEPTH) || id->bus_hold_flag)
ctrl_reg |= CDNS_I2C_CR_HOLD;
+ else
+ ctrl_reg = ctrl_reg & ~CDNS_I2C_CR_HOLD;
+
cdns_i2c_writereg(ctrl_reg, CDNS_I2C_CR_OFFSET);
/* Clear the interrupts in interrupt status register. */
diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
index 586e557..36100f4 100644
--- a/drivers/i2c/busses/i2c-tegra.c
+++ b/drivers/i2c/busses/i2c-tegra.c
@@ -794,7 +794,7 @@ static const struct i2c_algorithm tegra_i2c_algo = {
/* payload size is only 12 bit */
static struct i2c_adapter_quirks tegra_i2c_quirks = {
.max_read_len = 4096,
- .max_write_len = 4096,
+ .max_write_len = 4096 - 12,
};
static const struct tegra_i2c_hw_feature tegra20_i2c_hw = {
diff --git a/drivers/i2c/i2c-core.c b/drivers/i2c/i2c-core.c
index 7484aac..80d82c6 100644
--- a/drivers/i2c/i2c-core.c
+++ b/drivers/i2c/i2c-core.c
@@ -3250,16 +3250,16 @@ static s32 i2c_smbus_xfer_emulated(struct i2c_adapter *adapter, u16 addr,
the underlying bus driver */
break;
case I2C_SMBUS_I2C_BLOCK_DATA:
+ if (data->block[0] > I2C_SMBUS_BLOCK_MAX) {
+ dev_err(&adapter->dev, "Invalid block %s size %d\n",
+ read_write == I2C_SMBUS_READ ? "read" : "write",
+ data->block[0]);
+ return -EINVAL;
+ }
if (read_write == I2C_SMBUS_READ) {
msg[1].len = data->block[0];
} else {
msg[0].len = data->block[0] + 1;
- if (msg[0].len > I2C_SMBUS_BLOCK_MAX + 1) {
- dev_err(&adapter->dev,
- "Invalid block write size %d\n",
- data->block[0]);
- return -EINVAL;
- }
for (i = 1; i <= data->block[0]; i++)
msgbuf0[i] = data->block[i];
}
diff --git a/drivers/iio/accel/kxcjk-1013.c b/drivers/iio/accel/kxcjk-1013.c
index 7846368..780f886 100644
--- a/drivers/iio/accel/kxcjk-1013.c
+++ b/drivers/iio/accel/kxcjk-1013.c
@@ -1340,6 +1340,8 @@ static int kxcjk1013_resume(struct device *dev)
mutex_lock(&data->mutex);
ret = kxcjk1013_set_mode(data, OPERATION);
+ if (ret == 0)
+ ret = kxcjk1013_set_range(data, data->range);
mutex_unlock(&data->mutex);
return ret;
diff --git a/drivers/iio/adc/ad_sigma_delta.c b/drivers/iio/adc/ad_sigma_delta.c
index 22c4c17..a1d072e 100644
--- a/drivers/iio/adc/ad_sigma_delta.c
+++ b/drivers/iio/adc/ad_sigma_delta.c
@@ -121,6 +121,7 @@ static int ad_sd_read_reg_raw(struct ad_sigma_delta *sigma_delta,
if (sigma_delta->info->has_registers) {
data[0] = reg << sigma_delta->info->addr_shift;
data[0] |= sigma_delta->info->read_mask;
+ data[0] |= sigma_delta->comm;
spi_message_add_tail(&t[0], &m);
}
spi_message_add_tail(&t[1], &m);
diff --git a/drivers/iio/adc/at91_adc.c b/drivers/iio/adc/at91_adc.c
index e3e2155..dd9f228 100644
--- a/drivers/iio/adc/at91_adc.c
+++ b/drivers/iio/adc/at91_adc.c
@@ -704,23 +704,29 @@ static int at91_adc_read_raw(struct iio_dev *idev,
ret = wait_event_interruptible_timeout(st->wq_data_avail,
st->done,
msecs_to_jiffies(1000));
- if (ret == 0)
- ret = -ETIMEDOUT;
- if (ret < 0) {
- mutex_unlock(&st->lock);
- return ret;
- }
- *val = st->last_value;
-
+ /* Disable interrupts, regardless if adc conversion was
+ * successful or not
+ */
at91_adc_writel(st, AT91_ADC_CHDR,
AT91_ADC_CH(chan->channel));
at91_adc_writel(st, AT91_ADC_IDR, BIT(chan->channel));
- st->last_value = 0;
- st->done = false;
+ if (ret > 0) {
+ /* a valid conversion took place */
+ *val = st->last_value;
+ st->last_value = 0;
+ st->done = false;
+ ret = IIO_VAL_INT;
+ } else if (ret == 0) {
+ /* conversion timeout */
+ dev_err(&idev->dev, "ADC Channel %d timeout.\n",
+ chan->channel);
+ ret = -ETIMEDOUT;
+ }
+
mutex_unlock(&st->lock);
- return IIO_VAL_INT;
+ return ret;
case IIO_CHAN_INFO_SCALE:
*val = st->vref_mv;
diff --git a/drivers/iio/adc/exynos_adc.c b/drivers/iio/adc/exynos_adc.c
index c15756d..eb8b735c 100644
--- a/drivers/iio/adc/exynos_adc.c
+++ b/drivers/iio/adc/exynos_adc.c
@@ -916,7 +916,7 @@ static int exynos_adc_remove(struct platform_device *pdev)
struct iio_dev *indio_dev = platform_get_drvdata(pdev);
struct exynos_adc *info = iio_priv(indio_dev);
- if (IS_REACHABLE(CONFIG_INPUT)) {
+ if (IS_REACHABLE(CONFIG_INPUT) && info->input) {
free_irq(info->tsirq, info);
input_unregister_device(info->input);
}
diff --git a/drivers/iio/gyro/bmg160_core.c b/drivers/iio/gyro/bmg160_core.c
index 821919d..b5a5517 100644
--- a/drivers/iio/gyro/bmg160_core.c
+++ b/drivers/iio/gyro/bmg160_core.c
@@ -583,11 +583,10 @@ static int bmg160_read_raw(struct iio_dev *indio_dev,
case IIO_CHAN_INFO_LOW_PASS_FILTER_3DB_FREQUENCY:
return bmg160_get_filter(data, val);
case IIO_CHAN_INFO_SCALE:
- *val = 0;
switch (chan->type) {
case IIO_TEMP:
- *val2 = 500000;
- return IIO_VAL_INT_PLUS_MICRO;
+ *val = 500;
+ return IIO_VAL_INT;
case IIO_ANGL_VEL:
{
int i;
@@ -595,6 +594,7 @@ static int bmg160_read_raw(struct iio_dev *indio_dev,
for (i = 0; i < ARRAY_SIZE(bmg160_scale_table); ++i) {
if (bmg160_scale_table[i].dps_range ==
data->dps_range) {
+ *val = 0;
*val2 = bmg160_scale_table[i].scale;
return IIO_VAL_INT_PLUS_MICRO;
}
diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c
index dd18b74..a2322b2 100644
--- a/drivers/infiniband/hw/cxgb4/cm.c
+++ b/drivers/infiniband/hw/cxgb4/cm.c
@@ -1872,8 +1872,10 @@ static int abort_rpl(struct c4iw_dev *dev, struct sk_buff *skb)
}
mutex_unlock(&ep->com.mutex);
- if (release)
+ if (release) {
+ close_complete_upcall(ep, -ECONNRESET);
release_ep_resources(ep);
+ }
c4iw_put_ep(&ep->com);
return 0;
}
@@ -3567,7 +3569,6 @@ int c4iw_ep_disconnect(struct c4iw_ep *ep, int abrupt, gfp_t gfp)
if (close) {
if (abrupt) {
set_bit(EP_DISC_ABORT, &ep->com.history);
- close_complete_upcall(ep, -ECONNRESET);
ret = send_abort(ep);
} else {
set_bit(EP_DISC_CLOSE, &ep->com.history);
diff --git a/drivers/infiniband/hw/mlx4/alias_GUID.c b/drivers/infiniband/hw/mlx4/alias_GUID.c
index 5e99390..ec13884 100644
--- a/drivers/infiniband/hw/mlx4/alias_GUID.c
+++ b/drivers/infiniband/hw/mlx4/alias_GUID.c
@@ -805,8 +805,8 @@ void mlx4_ib_destroy_alias_guid_service(struct mlx4_ib_dev *dev)
unsigned long flags;
for (i = 0 ; i < dev->num_ports; i++) {
- cancel_delayed_work(&dev->sriov.alias_guid.ports_guid[i].alias_guid_work);
det = &sriov->alias_guid.ports_guid[i];
+ cancel_delayed_work_sync(&det->alias_guid_work);
spin_lock_irqsave(&sriov->alias_guid.ag_work_lock, flags);
while (!list_empty(&det->cb_list)) {
cb_ctx = list_entry(det->cb_list.next,
diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
index 39a4888..5dc920f 100644
--- a/drivers/infiniband/hw/mlx4/cm.c
+++ b/drivers/infiniband/hw/mlx4/cm.c
@@ -39,7 +39,7 @@
#include "mlx4_ib.h"
-#define CM_CLEANUP_CACHE_TIMEOUT (5 * HZ)
+#define CM_CLEANUP_CACHE_TIMEOUT (30 * HZ)
struct id_map_entry {
struct rb_node node;
diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers/infiniband/sw/rdmavt/mr.c
index 46b6497..49d55a0 100644
--- a/drivers/infiniband/sw/rdmavt/mr.c
+++ b/drivers/infiniband/sw/rdmavt/mr.c
@@ -497,11 +497,6 @@ static int rvt_set_page(struct ib_mr *ibmr, u64 addr)
if (unlikely(mapped_segs == mr->mr.max_segs))
return -ENOMEM;
- if (mr->mr.length == 0) {
- mr->mr.user_base = addr;
- mr->mr.iova = addr;
- }
-
m = mapped_segs / RVT_SEGSZ;
n = mapped_segs % RVT_SEGSZ;
mr->mr.map[m]->segs[n].vaddr = (void *)addr;
@@ -518,17 +513,24 @@ static int rvt_set_page(struct ib_mr *ibmr, u64 addr)
* @sg_nents: number of entries in sg
* @sg_offset: offset in bytes into sg
*
+ * Overwrite rvt_mr length with mr length calculated by ib_sg_to_pages.
+ *
* Return: number of sg elements mapped to the memory region
*/
int rvt_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg,
int sg_nents, unsigned int *sg_offset)
{
struct rvt_mr *mr = to_imr(ibmr);
+ int ret;
mr->mr.length = 0;
mr->mr.page_shift = PAGE_SHIFT;
- return ib_sg_to_pages(ibmr, sg, sg_nents, sg_offset,
- rvt_set_page);
+ ret = ib_sg_to_pages(ibmr, sg, sg_nents, sg_offset, rvt_set_page);
+ mr->mr.user_base = ibmr->iova;
+ mr->mr.iova = ibmr->iova;
+ mr->mr.offset = ibmr->iova - (u64)mr->mr.map[0]->segs[0].vaddr;
+ mr->mr.length = (size_t)ibmr->length;
+ return ret;
}
/**
@@ -559,6 +561,7 @@ int rvt_fast_reg_mr(struct rvt_qp *qp, struct ib_mr *ibmr, u32 key,
ibmr->rkey = key;
mr->mr.lkey = key;
mr->mr.access_flags = access;
+ mr->mr.iova = ibmr->iova;
atomic_set(&mr->mr.lkey_invalid, 0);
return 0;
diff --git a/drivers/input/keyboard/cap11xx.c b/drivers/input/keyboard/cap11xx.c
index 4401be2..3c53aa5 100644
--- a/drivers/input/keyboard/cap11xx.c
+++ b/drivers/input/keyboard/cap11xx.c
@@ -75,9 +75,7 @@
struct cap11xx_led {
struct cap11xx_priv *priv;
struct led_classdev cdev;
- struct work_struct work;
u32 reg;
- enum led_brightness new_brightness;
};
#endif
@@ -233,30 +231,21 @@ static void cap11xx_input_close(struct input_dev *idev)
}
#ifdef CONFIG_LEDS_CLASS
-static void cap11xx_led_work(struct work_struct *work)
-{
- struct cap11xx_led *led = container_of(work, struct cap11xx_led, work);
- struct cap11xx_priv *priv = led->priv;
- int value = led->new_brightness;
-
- /*
- * All LEDs share the same duty cycle as this is a HW limitation.
- * Brightness levels per LED are either 0 (OFF) and 1 (ON).
- */
- regmap_update_bits(priv->regmap, CAP11XX_REG_LED_OUTPUT_CONTROL,
- BIT(led->reg), value ? BIT(led->reg) : 0);
-}
-
-static void cap11xx_led_set(struct led_classdev *cdev,
- enum led_brightness value)
+static int cap11xx_led_set(struct led_classdev *cdev,
+ enum led_brightness value)
{
struct cap11xx_led *led = container_of(cdev, struct cap11xx_led, cdev);
+ struct cap11xx_priv *priv = led->priv;
- if (led->new_brightness == value)
- return;
-
- led->new_brightness = value;
- schedule_work(&led->work);
+ /*
+ * All LEDs share the same duty cycle as this is a HW
+ * limitation. Brightness levels per LED are either
+ * 0 (OFF) and 1 (ON).
+ */
+ return regmap_update_bits(priv->regmap,
+ CAP11XX_REG_LED_OUTPUT_CONTROL,
+ BIT(led->reg),
+ value ? BIT(led->reg) : 0);
}
static int cap11xx_init_leds(struct device *dev,
@@ -299,7 +288,7 @@ static int cap11xx_init_leds(struct device *dev,
led->cdev.default_trigger =
of_get_property(child, "linux,default-trigger", NULL);
led->cdev.flags = 0;
- led->cdev.brightness_set = cap11xx_led_set;
+ led->cdev.brightness_set_blocking = cap11xx_led_set;
led->cdev.max_brightness = 1;
led->cdev.brightness = LED_OFF;
@@ -312,8 +301,6 @@ static int cap11xx_init_leds(struct device *dev,
led->reg = reg;
led->priv = priv;
- INIT_WORK(&led->work, cap11xx_led_work);
-
error = devm_led_classdev_register(dev, &led->cdev);
if (error) {
of_node_put(child);
diff --git a/drivers/input/keyboard/matrix_keypad.c b/drivers/input/keyboard/matrix_keypad.c
index c64d874..2e12e31 100644
--- a/drivers/input/keyboard/matrix_keypad.c
+++ b/drivers/input/keyboard/matrix_keypad.c
@@ -220,7 +220,7 @@ static void matrix_keypad_stop(struct input_dev *dev)
keypad->stopped = true;
spin_unlock_irq(&keypad->lock);
- flush_work(&keypad->work.work);
+ flush_delayed_work(&keypad->work);
/*
* matrix_keypad_scan() will leave IRQs enabled;
* we should disable them now.
diff --git a/drivers/input/keyboard/st-keyscan.c b/drivers/input/keyboard/st-keyscan.c
index de7be4f..ebf9f64 100644
--- a/drivers/input/keyboard/st-keyscan.c
+++ b/drivers/input/keyboard/st-keyscan.c
@@ -153,6 +153,8 @@ static int keyscan_probe(struct platform_device *pdev)
input_dev->id.bustype = BUS_HOST;
+ keypad_data->input_dev = input_dev;
+
error = keypad_matrix_key_parse_dt(keypad_data);
if (error)
return error;
@@ -168,8 +170,6 @@ static int keyscan_probe(struct platform_device *pdev)
input_set_drvdata(input_dev, keypad_data);
- keypad_data->input_dev = input_dev;
-
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
keypad_data->base = devm_ioremap_resource(&pdev->dev, res);
if (IS_ERR(keypad_data->base))
diff --git a/drivers/input/misc/Kconfig b/drivers/input/misc/Kconfig
old mode 100644
new mode 100755
diff --git a/drivers/input/misc/Makefile b/drivers/input/misc/Makefile
index c08ee8f..bd95a6a 100644
--- a/drivers/input/misc/Makefile
+++ b/drivers/input/misc/Makefile
@@ -42,7 +42,6 @@
obj-$(CONFIG_HP_SDC_RTC) += hp_sdc_rtc.o
obj-$(CONFIG_INPUT_IMS_PCU) += ims-pcu.o
obj-$(CONFIG_INPUT_IXP4XX_BEEPER) += ixp4xx-beeper.o
-obj-$(CONFIG_INPUT_KEYCHORD) += keychord.o
obj-$(CONFIG_INPUT_KEYSPAN_REMOTE) += keyspan_remote.o
obj-$(CONFIG_INPUT_KXTJ9) += kxtj9.o
obj-$(CONFIG_INPUT_M68K_BEEP) += m68kspkr.o
diff --git a/drivers/input/misc/keychord.c b/drivers/input/misc/keychord.c
deleted file mode 100644
index 82fefdf..0000000
--- a/drivers/input/misc/keychord.c
+++ /dev/null
@@ -1,467 +0,0 @@
-/*
- * drivers/input/misc/keychord.c
- *
- * Copyright (C) 2008 Google, Inc.
- * Author: Mike Lockwood <lockwood@android.com>
- *
- * This software is licensed under the terms of the GNU General Public
- * License version 2, as published by the Free Software Foundation, and
- * may be copied, distributed, and modified under those terms.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
-*/
-
-#include <linux/poll.h>
-#include <linux/slab.h>
-#include <linux/module.h>
-#include <linux/init.h>
-#include <linux/spinlock.h>
-#include <linux/fs.h>
-#include <linux/miscdevice.h>
-#include <linux/keychord.h>
-#include <linux/sched.h>
-
-#define KEYCHORD_NAME "keychord"
-#define BUFFER_SIZE 16
-
-MODULE_AUTHOR("Mike Lockwood <lockwood@android.com>");
-MODULE_DESCRIPTION("Key chord input driver");
-MODULE_SUPPORTED_DEVICE("keychord");
-MODULE_LICENSE("GPL");
-
-#define NEXT_KEYCHORD(kc) ((struct input_keychord *) \
- ((char *)kc + sizeof(struct input_keychord) + \
- kc->count * sizeof(kc->keycodes[0])))
-
-struct keychord_device {
- struct input_handler input_handler;
- int registered;
-
- /* list of keychords to monitor */
- struct input_keychord *keychords;
- int keychord_count;
-
- /* bitmask of keys contained in our keychords */
- unsigned long keybit[BITS_TO_LONGS(KEY_CNT)];
- /* current state of the keys */
- unsigned long keystate[BITS_TO_LONGS(KEY_CNT)];
- /* number of keys that are currently pressed */
- int key_down;
-
- /* second input_device_id is needed for null termination */
- struct input_device_id device_ids[2];
-
- spinlock_t lock;
- wait_queue_head_t waitq;
- unsigned char head;
- unsigned char tail;
- __u16 buff[BUFFER_SIZE];
- /* Bit to serialize writes to this device */
-#define KEYCHORD_BUSY 0x01
- unsigned long flags;
- wait_queue_head_t write_waitq;
-};
-
-static int check_keychord(struct keychord_device *kdev,
- struct input_keychord *keychord)
-{
- int i;
-
- if (keychord->count != kdev->key_down)
- return 0;
-
- for (i = 0; i < keychord->count; i++) {
- if (!test_bit(keychord->keycodes[i], kdev->keystate))
- return 0;
- }
-
- /* we have a match */
- return 1;
-}
-
-static void keychord_event(struct input_handle *handle, unsigned int type,
- unsigned int code, int value)
-{
- struct keychord_device *kdev = handle->private;
- struct input_keychord *keychord;
- unsigned long flags;
- int i, got_chord = 0;
-
- if (type != EV_KEY || code >= KEY_MAX)
- return;
-
- spin_lock_irqsave(&kdev->lock, flags);
- /* do nothing if key state did not change */
- if (!test_bit(code, kdev->keystate) == !value)
- goto done;
- __change_bit(code, kdev->keystate);
- if (value)
- kdev->key_down++;
- else
- kdev->key_down--;
-
- /* don't notify on key up */
- if (!value)
- goto done;
- /* ignore this event if it is not one of the keys we are monitoring */
- if (!test_bit(code, kdev->keybit))
- goto done;
-
- keychord = kdev->keychords;
- if (!keychord)
- goto done;
-
- /* check to see if the keyboard state matches any keychords */
- for (i = 0; i < kdev->keychord_count; i++) {
- if (check_keychord(kdev, keychord)) {
- kdev->buff[kdev->head] = keychord->id;
- kdev->head = (kdev->head + 1) % BUFFER_SIZE;
- got_chord = 1;
- break;
- }
- /* skip to next keychord */
- keychord = NEXT_KEYCHORD(keychord);
- }
-
-done:
- spin_unlock_irqrestore(&kdev->lock, flags);
-
- if (got_chord) {
- pr_info("keychord: got keychord id %d. Any tasks: %d\n",
- keychord->id,
- !list_empty_careful(&kdev->waitq.task_list));
- wake_up_interruptible(&kdev->waitq);
- }
-}
-
-static int keychord_connect(struct input_handler *handler,
- struct input_dev *dev,
- const struct input_device_id *id)
-{
- int i, ret;
- struct input_handle *handle;
- struct keychord_device *kdev =
- container_of(handler, struct keychord_device, input_handler);
-
- /*
- * ignore this input device if it does not contain any keycodes
- * that we are monitoring
- */
- for (i = 0; i < KEY_MAX; i++) {
- if (test_bit(i, kdev->keybit) && test_bit(i, dev->keybit))
- break;
- }
- if (i == KEY_MAX)
- return -ENODEV;
-
- handle = kzalloc(sizeof(*handle), GFP_KERNEL);
- if (!handle)
- return -ENOMEM;
-
- handle->dev = dev;
- handle->handler = handler;
- handle->name = KEYCHORD_NAME;
- handle->private = kdev;
-
- ret = input_register_handle(handle);
- if (ret)
- goto err_input_register_handle;
-
- ret = input_open_device(handle);
- if (ret)
- goto err_input_open_device;
-
- pr_info("keychord: using input dev %s for fevent\n", dev->name);
- return 0;
-
-err_input_open_device:
- input_unregister_handle(handle);
-err_input_register_handle:
- kfree(handle);
- return ret;
-}
-
-static void keychord_disconnect(struct input_handle *handle)
-{
- input_close_device(handle);
- input_unregister_handle(handle);
- kfree(handle);
-}
-
-/*
- * keychord_read is used to read keychord events from the driver
- */
-static ssize_t keychord_read(struct file *file, char __user *buffer,
- size_t count, loff_t *ppos)
-{
- struct keychord_device *kdev = file->private_data;
- __u16 id;
- int retval;
- unsigned long flags;
-
- if (count < sizeof(id))
- return -EINVAL;
- count = sizeof(id);
-
- if (kdev->head == kdev->tail && (file->f_flags & O_NONBLOCK))
- return -EAGAIN;
-
- retval = wait_event_interruptible(kdev->waitq,
- kdev->head != kdev->tail);
- if (retval)
- return retval;
-
- spin_lock_irqsave(&kdev->lock, flags);
- /* pop a keychord ID off the queue */
- id = kdev->buff[kdev->tail];
- kdev->tail = (kdev->tail + 1) % BUFFER_SIZE;
- spin_unlock_irqrestore(&kdev->lock, flags);
-
- if (copy_to_user(buffer, &id, count))
- return -EFAULT;
-
- return count;
-}
-
-/*
- * serializes writes on a device. can use mutex_lock_interruptible()
- * for this particular use case as well - a matter of preference.
- */
-static int
-keychord_write_lock(struct keychord_device *kdev)
-{
- int ret;
- unsigned long flags;
-
- spin_lock_irqsave(&kdev->lock, flags);
- while (kdev->flags & KEYCHORD_BUSY) {
- spin_unlock_irqrestore(&kdev->lock, flags);
- ret = wait_event_interruptible(kdev->write_waitq,
- ((kdev->flags & KEYCHORD_BUSY) == 0));
- if (ret)
- return ret;
- spin_lock_irqsave(&kdev->lock, flags);
- }
- kdev->flags |= KEYCHORD_BUSY;
- spin_unlock_irqrestore(&kdev->lock, flags);
- return 0;
-}
-
-static void
-keychord_write_unlock(struct keychord_device *kdev)
-{
- unsigned long flags;
-
- spin_lock_irqsave(&kdev->lock, flags);
- kdev->flags &= ~KEYCHORD_BUSY;
- spin_unlock_irqrestore(&kdev->lock, flags);
- wake_up_interruptible(&kdev->write_waitq);
-}
-
-/*
- * keychord_write is used to configure the driver
- */
-static ssize_t keychord_write(struct file *file, const char __user *buffer,
- size_t count, loff_t *ppos)
-{
- struct keychord_device *kdev = file->private_data;
- struct input_keychord *keychords = 0;
- struct input_keychord *keychord;
- int ret, i, key;
- unsigned long flags;
- size_t resid = count;
- size_t key_bytes;
-
- if (count < sizeof(struct input_keychord) || count > PAGE_SIZE)
- return -EINVAL;
- keychords = kzalloc(count, GFP_KERNEL);
- if (!keychords)
- return -ENOMEM;
-
- /* read list of keychords from userspace */
- if (copy_from_user(keychords, buffer, count)) {
- kfree(keychords);
- return -EFAULT;
- }
-
- /*
- * Serialize writes to this device to prevent various races.
- * 1) writers racing here could do duplicate input_unregister_handler()
- * calls, resulting in attempting to unlink a node from a list that
- * does not exist.
- * 2) writers racing here could do duplicate input_register_handler() calls
- * below, resulting in a duplicate insertion of a node into the list.
- * 3) a double kfree of keychords can occur (in the event that
- * input_register_handler() fails below.
- */
- ret = keychord_write_lock(kdev);
- if (ret) {
- kfree(keychords);
- return ret;
- }
-
- /* unregister handler before changing configuration */
- if (kdev->registered) {
- input_unregister_handler(&kdev->input_handler);
- kdev->registered = 0;
- }
-
- spin_lock_irqsave(&kdev->lock, flags);
- /* clear any existing configuration */
- kfree(kdev->keychords);
- kdev->keychords = 0;
- kdev->keychord_count = 0;
- kdev->key_down = 0;
- memset(kdev->keybit, 0, sizeof(kdev->keybit));
- memset(kdev->keystate, 0, sizeof(kdev->keystate));
- kdev->head = kdev->tail = 0;
-
- keychord = keychords;
-
- while (resid > 0) {
- /* Is the entire keychord entry header present ? */
- if (resid < sizeof(struct input_keychord)) {
- pr_err("keychord: Insufficient bytes present for header %zu\n",
- resid);
- goto err_unlock_return;
- }
- resid -= sizeof(struct input_keychord);
- if (keychord->count <= 0) {
- pr_err("keychord: invalid keycode count %d\n",
- keychord->count);
- goto err_unlock_return;
- }
- key_bytes = keychord->count * sizeof(keychord->keycodes[0]);
- /* Do we have all the expected keycodes ? */
- if (resid < key_bytes) {
- pr_err("keychord: Insufficient bytes present for keycount %zu\n",
- resid);
- goto err_unlock_return;
- }
- resid -= key_bytes;
-
- if (keychord->version != KEYCHORD_VERSION) {
- pr_err("keychord: unsupported version %d\n",
- keychord->version);
- goto err_unlock_return;
- }
-
- /* keep track of the keys we are monitoring in keybit */
- for (i = 0; i < keychord->count; i++) {
- key = keychord->keycodes[i];
- if (key < 0 || key >= KEY_CNT) {
- pr_err("keychord: keycode %d out of range\n",
- key);
- goto err_unlock_return;
- }
- __set_bit(key, kdev->keybit);
- }
-
- kdev->keychord_count++;
- keychord = NEXT_KEYCHORD(keychord);
- }
-
- kdev->keychords = keychords;
- spin_unlock_irqrestore(&kdev->lock, flags);
-
- ret = input_register_handler(&kdev->input_handler);
- if (ret) {
- kfree(keychords);
- kdev->keychords = 0;
- keychord_write_unlock(kdev);
- return ret;
- }
- kdev->registered = 1;
-
- keychord_write_unlock(kdev);
-
- return count;
-
-err_unlock_return:
- spin_unlock_irqrestore(&kdev->lock, flags);
- kfree(keychords);
- keychord_write_unlock(kdev);
- return -EINVAL;
-}
-
-static unsigned int keychord_poll(struct file *file, poll_table *wait)
-{
- struct keychord_device *kdev = file->private_data;
-
- poll_wait(file, &kdev->waitq, wait);
-
- if (kdev->head != kdev->tail)
- return POLLIN | POLLRDNORM;
-
- return 0;
-}
-
-static int keychord_open(struct inode *inode, struct file *file)
-{
- struct keychord_device *kdev;
-
- kdev = kzalloc(sizeof(struct keychord_device), GFP_KERNEL);
- if (!kdev)
- return -ENOMEM;
-
- spin_lock_init(&kdev->lock);
- init_waitqueue_head(&kdev->waitq);
- init_waitqueue_head(&kdev->write_waitq);
-
- kdev->input_handler.event = keychord_event;
- kdev->input_handler.connect = keychord_connect;
- kdev->input_handler.disconnect = keychord_disconnect;
- kdev->input_handler.name = KEYCHORD_NAME;
- kdev->input_handler.id_table = kdev->device_ids;
-
- kdev->device_ids[0].flags = INPUT_DEVICE_ID_MATCH_EVBIT;
- __set_bit(EV_KEY, kdev->device_ids[0].evbit);
-
- file->private_data = kdev;
-
- return 0;
-}
-
-static int keychord_release(struct inode *inode, struct file *file)
-{
- struct keychord_device *kdev = file->private_data;
-
- if (kdev->registered)
- input_unregister_handler(&kdev->input_handler);
- kfree(kdev->keychords);
- kfree(kdev);
-
- return 0;
-}
-
-static const struct file_operations keychord_fops = {
- .owner = THIS_MODULE,
- .open = keychord_open,
- .release = keychord_release,
- .read = keychord_read,
- .write = keychord_write,
- .poll = keychord_poll,
-};
-
-static struct miscdevice keychord_misc = {
- .fops = &keychord_fops,
- .name = KEYCHORD_NAME,
- .minor = MISC_DYNAMIC_MINOR,
-};
-
-static int __init keychord_init(void)
-{
- return misc_register(&keychord_misc);
-}
-
-static void __exit keychord_exit(void)
-{
- misc_deregister(&keychord_misc);
-}
-
-module_init(keychord_init);
-module_exit(keychord_exit);
diff --git a/drivers/input/rmi4/rmi_f11.c b/drivers/input/rmi4/rmi_f11.c
index f798f42..275f957 100644
--- a/drivers/input/rmi4/rmi_f11.c
+++ b/drivers/input/rmi4/rmi_f11.c
@@ -1198,7 +1198,7 @@ static int rmi_f11_initialize(struct rmi_function *fn)
ctrl->ctrl0_11[11] = ctrl->ctrl0_11[11] & ~BIT(0);
rc = f11_write_control_regs(fn, &f11->sens_query,
- &f11->dev_controls, fn->fd.query_base_addr);
+ &f11->dev_controls, fn->fd.control_base_addr);
if (rc)
dev_warn(&fn->dev, "Failed to write control registers\n");
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index ca22483..c1233d0 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2599,7 +2599,12 @@ static int map_sg(struct device *dev, struct scatterlist *sglist,
/* Everything is mapped - write the right values into s->dma_address */
for_each_sg(sglist, s, nelems, i) {
- s->dma_address += address + s->offset;
+ /*
+ * Add in the remaining piece of the scatter-gather offset that
+ * was masked out when we were determining the physical address
+ * via (sg_phys(s) & PAGE_MASK) earlier.
+ */
+ s->dma_address += address + (s->offset & ~PAGE_MASK);
s->dma_length = s->length;
}
diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 63110fb..d51734e 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -143,7 +143,7 @@ dmar_alloc_pci_notify_info(struct pci_dev *dev, unsigned long event)
for (tmp = dev; tmp; tmp = tmp->bus->self)
level++;
- size = sizeof(*info) + level * sizeof(struct acpi_dmar_pci_path);
+ size = sizeof(*info) + level * sizeof(info->path[0]);
if (size <= sizeof(dmar_pci_notify_info_buf)) {
info = (struct dmar_pci_notify_info *)dmar_pci_notify_info_buf;
} else {
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 86e3496..28feb17 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1636,6 +1636,9 @@ static void iommu_disable_protect_mem_regions(struct intel_iommu *iommu)
u32 pmen;
unsigned long flags;
+ if (!cap_plmr(iommu->cap) && !cap_phmr(iommu->cap))
+ return;
+
raw_spin_lock_irqsave(&iommu->register_lock, flags);
pmen = readl(iommu->reg + DMAR_PMEN_REG);
pmen &= ~DMA_PMEN_EPM;
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index d68a552..3085b47 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -207,7 +207,8 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
if (dma != virt_to_phys(table))
goto out_unmap;
}
- kmemleak_ignore(table);
+ if (lvl == 2)
+ kmemleak_ignore(table);
return table;
out_unmap:
diff --git a/drivers/irqchip/irq-mbigen.c b/drivers/irqchip/irq-mbigen.c
index 05d87f6..406bfe6 100644
--- a/drivers/irqchip/irq-mbigen.c
+++ b/drivers/irqchip/irq-mbigen.c
@@ -160,6 +160,9 @@ static void mbigen_write_msg(struct msi_desc *desc, struct msi_msg *msg)
void __iomem *base = d->chip_data;
u32 val;
+ if (!msg->address_lo && !msg->address_hi)
+ return;
+
base += get_mbigen_vec_reg(d->hwirq);
val = readl_relaxed(base);
diff --git a/drivers/isdn/hardware/mISDN/hfcmulti.c b/drivers/isdn/hardware/mISDN/hfcmulti.c
index 480c2d7..8feb8e9 100644
--- a/drivers/isdn/hardware/mISDN/hfcmulti.c
+++ b/drivers/isdn/hardware/mISDN/hfcmulti.c
@@ -4370,7 +4370,8 @@ setup_pci(struct hfc_multi *hc, struct pci_dev *pdev,
if (m->clock2)
test_and_set_bit(HFC_CHIP_CLOCK2, &hc->chip);
- if (ent->device == 0xB410) {
+ if (ent->vendor == PCI_VENDOR_ID_DIGIUM &&
+ ent->device == PCI_DEVICE_ID_DIGIUM_HFC4S) {
test_and_set_bit(HFC_CHIP_B410P, &hc->chip);
test_and_set_bit(HFC_CHIP_PCM_MASTER, &hc->chip);
test_and_clear_bit(HFC_CHIP_PCM_SLAVE, &hc->chip);
diff --git a/drivers/leds/leds-lp55xx-common.c b/drivers/leds/leds-lp55xx-common.c
index 5377f22..e265595 100644
--- a/drivers/leds/leds-lp55xx-common.c
+++ b/drivers/leds/leds-lp55xx-common.c
@@ -201,7 +201,7 @@ static void lp55xx_firmware_loaded(const struct firmware *fw, void *context)
if (!fw) {
dev_err(dev, "firmware request failed\n");
- goto out;
+ return;
}
/* handling firmware data is chip dependent */
@@ -214,9 +214,9 @@ static void lp55xx_firmware_loaded(const struct firmware *fw, void *context)
mutex_unlock(&chip->lock);
-out:
/* firmware should be released for other channel use */
release_firmware(chip->fw);
+ chip->fw = NULL;
}
static int lp55xx_request_firmware(struct lp55xx_chip *chip)
diff --git a/drivers/leds/trigger/ledtrig-heartbeat.c b/drivers/leds/trigger/ledtrig-heartbeat.c
index 410c39c..4600f95 100644
--- a/drivers/leds/trigger/ledtrig-heartbeat.c
+++ b/drivers/leds/trigger/ledtrig-heartbeat.c
@@ -17,6 +17,7 @@
#include <linux/slab.h>
#include <linux/timer.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/leds.h>
#include <linux/reboot.h>
#include "../leds.h"
diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index 121df39..fadad29 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -582,4 +582,17 @@
any more after all the data blocks it covers have been verified anyway.
If unsure, say N.
+
+config DM_BOW
+ tristate "Backup block device"
+ depends on BLK_DEV_DM
+ select DM_BUFIO
+ ---help---
+ This device-mapper target takes a device and keeps a log of all
+ changes using free blocks identified by issuing a trim command.
+ This can then be restored by running a command line utility,
+ or committed by simply replacing the target.
+
+ If unsure, say N.
+
endif # MD
diff --git a/drivers/md/Makefile b/drivers/md/Makefile
index c8dec9c..6535d50 100644
--- a/drivers/md/Makefile
+++ b/drivers/md/Makefile
@@ -62,6 +62,7 @@
obj-$(CONFIG_DM_LOG_WRITES) += dm-log-writes.o
obj-$(CONFIG_DM_REQ_CRYPT) += dm-req-crypt.o
obj-$(CONFIG_DM_ANDROID_VERITY) += dm-android-verity.o
+obj-$(CONFIG_DM_BOW) += dm-bow.o
ifeq ($(CONFIG_DM_UEVENT),y)
dm-mod-objs += dm-uevent.o
diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c
index 5a5c1f1..463ce67 100644
--- a/drivers/md/bcache/sysfs.c
+++ b/drivers/md/bcache/sysfs.c
@@ -215,7 +215,9 @@ STORE(__cached_dev)
d_strtoul(writeback_rate_d_term);
d_strtoul_nonzero(writeback_rate_p_term_inverse);
- d_strtoi_h(sequential_cutoff);
+ sysfs_strtoul_clamp(sequential_cutoff,
+ dc->sequential_cutoff,
+ 0, UINT_MAX);
d_strtoi_h(readahead);
if (attr == &sysfs_clear_stats)
@@ -645,8 +647,17 @@ STORE(__bch_cache_set)
c->error_limit = strtoul_or_return(buf) << IO_ERROR_SHIFT;
/* See count_io_errors() for why 88 */
- if (attr == &sysfs_io_error_halflife)
- c->error_decay = strtoul_or_return(buf) / 88;
+ if (attr == &sysfs_io_error_halflife) {
+ unsigned long v = 0;
+ ssize_t ret;
+
+ ret = strtoul_safe_clamp(buf, v, 0, UINT_MAX);
+ if (!ret) {
+ c->error_decay = v / 88;
+ return size;
+ }
+ return ret;
+ }
sysfs_strtoul(journal_delay_ms, c->journal_delay_ms);
sysfs_strtoul(verify, c->verify);
diff --git a/drivers/md/bcache/sysfs.h b/drivers/md/bcache/sysfs.h
index 0526fe9..e7a3c12 100644
--- a/drivers/md/bcache/sysfs.h
+++ b/drivers/md/bcache/sysfs.h
@@ -80,9 +80,16 @@ do { \
#define sysfs_strtoul_clamp(file, var, min, max) \
do { \
- if (attr == &sysfs_ ## file) \
- return strtoul_safe_clamp(buf, var, min, max) \
- ?: (ssize_t) size; \
+ if (attr == &sysfs_ ## file) { \
+ unsigned long v = 0; \
+ ssize_t ret; \
+ ret = strtoul_safe_clamp(buf, v, min, max); \
+ if (!ret) { \
+ var = v; \
+ return size; \
+ } \
+ return ret; \
+ } \
} while (0)
#define strtoul_or_return(cp) \
diff --git a/drivers/md/bcache/writeback.h b/drivers/md/bcache/writeback.h
index cdf8d25..6fb6dd6 100644
--- a/drivers/md/bcache/writeback.h
+++ b/drivers/md/bcache/writeback.h
@@ -68,6 +68,9 @@ static inline bool should_writeback(struct cached_dev *dc, struct bio *bio,
in_use > CUTOFF_WRITEBACK_SYNC)
return false;
+ if (bio_op(bio) == REQ_OP_DISCARD)
+ return false;
+
if (dc->partial_stripes_expensive &&
bcache_dev_stripe_dirty(dc, bio->bi_iter.bi_sector,
bio_sectors(bio)))
diff --git a/drivers/md/dm-bow.c b/drivers/md/dm-bow.c
new file mode 100644
index 0000000..4e7f6c0
--- /dev/null
+++ b/drivers/md/dm-bow.c
@@ -0,0 +1,1233 @@
+/*
+ * Copyright (C) 2018 Google Limited.
+ *
+ * This file is released under the GPL.
+ */
+
+#include "dm.h"
+#include "dm-bufio.h"
+#include "dm-core.h"
+
+#include <linux/crc32.h>
+#include <linux/module.h>
+
+#define DM_MSG_PREFIX "bow"
+#define SECTOR_SIZE 512
+
+struct log_entry {
+ u64 source;
+ u64 dest;
+ u32 size;
+ u32 checksum;
+} __packed;
+
+struct log_sector {
+ u32 magic;
+ u16 header_version;
+ u16 header_size;
+ u32 block_size;
+ u32 count;
+ u32 sequence;
+ sector_t sector0;
+ struct log_entry entries[];
+} __packed;
+
+/*
+ * MAGIC is BOW in ascii
+ */
+#define MAGIC 0x00574f42
+#define HEADER_VERSION 0x0100
+
+/*
+ * A sorted set of ranges representing the state of the data on the device.
+ * Use an rb_tree for fast lookup of a given sector
+ * Consecutive ranges are always of different type - operations on this
+ * set must merge matching consecutive ranges.
+ *
+ * Top range is always of type TOP
+ */
+struct bow_range {
+ struct rb_node node;
+ sector_t sector;
+ enum {
+ INVALID, /* Type not set */
+ SECTOR0, /* First sector - holds log record */
+ SECTOR0_CURRENT,/* Live contents of sector0 */
+ UNCHANGED, /* Original contents */
+ TRIMMED, /* Range has been trimmed */
+ CHANGED, /* Range has been changed */
+ BACKUP, /* Range is being used as a backup */
+ TOP, /* Final range - sector is size of device */
+ } type;
+ struct list_head trimmed_list; /* list of TRIMMED ranges */
+};
+
+static const char * const readable_type[] = {
+ "Invalid",
+ "Sector0",
+ "Sector0_current",
+ "Unchanged",
+ "Free",
+ "Changed",
+ "Backup",
+ "Top",
+};
+
+enum state {
+ TRIM,
+ CHECKPOINT,
+ COMMITTED,
+};
+
+struct bow_context {
+ struct dm_dev *dev;
+ u32 block_size;
+ u32 block_shift;
+ struct workqueue_struct *workqueue;
+ struct dm_bufio_client *bufio;
+ struct mutex ranges_lock; /* Hold to access this struct and/or ranges */
+ struct rb_root ranges;
+ struct dm_kobject_holder kobj_holder; /* for sysfs attributes */
+ atomic_t state; /* One of the enum state values above */
+ u64 trims_total;
+ struct log_sector *log_sector;
+ struct list_head trimmed_list;
+ bool forward_trims;
+};
+
+sector_t range_top(struct bow_range *br)
+{
+ return container_of(rb_next(&br->node), struct bow_range, node)
+ ->sector;
+}
+
+u64 range_size(struct bow_range *br)
+{
+ return (range_top(br) - br->sector) * SECTOR_SIZE;
+}
+
+static sector_t bvec_top(struct bvec_iter *bi_iter)
+{
+ return bi_iter->bi_sector + bi_iter->bi_size / SECTOR_SIZE;
+}
+
+/*
+ * Find the first range that overlaps with bi_iter
+ * bi_iter is set to the size of the overlapping sub-range
+ */
+static struct bow_range *find_first_overlapping_range(struct rb_root *ranges,
+ struct bvec_iter *bi_iter)
+{
+ struct rb_node *node = ranges->rb_node;
+ struct bow_range *br;
+
+ while (node) {
+ br = container_of(node, struct bow_range, node);
+
+ if (br->sector <= bi_iter->bi_sector
+ && bi_iter->bi_sector < range_top(br))
+ break;
+
+ if (bi_iter->bi_sector < br->sector)
+ node = node->rb_left;
+ else
+ node = node->rb_right;
+ }
+
+ WARN_ON(!node);
+ if (!node)
+ return NULL;
+
+ if (range_top(br) - bi_iter->bi_sector
+ < bi_iter->bi_size >> SECTOR_SHIFT)
+ bi_iter->bi_size = (range_top(br) - bi_iter->bi_sector)
+ << SECTOR_SHIFT;
+
+ return br;
+}
+
+void add_before(struct rb_root *ranges, struct bow_range *new_br,
+ struct bow_range *existing)
+{
+ struct rb_node *parent = &(existing->node);
+ struct rb_node **link = &(parent->rb_left);
+
+ while (*link) {
+ parent = *link;
+ link = &((*link)->rb_right);
+ }
+
+ rb_link_node(&new_br->node, parent, link);
+ rb_insert_color(&new_br->node, ranges);
+}
+
+/*
+ * Given a range br returned by find_first_overlapping_range, split br into a
+ * leading range, a range matching the bi_iter and a trailing range.
+ * Leading and trailing may end up size 0 and will then be deleted. The
+ * new range matching the bi_iter is then returned and should have its type
+ * and type specific fields populated.
+ * If bi_iter runs off the end of the range, bi_iter is truncated accordingly
+ */
+static int split_range(struct bow_context *bc, struct bow_range **br,
+ struct bvec_iter *bi_iter)
+{
+ struct bow_range *new_br;
+
+ if (bi_iter->bi_sector < (*br)->sector) {
+ WARN_ON(true);
+ return -EIO;
+ }
+
+ if (bi_iter->bi_sector > (*br)->sector) {
+ struct bow_range *leading_br =
+ kzalloc(sizeof(*leading_br), GFP_KERNEL);
+
+ if (!leading_br)
+ return -ENOMEM;
+
+ *leading_br = **br;
+ if (leading_br->type == TRIMMED)
+ list_add(&leading_br->trimmed_list, &bc->trimmed_list);
+
+ add_before(&bc->ranges, leading_br, *br);
+ (*br)->sector = bi_iter->bi_sector;
+ }
+
+ if (bvec_top(bi_iter) >= range_top(*br)) {
+ bi_iter->bi_size = (range_top(*br) - (*br)->sector)
+ * SECTOR_SIZE;
+ return 0;
+ }
+
+ /* new_br will be the beginning, existing br will be the tail */
+ new_br = kzalloc(sizeof(*new_br), GFP_KERNEL);
+ if (!new_br)
+ return -ENOMEM;
+
+ new_br->sector = (*br)->sector;
+ (*br)->sector = bvec_top(bi_iter);
+ add_before(&bc->ranges, new_br, *br);
+ *br = new_br;
+
+ return 0;
+}
+
+/*
+ * Sets type of a range. May merge range into surrounding ranges
+ * Since br may be invalidated, always sets br to NULL to prevent
+ * usage after this is called
+ */
+static void set_type(struct bow_context *bc, struct bow_range **br, int type)
+{
+ struct bow_range *prev = container_of(rb_prev(&(*br)->node),
+ struct bow_range, node);
+ struct bow_range *next = container_of(rb_next(&(*br)->node),
+ struct bow_range, node);
+
+ if ((*br)->type == TRIMMED) {
+ bc->trims_total -= range_size(*br);
+ list_del(&(*br)->trimmed_list);
+ }
+
+ if (type == TRIMMED) {
+ bc->trims_total += range_size(*br);
+ list_add(&(*br)->trimmed_list, &bc->trimmed_list);
+ }
+
+ (*br)->type = type;
+
+ if (next->type == type) {
+ if (type == TRIMMED)
+ list_del(&next->trimmed_list);
+ rb_erase(&next->node, &bc->ranges);
+ kfree(next);
+ }
+
+ if (prev->type == type) {
+ if (type == TRIMMED)
+ list_del(&(*br)->trimmed_list);
+ rb_erase(&(*br)->node, &bc->ranges);
+ kfree(*br);
+ }
+
+ *br = NULL;
+}
+
+static struct bow_range *find_free_range(struct bow_context *bc)
+{
+ if (list_empty(&bc->trimmed_list)) {
+ DMERR("Unable to find free space to back up to");
+ return NULL;
+ }
+
+ return list_first_entry(&bc->trimmed_list, struct bow_range,
+ trimmed_list);
+}
+
+static sector_t sector_to_page(struct bow_context const *bc, sector_t sector)
+{
+ WARN_ON((sector & (((sector_t)1 << (bc->block_shift - SECTOR_SHIFT)) - 1))
+ != 0);
+ return sector >> (bc->block_shift - SECTOR_SHIFT);
+}
+
+static int copy_data(struct bow_context const *bc,
+ struct bow_range *source, struct bow_range *dest,
+ u32 *checksum)
+{
+ int i;
+
+ if (range_size(source) != range_size(dest)) {
+ WARN_ON(1);
+ return -EIO;
+ }
+
+ if (checksum)
+ *checksum = sector_to_page(bc, source->sector);
+
+ for (i = 0; i < range_size(source) >> bc->block_shift; ++i) {
+ struct dm_buffer *read_buffer, *write_buffer;
+ u8 *read, *write;
+ sector_t page = sector_to_page(bc, source->sector) + i;
+
+ read = dm_bufio_read(bc->bufio, page, &read_buffer);
+ if (IS_ERR(read)) {
+ DMERR("Cannot read page %llu",
+ (unsigned long long)page);
+ return PTR_ERR(read);
+ }
+
+ if (checksum)
+ *checksum = crc32(*checksum, read, bc->block_size);
+
+ write = dm_bufio_new(bc->bufio,
+ sector_to_page(bc, dest->sector) + i,
+ &write_buffer);
+ if (IS_ERR(write)) {
+ DMERR("Cannot write sector");
+ dm_bufio_release(read_buffer);
+ return PTR_ERR(write);
+ }
+
+ memcpy(write, read, bc->block_size);
+
+ dm_bufio_mark_buffer_dirty(write_buffer);
+ dm_bufio_release(write_buffer);
+ dm_bufio_release(read_buffer);
+ }
+
+ dm_bufio_write_dirty_buffers(bc->bufio);
+ return 0;
+}
+
+/****** logging functions ******/
+
+static int add_log_entry(struct bow_context *bc, sector_t source, sector_t dest,
+ unsigned int size, u32 checksum);
+
+static int backup_log_sector(struct bow_context *bc)
+{
+ struct bow_range *first_br, *free_br;
+ struct bvec_iter bi_iter;
+ u32 checksum = 0;
+ int ret;
+
+ first_br = container_of(rb_first(&bc->ranges), struct bow_range, node);
+
+ if (first_br->type != SECTOR0) {
+ WARN_ON(1);
+ return -EIO;
+ }
+
+ if (range_size(first_br) != bc->block_size) {
+ WARN_ON(1);
+ return -EIO;
+ }
+
+ free_br = find_free_range(bc);
+ /* No space left - return this error to userspace */
+ if (!free_br)
+ return -ENOSPC;
+ bi_iter.bi_sector = free_br->sector;
+ bi_iter.bi_size = bc->block_size;
+ ret = split_range(bc, &free_br, &bi_iter);
+ if (ret)
+ return ret;
+ if (bi_iter.bi_size != bc->block_size) {
+ WARN_ON(1);
+ return -EIO;
+ }
+
+ ret = copy_data(bc, first_br, free_br, &checksum);
+ if (ret)
+ return ret;
+
+ bc->log_sector->count = 0;
+ bc->log_sector->sequence++;
+ ret = add_log_entry(bc, first_br->sector, free_br->sector,
+ range_size(first_br), checksum);
+ if (ret)
+ return ret;
+
+ set_type(bc, &free_br, BACKUP);
+ return 0;
+}
+
+static int add_log_entry(struct bow_context *bc, sector_t source, sector_t dest,
+ unsigned int size, u32 checksum)
+{
+ struct dm_buffer *sector_buffer;
+ u8 *sector;
+
+ if (sizeof(struct log_sector)
+ + sizeof(struct log_entry) * (bc->log_sector->count + 1)
+ > bc->block_size) {
+ int ret = backup_log_sector(bc);
+
+ if (ret)
+ return ret;
+ }
+
+ sector = dm_bufio_new(bc->bufio, 0, §or_buffer);
+ if (IS_ERR(sector)) {
+ DMERR("Cannot write boot sector");
+ dm_bufio_release(sector_buffer);
+ return -ENOSPC;
+ }
+
+ bc->log_sector->entries[bc->log_sector->count].source = source;
+ bc->log_sector->entries[bc->log_sector->count].dest = dest;
+ bc->log_sector->entries[bc->log_sector->count].size = size;
+ bc->log_sector->entries[bc->log_sector->count].checksum = checksum;
+ bc->log_sector->count++;
+
+ memcpy(sector, bc->log_sector, bc->block_size);
+ dm_bufio_mark_buffer_dirty(sector_buffer);
+ dm_bufio_release(sector_buffer);
+ dm_bufio_write_dirty_buffers(bc->bufio);
+ return 0;
+}
+
+static int prepare_log(struct bow_context *bc)
+{
+ struct bow_range *free_br, *first_br;
+ struct bvec_iter bi_iter;
+ u32 checksum = 0;
+ int ret;
+
+ /* Carve out first sector as log sector */
+ first_br = container_of(rb_first(&bc->ranges), struct bow_range, node);
+ if (first_br->type != UNCHANGED) {
+ WARN_ON(1);
+ return -EIO;
+ }
+
+ if (range_size(first_br) < bc->block_size) {
+ WARN_ON(1);
+ return -EIO;
+ }
+ bi_iter.bi_sector = 0;
+ bi_iter.bi_size = bc->block_size;
+ ret = split_range(bc, &first_br, &bi_iter);
+ if (ret)
+ return ret;
+ first_br->type = SECTOR0;
+ if (range_size(first_br) != bc->block_size) {
+ WARN_ON(1);
+ return -EIO;
+ }
+
+ /* Find free sector for active sector0 reads/writes */
+ free_br = find_free_range(bc);
+ if (!free_br)
+ return -ENOSPC;
+ bi_iter.bi_sector = free_br->sector;
+ bi_iter.bi_size = bc->block_size;
+ ret = split_range(bc, &free_br, &bi_iter);
+ if (ret)
+ return ret;
+ free_br->type = SECTOR0_CURRENT;
+
+ /* Copy data */
+ ret = copy_data(bc, first_br, free_br, NULL);
+ if (ret)
+ return ret;
+
+ bc->log_sector->sector0 = free_br->sector;
+
+ /* Find free sector to back up original sector zero */
+ free_br = find_free_range(bc);
+ if (!free_br)
+ return -ENOSPC;
+ bi_iter.bi_sector = free_br->sector;
+ bi_iter.bi_size = bc->block_size;
+ ret = split_range(bc, &free_br, &bi_iter);
+ if (ret)
+ return ret;
+
+ /* Back up */
+ ret = copy_data(bc, first_br, free_br, &checksum);
+ if (ret)
+ return ret;
+
+ /*
+ * Set up our replacement boot sector - it will get written when we
+ * add the first log entry, which we do immediately
+ */
+ bc->log_sector->magic = MAGIC;
+ bc->log_sector->header_version = HEADER_VERSION;
+ bc->log_sector->header_size = sizeof(*bc->log_sector);
+ bc->log_sector->block_size = bc->block_size;
+ bc->log_sector->count = 0;
+ bc->log_sector->sequence = 0;
+
+ /* Add log entry */
+ ret = add_log_entry(bc, first_br->sector, free_br->sector,
+ range_size(first_br), checksum);
+ if (ret)
+ return ret;
+
+ set_type(bc, &free_br, BACKUP);
+ return 0;
+}
+
+static struct bow_range *find_sector0_current(struct bow_context *bc)
+{
+ struct bvec_iter bi_iter;
+
+ bi_iter.bi_sector = bc->log_sector->sector0;
+ bi_iter.bi_size = bc->block_size;
+ return find_first_overlapping_range(&bc->ranges, &bi_iter);
+}
+
+/****** sysfs interface functions ******/
+
+static ssize_t state_show(struct kobject *kobj, struct kobj_attribute *attr,
+ char *buf)
+{
+ struct bow_context *bc = container_of(kobj, struct bow_context,
+ kobj_holder.kobj);
+
+ return scnprintf(buf, PAGE_SIZE, "%d\n", atomic_read(&bc->state));
+}
+
+static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct bow_context *bc = container_of(kobj, struct bow_context,
+ kobj_holder.kobj);
+ enum state state, original_state;
+ int ret;
+
+ state = buf[0] - '0';
+ if (state < TRIM || state > COMMITTED) {
+ DMERR("State value %d out of range", state);
+ return -EINVAL;
+ }
+
+ mutex_lock(&bc->ranges_lock);
+ original_state = atomic_read(&bc->state);
+ if (state != original_state + 1) {
+ DMERR("Invalid state change from %d to %d",
+ original_state, state);
+ ret = -EINVAL;
+ goto bad;
+ }
+
+ DMINFO("Switching to state %s", state == CHECKPOINT ? "Checkpoint"
+ : state == COMMITTED ? "Committed" : "Unknown");
+
+ if (state == CHECKPOINT) {
+ ret = prepare_log(bc);
+ if (ret) {
+ DMERR("Failed to switch to checkpoint state");
+ goto bad;
+ }
+ } else if (state == COMMITTED) {
+ struct bow_range *br = find_sector0_current(bc);
+ struct bow_range *sector0_br =
+ container_of(rb_first(&bc->ranges), struct bow_range,
+ node);
+
+ ret = copy_data(bc, br, sector0_br, 0);
+ if (ret) {
+ DMERR("Failed to switch to committed state");
+ goto bad;
+ }
+ }
+ atomic_inc(&bc->state);
+ ret = count;
+
+bad:
+ mutex_unlock(&bc->ranges_lock);
+ return ret;
+}
+
+static ssize_t free_show(struct kobject *kobj, struct kobj_attribute *attr,
+ char *buf)
+{
+ struct bow_context *bc = container_of(kobj, struct bow_context,
+ kobj_holder.kobj);
+ u64 trims_total;
+
+ mutex_lock(&bc->ranges_lock);
+ trims_total = bc->trims_total;
+ mutex_unlock(&bc->ranges_lock);
+
+ return scnprintf(buf, PAGE_SIZE, "%llu\n", trims_total);
+}
+
+static struct kobj_attribute attr_state = __ATTR_RW(state);
+static struct kobj_attribute attr_free = __ATTR_RO(free);
+
+static struct attribute *bow_attrs[] = {
+ &attr_state.attr,
+ &attr_free.attr,
+ NULL
+};
+
+static struct kobj_type bow_ktype = {
+ .sysfs_ops = &kobj_sysfs_ops,
+ .default_attrs = bow_attrs,
+ .release = dm_kobject_release
+};
+
+/****** constructor/destructor ******/
+
+static void dm_bow_dtr(struct dm_target *ti)
+{
+ struct bow_context *bc = (struct bow_context *) ti->private;
+ struct kobject *kobj;
+
+ while (rb_first(&bc->ranges)) {
+ struct bow_range *br = container_of(rb_first(&bc->ranges),
+ struct bow_range, node);
+
+ rb_erase(&br->node, &bc->ranges);
+ kfree(br);
+ }
+ if (bc->workqueue)
+ destroy_workqueue(bc->workqueue);
+ if (bc->bufio)
+ dm_bufio_client_destroy(bc->bufio);
+
+ kobj = &bc->kobj_holder.kobj;
+ if (kobj->state_initialized) {
+ kobject_put(kobj);
+ wait_for_completion(dm_get_completion_from_kobject(kobj));
+ }
+
+ kfree(bc->log_sector);
+ kfree(bc);
+}
+
+static int dm_bow_ctr(struct dm_target *ti, unsigned int argc, char **argv)
+{
+ struct bow_context *bc;
+ struct bow_range *br;
+ int ret;
+ struct mapped_device *md = dm_table_get_md(ti->table);
+
+ if (argc != 1) {
+ ti->error = "Invalid argument count";
+ return -EINVAL;
+ }
+
+ bc = kzalloc(sizeof(*bc), GFP_KERNEL);
+ if (!bc) {
+ ti->error = "Cannot allocate bow context";
+ return -ENOMEM;
+ }
+
+ ti->num_flush_bios = 1;
+ ti->num_discard_bios = 1;
+ ti->num_write_same_bios = 1;
+ ti->private = bc;
+
+ ret = dm_get_device(ti, argv[0], dm_table_get_mode(ti->table),
+ &bc->dev);
+ if (ret) {
+ ti->error = "Device lookup failed";
+ goto bad;
+ }
+
+ if (bc->dev->bdev->bd_queue->limits.max_discard_sectors == 0) {
+ bc->dev->bdev->bd_queue->limits.discard_granularity = 1 << 12;
+ bc->dev->bdev->bd_queue->limits.max_hw_discard_sectors = 1 << 15;
+ bc->dev->bdev->bd_queue->limits.max_discard_sectors = 1 << 15;
+ bc->forward_trims = false;
+ } else {
+ bc->forward_trims = true;
+ }
+
+ bc->block_size = bc->dev->bdev->bd_queue->limits.logical_block_size;
+ bc->block_shift = ilog2(bc->block_size);
+ bc->log_sector = kzalloc(bc->block_size, GFP_KERNEL);
+ if (!bc->log_sector) {
+ ti->error = "Cannot allocate log sector";
+ goto bad;
+ }
+
+ init_completion(&bc->kobj_holder.completion);
+ ret = kobject_init_and_add(&bc->kobj_holder.kobj, &bow_ktype,
+ &disk_to_dev(dm_disk(md))->kobj, "%s",
+ "bow");
+ if (ret) {
+ ti->error = "Cannot create sysfs node";
+ goto bad;
+ }
+
+ mutex_init(&bc->ranges_lock);
+ bc->ranges = RB_ROOT;
+ bc->bufio = dm_bufio_client_create(bc->dev->bdev, bc->block_size, 1, 0,
+ NULL, NULL);
+ if (IS_ERR(bc->bufio)) {
+ ti->error = "Cannot initialize dm-bufio";
+ ret = PTR_ERR(bc->bufio);
+ bc->bufio = NULL;
+ goto bad;
+ }
+
+ bc->workqueue = alloc_workqueue("dm-bow",
+ WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM
+ | WQ_UNBOUND, num_online_cpus());
+ if (!bc->workqueue) {
+ ti->error = "Cannot allocate workqueue";
+ ret = -ENOMEM;
+ goto bad;
+ }
+
+ INIT_LIST_HEAD(&bc->trimmed_list);
+
+ br = kzalloc(sizeof(*br), GFP_KERNEL);
+ if (!br) {
+ ti->error = "Cannot allocate ranges";
+ ret = -ENOMEM;
+ goto bad;
+ }
+
+ br->sector = ti->len;
+ br->type = TOP;
+ rb_link_node(&br->node, NULL, &bc->ranges.rb_node);
+ rb_insert_color(&br->node, &bc->ranges);
+
+ br = kzalloc(sizeof(*br), GFP_KERNEL);
+ if (!br) {
+ ti->error = "Cannot allocate ranges";
+ ret = -ENOMEM;
+ goto bad;
+ }
+
+ br->sector = 0;
+ br->type = UNCHANGED;
+ rb_link_node(&br->node, bc->ranges.rb_node,
+ &bc->ranges.rb_node->rb_left);
+ rb_insert_color(&br->node, &bc->ranges);
+
+ ti->discards_supported = true;
+
+ return 0;
+
+bad:
+ dm_bow_dtr(ti);
+ return ret;
+}
+
+/****** Handle writes ******/
+
+static int prepare_unchanged_range(struct bow_context *bc, struct bow_range *br,
+ struct bvec_iter *bi_iter,
+ bool record_checksum)
+{
+ struct bow_range *backup_br;
+ struct bvec_iter backup_bi;
+ sector_t log_source, log_dest;
+ unsigned int log_size;
+ u32 checksum = 0;
+ int ret;
+ int original_type;
+ sector_t sector0;
+
+ /* Find a free range */
+ backup_br = find_free_range(bc);
+ if (!backup_br)
+ return -ENOSPC;
+
+ /* Carve out a backup range. This may be smaller than the br given */
+ backup_bi.bi_sector = backup_br->sector;
+ backup_bi.bi_size = min(range_size(backup_br), (u64) bi_iter->bi_size);
+ ret = split_range(bc, &backup_br, &backup_bi);
+ if (ret)
+ return ret;
+
+ /*
+ * Carve out a changed range. This will not be smaller than the backup
+ * br since the backup br is smaller than the source range and iterator
+ */
+ bi_iter->bi_size = backup_bi.bi_size;
+ ret = split_range(bc, &br, bi_iter);
+ if (ret)
+ return ret;
+ if (range_size(br) != range_size(backup_br)) {
+ WARN_ON(1);
+ return -EIO;
+ }
+
+
+ /* Copy data over */
+ ret = copy_data(bc, br, backup_br, record_checksum ? &checksum : NULL);
+ if (ret)
+ return ret;
+
+ /* Add an entry to the log */
+ log_source = br->sector;
+ log_dest = backup_br->sector;
+ log_size = range_size(br);
+
+ /*
+ * Set the types. Note that since set_type also amalgamates ranges
+ * we have to set both sectors to their final type before calling
+ * set_type on either
+ */
+ original_type = br->type;
+ sector0 = backup_br->sector;
+ if (backup_br->type == TRIMMED)
+ list_del(&backup_br->trimmed_list);
+ backup_br->type = br->type == SECTOR0_CURRENT ? SECTOR0_CURRENT
+ : BACKUP;
+ br->type = CHANGED;
+ set_type(bc, &backup_br, backup_br->type);
+
+ /*
+ * Add the log entry after marking the backup sector, since adding a log
+ * can cause another backup
+ */
+ ret = add_log_entry(bc, log_source, log_dest, log_size, checksum);
+ if (ret) {
+ br->type = original_type;
+ return ret;
+ }
+
+ /* Now it is safe to mark this backup successful */
+ if (original_type == SECTOR0_CURRENT)
+ bc->log_sector->sector0 = sector0;
+
+ set_type(bc, &br, br->type);
+ return ret;
+}
+
+static int prepare_free_range(struct bow_context *bc, struct bow_range *br,
+ struct bvec_iter *bi_iter)
+{
+ int ret;
+
+ ret = split_range(bc, &br, bi_iter);
+ if (ret)
+ return ret;
+ set_type(bc, &br, CHANGED);
+ return 0;
+}
+
+static int prepare_changed_range(struct bow_context *bc, struct bow_range *br,
+ struct bvec_iter *bi_iter)
+{
+ /* Nothing to do ... */
+ return 0;
+}
+
+static int prepare_one_range(struct bow_context *bc,
+ struct bvec_iter *bi_iter)
+{
+ struct bow_range *br = find_first_overlapping_range(&bc->ranges,
+ bi_iter);
+ switch (br->type) {
+ case CHANGED:
+ return prepare_changed_range(bc, br, bi_iter);
+
+ case TRIMMED:
+ return prepare_free_range(bc, br, bi_iter);
+
+ case UNCHANGED:
+ case BACKUP:
+ return prepare_unchanged_range(bc, br, bi_iter, true);
+
+ /*
+ * We cannot track the checksum for the active sector0, since it
+ * may change at any point.
+ */
+ case SECTOR0_CURRENT:
+ return prepare_unchanged_range(bc, br, bi_iter, false);
+
+ case SECTOR0: /* Handled in the dm_bow_map */
+ case TOP: /* Illegal - top is off the end of the device */
+ default:
+ WARN_ON(1);
+ return -EIO;
+ }
+}
+
+struct write_work {
+ struct work_struct work;
+ struct bow_context *bc;
+ struct bio *bio;
+};
+
+static void bow_write(struct work_struct *work)
+{
+ struct write_work *ww = container_of(work, struct write_work, work);
+ struct bow_context *bc = ww->bc;
+ struct bio *bio = ww->bio;
+ struct bvec_iter bi_iter = bio->bi_iter;
+ int ret = 0;
+
+ kfree(ww);
+
+ mutex_lock(&bc->ranges_lock);
+ do {
+ ret = prepare_one_range(bc, &bi_iter);
+ bi_iter.bi_sector += bi_iter.bi_size / SECTOR_SIZE;
+ bi_iter.bi_size = bio->bi_iter.bi_size
+ - (bi_iter.bi_sector - bio->bi_iter.bi_sector)
+ * SECTOR_SIZE;
+ } while (!ret && bi_iter.bi_size);
+
+ mutex_unlock(&bc->ranges_lock);
+
+ if (!ret) {
+ bio->bi_bdev = bc->dev->bdev;
+ submit_bio(bio);
+ } else {
+ DMERR("Write failure with error %d", -ret);
+ bio->bi_error = ret;
+ bio_endio(bio);
+ }
+}
+
+static int queue_write(struct bow_context *bc, struct bio *bio)
+{
+ struct write_work *ww = kmalloc(sizeof(*ww), GFP_NOIO | __GFP_NORETRY
+ | __GFP_NOMEMALLOC | __GFP_NOWARN);
+ if (!ww) {
+ DMERR("Failed to allocate write_work");
+ return -ENOMEM;
+ }
+
+ INIT_WORK(&ww->work, bow_write);
+ ww->bc = bc;
+ ww->bio = bio;
+ queue_work(bc->workqueue, &ww->work);
+ return DM_MAPIO_SUBMITTED;
+}
+
+static int handle_sector0(struct bow_context *bc, struct bio *bio)
+{
+ int ret = DM_MAPIO_REMAPPED;
+
+ if (bio->bi_iter.bi_size > bc->block_size) {
+ struct bio * split = bio_split(bio,
+ bc->block_size >> SECTOR_SHIFT,
+ GFP_NOIO,
+ fs_bio_set);
+ if (!split) {
+ DMERR("Failed to split bio");
+ bio->bi_error = -ENOMEM;
+ bio_endio(bio);
+ return DM_MAPIO_SUBMITTED;
+ }
+
+ bio_chain(split, bio);
+ split->bi_iter.bi_sector = bc->log_sector->sector0;
+ split->bi_bdev = bc->dev->bdev;
+ submit_bio(split);
+
+ if (bio_data_dir(bio) == WRITE)
+ ret = queue_write(bc, bio);
+ } else {
+ bio->bi_iter.bi_sector = bc->log_sector->sector0;
+ }
+
+ return ret;
+}
+
+static int add_trim(struct bow_context *bc, struct bio *bio)
+{
+ struct bow_range *br;
+ struct bvec_iter bi_iter = bio->bi_iter;
+
+ DMDEBUG("add_trim: %llu, %u",
+ (unsigned long long)bio->bi_iter.bi_sector,
+ bio->bi_iter.bi_size);
+
+ do {
+ br = find_first_overlapping_range(&bc->ranges, &bi_iter);
+
+ switch (br->type) {
+ case UNCHANGED:
+ if (!split_range(bc, &br, &bi_iter))
+ set_type(bc, &br, TRIMMED);
+ break;
+
+ case TRIMMED:
+ /* Nothing to do */
+ break;
+
+ default:
+ /* No other case is legal in TRIM state */
+ WARN_ON(true);
+ break;
+ }
+
+ bi_iter.bi_sector += bi_iter.bi_size / SECTOR_SIZE;
+ bi_iter.bi_size = bio->bi_iter.bi_size
+ - (bi_iter.bi_sector - bio->bi_iter.bi_sector)
+ * SECTOR_SIZE;
+
+ } while (bi_iter.bi_size);
+
+ bio_endio(bio);
+ return DM_MAPIO_SUBMITTED;
+}
+
+static int remove_trim(struct bow_context *bc, struct bio *bio)
+{
+ struct bow_range *br;
+ struct bvec_iter bi_iter = bio->bi_iter;
+
+ DMDEBUG("remove_trim: %llu, %u",
+ (unsigned long long)bio->bi_iter.bi_sector,
+ bio->bi_iter.bi_size);
+
+ do {
+ br = find_first_overlapping_range(&bc->ranges, &bi_iter);
+
+ switch (br->type) {
+ case UNCHANGED:
+ /* Nothing to do */
+ break;
+
+ case TRIMMED:
+ if (!split_range(bc, &br, &bi_iter))
+ set_type(bc, &br, UNCHANGED);
+ break;
+
+ default:
+ /* No other case is legal in TRIM state */
+ WARN_ON(true);
+ break;
+ }
+
+ bi_iter.bi_sector += bi_iter.bi_size / SECTOR_SIZE;
+ bi_iter.bi_size = bio->bi_iter.bi_size
+ - (bi_iter.bi_sector - bio->bi_iter.bi_sector)
+ * SECTOR_SIZE;
+
+ } while (bi_iter.bi_size);
+
+ return DM_MAPIO_REMAPPED;
+}
+
+int remap_unless_illegal_trim(struct bow_context *bc, struct bio *bio)
+{
+ if (!bc->forward_trims && bio_op(bio) == REQ_OP_DISCARD) {
+ bio->bi_error = -EINVAL;
+ bio_endio(bio);
+ return DM_MAPIO_SUBMITTED;
+ } else {
+ bio->bi_bdev = bc->dev->bdev;
+ return DM_MAPIO_REMAPPED;
+ }
+}
+
+/****** dm interface ******/
+
+static int dm_bow_map(struct dm_target *ti, struct bio *bio)
+{
+ int ret = DM_MAPIO_REMAPPED;
+ struct bow_context *bc = ti->private;
+
+ if (likely(bc->state.counter == COMMITTED))
+ return remap_unless_illegal_trim(bc, bio);
+
+ if (bio_data_dir(bio) == READ && bio->bi_iter.bi_sector != 0)
+ return remap_unless_illegal_trim(bc, bio);
+
+ if (atomic_read(&bc->state) != COMMITTED) {
+ enum state state;
+
+ mutex_lock(&bc->ranges_lock);
+ state = atomic_read(&bc->state);
+ if (state == TRIM) {
+ if (bio_op(bio) == REQ_OP_DISCARD)
+ ret = add_trim(bc, bio);
+ else if (bio_data_dir(bio) == WRITE)
+ ret = remove_trim(bc, bio);
+ else
+ /* pass-through */;
+ } else if (state == CHECKPOINT) {
+ if (bio->bi_iter.bi_sector == 0)
+ ret = handle_sector0(bc, bio);
+ else if (bio_data_dir(bio) == WRITE)
+ ret = queue_write(bc, bio);
+ else
+ /* pass-through */;
+ } else {
+ /* pass-through */
+ }
+ mutex_unlock(&bc->ranges_lock);
+ }
+
+ if (ret == DM_MAPIO_REMAPPED)
+ return remap_unless_illegal_trim(bc, bio);
+
+ return ret;
+}
+
+static void dm_bow_tablestatus(struct dm_target *ti, char *result,
+ unsigned int maxlen)
+{
+ char *end = result + maxlen;
+ struct bow_context *bc = ti->private;
+ struct rb_node *i;
+ int trimmed_list_length = 0;
+ int trimmed_range_count = 0;
+ struct bow_range *br;
+
+ if (maxlen == 0)
+ return;
+ result[0] = 0;
+
+ list_for_each_entry(br, &bc->trimmed_list, trimmed_list)
+ if (br->type == TRIMMED) {
+ ++trimmed_list_length;
+ } else {
+ scnprintf(result, end - result,
+ "ERROR: non-trimmed entry in trimmed_list");
+ return;
+ }
+
+ if (!rb_first(&bc->ranges)) {
+ scnprintf(result, end - result, "ERROR: Empty ranges");
+ return;
+ }
+
+ if (container_of(rb_first(&bc->ranges), struct bow_range, node)
+ ->sector) {
+ scnprintf(result, end - result,
+ "ERROR: First range does not start at sector 0");
+ return;
+ }
+
+ for (i = rb_first(&bc->ranges); i; i = rb_next(i)) {
+ struct bow_range *br = container_of(i, struct bow_range, node);
+
+ result += scnprintf(result, end - result, "%s: %llu",
+ readable_type[br->type],
+ (unsigned long long)br->sector);
+ if (result >= end)
+ return;
+
+ result += scnprintf(result, end - result, "\n");
+ if (result >= end)
+ return;
+
+ if (br->type == TRIMMED)
+ ++trimmed_range_count;
+
+ if (br->type == TOP) {
+ if (br->sector != ti->len) {
+ scnprintf(result, end - result,
+ "\nERROR: Top sector is incorrect");
+ }
+
+ if (&br->node != rb_last(&bc->ranges)) {
+ scnprintf(result, end - result,
+ "\nERROR: Top sector is not last");
+ }
+
+ break;
+ }
+
+ if (!rb_next(i)) {
+ scnprintf(result, end - result,
+ "\nERROR: Last range not of type TOP");
+ return;
+ }
+
+ if (br->sector > range_top(br)) {
+ scnprintf(result, end - result,
+ "\nERROR: sectors out of order");
+ return;
+ }
+ }
+
+ if (trimmed_range_count != trimmed_list_length)
+ scnprintf(result, end - result,
+ "\nERROR: not all trimmed ranges in trimmed list");
+}
+
+static void dm_bow_status(struct dm_target *ti, status_type_t type,
+ unsigned int status_flags, char *result,
+ unsigned int maxlen)
+{
+ switch (type) {
+ case STATUSTYPE_INFO:
+ if (maxlen)
+ result[0] = 0;
+ break;
+
+ case STATUSTYPE_TABLE:
+ dm_bow_tablestatus(ti, result, maxlen);
+ break;
+ }
+}
+
+int dm_bow_prepare_ioctl(struct dm_target *ti, struct block_device **bdev,
+ fmode_t *mode)
+{
+ struct bow_context *bc = ti->private;
+ struct dm_dev *dev = bc->dev;
+
+ *bdev = dev->bdev;
+ /* Only pass ioctls through if the device sizes match exactly. */
+ return ti->len != i_size_read(dev->bdev->bd_inode) >> SECTOR_SHIFT;
+}
+
+static int dm_bow_iterate_devices(struct dm_target *ti,
+ iterate_devices_callout_fn fn, void *data)
+{
+ struct bow_context *bc = ti->private;
+
+ return fn(ti, bc->dev, 0, ti->len, data);
+}
+
+static struct target_type bow_target = {
+ .name = "bow",
+ .version = {1, 1, 1},
+ .module = THIS_MODULE,
+ .ctr = dm_bow_ctr,
+ .dtr = dm_bow_dtr,
+ .map = dm_bow_map,
+ .status = dm_bow_status,
+ .prepare_ioctl = dm_bow_prepare_ioctl,
+ .iterate_devices = dm_bow_iterate_devices,
+};
+
+int __init dm_bow_init(void)
+{
+ int r = dm_register_target(&bow_target);
+
+ if (r < 0)
+ DMERR("registering bow failed %d", r);
+ return r;
+}
+
+void dm_bow_exit(void)
+{
+ dm_unregister_target(&bow_target);
+}
+
+MODULE_LICENSE("GPL");
+
+module_init(dm_bow_init);
+module_exit(dm_bow_exit);
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index 8ca97b2..b8a935a 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -3295,6 +3295,13 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv)
as.argc = argc;
as.argv = argv;
+ /* make sure metadata and data are different devices */
+ if (!strcmp(argv[0], argv[1])) {
+ ti->error = "Error setting metadata or data device";
+ r = -EINVAL;
+ goto out_unlock;
+ }
+
/*
* Set default pool features.
*/
@@ -4177,6 +4184,12 @@ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv)
tc->sort_bio_list = RB_ROOT;
if (argc == 3) {
+ if (!strcmp(argv[0], argv[2])) {
+ ti->error = "Error setting origin device";
+ r = -EINVAL;
+ goto bad_origin_dev;
+ }
+
r = dm_get_device(ti, argv[2], FMODE_READ, &origin_dev);
if (r) {
ti->error = "Error opening origin device";
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index f1ad80b..3d2cd54 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3798,6 +3798,8 @@ static int raid10_run(struct mddev *mddev)
set_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
mddev->sync_thread = md_register_thread(md_do_sync, mddev,
"reshape");
+ if (!mddev->sync_thread)
+ goto out_free_conf;
}
return 0;
@@ -4489,7 +4491,6 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr,
atomic_inc(&r10_bio->remaining);
read_bio->bi_next = NULL;
generic_make_request(read_bio);
- sector_nr += nr_sectors;
sectors_done += nr_sectors;
if (sector_nr <= last)
goto read_more;
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index d15e29d..c3815fa 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6977,6 +6977,8 @@ static int raid5_run(struct mddev *mddev)
set_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
mddev->sync_thread = md_register_thread(md_do_sync, mddev,
"reshape");
+ if (!mddev->sync_thread)
+ goto abort;
}
/* Ok, everything is just fine now */
diff --git a/drivers/media/i2c/mt9m111.c b/drivers/media/i2c/mt9m111.c
index 72e71b7..a614587 100644
--- a/drivers/media/i2c/mt9m111.c
+++ b/drivers/media/i2c/mt9m111.c
@@ -974,6 +974,8 @@ static int mt9m111_probe(struct i2c_client *client,
mt9m111->rect.top = MT9M111_MIN_DARK_ROWS;
mt9m111->rect.width = MT9M111_MAX_WIDTH;
mt9m111->rect.height = MT9M111_MAX_HEIGHT;
+ mt9m111->width = mt9m111->rect.width;
+ mt9m111->height = mt9m111->rect.height;
mt9m111->fmt = &mt9m111_colour_fmts[0];
mt9m111->lastpage = -1;
mutex_init(&mt9m111->power_lock);
diff --git a/drivers/media/platform/msm/vidc_3x/msm_vidc.c b/drivers/media/platform/msm/vidc_3x/msm_vidc.c
index 77452f0..07804d2 100644
--- a/drivers/media/platform/msm/vidc_3x/msm_vidc.c
+++ b/drivers/media/platform/msm/vidc_3x/msm_vidc.c
@@ -1084,7 +1084,7 @@ static inline int vb2_bufq_init(struct msm_vidc_inst *inst,
q->ops = msm_venc_get_vb2q_ops();
q->mem_ops = &msm_vidc_vb2_mem_ops;
q->drv_priv = inst;
- q->allow_zero_bytesused = !V4L2_TYPE_IS_OUTPUT(type);
+ q->allow_zero_bytesused = 1;
return vb2_queue_init(q);
}
diff --git a/drivers/media/platform/mx2_emmaprp.c b/drivers/media/platform/mx2_emmaprp.c
index e68d271..8354ad2 100644
--- a/drivers/media/platform/mx2_emmaprp.c
+++ b/drivers/media/platform/mx2_emmaprp.c
@@ -288,7 +288,7 @@ static void emmaprp_device_run(void *priv)
{
struct emmaprp_ctx *ctx = priv;
struct emmaprp_q_data *s_q_data, *d_q_data;
- struct vb2_buffer *src_buf, *dst_buf;
+ struct vb2_v4l2_buffer *src_buf, *dst_buf;
struct emmaprp_dev *pcdev = ctx->dev;
unsigned int s_width, s_height;
unsigned int d_width, d_height;
@@ -308,8 +308,8 @@ static void emmaprp_device_run(void *priv)
d_height = d_q_data->height;
d_size = d_width * d_height;
- p_in = vb2_dma_contig_plane_dma_addr(src_buf, 0);
- p_out = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
+ p_in = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+ p_out = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
if (!p_in || !p_out) {
v4l2_err(&pcdev->v4l2_dev,
"Acquiring kernel pointers to buffers failed\n");
diff --git a/drivers/media/platform/s5p-g2d/g2d.c b/drivers/media/platform/s5p-g2d/g2d.c
index 62c0dec..5f6ccf4 100644
--- a/drivers/media/platform/s5p-g2d/g2d.c
+++ b/drivers/media/platform/s5p-g2d/g2d.c
@@ -498,7 +498,7 @@ static void device_run(void *prv)
{
struct g2d_ctx *ctx = prv;
struct g2d_dev *dev = ctx->dev;
- struct vb2_buffer *src, *dst;
+ struct vb2_v4l2_buffer *src, *dst;
unsigned long flags;
u32 cmd = 0;
@@ -513,10 +513,10 @@ static void device_run(void *prv)
spin_lock_irqsave(&dev->ctrl_lock, flags);
g2d_set_src_size(dev, &ctx->in);
- g2d_set_src_addr(dev, vb2_dma_contig_plane_dma_addr(src, 0));
+ g2d_set_src_addr(dev, vb2_dma_contig_plane_dma_addr(&src->vb2_buf, 0));
g2d_set_dst_size(dev, &ctx->out);
- g2d_set_dst_addr(dev, vb2_dma_contig_plane_dma_addr(dst, 0));
+ g2d_set_dst_addr(dev, vb2_dma_contig_plane_dma_addr(&dst->vb2_buf, 0));
g2d_set_rop4(dev, ctx->rop);
g2d_set_flip(dev, ctx->flip);
diff --git a/drivers/media/platform/s5p-jpeg/jpeg-core.c b/drivers/media/platform/s5p-jpeg/jpeg-core.c
index 1da2c94..c89922f 100644
--- a/drivers/media/platform/s5p-jpeg/jpeg-core.c
+++ b/drivers/media/platform/s5p-jpeg/jpeg-core.c
@@ -789,14 +789,14 @@ static void skip(struct s5p_jpeg_buffer *buf, long len);
static void exynos4_jpeg_parse_decode_h_tbl(struct s5p_jpeg_ctx *ctx)
{
struct s5p_jpeg *jpeg = ctx->jpeg;
- struct vb2_buffer *vb = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
+ struct vb2_v4l2_buffer *vb = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
struct s5p_jpeg_buffer jpeg_buffer;
unsigned int word;
int c, x, components;
jpeg_buffer.size = 2; /* Ls */
jpeg_buffer.data =
- (unsigned long)vb2_plane_vaddr(vb, 0) + ctx->out_q.sos + 2;
+ (unsigned long)vb2_plane_vaddr(&vb->vb2_buf, 0) + ctx->out_q.sos + 2;
jpeg_buffer.curr = 0;
word = 0;
@@ -826,14 +826,14 @@ static void exynos4_jpeg_parse_decode_h_tbl(struct s5p_jpeg_ctx *ctx)
static void exynos4_jpeg_parse_huff_tbl(struct s5p_jpeg_ctx *ctx)
{
struct s5p_jpeg *jpeg = ctx->jpeg;
- struct vb2_buffer *vb = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
+ struct vb2_v4l2_buffer *vb = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
struct s5p_jpeg_buffer jpeg_buffer;
unsigned int word;
int c, i, n, j;
for (j = 0; j < ctx->out_q.dht.n; ++j) {
jpeg_buffer.size = ctx->out_q.dht.len[j];
- jpeg_buffer.data = (unsigned long)vb2_plane_vaddr(vb, 0) +
+ jpeg_buffer.data = (unsigned long)vb2_plane_vaddr(&vb->vb2_buf, 0) +
ctx->out_q.dht.marker[j];
jpeg_buffer.curr = 0;
@@ -885,13 +885,13 @@ static void exynos4_jpeg_parse_huff_tbl(struct s5p_jpeg_ctx *ctx)
static void exynos4_jpeg_parse_decode_q_tbl(struct s5p_jpeg_ctx *ctx)
{
struct s5p_jpeg *jpeg = ctx->jpeg;
- struct vb2_buffer *vb = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
+ struct vb2_v4l2_buffer *vb = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
struct s5p_jpeg_buffer jpeg_buffer;
int c, x, components;
jpeg_buffer.size = ctx->out_q.sof_len;
jpeg_buffer.data =
- (unsigned long)vb2_plane_vaddr(vb, 0) + ctx->out_q.sof;
+ (unsigned long)vb2_plane_vaddr(&vb->vb2_buf, 0) + ctx->out_q.sof;
jpeg_buffer.curr = 0;
skip(&jpeg_buffer, 5); /* P, Y, X */
@@ -916,14 +916,14 @@ static void exynos4_jpeg_parse_decode_q_tbl(struct s5p_jpeg_ctx *ctx)
static void exynos4_jpeg_parse_q_tbl(struct s5p_jpeg_ctx *ctx)
{
struct s5p_jpeg *jpeg = ctx->jpeg;
- struct vb2_buffer *vb = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
+ struct vb2_v4l2_buffer *vb = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
struct s5p_jpeg_buffer jpeg_buffer;
unsigned int word;
int c, i, j;
for (j = 0; j < ctx->out_q.dqt.n; ++j) {
jpeg_buffer.size = ctx->out_q.dqt.len[j];
- jpeg_buffer.data = (unsigned long)vb2_plane_vaddr(vb, 0) +
+ jpeg_buffer.data = (unsigned long)vb2_plane_vaddr(&vb->vb2_buf, 0) +
ctx->out_q.dqt.marker[j];
jpeg_buffer.curr = 0;
@@ -1264,13 +1264,16 @@ static int s5p_jpeg_querycap(struct file *file, void *priv,
return 0;
}
-static int enum_fmt(struct s5p_jpeg_fmt *sjpeg_formats, int n,
+static int enum_fmt(struct s5p_jpeg_ctx *ctx,
+ struct s5p_jpeg_fmt *sjpeg_formats, int n,
struct v4l2_fmtdesc *f, u32 type)
{
int i, num = 0;
+ unsigned int fmt_ver_flag = ctx->jpeg->variant->fmt_ver_flag;
for (i = 0; i < n; ++i) {
- if (sjpeg_formats[i].flags & type) {
+ if (sjpeg_formats[i].flags & type &&
+ sjpeg_formats[i].flags & fmt_ver_flag) {
/* index-th format of type type found ? */
if (num == f->index)
break;
@@ -1297,11 +1300,11 @@ static int s5p_jpeg_enum_fmt_vid_cap(struct file *file, void *priv,
struct s5p_jpeg_ctx *ctx = fh_to_ctx(priv);
if (ctx->mode == S5P_JPEG_ENCODE)
- return enum_fmt(sjpeg_formats, SJPEG_NUM_FORMATS, f,
+ return enum_fmt(ctx, sjpeg_formats, SJPEG_NUM_FORMATS, f,
SJPEG_FMT_FLAG_ENC_CAPTURE);
- return enum_fmt(sjpeg_formats, SJPEG_NUM_FORMATS, f,
- SJPEG_FMT_FLAG_DEC_CAPTURE);
+ return enum_fmt(ctx, sjpeg_formats, SJPEG_NUM_FORMATS, f,
+ SJPEG_FMT_FLAG_DEC_CAPTURE);
}
static int s5p_jpeg_enum_fmt_vid_out(struct file *file, void *priv,
@@ -1310,11 +1313,11 @@ static int s5p_jpeg_enum_fmt_vid_out(struct file *file, void *priv,
struct s5p_jpeg_ctx *ctx = fh_to_ctx(priv);
if (ctx->mode == S5P_JPEG_ENCODE)
- return enum_fmt(sjpeg_formats, SJPEG_NUM_FORMATS, f,
+ return enum_fmt(ctx, sjpeg_formats, SJPEG_NUM_FORMATS, f,
SJPEG_FMT_FLAG_ENC_OUTPUT);
- return enum_fmt(sjpeg_formats, SJPEG_NUM_FORMATS, f,
- SJPEG_FMT_FLAG_DEC_OUTPUT);
+ return enum_fmt(ctx, sjpeg_formats, SJPEG_NUM_FORMATS, f,
+ SJPEG_FMT_FLAG_DEC_OUTPUT);
}
static struct s5p_jpeg_q_data *get_q_data(struct s5p_jpeg_ctx *ctx,
@@ -2027,15 +2030,15 @@ static void s5p_jpeg_device_run(void *priv)
{
struct s5p_jpeg_ctx *ctx = priv;
struct s5p_jpeg *jpeg = ctx->jpeg;
- struct vb2_buffer *src_buf, *dst_buf;
+ struct vb2_v4l2_buffer *src_buf, *dst_buf;
unsigned long src_addr, dst_addr, flags;
spin_lock_irqsave(&ctx->jpeg->slock, flags);
src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
- src_addr = vb2_dma_contig_plane_dma_addr(src_buf, 0);
- dst_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
+ src_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
+ dst_addr = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
s5p_jpeg_reset(jpeg->regs);
s5p_jpeg_poweron(jpeg->regs);
@@ -2108,7 +2111,7 @@ static void exynos4_jpeg_set_img_addr(struct s5p_jpeg_ctx *ctx)
{
struct s5p_jpeg *jpeg = ctx->jpeg;
struct s5p_jpeg_fmt *fmt;
- struct vb2_buffer *vb;
+ struct vb2_v4l2_buffer *vb;
struct s5p_jpeg_addr jpeg_addr = {};
u32 pix_size, padding_bytes = 0;
@@ -2127,7 +2130,7 @@ static void exynos4_jpeg_set_img_addr(struct s5p_jpeg_ctx *ctx)
vb = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
}
- jpeg_addr.y = vb2_dma_contig_plane_dma_addr(vb, 0);
+ jpeg_addr.y = vb2_dma_contig_plane_dma_addr(&vb->vb2_buf, 0);
if (fmt->colplanes == 2) {
jpeg_addr.cb = jpeg_addr.y + pix_size - padding_bytes;
@@ -2145,7 +2148,7 @@ static void exynos4_jpeg_set_img_addr(struct s5p_jpeg_ctx *ctx)
static void exynos4_jpeg_set_jpeg_addr(struct s5p_jpeg_ctx *ctx)
{
struct s5p_jpeg *jpeg = ctx->jpeg;
- struct vb2_buffer *vb;
+ struct vb2_v4l2_buffer *vb;
unsigned int jpeg_addr = 0;
if (ctx->mode == S5P_JPEG_ENCODE)
@@ -2153,7 +2156,7 @@ static void exynos4_jpeg_set_jpeg_addr(struct s5p_jpeg_ctx *ctx)
else
vb = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
- jpeg_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
+ jpeg_addr = vb2_dma_contig_plane_dma_addr(&vb->vb2_buf, 0);
if (jpeg->variant->version == SJPEG_EXYNOS5433 &&
ctx->mode == S5P_JPEG_DECODE)
jpeg_addr += ctx->out_q.sos;
@@ -2268,7 +2271,7 @@ static void exynos3250_jpeg_set_img_addr(struct s5p_jpeg_ctx *ctx)
{
struct s5p_jpeg *jpeg = ctx->jpeg;
struct s5p_jpeg_fmt *fmt;
- struct vb2_buffer *vb;
+ struct vb2_v4l2_buffer *vb;
struct s5p_jpeg_addr jpeg_addr = {};
u32 pix_size;
@@ -2282,7 +2285,7 @@ static void exynos3250_jpeg_set_img_addr(struct s5p_jpeg_ctx *ctx)
fmt = ctx->cap_q.fmt;
}
- jpeg_addr.y = vb2_dma_contig_plane_dma_addr(vb, 0);
+ jpeg_addr.y = vb2_dma_contig_plane_dma_addr(&vb->vb2_buf, 0);
if (fmt->colplanes == 2) {
jpeg_addr.cb = jpeg_addr.y + pix_size;
@@ -2300,7 +2303,7 @@ static void exynos3250_jpeg_set_img_addr(struct s5p_jpeg_ctx *ctx)
static void exynos3250_jpeg_set_jpeg_addr(struct s5p_jpeg_ctx *ctx)
{
struct s5p_jpeg *jpeg = ctx->jpeg;
- struct vb2_buffer *vb;
+ struct vb2_v4l2_buffer *vb;
unsigned int jpeg_addr = 0;
if (ctx->mode == S5P_JPEG_ENCODE)
@@ -2308,7 +2311,7 @@ static void exynos3250_jpeg_set_jpeg_addr(struct s5p_jpeg_ctx *ctx)
else
vb = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
- jpeg_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
+ jpeg_addr = vb2_dma_contig_plane_dma_addr(&vb->vb2_buf, 0);
exynos3250_jpeg_jpgadr(jpeg->regs, jpeg_addr);
}
diff --git a/drivers/media/platform/sh_veu.c b/drivers/media/platform/sh_veu.c
index 15a562a..a4f5932 100644
--- a/drivers/media/platform/sh_veu.c
+++ b/drivers/media/platform/sh_veu.c
@@ -276,13 +276,13 @@ static void sh_veu_process(struct sh_veu_dev *veu,
static void sh_veu_device_run(void *priv)
{
struct sh_veu_dev *veu = priv;
- struct vb2_buffer *src_buf, *dst_buf;
+ struct vb2_v4l2_buffer *src_buf, *dst_buf;
src_buf = v4l2_m2m_next_src_buf(veu->m2m_ctx);
dst_buf = v4l2_m2m_next_dst_buf(veu->m2m_ctx);
if (src_buf && dst_buf)
- sh_veu_process(veu, src_buf, dst_buf);
+ sh_veu_process(veu, &src_buf->vb2_buf, &dst_buf->vb2_buf);
}
/* ========== video ioctls ========== */
diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c
index 9f2a64c..21102ea 100644
--- a/drivers/media/usb/uvc/uvc_ctrl.c
+++ b/drivers/media/usb/uvc/uvc_ctrl.c
@@ -1203,7 +1203,7 @@ static void uvc_ctrl_fill_event(struct uvc_video_chain *chain,
__uvc_query_v4l2_ctrl(chain, ctrl, mapping, &v4l2_ctrl);
- memset(ev->reserved, 0, sizeof(ev->reserved));
+ memset(ev, 0, sizeof(*ev));
ev->type = V4L2_EVENT_CTRL;
ev->id = v4l2_ctrl.id;
ev->u.ctrl.value = value;
diff --git a/drivers/media/usb/uvc/uvc_video.c b/drivers/media/usb/uvc/uvc_video.c
index 48503f30..dcca723 100644
--- a/drivers/media/usb/uvc/uvc_video.c
+++ b/drivers/media/usb/uvc/uvc_video.c
@@ -638,6 +638,14 @@ void uvc_video_clock_update(struct uvc_streaming *stream,
if (!uvc_hw_timestamps_param)
return;
+ /*
+ * We will get called from __vb2_queue_cancel() if there are buffers
+ * done but not dequeued by the user, but the sample array has already
+ * been released at that time. Just bail out in that case.
+ */
+ if (!clock->samples)
+ return;
+
spin_lock_irqsave(&clock->lock, flags);
if (clock->count < clock->size)
diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
index fa14101..d16128f 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls.c
@@ -1235,7 +1235,7 @@ static u32 user_flags(const struct v4l2_ctrl *ctrl)
static void fill_event(struct v4l2_event *ev, struct v4l2_ctrl *ctrl, u32 changes)
{
- memset(ev->reserved, 0, sizeof(ev->reserved));
+ memset(ev, 0, sizeof(*ev));
ev->type = V4L2_EVENT_CTRL;
ev->id = ctrl->id;
ev->u.ctrl.changes = changes;
diff --git a/drivers/media/v4l2-core/videobuf2-v4l2.c b/drivers/media/v4l2-core/videobuf2-v4l2.c
index a29ddca..ae67919 100644
--- a/drivers/media/v4l2-core/videobuf2-v4l2.c
+++ b/drivers/media/v4l2-core/videobuf2-v4l2.c
@@ -146,7 +146,6 @@ static void vb2_warn_zero_bytesused(struct vb2_buffer *vb)
return;
check_once = true;
- WARN_ON(1);
pr_warn("use of bytesused == 0 is deprecated and will be removed in the future,\n");
if (vb->vb2_queue->allow_zero_bytesused)
diff --git a/drivers/misc/lkdtm.h b/drivers/misc/lkdtm.h
index 9966891..296c711 100644
--- a/drivers/misc/lkdtm.h
+++ b/drivers/misc/lkdtm.h
@@ -43,7 +43,9 @@ void lkdtm_EXEC_KMALLOC(void);
void lkdtm_EXEC_VMALLOC(void);
void lkdtm_EXEC_RODATA(void);
void lkdtm_EXEC_USERSPACE(void);
+void lkdtm_EXEC_NULL(void);
void lkdtm_ACCESS_USERSPACE(void);
+void lkdtm_ACCESS_NULL(void);
/* lkdtm_rodata.c */
void lkdtm_rodata_do_nothing(void);
diff --git a/drivers/misc/lkdtm_core.c b/drivers/misc/lkdtm_core.c
index b72fb64..289c2d3 100644
--- a/drivers/misc/lkdtm_core.c
+++ b/drivers/misc/lkdtm_core.c
@@ -217,7 +217,9 @@ struct crashtype crashtypes[] = {
CRASHTYPE(EXEC_VMALLOC),
CRASHTYPE(EXEC_RODATA),
CRASHTYPE(EXEC_USERSPACE),
+ CRASHTYPE(EXEC_NULL),
CRASHTYPE(ACCESS_USERSPACE),
+ CRASHTYPE(ACCESS_NULL),
CRASHTYPE(WRITE_RO),
CRASHTYPE(WRITE_RO_AFTER_INIT),
CRASHTYPE(WRITE_KERN),
diff --git a/drivers/misc/lkdtm_perms.c b/drivers/misc/lkdtm_perms.c
index 45f1c0f..1a9dcda 100644
--- a/drivers/misc/lkdtm_perms.c
+++ b/drivers/misc/lkdtm_perms.c
@@ -160,6 +160,11 @@ void lkdtm_EXEC_USERSPACE(void)
vm_munmap(user_addr, PAGE_SIZE);
}
+void lkdtm_EXEC_NULL(void)
+{
+ execute_location(NULL, CODE_AS_IS);
+}
+
void lkdtm_ACCESS_USERSPACE(void)
{
unsigned long user_addr, tmp = 0;
@@ -191,6 +196,19 @@ void lkdtm_ACCESS_USERSPACE(void)
vm_munmap(user_addr, PAGE_SIZE);
}
+void lkdtm_ACCESS_NULL(void)
+{
+ unsigned long tmp;
+ unsigned long *ptr = (unsigned long *)NULL;
+
+ pr_info("attempting bad read at %px\n", ptr);
+ tmp = *ptr;
+ tmp += 0xc0dec0de;
+
+ pr_info("attempting bad write at %px\n", ptr);
+ *ptr = tmp;
+}
+
void __init lkdtm_perms_init(void)
{
/* Make sure we can write to __ro_after_init values during __init */
diff --git a/drivers/misc/qseecom.c b/drivers/misc/qseecom.c
index 00cf8e2..ab44c87 100644
--- a/drivers/misc/qseecom.c
+++ b/drivers/misc/qseecom.c
@@ -1268,7 +1268,7 @@ static int qseecom_register_listener(struct qseecom_dev_handle *data,
new_entry->listener_in_use = false;
list_add_tail(&new_entry->list, &qseecom.registered_listener_list_head);
- pr_warn("Service %d is registered\n", rcvd_lstnr.listener_id);
+ pr_debug("Service %d is registered\n", rcvd_lstnr.listener_id);
return ret;
}
@@ -1327,7 +1327,7 @@ static int __qseecom_unregister_listener(struct qseecom_dev_handle *data,
kzfree(ptr_svc);
data->released = true;
- pr_warn("Service %d is unregistered\n", data->listener.id);
+ pr_debug("Service %d is unregistered\n", data->listener.id);
return ret;
}
@@ -3993,7 +3993,7 @@ static int qseecom_receive_req(struct qseecom_dev_handle *data)
if (wait_event_interruptible(this_lstnr->rcv_req_wq,
__qseecom_listener_has_rcvd_req(data,
this_lstnr))) {
- pr_warn("Interrupted: exiting Listener Service = %d\n",
+ pr_debug("Interrupted: exiting Listener Service = %d\n",
(uint32_t)data->listener.id);
/* woken up for different reason */
return -ERESTARTSYS;
diff --git a/drivers/mmc/host/davinci_mmc.c b/drivers/mmc/host/davinci_mmc.c
index 8fa478c..619457b 100644
--- a/drivers/mmc/host/davinci_mmc.c
+++ b/drivers/mmc/host/davinci_mmc.c
@@ -1120,7 +1120,7 @@ static inline void mmc_davinci_cpufreq_deregister(struct mmc_davinci_host *host)
{
}
#endif
-static void __init init_mmcsd_host(struct mmc_davinci_host *host)
+static void init_mmcsd_host(struct mmc_davinci_host *host)
{
mmc_davinci_reset_ctrl(host, 1);
diff --git a/drivers/mmc/host/omap.c b/drivers/mmc/host/omap.c
index a4bf14e..21dfce2 100644
--- a/drivers/mmc/host/omap.c
+++ b/drivers/mmc/host/omap.c
@@ -920,7 +920,7 @@ static inline void set_cmd_timeout(struct mmc_omap_host *host, struct mmc_reques
reg &= ~(1 << 5);
OMAP_MMC_WRITE(host, SDIO, reg);
/* Set maximum timeout */
- OMAP_MMC_WRITE(host, CTO, 0xff);
+ OMAP_MMC_WRITE(host, CTO, 0xfd);
}
static inline void set_data_timeout(struct mmc_omap_host *host, struct mmc_request *req)
diff --git a/drivers/mmc/host/pxamci.c b/drivers/mmc/host/pxamci.c
index c763b40..3e13969 100644
--- a/drivers/mmc/host/pxamci.c
+++ b/drivers/mmc/host/pxamci.c
@@ -181,7 +181,7 @@ static void pxamci_dma_irq(void *param);
static void pxamci_setup_data(struct pxamci_host *host, struct mmc_data *data)
{
struct dma_async_tx_descriptor *tx;
- enum dma_data_direction direction;
+ enum dma_transfer_direction direction;
struct dma_slave_config config;
struct dma_chan *chan;
unsigned int nob = data->blocks;
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 01c8b90..fb20e82 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1092,8 +1092,7 @@ static bool sdhci_needs_reset(struct sdhci_host *host, struct mmc_request *mrq)
return (!(host->flags & SDHCI_DEVICE_DEAD) &&
((mrq->cmd && mrq->cmd->error) ||
(mrq->sbc && mrq->sbc->error) ||
- (mrq->data && ((mrq->data->error && !mrq->data->stop) ||
- (mrq->data->stop && mrq->data->stop->error))) ||
+ (mrq->data && mrq->data->stop && mrq->data->stop->error) ||
(host->quirks & SDHCI_QUIRK_RESET_AFTER_REQUEST)));
}
@@ -1147,6 +1146,16 @@ static void sdhci_finish_data(struct sdhci_host *host)
MMC_TRACE(host->mmc, "%s: 0x24=0x%08x\n", __func__,
sdhci_readl(host, SDHCI_PRESENT_STATE));
+ /*
+ * The controller needs a reset of internal state machines upon error
+ * conditions.
+ */
+ if (data->error) {
+ if (!host->cmd || host->cmd == data_cmd)
+ sdhci_do_reset(host, SDHCI_RESET_CMD);
+ sdhci_do_reset(host, SDHCI_RESET_DATA);
+ }
+
if ((host->flags & (SDHCI_REQ_USE_DMA | SDHCI_USE_ADMA)) ==
(SDHCI_REQ_USE_DMA | SDHCI_USE_ADMA))
sdhci_adma_table_post(host, data);
@@ -1171,17 +1180,6 @@ static void sdhci_finish_data(struct sdhci_host *host)
if (data->stop &&
(data->error ||
!data->mrq->sbc)) {
-
- /*
- * The controller needs a reset of internal state machines
- * upon error conditions.
- */
- if (data->error) {
- if (!host->cmd || host->cmd == data_cmd)
- sdhci_do_reset(host, SDHCI_RESET_CMD);
- sdhci_do_reset(host, SDHCI_RESET_DATA);
- }
-
/*
* 'cap_cmd_during_tfr' request must not use the command line
* after mmc_command_done() has been called. It is upper layer's
@@ -2987,7 +2985,7 @@ static void sdhci_timeout_data_timer(unsigned long data)
* *
\*****************************************************************************/
-static void sdhci_cmd_irq(struct sdhci_host *host, u32 intmask)
+static void sdhci_cmd_irq(struct sdhci_host *host, u32 intmask, u32 *intmask_p)
{
u16 auto_cmd_status;
if (!host->cmd) {
@@ -3036,16 +3034,9 @@ static void sdhci_cmd_irq(struct sdhci_host *host, u32 intmask)
host->cmd->error = -EILSEQ;
}
+ /* Treat data command CRC error the same as data CRC error */
+
/*
- * If this command initiates a data phase and a response
- * CRC error is signalled, the card can start transferring
- * data - the card may have received the command without
- * error. We must not terminate the mmc_request early.
- *
- * If the card did not receive the command or returned an
- * error which prevented it sending data, the data phase
- * will time out.
- *
* Even in case of cmd INDEX OR ENDBIT error we
* handle it the same way.
*/
@@ -3053,6 +3044,7 @@ static void sdhci_cmd_irq(struct sdhci_host *host, u32 intmask)
(((intmask & (SDHCI_INT_CRC | SDHCI_INT_TIMEOUT)) ==
SDHCI_INT_CRC) || (host->cmd->error == -EILSEQ))) {
host->cmd = NULL;
+ *intmask_p |= SDHCI_INT_DATA_CRC;
return;
}
@@ -3435,7 +3427,7 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id)
if ((host->quirks2 & SDHCI_QUIRK2_SLOW_INT_CLR) &&
(host->clock <= 400000))
udelay(40);
- sdhci_cmd_irq(host, intmask & SDHCI_INT_CMD_MASK);
+ sdhci_cmd_irq(host, intmask & SDHCI_INT_CMD_MASK, &intmask);
}
if (intmask & SDHCI_INT_DATA_MASK) {
diff --git a/drivers/mmc/host/tmio_mmc_pio.c b/drivers/mmc/host/tmio_mmc_pio.c
index 7005676..0fc1f73 100644
--- a/drivers/mmc/host/tmio_mmc_pio.c
+++ b/drivers/mmc/host/tmio_mmc_pio.c
@@ -675,7 +675,7 @@ static bool __tmio_mmc_sdcard_irq(struct tmio_mmc_host *host,
return false;
}
-static void tmio_mmc_sdio_irq(int irq, void *devid)
+static bool tmio_mmc_sdio_irq(int irq, void *devid)
{
struct tmio_mmc_host *host = devid;
struct mmc_host *mmc = host->mmc;
@@ -684,7 +684,7 @@ static void tmio_mmc_sdio_irq(int irq, void *devid)
unsigned int sdio_status;
if (!(pdata->flags & TMIO_MMC_SDIO_IRQ))
- return;
+ return false;
status = sd_ctrl_read16(host, CTL_SDIO_STATUS);
ireg = status & TMIO_SDIO_MASK_ALL & ~host->sdcard_irq_mask;
@@ -697,6 +697,8 @@ static void tmio_mmc_sdio_irq(int irq, void *devid)
if (mmc->caps & MMC_CAP_SDIO_IRQ && ireg & TMIO_SDIO_STAT_IOIRQ)
mmc_signal_sdio_irq(mmc);
+
+ return ireg;
}
irqreturn_t tmio_mmc_irq(int irq, void *devid)
@@ -718,9 +720,10 @@ irqreturn_t tmio_mmc_irq(int irq, void *devid)
if (__tmio_mmc_sdcard_irq(host, ireg, status))
return IRQ_HANDLED;
- tmio_mmc_sdio_irq(irq, devid);
+ if (tmio_mmc_sdio_irq(irq, devid))
+ return IRQ_HANDLED;
- return IRQ_HANDLED;
+ return IRQ_NONE;
}
EXPORT_SYMBOL(tmio_mmc_irq);
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 24a3433..9316972 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3134,8 +3134,12 @@ static int bond_netdev_event(struct notifier_block *this,
return NOTIFY_DONE;
if (event_dev->flags & IFF_MASTER) {
+ int ret;
+
netdev_dbg(event_dev, "IFF_MASTER\n");
- return bond_master_netdev_event(event, event_dev);
+ ret = bond_master_netdev_event(event, event_dev);
+ if (ret != NOTIFY_DONE)
+ return ret;
}
if (event_dev->flags & IFF_SLAVE) {
diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 7f64a76..ebfbaf8 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -630,22 +630,6 @@ qca8k_adjust_link(struct dsa_switch *ds, int port, struct phy_device *phy)
qca8k_port_set_status(priv, port, 1);
}
-static int
-qca8k_phy_read(struct dsa_switch *ds, int phy, int regnum)
-{
- struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv;
-
- return mdiobus_read(priv->bus, phy, regnum);
-}
-
-static int
-qca8k_phy_write(struct dsa_switch *ds, int phy, int regnum, u16 val)
-{
- struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv;
-
- return mdiobus_write(priv->bus, phy, regnum, val);
-}
-
static void
qca8k_get_strings(struct dsa_switch *ds, int port, uint8_t *data)
{
@@ -961,8 +945,6 @@ static struct dsa_switch_ops qca8k_switch_ops = {
.setup = qca8k_setup,
.adjust_link = qca8k_adjust_link,
.get_strings = qca8k_get_strings,
- .phy_read = qca8k_phy_read,
- .phy_write = qca8k_phy_write,
.get_ethtool_stats = qca8k_get_ethtool_stats,
.get_sset_count = qca8k_get_sset_count,
.get_eee = qca8k_get_eee,
diff --git a/drivers/net/ethernet/8390/mac8390.c b/drivers/net/ethernet/8390/mac8390.c
index b928390..0fdc9ad 100644
--- a/drivers/net/ethernet/8390/mac8390.c
+++ b/drivers/net/ethernet/8390/mac8390.c
@@ -156,8 +156,6 @@ static void dayna_block_output(struct net_device *dev, int count,
#define memcpy_fromio(a, b, c) memcpy((a), (void *)(b), (c))
#define memcpy_toio(a, b, c) memcpy((void *)(a), (b), (c))
-#define memcmp_withio(a, b, c) memcmp((a), (void *)(b), (c))
-
/* Slow Sane (16-bit chunk memory read/write) Cabletron uses this */
static void slow_sane_get_8390_hdr(struct net_device *dev,
struct e8390_pkt_hdr *hdr, int ring_page);
@@ -237,19 +235,26 @@ static enum mac8390_type __init mac8390_ident(struct nubus_dev *dev)
static enum mac8390_access __init mac8390_testio(volatile unsigned long membase)
{
- unsigned long outdata = 0xA5A0B5B0;
- unsigned long indata = 0x00000000;
+ u32 outdata = 0xA5A0B5B0;
+ u32 indata = 0;
+
/* Try writing 32 bits */
- memcpy_toio(membase, &outdata, 4);
- /* Now compare them */
- if (memcmp_withio(&outdata, membase, 4) == 0)
+ nubus_writel(outdata, membase);
+ /* Now read it back */
+ indata = nubus_readl(membase);
+ if (outdata == indata)
return ACCESS_32;
+
+ outdata = 0xC5C0D5D0;
+ indata = 0;
+
/* Write 16 bit output */
word_memcpy_tocard(membase, &outdata, 4);
/* Now read it back */
word_memcpy_fromcard(&indata, membase, 4);
if (outdata == indata)
return ACCESS_16;
+
return ACCESS_UNKNOWN;
}
diff --git a/drivers/net/ethernet/atheros/atlx/atl2.c b/drivers/net/ethernet/atheros/atlx/atl2.c
index 2ff4658..097a0bf 100644
--- a/drivers/net/ethernet/atheros/atlx/atl2.c
+++ b/drivers/net/ethernet/atheros/atlx/atl2.c
@@ -1338,13 +1338,11 @@ static int atl2_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
{
struct net_device *netdev;
struct atl2_adapter *adapter;
- static int cards_found;
+ static int cards_found = 0;
unsigned long mmio_start;
int mmio_len;
int err;
- cards_found = 0;
-
err = pci_enable_device(pdev);
if (err)
return err;
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index 53a506b..95874c1 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -104,6 +104,10 @@ static int bcm_sysport_set_rx_csum(struct net_device *dev,
priv->rx_chk_en = !!(wanted & NETIF_F_RXCSUM);
reg = rxchk_readl(priv, RXCHK_CONTROL);
+ /* Clear L2 header checks, which would prevent BPDUs
+ * from being received.
+ */
+ reg &= ~RXCHK_L2_HDR_DIS;
if (priv->rx_chk_en)
reg |= RXCHK_EN;
else
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 737f0f6..45ea271 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -959,6 +959,8 @@ static void bnxt_tpa_start(struct bnxt *bp, struct bnxt_rx_ring_info *rxr,
tpa_info = &rxr->rx_tpa[agg_id];
if (unlikely(cons != rxr->rx_next_cons)) {
+ netdev_warn(bp->dev, "TPA cons %x != expected cons %x\n",
+ cons, rxr->rx_next_cons);
bnxt_sched_reset(bp, rxr);
return;
}
@@ -1377,14 +1379,16 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_napi *bnapi, u32 *raw_cons,
}
cons = rxcmp->rx_cmp_opaque;
- rx_buf = &rxr->rx_buf_ring[cons];
- data = rx_buf->data;
if (unlikely(cons != rxr->rx_next_cons)) {
int rc1 = bnxt_discard_rx(bp, bnapi, raw_cons, rxcmp);
+ netdev_warn(bp->dev, "RX cons %x != expected cons %x\n",
+ cons, rxr->rx_next_cons);
bnxt_sched_reset(bp, rxr);
return rc1;
}
+ rx_buf = &rxr->rx_buf_ring[cons];
+ data = rx_buf->data;
prefetch(data);
agg_bufs = (le32_to_cpu(rxcmp->rx_cmp_misc_v1) & RX_CMP_AGG_BUFS) >>
@@ -1400,11 +1404,17 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_napi *bnapi, u32 *raw_cons,
rx_buf->data = NULL;
if (rxcmp1->rx_cmp_cfa_code_errors_v2 & RX_CMP_L2_ERRORS) {
+ u32 rx_err = le32_to_cpu(rxcmp1->rx_cmp_cfa_code_errors_v2);
+
bnxt_reuse_rx_data(rxr, cons, data);
if (agg_bufs)
bnxt_reuse_rx_agg_bufs(bnapi, cp_cons, agg_bufs);
rc = -EIO;
+ if (rx_err & RX_CMPL_ERRORS_BUFFER_ERROR_MASK) {
+ netdev_warn(bp->dev, "RX buffer error %x\n", rx_err);
+ bnxt_sched_reset(bp, rxr);
+ }
goto next_rx;
}
diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c
index da142f6..18ddd24 100644
--- a/drivers/net/ethernet/cavium/thunder/nic_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nic_main.c
@@ -999,7 +999,7 @@ static void nic_handle_mbx_intr(struct nicpf *nic, int vf)
case NIC_MBOX_MSG_CFG_DONE:
/* Last message of VF config msg sequence */
nic_enable_vf(nic, vf, true);
- goto unlock;
+ break;
case NIC_MBOX_MSG_SHUTDOWN:
/* First msg in VF teardown sequence */
if (vf >= nic->num_vf_en)
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index c75d4ea9..71f228c 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -162,6 +162,17 @@ static int nicvf_check_pf_ready(struct nicvf *nic)
return 1;
}
+static void nicvf_send_cfg_done(struct nicvf *nic)
+{
+ union nic_mbx mbx = {};
+
+ mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE;
+ if (nicvf_send_msg_to_pf(nic, &mbx)) {
+ netdev_err(nic->netdev,
+ "PF didn't respond to CFG DONE msg\n");
+ }
+}
+
static void nicvf_read_bgx_stats(struct nicvf *nic, struct bgx_stats_msg *bgx)
{
if (bgx->rx)
@@ -1178,7 +1189,6 @@ int nicvf_open(struct net_device *netdev)
struct nicvf *nic = netdev_priv(netdev);
struct queue_set *qs = nic->qs;
struct nicvf_cq_poll *cq_poll = NULL;
- union nic_mbx mbx = {};
netif_carrier_off(netdev);
@@ -1267,8 +1277,7 @@ int nicvf_open(struct net_device *netdev)
nicvf_enable_intr(nic, NICVF_INTR_RBDR, qidx);
/* Send VF config done msg to PF */
- mbx.msg.msg = NIC_MBOX_MSG_CFG_DONE;
- nicvf_write_to_mbx(nic, &mbx);
+ nicvf_send_cfg_done(nic);
return 0;
cleanup:
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 89acf7b..b73d9ba 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -120,7 +120,7 @@ static void enic_init_affinity_hint(struct enic *enic)
for (i = 0; i < enic->intr_count; i++) {
if (enic_is_err_intr(enic, i) || enic_is_notify_intr(enic, i) ||
- (enic->msix[i].affinity_mask &&
+ (cpumask_available(enic->msix[i].affinity_mask) &&
!cpumask_empty(enic->msix[i].affinity_mask)))
continue;
if (zalloc_cpumask_var(&enic->msix[i].affinity_mask,
@@ -149,7 +149,7 @@ static void enic_set_affinity_hint(struct enic *enic)
for (i = 0; i < enic->intr_count; i++) {
if (enic_is_err_intr(enic, i) ||
enic_is_notify_intr(enic, i) ||
- !enic->msix[i].affinity_mask ||
+ !cpumask_available(enic->msix[i].affinity_mask) ||
cpumask_empty(enic->msix[i].affinity_mask))
continue;
err = irq_set_affinity_hint(enic->msix_entry[i].vector,
@@ -162,7 +162,7 @@ static void enic_set_affinity_hint(struct enic *enic)
for (i = 0; i < enic->wq_count; i++) {
int wq_intr = enic_msix_wq_intr(enic, i);
- if (enic->msix[wq_intr].affinity_mask &&
+ if (cpumask_available(enic->msix[wq_intr].affinity_mask) &&
!cpumask_empty(enic->msix[wq_intr].affinity_mask))
netif_set_xps_queue(enic->netdev,
enic->msix[wq_intr].affinity_mask,
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
index 5bb019d..551b2a9 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
@@ -2820,6 +2820,7 @@ int hns_dsaf_roce_reset(struct fwnode_handle *dsaf_fwnode, bool dereset)
dsaf_dev = dev_get_drvdata(&pdev->dev);
if (!dsaf_dev) {
dev_err(&pdev->dev, "dsaf_dev is NULL\n");
+ put_device(&pdev->dev);
return -ENODEV;
}
@@ -2827,6 +2828,7 @@ int hns_dsaf_roce_reset(struct fwnode_handle *dsaf_fwnode, bool dereset)
if (AE_IS_VER1(dsaf_dev->dsaf_ver)) {
dev_err(dsaf_dev->dev, "%s v1 chip doesn't support RoCE!\n",
dsaf_dev->ae_dev.name);
+ put_device(&pdev->dev);
return -ENODEV;
}
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 6855b33..8bbedfc 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -2121,7 +2121,7 @@ static int e1000_request_msix(struct e1000_adapter *adapter)
if (strlen(netdev->name) < (IFNAMSIZ - 5))
snprintf(adapter->rx_ring->name,
sizeof(adapter->rx_ring->name) - 1,
- "%s-rx-0", netdev->name);
+ "%.14s-rx-0", netdev->name);
else
memcpy(adapter->rx_ring->name, netdev->name, IFNAMSIZ);
err = request_irq(adapter->msix_entries[vector].vector,
@@ -2137,7 +2137,7 @@ static int e1000_request_msix(struct e1000_adapter *adapter)
if (strlen(netdev->name) < (IFNAMSIZ - 5))
snprintf(adapter->tx_ring->name,
sizeof(adapter->tx_ring->name) - 1,
- "%s-tx-0", netdev->name);
+ "%.14s-tx-0", netdev->name);
else
memcpy(adapter->tx_ring->name, netdev->name, IFNAMSIZ);
err = request_irq(adapter->msix_entries[vector].vector,
@@ -5291,8 +5291,13 @@ static void e1000_watchdog_task(struct work_struct *work)
/* 8000ES2LAN requires a Rx packet buffer work-around
* on link down event; reset the controller to flush
* the Rx packet buffer.
+ *
+ * If the link is lost the controller stops DMA, but
+ * if there is queued Tx work it cannot be done. So
+ * reset the controller to flush the Tx packet buffers.
*/
- if (adapter->flags & FLAG_RX_NEEDS_RESTART)
+ if ((adapter->flags & FLAG_RX_NEEDS_RESTART) ||
+ e1000_desc_unused(tx_ring) + 1 < tx_ring->count)
adapter->flags |= FLAG_RESTART_NOW;
else
pm_schedule_suspend(netdev->dev.parent,
@@ -5315,14 +5320,6 @@ static void e1000_watchdog_task(struct work_struct *work)
adapter->gotc_old = adapter->stats.gotc;
spin_unlock(&adapter->stats64_lock);
- /* If the link is lost the controller stops DMA, but
- * if there is queued Tx work it cannot be done. So
- * reset the controller to flush the Tx packet buffers.
- */
- if (!netif_carrier_ok(netdev) &&
- (e1000_desc_unused(tx_ring) + 1 < tx_ring->count))
- adapter->flags |= FLAG_RESTART_NOW;
-
/* If reset is necessary, do it outside of interrupt context. */
if (adapter->flags & FLAG_RESTART_NOW) {
schedule_work(&adapter->reset_task);
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index 2aae6f8..a526637 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -58,6 +58,8 @@ static int __init fm10k_init_module(void)
/* create driver workqueue */
fm10k_workqueue = alloc_workqueue("%s", WQ_MEM_RECLAIM, 0,
fm10k_driver_name);
+ if (!fm10k_workqueue)
+ return -ENOMEM;
fm10k_dbg_init();
diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
index 5b12022..526d07e 100644
--- a/drivers/net/ethernet/marvell/mv643xx_eth.c
+++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
@@ -2886,7 +2886,7 @@ static int mv643xx_eth_shared_probe(struct platform_device *pdev)
ret = mv643xx_eth_shared_of_probe(pdev);
if (ret)
- return ret;
+ goto err_put_clk;
pd = dev_get_platdata(&pdev->dev);
msp->tx_csum_limit = (pd != NULL && pd->tx_csum_limit) ?
@@ -2894,6 +2894,11 @@ static int mv643xx_eth_shared_probe(struct platform_device *pdev)
infer_hw_params(msp);
return 0;
+
+err_put_clk:
+ if (!IS_ERR(msp->clk))
+ clk_disable_unprepare(msp->clk);
+ return ret;
}
static int mv643xx_eth_shared_remove(struct platform_device *pdev)
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index c92ffdf..d98b874 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2050,7 +2050,7 @@ static int mvneta_rx_hwbm(struct mvneta_port *pp, int rx_todo,
if (unlikely(!skb))
goto err_drop_frame_ret_pool;
- dma_sync_single_range_for_cpu(dev->dev.parent,
+ dma_sync_single_range_for_cpu(&pp->bm_priv->pdev->dev,
rx_desc->buf_phys_addr,
MVNETA_MH_SIZE + NET_SKB_PAD,
rx_bytes,
diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index dae9dcf..e528338 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -2633,6 +2633,8 @@ int mlx4_cmd_use_events(struct mlx4_dev *dev)
if (!priv->cmd.context)
return -ENOMEM;
+ if (mlx4_is_mfunc(dev))
+ mutex_lock(&priv->cmd.slave_cmd_mutex);
down_write(&priv->cmd.switch_sem);
for (i = 0; i < priv->cmd.max_cmds; ++i) {
priv->cmd.context[i].token = i;
@@ -2658,6 +2660,8 @@ int mlx4_cmd_use_events(struct mlx4_dev *dev)
down(&priv->cmd.poll_sem);
priv->cmd.use_events = 1;
up_write(&priv->cmd.switch_sem);
+ if (mlx4_is_mfunc(dev))
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
return err;
}
@@ -2670,6 +2674,8 @@ void mlx4_cmd_use_polling(struct mlx4_dev *dev)
struct mlx4_priv *priv = mlx4_priv(dev);
int i;
+ if (mlx4_is_mfunc(dev))
+ mutex_lock(&priv->cmd.slave_cmd_mutex);
down_write(&priv->cmd.switch_sem);
priv->cmd.use_events = 0;
@@ -2677,9 +2683,12 @@ void mlx4_cmd_use_polling(struct mlx4_dev *dev)
down(&priv->cmd.event_sem);
kfree(priv->cmd.context);
+ priv->cmd.context = NULL;
up(&priv->cmd.poll_sem);
up_write(&priv->cmd.switch_sem);
+ if (mlx4_is_mfunc(dev))
+ mutex_unlock(&priv->cmd.slave_cmd_mutex);
}
struct mlx4_cmd_mailbox *mlx4_alloc_cmd_mailbox(struct mlx4_dev *dev)
diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index 9d1a7d5..7994430 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -2677,13 +2677,13 @@ static int qp_get_mtt_size(struct mlx4_qp_context *qpc)
int total_pages;
int total_mem;
int page_offset = (be32_to_cpu(qpc->params2) >> 6) & 0x3f;
+ int tot;
sq_size = 1 << (log_sq_size + log_sq_sride + 4);
rq_size = (srq|rss|xrc) ? 0 : (1 << (log_rq_size + log_rq_stride + 4));
total_mem = sq_size + rq_size;
- total_pages =
- roundup_pow_of_two((total_mem + (page_offset << 6)) >>
- page_shift);
+ tot = (total_mem + (page_offset << 6)) >> page_shift;
+ total_pages = !tot ? 1 : roundup_pow_of_two(tot);
return total_pages;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
index 029e856..dc809c2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
@@ -45,7 +45,9 @@ int mlx5e_create_tir(struct mlx5_core_dev *mdev,
if (err)
return err;
+ mutex_lock(&mdev->mlx5e_res.td.list_lock);
list_add(&tir->list, &mdev->mlx5e_res.td.tirs_list);
+ mutex_unlock(&mdev->mlx5e_res.td.list_lock);
return 0;
}
@@ -53,8 +55,10 @@ int mlx5e_create_tir(struct mlx5_core_dev *mdev,
void mlx5e_destroy_tir(struct mlx5_core_dev *mdev,
struct mlx5e_tir *tir)
{
+ mutex_lock(&mdev->mlx5e_res.td.list_lock);
mlx5_core_destroy_tir(mdev, tir->tirn);
list_del(&tir->list);
+ mutex_unlock(&mdev->mlx5e_res.td.list_lock);
}
static int mlx5e_create_mkey(struct mlx5_core_dev *mdev, u32 pdn,
@@ -114,6 +118,7 @@ int mlx5e_create_mdev_resources(struct mlx5_core_dev *mdev)
}
INIT_LIST_HEAD(&mdev->mlx5e_res.td.tirs_list);
+ mutex_init(&mdev->mlx5e_res.td.list_lock);
return 0;
@@ -151,6 +156,7 @@ int mlx5e_refresh_tirs_self_loopback_enable(struct mlx5_core_dev *mdev)
MLX5_SET(modify_tir_in, in, bitmask.self_lb_en, 1);
+ mutex_lock(&mdev->mlx5e_res.td.list_lock);
list_for_each_entry(tir, &mdev->mlx5e_res.td.tirs_list, list) {
err = mlx5_core_modify_tir(mdev, tir->tirn, in, inlen);
if (err)
@@ -159,6 +165,7 @@ int mlx5e_refresh_tirs_self_loopback_enable(struct mlx5_core_dev *mdev)
out:
kvfree(in);
+ mutex_unlock(&mdev->mlx5e_res.td.list_lock);
return err;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index d5e8ac8..54872f8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -1365,7 +1365,7 @@ static int mlx5e_get_module_info(struct net_device *netdev,
break;
case MLX5_MODULE_ID_SFP:
modinfo->type = ETH_MODULE_SFF_8472;
- modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN;
+ modinfo->eeprom_len = MLX5_EEPROM_PAGE_LENGTH;
break;
default:
netdev_err(priv->netdev, "%s: cable type not recognized:0x%x\n",
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c
index 43d7c83..0bad09d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -368,10 +368,6 @@ int mlx5_query_module_eeprom(struct mlx5_core_dev *dev,
size -= offset + size - MLX5_EEPROM_PAGE_LENGTH;
i2c_addr = MLX5_I2C_ADDR_LOW;
- if (offset >= MLX5_EEPROM_PAGE_LENGTH) {
- i2c_addr = MLX5_I2C_ADDR_HIGH;
- offset -= MLX5_EEPROM_PAGE_LENGTH;
- }
MLX5_SET(mcia_reg, in, l, 0);
MLX5_SET(mcia_reg, in, module, module_num);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 22a5916e..e3ed70a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1565,7 +1565,7 @@ static void mlxsw_sp_port_get_prio_strings(u8 **p, int prio)
int i;
for (i = 0; i < MLXSW_SP_PORT_HW_PRIO_STATS_LEN; i++) {
- snprintf(*p, ETH_GSTRING_LEN, "%s_%d",
+ snprintf(*p, ETH_GSTRING_LEN, "%.29s_%.1d",
mlxsw_sp_port_hw_prio_stats[i].str, prio);
*p += ETH_GSTRING_LEN;
}
@@ -1576,7 +1576,7 @@ static void mlxsw_sp_port_get_tc_strings(u8 **p, int tc)
int i;
for (i = 0; i < MLXSW_SP_PORT_HW_TC_STATS_LEN; i++) {
- snprintf(*p, ETH_GSTRING_LEN, "%s_%d",
+ snprintf(*p, ETH_GSTRING_LEN, "%.29s_%.1d",
mlxsw_sp_port_hw_tc_stats[i].str, tc);
*p += ETH_GSTRING_LEN;
}
@@ -2059,11 +2059,11 @@ mlxsw_sp_port_set_link_ksettings(struct net_device *dev,
if (err)
return err;
+ mlxsw_sp_port->link.autoneg = autoneg;
+
if (!netif_running(dev))
return 0;
- mlxsw_sp_port->link.autoneg = autoneg;
-
mlxsw_sp_port_admin_status_set(mlxsw_sp_port, false);
mlxsw_sp_port_admin_status_set(mlxsw_sp_port, true);
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
index f8df530..7308777 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_bpf_jit.c
@@ -756,15 +756,10 @@ wrp_alu64_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
static int
wrp_alu32_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
- enum alu_op alu_op, bool skip)
+ enum alu_op alu_op)
{
const struct bpf_insn *insn = &meta->insn;
- if (skip) {
- meta->skip = true;
- return 0;
- }
-
wrp_alu_imm(nfp_prog, insn->dst_reg * 2, alu_op, insn->imm);
wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
@@ -1017,7 +1012,7 @@ static int xor_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
static int xor_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
{
- return wrp_alu32_imm(nfp_prog, meta, ALU_OP_XOR, !~meta->insn.imm);
+ return wrp_alu32_imm(nfp_prog, meta, ALU_OP_XOR);
}
static int and_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
@@ -1027,7 +1022,7 @@ static int and_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
static int and_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
{
- return wrp_alu32_imm(nfp_prog, meta, ALU_OP_AND, !~meta->insn.imm);
+ return wrp_alu32_imm(nfp_prog, meta, ALU_OP_AND);
}
static int or_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
@@ -1037,7 +1032,7 @@ static int or_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
static int or_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
{
- return wrp_alu32_imm(nfp_prog, meta, ALU_OP_OR, !meta->insn.imm);
+ return wrp_alu32_imm(nfp_prog, meta, ALU_OP_OR);
}
static int add_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
@@ -1047,7 +1042,7 @@ static int add_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
static int add_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
{
- return wrp_alu32_imm(nfp_prog, meta, ALU_OP_ADD, !meta->insn.imm);
+ return wrp_alu32_imm(nfp_prog, meta, ALU_OP_ADD);
}
static int sub_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
@@ -1057,7 +1052,7 @@ static int sub_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
static int sub_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
{
- return wrp_alu32_imm(nfp_prog, meta, ALU_OP_SUB, !meta->insn.imm);
+ return wrp_alu32_imm(nfp_prog, meta, ALU_OP_SUB);
}
static int shl_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index 71836a7..480883a 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -457,7 +457,7 @@ static int ravb_dmac_init(struct net_device *ndev)
RCR_EFFS | RCR_ENCF | RCR_ETS0 | RCR_ESF | 0x18000000, RCR);
/* Set FIFO size */
- ravb_write(ndev, TGC_TQP_AVBMODE1 | 0x00222200, TGC);
+ ravb_write(ndev, TGC_TQP_AVBMODE1 | 0x00112200, TGC);
/* Timestamp enable */
ravb_write(ndev, TCCR_TFEN, TCCR);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 20a2b01..2c04a07 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1747,11 +1747,6 @@ static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
if (ret < 0)
pr_warn("%s: failed debugFS registration\n", __func__);
#endif
- /* Start the ball rolling... */
- pr_debug("%s: DMA RX/TX processes started...\n", dev->name);
- priv->hw->dma->start_tx(priv->ioaddr);
- priv->hw->dma->start_rx(priv->ioaddr);
-
/* Dump DMA/MAC registers */
if (netif_msg_hw(priv)) {
priv->hw->mac->dump_regs(priv->hw);
@@ -1779,6 +1774,11 @@ static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
if (priv->tso)
priv->hw->dma->enable_tso(priv->ioaddr, 1, STMMAC_CHAN0);
+ /* Start the ball rolling... */
+ pr_debug("%s: DMA RX/TX processes started...\n", dev->name);
+ priv->hw->dma->start_tx(priv->ioaddr);
+ priv->hw->dma->start_rx(priv->ioaddr);
+
return 0;
}
@@ -1796,8 +1796,6 @@ static int stmmac_open(struct net_device *dev)
struct stmmac_priv *priv = netdev_priv(dev);
int ret;
- stmmac_check_ether_addr(priv);
-
if (priv->hw->pcs != STMMAC_PCS_RGMII &&
priv->hw->pcs != STMMAC_PCS_TBI &&
priv->hw->pcs != STMMAC_PCS_RTBI) {
@@ -2931,6 +2929,20 @@ static int stmmac_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
return ret;
}
+static int stmmac_set_mac_address(struct net_device *ndev, void *addr)
+{
+ struct stmmac_priv *priv = netdev_priv(ndev);
+ int ret = 0;
+
+ ret = eth_mac_addr(ndev, addr);
+ if (ret)
+ return ret;
+
+ priv->hw->mac->set_umac_addr(priv->hw, ndev->dev_addr, 0);
+
+ return ret;
+}
+
#ifdef CONFIG_DEBUG_FS
static struct dentry *stmmac_fs_dir;
@@ -3137,7 +3149,7 @@ static const struct net_device_ops stmmac_netdev_ops = {
#ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller = stmmac_poll_controller,
#endif
- .ndo_set_mac_address = eth_mac_addr,
+ .ndo_set_mac_address = stmmac_set_mac_address,
};
/**
@@ -3341,6 +3353,8 @@ int stmmac_dvr_probe(struct device *device,
if (ret)
goto error_hw_init;
+ stmmac_check_ether_addr(priv);
+
ndev->netdev_ops = &stmmac_netdev_ops;
ndev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 4a2609c..72fb55c 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -463,7 +463,12 @@ static int ipvlan_nl_changelink(struct net_device *dev,
struct ipvl_port *port = ipvlan_port_get_rtnl(ipvlan->phy_dev);
int err = 0;
- if (data && data[IFLA_IPVLAN_MODE]) {
+ if (!data)
+ return 0;
+ if (!ns_capable(dev_net(ipvlan->phy_dev)->user_ns, CAP_NET_ADMIN))
+ return -EPERM;
+
+ if (data[IFLA_IPVLAN_MODE]) {
u16 nmode = nla_get_u16(data[IFLA_IPVLAN_MODE]);
err = ipvlan_set_port_mode(port, nmode);
@@ -530,6 +535,8 @@ static int ipvlan_link_new(struct net *src_net, struct net_device *dev,
struct ipvl_dev *tmp = netdev_priv(phy_dev);
phy_dev = tmp->phy_dev;
+ if (!ns_capable(dev_net(phy_dev)->user_ns, CAP_NET_ADMIN))
+ return -EPERM;
} else if (!netif_is_ipvlan_port(phy_dev)) {
err = ipvlan_port_create(phy_dev);
if (err < 0)
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index 09deef4..a9bbdce 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -319,7 +319,6 @@ int __mdiobus_register(struct mii_bus *bus, struct module *owner)
err = device_register(&bus->dev);
if (err) {
pr_err("mii_bus %s failed to register\n", bus->id);
- put_device(&bus->dev);
return -EINVAL;
}
diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
index 2e3f932..a90b7af 100644
--- a/drivers/net/ppp/pptp.c
+++ b/drivers/net/ppp/pptp.c
@@ -536,6 +536,7 @@ static void pptp_sock_destruct(struct sock *sk)
pppox_unbind_sock(sk);
}
skb_queue_purge(&sk->sk_receive_queue);
+ dst_release(rcu_dereference_protected(sk->sk_dst_cache, 1));
}
static int pptp_create(struct net *net, struct socket *sock, int kern)
diff --git a/drivers/net/slip/slhc.c b/drivers/net/slip/slhc.c
index cfd81eb..ddceed3 100644
--- a/drivers/net/slip/slhc.c
+++ b/drivers/net/slip/slhc.c
@@ -153,7 +153,7 @@ slhc_init(int rslots, int tslots)
void
slhc_free(struct slcompress *comp)
{
- if ( comp == NULLSLCOMPR )
+ if ( IS_ERR_OR_NULL(comp) )
return;
if ( comp->tstate != NULLSLSTATE )
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 375b681..3eb6d48 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1163,6 +1163,12 @@ static int team_port_add(struct team *team, struct net_device *port_dev)
return -EINVAL;
}
+ if (netdev_has_upper_dev(dev, port_dev)) {
+ netdev_err(dev, "Device %s is already an upper device of the team interface\n",
+ portname);
+ return -EBUSY;
+ }
+
if (port_dev->features & NETIF_F_VLAN_CHALLENGED &&
vlan_uses_dev(dev)) {
netdev_err(dev, "Device %s is VLAN challenged and team device has VLAN set up\n",
@@ -1251,6 +1257,23 @@ static int team_port_add(struct team *team, struct net_device *port_dev)
goto err_option_port_add;
}
+ /* set promiscuity level to new slave */
+ if (dev->flags & IFF_PROMISC) {
+ err = dev_set_promiscuity(port_dev, 1);
+ if (err)
+ goto err_set_slave_promisc;
+ }
+
+ /* set allmulti level to new slave */
+ if (dev->flags & IFF_ALLMULTI) {
+ err = dev_set_allmulti(port_dev, 1);
+ if (err) {
+ if (dev->flags & IFF_PROMISC)
+ dev_set_promiscuity(port_dev, -1);
+ goto err_set_slave_promisc;
+ }
+ }
+
netif_addr_lock_bh(dev);
dev_uc_sync_multiple(port_dev, dev);
dev_mc_sync_multiple(port_dev, dev);
@@ -1267,6 +1290,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev)
return 0;
+err_set_slave_promisc:
+ __team_option_inst_del_port(team, port);
+
err_option_port_add:
team_upper_dev_unlink(team, port);
@@ -1312,6 +1338,12 @@ static int team_port_del(struct team *team, struct net_device *port_dev)
team_port_disable(team, port);
list_del_rcu(&port->list);
+
+ if (dev->flags & IFF_PROMISC)
+ dev_set_promiscuity(port_dev, -1);
+ if (dev->flags & IFF_ALLMULTI)
+ dev_set_allmulti(port_dev, -1);
+
team_upper_dev_unlink(team, port);
netdev_rx_handler_unregister(port_dev);
team_port_disable_netpoll(port);
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 03ba0f1c..9942054 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1194,9 +1194,6 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
u32 rxhash;
ssize_t n;
- if (!(tun->dev->flags & IFF_UP))
- return -EIO;
-
if (!(tun->flags & IFF_NO_PI)) {
if (len < sizeof(pi))
return -EINVAL;
@@ -1273,9 +1270,11 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
err = skb_copy_datagram_from_iter(skb, 0, from, len);
if (err) {
+ err = -EFAULT;
+drop:
this_cpu_inc(tun->pcpu_stats->rx_dropped);
kfree_skb(skb);
- return -EFAULT;
+ return err;
}
err = virtio_net_hdr_to_skb(skb, &gso, tun_is_little_endian(tun));
@@ -1331,7 +1330,16 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
skb_probe_transport_header(skb, 0);
rxhash = skb_get_hash(skb);
+
+ rcu_read_lock();
+ if (unlikely(!(tun->dev->flags & IFF_UP))) {
+ err = -EIO;
+ rcu_read_unlock();
+ goto drop;
+ }
+
netif_rx_ni(skb);
+ rcu_read_unlock();
stats = get_cpu_ptr(tun->pcpu_stats);
u64_stats_update_begin(&stats->syncp);
diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 134eb18..d51ad14 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -890,13 +890,14 @@ static const struct usb_device_id products[] = {
{QMI_FIXED_INTF(0x19d2, 0x2002, 4)}, /* ZTE (Vodafone) K3765-Z */
{QMI_FIXED_INTF(0x2001, 0x7e19, 4)}, /* D-Link DWM-221 B1 */
{QMI_FIXED_INTF(0x2001, 0x7e35, 4)}, /* D-Link DWM-222 */
+ {QMI_FIXED_INTF(0x2020, 0x2031, 4)}, /* Olicard 600 */
{QMI_FIXED_INTF(0x2020, 0x2033, 4)}, /* BroadMobi BM806U */
{QMI_FIXED_INTF(0x0f3d, 0x68a2, 8)}, /* Sierra Wireless MC7700 */
{QMI_FIXED_INTF(0x114f, 0x68a2, 8)}, /* Sierra Wireless MC7750 */
{QMI_FIXED_INTF(0x1199, 0x68a2, 8)}, /* Sierra Wireless MC7710 in QMI mode */
{QMI_FIXED_INTF(0x1199, 0x68a2, 19)}, /* Sierra Wireless MC7710 in QMI mode */
- {QMI_FIXED_INTF(0x1199, 0x68c0, 8)}, /* Sierra Wireless MC7304/MC7354 */
- {QMI_FIXED_INTF(0x1199, 0x68c0, 10)}, /* Sierra Wireless MC7304/MC7354 */
+ {QMI_QUIRK_SET_DTR(0x1199, 0x68c0, 8)}, /* Sierra Wireless MC7304/MC7354, WP76xx */
+ {QMI_QUIRK_SET_DTR(0x1199, 0x68c0, 10)},/* Sierra Wireless MC7304/MC7354 */
{QMI_FIXED_INTF(0x1199, 0x901c, 8)}, /* Sierra Wireless EM7700 */
{QMI_FIXED_INTF(0x1199, 0x901f, 8)}, /* Sierra Wireless EM7355 */
{QMI_FIXED_INTF(0x1199, 0x9041, 8)}, /* Sierra Wireless MC7305/MC7355 */
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 373713f..b6ee0c1 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1380,6 +1380,14 @@ static int vxlan_rcv(struct sock *sk, struct sk_buff *skb)
goto drop;
}
+ rcu_read_lock();
+
+ if (unlikely(!(vxlan->dev->flags & IFF_UP))) {
+ rcu_read_unlock();
+ atomic_long_inc(&vxlan->dev->rx_dropped);
+ goto drop;
+ }
+
stats = this_cpu_ptr(vxlan->dev->tstats);
u64_stats_update_begin(&stats->syncp);
stats->rx_packets++;
@@ -1387,6 +1395,9 @@ static int vxlan_rcv(struct sock *sk, struct sk_buff *skb)
u64_stats_update_end(&stats->syncp);
gro_cells_receive(&vxlan->gro_cells, skb);
+
+ rcu_read_unlock();
+
return 0;
drop:
@@ -2362,6 +2373,8 @@ static void vxlan_uninit(struct net_device *dev)
{
struct vxlan_dev *vxlan = netdev_priv(dev);
+ gro_cells_destroy(&vxlan->gro_cells);
+
vxlan_fdb_delete_default(vxlan);
free_percpu(dev->tstats);
@@ -3112,7 +3125,6 @@ static void vxlan_dellink(struct net_device *dev, struct list_head *head)
{
struct vxlan_dev *vxlan = netdev_priv(dev);
- gro_cells_destroy(&vxlan->gro_cells);
list_del(&vxlan->next);
unregister_netdevice_queue(dev, head);
}
@@ -3363,10 +3375,8 @@ static void __net_exit vxlan_exit_net(struct net *net)
/* If vxlan->dev is in the same netns, it has already been added
* to the list by the previous loop.
*/
- if (!net_eq(dev_net(vxlan->dev), net)) {
- gro_cells_destroy(&vxlan->gro_cells);
+ if (!net_eq(dev_net(vxlan->dev), net))
unregister_netdevice_queue(vxlan->dev, &list);
- }
}
unregister_netdevice_many(&list);
diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
index 21aec5c..bbfe7be 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.c
+++ b/drivers/net/wireless/ath/ath10k/wmi.c
@@ -4277,7 +4277,7 @@ static void ath10k_tpc_config_disp_tables(struct ath10k *ar,
rate_code[i],
type);
snprintf(buff, sizeof(buff), "%8d ", tpc[j]);
- strncat(tpc_value, buff, strlen(buff));
+ strlcat(tpc_value, buff, sizeof(tpc_value));
}
tpc_stats->tpc_table[type].pream_idx[i] = pream_idx;
tpc_stats->tpc_table[type].rate_code[i] = rate_code[i];
diff --git a/drivers/net/wireless/ath/wil6210/cfg80211.c b/drivers/net/wireless/ath/wil6210/cfg80211.c
index 30ced6d..4883bc6 100644
--- a/drivers/net/wireless/ath/wil6210/cfg80211.c
+++ b/drivers/net/wireless/ath/wil6210/cfg80211.c
@@ -1424,6 +1424,12 @@ static int _wil_cfg80211_merge_extra_ies(const u8 *ies1, u16 ies1_len,
u8 *buf, *dpos;
const u8 *spos;
+ if (!ies1)
+ ies1_len = 0;
+
+ if (!ies2)
+ ies2_len = 0;
+
if (ies1_len == 0 && ies2_len == 0) {
*merged_ies = NULL;
*merged_len = 0;
@@ -1433,17 +1439,19 @@ static int _wil_cfg80211_merge_extra_ies(const u8 *ies1, u16 ies1_len,
buf = kmalloc(ies1_len + ies2_len, GFP_KERNEL);
if (!buf)
return -ENOMEM;
- memcpy(buf, ies1, ies1_len);
+ if (ies1)
+ memcpy(buf, ies1, ies1_len);
dpos = buf + ies1_len;
spos = ies2;
- while (spos + 1 < ies2 + ies2_len) {
+ while (spos && (spos + 1 < ies2 + ies2_len)) {
/* IE tag at offset 0, length at offset 1 */
u16 ielen = 2 + spos[1];
if (spos + ielen > ies2 + ies2_len)
break;
if (spos[0] == WLAN_EID_VENDOR_SPECIFIC &&
- !_wil_cfg80211_find_ie(ies1, ies1_len, spos, ielen)) {
+ (!ies1 || !_wil_cfg80211_find_ie(ies1, ies1_len,
+ spos, ielen))) {
memcpy(dpos, spos, ielen);
dpos += ielen;
}
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
index e58a50d..c21f8bd 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
@@ -475,7 +475,7 @@ static void iwl_pcie_rx_allocator(struct iwl_trans *trans)
struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
struct iwl_rb_allocator *rba = &trans_pcie->rba;
struct list_head local_empty;
- int pending = atomic_xchg(&rba->req_pending, 0);
+ int pending = atomic_read(&rba->req_pending);
IWL_DEBUG_RX(trans, "Pending allocation requests = %d\n", pending);
@@ -530,11 +530,13 @@ static void iwl_pcie_rx_allocator(struct iwl_trans *trans)
i++;
}
+ atomic_dec(&rba->req_pending);
pending--;
+
if (!pending) {
- pending = atomic_xchg(&rba->req_pending, 0);
+ pending = atomic_read(&rba->req_pending);
IWL_DEBUG_RX(trans,
- "Pending allocation requests = %d\n",
+ "Got more pending allocation requests = %d\n",
pending);
}
@@ -546,12 +548,15 @@ static void iwl_pcie_rx_allocator(struct iwl_trans *trans)
spin_unlock(&rba->lock);
atomic_inc(&rba->req_ready);
+
}
spin_lock(&rba->lock);
/* return unused rbds to the allocator empty list */
list_splice_tail(&local_empty, &rba->rbd_empty);
spin_unlock(&rba->lock);
+
+ IWL_DEBUG_RX(trans, "%s, exit.\n", __func__);
}
/*
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c
index 780acf2..e9ec1da 100644
--- a/drivers/net/wireless/mac80211_hwsim.c
+++ b/drivers/net/wireless/mac80211_hwsim.c
@@ -3167,7 +3167,7 @@ static int hwsim_get_radio_nl(struct sk_buff *msg, struct genl_info *info)
goto out_err;
}
- genlmsg_reply(skb, info);
+ res = genlmsg_reply(skb, info);
break;
}
diff --git a/drivers/net/wireless/marvell/libertas_tf/if_usb.c b/drivers/net/wireless/marvell/libertas_tf/if_usb.c
index e0ade40..4b53920 100644
--- a/drivers/net/wireless/marvell/libertas_tf/if_usb.c
+++ b/drivers/net/wireless/marvell/libertas_tf/if_usb.c
@@ -433,8 +433,6 @@ static int __if_usb_submit_rx_urb(struct if_usb_card *cardp,
skb_tail_pointer(skb),
MRVDRV_ETH_RX_PACKET_BUFFER_SIZE, callbackfn, cardp);
- cardp->rx_urb->transfer_flags |= URB_ZERO_PACKET;
-
lbtf_deb_usb2(&cardp->udev->dev, "Pointer for rx_urb %p\n",
cardp->rx_urb);
ret = usb_submit_urb(cardp->rx_urb, GFP_ATOMIC);
diff --git a/drivers/net/wireless/mediatek/mt7601u/eeprom.h b/drivers/net/wireless/mediatek/mt7601u/eeprom.h
index 662d127..57b503a 100644
--- a/drivers/net/wireless/mediatek/mt7601u/eeprom.h
+++ b/drivers/net/wireless/mediatek/mt7601u/eeprom.h
@@ -17,7 +17,7 @@
struct mt7601u_dev;
-#define MT7601U_EE_MAX_VER 0x0c
+#define MT7601U_EE_MAX_VER 0x0d
#define MT7601U_EEPROM_SIZE 256
#define MT7601U_DEFAULT_TX_POWER 6
diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00.h b/drivers/net/wireless/ralink/rt2x00/rt2x00.h
index f68d492..822833a 100644
--- a/drivers/net/wireless/ralink/rt2x00/rt2x00.h
+++ b/drivers/net/wireless/ralink/rt2x00/rt2x00.h
@@ -669,7 +669,6 @@ enum rt2x00_state_flags {
CONFIG_CHANNEL_HT40,
CONFIG_POWERSAVING,
CONFIG_HT_DISABLED,
- CONFIG_QOS_DISABLED,
CONFIG_MONITORING,
/*
diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00mac.c b/drivers/net/wireless/ralink/rt2x00/rt2x00mac.c
index 987c7c4..55036ce 100644
--- a/drivers/net/wireless/ralink/rt2x00/rt2x00mac.c
+++ b/drivers/net/wireless/ralink/rt2x00/rt2x00mac.c
@@ -666,19 +666,9 @@ void rt2x00mac_bss_info_changed(struct ieee80211_hw *hw,
rt2x00dev->intf_associated--;
rt2x00leds_led_assoc(rt2x00dev, !!rt2x00dev->intf_associated);
-
- clear_bit(CONFIG_QOS_DISABLED, &rt2x00dev->flags);
}
/*
- * Check for access point which do not support 802.11e . We have to
- * generate data frames sequence number in S/W for such AP, because
- * of H/W bug.
- */
- if (changes & BSS_CHANGED_QOS && !bss_conf->qos)
- set_bit(CONFIG_QOS_DISABLED, &rt2x00dev->flags);
-
- /*
* When the erp information has changed, we should perform
* additional configuration steps. For all other changes we are done.
*/
diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c b/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c
index 68b620b..9a15a69 100644
--- a/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c
+++ b/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c
@@ -201,15 +201,18 @@ static void rt2x00queue_create_tx_descriptor_seq(struct rt2x00_dev *rt2x00dev,
if (!rt2x00_has_cap_flag(rt2x00dev, REQUIRE_SW_SEQNO)) {
/*
* rt2800 has a H/W (or F/W) bug, device incorrectly increase
- * seqno on retransmited data (non-QOS) frames. To workaround
- * the problem let's generate seqno in software if QOS is
- * disabled.
+ * seqno on retransmitted data (non-QOS) and management frames.
+ * To workaround the problem let's generate seqno in software.
+ * Except for beacons which are transmitted periodically by H/W
+ * hence hardware has to assign seqno for them.
*/
- if (test_bit(CONFIG_QOS_DISABLED, &rt2x00dev->flags))
- __clear_bit(ENTRY_TXD_GENERATE_SEQ, &txdesc->flags);
- else
+ if (ieee80211_is_beacon(hdr->frame_control)) {
+ __set_bit(ENTRY_TXD_GENERATE_SEQ, &txdesc->flags);
/* H/W will generate sequence number */
return;
+ }
+
+ __clear_bit(ENTRY_TXD_GENERATE_SEQ, &txdesc->flags);
}
/*
diff --git a/drivers/net/wireless/rsi/rsi_common.h b/drivers/net/wireless/rsi/rsi_common.h
index d3fbe33..a13f08f 100644
--- a/drivers/net/wireless/rsi/rsi_common.h
+++ b/drivers/net/wireless/rsi/rsi_common.h
@@ -75,7 +75,6 @@ static inline int rsi_kill_thread(struct rsi_thread *handle)
atomic_inc(&handle->thread_done);
rsi_set_event(&handle->event);
- wait_for_completion(&handle->completion);
return kthread_stop(handle->task);
}
diff --git a/drivers/net/wireless/ti/wlcore/main.c b/drivers/net/wireless/ti/wlcore/main.c
index 5438975..17d32ce 100644
--- a/drivers/net/wireless/ti/wlcore/main.c
+++ b/drivers/net/wireless/ti/wlcore/main.c
@@ -1058,8 +1058,11 @@ static int wl12xx_chip_wakeup(struct wl1271 *wl, bool plt)
goto out;
ret = wl12xx_fetch_firmware(wl, plt);
- if (ret < 0)
- goto out;
+ if (ret < 0) {
+ kfree(wl->fw_status);
+ kfree(wl->raw_fw_status);
+ kfree(wl->tx_res_if);
+ }
out:
return ret;
diff --git a/drivers/net/wireless/virt_wifi.c b/drivers/net/wireless/virt_wifi.c
index 52f34b3..d255147 100644
--- a/drivers/net/wireless/virt_wifi.c
+++ b/drivers/net/wireless/virt_wifi.c
@@ -365,7 +365,6 @@ static struct wiphy *virt_wifi_make_wiphy(void)
wiphy->bands[NL80211_BAND_5GHZ] = &band_5ghz;
wiphy->bands[NL80211_BAND_60GHZ] = NULL;
- wiphy->regulatory_flags = REGULATORY_WIPHY_SELF_MANAGED;
wiphy->interface_modes = BIT(NL80211_IFTYPE_STATION);
priv = wiphy_priv(wiphy);
diff --git a/drivers/nvdimm/label.c b/drivers/nvdimm/label.c
index d8d189d..66a089d 100644
--- a/drivers/nvdimm/label.c
+++ b/drivers/nvdimm/label.c
@@ -492,7 +492,7 @@ static unsigned long nd_label_offset(struct nvdimm_drvdata *ndd,
static int __pmem_label_update(struct nd_region *nd_region,
struct nd_mapping *nd_mapping, struct nd_namespace_pmem *nspm,
- int pos)
+ int pos, unsigned long flags)
{
u64 cookie = nd_region_interleave_set_cookie(nd_region);
struct nvdimm_drvdata *ndd = to_ndd(nd_mapping);
@@ -530,7 +530,7 @@ static int __pmem_label_update(struct nd_region *nd_region,
memcpy(nd_label->uuid, nspm->uuid, NSLABEL_UUID_LEN);
if (nspm->alt_name)
memcpy(nd_label->name, nspm->alt_name, NSLABEL_NAME_LEN);
- nd_label->flags = __cpu_to_le32(NSLABEL_FLAG_UPDATING);
+ nd_label->flags = __cpu_to_le32(flags);
nd_label->nlabel = __cpu_to_le16(nd_region->ndr_mappings);
nd_label->position = __cpu_to_le16(pos);
nd_label->isetcookie = __cpu_to_le64(cookie);
@@ -922,13 +922,13 @@ static int del_labels(struct nd_mapping *nd_mapping, u8 *uuid)
int nd_pmem_namespace_label_update(struct nd_region *nd_region,
struct nd_namespace_pmem *nspm, resource_size_t size)
{
- int i;
+ int i, rc;
for (i = 0; i < nd_region->ndr_mappings; i++) {
struct nd_mapping *nd_mapping = &nd_region->mapping[i];
struct nvdimm_drvdata *ndd = to_ndd(nd_mapping);
struct resource *res;
- int rc, count = 0;
+ int count = 0;
if (size == 0) {
rc = del_labels(nd_mapping, nspm->uuid);
@@ -946,7 +946,20 @@ int nd_pmem_namespace_label_update(struct nd_region *nd_region,
if (rc < 0)
return rc;
- rc = __pmem_label_update(nd_region, nd_mapping, nspm, i);
+ rc = __pmem_label_update(nd_region, nd_mapping, nspm, i,
+ NSLABEL_FLAG_UPDATING);
+ if (rc)
+ return rc;
+ }
+
+ if (size == 0)
+ return 0;
+
+ /* Clear the UPDATING flag per UEFI 2.7 expectations */
+ for (i = 0; i < nd_region->ndr_mappings; i++) {
+ struct nd_mapping *nd_mapping = &nd_region->mapping[i];
+
+ rc = __pmem_label_update(nd_region, nd_mapping, nspm, i, 0);
if (rc)
return rc;
}
diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
index 74257ac..9bc5f55 100644
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -138,6 +138,7 @@ bool nd_is_uuid_unique(struct device *dev, u8 *uuid)
bool pmem_should_map_pages(struct device *dev)
{
struct nd_region *nd_region = to_nd_region(dev->parent);
+ struct nd_namespace_common *ndns = to_ndns(dev);
struct nd_namespace_io *nsio;
if (!IS_ENABLED(CONFIG_ZONE_DEVICE))
@@ -149,6 +150,9 @@ bool pmem_should_map_pages(struct device *dev)
if (is_nd_pfn(dev) || is_nd_btt(dev))
return false;
+ if (ndns->force_raw)
+ return false;
+
nsio = to_nd_namespace_io(dev);
if (region_intersects(nsio->res.start, resource_size(&nsio->res),
IORESOURCE_SYSTEM_RAM,
diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
index d6aa59c..ba9aa84 100644
--- a/drivers/nvdimm/pfn_devs.c
+++ b/drivers/nvdimm/pfn_devs.c
@@ -515,7 +515,7 @@ static unsigned long init_altmap_base(resource_size_t base)
static unsigned long init_altmap_reserve(resource_size_t base)
{
- unsigned long reserve = PHYS_PFN(SZ_8K);
+ unsigned long reserve = PFN_UP(SZ_8K);
unsigned long base_pfn = PHYS_PFN(base);
reserve += base_pfn - PFN_SECTION_ALIGN_DOWN(base_pfn);
diff --git a/drivers/parport/parport_pc.c b/drivers/parport/parport_pc.c
index bdce067..02e6485 100644
--- a/drivers/parport/parport_pc.c
+++ b/drivers/parport/parport_pc.c
@@ -1377,7 +1377,7 @@ static struct superio_struct *find_superio(struct parport *p)
{
int i;
for (i = 0; i < NR_SUPERIOS; i++)
- if (superios[i].io != p->base)
+ if (superios[i].io == p->base)
return &superios[i];
return NULL;
}
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index dedb120..6663b76 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3866,6 +3866,8 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9128,
/* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c14 */
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9130,
quirk_dma_func1_alias);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9170,
+ quirk_dma_func1_alias);
/* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c47 + c57 */
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9172,
quirk_dma_func1_alias);
diff --git a/drivers/pinctrl/meson/pinctrl-meson8b.c b/drivers/pinctrl/meson/pinctrl-meson8b.c
index cbe5f5c..e1b689f 100644
--- a/drivers/pinctrl/meson/pinctrl-meson8b.c
+++ b/drivers/pinctrl/meson/pinctrl-meson8b.c
@@ -662,7 +662,7 @@ static const char * const sd_a_groups[] = {
static const char * const sdxc_a_groups[] = {
"sdxc_d0_0_a", "sdxc_d13_0_a", "sdxc_d47_a", "sdxc_clk_a",
- "sdxc_cmd_a", "sdxc_d0_1_a", "sdxc_d0_13_1_a"
+ "sdxc_cmd_a", "sdxc_d0_1_a", "sdxc_d13_1_a"
};
static const char * const pcm_a_groups[] = {
diff --git a/drivers/platform/x86/intel_ips.c b/drivers/platform/x86/intel_ips.c
index 55663b3..58dcee5 100644
--- a/drivers/platform/x86/intel_ips.c
+++ b/drivers/platform/x86/intel_ips.c
@@ -68,6 +68,7 @@
#include <linux/module.h>
#include <linux/pci.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/seq_file.h>
#include <linux/string.h>
#include <linux/tick.h>
diff --git a/drivers/power/supply/charger-manager.c b/drivers/power/supply/charger-manager.c
index e664ca7..13f23c0 100644
--- a/drivers/power/supply/charger-manager.c
+++ b/drivers/power/supply/charger-manager.c
@@ -1212,7 +1212,6 @@ static int charger_extcon_init(struct charger_manager *cm,
if (ret < 0) {
pr_info("Cannot register extcon_dev for %s(cable: %s)\n",
cable->extcon_name, cable->name);
- ret = -EINVAL;
}
return ret;
@@ -1634,7 +1633,7 @@ static int charger_manager_probe(struct platform_device *pdev)
if (IS_ERR(desc)) {
dev_err(&pdev->dev, "No platform data (desc) found\n");
- return -ENODEV;
+ return PTR_ERR(desc);
}
cm = devm_kzalloc(&pdev->dev,
diff --git a/drivers/regulator/act8865-regulator.c b/drivers/regulator/act8865-regulator.c
index 7652477..39e8d60 100644
--- a/drivers/regulator/act8865-regulator.c
+++ b/drivers/regulator/act8865-regulator.c
@@ -131,7 +131,7 @@
* ACT8865 voltage number
*/
#define ACT8865_VOLTAGE_NUM 64
-#define ACT8600_SUDCDC_VOLTAGE_NUM 255
+#define ACT8600_SUDCDC_VOLTAGE_NUM 256
struct act8865 {
struct regmap *regmap;
@@ -222,7 +222,8 @@ static const struct regulator_linear_range act8600_sudcdc_voltage_ranges[] = {
REGULATOR_LINEAR_RANGE(3000000, 0, 63, 0),
REGULATOR_LINEAR_RANGE(3000000, 64, 159, 100000),
REGULATOR_LINEAR_RANGE(12600000, 160, 191, 200000),
- REGULATOR_LINEAR_RANGE(19000000, 191, 255, 400000),
+ REGULATOR_LINEAR_RANGE(19000000, 192, 247, 400000),
+ REGULATOR_LINEAR_RANGE(41400000, 248, 255, 0),
};
static struct regulator_ops act8865_ops = {
diff --git a/drivers/regulator/s2mpa01.c b/drivers/regulator/s2mpa01.c
index 92f8875..2daf751 100644
--- a/drivers/regulator/s2mpa01.c
+++ b/drivers/regulator/s2mpa01.c
@@ -303,13 +303,13 @@ static const struct regulator_desc regulators[] = {
regulator_desc_ldo(2, STEP_50_MV),
regulator_desc_ldo(3, STEP_50_MV),
regulator_desc_ldo(4, STEP_50_MV),
- regulator_desc_ldo(5, STEP_50_MV),
+ regulator_desc_ldo(5, STEP_25_MV),
regulator_desc_ldo(6, STEP_25_MV),
regulator_desc_ldo(7, STEP_50_MV),
regulator_desc_ldo(8, STEP_50_MV),
regulator_desc_ldo(9, STEP_50_MV),
regulator_desc_ldo(10, STEP_50_MV),
- regulator_desc_ldo(11, STEP_25_MV),
+ regulator_desc_ldo(11, STEP_50_MV),
regulator_desc_ldo(12, STEP_50_MV),
regulator_desc_ldo(13, STEP_50_MV),
regulator_desc_ldo(14, STEP_50_MV),
@@ -320,11 +320,11 @@ static const struct regulator_desc regulators[] = {
regulator_desc_ldo(19, STEP_50_MV),
regulator_desc_ldo(20, STEP_50_MV),
regulator_desc_ldo(21, STEP_50_MV),
- regulator_desc_ldo(22, STEP_25_MV),
- regulator_desc_ldo(23, STEP_25_MV),
+ regulator_desc_ldo(22, STEP_50_MV),
+ regulator_desc_ldo(23, STEP_50_MV),
regulator_desc_ldo(24, STEP_50_MV),
regulator_desc_ldo(25, STEP_50_MV),
- regulator_desc_ldo(26, STEP_50_MV),
+ regulator_desc_ldo(26, STEP_25_MV),
regulator_desc_buck1_4(1),
regulator_desc_buck1_4(2),
regulator_desc_buck1_4(3),
diff --git a/drivers/regulator/s2mps11.c b/drivers/regulator/s2mps11.c
index d838e77..1fe1c18 100644
--- a/drivers/regulator/s2mps11.c
+++ b/drivers/regulator/s2mps11.c
@@ -376,7 +376,7 @@ static const struct regulator_desc s2mps11_regulators[] = {
regulator_desc_s2mps11_ldo(32, STEP_50_MV),
regulator_desc_s2mps11_ldo(33, STEP_50_MV),
regulator_desc_s2mps11_ldo(34, STEP_50_MV),
- regulator_desc_s2mps11_ldo(35, STEP_50_MV),
+ regulator_desc_s2mps11_ldo(35, STEP_25_MV),
regulator_desc_s2mps11_ldo(36, STEP_50_MV),
regulator_desc_s2mps11_ldo(37, STEP_50_MV),
regulator_desc_s2mps11_ldo(38, STEP_50_MV),
@@ -386,8 +386,8 @@ static const struct regulator_desc s2mps11_regulators[] = {
regulator_desc_s2mps11_buck1_4(4),
regulator_desc_s2mps11_buck5,
regulator_desc_s2mps11_buck67810(6, MIN_600_MV, STEP_6_25_MV),
- regulator_desc_s2mps11_buck67810(7, MIN_600_MV, STEP_6_25_MV),
- regulator_desc_s2mps11_buck67810(8, MIN_600_MV, STEP_6_25_MV),
+ regulator_desc_s2mps11_buck67810(7, MIN_600_MV, STEP_12_5_MV),
+ regulator_desc_s2mps11_buck67810(8, MIN_600_MV, STEP_12_5_MV),
regulator_desc_s2mps11_buck9,
regulator_desc_s2mps11_buck67810(10, MIN_750_MV, STEP_12_5_MV),
};
diff --git a/drivers/rtc/rtc-lib.c b/drivers/rtc/rtc-lib.c
index e6bfb9c..5b136bd 100644
--- a/drivers/rtc/rtc-lib.c
+++ b/drivers/rtc/rtc-lib.c
@@ -52,13 +52,11 @@ EXPORT_SYMBOL(rtc_year_days);
*/
void rtc_time64_to_tm(time64_t time, struct rtc_time *tm)
{
- unsigned int month, year;
- unsigned long secs;
+ unsigned int month, year, secs;
int days;
/* time must be positive */
- days = div_s64(time, 86400);
- secs = time - (unsigned int) days * 86400;
+ days = div_s64_rem(time, 86400, &secs);
/* day of the week, 1970-01-01 was a Thursday */
tm->tm_wday = (days + 4) % 7;
diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
index be17de9..11c6335 100644
--- a/drivers/s390/block/dasd_eckd.c
+++ b/drivers/s390/block/dasd_eckd.c
@@ -4508,6 +4508,14 @@ static int dasd_symm_io(struct dasd_device *device, void __user *argp)
usrparm.psf_data &= 0x7fffffffULL;
usrparm.rssd_result &= 0x7fffffffULL;
}
+ /* at least 2 bytes are accessed and should be allocated */
+ if (usrparm.psf_data_len < 2) {
+ DBF_DEV_EVENT(DBF_WARNING, device,
+ "Symmetrix ioctl invalid data length %d",
+ usrparm.psf_data_len);
+ rc = -EINVAL;
+ goto out;
+ }
/* alloc I/O data area */
psf_data = kzalloc(usrparm.psf_data_len, GFP_KERNEL | GFP_DMA);
rssd_result = kzalloc(usrparm.rssd_result_len, GFP_KERNEL | GFP_DMA);
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 2abcd33..abe460e 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -652,6 +652,20 @@ static void zfcp_erp_strategy_memwait(struct zfcp_erp_action *erp_action)
add_timer(&erp_action->timer);
}
+void zfcp_erp_port_forced_reopen_all(struct zfcp_adapter *adapter,
+ int clear, char *dbftag)
+{
+ unsigned long flags;
+ struct zfcp_port *port;
+
+ write_lock_irqsave(&adapter->erp_lock, flags);
+ read_lock(&adapter->port_list_lock);
+ list_for_each_entry(port, &adapter->port_list, list)
+ _zfcp_erp_port_forced_reopen(port, clear, dbftag);
+ read_unlock(&adapter->port_list_lock);
+ write_unlock_irqrestore(&adapter->erp_lock, flags);
+}
+
static void _zfcp_erp_port_reopen_all(struct zfcp_adapter *adapter,
int clear, char *id)
{
@@ -1306,6 +1320,9 @@ static void zfcp_erp_try_rport_unblock(struct zfcp_port *port)
struct zfcp_scsi_dev *zsdev = sdev_to_zfcp(sdev);
int lun_status;
+ if (sdev->sdev_state == SDEV_DEL ||
+ sdev->sdev_state == SDEV_CANCEL)
+ continue;
if (zsdev->port != port)
continue;
/* LUN under port of interest */
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index b326f05..a39a745 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -68,6 +68,8 @@ extern void zfcp_erp_clear_port_status(struct zfcp_port *, u32);
extern int zfcp_erp_port_reopen(struct zfcp_port *, int, char *);
extern void zfcp_erp_port_shutdown(struct zfcp_port *, int, char *);
extern void zfcp_erp_port_forced_reopen(struct zfcp_port *, int, char *);
+extern void zfcp_erp_port_forced_reopen_all(struct zfcp_adapter *adapter,
+ int clear, char *dbftag);
extern void zfcp_erp_set_lun_status(struct scsi_device *, u32);
extern void zfcp_erp_clear_lun_status(struct scsi_device *, u32);
extern void zfcp_erp_lun_reopen(struct scsi_device *, int, char *);
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 3afb200..bdb257e 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -326,6 +326,10 @@ static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt)
struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
int ret = SUCCESS, fc_ret;
+ if (!(adapter->connection_features & FSF_FEATURE_NPIV_MODE)) {
+ zfcp_erp_port_forced_reopen_all(adapter, 0, "schrh_p");
+ zfcp_erp_wait(adapter);
+ }
zfcp_erp_adapter_reopen(adapter, 0, "schrh_1");
zfcp_erp_wait(adapter);
fc_ret = fc_block_scsi_eh(scpnt);
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 3493d44..d5e510f 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -283,6 +283,8 @@ static void virtio_ccw_drop_indicators(struct virtio_ccw_device *vcdev)
{
struct virtio_ccw_vq_info *info;
+ if (!vcdev->airq_info)
+ return;
list_for_each_entry(info, &vcdev->virtqueues, node)
drop_airq_indicator(info->vq, vcdev->airq_info);
}
@@ -424,7 +426,7 @@ static int virtio_ccw_read_vq_conf(struct virtio_ccw_device *vcdev,
ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_READ_VQ_CONF);
if (ret)
return ret;
- return vcdev->config_block->num;
+ return vcdev->config_block->num ?: -ENOENT;
}
static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 2f872f7..5252dd5 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -10,6 +10,7 @@
*/
#include "hisi_sas.h"
+#include "../libsas/sas_internal.h"
#define DRV_NAME "hisi_sas"
#define DEV_IS_GONE(dev) \
@@ -1128,9 +1129,18 @@ static void hisi_sas_port_deformed(struct asd_sas_phy *sas_phy)
static void hisi_sas_phy_disconnected(struct hisi_sas_phy *phy)
{
+ struct asd_sas_phy *sas_phy = &phy->sas_phy;
+ struct sas_phy *sphy = sas_phy->phy;
+ struct sas_phy_data *d = sphy->hostdata;
+
phy->phy_attached = 0;
phy->phy_type = 0;
phy->port = NULL;
+
+ if (d->enable)
+ sphy->negotiated_linkrate = SAS_LINK_RATE_UNKNOWN;
+ else
+ sphy->negotiated_linkrate = SAS_PHY_DISABLED;
}
void hisi_sas_phy_down(struct hisi_hba *hisi_hba, int phy_no, int rdy)
diff --git a/drivers/scsi/libfc/fc_rport.c b/drivers/scsi/libfc/fc_rport.c
index e3ffd24..97aeadd 100644
--- a/drivers/scsi/libfc/fc_rport.c
+++ b/drivers/scsi/libfc/fc_rport.c
@@ -1935,7 +1935,6 @@ static void fc_rport_recv_logo_req(struct fc_lport *lport, struct fc_frame *fp)
FC_RPORT_DBG(rdata, "Received LOGO request while in state %s\n",
fc_rport_state(rdata));
- rdata->flags &= ~FC_RP_STARTED;
fc_rport_enter_delete(rdata, RPORT_EV_STOP);
mutex_unlock(&rdata->rp_mutex);
kref_put(&rdata->kref, rdata->local_port->tt.rport_destroy);
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index c79743d..2ffe104 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1448,7 +1448,13 @@ static int iscsi_xmit_task(struct iscsi_conn *conn)
if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx))
return -ENODATA;
+ spin_lock_bh(&conn->session->back_lock);
+ if (conn->task == NULL) {
+ spin_unlock_bh(&conn->session->back_lock);
+ return -ENODATA;
+ }
__iscsi_get_task(task);
+ spin_unlock_bh(&conn->session->back_lock);
spin_unlock_bh(&conn->session->frwd_lock);
rc = conn->session->tt->xmit_task(task);
spin_lock_bh(&conn->session->frwd_lock);
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 5de024a..5b1c37e 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -3956,6 +3956,7 @@ int megasas_alloc_cmds(struct megasas_instance *instance)
if (megasas_create_frame_pool(instance)) {
dev_printk(KERN_DEBUG, &instance->pdev->dev, "Error creating frame DMA pool\n");
megasas_free_cmds(instance);
+ return -ENOMEM;
}
return 0;
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 1d57b34..5de4741 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -219,7 +219,7 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
sdev = kzalloc(sizeof(*sdev) + shost->transportt->device_size,
- GFP_ATOMIC);
+ GFP_KERNEL);
if (!sdev)
goto out;
@@ -796,7 +796,7 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result,
*/
sdev->inquiry = kmemdup(inq_result,
max_t(size_t, sdev->inquiry_len, 36),
- GFP_ATOMIC);
+ GFP_KERNEL);
if (sdev->inquiry == NULL)
return SCSI_SCAN_NO_RESPONSE;
@@ -1094,7 +1094,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
if (!sdev)
goto out;
- result = kmalloc(result_len, GFP_ATOMIC |
+ result = kmalloc(result_len, GFP_KERNEL |
((shost->unchecked_isa_dma) ? __GFP_DMA : 0));
if (!result)
goto out_free_sdev;
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
old mode 100644
new mode 100755
index f2acfbc..4fba09b
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1290,11 +1290,6 @@ static void sd_release(struct gendisk *disk, fmode_t mode)
scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW);
}
- /*
- * XXX and what if there are packets in flight and this close()
- * XXX is followed by a "rmmod sd_mod"?
- */
-
scsi_disk_put(sdkp);
}
@@ -2737,6 +2732,58 @@ static void sd_read_write_same(struct scsi_disk *sdkp, unsigned char *buffer)
sdkp->ws10 = 1;
}
+/*
+ * Determine the device's preferred I/O size for reads and writes
+ * unless the reported value is unreasonably small, large, not a
+ * multiple of the physical block size, or simply garbage.
+ */
+static bool sd_validate_opt_xfer_size(struct scsi_disk *sdkp,
+ unsigned int dev_max)
+{
+ struct scsi_device *sdp = sdkp->device;
+ unsigned int opt_xfer_bytes =
+ sdkp->opt_xfer_blocks * sdp->sector_size;
+
+ if (sdkp->opt_xfer_blocks == 0)
+ return false;
+
+ if (sdkp->opt_xfer_blocks > dev_max) {
+ sd_first_printk(KERN_WARNING, sdkp,
+ "Optimal transfer size %u logical blocks " \
+ "> dev_max (%u logical blocks)\n",
+ sdkp->opt_xfer_blocks, dev_max);
+ return false;
+ }
+
+ if (sdkp->opt_xfer_blocks > SD_DEF_XFER_BLOCKS) {
+ sd_first_printk(KERN_WARNING, sdkp,
+ "Optimal transfer size %u logical blocks " \
+ "> sd driver limit (%u logical blocks)\n",
+ sdkp->opt_xfer_blocks, SD_DEF_XFER_BLOCKS);
+ return false;
+ }
+
+ if (opt_xfer_bytes < PAGE_SIZE) {
+ sd_first_printk(KERN_WARNING, sdkp,
+ "Optimal transfer size %u bytes < " \
+ "PAGE_SIZE (%u bytes)\n",
+ opt_xfer_bytes, (unsigned int)PAGE_SIZE);
+ return false;
+ }
+
+ if (opt_xfer_bytes & (sdkp->physical_block_size - 1)) {
+ sd_first_printk(KERN_WARNING, sdkp,
+ "Optimal transfer size %u bytes not a " \
+ "multiple of physical block size (%u bytes)\n",
+ opt_xfer_bytes, sdkp->physical_block_size);
+ return false;
+ }
+
+ sd_first_printk(KERN_INFO, sdkp, "Optimal transfer size %u bytes\n",
+ opt_xfer_bytes);
+ return true;
+}
+
/**
* sd_revalidate_disk - called the first time a new disk is seen,
* performs disk spin up, read_capacity, etc.
@@ -2801,18 +2848,10 @@ static int sd_revalidate_disk(struct gendisk *disk)
dev_max = min_not_zero(dev_max, sdkp->max_xfer_blocks);
q->limits.max_dev_sectors = logical_to_sectors(sdp, dev_max);
- /*
- * Determine the device's preferred I/O size for reads and writes
- * unless the reported value is unreasonably small, large, or
- * garbage.
- */
- if (sdkp->opt_xfer_blocks &&
- sdkp->opt_xfer_blocks <= dev_max &&
- sdkp->opt_xfer_blocks <= SD_DEF_XFER_BLOCKS &&
- sdkp->opt_xfer_blocks * sdp->sector_size >= PAGE_SIZE)
+ if (sd_validate_opt_xfer_size(sdkp, dev_max)) {
rw_max = q->limits.io_opt =
- sdkp->opt_xfer_blocks * sdp->sector_size;
- else
+ sdkp->opt_xfer_blocks * sdp->sector_size;
+ } else
rw_max = min_not_zero(logical_to_sectors(sdp, dev_max),
(sector_t)BLK_DEF_MAX_SECTORS);
@@ -3120,11 +3159,23 @@ static void scsi_disk_release(struct device *dev)
{
struct scsi_disk *sdkp = to_scsi_disk(dev);
struct gendisk *disk = sdkp->disk;
-
+ struct request_queue *q = disk->queue;
+
spin_lock(&sd_index_lock);
ida_remove(&sd_index_ida, sdkp->index);
spin_unlock(&sd_index_lock);
+ /*
+ * Wait until all requests that are in progress have completed.
+ * This is necessary to avoid that e.g. scsi_end_request() crashes
+ * due to clearing the disk->private_data pointer. Wait from inside
+ * scsi_disk_release() instead of from sd_release() to avoid that
+ * freezing and unfreezing the request queue affects user space I/O
+ * in case multiple processes open a /dev/sd... node concurrently.
+ */
+ blk_mq_freeze_queue(q);
+ blk_mq_unfreeze_queue(q);
+
disk->private_data = NULL;
put_disk(disk);
put_device(&sdkp->device->sdev_gendev);
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
old mode 100644
new mode 100755
diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index cbc8e93..7ba0031 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -693,7 +693,6 @@ static int virtscsi_device_reset(struct scsi_cmnd *sc)
return FAILED;
memset(cmd, 0, sizeof(*cmd));
- cmd->sc = sc;
cmd->req.tmf = (struct virtio_scsi_ctrl_tmf_req){
.type = VIRTIO_SCSI_T_TMF,
.subtype = cpu_to_virtio32(vscsi->vdev,
@@ -752,7 +751,6 @@ static int virtscsi_abort(struct scsi_cmnd *sc)
return FAILED;
memset(cmd, 0, sizeof(*cmd));
- cmd->sc = sc;
cmd->req.tmf = (struct virtio_scsi_ctrl_tmf_req){
.type = VIRTIO_SCSI_T_TMF,
.subtype = VIRTIO_SCSI_T_TMF_ABORT_TASK,
diff --git a/drivers/soc/qcom/icnss.c b/drivers/soc/qcom/icnss.c
index aa5d09e..ac36b27 100644
--- a/drivers/soc/qcom/icnss.c
+++ b/drivers/soc/qcom/icnss.c
@@ -49,6 +49,7 @@
#include <soc/qcom/service-notifier.h>
#include <soc/qcom/socinfo.h>
#include <soc/qcom/ramdump.h>
+#include <linux/thermal.h>
#include "wlan_firmware_service_v01.h"
@@ -481,6 +482,9 @@ static struct icnss_priv {
uint32_t fw_early_crash_irq;
struct completion unblock_shutdown;
bool is_ssr;
+ struct thermal_cooling_device *tcdev;
+ unsigned long curr_thermal_state;
+ unsigned long max_thermal_state;
} *penv;
#ifdef CONFIG_ICNSS_DEBUG
@@ -2430,6 +2434,115 @@ static int icnss_driver_event_fw_ready_ind(void *data)
return ret;
}
+static int icnss_tcdev_get_max_state(struct thermal_cooling_device *tcdev,
+ unsigned long *thermal_state)
+{
+ struct icnss_priv *priv = tcdev->devdata;
+
+ *thermal_state = priv->max_thermal_state;
+
+ return 0;
+}
+
+
+static int icnss_tcdev_get_cur_state(struct thermal_cooling_device *tcdev,
+ unsigned long *thermal_state)
+{
+ struct icnss_priv *priv = tcdev->devdata;
+
+ *thermal_state = priv->curr_thermal_state;
+
+ return 0;
+}
+
+
+static int icnss_tcdev_set_cur_state(struct thermal_cooling_device *tcdev,
+ unsigned long thermal_state)
+{
+ struct icnss_priv *priv = tcdev->devdata;
+ struct device *dev = &priv->pdev->dev;
+ int ret = 0;
+
+ priv->curr_thermal_state = thermal_state;
+
+ if (!priv->ops || !priv->ops->set_therm_state)
+ return 0;
+
+ icnss_pr_vdbg("Cooling device set current state: %ld",
+ thermal_state);
+
+ ret = priv->ops->set_therm_state(dev, thermal_state);
+
+ if (ret)
+ icnss_pr_err("Setting Current Thermal State Failed: %d\n", ret);
+
+ return 0;
+}
+
+static struct thermal_cooling_device_ops icnss_cooling_ops = {
+ .get_max_state = icnss_tcdev_get_max_state,
+ .get_cur_state = icnss_tcdev_get_cur_state,
+ .set_cur_state = icnss_tcdev_set_cur_state,
+};
+
+int icnss_thermal_register(struct device *dev, unsigned long max_state)
+{
+ struct icnss_priv *priv = dev_get_drvdata(dev);
+ int ret = 0;
+
+ priv->max_thermal_state = max_state;
+
+ if (of_find_property(dev->of_node, "#cooling-cells", NULL)) {
+ priv->tcdev = thermal_of_cooling_device_register(dev->of_node,
+ "icnss", priv,
+ &icnss_cooling_ops);
+ if (IS_ERR_OR_NULL(priv->tcdev)) {
+ ret = PTR_ERR(priv->tcdev);
+ icnss_pr_err("Cooling device register failed: %d\n",
+ ret);
+ } else {
+ icnss_pr_vdbg("Cooling device registered");
+ }
+ } else {
+ icnss_pr_dbg("Cooling device registration not supported");
+ ret = -EOPNOTSUPP;
+ }
+
+ return ret;
+}
+EXPORT_SYMBOL(icnss_thermal_register);
+
+void icnss_thermal_unregister(struct device *dev)
+{
+ struct icnss_priv *priv = dev_get_drvdata(dev);
+
+ if (!IS_ERR_OR_NULL(priv->tcdev))
+ thermal_cooling_device_unregister(priv->tcdev);
+
+ priv->tcdev = NULL;
+}
+EXPORT_SYMBOL(icnss_thermal_unregister);
+
+int icnss_get_curr_therm_state(struct device *dev,
+ unsigned long *thermal_state)
+{
+ struct icnss_priv *priv = dev_get_drvdata(dev);
+ int ret = 0;
+
+ if (IS_ERR_OR_NULL(priv->tcdev)) {
+ ret = PTR_ERR(priv->tcdev);
+ icnss_pr_err("Get current thermal state failed: %d\n", ret);
+ return ret;
+ }
+
+ icnss_pr_vdbg("Cooling device current state: %ld",
+ priv->curr_thermal_state);
+
+ *thermal_state = priv->curr_thermal_state;
+ return ret;
+}
+EXPORT_SYMBOL(icnss_get_curr_therm_state);
+
static int icnss_driver_event_register_driver(void *data)
{
int ret = 0;
diff --git a/drivers/soc/qcom/qcom_gsbi.c b/drivers/soc/qcom/qcom_gsbi.c
index 09c669e..038abc3 100644
--- a/drivers/soc/qcom/qcom_gsbi.c
+++ b/drivers/soc/qcom/qcom_gsbi.c
@@ -138,7 +138,7 @@ static int gsbi_probe(struct platform_device *pdev)
struct resource *res;
void __iomem *base;
struct gsbi_info *gsbi;
- int i;
+ int i, ret;
u32 mask, gsbi_num;
const struct crci_config *config = NULL;
@@ -221,7 +221,10 @@ static int gsbi_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, gsbi);
- return of_platform_populate(node, NULL, NULL, &pdev->dev);
+ ret = of_platform_populate(node, NULL, NULL, &pdev->dev);
+ if (ret)
+ clk_disable_unprepare(gsbi->hclk);
+ return ret;
}
static int gsbi_remove(struct platform_device *pdev)
diff --git a/drivers/soc/qcom/socinfo.c b/drivers/soc/qcom/socinfo.c
index b6214eb..c9489e3 100644
--- a/drivers/soc/qcom/socinfo.c
+++ b/drivers/soc/qcom/socinfo.c
@@ -642,6 +642,9 @@ static struct msm_soc_info cpu_of_id[] = {
/* QM215 ID */
[386] = {MSM_CPU_QM215, "QM215"},
+
+ /* SDM429W IDs*/
+ [416] = {MSM_CPU_SDM429W, "SDM429W"},
/* Uninitialized IDs are not known to run Linux.
* MSM_CPU_UNKNOWN is set to 0 to ensure these IDs are
* considered as unknown CPU.
@@ -1629,6 +1632,10 @@ static void * __init setup_dummy_socinfo(void)
dummy_socinfo.id = 364;
strlcpy(dummy_socinfo.build_id, "sda429 - ",
sizeof(dummy_socinfo.build_id));
+ } else if (early_machine_is_sdm429w()) {
+ dummy_socinfo.id = 416;
+ strlcpy(dummy_socinfo.build_id, "sdm429w - ",
+ sizeof(dummy_socinfo.build_id));
} else if (early_machine_is_mdm9607()) {
dummy_socinfo.id = 290;
strlcpy(dummy_socinfo.build_id, "mdm9607 - ",
diff --git a/drivers/soc/tegra/fuse/fuse-tegra.c b/drivers/soc/tegra/fuse/fuse-tegra.c
index de2c1bf..c4f5e5b 100644
--- a/drivers/soc/tegra/fuse/fuse-tegra.c
+++ b/drivers/soc/tegra/fuse/fuse-tegra.c
@@ -131,13 +131,17 @@ static int tegra_fuse_probe(struct platform_device *pdev)
/* take over the memory region from the early initialization */
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
fuse->base = devm_ioremap_resource(&pdev->dev, res);
- if (IS_ERR(fuse->base))
- return PTR_ERR(fuse->base);
+ if (IS_ERR(fuse->base)) {
+ err = PTR_ERR(fuse->base);
+ fuse->base = base;
+ return err;
+ }
fuse->clk = devm_clk_get(&pdev->dev, "fuse");
if (IS_ERR(fuse->clk)) {
dev_err(&pdev->dev, "failed to get FUSE clock: %ld",
PTR_ERR(fuse->clk));
+ fuse->base = base;
return PTR_ERR(fuse->clk);
}
@@ -146,8 +150,10 @@ static int tegra_fuse_probe(struct platform_device *pdev)
if (fuse->soc->probe) {
err = fuse->soc->probe(fuse);
- if (err < 0)
+ if (err < 0) {
+ fuse->base = base;
return err;
+ }
}
if (tegra_fuse_create_sysfs(&pdev->dev, fuse->soc->info->size,
diff --git a/drivers/soc/tegra/pmc.c b/drivers/soc/tegra/pmc.c
index 9685f9b8..a12710c 100644
--- a/drivers/soc/tegra/pmc.c
+++ b/drivers/soc/tegra/pmc.c
@@ -512,16 +512,10 @@ EXPORT_SYMBOL(tegra_powergate_power_off);
*/
int tegra_powergate_is_powered(unsigned int id)
{
- int status;
-
if (!tegra_powergate_is_valid(id))
return -EINVAL;
- mutex_lock(&pmc->powergates_lock);
- status = tegra_powergate_state(id);
- mutex_unlock(&pmc->powergates_lock);
-
- return status;
+ return tegra_powergate_state(id);
}
/**
diff --git a/drivers/spi/spi-pxa2xx.c b/drivers/spi/spi-pxa2xx.c
index 3f3751e..f2209ec 100644
--- a/drivers/spi/spi-pxa2xx.c
+++ b/drivers/spi/spi-pxa2xx.c
@@ -1668,6 +1668,7 @@ static int pxa2xx_spi_probe(struct platform_device *pdev)
platform_info->enable_dma = false;
} else {
master->can_dma = pxa2xx_spi_can_dma;
+ master->max_dma_len = MAX_DMA_LEN;
}
}
diff --git a/drivers/spi/spi-ti-qspi.c b/drivers/spi/spi-ti-qspi.c
index caeac66..4cb72a8 100644
--- a/drivers/spi/spi-ti-qspi.c
+++ b/drivers/spi/spi-ti-qspi.c
@@ -457,8 +457,8 @@ static void ti_qspi_enable_memory_map(struct spi_device *spi)
ti_qspi_write(qspi, MM_SWITCH, QSPI_SPI_SWITCH_REG);
if (qspi->ctrl_base) {
regmap_update_bits(qspi->ctrl_base, qspi->ctrl_reg,
- MEM_CS_EN(spi->chip_select),
- MEM_CS_MASK);
+ MEM_CS_MASK,
+ MEM_CS_EN(spi->chip_select));
}
qspi->mmap_enabled = true;
}
@@ -470,7 +470,7 @@ static void ti_qspi_disable_memory_map(struct spi_device *spi)
ti_qspi_write(qspi, 0, QSPI_SPI_SWITCH_REG);
if (qspi->ctrl_base)
regmap_update_bits(qspi->ctrl_base, qspi->ctrl_reg,
- 0, MEM_CS_MASK);
+ MEM_CS_MASK, 0);
qspi->mmap_enabled = false;
}
diff --git a/drivers/staging/android/ion/ion_dummy_driver.c b/drivers/staging/android/ion/ion_dummy_driver.c
index b23f2c7..1991626 100644
--- a/drivers/staging/android/ion/ion_dummy_driver.c
+++ b/drivers/staging/android/ion/ion_dummy_driver.c
@@ -38,11 +38,6 @@ static struct ion_platform_heap dummy_heaps[] = {
.name = "system",
},
{
- .id = ION_HEAP_TYPE_SYSTEM_CONTIG,
- .type = ION_HEAP_TYPE_SYSTEM_CONTIG,
- .name = "system contig",
- },
- {
.id = ION_HEAP_TYPE_CARVEOUT,
.type = ION_HEAP_TYPE_CARVEOUT,
.name = "carveout",
@@ -76,34 +71,30 @@ static int __init ion_dummy_init(void)
return -ENOMEM;
- /* Allocate a dummy carveout heap */
- carveout_ptr = alloc_pages_exact(
- dummy_heaps[ION_HEAP_TYPE_CARVEOUT].size,
- GFP_KERNEL);
- if (carveout_ptr)
- dummy_heaps[ION_HEAP_TYPE_CARVEOUT].base =
- virt_to_phys(carveout_ptr);
- else
- pr_err("ion_dummy: Could not allocate carveout\n");
-
- /* Allocate a dummy chunk heap */
- chunk_ptr = alloc_pages_exact(
- dummy_heaps[ION_HEAP_TYPE_CHUNK].size,
- GFP_KERNEL);
- if (chunk_ptr)
- dummy_heaps[ION_HEAP_TYPE_CHUNK].base = virt_to_phys(chunk_ptr);
- else
- pr_err("ion_dummy: Could not allocate chunk\n");
-
for (i = 0; i < dummy_ion_pdata.nr; i++) {
struct ion_platform_heap *heap_data = &dummy_ion_pdata.heaps[i];
- if (heap_data->type == ION_HEAP_TYPE_CARVEOUT &&
- !heap_data->base)
- continue;
+ if (heap_data->type == ION_HEAP_TYPE_CARVEOUT) {
+ /* Allocate a dummy carveout heap */
+ carveout_ptr = alloc_pages_exact(heap_data->size,
+ GFP_KERNEL);
+ if (!carveout_ptr) {
+ pr_err("ion_dummy: Could not allocate carveout\n");
+ continue;
+ }
+ heap_data->base = virt_to_phys(carveout_ptr);
+ }
- if (heap_data->type == ION_HEAP_TYPE_CHUNK && !heap_data->base)
- continue;
+ if (heap_data->type == ION_HEAP_TYPE_CHUNK) {
+ /* Allocate a dummy chunk heap */
+ chunk_ptr = alloc_pages_exact(heap_data->size,
+ GFP_KERNEL);
+ if (!chunk_ptr) {
+ pr_err("ion_dummy: Could not allocate chunk\n");
+ continue;
+ }
+ heap_data->base = virt_to_phys(chunk_ptr);
+ }
heaps[i] = ion_heap_create(heap_data);
if (IS_ERR_OR_NULL(heaps[i])) {
@@ -114,20 +105,28 @@ static int __init ion_dummy_init(void)
}
return 0;
err:
- for (i = 0; i < dummy_ion_pdata.nr; ++i)
- ion_heap_destroy(heaps[i]);
+ for (i = 0; i < dummy_ion_pdata.nr; ++i) {
+ struct ion_platform_heap *heap_data = &dummy_ion_pdata.heaps[i];
+
+ if (!IS_ERR_OR_NULL(heaps[i]))
+ ion_heap_destroy(heaps[i]);
+
+ if (heap_data->type == ION_HEAP_TYPE_CARVEOUT) {
+ if (carveout_ptr) {
+ free_pages_exact(carveout_ptr,
+ heap_data->size);
+ }
+ carveout_ptr = NULL;
+ }
+ if (heap_data->type == ION_HEAP_TYPE_CHUNK) {
+ if (chunk_ptr) {
+ free_pages_exact(chunk_ptr, heap_data->size);
+ }
+ chunk_ptr = NULL;
+ }
+ }
kfree(heaps);
- if (carveout_ptr) {
- free_pages_exact(carveout_ptr,
- dummy_heaps[ION_HEAP_TYPE_CARVEOUT].size);
- carveout_ptr = NULL;
- }
- if (chunk_ptr) {
- free_pages_exact(chunk_ptr,
- dummy_heaps[ION_HEAP_TYPE_CHUNK].size);
- chunk_ptr = NULL;
- }
return err;
}
device_initcall(ion_dummy_init);
@@ -138,19 +137,26 @@ static void __exit ion_dummy_exit(void)
ion_device_destroy(idev);
- for (i = 0; i < dummy_ion_pdata.nr; i++)
- ion_heap_destroy(heaps[i]);
- kfree(heaps);
+ for (i = 0; i < dummy_ion_pdata.nr; ++i) {
+ struct ion_platform_heap *heap_data = &dummy_ion_pdata.heaps[i];
- if (carveout_ptr) {
- free_pages_exact(carveout_ptr,
- dummy_heaps[ION_HEAP_TYPE_CARVEOUT].size);
- carveout_ptr = NULL;
+ if (!IS_ERR_OR_NULL(heaps[i]))
+ ion_heap_destroy(heaps[i]);
+
+ if (heap_data->type == ION_HEAP_TYPE_CARVEOUT) {
+ if (carveout_ptr) {
+ free_pages_exact(carveout_ptr,
+ heap_data->size);
+ }
+ carveout_ptr = NULL;
+ }
+ if (heap_data->type == ION_HEAP_TYPE_CHUNK) {
+ if (chunk_ptr) {
+ free_pages_exact(chunk_ptr, heap_data->size);
+ }
+ chunk_ptr = NULL;
+ }
}
- if (chunk_ptr) {
- free_pages_exact(chunk_ptr,
- dummy_heaps[ION_HEAP_TYPE_CHUNK].size);
- chunk_ptr = NULL;
- }
+ kfree(heaps);
}
__exitcall(ion_dummy_exit);
diff --git a/drivers/staging/comedi/comedidev.h b/drivers/staging/comedi/comedidev.h
index dcb6376..35432fb 100644
--- a/drivers/staging/comedi/comedidev.h
+++ b/drivers/staging/comedi/comedidev.h
@@ -984,6 +984,8 @@ int comedi_dio_insn_config(struct comedi_device *, struct comedi_subdevice *,
unsigned int mask);
unsigned int comedi_dio_update_state(struct comedi_subdevice *,
unsigned int *data);
+unsigned int comedi_bytes_per_scan_cmd(struct comedi_subdevice *s,
+ struct comedi_cmd *cmd);
unsigned int comedi_bytes_per_scan(struct comedi_subdevice *s);
unsigned int comedi_nscans_left(struct comedi_subdevice *s,
unsigned int nscans);
diff --git a/drivers/staging/comedi/drivers.c b/drivers/staging/comedi/drivers.c
index 1736248..8ca5493 100644
--- a/drivers/staging/comedi/drivers.c
+++ b/drivers/staging/comedi/drivers.c
@@ -390,11 +390,13 @@ unsigned int comedi_dio_update_state(struct comedi_subdevice *s,
EXPORT_SYMBOL_GPL(comedi_dio_update_state);
/**
- * comedi_bytes_per_scan() - Get length of asynchronous command "scan" in bytes
+ * comedi_bytes_per_scan_cmd() - Get length of asynchronous command "scan" in
+ * bytes
* @s: COMEDI subdevice.
+ * @cmd: COMEDI command.
*
* Determines the overall scan length according to the subdevice type and the
- * number of channels in the scan.
+ * number of channels in the scan for the specified command.
*
* For digital input, output or input/output subdevices, samples for
* multiple channels are assumed to be packed into one or more unsigned
@@ -404,9 +406,9 @@ EXPORT_SYMBOL_GPL(comedi_dio_update_state);
*
* Returns the overall scan length in bytes.
*/
-unsigned int comedi_bytes_per_scan(struct comedi_subdevice *s)
+unsigned int comedi_bytes_per_scan_cmd(struct comedi_subdevice *s,
+ struct comedi_cmd *cmd)
{
- struct comedi_cmd *cmd = &s->async->cmd;
unsigned int num_samples;
unsigned int bits_per_sample;
@@ -423,6 +425,29 @@ unsigned int comedi_bytes_per_scan(struct comedi_subdevice *s)
}
return comedi_samples_to_bytes(s, num_samples);
}
+EXPORT_SYMBOL_GPL(comedi_bytes_per_scan_cmd);
+
+/**
+ * comedi_bytes_per_scan() - Get length of asynchronous command "scan" in bytes
+ * @s: COMEDI subdevice.
+ *
+ * Determines the overall scan length according to the subdevice type and the
+ * number of channels in the scan for the current command.
+ *
+ * For digital input, output or input/output subdevices, samples for
+ * multiple channels are assumed to be packed into one or more unsigned
+ * short or unsigned int values according to the subdevice's %SDF_LSAMPL
+ * flag. For other types of subdevice, samples are assumed to occupy a
+ * whole unsigned short or unsigned int according to the %SDF_LSAMPL flag.
+ *
+ * Returns the overall scan length in bytes.
+ */
+unsigned int comedi_bytes_per_scan(struct comedi_subdevice *s)
+{
+ struct comedi_cmd *cmd = &s->async->cmd;
+
+ return comedi_bytes_per_scan_cmd(s, cmd);
+}
EXPORT_SYMBOL_GPL(comedi_bytes_per_scan);
static unsigned int __comedi_nscans_left(struct comedi_subdevice *s,
diff --git a/drivers/staging/comedi/drivers/ni_mio_common.c b/drivers/staging/comedi/drivers/ni_mio_common.c
index 0fa85d5..fe03a41 100644
--- a/drivers/staging/comedi/drivers/ni_mio_common.c
+++ b/drivers/staging/comedi/drivers/ni_mio_common.c
@@ -3477,6 +3477,7 @@ static int ni_cdio_check_chanlist(struct comedi_device *dev,
static int ni_cdio_cmdtest(struct comedi_device *dev,
struct comedi_subdevice *s, struct comedi_cmd *cmd)
{
+ unsigned int bytes_per_scan;
int err = 0;
int tmp;
@@ -3506,9 +3507,12 @@ static int ni_cdio_cmdtest(struct comedi_device *dev,
err |= comedi_check_trigger_arg_is(&cmd->convert_arg, 0);
err |= comedi_check_trigger_arg_is(&cmd->scan_end_arg,
cmd->chanlist_len);
- err |= comedi_check_trigger_arg_max(&cmd->stop_arg,
- s->async->prealloc_bufsz /
- comedi_bytes_per_scan(s));
+ bytes_per_scan = comedi_bytes_per_scan_cmd(s, cmd);
+ if (bytes_per_scan) {
+ err |= comedi_check_trigger_arg_max(&cmd->stop_arg,
+ s->async->prealloc_bufsz /
+ bytes_per_scan);
+ }
if (err)
return 3;
diff --git a/drivers/staging/comedi/drivers/ni_usb6501.c b/drivers/staging/comedi/drivers/ni_usb6501.c
index 5036eeb..2f174a3 100644
--- a/drivers/staging/comedi/drivers/ni_usb6501.c
+++ b/drivers/staging/comedi/drivers/ni_usb6501.c
@@ -472,10 +472,8 @@ static int ni6501_alloc_usb_buffers(struct comedi_device *dev)
size = usb_endpoint_maxp(devpriv->ep_tx);
devpriv->usb_tx_buf = kzalloc(size, GFP_KERNEL);
- if (!devpriv->usb_tx_buf) {
- kfree(devpriv->usb_rx_buf);
+ if (!devpriv->usb_tx_buf)
return -ENOMEM;
- }
return 0;
}
@@ -527,6 +525,9 @@ static int ni6501_auto_attach(struct comedi_device *dev,
if (!devpriv)
return -ENOMEM;
+ mutex_init(&devpriv->mut);
+ usb_set_intfdata(intf, devpriv);
+
ret = ni6501_find_endpoints(dev);
if (ret)
return ret;
@@ -535,9 +536,6 @@ static int ni6501_auto_attach(struct comedi_device *dev,
if (ret)
return ret;
- mutex_init(&devpriv->mut);
- usb_set_intfdata(intf, devpriv);
-
ret = comedi_alloc_subdevices(dev, 2);
if (ret)
return ret;
diff --git a/drivers/staging/comedi/drivers/vmk80xx.c b/drivers/staging/comedi/drivers/vmk80xx.c
index a004aed..1800eb3 100644
--- a/drivers/staging/comedi/drivers/vmk80xx.c
+++ b/drivers/staging/comedi/drivers/vmk80xx.c
@@ -691,10 +691,8 @@ static int vmk80xx_alloc_usb_buffers(struct comedi_device *dev)
size = usb_endpoint_maxp(devpriv->ep_tx);
devpriv->usb_tx_buf = kzalloc(size, GFP_KERNEL);
- if (!devpriv->usb_tx_buf) {
- kfree(devpriv->usb_rx_buf);
+ if (!devpriv->usb_tx_buf)
return -ENOMEM;
- }
return 0;
}
@@ -809,6 +807,8 @@ static int vmk80xx_auto_attach(struct comedi_device *dev,
devpriv->model = board->model;
+ sema_init(&devpriv->limit_sem, 8);
+
ret = vmk80xx_find_usb_endpoints(dev);
if (ret)
return ret;
@@ -817,8 +817,6 @@ static int vmk80xx_auto_attach(struct comedi_device *dev,
if (ret)
return ret;
- sema_init(&devpriv->limit_sem, 8);
-
usb_set_intfdata(intf, devpriv);
if (devpriv->model == VMK8055_MODEL)
diff --git a/drivers/staging/iio/adc/ad7192.c b/drivers/staging/iio/adc/ad7192.c
index 4dc9ca3..b82a4ab 100644
--- a/drivers/staging/iio/adc/ad7192.c
+++ b/drivers/staging/iio/adc/ad7192.c
@@ -109,10 +109,10 @@
#define AD7192_CH_AIN3 BIT(6) /* AIN3 - AINCOM */
#define AD7192_CH_AIN4 BIT(7) /* AIN4 - AINCOM */
-#define AD7193_CH_AIN1P_AIN2M 0x000 /* AIN1(+) - AIN2(-) */
-#define AD7193_CH_AIN3P_AIN4M 0x001 /* AIN3(+) - AIN4(-) */
-#define AD7193_CH_AIN5P_AIN6M 0x002 /* AIN5(+) - AIN6(-) */
-#define AD7193_CH_AIN7P_AIN8M 0x004 /* AIN7(+) - AIN8(-) */
+#define AD7193_CH_AIN1P_AIN2M 0x001 /* AIN1(+) - AIN2(-) */
+#define AD7193_CH_AIN3P_AIN4M 0x002 /* AIN3(+) - AIN4(-) */
+#define AD7193_CH_AIN5P_AIN6M 0x004 /* AIN5(+) - AIN6(-) */
+#define AD7193_CH_AIN7P_AIN8M 0x008 /* AIN7(+) - AIN8(-) */
#define AD7193_CH_TEMP 0x100 /* Temp senseor */
#define AD7193_CH_AIN2P_AIN2M 0x200 /* AIN2(+) - AIN2(-) */
#define AD7193_CH_AIN1 0x401 /* AIN1 - AINCOM */
diff --git a/drivers/staging/vt6655/device_main.c b/drivers/staging/vt6655/device_main.c
index ab96629..22e5116 100644
--- a/drivers/staging/vt6655/device_main.c
+++ b/drivers/staging/vt6655/device_main.c
@@ -977,8 +977,6 @@ static void vnt_interrupt_process(struct vnt_private *priv)
return;
}
- MACvIntDisable(priv->PortOffset);
-
spin_lock_irqsave(&priv->lock, flags);
/* Read low level stats */
@@ -1067,8 +1065,6 @@ static void vnt_interrupt_process(struct vnt_private *priv)
}
spin_unlock_irqrestore(&priv->lock, flags);
-
- MACvIntEnable(priv->PortOffset, IMR_MASK_VALUE);
}
static void vnt_interrupt_work(struct work_struct *work)
@@ -1078,14 +1074,17 @@ static void vnt_interrupt_work(struct work_struct *work)
if (priv->vif)
vnt_interrupt_process(priv);
+
+ MACvIntEnable(priv->PortOffset, IMR_MASK_VALUE);
}
static irqreturn_t vnt_interrupt(int irq, void *arg)
{
struct vnt_private *priv = arg;
- if (priv->vif)
- schedule_work(&priv->interrupt_work);
+ schedule_work(&priv->interrupt_work);
+
+ MACvIntDisable(priv->PortOffset);
return IRQ_HANDLED;
}
diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index 80205f3..b6c4f55 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -4084,9 +4084,9 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
struct se_cmd *se_cmd = &cmd->se_cmd;
if (se_cmd->se_tfo != NULL) {
- spin_lock(&se_cmd->t_state_lock);
+ spin_lock_irq(&se_cmd->t_state_lock);
se_cmd->transport_state |= CMD_T_FABRIC_STOP;
- spin_unlock(&se_cmd->t_state_lock);
+ spin_unlock_irq(&se_cmd->t_state_lock);
}
}
spin_unlock_bh(&conn->cmd_lock);
diff --git a/drivers/thermal/int340x_thermal/int3400_thermal.c b/drivers/thermal/int340x_thermal/int3400_thermal.c
index 5836e55..d4c374c 100644
--- a/drivers/thermal/int340x_thermal/int3400_thermal.c
+++ b/drivers/thermal/int340x_thermal/int3400_thermal.c
@@ -20,6 +20,13 @@ enum int3400_thermal_uuid {
INT3400_THERMAL_PASSIVE_1,
INT3400_THERMAL_ACTIVE,
INT3400_THERMAL_CRITICAL,
+ INT3400_THERMAL_ADAPTIVE_PERFORMANCE,
+ INT3400_THERMAL_EMERGENCY_CALL_MODE,
+ INT3400_THERMAL_PASSIVE_2,
+ INT3400_THERMAL_POWER_BOSS,
+ INT3400_THERMAL_VIRTUAL_SENSOR,
+ INT3400_THERMAL_COOLING_MODE,
+ INT3400_THERMAL_HARDWARE_DUTY_CYCLING,
INT3400_THERMAL_MAXIMUM_UUID,
};
@@ -27,6 +34,13 @@ static u8 *int3400_thermal_uuids[INT3400_THERMAL_MAXIMUM_UUID] = {
"42A441D6-AE6A-462b-A84B-4A8CE79027D3",
"3A95C389-E4B8-4629-A526-C52C88626BAE",
"97C68AE7-15FA-499c-B8C9-5DA81D606E0A",
+ "63BE270F-1C11-48FD-A6F7-3AF253FF3E2D",
+ "5349962F-71E6-431D-9AE8-0A635B710AEE",
+ "9E04115A-AE87-4D1C-9500-0F3E340BFE75",
+ "F5A35014-C209-46A4-993A-EB56DE7530A1",
+ "6ED722A7-9240-48A5-B479-31EEF723D7CF",
+ "16CAF1B7-DD38-40ED-B1C1-1B8A1913D531",
+ "BE84BABF-C4D4-403D-B495-3128FD44dAC1",
};
struct int3400_thermal_priv {
@@ -271,10 +285,9 @@ static int int3400_thermal_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, priv);
- if (priv->uuid_bitmap & 1 << INT3400_THERMAL_PASSIVE_1) {
- int3400_thermal_ops.get_mode = int3400_thermal_get_mode;
- int3400_thermal_ops.set_mode = int3400_thermal_set_mode;
- }
+ int3400_thermal_ops.get_mode = int3400_thermal_get_mode;
+ int3400_thermal_ops.set_mode = int3400_thermal_set_mode;
+
priv->thermal = thermal_zone_device_register("INT3400 Thermal", 0, 0,
priv, &int3400_thermal_ops,
&int3400_thermal_params, 0, 0);
diff --git a/drivers/tty/Kconfig b/drivers/tty/Kconfig
old mode 100644
new mode 100755
index 53e3186..b5b3bfa
--- a/drivers/tty/Kconfig
+++ b/drivers/tty/Kconfig
@@ -475,4 +475,27 @@
help
Console support for OKL4 Microvisor virtual ttys.
+config LDISC_AUTOLOAD
+ bool "Automatically load TTY Line Disciplines"
+ default y
+ help
+ Historically the kernel has always automatically loaded any
+ line discipline that is in a kernel module when a user asks
+ for it to be loaded with the TIOCSETD ioctl, or through other
+ means. This is not always the best thing to do on systems
+ where you know you will not be using some of the more
+ "ancient" line disciplines, so prevent the kernel from doing
+ this unless the request is coming from a process with the
+ CAP_SYS_MODULE permissions.
+
+ Say 'Y' here if you trust your userspace users to do the right
+ thing, or if you have only provided the line disciplines that
+ you know you will be using, or if you wish to continue to use
+ the traditional method of on-demand loading of these modules
+ by any user.
+
+ This functionality can be changed at runtime with the
+ dev.tty.ldisc_autoload sysctl, this configuration option will
+ only set the default value of this functionality.
+
endif # TTY
diff --git a/drivers/tty/serial/8250/8250_of.c b/drivers/tty/serial/8250/8250_of.c
index 7a8b5fc..f89dfde 100644
--- a/drivers/tty/serial/8250/8250_of.c
+++ b/drivers/tty/serial/8250/8250_of.c
@@ -97,6 +97,10 @@ static int of_platform_serial_setup(struct platform_device *ofdev,
if (of_property_read_u32(np, "reg-offset", &prop) == 0)
port->mapbase += prop;
+ /* Compatibility with the deprecated pxa driver and 8250_pxa drivers. */
+ if (of_device_is_compatible(np, "mrvl,mmp-uart"))
+ port->regshift = 2;
+
/* Check for registers offset within the devices address range */
if (of_property_read_u32(np, "reg-shift", &prop) == 0)
port->regshift = prop;
diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
index e82b347..2c38b3a 100644
--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -1330,6 +1330,30 @@ static int pci_default_setup(struct serial_private *priv,
return setup_port(priv, port, bar, offset, board->reg_shift);
}
+static int pci_pericom_setup(struct serial_private *priv,
+ const struct pciserial_board *board,
+ struct uart_8250_port *port, int idx)
+{
+ unsigned int bar, offset = board->first_offset, maxnr;
+
+ bar = FL_GET_BASE(board->flags);
+ if (board->flags & FL_BASE_BARS)
+ bar += idx;
+ else
+ offset += idx * board->uart_offset;
+
+ if (idx==3)
+ offset = 0x38;
+
+ maxnr = (pci_resource_len(priv->dev, bar) - board->first_offset) >>
+ (board->reg_shift + 3);
+
+ if (board->flags & FL_REGION_SZ_CAP && idx >= maxnr)
+ return 1;
+
+ return setup_port(priv, port, bar, offset, board->reg_shift);
+}
+
static int
ce4100_serial_setup(struct serial_private *priv,
const struct pciserial_board *board,
@@ -2097,6 +2121,16 @@ static struct pci_serial_quirk pci_serial_quirks[] __refdata = {
.exit = pci_plx9050_exit,
},
/*
+ * Pericom (Only 7954 - It have a offset jump for port 4)
+ */
+ {
+ .vendor = PCI_VENDOR_ID_PERICOM,
+ .device = PCI_DEVICE_ID_PERICOM_PI7C9X7954,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ /*
* PLX
*/
{
@@ -2126,6 +2160,111 @@ static struct pci_serial_quirk pci_serial_quirks[] __refdata = {
.setup = pci_default_setup,
.exit = pci_plx9050_exit,
},
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM_4SDB,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_MPCIE_COM_4S,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM232_4DB,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_MPCIE_COM232_4,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM_4SMDB,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_MPCIE_COM_4SM,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_MPCIE_ICM422_4,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_MPCIE_ICM485_4,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_DEVICE_ID_ACCESIO_PCIE_ICM_4S,
+ .device = PCI_DEVICE_ID_ACCESIO_PCIE_ICM232_4,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_MPCIE_ICM232_4,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM422_4,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM485_4,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM232_4,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_PCIE_COM_4SM,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
+ {
+ .vendor = PCI_VENDOR_ID_ACCESIO,
+ .device = PCI_DEVICE_ID_ACCESIO_PCIE_ICM_4SM,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_pericom_setup,
+ },
/*
* SBS Technologies, Inc., PMC-OCTALPRO 232
*/
@@ -4976,10 +5115,10 @@ static struct pci_device_id serial_pci_tbl[] = {
*/
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM_2SDB,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7952 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_COM_2S,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7952 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM_4SDB,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
pbn_pericom_PI7C9X7954 },
@@ -4988,10 +5127,10 @@ static struct pci_device_id serial_pci_tbl[] = {
pbn_pericom_PI7C9X7954 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM232_2DB,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7952 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_COM232_2,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7952 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM232_4DB,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
pbn_pericom_PI7C9X7954 },
@@ -5000,10 +5139,10 @@ static struct pci_device_id serial_pci_tbl[] = {
pbn_pericom_PI7C9X7954 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM_2SMDB,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7952 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_COM_2SM,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7952 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM_4SMDB,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
pbn_pericom_PI7C9X7954 },
@@ -5012,13 +5151,13 @@ static struct pci_device_id serial_pci_tbl[] = {
pbn_pericom_PI7C9X7954 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_ICM485_1,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7951 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_ICM422_2,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7952 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_ICM485_2,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7952 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_ICM422_4,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
pbn_pericom_PI7C9X7954 },
@@ -5027,16 +5166,16 @@ static struct pci_device_id serial_pci_tbl[] = {
pbn_pericom_PI7C9X7954 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_ICM_2S,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7952 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_ICM_4S,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
pbn_pericom_PI7C9X7954 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_ICM232_2,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7952 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_MPCIE_ICM232_2,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7952 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_ICM232_4,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
pbn_pericom_PI7C9X7954 },
@@ -5045,13 +5184,13 @@ static struct pci_device_id serial_pci_tbl[] = {
pbn_pericom_PI7C9X7954 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_ICM_2SM,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7954 },
+ pbn_pericom_PI7C9X7952 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM422_4,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7958 },
+ pbn_pericom_PI7C9X7954 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM485_4,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7958 },
+ pbn_pericom_PI7C9X7954 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM422_8,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
pbn_pericom_PI7C9X7958 },
@@ -5060,19 +5199,19 @@ static struct pci_device_id serial_pci_tbl[] = {
pbn_pericom_PI7C9X7958 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM232_4,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7958 },
+ pbn_pericom_PI7C9X7954 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM232_8,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
pbn_pericom_PI7C9X7958 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM_4SM,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7958 },
+ pbn_pericom_PI7C9X7954 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_COM_8SM,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
pbn_pericom_PI7C9X7958 },
{ PCI_VENDOR_ID_ACCESIO, PCI_DEVICE_ID_ACCESIO_PCIE_ICM_4SM,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_pericom_PI7C9X7958 },
+ pbn_pericom_PI7C9X7954 },
/*
* Topic TP560 Data/Fax/Voice 56k modem (reported by Evan Clarke)
*/
diff --git a/drivers/tty/serial/atmel_serial.c b/drivers/tty/serial/atmel_serial.c
index 5a341b1..ef688aa 100644
--- a/drivers/tty/serial/atmel_serial.c
+++ b/drivers/tty/serial/atmel_serial.c
@@ -175,6 +175,8 @@ struct atmel_uart_port {
unsigned int pending_status;
spinlock_t lock_suspended;
+ bool hd_start_rx; /* can start RX during half-duplex operation */
+
int (*prepare_rx)(struct uart_port *port);
int (*prepare_tx)(struct uart_port *port);
void (*schedule_rx)(struct uart_port *port);
@@ -241,6 +243,12 @@ static inline void atmel_uart_write_char(struct uart_port *port, u8 value)
#endif
+static inline int atmel_uart_is_half_duplex(struct uart_port *port)
+{
+ return (port->rs485.flags & SER_RS485_ENABLED) &&
+ !(port->rs485.flags & SER_RS485_RX_DURING_TX);
+}
+
#ifdef CONFIG_SERIAL_ATMEL_PDC
static bool atmel_use_pdc_rx(struct uart_port *port)
{
@@ -492,9 +500,9 @@ static void atmel_stop_tx(struct uart_port *port)
/* Disable interrupts */
atmel_uart_writel(port, ATMEL_US_IDR, atmel_port->tx_done_mask);
- if ((port->rs485.flags & SER_RS485_ENABLED) &&
- !(port->rs485.flags & SER_RS485_RX_DURING_TX))
+ if (atmel_uart_is_half_duplex(port))
atmel_start_rx(port);
+
}
/*
@@ -511,8 +519,7 @@ static void atmel_start_tx(struct uart_port *port)
return;
if (atmel_use_pdc_tx(port) || atmel_use_dma_tx(port))
- if ((port->rs485.flags & SER_RS485_ENABLED) &&
- !(port->rs485.flags & SER_RS485_RX_DURING_TX))
+ if (atmel_uart_is_half_duplex(port))
atmel_stop_rx(port);
if (atmel_use_pdc_tx(port))
@@ -809,10 +816,14 @@ static void atmel_complete_tx_dma(void *arg)
*/
if (!uart_circ_empty(xmit))
atmel_tasklet_schedule(atmel_port, &atmel_port->tasklet_tx);
- else if ((port->rs485.flags & SER_RS485_ENABLED) &&
- !(port->rs485.flags & SER_RS485_RX_DURING_TX)) {
- /* DMA done, stop TX, start RX for RS485 */
- atmel_start_rx(port);
+ else if (atmel_uart_is_half_duplex(port)) {
+ /*
+ * DMA done, re-enable TXEMPTY and signal that we can stop
+ * TX and start RX for RS485
+ */
+ atmel_port->hd_start_rx = true;
+ atmel_uart_writel(port, ATMEL_US_IER,
+ atmel_port->tx_done_mask);
}
spin_unlock_irqrestore(&port->lock, flags);
@@ -1166,6 +1177,10 @@ static int atmel_prepare_rx_dma(struct uart_port *port)
sg_dma_len(&atmel_port->sg_rx)/2,
DMA_DEV_TO_MEM,
DMA_PREP_INTERRUPT);
+ if (!desc) {
+ dev_err(port->dev, "Preparing DMA cyclic failed\n");
+ goto chan_err;
+ }
desc->callback = atmel_complete_rx_dma;
desc->callback_param = port;
atmel_port->desc_rx = desc;
@@ -1253,9 +1268,20 @@ atmel_handle_transmit(struct uart_port *port, unsigned int pending)
struct atmel_uart_port *atmel_port = to_atmel_uart_port(port);
if (pending & atmel_port->tx_done_mask) {
- /* Either PDC or interrupt transmission */
atmel_uart_writel(port, ATMEL_US_IDR,
atmel_port->tx_done_mask);
+
+ /* Start RX if flag was set and FIFO is empty */
+ if (atmel_port->hd_start_rx) {
+ if (!(atmel_uart_readl(port, ATMEL_US_CSR)
+ & ATMEL_US_TXEMPTY))
+ dev_warn(port->dev, "Should start RX, but TX fifo is not empty\n");
+
+ atmel_port->hd_start_rx = false;
+ atmel_start_rx(port);
+ return;
+ }
+
atmel_tasklet_schedule(atmel_port, &atmel_port->tasklet_tx);
}
}
@@ -1382,8 +1408,7 @@ static void atmel_tx_pdc(struct uart_port *port)
atmel_uart_writel(port, ATMEL_US_IER,
atmel_port->tx_done_mask);
} else {
- if ((port->rs485.flags & SER_RS485_ENABLED) &&
- !(port->rs485.flags & SER_RS485_RX_DURING_TX)) {
+ if (atmel_uart_is_half_duplex(port)) {
/* DMA done, stop TX, start RX for RS485 */
atmel_start_rx(port);
}
diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
index 6c0c6c5d..ec24e82 100644
--- a/drivers/tty/serial/kgdboc.c
+++ b/drivers/tty/serial/kgdboc.c
@@ -148,8 +148,10 @@ static int configure_kgdboc(void)
char *cptr = config;
struct console *cons;
- if (!strlen(config) || isspace(config[0]))
+ if (!strlen(config) || isspace(config[0])) {
+ err = 0;
goto noconfig;
+ }
kgdboc_io_ops.is_console = 0;
kgdb_tty_driver = NULL;
diff --git a/drivers/tty/serial/max310x.c b/drivers/tty/serial/max310x.c
index 8a3e926..5331baf 100644
--- a/drivers/tty/serial/max310x.c
+++ b/drivers/tty/serial/max310x.c
@@ -1323,6 +1323,8 @@ static int max310x_spi_probe(struct spi_device *spi)
if (spi->dev.of_node) {
const struct of_device_id *of_id =
of_match_device(max310x_dt_ids, &spi->dev);
+ if (!of_id)
+ return -ENODEV;
devtype = (struct max310x_devtype *)of_id->data;
} else {
diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
index 6ff53b6..bcb9979 100644
--- a/drivers/tty/serial/sh-sci.c
+++ b/drivers/tty/serial/sh-sci.c
@@ -834,19 +834,9 @@ static void sci_transmit_chars(struct uart_port *port)
if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
uart_write_wakeup(port);
- if (uart_circ_empty(xmit)) {
+ if (uart_circ_empty(xmit))
sci_stop_tx(port);
- } else {
- ctrl = serial_port_in(port, SCSCR);
- if (port->type != PORT_SCI) {
- serial_port_in(port, SCxSR); /* Dummy read */
- sci_clear_SCxSR(port, SCxSR_TDxE_CLEAR(port));
- }
-
- ctrl |= SCSCR_TIE;
- serial_port_out(port, SCSCR, ctrl);
- }
}
/* On SH3, SCIF may read end-of-break as a space->mark char */
diff --git a/drivers/tty/serial/sprd_serial.c b/drivers/tty/serial/sprd_serial.c
index 699447a..747560f 100644
--- a/drivers/tty/serial/sprd_serial.c
+++ b/drivers/tty/serial/sprd_serial.c
@@ -36,7 +36,7 @@
#define SPRD_FIFO_SIZE 128
#define SPRD_DEF_RATE 26000000
#define SPRD_BAUD_IO_LIMIT 3000000
-#define SPRD_TIMEOUT 256
+#define SPRD_TIMEOUT 256000
/* the offset of serial registers and BITs for them */
/* data registers */
@@ -63,6 +63,7 @@
/* interrupt clear register */
#define SPRD_ICLR 0x0014
+#define SPRD_ICLR_TIMEOUT BIT(13)
/* line control register */
#define SPRD_LCR 0x0018
@@ -298,7 +299,8 @@ static irqreturn_t sprd_handle_irq(int irq, void *dev_id)
return IRQ_NONE;
}
- serial_out(port, SPRD_ICLR, ~0);
+ if (ims & SPRD_IMSR_TIMEOUT)
+ serial_out(port, SPRD_ICLR, SPRD_ICLR_TIMEOUT);
if (ims & (SPRD_IMSR_RX_FIFO_FULL |
SPRD_IMSR_BREAK_DETECT | SPRD_IMSR_TIMEOUT))
diff --git a/drivers/tty/serial/xilinx_uartps.c b/drivers/tty/serial/xilinx_uartps.c
index 7333d64..eb61a07 100644
--- a/drivers/tty/serial/xilinx_uartps.c
+++ b/drivers/tty/serial/xilinx_uartps.c
@@ -362,7 +362,13 @@ static irqreturn_t cdns_uart_isr(int irq, void *dev_id)
cdns_uart_handle_tx(dev_id);
isrstatus &= ~CDNS_UART_IXR_TXEMPTY;
}
- if (isrstatus & CDNS_UART_IXR_RXMASK)
+
+ /*
+ * Skip RX processing if RX is disabled as RXEMPTY will never be set
+ * as read bytes will not be removed from the FIFO.
+ */
+ if (isrstatus & CDNS_UART_IXR_RXMASK &&
+ !(readl(port->membase + CDNS_UART_CR) & CDNS_UART_CR_RX_DIS))
cdns_uart_handle_rx(dev_id, isrstatus);
spin_unlock(&port->lock);
@@ -1255,7 +1261,7 @@ static void cdns_uart_console_write(struct console *co, const char *s,
*
* Return: 0 on success, negative errno otherwise.
*/
-static int __init cdns_uart_console_setup(struct console *co, char *options)
+static int cdns_uart_console_setup(struct console *co, char *options)
{
struct uart_port *port = &cdns_uart_port[co->index];
int baud = 9600;
diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 41b9a7c..ca9c82e 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -25,7 +25,7 @@
* Byte threshold to limit memory consumption for flip buffers.
* The actual memory limit is > 2x this amount.
*/
-#define TTYB_DEFAULT_MEM_LIMIT 65536
+#define TTYB_DEFAULT_MEM_LIMIT (640 * 1024UL)
/*
* We default to dicing tty buffer allocations to this many characters
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 84f7b83..08e47de 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -520,6 +520,8 @@ void proc_clear_tty(struct task_struct *p)
tty_kref_put(tty);
}
+extern void tty_sysctl_init(void);
+
/**
* proc_set_tty - set the controlling terminal
*
@@ -3709,6 +3711,7 @@ void console_sysfs_notify(void)
*/
int __init tty_init(void)
{
+ tty_sysctl_init();
cdev_init(&tty_cdev, &tty_fops);
if (cdev_add(&tty_cdev, MKDEV(TTYAUX_MAJOR, 0), 1) ||
register_chrdev_region(MKDEV(TTYAUX_MAJOR, 0), 1, "/dev/tty") < 0)
diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
index 4ab518d..3eb3f2a 100644
--- a/drivers/tty/tty_ldisc.c
+++ b/drivers/tty/tty_ldisc.c
@@ -155,6 +155,13 @@ static void put_ldops(struct tty_ldisc_ops *ldops)
* takes tty_ldiscs_lock to guard against ldisc races
*/
+#if defined(CONFIG_LDISC_AUTOLOAD)
+ #define INITIAL_AUTOLOAD_STATE 1
+#else
+ #define INITIAL_AUTOLOAD_STATE 0
+#endif
+static int tty_ldisc_autoload = INITIAL_AUTOLOAD_STATE;
+
static struct tty_ldisc *tty_ldisc_get(struct tty_struct *tty, int disc)
{
struct tty_ldisc *ld;
@@ -169,6 +176,8 @@ static struct tty_ldisc *tty_ldisc_get(struct tty_struct *tty, int disc)
*/
ldops = get_ldops(disc);
if (IS_ERR(ldops)) {
+ if (!capable(CAP_SYS_MODULE) && !tty_ldisc_autoload)
+ return ERR_PTR(-EPERM);
request_module("tty-ldisc-%d", disc);
ldops = get_ldops(disc);
if (IS_ERR(ldops))
@@ -774,3 +783,41 @@ void tty_ldisc_deinit(struct tty_struct *tty)
tty_ldisc_put(tty->ldisc);
tty->ldisc = NULL;
}
+
+static int zero;
+static int one = 1;
+static struct ctl_table tty_table[] = {
+ {
+ .procname = "ldisc_autoload",
+ .data = &tty_ldisc_autoload,
+ .maxlen = sizeof(tty_ldisc_autoload),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
+ { }
+};
+
+static struct ctl_table tty_dir_table[] = {
+ {
+ .procname = "tty",
+ .mode = 0555,
+ .child = tty_table,
+ },
+ { }
+};
+
+static struct ctl_table tty_root_table[] = {
+ {
+ .procname = "dev",
+ .mode = 0555,
+ .child = tty_dir_table,
+ },
+ { }
+};
+
+void tty_sysctl_init(void)
+{
+ register_sysctl_table(tty_root_table);
+}
diff --git a/drivers/usb/chipidea/core.c b/drivers/usb/chipidea/core.c
index 64c6af2c..e96e3a5 100644
--- a/drivers/usb/chipidea/core.c
+++ b/drivers/usb/chipidea/core.c
@@ -901,8 +901,15 @@ static int ci_hdrc_probe(struct platform_device *pdev)
} else if (ci->platdata->usb_phy) {
ci->usb_phy = ci->platdata->usb_phy;
} else {
+ ci->usb_phy = devm_usb_get_phy_by_phandle(dev->parent, "phys",
+ 0);
ci->phy = devm_phy_get(dev->parent, "usb-phy");
- ci->usb_phy = devm_usb_get_phy(dev->parent, USB_PHY_TYPE_USB2);
+
+ /* Fallback to grabbing any registered USB2 PHY */
+ if (IS_ERR(ci->usb_phy) &&
+ PTR_ERR(ci->usb_phy) != -EPROBE_DEFER)
+ ci->usb_phy = devm_usb_get_phy(dev->parent,
+ USB_PHY_TYPE_USB2);
/* if both generic PHY and USB PHY layers aren't enabled */
if (PTR_ERR(ci->phy) == -ENOSYS &&
diff --git a/drivers/usb/common/common.c b/drivers/usb/common/common.c
index 5ef8da6..64c7640 100644
--- a/drivers/usb/common/common.c
+++ b/drivers/usb/common/common.c
@@ -148,6 +148,8 @@ enum usb_dr_mode of_usb_get_dr_mode_by_phy(struct device_node *np, int arg0)
do {
controller = of_find_node_with_property(controller, "phys");
+ if (!of_device_is_available(controller))
+ continue;
index = 0;
do {
if (arg0 == -1) {
diff --git a/drivers/usb/core/config.c b/drivers/usb/core/config.c
index f794741..876679a 100644
--- a/drivers/usb/core/config.c
+++ b/drivers/usb/core/config.c
@@ -763,21 +763,18 @@ void usb_destroy_configuration(struct usb_device *dev)
return;
if (dev->rawdescriptors) {
- for (i = 0; i < dev->descriptor.bNumConfigurations &&
- i < USB_MAXCONFIG; i++)
+ for (i = 0; i < dev->descriptor.bNumConfigurations; i++)
kfree(dev->rawdescriptors[i]);
kfree(dev->rawdescriptors);
dev->rawdescriptors = NULL;
}
- for (c = 0; c < dev->descriptor.bNumConfigurations &&
- c < USB_MAXCONFIG; c++) {
+ for (c = 0; c < dev->descriptor.bNumConfigurations; c++) {
struct usb_host_config *cf = &dev->config[c];
kfree(cf->string);
- for (i = 0; i < cf->desc.bNumInterfaces &&
- i < USB_MAXINTERFACES; i++) {
+ for (i = 0; i < cf->desc.bNumInterfaces; i++) {
if (cf->intf_cache[i])
kref_put(&cf->intf_cache[i]->ref,
usb_release_interface_cache);
diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index 7dae981..4809581 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -1901,14 +1901,11 @@ int usb_runtime_idle(struct device *dev)
return -EBUSY;
}
-int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable)
+static int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable)
{
struct usb_hcd *hcd = bus_to_hcd(udev->bus);
int ret = -EPERM;
- if (enable && !udev->usb2_hw_lpm_allowed)
- return 0;
-
if (hcd->driver->set_usb2_hw_lpm) {
ret = hcd->driver->set_usb2_hw_lpm(hcd, udev, enable);
if (!ret)
@@ -1918,6 +1915,24 @@ int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable)
return ret;
}
+int usb_enable_usb2_hardware_lpm(struct usb_device *udev)
+{
+ if (!udev->usb2_hw_lpm_capable ||
+ !udev->usb2_hw_lpm_allowed ||
+ udev->usb2_hw_lpm_enabled)
+ return 0;
+
+ return usb_set_usb2_hardware_lpm(udev, 1);
+}
+
+int usb_disable_usb2_hardware_lpm(struct usb_device *udev)
+{
+ if (!udev->usb2_hw_lpm_enabled)
+ return 0;
+
+ return usb_set_usb2_hardware_lpm(udev, 0);
+}
+
#endif /* CONFIG_PM */
struct bus_type usb_bus_type = {
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 0ff6ae3..d6c3df6 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -3179,8 +3179,7 @@ int usb_port_suspend(struct usb_device *udev, pm_message_t msg)
}
/* disable USB2 hardware LPM */
- if (udev->usb2_hw_lpm_enabled == 1)
- usb_set_usb2_hardware_lpm(udev, 0);
+ usb_disable_usb2_hardware_lpm(udev);
if (usb_disable_ltm(udev)) {
dev_err(&udev->dev, "Failed to disable LTM before suspend\n.");
@@ -3226,8 +3225,7 @@ int usb_port_suspend(struct usb_device *udev, pm_message_t msg)
usb_enable_ltm(udev);
err_ltm:
/* Try to enable USB2 hardware LPM again */
- if (udev->usb2_hw_lpm_capable == 1)
- usb_set_usb2_hardware_lpm(udev, 1);
+ usb_enable_usb2_hardware_lpm(udev);
if (udev->do_remote_wakeup)
(void) usb_disable_remote_wakeup(udev);
@@ -3512,8 +3510,7 @@ int usb_port_resume(struct usb_device *udev, pm_message_t msg)
hub_port_logical_disconnect(hub, port1);
} else {
/* Try to enable USB2 hardware LPM */
- if (udev->usb2_hw_lpm_capable == 1)
- usb_set_usb2_hardware_lpm(udev, 1);
+ usb_enable_usb2_hardware_lpm(udev);
/* Try to enable USB3 LTM and LPM */
usb_enable_ltm(udev);
@@ -4350,7 +4347,7 @@ static void hub_set_initial_usb2_lpm_policy(struct usb_device *udev)
if ((udev->bos->ext_cap->bmAttributes & cpu_to_le32(USB_BESL_SUPPORT)) ||
connect_type == USB_PORT_CONNECT_TYPE_HARD_WIRED) {
udev->usb2_hw_lpm_allowed = 1;
- usb_set_usb2_hardware_lpm(udev, 1);
+ usb_enable_usb2_hardware_lpm(udev);
}
}
@@ -5516,8 +5513,7 @@ static int usb_reset_and_verify_device(struct usb_device *udev)
/* Disable USB2 hardware LPM.
* It will be re-enabled by the enumeration process.
*/
- if (udev->usb2_hw_lpm_enabled == 1)
- usb_set_usb2_hardware_lpm(udev, 0);
+ usb_disable_usb2_hardware_lpm(udev);
/* Disable LPM and LTM while we reset the device and reinstall the alt
* settings. Device-initiated LPM settings, and system exit latency
@@ -5627,7 +5623,7 @@ static int usb_reset_and_verify_device(struct usb_device *udev)
done:
/* Now that the alt settings are re-installed, enable LTM and LPM. */
- usb_set_usb2_hardware_lpm(udev, 1);
+ usb_enable_usb2_hardware_lpm(udev);
usb_unlocked_enable_lpm(udev);
usb_enable_ltm(udev);
usb_release_bos_descriptor(udev);
diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
index 255fecc..f193717 100644
--- a/drivers/usb/core/message.c
+++ b/drivers/usb/core/message.c
@@ -1181,8 +1181,7 @@ void usb_disable_device(struct usb_device *dev, int skip_ep0)
dev->actconfig->interface[i] = NULL;
}
- if (dev->usb2_hw_lpm_enabled == 1)
- usb_set_usb2_hardware_lpm(dev, 0);
+ usb_disable_usb2_hardware_lpm(dev);
usb_unlocked_disable_lpm(dev);
usb_disable_ltm(dev);
diff --git a/drivers/usb/core/sysfs.c b/drivers/usb/core/sysfs.c
index c953a0f..1a232b4 100644
--- a/drivers/usb/core/sysfs.c
+++ b/drivers/usb/core/sysfs.c
@@ -494,7 +494,10 @@ static ssize_t usb2_hardware_lpm_store(struct device *dev,
if (!ret) {
udev->usb2_hw_lpm_allowed = value;
- ret = usb_set_usb2_hardware_lpm(udev, value);
+ if (value)
+ ret = usb_enable_usb2_hardware_lpm(udev);
+ else
+ ret = usb_disable_usb2_hardware_lpm(udev);
}
usb_unlock_device(udev);
diff --git a/drivers/usb/core/usb.h b/drivers/usb/core/usb.h
index fbff25f..dde0e99 100644
--- a/drivers/usb/core/usb.h
+++ b/drivers/usb/core/usb.h
@@ -84,7 +84,8 @@ extern int usb_remote_wakeup(struct usb_device *dev);
extern int usb_runtime_suspend(struct device *dev);
extern int usb_runtime_resume(struct device *dev);
extern int usb_runtime_idle(struct device *dev);
-extern int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable);
+extern int usb_enable_usb2_hardware_lpm(struct usb_device *udev);
+extern int usb_disable_usb2_hardware_lpm(struct usb_device *udev);
#else
@@ -104,7 +105,12 @@ static inline int usb_autoresume_device(struct usb_device *udev)
return 0;
}
-static inline int usb_set_usb2_hardware_lpm(struct usb_device *udev, int enable)
+static inline int usb_enable_usb2_hardware_lpm(struct usb_device *udev)
+{
+ return 0;
+}
+
+static inline int usb_disable_usb2_hardware_lpm(struct usb_device *udev)
{
return 0;
}
diff --git a/drivers/usb/gadget/configfs.c b/drivers/usb/gadget/configfs.c
index 14c18f3..b392577 100644
--- a/drivers/usb/gadget/configfs.c
+++ b/drivers/usb/gadget/configfs.c
@@ -329,6 +329,7 @@ static ssize_t gadget_dev_desc_UDC_store(struct config_item *item,
gi->composite.gadget_driver.udc_name = NULL;
goto err;
}
+ schedule_work(&gi->work);
}
mutex_unlock(&gi->lock);
return len;
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
old mode 100644
new mode 100755
index 15c67c1..8bcc6ff
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -1188,9 +1188,12 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
*/
if (epfile->ep == ep) {
usb_ep_dequeue(ep->ep, req);
+ spin_unlock_irq(&epfile->ffs->eps_lock);
+ wait_for_completion(done);
interrupted = ep->status < 0;
+ } else {
+ spin_unlock_irq(&epfile->ffs->eps_lock);
}
- spin_unlock_irq(&epfile->ffs->eps_lock);
}
if (interrupted) {
diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c
index 5815120..8e83649 100644
--- a/drivers/usb/gadget/function/f_hid.c
+++ b/drivers/usb/gadget/function/f_hid.c
@@ -340,20 +340,20 @@ static ssize_t f_hidg_write(struct file *file, const char __user *buffer,
req->complete = f_hidg_req_complete;
req->context = hidg;
+ spin_unlock_irqrestore(&hidg->write_spinlock, flags);
+
status = usb_ep_queue(hidg->in_ep, hidg->req, GFP_ATOMIC);
if (status < 0) {
ERROR(hidg->func.config->cdev,
"usb_ep_queue error on int endpoint %zd\n", status);
- goto release_write_pending_unlocked;
+ goto release_write_pending;
} else {
status = count;
}
- spin_unlock_irqrestore(&hidg->write_spinlock, flags);
return status;
release_write_pending:
spin_lock_irqsave(&hidg->write_spinlock, flags);
-release_write_pending_unlocked:
hidg->write_pending = 0;
spin_unlock_irqrestore(&hidg->write_spinlock, flags);
diff --git a/drivers/usb/host/ehci-hub.c b/drivers/usb/host/ehci-hub.c
index b09392f..37acb0e 100644
--- a/drivers/usb/host/ehci-hub.c
+++ b/drivers/usb/host/ehci-hub.c
@@ -779,12 +779,12 @@ static struct urb *request_single_step_set_feature_urb(
atomic_inc(&urb->use_count);
atomic_inc(&urb->dev->urbnum);
urb->setup_dma = dma_map_single(
- hcd->self.controller,
+ hcd->self.sysdev,
urb->setup_packet,
sizeof(struct usb_ctrlrequest),
DMA_TO_DEVICE);
urb->transfer_dma = dma_map_single(
- hcd->self.controller,
+ hcd->self.sysdev,
urb->transfer_buffer,
urb->transfer_buffer_length,
DMA_FROM_DEVICE);
diff --git a/drivers/usb/host/xhci-rcar.c b/drivers/usb/host/xhci-rcar.c
index 0e4535e..64ee815 100644
--- a/drivers/usb/host/xhci-rcar.c
+++ b/drivers/usb/host/xhci-rcar.c
@@ -192,5 +192,6 @@ int xhci_rcar_init_quirk(struct usb_hcd *hcd)
xhci_rcar_is_gen3(hcd->self.controller))
xhci->quirks |= XHCI_NO_64BIT_SUPPORT;
+ xhci->quirks |= XHCI_TRUST_TX_LENGTH;
return xhci_rcar_download_firmware(hcd);
}
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 32cb995..fe5db15 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1635,10 +1635,13 @@ static void handle_port_status(struct xhci_hcd *xhci,
}
}
- if ((temp & PORT_PLC) && (temp & PORT_PLS_MASK) == XDEV_U0 &&
- DEV_SUPERSPEED_ANY(temp)) {
+ if ((temp & PORT_PLC) &&
+ DEV_SUPERSPEED_ANY(temp) &&
+ ((temp & PORT_PLS_MASK) == XDEV_U0 ||
+ (temp & PORT_PLS_MASK) == XDEV_U1 ||
+ (temp & PORT_PLS_MASK) == XDEV_U2)) {
xhci_dbg(xhci, "resume SS port %d finished\n", port_id);
- /* We've just brought the device into U0 through either the
+ /* We've just brought the device into U0/1/2 through either the
* Resume state after a device remote wakeup, or through the
* U3Exit state after a host-initiated resume. If it's a device
* initiated remote wake, don't pass up the link state change,
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 93043ed..016f7a1 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -311,6 +311,7 @@ struct xhci_op_regs {
*/
#define PORT_PLS_MASK (0xf << 5)
#define XDEV_U0 (0x0 << 5)
+#define XDEV_U1 (0x1 << 5)
#define XDEV_U2 (0x2 << 5)
#define XDEV_U3 (0x3 << 5)
#define XDEV_INACTIVE (0x6 << 5)
diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c
index 41713e5..109f5da 100644
--- a/drivers/usb/serial/cp210x.c
+++ b/drivers/usb/serial/cp210x.c
@@ -77,6 +77,7 @@ static const struct usb_device_id id_table[] = {
{ USB_DEVICE(0x10C4, 0x804E) }, /* Software Bisque Paramount ME build-in converter */
{ USB_DEVICE(0x10C4, 0x8053) }, /* Enfora EDG1228 */
{ USB_DEVICE(0x10C4, 0x8054) }, /* Enfora GSM2228 */
+ { USB_DEVICE(0x10C4, 0x8056) }, /* Lorenz Messtechnik devices */
{ USB_DEVICE(0x10C4, 0x8066) }, /* Argussoft In-System Programmer */
{ USB_DEVICE(0x10C4, 0x806F) }, /* IMS USB to RS422 Converter Cable */
{ USB_DEVICE(0x10C4, 0x807A) }, /* Crumb128 board */
diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c
index b88a722..f54931a 100644
--- a/drivers/usb/serial/ftdi_sio.c
+++ b/drivers/usb/serial/ftdi_sio.c
@@ -604,6 +604,8 @@ static const struct usb_device_id id_table_combined[] = {
.driver_info = (kernel_ulong_t)&ftdi_jtag_quirk },
{ USB_DEVICE(FTDI_VID, FTDI_NT_ORIONLXM_PID),
.driver_info = (kernel_ulong_t)&ftdi_jtag_quirk },
+ { USB_DEVICE(FTDI_VID, FTDI_NT_ORIONLX_PLUS_PID) },
+ { USB_DEVICE(FTDI_VID, FTDI_NT_ORION_IO_PID) },
{ USB_DEVICE(FTDI_VID, FTDI_SYNAPSE_SS200_PID) },
{ USB_DEVICE(FTDI_VID, FTDI_CUSTOMWARE_MINIPLEX_PID) },
{ USB_DEVICE(FTDI_VID, FTDI_CUSTOMWARE_MINIPLEX2_PID) },
diff --git a/drivers/usb/serial/ftdi_sio_ids.h b/drivers/usb/serial/ftdi_sio_ids.h
index ddf5ab9..15d220e 100644
--- a/drivers/usb/serial/ftdi_sio_ids.h
+++ b/drivers/usb/serial/ftdi_sio_ids.h
@@ -566,7 +566,9 @@
/*
* NovaTech product ids (FTDI_VID)
*/
-#define FTDI_NT_ORIONLXM_PID 0x7c90 /* OrionLXm Substation Automation Platform */
+#define FTDI_NT_ORIONLXM_PID 0x7c90 /* OrionLXm Substation Automation Platform */
+#define FTDI_NT_ORIONLX_PLUS_PID 0x7c91 /* OrionLX+ Substation Automation Platform */
+#define FTDI_NT_ORION_IO_PID 0x7c92 /* Orion I/O */
/*
* Synapse Wireless product ids (FTDI_VID)
diff --git a/drivers/usb/serial/mos7720.c b/drivers/usb/serial/mos7720.c
index 135eb04..ea20322 100644
--- a/drivers/usb/serial/mos7720.c
+++ b/drivers/usb/serial/mos7720.c
@@ -368,8 +368,6 @@ static int write_parport_reg_nonblock(struct mos7715_parport *mos_parport,
if (!urbtrack)
return -ENOMEM;
- kref_get(&mos_parport->ref_count);
- urbtrack->mos_parport = mos_parport;
urbtrack->urb = usb_alloc_urb(0, GFP_ATOMIC);
if (!urbtrack->urb) {
kfree(urbtrack);
@@ -390,6 +388,8 @@ static int write_parport_reg_nonblock(struct mos7715_parport *mos_parport,
usb_sndctrlpipe(usbdev, 0),
(unsigned char *)urbtrack->setup,
NULL, 0, async_complete, urbtrack);
+ kref_get(&mos_parport->ref_count);
+ urbtrack->mos_parport = mos_parport;
kref_init(&urbtrack->ref_count);
INIT_LIST_HEAD(&urbtrack->urblist_entry);
diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
index b2b7c12..9f96dd2 100644
--- a/drivers/usb/serial/option.c
+++ b/drivers/usb/serial/option.c
@@ -1066,7 +1066,8 @@ static const struct usb_device_id option_ids[] = {
.driver_info = RSVD(3) },
{ USB_DEVICE(QUALCOMM_VENDOR_ID, 0x6613)}, /* Onda H600/ZTE MF330 */
{ USB_DEVICE(QUALCOMM_VENDOR_ID, 0x0023)}, /* ONYX 3G device */
- { USB_DEVICE(QUALCOMM_VENDOR_ID, 0x9000)}, /* SIMCom SIM5218 */
+ { USB_DEVICE(QUALCOMM_VENDOR_ID, 0x9000), /* SIMCom SIM5218 */
+ .driver_info = NCTRL(0) | NCTRL(1) | NCTRL(2) | NCTRL(3) | RSVD(4) },
/* Quectel products using Qualcomm vendor ID */
{ USB_DEVICE(QUALCOMM_VENDOR_ID, QUECTEL_PRODUCT_UC15)},
{ USB_DEVICE(QUALCOMM_VENDOR_ID, QUECTEL_PRODUCT_UC20),
@@ -1941,10 +1942,12 @@ static const struct usb_device_id option_ids[] = {
.driver_info = RSVD(4) },
{ USB_DEVICE_INTERFACE_CLASS(0x2001, 0x7e35, 0xff), /* D-Link DWM-222 */
.driver_info = RSVD(4) },
- { USB_DEVICE_AND_INTERFACE_INFO(0x07d1, 0x3e01, 0xff, 0xff, 0xff) }, /* D-Link DWM-152/C1 */
- { USB_DEVICE_AND_INTERFACE_INFO(0x07d1, 0x3e02, 0xff, 0xff, 0xff) }, /* D-Link DWM-156/C1 */
- { USB_DEVICE_AND_INTERFACE_INFO(0x07d1, 0x7e11, 0xff, 0xff, 0xff) }, /* D-Link DWM-156/A3 */
- { USB_DEVICE_INTERFACE_CLASS(0x2020, 0x4000, 0xff) }, /* OLICARD300 - MT6225 */
+ { USB_DEVICE_AND_INTERFACE_INFO(0x07d1, 0x3e01, 0xff, 0xff, 0xff) }, /* D-Link DWM-152/C1 */
+ { USB_DEVICE_AND_INTERFACE_INFO(0x07d1, 0x3e02, 0xff, 0xff, 0xff) }, /* D-Link DWM-156/C1 */
+ { USB_DEVICE_AND_INTERFACE_INFO(0x07d1, 0x7e11, 0xff, 0xff, 0xff) }, /* D-Link DWM-156/A3 */
+ { USB_DEVICE_INTERFACE_CLASS(0x2020, 0x2031, 0xff), /* Olicard 600 */
+ .driver_info = RSVD(4) },
+ { USB_DEVICE_INTERFACE_CLASS(0x2020, 0x4000, 0xff) }, /* OLICARD300 - MT6225 */
{ USB_DEVICE(INOVIA_VENDOR_ID, INOVIA_SEW858) },
{ USB_DEVICE(VIATELECOM_VENDOR_ID, VIATELECOM_PRODUCT_CDS7) },
{ USB_DEVICE_AND_INTERFACE_INFO(WETELECOM_VENDOR_ID, WETELECOM_PRODUCT_WMD200, 0xff, 0xff, 0xff) },
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 2383caf..a54dbfe 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -863,8 +863,12 @@ static int vhost_new_umem_range(struct vhost_umem *umem,
u64 start, u64 size, u64 end,
u64 userspace_addr, int perm)
{
- struct vhost_umem_node *tmp, *node = kmalloc(sizeof(*node), GFP_ATOMIC);
+ struct vhost_umem_node *tmp, *node;
+ if (!size)
+ return -EFAULT;
+
+ node = kmalloc(sizeof(*node), GFP_ATOMIC);
if (!node)
return -ENOMEM;
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 72e914d..3cefd60 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -640,7 +640,7 @@ static int vhost_vsock_set_cid(struct vhost_vsock *vsock, u64 guest_cid)
hash_del_rcu(&vsock->hash);
vsock->guest_cid = guest_cid;
- hash_add_rcu(vhost_vsock_hash, &vsock->hash, guest_cid);
+ hash_add_rcu(vhost_vsock_hash, &vsock->hash, vsock->guest_cid);
spin_unlock_bh(&vhost_vsock_lock);
return 0;
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c
index d95ae09..baa1510 100644
--- a/drivers/video/backlight/pwm_bl.c
+++ b/drivers/video/backlight/pwm_bl.c
@@ -54,10 +54,11 @@ static void pwm_backlight_power_on(struct pwm_bl_data *pb, int brightness)
if (err < 0)
dev_err(pb->dev, "failed to enable power supply\n");
+ pwm_enable(pb->pwm);
+
if (pb->enable_gpio)
gpiod_set_value_cansleep(pb->enable_gpio, 1);
- pwm_enable(pb->pwm);
pb->enabled = true;
}
@@ -66,12 +67,12 @@ static void pwm_backlight_power_off(struct pwm_bl_data *pb)
if (!pb->enabled)
return;
- pwm_config(pb->pwm, 0, pb->period);
- pwm_disable(pb->pwm);
-
if (pb->enable_gpio)
gpiod_set_value_cansleep(pb->enable_gpio, 0);
+ pwm_config(pb->pwm, 0, pb->period);
+ pwm_disable(pb->pwm);
+
regulator_disable(pb->power_supply);
pb->enabled = false;
}
diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index 0dcdca1..780c22f 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -425,6 +425,9 @@ static void fb_do_show_logo(struct fb_info *info, struct fb_image *image,
{
unsigned int x;
+ if (image->width > info->var.xres || image->height > info->var.yres)
+ return;
+
if (rotate == FB_ROTATE_UR) {
for (x = 0;
x < num && image->dx + image->width <= info->var.xres;
diff --git a/drivers/video/fbdev/msm/mdss_dsi.c b/drivers/video/fbdev/msm/mdss_dsi.c
index 5d8398e..5a477ba 100644
--- a/drivers/video/fbdev/msm/mdss_dsi.c
+++ b/drivers/video/fbdev/msm/mdss_dsi.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
+/* Copyright (c) 2012-2019, The Linux Foundation. All rights reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 and
@@ -1015,7 +1015,8 @@ static ssize_t mdss_dsi_cmd_write(struct file *file, const char __user *p,
static int mdss_dsi_cmd_flush(struct file *file, fl_owner_t id)
{
struct buf_data *pcmds = file->private_data;
- int blen, len, i;
+ unsigned int len;
+ int blen, i;
char *buf, *bufp, *bp;
struct dsi_ctrl_hdr *dchdr;
@@ -1059,7 +1060,7 @@ static int mdss_dsi_cmd_flush(struct file *file, fl_owner_t id)
while (len >= sizeof(*dchdr)) {
dchdr = (struct dsi_ctrl_hdr *)bp;
dchdr->dlen = ntohs(dchdr->dlen);
- if (dchdr->dlen > len || dchdr->dlen < 0) {
+ if (dchdr->dlen > (len - sizeof(*dchdr)) || dchdr->dlen < 0) {
pr_err("%s: dtsi cmd=%x error, len=%d\n",
__func__, dchdr->dtype, dchdr->dlen);
kfree(buf);
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 8977f40..2f09294 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1040,6 +1040,8 @@ struct virtqueue *vring_create_virtqueue(
GFP_KERNEL|__GFP_NOWARN|__GFP_ZERO);
if (queue)
break;
+ if (!may_reduce_num)
+ return NULL;
}
if (!num)
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
index 072e759..a8ff430 100644
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -59,6 +59,8 @@ enum {
Opt_cache_loose, Opt_fscache, Opt_mmap,
/* Access options */
Opt_access, Opt_posixacl,
+ /* Lock timeout option */
+ Opt_locktimeout,
/* Error token */
Opt_err
};
@@ -78,6 +80,7 @@ static const match_table_t tokens = {
{Opt_cachetag, "cachetag=%s"},
{Opt_access, "access=%s"},
{Opt_posixacl, "posixacl"},
+ {Opt_locktimeout, "locktimeout=%u"},
{Opt_err, NULL}
};
@@ -126,6 +129,7 @@ static int v9fs_parse_options(struct v9fs_session_info *v9ses, char *opts)
#ifdef CONFIG_9P_FSCACHE
v9ses->cachetag = NULL;
#endif
+ v9ses->session_lock_timeout = P9_LOCK_TIMEOUT;
if (!opts)
return 0;
@@ -298,6 +302,23 @@ static int v9fs_parse_options(struct v9fs_session_info *v9ses, char *opts)
#endif
break;
+ case Opt_locktimeout:
+ r = match_int(&args[0], &option);
+ if (r < 0) {
+ p9_debug(P9_DEBUG_ERROR,
+ "integer field, but no integer?\n");
+ ret = r;
+ continue;
+ }
+ if (option < 1) {
+ p9_debug(P9_DEBUG_ERROR,
+ "locktimeout must be a greater than zero integer.\n");
+ ret = -EINVAL;
+ continue;
+ }
+ v9ses->session_lock_timeout = (long)option * HZ;
+ break;
+
default:
continue;
}
diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h
index 443d12e..ce6ca9f 100644
--- a/fs/9p/v9fs.h
+++ b/fs/9p/v9fs.h
@@ -116,6 +116,7 @@ struct v9fs_session_info {
struct list_head slist; /* list of sessions registered with v9fs */
struct backing_dev_info bdi;
struct rw_semaphore rename_sem;
+ long session_lock_timeout; /* retry interval for blocking locks */
};
/* cache_validity flags */
diff --git a/fs/9p/v9fs_vfs.h b/fs/9p/v9fs_vfs.h
index 5a0db6d..aaee1e6 100644
--- a/fs/9p/v9fs_vfs.h
+++ b/fs/9p/v9fs_vfs.h
@@ -40,6 +40,9 @@
*/
#define P9_LOCK_TIMEOUT (30*HZ)
+/* flags for v9fs_stat2inode() & v9fs_stat2inode_dotl() */
+#define V9FS_STAT2INODE_KEEP_ISIZE 1
+
extern struct file_system_type v9fs_fs_type;
extern const struct address_space_operations v9fs_addr_operations;
extern const struct file_operations v9fs_file_operations;
@@ -61,8 +64,10 @@ int v9fs_init_inode(struct v9fs_session_info *v9ses,
struct inode *inode, umode_t mode, dev_t);
void v9fs_evict_inode(struct inode *inode);
ino_t v9fs_qid2ino(struct p9_qid *qid);
-void v9fs_stat2inode(struct p9_wstat *, struct inode *, struct super_block *);
-void v9fs_stat2inode_dotl(struct p9_stat_dotl *, struct inode *);
+void v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode,
+ struct super_block *sb, unsigned int flags);
+void v9fs_stat2inode_dotl(struct p9_stat_dotl *stat, struct inode *inode,
+ unsigned int flags);
int v9fs_dir_release(struct inode *inode, struct file *filp);
int v9fs_file_open(struct inode *inode, struct file *file);
void v9fs_inode2stat(struct inode *inode, struct p9_wstat *stat);
@@ -83,4 +88,18 @@ static inline void v9fs_invalidate_inode_attr(struct inode *inode)
}
int v9fs_open_to_dotl_flags(int flags);
+
+static inline void v9fs_i_size_write(struct inode *inode, loff_t i_size)
+{
+ /*
+ * 32-bit need the lock, concurrent updates could break the
+ * sequences and make i_size_read() loop forever.
+ * 64-bit updates are atomic and can skip the locking.
+ */
+ if (sizeof(i_size) > sizeof(long))
+ spin_lock(&inode->i_lock);
+ i_size_write(inode, i_size);
+ if (sizeof(i_size) > sizeof(long))
+ spin_unlock(&inode->i_lock);
+}
#endif
diff --git a/fs/9p/vfs_dir.c b/fs/9p/vfs_dir.c
index 48db9a9..cb6c403 100644
--- a/fs/9p/vfs_dir.c
+++ b/fs/9p/vfs_dir.c
@@ -105,7 +105,6 @@ static int v9fs_dir_readdir(struct file *file, struct dir_context *ctx)
int err = 0;
struct p9_fid *fid;
int buflen;
- int reclen = 0;
struct p9_rdir *rdir;
struct kvec kvec;
@@ -138,11 +137,10 @@ static int v9fs_dir_readdir(struct file *file, struct dir_context *ctx)
while (rdir->head < rdir->tail) {
err = p9stat_read(fid->clnt, rdir->buf + rdir->head,
rdir->tail - rdir->head, &st);
- if (err) {
+ if (err <= 0) {
p9_debug(P9_DEBUG_VFS, "returned %d\n", err);
return -EIO;
}
- reclen = st.size+2;
over = !dir_emit(ctx, st.name, strlen(st.name),
v9fs_qid2ino(&st.qid), dt_type(&st));
@@ -150,8 +148,8 @@ static int v9fs_dir_readdir(struct file *file, struct dir_context *ctx)
if (over)
return 0;
- rdir->head += reclen;
- ctx->pos += reclen;
+ rdir->head += err;
+ ctx->pos += err;
}
}
}
diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index 398a3ed..79ff727 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -154,6 +154,7 @@ static int v9fs_file_do_lock(struct file *filp, int cmd, struct file_lock *fl)
uint8_t status = P9_LOCK_ERROR;
int res = 0;
unsigned char fl_type;
+ struct v9fs_session_info *v9ses;
fid = filp->private_data;
BUG_ON(fid == NULL);
@@ -189,6 +190,8 @@ static int v9fs_file_do_lock(struct file *filp, int cmd, struct file_lock *fl)
if (IS_SETLKW(cmd))
flock.flags = P9_LOCK_FLAGS_BLOCK;
+ v9ses = v9fs_inode2v9ses(file_inode(filp));
+
/*
* if its a blocked request and we get P9_LOCK_BLOCKED as the status
* for lock request, keep on trying
@@ -202,7 +205,8 @@ static int v9fs_file_do_lock(struct file *filp, int cmd, struct file_lock *fl)
break;
if (status == P9_LOCK_BLOCKED && !IS_SETLKW(cmd))
break;
- if (schedule_timeout_interruptible(P9_LOCK_TIMEOUT) != 0)
+ if (schedule_timeout_interruptible(v9ses->session_lock_timeout)
+ != 0)
break;
/*
* p9_client_lock_dotl overwrites flock.client_id with the
@@ -442,7 +446,11 @@ v9fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
i_size = i_size_read(inode);
if (iocb->ki_pos > i_size) {
inode_add_bytes(inode, iocb->ki_pos - i_size);
- i_size_write(inode, iocb->ki_pos);
+ /*
+ * Need to serialize against i_size_write() in
+ * v9fs_stat2inode()
+ */
+ v9fs_i_size_write(inode, iocb->ki_pos);
}
return retval;
}
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index f8ab4a6..ddd1eb6 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -538,7 +538,7 @@ static struct inode *v9fs_qid_iget(struct super_block *sb,
if (retval)
goto error;
- v9fs_stat2inode(st, inode, sb);
+ v9fs_stat2inode(st, inode, sb, 0);
v9fs_cache_inode_get_cookie(inode);
unlock_new_inode(inode);
return inode;
@@ -1078,7 +1078,7 @@ v9fs_vfs_getattr(struct vfsmount *mnt, struct dentry *dentry,
if (IS_ERR(st))
return PTR_ERR(st);
- v9fs_stat2inode(st, d_inode(dentry), dentry->d_sb);
+ v9fs_stat2inode(st, d_inode(dentry), dentry->d_sb, 0);
generic_fillattr(d_inode(dentry), stat);
p9stat_free(st);
@@ -1156,12 +1156,13 @@ static int v9fs_vfs_setattr(struct dentry *dentry, struct iattr *iattr)
* @stat: Plan 9 metadata (mistat) structure
* @inode: inode to populate
* @sb: superblock of filesystem
+ * @flags: control flags (e.g. V9FS_STAT2INODE_KEEP_ISIZE)
*
*/
void
v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode,
- struct super_block *sb)
+ struct super_block *sb, unsigned int flags)
{
umode_t mode;
char ext[32];
@@ -1202,10 +1203,11 @@ v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode,
mode = p9mode2perm(v9ses, stat);
mode |= inode->i_mode & ~S_IALLUGO;
inode->i_mode = mode;
- i_size_write(inode, stat->length);
+ if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE))
+ v9fs_i_size_write(inode, stat->length);
/* not real number of blocks, but 512 byte ones ... */
- inode->i_blocks = (i_size_read(inode) + 512 - 1) >> 9;
+ inode->i_blocks = (stat->length + 512 - 1) >> 9;
v9inode->cache_validity &= ~V9FS_INO_INVALID_ATTR;
}
@@ -1402,9 +1404,9 @@ int v9fs_refresh_inode(struct p9_fid *fid, struct inode *inode)
{
int umode;
dev_t rdev;
- loff_t i_size;
struct p9_wstat *st;
struct v9fs_session_info *v9ses;
+ unsigned int flags;
v9ses = v9fs_inode2v9ses(inode);
st = p9_client_stat(fid);
@@ -1417,16 +1419,13 @@ int v9fs_refresh_inode(struct p9_fid *fid, struct inode *inode)
if ((inode->i_mode & S_IFMT) != (umode & S_IFMT))
goto out;
- spin_lock(&inode->i_lock);
/*
* We don't want to refresh inode->i_size,
* because we may have cached data
*/
- i_size = inode->i_size;
- v9fs_stat2inode(st, inode, inode->i_sb);
- if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE)
- inode->i_size = i_size;
- spin_unlock(&inode->i_lock);
+ flags = (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE) ?
+ V9FS_STAT2INODE_KEEP_ISIZE : 0;
+ v9fs_stat2inode(st, inode, inode->i_sb, flags);
out:
p9stat_free(st);
kfree(st);
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index c3dd0d4..425bc1a 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -143,7 +143,7 @@ static struct inode *v9fs_qid_iget_dotl(struct super_block *sb,
if (retval)
goto error;
- v9fs_stat2inode_dotl(st, inode);
+ v9fs_stat2inode_dotl(st, inode, 0);
v9fs_cache_inode_get_cookie(inode);
retval = v9fs_get_acl(inode, fid);
if (retval)
@@ -496,7 +496,7 @@ v9fs_vfs_getattr_dotl(struct vfsmount *mnt, struct dentry *dentry,
if (IS_ERR(st))
return PTR_ERR(st);
- v9fs_stat2inode_dotl(st, d_inode(dentry));
+ v9fs_stat2inode_dotl(st, d_inode(dentry), 0);
generic_fillattr(d_inode(dentry), stat);
/* Change block size to what the server returned */
stat->blksize = st->st_blksize;
@@ -607,11 +607,13 @@ int v9fs_vfs_setattr_dotl(struct dentry *dentry, struct iattr *iattr)
* v9fs_stat2inode_dotl - populate an inode structure with stat info
* @stat: stat structure
* @inode: inode to populate
+ * @flags: ctrl flags (e.g. V9FS_STAT2INODE_KEEP_ISIZE)
*
*/
void
-v9fs_stat2inode_dotl(struct p9_stat_dotl *stat, struct inode *inode)
+v9fs_stat2inode_dotl(struct p9_stat_dotl *stat, struct inode *inode,
+ unsigned int flags)
{
umode_t mode;
struct v9fs_inode *v9inode = V9FS_I(inode);
@@ -631,7 +633,8 @@ v9fs_stat2inode_dotl(struct p9_stat_dotl *stat, struct inode *inode)
mode |= inode->i_mode & ~S_IALLUGO;
inode->i_mode = mode;
- i_size_write(inode, stat->st_size);
+ if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE))
+ v9fs_i_size_write(inode, stat->st_size);
inode->i_blocks = stat->st_blocks;
} else {
if (stat->st_result_mask & P9_STATS_ATIME) {
@@ -661,8 +664,9 @@ v9fs_stat2inode_dotl(struct p9_stat_dotl *stat, struct inode *inode)
}
if (stat->st_result_mask & P9_STATS_RDEV)
inode->i_rdev = new_decode_dev(stat->st_rdev);
- if (stat->st_result_mask & P9_STATS_SIZE)
- i_size_write(inode, stat->st_size);
+ if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE) &&
+ stat->st_result_mask & P9_STATS_SIZE)
+ v9fs_i_size_write(inode, stat->st_size);
if (stat->st_result_mask & P9_STATS_BLOCKS)
inode->i_blocks = stat->st_blocks;
}
@@ -928,9 +932,9 @@ v9fs_vfs_get_link_dotl(struct dentry *dentry,
int v9fs_refresh_inode_dotl(struct p9_fid *fid, struct inode *inode)
{
- loff_t i_size;
struct p9_stat_dotl *st;
struct v9fs_session_info *v9ses;
+ unsigned int flags;
v9ses = v9fs_inode2v9ses(inode);
st = p9_client_getattr_dotl(fid, P9_STATS_ALL);
@@ -942,16 +946,13 @@ int v9fs_refresh_inode_dotl(struct p9_fid *fid, struct inode *inode)
if ((inode->i_mode & S_IFMT) != (st->st_mode & S_IFMT))
goto out;
- spin_lock(&inode->i_lock);
/*
* We don't want to refresh inode->i_size,
* because we may have cached data
*/
- i_size = inode->i_size;
- v9fs_stat2inode_dotl(st, inode);
- if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE)
- inode->i_size = i_size;
- spin_unlock(&inode->i_lock);
+ flags = (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE) ?
+ V9FS_STAT2INODE_KEEP_ISIZE : 0;
+ v9fs_stat2inode_dotl(st, inode, flags);
out:
kfree(st);
return 0;
diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c
index de3ed86..06ff1a9 100644
--- a/fs/9p/vfs_super.c
+++ b/fs/9p/vfs_super.c
@@ -165,7 +165,7 @@ static struct dentry *v9fs_mount(struct file_system_type *fs_type, int flags,
goto release_sb;
}
d_inode(root)->i_ino = v9fs_qid2ino(&st->qid);
- v9fs_stat2inode_dotl(st, d_inode(root));
+ v9fs_stat2inode_dotl(st, d_inode(root), 0);
kfree(st);
} else {
struct p9_wstat *st = NULL;
@@ -176,7 +176,7 @@ static struct dentry *v9fs_mount(struct file_system_type *fs_type, int flags,
}
d_inode(root)->i_ino = v9fs_qid2ino(&st->qid);
- v9fs_stat2inode(st, d_inode(root), sb);
+ v9fs_stat2inode(st, d_inode(root), sb, 0);
p9stat_free(st);
kfree(st);
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index f605bfa..84051b2 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3018,11 +3018,11 @@ static int __do_readpage(struct extent_io_tree *tree,
*/
if (test_bit(EXTENT_FLAG_COMPRESSED, &em->flags) &&
prev_em_start && *prev_em_start != (u64)-1 &&
- *prev_em_start != em->orig_start)
+ *prev_em_start != em->start)
force_bio_submit = true;
if (prev_em_start)
- *prev_em_start = em->orig_start;
+ *prev_em_start = em->start;
free_extent_map(em);
em = NULL;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 242584a..a67143c 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -385,6 +385,16 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg)
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
+ /*
+ * If the fs is mounted with nologreplay, which requires it to be
+ * mounted in RO mode as well, we can not allow discard on free space
+ * inside block groups, because log trees refer to extents that are not
+ * pinned in a block group's free space cache (pinning the extents is
+ * precisely the first phase of replaying a log tree).
+ */
+ if (btrfs_test_opt(fs_info, NOLOGREPLAY))
+ return -EROFS;
+
rcu_read_lock();
list_for_each_entry_rcu(device, &fs_info->fs_devices->devices,
dev_list) {
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index af6a776..5aa07de 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -2395,8 +2395,9 @@ static noinline void finish_parity_scrub(struct btrfs_raid_bio *rbio,
bitmap_clear(rbio->dbitmap, pagenr, 1);
kunmap(p);
- for (stripe = 0; stripe < rbio->real_stripes; stripe++)
+ for (stripe = 0; stripe < nr_data; stripe++)
kunmap(page_in_rbio(rbio, stripe, pagenr, 0));
+ kunmap(p_page);
}
__free_page(p_page);
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 47d11a3..a36bb75 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3343,9 +3343,16 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
}
btrfs_release_path(path);
- /* find the first key from this transaction again */
+ /*
+ * Find the first key from this transaction again. See the note for
+ * log_new_dir_dentries, if we're logging a directory recursively we
+ * won't be holding its i_mutex, which means we can modify the directory
+ * while we're logging it. If we remove an entry between our first
+ * search and this search we'll not find the key again and can just
+ * bail.
+ */
ret = btrfs_search_slot(NULL, root, &min_key, path, 0, 0);
- if (WARN_ON(ret != 0))
+ if (ret != 0)
goto done;
/*
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1fe447e..dbb78c3 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6439,10 +6439,10 @@ static int btrfs_check_chunk_valid(struct btrfs_root *root,
}
if ((type & BTRFS_BLOCK_GROUP_RAID10 && sub_stripes != 2) ||
- (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes < 1) ||
+ (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes != 2) ||
(type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) ||
(type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) ||
- (type & BTRFS_BLOCK_GROUP_DUP && num_stripes > 2) ||
+ (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) ||
((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 &&
num_stripes != 1)) {
btrfs_err(root->fs_info,
diff --git a/fs/buffer.c b/fs/buffer.c
index 3477a14..9915966 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -3077,6 +3077,13 @@ void guard_bio_eod(int op, struct bio *bio)
/* Uhhuh. We've got a bio that straddles the device size! */
truncated_bytes = bio->bi_iter.bi_size - (maxsector << 9);
+ /*
+ * The bio contains more than one segment which spans EOD, just return
+ * and let IO layer turn it into an EIO
+ */
+ if (truncated_bytes > bvec->bv_len)
+ return;
+
/* Truncate the bio.. */
bio->bi_iter.bi_size -= truncated_bytes;
bvec->bv_len -= truncated_bytes;
diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index cec2569..2ffc7fe 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -1471,6 +1471,7 @@ void ceph_dentry_lru_del(struct dentry *dn)
unsigned ceph_dentry_hash(struct inode *dir, struct dentry *dn)
{
struct ceph_inode_info *dci = ceph_inode(dir);
+ unsigned hash;
switch (dci->i_dir_layout.dl_dir_hash) {
case 0: /* for backward compat */
@@ -1478,8 +1479,11 @@ unsigned ceph_dentry_hash(struct inode *dir, struct dentry *dn)
return dn->d_name.hash;
default:
- return ceph_str_hash(dci->i_dir_layout.dl_dir_hash,
+ spin_lock(&dn->d_lock);
+ hash = ceph_str_hash(dci->i_dir_layout.dl_dir_hash,
dn->d_name.name, dn->d_name.len);
+ spin_unlock(&dn->d_lock);
+ return hash;
}
}
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 6cbd0d8..67cb9d0 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -1187,6 +1187,15 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap,
list_add(&ci->i_prealloc_cap_flush->i_list, &to_remove);
ci->i_prealloc_cap_flush = NULL;
}
+
+ if (drop &&
+ ci->i_wrbuffer_ref_head == 0 &&
+ ci->i_wr_ref == 0 &&
+ ci->i_dirty_caps == 0 &&
+ ci->i_flushing_caps == 0) {
+ ceph_put_snap_context(ci->i_head_snapc);
+ ci->i_head_snapc = NULL;
+ }
}
spin_unlock(&ci->i_ceph_lock);
while (!list_empty(&to_remove)) {
diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
index 411e9df..3a76ae0 100644
--- a/fs/ceph/snap.c
+++ b/fs/ceph/snap.c
@@ -563,7 +563,12 @@ void ceph_queue_cap_snap(struct ceph_inode_info *ci)
old_snapc = NULL;
update_snapc:
- if (ci->i_head_snapc) {
+ if (ci->i_wrbuffer_ref_head == 0 &&
+ ci->i_wr_ref == 0 &&
+ ci->i_dirty_caps == 0 &&
+ ci->i_flushing_caps == 0) {
+ ci->i_head_snapc = NULL;
+ } else {
ci->i_head_snapc = ceph_get_snap_context(new_snapc);
dout(" new snapc is %p\n", new_snapc);
}
diff --git a/fs/cifs/cifs_dfs_ref.c b/fs/cifs/cifs_dfs_ref.c
index 9156be5..4660208 100644
--- a/fs/cifs/cifs_dfs_ref.c
+++ b/fs/cifs/cifs_dfs_ref.c
@@ -271,9 +271,9 @@ static void dump_referral(const struct dfs_info3_param *ref)
{
cifs_dbg(FYI, "DFS: ref path: %s\n", ref->path_name);
cifs_dbg(FYI, "DFS: node path: %s\n", ref->node_name);
- cifs_dbg(FYI, "DFS: fl: %hd, srv_type: %hd\n",
+ cifs_dbg(FYI, "DFS: fl: %d, srv_type: %d\n",
ref->flags, ref->server_type);
- cifs_dbg(FYI, "DFS: ref_flags: %hd, path_consumed: %hd\n",
+ cifs_dbg(FYI, "DFS: ref_flags: %d, path_consumed: %d\n",
ref->ref_flag, ref->path_consumed);
}
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 4ed4736..5367b684 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1157,6 +1157,7 @@ cifsFileInfo_get_locked(struct cifsFileInfo *cifs_file)
}
struct cifsFileInfo *cifsFileInfo_get(struct cifsFileInfo *cifs_file);
+void _cifsFileInfo_put(struct cifsFileInfo *cifs_file, bool wait_oplock_hdlr);
void cifsFileInfo_put(struct cifsFileInfo *cifs_file);
#define CIFS_CACHE_READ_FLG 1
@@ -1651,6 +1652,7 @@ GLOBAL_EXTERN spinlock_t gidsidlock;
#endif /* CONFIG_CIFS_ACL */
void cifs_oplock_break(struct work_struct *work);
+void cifs_queue_oplock_break(struct cifsFileInfo *cfile);
extern const struct slow_work_ops cifs_oplock_break_ops;
extern struct workqueue_struct *cifsiod_wq;
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 5b287d3..4b350ac 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1201,6 +1201,11 @@ cifs_parse_devname(const char *devname, struct smb_vol *vol)
const char *delims = "/\\";
size_t len;
+ if (unlikely(!devname || !*devname)) {
+ cifs_dbg(VFS, "Device name not specified.\n");
+ return -EINVAL;
+ }
+
/* make sure we have a valid UNC double delimiter prefix */
len = strspn(devname, delims);
if (len != 2)
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 8ec2963..e7f1773 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -358,13 +358,31 @@ cifsFileInfo_get(struct cifsFileInfo *cifs_file)
return cifs_file;
}
-/*
- * Release a reference on the file private data. This may involve closing
- * the filehandle out on the server. Must be called without holding
- * tcon->open_file_lock and cifs_file->file_info_lock.
+/**
+ * cifsFileInfo_put - release a reference of file priv data
+ *
+ * Always potentially wait for oplock handler. See _cifsFileInfo_put().
*/
void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
{
+ _cifsFileInfo_put(cifs_file, true);
+}
+
+/**
+ * _cifsFileInfo_put - release a reference of file priv data
+ *
+ * This may involve closing the filehandle @cifs_file out on the
+ * server. Must be called without holding tcon->open_file_lock and
+ * cifs_file->file_info_lock.
+ *
+ * If @wait_for_oplock_handler is true and we are releasing the last
+ * reference, wait for any running oplock break handler of the file
+ * and cancel any pending one. If calling this function from the
+ * oplock break handler, you need to pass false.
+ *
+ */
+void _cifsFileInfo_put(struct cifsFileInfo *cifs_file, bool wait_oplock_handler)
+{
struct inode *inode = d_inode(cifs_file->dentry);
struct cifs_tcon *tcon = tlink_tcon(cifs_file->tlink);
struct TCP_Server_Info *server = tcon->ses->server;
@@ -411,7 +429,8 @@ void cifsFileInfo_put(struct cifsFileInfo *cifs_file)
spin_unlock(&tcon->open_file_lock);
- oplock_break_cancelled = cancel_work_sync(&cifs_file->oplock_break);
+ oplock_break_cancelled = wait_oplock_handler ?
+ cancel_work_sync(&cifs_file->oplock_break) : false;
if (!tcon->need_reconnect && !cifs_file->invalidHandle) {
struct TCP_Server_Info *server = tcon->ses->server;
@@ -1627,8 +1646,20 @@ cifs_setlk(struct file *file, struct file_lock *flock, __u32 type,
rc = server->ops->mand_unlock_range(cfile, flock, xid);
out:
- if (flock->fl_flags & FL_POSIX && !rc)
+ if (flock->fl_flags & FL_POSIX) {
+ /*
+ * If this is a request to remove all locks because we
+ * are closing the file, it doesn't matter if the
+ * unlocking failed as both cifs.ko and the SMB server
+ * remove the lock on file close
+ */
+ if (rc) {
+ cifs_dbg(VFS, "%s failed rc=%d\n", __func__, rc);
+ if (!(flock->fl_flags & FL_CLOSE))
+ return rc;
+ }
rc = locks_lock_file_wait(file, flock);
+ }
return rc;
}
@@ -2797,14 +2828,16 @@ cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from)
* these pages but not on the region from pos to ppos+len-1.
*/
written = cifs_user_writev(iocb, from);
- if (written > 0 && CIFS_CACHE_READ(cinode)) {
+ if (CIFS_CACHE_READ(cinode)) {
/*
- * Windows 7 server can delay breaking level2 oplock if a write
- * request comes - break it on the client to prevent reading
- * an old data.
+ * We have read level caching and we have just sent a write
+ * request to the server thus making data in the cache stale.
+ * Zap the cache and set oplock/lease level to NONE to avoid
+ * reading stale data from the cache. All subsequent read
+ * operations will read new data from the server.
*/
cifs_zap_mapping(inode);
- cifs_dbg(FYI, "Set no oplock for inode=%p after a write operation\n",
+ cifs_dbg(FYI, "Set Oplock/Lease to NONE for inode=%p after write\n",
inode);
cinode->oplock = 0;
}
@@ -3899,6 +3932,7 @@ void cifs_oplock_break(struct work_struct *work)
cinode);
cifs_dbg(FYI, "Oplock release rc = %d\n", rc);
}
+ _cifsFileInfo_put(cfile, false /* do not wait for ourself */);
cifs_done_oplock_break(cinode);
}
diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index 57c938f..786f67b 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -771,43 +771,50 @@ cifs_get_inode_info(struct inode **inode, const char *full_path,
} else if ((rc == -EACCES) && backup_cred(cifs_sb) &&
(strcmp(server->vals->version_string, SMB1_VERSION_STRING)
== 0)) {
- /*
- * For SMB2 and later the backup intent flag is already
- * sent if needed on open and there is no path based
- * FindFirst operation to use to retry with
- */
+ /*
+ * For SMB2 and later the backup intent flag is already
+ * sent if needed on open and there is no path based
+ * FindFirst operation to use to retry with
+ */
- srchinf = kzalloc(sizeof(struct cifs_search_info),
- GFP_KERNEL);
- if (srchinf == NULL) {
- rc = -ENOMEM;
- goto cgii_exit;
- }
+ srchinf = kzalloc(sizeof(struct cifs_search_info),
+ GFP_KERNEL);
+ if (srchinf == NULL) {
+ rc = -ENOMEM;
+ goto cgii_exit;
+ }
- srchinf->endOfSearch = false;
+ srchinf->endOfSearch = false;
+ if (tcon->unix_ext)
+ srchinf->info_level = SMB_FIND_FILE_UNIX;
+ else if ((tcon->ses->capabilities &
+ tcon->ses->server->vals->cap_nt_find) == 0)
+ srchinf->info_level = SMB_FIND_FILE_INFO_STANDARD;
+ else if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_SERVER_INUM)
srchinf->info_level = SMB_FIND_FILE_ID_FULL_DIR_INFO;
+ else /* no srvino useful for fallback to some netapp */
+ srchinf->info_level = SMB_FIND_FILE_DIRECTORY_INFO;
- srchflgs = CIFS_SEARCH_CLOSE_ALWAYS |
- CIFS_SEARCH_CLOSE_AT_END |
- CIFS_SEARCH_BACKUP_SEARCH;
+ srchflgs = CIFS_SEARCH_CLOSE_ALWAYS |
+ CIFS_SEARCH_CLOSE_AT_END |
+ CIFS_SEARCH_BACKUP_SEARCH;
- rc = CIFSFindFirst(xid, tcon, full_path,
- cifs_sb, NULL, srchflgs, srchinf, false);
- if (!rc) {
- data =
- (FILE_ALL_INFO *)srchinf->srch_entries_start;
+ rc = CIFSFindFirst(xid, tcon, full_path,
+ cifs_sb, NULL, srchflgs, srchinf, false);
+ if (!rc) {
+ data = (FILE_ALL_INFO *)srchinf->srch_entries_start;
- cifs_dir_info_to_fattr(&fattr,
- (FILE_DIRECTORY_INFO *)data, cifs_sb);
- fattr.cf_uniqueid = le64_to_cpu(
- ((SEARCH_ID_FULL_DIR_INFO *)data)->UniqueId);
- validinum = true;
+ cifs_dir_info_to_fattr(&fattr,
+ (FILE_DIRECTORY_INFO *)data, cifs_sb);
+ fattr.cf_uniqueid = le64_to_cpu(
+ ((SEARCH_ID_FULL_DIR_INFO *)data)->UniqueId);
+ validinum = true;
- cifs_buf_release(srchinf->ntwrk_buf_start);
- }
- kfree(srchinf);
- if (rc)
- goto cgii_exit;
+ cifs_buf_release(srchinf->ntwrk_buf_start);
+ }
+ kfree(srchinf);
+ if (rc)
+ goto cgii_exit;
} else
goto cgii_exit;
@@ -1715,6 +1722,10 @@ cifs_do_rename(const unsigned int xid, struct dentry *from_dentry,
if (rc == 0 || rc != -EBUSY)
goto do_rename_exit;
+ /* Don't fall back to using SMB on SMB 2+ mount */
+ if (server->vals->protocol_id != 0)
+ goto do_rename_exit;
+
/* open-file renames don't work across directories */
if (to_dentry->d_parent != from_dentry->d_parent)
goto do_rename_exit;
diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
index 50559a8..5e75df6 100644
--- a/fs/cifs/misc.c
+++ b/fs/cifs/misc.c
@@ -494,8 +494,7 @@ is_valid_oplock_break(char *buffer, struct TCP_Server_Info *srv)
CIFS_INODE_DOWNGRADE_OPLOCK_TO_L2,
&pCifsInode->flags);
- queue_work(cifsoplockd_wq,
- &netfile->oplock_break);
+ cifs_queue_oplock_break(netfile);
netfile->oplock_break_cancelled = false;
spin_unlock(&tcon->open_file_lock);
@@ -592,6 +591,28 @@ void cifs_put_writer(struct cifsInodeInfo *cinode)
spin_unlock(&cinode->writers_lock);
}
+/**
+ * cifs_queue_oplock_break - queue the oplock break handler for cfile
+ *
+ * This function is called from the demultiplex thread when it
+ * receives an oplock break for @cfile.
+ *
+ * Assumes the tcon->open_file_lock is held.
+ * Assumes cfile->file_info_lock is NOT held.
+ */
+void cifs_queue_oplock_break(struct cifsFileInfo *cfile)
+{
+ /*
+ * Bump the handle refcount now while we hold the
+ * open_file_lock to enforce the validity of it for the oplock
+ * break handler. The matching put is done at the end of the
+ * handler.
+ */
+ cifsFileInfo_get(cfile);
+
+ queue_work(cifsoplockd_wq, &cfile->oplock_break);
+}
+
void cifs_done_oplock_break(struct cifsInodeInfo *cinode)
{
clear_bit(CIFS_INODE_PENDING_OPLOCK_BREAK, &cinode->flags);
diff --git a/fs/cifs/smb1ops.c b/fs/cifs/smb1ops.c
index efd72e1..f7a9ada 100644
--- a/fs/cifs/smb1ops.c
+++ b/fs/cifs/smb1ops.c
@@ -305,7 +305,7 @@ coalesce_t2(char *second_buf, struct smb_hdr *target_hdr)
remaining = tgt_total_cnt - total_in_tgt;
if (remaining < 0) {
- cifs_dbg(FYI, "Server sent too much data. tgt_total_cnt=%hu total_in_tgt=%hu\n",
+ cifs_dbg(FYI, "Server sent too much data. tgt_total_cnt=%hu total_in_tgt=%u\n",
tgt_total_cnt, total_in_tgt);
return -EPROTO;
}
diff --git a/fs/cifs/smb2maperror.c b/fs/cifs/smb2maperror.c
index 98c25b9..7e93d57 100644
--- a/fs/cifs/smb2maperror.c
+++ b/fs/cifs/smb2maperror.c
@@ -1034,7 +1034,8 @@ static const struct status_to_posix_error smb2_error_map_table[] = {
{STATUS_UNFINISHED_CONTEXT_DELETED, -EIO,
"STATUS_UNFINISHED_CONTEXT_DELETED"},
{STATUS_NO_TGT_REPLY, -EIO, "STATUS_NO_TGT_REPLY"},
- {STATUS_OBJECTID_NOT_FOUND, -EIO, "STATUS_OBJECTID_NOT_FOUND"},
+ /* Note that ENOATTTR and ENODATA are the same errno */
+ {STATUS_OBJECTID_NOT_FOUND, -ENODATA, "STATUS_OBJECTID_NOT_FOUND"},
{STATUS_NO_IP_ADDRESSES, -EIO, "STATUS_NO_IP_ADDRESSES"},
{STATUS_WRONG_CREDENTIAL_HANDLE, -EIO,
"STATUS_WRONG_CREDENTIAL_HANDLE"},
diff --git a/fs/cifs/smb2misc.c b/fs/cifs/smb2misc.c
index e96a74d..9994d15 100644
--- a/fs/cifs/smb2misc.c
+++ b/fs/cifs/smb2misc.c
@@ -474,7 +474,6 @@ smb2_tcon_has_lease(struct cifs_tcon *tcon, struct smb2_lease_break *rsp,
__u8 lease_state;
struct list_head *tmp;
struct cifsFileInfo *cfile;
- struct TCP_Server_Info *server = tcon->ses->server;
struct cifs_pending_open *open;
struct cifsInodeInfo *cinode;
int ack_req = le32_to_cpu(rsp->Flags &
@@ -494,14 +493,26 @@ smb2_tcon_has_lease(struct cifs_tcon *tcon, struct smb2_lease_break *rsp,
cifs_dbg(FYI, "lease key match, lease break 0x%x\n",
le32_to_cpu(rsp->NewLeaseState));
- server->ops->set_oplock_level(cinode, lease_state, 0, NULL);
-
if (ack_req)
cfile->oplock_break_cancelled = false;
else
cfile->oplock_break_cancelled = true;
- queue_work(cifsoplockd_wq, &cfile->oplock_break);
+ set_bit(CIFS_INODE_PENDING_OPLOCK_BREAK, &cinode->flags);
+
+ /*
+ * Set or clear flags depending on the lease state being READ.
+ * HANDLE caching flag should be added when the client starts
+ * to defer closing remote file handles with HANDLE leases.
+ */
+ if (lease_state & SMB2_LEASE_READ_CACHING_HE)
+ set_bit(CIFS_INODE_DOWNGRADE_OPLOCK_TO_L2,
+ &cinode->flags);
+ else
+ clear_bit(CIFS_INODE_DOWNGRADE_OPLOCK_TO_L2,
+ &cinode->flags);
+
+ cifs_queue_oplock_break(cfile);
kfree(lw);
return true;
}
@@ -645,8 +656,8 @@ smb2_is_valid_oplock_break(char *buffer, struct TCP_Server_Info *server)
CIFS_INODE_DOWNGRADE_OPLOCK_TO_L2,
&cinode->flags);
spin_unlock(&cfile->file_info_lock);
- queue_work(cifsoplockd_wq,
- &cfile->oplock_break);
+
+ cifs_queue_oplock_break(cfile);
spin_unlock(&tcon->open_file_lock);
spin_unlock(&cifs_tcp_ses_lock);
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 2db7968..97d8e2a 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -1377,6 +1377,15 @@ smb2_downgrade_oplock(struct TCP_Server_Info *server,
}
static void
+smb21_downgrade_oplock(struct TCP_Server_Info *server,
+ struct cifsInodeInfo *cinode, bool set_level2)
+{
+ server->ops->set_oplock_level(cinode,
+ set_level2 ? SMB2_LEASE_READ_CACHING_HE :
+ 0, 0, NULL);
+}
+
+static void
smb2_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock,
unsigned int epoch, bool *purge_cache)
{
@@ -1681,7 +1690,7 @@ struct smb_version_operations smb21_operations = {
.print_stats = smb2_print_stats,
.is_oplock_break = smb2_is_valid_oplock_break,
.handle_cancelled_mid = smb2_handle_cancelled_mid,
- .downgrade_oplock = smb2_downgrade_oplock,
+ .downgrade_oplock = smb21_downgrade_oplock,
.need_neg = smb2_need_neg,
.negotiate = smb2_negotiate,
.negotiate_wsize = smb2_negotiate_wsize,
@@ -1765,7 +1774,7 @@ struct smb_version_operations smb30_operations = {
.dump_share_caps = smb2_dump_share_caps,
.is_oplock_break = smb2_is_valid_oplock_break,
.handle_cancelled_mid = smb2_handle_cancelled_mid,
- .downgrade_oplock = smb2_downgrade_oplock,
+ .downgrade_oplock = smb21_downgrade_oplock,
.need_neg = smb2_need_neg,
.negotiate = smb2_negotiate,
.negotiate_wsize = smb2_negotiate_wsize,
@@ -1855,7 +1864,7 @@ struct smb_version_operations smb311_operations = {
.dump_share_caps = smb2_dump_share_caps,
.is_oplock_break = smb2_is_valid_oplock_break,
.handle_cancelled_mid = smb2_handle_cancelled_mid,
- .downgrade_oplock = smb2_downgrade_oplock,
+ .downgrade_oplock = smb21_downgrade_oplock,
.need_neg = smb2_need_neg,
.negotiate = smb2_negotiate,
.negotiate_wsize = smb2_negotiate_wsize,
diff --git a/fs/dcache.c b/fs/dcache.c
index 75bbe7d..96c150d 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1522,7 +1522,7 @@ static void check_and_drop(void *_data)
{
struct detach_data *data = _data;
- if (!data->mountpoint && !data->select.found)
+ if (!data->mountpoint && list_empty(&data->select.dispose))
__d_drop(data->select.start);
}
@@ -1564,17 +1564,15 @@ void d_invalidate(struct dentry *dentry)
d_walk(dentry, &data, detach_and_collect, check_and_drop);
- if (data.select.found)
+ if (!list_empty(&data.select.dispose))
shrink_dentry_list(&data.select.dispose);
+ else if (!data.mountpoint)
+ return;
if (data.mountpoint) {
detach_mounts(data.mountpoint);
dput(data.mountpoint);
}
-
- if (!data.mountpoint && !data.select.found)
- break;
-
cond_resched();
}
}
diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index 108df2e..81be3ba 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -396,6 +396,7 @@ devpts_fill_super(struct super_block *s, void *data, int silent)
s->s_blocksize_bits = 10;
s->s_magic = DEVPTS_SUPER_MAGIC;
s->s_op = &devpts_sops;
+ s->s_d_op = &simple_dentry_operations;
s->s_time_gran = 1;
error = -ENOMEM;
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 6cb042b..6fcb29b 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -724,7 +724,8 @@ static loff_t ext2_max_size(int bits)
{
loff_t res = EXT2_NDIR_BLOCKS;
int meta_blocks;
- loff_t upper_limit;
+ unsigned int upper_limit;
+ unsigned int ppb = 1 << (bits-2);
/* This is calculated to be the largest file size for a
* dense, file such that the total number of
@@ -738,24 +739,34 @@ static loff_t ext2_max_size(int bits)
/* total blocks in file system block size */
upper_limit >>= (bits - 9);
-
- /* indirect blocks */
- meta_blocks = 1;
- /* double indirect blocks */
- meta_blocks += 1 + (1LL << (bits-2));
- /* tripple indirect blocks */
- meta_blocks += 1 + (1LL << (bits-2)) + (1LL << (2*(bits-2)));
-
- upper_limit -= meta_blocks;
- upper_limit <<= bits;
-
+ /* Compute how many blocks we can address by block tree */
res += 1LL << (bits-2);
res += 1LL << (2*(bits-2));
res += 1LL << (3*(bits-2));
- res <<= bits;
- if (res > upper_limit)
- res = upper_limit;
+ /* Does block tree limit file size? */
+ if (res < upper_limit)
+ goto check_lfs;
+ res = upper_limit;
+ /* How many metadata blocks are needed for addressing upper_limit? */
+ upper_limit -= EXT2_NDIR_BLOCKS;
+ /* indirect blocks */
+ meta_blocks = 1;
+ upper_limit -= ppb;
+ /* double indirect blocks */
+ if (upper_limit < ppb * ppb) {
+ meta_blocks += 1 + DIV_ROUND_UP(upper_limit, ppb);
+ res -= meta_blocks;
+ goto check_lfs;
+ }
+ meta_blocks += 1 + ppb;
+ upper_limit -= ppb * ppb;
+ /* tripple indirect blocks for the rest */
+ meta_blocks += 1 + DIV_ROUND_UP(upper_limit, ppb) +
+ DIV_ROUND_UP(upper_limit, ppb*ppb);
+ res -= meta_blocks;
+check_lfs:
+ res <<= bits;
if (res > MAX_LFS_FILESIZE)
res = MAX_LFS_FILESIZE;
diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index f976111..4b7cc1a 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -391,7 +391,7 @@ static inline void ext4_update_inode_fsync_trans(handle_t *handle,
{
struct ext4_inode_info *ei = EXT4_I(inode);
- if (ext4_handle_valid(handle)) {
+ if (ext4_handle_valid(handle) && !is_handle_aborted(handle)) {
ei->i_sync_tid = handle->h_transaction->t_tid;
if (datasync)
ei->i_datasync_tid = handle->h_transaction->t_tid;
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 08fca4a..fe76d09 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -79,7 +79,7 @@ ext4_unaligned_aio(struct inode *inode, struct iov_iter *from, loff_t pos)
struct super_block *sb = inode->i_sb;
int blockmask = sb->s_blocksize - 1;
- if (pos >= i_size_read(inode))
+ if (pos >= ALIGN(i_size_read(inode), sb->s_blocksize))
return 0;
if ((pos | iov_iter_alignment(from)) & blockmask)
diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
index 58229c1..d2844fe 100644
--- a/fs/ext4/indirect.c
+++ b/fs/ext4/indirect.c
@@ -1217,6 +1217,7 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
ext4_lblk_t offsets[4], offsets2[4];
Indirect chain[4], chain2[4];
Indirect *partial, *partial2;
+ Indirect *p = NULL, *p2 = NULL;
ext4_lblk_t max_block;
__le32 nr = 0, nr2 = 0;
int n = 0, n2 = 0;
@@ -1258,7 +1259,7 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
}
- partial = ext4_find_shared(inode, n, offsets, chain, &nr);
+ partial = p = ext4_find_shared(inode, n, offsets, chain, &nr);
if (nr) {
if (partial == chain) {
/* Shared branch grows from the inode */
@@ -1283,13 +1284,11 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
partial->p + 1,
(__le32 *)partial->bh->b_data+addr_per_block,
(chain+n-1) - partial);
- BUFFER_TRACE(partial->bh, "call brelse");
- brelse(partial->bh);
partial--;
}
end_range:
- partial2 = ext4_find_shared(inode, n2, offsets2, chain2, &nr2);
+ partial2 = p2 = ext4_find_shared(inode, n2, offsets2, chain2, &nr2);
if (nr2) {
if (partial2 == chain2) {
/*
@@ -1319,16 +1318,14 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
(__le32 *)partial2->bh->b_data,
partial2->p,
(chain2+n2-1) - partial2);
- BUFFER_TRACE(partial2->bh, "call brelse");
- brelse(partial2->bh);
partial2--;
}
goto do_indirects;
}
/* Punch happened within the same level (n == n2) */
- partial = ext4_find_shared(inode, n, offsets, chain, &nr);
- partial2 = ext4_find_shared(inode, n2, offsets2, chain2, &nr2);
+ partial = p = ext4_find_shared(inode, n, offsets, chain, &nr);
+ partial2 = p2 = ext4_find_shared(inode, n2, offsets2, chain2, &nr2);
/* Free top, but only if partial2 isn't its subtree. */
if (nr) {
@@ -1385,11 +1382,7 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
partial->p + 1,
partial2->p,
(chain+n-1) - partial);
- BUFFER_TRACE(partial->bh, "call brelse");
- brelse(partial->bh);
- BUFFER_TRACE(partial2->bh, "call brelse");
- brelse(partial2->bh);
- return 0;
+ goto cleanup;
}
/*
@@ -1404,8 +1397,6 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
partial->p + 1,
(__le32 *)partial->bh->b_data+addr_per_block,
(chain+n-1) - partial);
- BUFFER_TRACE(partial->bh, "call brelse");
- brelse(partial->bh);
partial--;
}
if (partial2 > chain2 && depth2 <= depth) {
@@ -1413,11 +1404,21 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
(__le32 *)partial2->bh->b_data,
partial2->p,
(chain2+n2-1) - partial2);
- BUFFER_TRACE(partial2->bh, "call brelse");
- brelse(partial2->bh);
partial2--;
}
}
+
+cleanup:
+ while (p && p > chain) {
+ BUFFER_TRACE(p->bh, "call brelse");
+ brelse(p->bh);
+ p--;
+ }
+ while (p2 && p2 > chain2) {
+ BUFFER_TRACE(p2->bh, "call brelse");
+ brelse(p2->bh);
+ p2--;
+ }
return 0;
do_indirects:
@@ -1425,7 +1426,7 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
switch (offsets[0]) {
default:
if (++n >= n2)
- return 0;
+ break;
nr = i_data[EXT4_IND_BLOCK];
if (nr) {
ext4_free_branches(handle, inode, NULL, &nr, &nr+1, 1);
@@ -1433,7 +1434,7 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
}
case EXT4_IND_BLOCK:
if (++n >= n2)
- return 0;
+ break;
nr = i_data[EXT4_DIND_BLOCK];
if (nr) {
ext4_free_branches(handle, inode, NULL, &nr, &nr+1, 2);
@@ -1441,7 +1442,7 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
}
case EXT4_DIND_BLOCK:
if (++n >= n2)
- return 0;
+ break;
nr = i_data[EXT4_TIND_BLOCK];
if (nr) {
ext4_free_branches(handle, inode, NULL, &nr, &nr+1, 3);
@@ -1450,5 +1451,5 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
case EXT4_TIND_BLOCK:
;
}
- return 0;
+ goto cleanup;
}
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index e70500d..f8ce348 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -756,6 +756,13 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
if ((flags & BLKDEV_DISCARD_SECURE) && !blk_queue_secure_erase(q))
return -EOPNOTSUPP;
+ /*
+ * We haven't replayed the journal, so we cannot use our
+ * block-bitmap-guided storage zapping commands.
+ */
+ if (test_opt(sb, NOLOAD) && ext4_has_feature_journal(sb))
+ return -EROFS;
+
if (copy_from_user(&range, (struct fstrim_range __user *)arg,
sizeof(range)))
return -EFAULT;
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 58e6b8a..aef2a24 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -907,11 +907,18 @@ static int add_new_gdb_meta_bg(struct super_block *sb,
memcpy(n_group_desc, o_group_desc,
EXT4_SB(sb)->s_gdb_count * sizeof(struct buffer_head *));
n_group_desc[gdb_num] = gdb_bh;
+
+ BUFFER_TRACE(gdb_bh, "get_write_access");
+ err = ext4_journal_get_write_access(handle, gdb_bh);
+ if (err) {
+ kvfree(n_group_desc);
+ brelse(gdb_bh);
+ return err;
+ }
+
EXT4_SB(sb)->s_group_desc = n_group_desc;
EXT4_SB(sb)->s_gdb_count++;
kvfree(o_group_desc);
- BUFFER_TRACE(gdb_bh, "get_write_access");
- err = ext4_journal_get_write_access(handle, gdb_bh);
return err;
}
@@ -1928,7 +1935,8 @@ int ext4_resize_fs(struct super_block *sb, ext4_fsblk_t n_blocks_count)
le16_to_cpu(es->s_reserved_gdt_blocks);
n_group = n_desc_blocks * EXT4_DESC_PER_BLOCK(sb);
n_blocks_count = (ext4_fsblk_t)n_group *
- EXT4_BLOCKS_PER_GROUP(sb);
+ EXT4_BLOCKS_PER_GROUP(sb) +
+ le32_to_cpu(es->s_first_data_block);
n_group--; /* set to last group number */
}
@@ -2039,6 +2047,10 @@ int ext4_resize_fs(struct super_block *sb, ext4_fsblk_t n_blocks_count)
free_flex_gd(flex_gd);
if (resize_inode != NULL)
iput(resize_inode);
- ext4_msg(sb, KERN_INFO, "resized filesystem to %llu", n_blocks_count);
+ if (err)
+ ext4_warning(sb, "error (%d) occurred during "
+ "file system resize", err);
+ ext4_msg(sb, KERN_INFO, "resized filesystem to %llu",
+ ext4_blocks_count(es));
return err;
}
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index d27d3c4..5029480 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -306,8 +306,9 @@ static int f2fs_write_meta_pages(struct address_space *mapping,
goto skip_write;
/* collect a number of dirty meta pages and write together */
- if (wbc->for_kupdate ||
- get_pages(sbi, F2FS_DIRTY_META) < nr_pages_to_skip(sbi, META))
+ if (wbc->sync_mode != WB_SYNC_ALL &&
+ get_pages(sbi, F2FS_DIRTY_META) <
+ nr_pages_to_skip(sbi, META))
goto skip_write;
/* if locked failed, cp will flush dirty pages instead */
@@ -405,7 +406,7 @@ static int f2fs_set_meta_page_dirty(struct page *page)
if (!PageDirty(page)) {
__set_page_dirty_nobuffers(page);
inc_page_count(F2FS_P_SB(page), F2FS_DIRTY_META);
- SetPagePrivate(page);
+ f2fs_set_page_private(page, 0);
f2fs_trace_pid(page);
return 1;
}
@@ -956,7 +957,7 @@ void f2fs_update_dirty_page(struct inode *inode, struct page *page)
inode_inc_dirty_pages(inode);
spin_unlock(&sbi->inode_lock[type]);
- SetPagePrivate(page);
+ f2fs_set_page_private(page, 0);
f2fs_trace_pid(page);
}
@@ -1259,10 +1260,17 @@ static void update_ckpt_flags(struct f2fs_sb_info *sbi, struct cp_control *cpc)
else
__clear_ckpt_flags(ckpt, CP_DISABLED_FLAG);
+ if (is_sbi_flag_set(sbi, SBI_CP_DISABLED_QUICK))
+ __set_ckpt_flags(ckpt, CP_DISABLED_QUICK_FLAG);
+ else
+ __clear_ckpt_flags(ckpt, CP_DISABLED_QUICK_FLAG);
+
if (is_sbi_flag_set(sbi, SBI_QUOTA_SKIP_FLUSH))
__set_ckpt_flags(ckpt, CP_QUOTA_NEED_FSCK_FLAG);
- else
- __clear_ckpt_flags(ckpt, CP_QUOTA_NEED_FSCK_FLAG);
+ /*
+ * TODO: we count on fsck.f2fs to clear this flag until we figure out
+ * missing cases which clear it incorrectly.
+ */
if (is_sbi_flag_set(sbi, SBI_QUOTA_NEED_REPAIR))
__set_ckpt_flags(ckpt, CP_QUOTA_NEED_FSCK_FLAG);
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 22e0c99..a7a8757 100755
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -306,9 +306,10 @@ static inline void __submit_bio(struct f2fs_sb_info *sbi,
for (; start < F2FS_IO_SIZE(sbi); start++) {
struct page *page =
mempool_alloc(sbi->write_io_dummy,
- GFP_NOIO | __GFP_ZERO | __GFP_NOFAIL);
+ GFP_NOIO | __GFP_NOFAIL);
f2fs_bug_on(sbi, !page);
+ zero_user_segment(page, 0, PAGE_SIZE);
SetPagePrivate(page);
set_page_private(page, (unsigned long)DUMMY_WRITTEN_PAGE);
lock_page(page);
@@ -1617,6 +1618,9 @@ static int f2fs_mpage_readpages(struct address_space *mapping,
if (last_block > last_block_in_file)
last_block = last_block_in_file;
+ /* just zeroing out page which is beyond EOF */
+ if (block_in_file >= last_block)
+ goto zero_out;
/*
* Map blocks using the previous result first.
*/
@@ -1629,16 +1633,11 @@ static int f2fs_mpage_readpages(struct address_space *mapping,
* Then do more f2fs_map_blocks() calls until we are
* done with this page.
*/
- map.m_flags = 0;
+ map.m_lblk = block_in_file;
+ map.m_len = last_block - block_in_file;
- if (block_in_file < last_block) {
- map.m_lblk = block_in_file;
- map.m_len = last_block - block_in_file;
-
- if (f2fs_map_blocks(inode, &map, 0,
- F2FS_GET_BLOCK_DEFAULT))
- goto set_error_page;
- }
+ if (f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_DEFAULT))
+ goto set_error_page;
got_it:
if ((map.m_flags & F2FS_MAP_MAPPED)) {
block_nr = map.m_pblk + block_in_file - map.m_lblk;
@@ -1653,6 +1652,7 @@ static int f2fs_mpage_readpages(struct address_space *mapping,
DATA_GENERIC))
goto set_error_page;
} else {
+zero_out:
zero_user_segment(page, 0, PAGE_SIZE);
if (!PageUptodate(page))
SetPageUptodate(page);
@@ -1940,8 +1940,13 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio)
if (fio->need_lock == LOCK_REQ)
f2fs_unlock_op(fio->sbi);
err = f2fs_inplace_write_data(fio);
- if (err && PageWriteback(page))
- end_page_writeback(page);
+ if (err) {
+ if (f2fs_encrypted_file(inode))
+ fscrypt_pullback_bio_page(&fio->encrypted_page,
+ true);
+ if (PageWriteback(page))
+ end_page_writeback(page);
+ }
trace_f2fs_do_write_data_page(fio->page, IPU);
set_inode_flag(inode, FI_UPDATE_WRITE);
return err;
@@ -2392,7 +2397,8 @@ static void f2fs_write_failed(struct address_space *mapping, loff_t to)
down_write(&F2FS_I(inode)->i_mmap_sem);
truncate_pagecache(inode, i_size);
- f2fs_truncate_blocks(inode, i_size, true, true);
+ if (!IS_NOQUOTA(inode))
+ f2fs_truncate_blocks(inode, i_size, true);
up_write(&F2FS_I(inode)->i_mmap_sem);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
@@ -2673,14 +2679,11 @@ static void f2fs_dio_submit_bio(struct bio *bio, struct inode *inode,
{
struct f2fs_private_dio *dio;
bool write = (bio_op(bio) == REQ_OP_WRITE);
- int err;
dio = f2fs_kzalloc(F2FS_I_SB(inode),
sizeof(struct f2fs_private_dio), GFP_NOFS);
- if (!dio) {
- err = -ENOMEM;
+ if (!dio)
goto out;
- }
dio->inode = inode;
dio->orig_end_io = bio->bi_end_io;
@@ -2826,12 +2829,10 @@ void f2fs_invalidate_page(struct page *page, unsigned int offset,
clear_cold_data(page);
- /* This is atomic written page, keep Private */
if (IS_ATOMIC_WRITTEN_PAGE(page))
return f2fs_drop_inmem_page(inode, page);
- set_page_private(page, 0);
- ClearPagePrivate(page);
+ f2fs_clear_page_private(page);
}
int f2fs_release_page(struct page *page, gfp_t wait)
@@ -2845,8 +2846,7 @@ int f2fs_release_page(struct page *page, gfp_t wait)
return 0;
clear_cold_data(page);
- set_page_private(page, 0);
- ClearPagePrivate(page);
+ f2fs_clear_page_private(page);
return 1;
}
@@ -2914,12 +2914,8 @@ int f2fs_migrate_page(struct address_space *mapping,
return -EAGAIN;
}
- /*
- * A reference is expected if PagePrivate set when move mapping,
- * however F2FS breaks this for maintaining dirty page counts when
- * truncating pages. So here adjusting the 'extra_count' make it work.
- */
- extra_count = (atomic_written ? 1 : 0) - page_has_private(page);
+ /* one extra reference was held for atomic_write page */
+ extra_count = atomic_written ? 1 : 0;
rc = migrate_page_move_mapping(mapping, newpage,
page, NULL, mode, extra_count);
if (rc != MIGRATEPAGE_SUCCESS) {
@@ -2940,9 +2936,10 @@ int f2fs_migrate_page(struct address_space *mapping,
get_page(newpage);
}
- if (PagePrivate(page))
- SetPagePrivate(newpage);
- set_page_private(newpage, page_private(page));
+ if (PagePrivate(page)) {
+ f2fs_set_page_private(newpage, page_private(page));
+ f2fs_clear_page_private(page);
+ }
migrate_page_copy(newpage, page);
diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
index f05b37e..d00ba9b 100644
--- a/fs/f2fs/debug.c
+++ b/fs/f2fs/debug.c
@@ -522,30 +522,16 @@ void f2fs_destroy_stats(struct f2fs_sb_info *sbi)
kvfree(si);
}
-int __init f2fs_create_root_stats(void)
+void __init f2fs_create_root_stats(void)
{
- struct dentry *file;
-
f2fs_debugfs_root = debugfs_create_dir("f2fs", NULL);
- if (!f2fs_debugfs_root)
- return -ENOMEM;
- file = debugfs_create_file("status", S_IRUGO, f2fs_debugfs_root,
- NULL, &stat_fops);
- if (!file) {
- debugfs_remove(f2fs_debugfs_root);
- f2fs_debugfs_root = NULL;
- return -ENOMEM;
- }
-
- return 0;
+ debugfs_create_file("status", S_IRUGO, f2fs_debugfs_root, NULL,
+ &stat_fops);
}
void f2fs_destroy_root_stats(void)
{
- if (!f2fs_debugfs_root)
- return;
-
debugfs_remove_recursive(f2fs_debugfs_root);
f2fs_debugfs_root = NULL;
}
diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index 8fd0360..9ef8b10 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -728,7 +728,7 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, struct page *page,
!f2fs_truncate_hole(dir, page->index, page->index + 1)) {
f2fs_clear_radix_tree_dirty_tag(page);
clear_page_dirty_for_io(page);
- ClearPagePrivate(page);
+ f2fs_clear_page_private(page);
ClearPageUptodate(page);
clear_cold_data(page);
inode_dec_dirty_pages(dir);
@@ -800,6 +800,10 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct f2fs_dentry_ptr *d,
if (de->name_len == 0) {
bit_pos++;
ctx->pos = start_pos + bit_pos;
+ printk_ratelimited(
+ "%s, invalid namelen(0), ino:%u, run fsck to fix.",
+ KERN_WARNING, le32_to_cpu(de->ino));
+ set_sbi_flag(sbi, SBI_NEED_FSCK);
continue;
}
@@ -810,7 +814,8 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct f2fs_dentry_ptr *d,
/* check memory boundary before moving forward */
bit_pos += GET_DENTRY_SLOTS(le16_to_cpu(de->name_len));
- if (unlikely(bit_pos > d->max)) {
+ if (unlikely(bit_pos > d->max ||
+ le16_to_cpu(de->name_len) > F2FS_NAME_LEN)) {
f2fs_msg(sbi->sb, KERN_WARNING,
"%s: corrupted namelen=%d, run fsck to fix.",
__func__, le16_to_cpu(de->name_len));
@@ -891,7 +896,7 @@ static int f2fs_readdir(struct file *file, struct dir_context *ctx)
page_cache_sync_readahead(inode->i_mapping, ra, file, n,
min(npages - n, (pgoff_t)MAX_DIR_RA_PAGES));
- dentry_page = f2fs_get_lock_data_page(inode, n, false);
+ dentry_page = f2fs_find_data_page(inode, n);
if (IS_ERR(dentry_page)) {
err = PTR_ERR(dentry_page);
if (err == -ENOENT) {
@@ -909,11 +914,11 @@ static int f2fs_readdir(struct file *file, struct dir_context *ctx)
err = f2fs_fill_dentries(ctx, &d,
n * NR_DENTRY_IN_BLOCK, &fstr);
if (err) {
- f2fs_put_page(dentry_page, 1);
+ f2fs_put_page(dentry_page, 0);
break;
}
- f2fs_put_page(dentry_page, 1);
+ f2fs_put_page(dentry_page, 0);
}
out_free:
fscrypt_fname_free_buffer(&fstr);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index a434980..5a1d95e 100755
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -193,6 +193,8 @@ enum {
#define DEF_CP_INTERVAL 60 /* 60 secs */
#define DEF_IDLE_INTERVAL 5 /* 5 secs */
#define DEF_DISABLE_INTERVAL 5 /* 5 secs */
+#define DEF_DISABLE_QUICK_INTERVAL 1 /* 1 secs */
+#define DEF_UMOUNT_DISCARD_TIMEOUT 5 /* 5 secs */
struct cp_control {
int reason;
@@ -256,7 +258,7 @@ struct discard_entry {
/* max discard pend list number */
#define MAX_PLIST_NUM 512
#define plist_idx(blk_num) ((blk_num) >= MAX_PLIST_NUM ? \
- (MAX_PLIST_NUM - 1) : (blk_num - 1))
+ (MAX_PLIST_NUM - 1) : ((blk_num) - 1))
enum {
D_PREP, /* initial */
@@ -312,6 +314,7 @@ struct discard_policy {
bool sync; /* submit discard with REQ_SYNC flag */
bool ordered; /* issue discard by lba order */
unsigned int granularity; /* discard granularity */
+ int timeout; /* discard timeout for put_super */
};
struct discard_cmd_control {
@@ -455,7 +458,6 @@ struct f2fs_flush_device {
/* for inline stuff */
#define DEF_INLINE_RESERVED_SIZE 1
-#define DEF_MIN_INLINE_SIZE 1
static inline int get_extra_isize(struct inode *inode);
static inline int get_inline_xattr_addrs(struct inode *inode);
#define MAX_INLINE_DATA(inode) (sizeof(__le32) * \
@@ -1098,6 +1100,7 @@ enum {
SBI_IS_SHUTDOWN, /* shutdown by ioctl */
SBI_IS_RECOVERED, /* recovered orphan/data */
SBI_CP_DISABLED, /* CP was disabled last mount */
+ SBI_CP_DISABLED_QUICK, /* CP was disabled quickly */
SBI_QUOTA_NEED_FLUSH, /* need to flush quota info in CP */
SBI_QUOTA_SKIP_FLUSH, /* skip flushing quota in current CP */
SBI_QUOTA_NEED_REPAIR, /* quota file may be corrupted */
@@ -1109,6 +1112,7 @@ enum {
DISCARD_TIME,
GC_TIME,
DISABLE_TIME,
+ UMOUNT_DISCARD_TIMEOUT,
MAX_TIME,
};
@@ -1237,8 +1241,6 @@ struct f2fs_sb_info {
unsigned int nquota_files; /* # of quota sysfile */
- u32 s_next_generation; /* for NFS support */
-
/* # of pages, see count_type */
atomic_t nr_pages[NR_COUNT_TYPE];
/* # of allocated blocks */
@@ -1798,13 +1800,12 @@ static inline void inc_page_count(struct f2fs_sb_info *sbi, int count_type)
{
atomic_inc(&sbi->nr_pages[count_type]);
- if (count_type == F2FS_DIRTY_DATA || count_type == F2FS_INMEM_PAGES ||
- count_type == F2FS_WB_CP_DATA || count_type == F2FS_WB_DATA ||
- count_type == F2FS_RD_DATA || count_type == F2FS_RD_NODE ||
- count_type == F2FS_RD_META)
- return;
-
- set_sbi_flag(sbi, SBI_IS_DIRTY);
+ if (count_type == F2FS_DIRTY_DENTS ||
+ count_type == F2FS_DIRTY_NODES ||
+ count_type == F2FS_DIRTY_META ||
+ count_type == F2FS_DIRTY_QDATA ||
+ count_type == F2FS_DIRTY_IMETA)
+ set_sbi_flag(sbi, SBI_IS_DIRTY);
}
static inline void inode_inc_dirty_pages(struct inode *inode)
@@ -2156,10 +2157,17 @@ static inline bool is_idle(struct f2fs_sb_info *sbi, int type)
get_pages(sbi, F2FS_RD_META) || get_pages(sbi, F2FS_WB_DATA) ||
get_pages(sbi, F2FS_WB_CP_DATA) ||
get_pages(sbi, F2FS_DIO_READ) ||
- get_pages(sbi, F2FS_DIO_WRITE) ||
- atomic_read(&SM_I(sbi)->dcc_info->queued_discard) ||
- atomic_read(&SM_I(sbi)->fcc_info->queued_flush))
+ get_pages(sbi, F2FS_DIO_WRITE))
return false;
+
+ if (SM_I(sbi) && SM_I(sbi)->dcc_info &&
+ atomic_read(&SM_I(sbi)->dcc_info->queued_discard))
+ return false;
+
+ if (SM_I(sbi) && SM_I(sbi)->fcc_info &&
+ atomic_read(&SM_I(sbi)->fcc_info->queued_flush))
+ return false;
+
return f2fs_time_over(sbi, type);
}
@@ -2300,11 +2308,12 @@ static inline void f2fs_change_bit(unsigned int nr, char *addr)
#define F2FS_EXTENTS_FL 0x00080000 /* Inode uses extents */
#define F2FS_EA_INODE_FL 0x00200000 /* Inode used for large EA */
#define F2FS_EOFBLOCKS_FL 0x00400000 /* Blocks allocated beyond EOF */
+#define F2FS_NOCOW_FL 0x00800000 /* Do not cow file */
#define F2FS_INLINE_DATA_FL 0x10000000 /* Inode has inline data. */
#define F2FS_PROJINHERIT_FL 0x20000000 /* Create with parents projid */
#define F2FS_RESERVED_FL 0x80000000 /* reserved for ext4 lib */
-#define F2FS_FL_USER_VISIBLE 0x304BDFFF /* User visible flags */
+#define F2FS_FL_USER_VISIBLE 0x30CBDFFF /* User visible flags */
#define F2FS_FL_USER_MODIFIABLE 0x204BC0FF /* User modifiable flags */
/* Flags we can manipulate with through F2FS_IOC_FSSETXATTR */
@@ -2791,9 +2800,9 @@ static inline int get_inline_xattr_addrs(struct inode *inode)
#define F2FS_OLD_ATTRIBUTE_SIZE (offsetof(struct f2fs_inode, i_addr))
#define F2FS_FITS_IN_INODE(f2fs_inode, extra_isize, field) \
- ((offsetof(typeof(*f2fs_inode), field) + \
+ ((offsetof(typeof(*(f2fs_inode)), field) + \
sizeof((f2fs_inode)->field)) \
- <= (F2FS_OLD_ATTRIBUTE_SIZE + extra_isize)) \
+ <= (F2FS_OLD_ATTRIBUTE_SIZE + (extra_isize))) \
static inline void f2fs_reset_iostat(struct f2fs_sb_info *sbi)
{
@@ -2822,8 +2831,8 @@ static inline void f2fs_update_iostat(struct f2fs_sb_info *sbi,
#define __is_large_section(sbi) ((sbi)->segs_per_sec > 1)
-#define __is_meta_io(fio) (PAGE_TYPE_OF_BIO(fio->type) == META && \
- (!is_read_io(fio->op) || fio->is_meta))
+#define __is_meta_io(fio) (PAGE_TYPE_OF_BIO((fio)->type) == META && \
+ (!is_read_io((fio)->op) || (fio)->is_meta))
bool f2fs_is_valid_blkaddr(struct f2fs_sb_info *sbi,
block_t blkaddr, int type);
@@ -2855,13 +2864,33 @@ static inline bool is_valid_data_blkaddr(struct f2fs_sb_info *sbi,
return true;
}
+static inline void f2fs_set_page_private(struct page *page,
+ unsigned long data)
+{
+ if (PagePrivate(page))
+ return;
+
+ get_page(page);
+ SetPagePrivate(page);
+ set_page_private(page, data);
+}
+
+static inline void f2fs_clear_page_private(struct page *page)
+{
+ if (!PagePrivate(page))
+ return;
+
+ set_page_private(page, 0);
+ ClearPagePrivate(page);
+ f2fs_put_page(page, 0);
+}
+
/*
* file.c
*/
int f2fs_sync_file(struct file *file, loff_t start, loff_t end, int datasync);
void f2fs_truncate_data_blocks(struct dnode_of_data *dn);
-int f2fs_truncate_blocks(struct inode *inode, u64 from, bool lock,
- bool buf_write);
+int f2fs_truncate_blocks(struct inode *inode, u64 from, bool lock);
int f2fs_truncate(struct inode *inode);
int f2fs_getattr(struct vfsmount *mnt, struct dentry *dentry,
struct kstat *stat);
@@ -3034,7 +3063,7 @@ void f2fs_invalidate_blocks(struct f2fs_sb_info *sbi, block_t addr);
bool f2fs_is_checkpointed_data(struct f2fs_sb_info *sbi, block_t blkaddr);
void f2fs_drop_discard_cmd(struct f2fs_sb_info *sbi);
void f2fs_stop_discard_thread(struct f2fs_sb_info *sbi);
-bool f2fs_wait_discard_bios(struct f2fs_sb_info *sbi);
+bool f2fs_issue_discard_timeout(struct f2fs_sb_info *sbi);
void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi,
struct cp_control *cpc);
void f2fs_dirty_to_prefree(struct f2fs_sb_info *sbi);
@@ -3356,7 +3385,7 @@ static inline struct f2fs_stat_info *F2FS_STAT(struct f2fs_sb_info *sbi)
int f2fs_build_stats(struct f2fs_sb_info *sbi);
void f2fs_destroy_stats(struct f2fs_sb_info *sbi);
-int __init f2fs_create_root_stats(void);
+void __init f2fs_create_root_stats(void);
void f2fs_destroy_root_stats(void);
#else
#define stat_inc_cp_count(si) do { } while (0)
@@ -3394,7 +3423,7 @@ void f2fs_destroy_root_stats(void);
static inline int f2fs_build_stats(struct f2fs_sb_info *sbi) { return 0; }
static inline void f2fs_destroy_stats(struct f2fs_sb_info *sbi) { }
-static inline int __init f2fs_create_root_stats(void) { return 0; }
+static inline void __init f2fs_create_root_stats(void) { }
static inline void f2fs_destroy_root_stats(void) { }
#endif
@@ -3654,8 +3683,6 @@ extern void f2fs_build_fault_attr(struct f2fs_sb_info *sbi, unsigned int rate,
#define f2fs_build_fault_attr(sbi, rate, type) do { } while (0)
#endif
-#endif
-
static inline bool is_journalled_quota(struct f2fs_sb_info *sbi)
{
#ifdef CONFIG_QUOTA
@@ -3668,3 +3695,5 @@ static inline bool is_journalled_quota(struct f2fs_sb_info *sbi)
#endif
return false;
}
+
+#endif
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index d350497..6633e84 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -591,8 +591,7 @@ static int truncate_partial_data_page(struct inode *inode, u64 from,
return 0;
}
-int f2fs_truncate_blocks(struct inode *inode, u64 from, bool lock,
- bool buf_write)
+int f2fs_truncate_blocks(struct inode *inode, u64 from, bool lock)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct dnode_of_data dn;
@@ -600,7 +599,6 @@ int f2fs_truncate_blocks(struct inode *inode, u64 from, bool lock,
int count = 0, err = 0;
struct page *ipage;
bool truncate_page = false;
- int flag = buf_write ? F2FS_GET_BLOCK_PRE_AIO : F2FS_GET_BLOCK_PRE_DIO;
trace_f2fs_truncate_blocks_enter(inode, from);
@@ -610,7 +608,7 @@ int f2fs_truncate_blocks(struct inode *inode, u64 from, bool lock,
goto free_partial;
if (lock)
- __do_map_lock(sbi, flag, true);
+ f2fs_lock_op(sbi);
ipage = f2fs_get_node_page(sbi, inode->i_ino);
if (IS_ERR(ipage)) {
@@ -648,7 +646,7 @@ int f2fs_truncate_blocks(struct inode *inode, u64 from, bool lock,
err = f2fs_truncate_inode_blocks(inode, free_from);
out:
if (lock)
- __do_map_lock(sbi, flag, false);
+ f2fs_unlock_op(sbi);
free_partial:
/* lastly zero out the first data page */
if (!err)
@@ -683,7 +681,7 @@ int f2fs_truncate(struct inode *inode)
return err;
}
- err = f2fs_truncate_blocks(inode, i_size_read(inode), true, false);
+ err = f2fs_truncate_blocks(inode, i_size_read(inode), true);
if (err)
return err;
@@ -771,7 +769,6 @@ int f2fs_setattr(struct dentry *dentry, struct iattr *attr)
{
struct inode *inode = d_inode(dentry);
int err;
- bool size_changed = false;
if (unlikely(f2fs_cp_error(F2FS_I_SB(inode))))
return -EIO;
@@ -846,8 +843,6 @@ int f2fs_setattr(struct dentry *dentry, struct iattr *attr)
down_write(&F2FS_I(inode)->i_sem);
F2FS_I(inode)->last_disk_size = i_size_read(inode);
up_write(&F2FS_I(inode)->i_sem);
-
- size_changed = true;
}
__setattr_copy(inode, attr);
@@ -861,7 +856,7 @@ int f2fs_setattr(struct dentry *dentry, struct iattr *attr)
}
/* file size may changed here */
- f2fs_mark_inode_dirty_sync(inode, size_changed);
+ f2fs_mark_inode_dirty_sync(inode, true);
/* inode change will produce dirty node pages flushed by checkpoint */
f2fs_balance_fs(F2FS_I_SB(inode), true);
@@ -1265,7 +1260,7 @@ static int f2fs_collapse_range(struct inode *inode, loff_t offset, loff_t len)
new_size = i_size_read(inode) - len;
truncate_pagecache(inode, new_size);
- ret = f2fs_truncate_blocks(inode, new_size, true, false);
+ ret = f2fs_truncate_blocks(inode, new_size, true);
up_write(&F2FS_I(inode)->i_mmap_sem);
if (!ret)
f2fs_i_size_write(inode, new_size);
@@ -1450,7 +1445,7 @@ static int f2fs_insert_range(struct inode *inode, loff_t offset, loff_t len)
f2fs_balance_fs(sbi, true);
down_write(&F2FS_I(inode)->i_mmap_sem);
- ret = f2fs_truncate_blocks(inode, i_size_read(inode), true, false);
+ ret = f2fs_truncate_blocks(inode, i_size_read(inode), true);
up_write(&F2FS_I(inode)->i_mmap_sem);
if (ret)
return ret;
@@ -1654,6 +1649,8 @@ static int f2fs_ioc_getflags(struct file *filp, unsigned long arg)
flags |= F2FS_ENCRYPT_FL;
if (f2fs_has_inline_data(inode) || f2fs_has_inline_dentry(inode))
flags |= F2FS_INLINE_DATA_FL;
+ if (is_inode_flag_set(inode, FI_PIN_FILE))
+ flags |= F2FS_NOCOW_FL;
flags &= F2FS_FL_USER_VISIBLE;
@@ -1966,11 +1963,11 @@ static int f2fs_ioc_shutdown(struct file *filp, unsigned long arg)
break;
case F2FS_GOING_DOWN_NEED_FSCK:
set_sbi_flag(sbi, SBI_NEED_FSCK);
+ set_sbi_flag(sbi, SBI_CP_DISABLED_QUICK);
+ set_sbi_flag(sbi, SBI_IS_DIRTY);
/* do checkpoint only */
ret = f2fs_sync_fs(sb, 1);
- if (ret)
- goto out;
- break;
+ goto out;
default:
ret = -EINVAL;
goto out;
@@ -1986,6 +1983,9 @@ static int f2fs_ioc_shutdown(struct file *filp, unsigned long arg)
out:
if (in != F2FS_GOING_DOWN_FULLSYNC)
mnt_drop_write_file(filp);
+
+ trace_f2fs_shutdown(sbi, in, ret);
+
return ret;
}
@@ -2649,8 +2649,8 @@ static int f2fs_ioc_set_pin_file(struct file *filp, unsigned long arg)
__u32 pin;
int ret = 0;
- if (!inode_owner_or_capable(inode))
- return -EACCES;
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
if (get_user(pin, (__u32 __user *)arg))
return -EFAULT;
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index b8676a2..8422a13 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -316,7 +316,7 @@ bool f2fs_recover_inline_data(struct inode *inode, struct page *npage)
clear_inode_flag(inode, FI_INLINE_DATA);
f2fs_put_page(ipage, 1);
} else if (ri && (ri->i_inline & F2FS_INLINE_DATA)) {
- if (f2fs_truncate_blocks(inode, 0, false, false))
+ if (f2fs_truncate_blocks(inode, 0, false))
return false;
goto process_inline;
}
@@ -488,7 +488,7 @@ static int f2fs_add_inline_entries(struct inode *dir, void *inline_dentry)
return 0;
punch_dentry_pages:
truncate_inode_pages(&dir->i_data, 0);
- f2fs_truncate_blocks(dir, 0, false, false);
+ f2fs_truncate_blocks(dir, 0, false);
f2fs_remove_dirty_inode(dir);
return err;
}
@@ -677,6 +677,12 @@ int f2fs_read_inline_dir(struct file *file, struct dir_context *ctx,
if (IS_ERR(ipage))
return PTR_ERR(ipage);
+ /*
+ * f2fs_readdir was protected by inode.i_rwsem, it is safe to access
+ * ipage without page's lock held.
+ */
+ unlock_page(ipage);
+
inline_dentry = inline_data_addr(inode, ipage);
make_dentry_ptr_inline(inode, &d, inline_dentry);
@@ -685,7 +691,7 @@ int f2fs_read_inline_dir(struct file *file, struct dir_context *ctx,
if (!err)
ctx->pos = d.max;
- f2fs_put_page(ipage, 1);
+ f2fs_put_page(ipage, 0);
return err < 0 ? err : 0;
}
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index bec5296..d44e6de 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -14,6 +14,7 @@
#include "f2fs.h"
#include "node.h"
#include "segment.h"
+#include "xattr.h"
#include <trace/events/f2fs.h>
@@ -248,6 +249,20 @@ static bool sanity_check_inode(struct inode *inode, struct page *node_page)
return false;
}
+ if (f2fs_has_extra_attr(inode) &&
+ f2fs_sb_has_flexible_inline_xattr(sbi) &&
+ f2fs_has_inline_xattr(inode) &&
+ (!fi->i_inline_xattr_size ||
+ fi->i_inline_xattr_size > MAX_INLINE_XATTR_SIZE)) {
+ set_sbi_flag(sbi, SBI_NEED_FSCK);
+ f2fs_msg(sbi->sb, KERN_WARNING,
+ "%s: inode (ino=%lx) has corrupted "
+ "i_inline_xattr_size: %d, max: %zu",
+ __func__, inode->i_ino, fi->i_inline_xattr_size,
+ MAX_INLINE_XATTR_SIZE);
+ return false;
+ }
+
if (F2FS_I(inode)->extent_tree) {
struct extent_info *ei = &F2FS_I(inode)->extent_tree->largest;
diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index 98fe5a7..331ff8b 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -10,6 +10,7 @@
#include <linux/pagemap.h>
#include <linux/sched.h>
#include <linux/ctype.h>
+#include <linux/random.h>
#include <linux/dcache.h>
#include <linux/namei.h>
#include <linux/quotaops.h>
@@ -50,7 +51,7 @@ static struct inode *f2fs_new_inode(struct inode *dir, umode_t mode)
inode->i_blocks = 0;
inode->i_mtime = inode->i_atime = inode->i_ctime =
F2FS_I(inode)->i_crtime = current_time(inode);
- inode->i_generation = sbi->s_next_generation++;
+ inode->i_generation = prandom_u32();
if (S_ISDIR(inode->i_mode))
F2FS_I(inode)->i_current_depth = 1;
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 46fdaeb..93e9d42 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1922,7 +1922,9 @@ static int f2fs_write_node_pages(struct address_space *mapping,
f2fs_balance_fs_bg(sbi);
/* collect a number of dirty node pages and write together */
- if (get_pages(sbi, F2FS_DIRTY_NODES) < nr_pages_to_skip(sbi, NODE))
+ if (wbc->sync_mode != WB_SYNC_ALL &&
+ get_pages(sbi, F2FS_DIRTY_NODES) <
+ nr_pages_to_skip(sbi, NODE))
goto skip_write;
if (wbc->sync_mode == WB_SYNC_ALL)
@@ -1961,7 +1963,7 @@ static int f2fs_set_node_page_dirty(struct page *page)
if (!PageDirty(page)) {
__set_page_dirty_nobuffers(page);
inc_page_count(F2FS_P_SB(page), F2FS_DIRTY_NODES);
- SetPagePrivate(page);
+ f2fs_set_page_private(page, 0);
f2fs_trace_pid(page);
return 1;
}
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index eaf7be5..696da20 100755
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -191,8 +191,7 @@ void f2fs_register_inmem_page(struct inode *inode, struct page *page)
f2fs_trace_pid(page);
- set_page_private(page, (unsigned long)ATOMIC_WRITTEN_PAGE);
- SetPagePrivate(page);
+ f2fs_set_page_private(page, (unsigned long)ATOMIC_WRITTEN_PAGE);
new = f2fs_kmem_cache_alloc(inmem_entry_slab, GFP_NOFS);
@@ -215,7 +214,8 @@ void f2fs_register_inmem_page(struct inode *inode, struct page *page)
}
static int __revoke_inmem_pages(struct inode *inode,
- struct list_head *head, bool drop, bool recover)
+ struct list_head *head, bool drop, bool recover,
+ bool trylock)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct inmem_pages *cur, *tmp;
@@ -227,7 +227,16 @@ static int __revoke_inmem_pages(struct inode *inode,
if (drop)
trace_f2fs_commit_inmem_page(page, INMEM_DROP);
- lock_page(page);
+ if (trylock) {
+ /*
+ * to avoid deadlock in between page lock and
+ * inmem_lock.
+ */
+ if (!trylock_page(page))
+ continue;
+ } else {
+ lock_page(page);
+ }
f2fs_wait_on_page_writeback(page, DATA, true, true);
@@ -270,8 +279,7 @@ static int __revoke_inmem_pages(struct inode *inode,
ClearPageUptodate(page);
clear_cold_data(page);
}
- set_page_private(page, 0);
- ClearPagePrivate(page);
+ f2fs_clear_page_private(page);
f2fs_put_page(page, 1);
list_del(&cur->list);
@@ -318,13 +326,19 @@ void f2fs_drop_inmem_pages(struct inode *inode)
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct f2fs_inode_info *fi = F2FS_I(inode);
- mutex_lock(&fi->inmem_lock);
- __revoke_inmem_pages(inode, &fi->inmem_pages, true, false);
- spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
- if (!list_empty(&fi->inmem_ilist))
- list_del_init(&fi->inmem_ilist);
- spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
- mutex_unlock(&fi->inmem_lock);
+ while (!list_empty(&fi->inmem_pages)) {
+ mutex_lock(&fi->inmem_lock);
+ __revoke_inmem_pages(inode, &fi->inmem_pages,
+ true, false, true);
+
+ if (list_empty(&fi->inmem_pages)) {
+ spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
+ if (!list_empty(&fi->inmem_ilist))
+ list_del_init(&fi->inmem_ilist);
+ spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
+ }
+ mutex_unlock(&fi->inmem_lock);
+ }
clear_inode_flag(inode, FI_ATOMIC_FILE);
fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
@@ -354,8 +368,7 @@ void f2fs_drop_inmem_page(struct inode *inode, struct page *page)
kmem_cache_free(inmem_entry_slab, cur);
ClearPageUptodate(page);
- set_page_private(page, 0);
- ClearPagePrivate(page);
+ f2fs_clear_page_private(page);
f2fs_put_page(page, 0);
trace_f2fs_commit_inmem_page(page, INMEM_INVALIDATE);
@@ -429,12 +442,15 @@ static int __f2fs_commit_inmem_pages(struct inode *inode)
* recovery or rewrite & commit last transaction. For other
* error number, revoking was done by filesystem itself.
*/
- err = __revoke_inmem_pages(inode, &revoke_list, false, true);
+ err = __revoke_inmem_pages(inode, &revoke_list,
+ false, true, false);
/* drop all uncommitted pages */
- __revoke_inmem_pages(inode, &fi->inmem_pages, true, false);
+ __revoke_inmem_pages(inode, &fi->inmem_pages,
+ true, false, false);
} else {
- __revoke_inmem_pages(inode, &revoke_list, false, false);
+ __revoke_inmem_pages(inode, &revoke_list,
+ false, false, false);
}
return err;
@@ -542,9 +558,13 @@ void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi)
static int __submit_flush_wait(struct f2fs_sb_info *sbi,
struct block_device *bdev)
{
- struct bio *bio = f2fs_bio_alloc(sbi, 0, true);
+ struct bio *bio;
int ret;
+ bio = f2fs_bio_alloc(sbi, 0, false);
+ if (!bio)
+ return -ENOMEM;
+
bio->bi_opf = REQ_OP_WRITE | REQ_SYNC | REQ_PREFLUSH;
bio->bi_bdev = bdev;
ret = submit_bio_wait(bio);
@@ -868,6 +888,9 @@ int f2fs_disable_cp_again(struct f2fs_sb_info *sbi)
if (holes[DATA] > ovp || holes[NODE] > ovp)
return -EAGAIN;
+ if (is_sbi_flag_set(sbi, SBI_CP_DISABLED_QUICK) &&
+ dirty_segments(sbi) > overprovision_segments(sbi))
+ return -EAGAIN;
return 0;
}
@@ -1036,6 +1059,7 @@ static void __init_discard_policy(struct f2fs_sb_info *sbi,
dpolicy->max_requests = DEF_MAX_DISCARD_REQUEST;
dpolicy->io_aware_gran = MAX_PLIST_NUM;
+ dpolicy->timeout = 0;
if (discard_type == DPOLICY_BG) {
dpolicy->min_interval = DEF_MIN_DISCARD_ISSUE_TIME;
@@ -1058,6 +1082,8 @@ static void __init_discard_policy(struct f2fs_sb_info *sbi,
} else if (discard_type == DPOLICY_UMOUNT) {
dpolicy->max_requests = UINT_MAX;
dpolicy->io_aware = false;
+ /* we need to issue all to keep CP_TRIMMED_FLAG */
+ dpolicy->granularity = 1;
}
}
@@ -1420,7 +1446,14 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
int i, issued = 0;
bool io_interrupted = false;
+ if (dpolicy->timeout != 0)
+ f2fs_update_time(sbi, dpolicy->timeout);
+
for (i = MAX_PLIST_NUM - 1; i >= 0; i--) {
+ if (dpolicy->timeout != 0 &&
+ f2fs_time_over(sbi, dpolicy->timeout))
+ break;
+
if (i + 1 < dpolicy->granularity)
break;
@@ -1607,7 +1640,7 @@ void f2fs_stop_discard_thread(struct f2fs_sb_info *sbi)
}
/* This comes from f2fs_put_super */
-bool f2fs_wait_discard_bios(struct f2fs_sb_info *sbi)
+bool f2fs_issue_discard_timeout(struct f2fs_sb_info *sbi)
{
struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
struct discard_policy dpolicy;
@@ -1615,6 +1648,7 @@ bool f2fs_wait_discard_bios(struct f2fs_sb_info *sbi)
__init_discard_policy(sbi, &dpolicy, DPOLICY_UMOUNT,
dcc->discard_granularity);
+ dpolicy.timeout = UMOUNT_DISCARD_TIMEOUT;
__issue_discard_cmd(sbi, &dpolicy);
dropped = __drop_discard_cmd(sbi);
@@ -3161,10 +3195,10 @@ int f2fs_inplace_write_data(struct f2fs_io_info *fio)
stat_inc_inplace_blocks(fio->sbi);
err = f2fs_submit_page_bio(fio);
- if (!err)
+ if (!err) {
update_device_state(fio);
-
- f2fs_update_iostat(fio->sbi, fio->io_type, F2FS_BLKSIZE);
+ f2fs_update_iostat(fio->sbi, fio->io_type, F2FS_BLKSIZE);
+ }
return err;
}
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index a77f76f..5c7ed04 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -865,7 +865,7 @@ static inline void wake_up_discard_thread(struct f2fs_sb_info *sbi, bool force)
}
}
mutex_unlock(&dcc->cmd_lock);
- if (!wakeup)
+ if (!wakeup || !is_idle(sbi, DISCARD_TIME))
return;
wake_up:
dcc->discard_wake = 1;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 3e0250b..e1446a5 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -269,7 +269,7 @@ static int f2fs_set_qf_name(struct super_block *sb, int qtype,
if (!qname) {
f2fs_msg(sb, KERN_ERR,
"Not enough memory for storing quotafile name");
- return -EINVAL;
+ return -ENOMEM;
}
if (F2FS_OPTION(sbi).s_qf_names[qtype]) {
if (strcmp(F2FS_OPTION(sbi).s_qf_names[qtype], qname) == 0)
@@ -586,7 +586,7 @@ static int parse_options(struct super_block *sb, char *options)
case Opt_io_size_bits:
if (args->from && match_int(args, &arg))
return -EINVAL;
- if (arg > __ilog2_u32(BIO_MAX_PAGES)) {
+ if (arg <= 0 || arg > __ilog2_u32(BIO_MAX_PAGES)) {
f2fs_msg(sb, KERN_WARNING,
"Not support %d, larger than %d",
1 << arg, BIO_MAX_PAGES);
@@ -821,6 +821,8 @@ static int parse_options(struct super_block *sb, char *options)
}
if (test_opt(sbi, INLINE_XATTR_SIZE)) {
+ int min_size, max_size;
+
if (!f2fs_sb_has_extra_attr(sbi) ||
!f2fs_sb_has_flexible_inline_xattr(sbi)) {
f2fs_msg(sb, KERN_ERR,
@@ -834,14 +836,15 @@ static int parse_options(struct super_block *sb, char *options)
"set with inline_xattr option");
return -EINVAL;
}
- if (!F2FS_OPTION(sbi).inline_xattr_size ||
- F2FS_OPTION(sbi).inline_xattr_size >=
- DEF_ADDRS_PER_INODE -
- F2FS_TOTAL_EXTRA_ATTR_SIZE -
- DEF_INLINE_RESERVED_SIZE -
- DEF_MIN_INLINE_SIZE) {
+
+ min_size = sizeof(struct f2fs_xattr_header) / sizeof(__le32);
+ max_size = MAX_INLINE_XATTR_SIZE;
+
+ if (F2FS_OPTION(sbi).inline_xattr_size < min_size ||
+ F2FS_OPTION(sbi).inline_xattr_size > max_size) {
f2fs_msg(sb, KERN_ERR,
- "inline xattr size is out of range");
+ "inline xattr size is out of range: %d ~ %d",
+ min_size, max_size);
return -EINVAL;
}
}
@@ -915,6 +918,10 @@ static int f2fs_drop_inode(struct inode *inode)
sb_start_intwrite(inode->i_sb);
f2fs_i_size_write(inode, 0);
+ f2fs_submit_merged_write_cond(F2FS_I_SB(inode),
+ inode, NULL, 0, DATA);
+ truncate_inode_pages_final(inode->i_mapping);
+
if (F2FS_HAS_BLOCKS(inode))
f2fs_truncate(inode);
@@ -1048,7 +1055,7 @@ static void f2fs_put_super(struct super_block *sb)
}
/* be sure to wait for any on-going discard commands */
- dropped = f2fs_wait_discard_bios(sbi);
+ dropped = f2fs_issue_discard_timeout(sbi);
if ((f2fs_hw_support_discard(sbi) || f2fs_hw_should_discard(sbi)) &&
!sbi->discard_blks && !dropped) {
@@ -1458,9 +1465,16 @@ static int f2fs_enable_quotas(struct super_block *sb);
static int f2fs_disable_checkpoint(struct f2fs_sb_info *sbi)
{
+ unsigned int s_flags = sbi->sb->s_flags;
struct cp_control cpc;
- int err;
+ int err = 0;
+ int ret;
+ if (s_flags & MS_RDONLY) {
+ f2fs_msg(sbi->sb, KERN_ERR,
+ "checkpoint=disable on readonly fs");
+ return -EINVAL;
+ }
sbi->sb->s_flags |= MS_ACTIVE;
f2fs_update_time(sbi, DISABLE_TIME);
@@ -1468,18 +1482,24 @@ static int f2fs_disable_checkpoint(struct f2fs_sb_info *sbi)
while (!f2fs_time_over(sbi, DISABLE_TIME)) {
mutex_lock(&sbi->gc_mutex);
err = f2fs_gc(sbi, true, false, NULL_SEGNO);
- if (err == -ENODATA)
+ if (err == -ENODATA) {
+ err = 0;
break;
+ }
if (err && err != -EAGAIN)
- return err;
+ break;
}
- err = sync_filesystem(sbi->sb);
- if (err)
- return err;
+ ret = sync_filesystem(sbi->sb);
+ if (ret || err) {
+ err = ret ? ret: err;
+ goto restore_flag;
+ }
- if (f2fs_disable_cp_again(sbi))
- return -EAGAIN;
+ if (f2fs_disable_cp_again(sbi)) {
+ err = -EAGAIN;
+ goto restore_flag;
+ }
mutex_lock(&sbi->gc_mutex);
cpc.reason = CP_PAUSE;
@@ -1488,7 +1508,9 @@ static int f2fs_disable_checkpoint(struct f2fs_sb_info *sbi)
sbi->unusable_block_count = 0;
mutex_unlock(&sbi->gc_mutex);
- return 0;
+restore_flag:
+ sbi->sb->s_flags = s_flags; /* Restore MS_RDONLY status */
+ return err;
}
static void f2fs_enable_checkpoint(struct f2fs_sb_info *sbi)
@@ -2026,6 +2048,12 @@ void f2fs_quota_off_umount(struct super_block *sb)
set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
}
}
+ /*
+ * In case of checkpoint=disable, we must flush quota blocks.
+ * This can cause NULL exception for node_inode in end_io, since
+ * put_super already dropped it.
+ */
+ sync_filesystem(sb);
}
static void f2fs_truncate_quota_inode_pages(struct super_block *sb)
@@ -2706,6 +2734,8 @@ static void init_sb_info(struct f2fs_sb_info *sbi)
sbi->interval_time[DISCARD_TIME] = DEF_IDLE_INTERVAL;
sbi->interval_time[GC_TIME] = DEF_IDLE_INTERVAL;
sbi->interval_time[DISABLE_TIME] = DEF_DISABLE_INTERVAL;
+ sbi->interval_time[UMOUNT_DISCARD_TIMEOUT] =
+ DEF_UMOUNT_DISCARD_TIMEOUT;
clear_sbi_flag(sbi, SBI_NEED_FSCK);
for (i = 0; i < NR_COUNT_TYPE; i++)
@@ -3029,10 +3059,11 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
struct f2fs_super_block *raw_super;
struct inode *root;
int err;
- bool retry = true, need_fsck = false;
+ bool skip_recovery = false, need_fsck = false;
char *options = NULL;
int recovery, i, valid_super_block;
struct curseg_info *seg_i;
+ int retry_cnt = 1;
try_onemore:
err = -EINVAL;
@@ -3104,7 +3135,6 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
sb->s_maxbytes = sbi->max_file_blocks <<
le32_to_cpu(raw_super->log_blocksize);
sb->s_max_links = F2FS_LINK_MAX;
- get_random_bytes(&sbi->s_next_generation, sizeof(u32));
#ifdef CONFIG_QUOTA
sb->dq_op = &f2fs_quota_operations;
@@ -3207,6 +3237,10 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
if (__is_set_ckpt_flags(F2FS_CKPT(sbi), CP_QUOTA_NEED_FSCK_FLAG))
set_sbi_flag(sbi, SBI_QUOTA_NEED_REPAIR);
+ if (__is_set_ckpt_flags(F2FS_CKPT(sbi), CP_DISABLED_QUICK_FLAG)) {
+ set_sbi_flag(sbi, SBI_CP_DISABLED_QUICK);
+ sbi->interval_time[DISABLE_TIME] = DEF_DISABLE_QUICK_INTERVAL;
+ }
/* Initialize device list */
err = f2fs_scan_devices(sbi);
@@ -3294,7 +3328,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
sb->s_root = d_make_root(root); /* allocate root dentry */
if (!sb->s_root) {
err = -ENOMEM;
- goto free_root_inode;
+ goto free_node_inode;
}
err = f2fs_register_sysfs(sbi);
@@ -3316,7 +3350,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
goto free_meta;
if (unlikely(is_set_ckpt_flags(sbi, CP_DISABLED_FLAG)))
- goto skip_recovery;
+ goto reset_checkpoint;
/* recover fsynced data */
if (!test_opt(sbi, DISABLE_ROLL_FORWARD)) {
@@ -3333,11 +3367,13 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
if (need_fsck)
set_sbi_flag(sbi, SBI_NEED_FSCK);
- if (!retry)
- goto skip_recovery;
+ if (skip_recovery)
+ goto reset_checkpoint;
err = f2fs_recover_fsync_data(sbi, false);
if (err < 0) {
+ if (err != -ENOMEM)
+ skip_recovery = true;
need_fsck = true;
f2fs_msg(sb, KERN_ERR,
"Cannot recover all fsync data errno=%d", err);
@@ -3353,14 +3389,14 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
goto free_meta;
}
}
-skip_recovery:
+reset_checkpoint:
/* f2fs_recover_fsync_data() cleared this already */
clear_sbi_flag(sbi, SBI_POR_DOING);
if (test_opt(sbi, DISABLE_CHECKPOINT)) {
err = f2fs_disable_checkpoint(sbi);
if (err)
- goto free_meta;
+ goto sync_free_meta;
} else if (is_set_ckpt_flags(sbi, CP_DISABLED_FLAG)) {
f2fs_enable_checkpoint(sbi);
}
@@ -3373,7 +3409,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
/* After POR, we can run background GC thread.*/
err = f2fs_start_gc_thread(sbi);
if (err)
- goto free_meta;
+ goto sync_free_meta;
}
kvfree(options);
@@ -3393,8 +3429,14 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
cur_cp_version(F2FS_CKPT(sbi)));
f2fs_update_time(sbi, CP_TIME);
f2fs_update_time(sbi, REQ_TIME);
+ clear_sbi_flag(sbi, SBI_CP_DISABLED_QUICK);
return 0;
+sync_free_meta:
+ /* safe to flush all the data */
+ sync_filesystem(sbi->sb);
+ retry_cnt = 0;
+
free_meta:
#ifdef CONFIG_QUOTA
f2fs_truncate_quota_inode_pages(sb);
@@ -3408,6 +3450,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
* falls into an infinite loop in f2fs_sync_meta_pages().
*/
truncate_inode_pages_final(META_MAPPING(sbi));
+ /* evict some inodes being cached by GC */
+ evict_inodes(sb);
f2fs_unregister_sysfs(sbi);
free_root_inode:
dput(sb->s_root);
@@ -3451,8 +3495,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
kvfree(sbi);
/* give only one another chance */
- if (retry) {
- retry = false;
+ if (retry_cnt > 0 && skip_recovery) {
+ retry_cnt--;
shrink_dcache_sb(sb);
goto try_onemore;
}
@@ -3553,9 +3597,7 @@ static int __init init_f2fs_fs(void)
err = register_filesystem(&f2fs_fs_type);
if (err)
goto free_shrinker;
- err = f2fs_create_root_stats();
- if (err)
- goto free_filesystem;
+ f2fs_create_root_stats();
err = f2fs_init_post_read_processing();
if (err)
goto free_root_stats;
@@ -3563,7 +3605,6 @@ static int __init init_f2fs_fs(void)
free_root_stats:
f2fs_destroy_root_stats();
-free_filesystem:
unregister_filesystem(&f2fs_fs_type);
free_shrinker:
unregister_shrinker(&f2fs_shrinker_info);
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index 948c1a2..d7b4766 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -222,6 +222,8 @@ static ssize_t __sbi_store(struct f2fs_attr *a,
#ifdef CONFIG_F2FS_FAULT_INJECTION
if (a->struct_type == FAULT_INFO_TYPE && t >= (1 << FAULT_MAX))
return -EINVAL;
+ if (a->struct_type == FAULT_INFO_RATE && t >= UINT_MAX)
+ return -EINVAL;
#endif
if (a->struct_type == RESERVED_BLOCKS) {
spin_lock(&sbi->stat_lock);
@@ -278,10 +280,16 @@ static ssize_t __sbi_store(struct f2fs_attr *a,
return count;
}
- *ui = t;
- if (!strcmp(a->attr.name, "iostat_enable") && *ui == 0)
- f2fs_reset_iostat(sbi);
+ if (!strcmp(a->attr.name, "iostat_enable")) {
+ sbi->iostat_enable = !!t;
+ if (!sbi->iostat_enable)
+ f2fs_reset_iostat(sbi);
+ return count;
+ }
+
+ *ui = (unsigned int)t;
+
return count;
}
@@ -418,6 +426,8 @@ F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, idle_interval, interval_time[REQ_TIME]);
F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, discard_idle_interval,
interval_time[DISCARD_TIME]);
F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_idle_interval, interval_time[GC_TIME]);
+F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info,
+ umount_discard_timeout, interval_time[UMOUNT_DISCARD_TIMEOUT]);
F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, iostat_enable, iostat_enable);
F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, readdir_ra, readdir_ra);
F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_pin_file_thresh, gc_pin_file_threshold);
@@ -475,6 +485,7 @@ static struct attribute *f2fs_attrs[] = {
ATTR_LIST(idle_interval),
ATTR_LIST(discard_idle_interval),
ATTR_LIST(gc_idle_interval),
+ ATTR_LIST(umount_discard_timeout),
ATTR_LIST(iostat_enable),
ATTR_LIST(readdir_ra),
ATTR_LIST(gc_pin_file_thresh),
diff --git a/fs/f2fs/trace.c b/fs/f2fs/trace.c
index ce2a5eb..d0ab533 100644
--- a/fs/f2fs/trace.c
+++ b/fs/f2fs/trace.c
@@ -14,7 +14,7 @@
#include "trace.h"
static RADIX_TREE(pids, GFP_ATOMIC);
-static struct mutex pids_lock;
+static spinlock_t pids_lock;
static struct last_io_info last_io;
static inline void __print_last_io(void)
@@ -58,23 +58,29 @@ void f2fs_trace_pid(struct page *page)
set_page_private(page, (unsigned long)pid);
+retry:
if (radix_tree_preload(GFP_NOFS))
return;
- mutex_lock(&pids_lock);
+ spin_lock(&pids_lock);
p = radix_tree_lookup(&pids, pid);
if (p == current)
goto out;
if (p)
radix_tree_delete(&pids, pid);
- f2fs_radix_tree_insert(&pids, pid, current);
+ if (radix_tree_insert(&pids, pid, current)) {
+ spin_unlock(&pids_lock);
+ radix_tree_preload_end();
+ cond_resched();
+ goto retry;
+ }
trace_printk("%3x:%3x %4x %-16s\n",
MAJOR(inode->i_sb->s_dev), MINOR(inode->i_sb->s_dev),
pid, current->comm);
out:
- mutex_unlock(&pids_lock);
+ spin_unlock(&pids_lock);
radix_tree_preload_end();
}
@@ -119,7 +125,7 @@ void f2fs_trace_ios(struct f2fs_io_info *fio, int flush)
void f2fs_build_trace_ios(void)
{
- mutex_init(&pids_lock);
+ spin_lock_init(&pids_lock);
}
#define PIDVEC_SIZE 128
@@ -147,7 +153,7 @@ void f2fs_destroy_trace_ios(void)
pid_t next_pid = 0;
unsigned int found;
- mutex_lock(&pids_lock);
+ spin_lock(&pids_lock);
while ((found = gang_lookup_pids(pid, next_pid, PIDVEC_SIZE))) {
unsigned idx;
@@ -155,5 +161,5 @@ void f2fs_destroy_trace_ios(void)
for (idx = 0; idx < found; idx++)
radix_tree_delete(&pids, pid[idx]);
}
- mutex_unlock(&pids_lock);
+ spin_unlock(&pids_lock);
}
diff --git a/fs/f2fs/xattr.c b/fs/f2fs/xattr.c
index 18d5ffb..848a785 100644
--- a/fs/f2fs/xattr.c
+++ b/fs/f2fs/xattr.c
@@ -224,11 +224,11 @@ static struct f2fs_xattr_entry *__find_inline_xattr(struct inode *inode,
{
struct f2fs_xattr_entry *entry;
unsigned int inline_size = inline_xattr_size(inode);
+ void *max_addr = base_addr + inline_size;
list_for_each_xattr(entry, base_addr) {
- if ((void *)entry + sizeof(__u32) > base_addr + inline_size ||
- (void *)XATTR_NEXT_ENTRY(entry) + sizeof(__u32) >
- base_addr + inline_size) {
+ if ((void *)entry + sizeof(__u32) > max_addr ||
+ (void *)XATTR_NEXT_ENTRY(entry) > max_addr) {
*last_addr = entry;
return NULL;
}
@@ -239,6 +239,13 @@ static struct f2fs_xattr_entry *__find_inline_xattr(struct inode *inode,
if (!memcmp(entry->e_name, name, len))
break;
}
+
+ /* inline xattr header or entry across max inline xattr size */
+ if (IS_XATTR_LAST_ENTRY(entry) &&
+ (void *)entry + sizeof(__u32) > max_addr) {
+ *last_addr = entry;
+ return NULL;
+ }
return entry;
}
@@ -340,7 +347,7 @@ static int lookup_all_xattrs(struct inode *inode, struct page *ipage,
*base_addr = txattr_addr;
return 0;
out:
- kzfree(txattr_addr);
+ kvfree(txattr_addr);
return err;
}
@@ -383,7 +390,7 @@ static int read_all_xattrs(struct inode *inode, struct page *ipage,
*base_addr = txattr_addr;
return 0;
fail:
- kzfree(txattr_addr);
+ kvfree(txattr_addr);
return err;
}
@@ -510,7 +517,7 @@ int f2fs_getxattr(struct inode *inode, int index, const char *name,
}
error = size;
out:
- kzfree(base_addr);
+ kvfree(base_addr);
return error;
}
@@ -538,7 +545,7 @@ ssize_t f2fs_listxattr(struct dentry *dentry, char *buffer, size_t buffer_size)
if (!handler || (handler->list && !handler->list(dentry)))
continue;
- prefix = handler->prefix ?: handler->name;
+ prefix = xattr_prefix(handler);
prefix_len = strlen(prefix);
size = prefix_len + entry->e_name_len + 1;
if (buffer) {
@@ -556,7 +563,7 @@ ssize_t f2fs_listxattr(struct dentry *dentry, char *buffer, size_t buffer_size)
}
error = buffer_size - rest;
cleanup:
- kzfree(base_addr);
+ kvfree(base_addr);
return error;
}
@@ -687,7 +694,7 @@ static int __f2fs_setxattr(struct inode *inode, int index,
if (!error && S_ISDIR(inode->i_mode))
set_sbi_flag(F2FS_I_SB(inode), SBI_NEED_CP);
exit:
- kzfree(base_addr);
+ kvfree(base_addr);
return error;
}
diff --git a/fs/f2fs/xattr.h b/fs/f2fs/xattr.h
index 67db134..9172ee0 100644
--- a/fs/f2fs/xattr.h
+++ b/fs/f2fs/xattr.h
@@ -78,6 +78,12 @@ struct f2fs_xattr_entry {
sizeof(struct f2fs_xattr_header) - \
sizeof(struct f2fs_xattr_entry))
+#define MAX_INLINE_XATTR_SIZE \
+ (DEF_ADDRS_PER_INODE - \
+ F2FS_TOTAL_EXTRA_ATTR_SIZE / sizeof(__le32) - \
+ DEF_INLINE_RESERVED_SIZE - \
+ MIN_INLINE_DENTRY_SIZE / sizeof(__le32))
+
/*
* On-disk structure of f2fs_xattr
* We use inline xattrs space + 1 block for xattr.
diff --git a/fs/file.c b/fs/file.c
index 69d6990..09aac4d 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -475,6 +475,7 @@ struct files_struct init_files = {
.full_fds_bits = init_files.full_fds_bits_init,
},
.file_lock = __SPIN_LOCK_UNLOCKED(init_files.file_lock),
+ .resize_wait = __WAIT_QUEUE_HEAD_INITIALIZER(init_files.resize_wait),
};
static unsigned int find_next_fd(struct fdtable *fdt, unsigned int start)
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 31f8ca0..10ec276 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -700,9 +700,11 @@ void jbd2_journal_commit_transaction(journal_t *journal)
the last tag we set up. */
tag->t_flags |= cpu_to_be16(JBD2_FLAG_LAST_TAG);
-
- jbd2_descriptor_block_csum_set(journal, descriptor);
start_journal_io:
+ if (descriptor)
+ jbd2_descriptor_block_csum_set(journal,
+ descriptor);
+
for (i = 0; i < bufs; i++) {
struct buffer_head *bh = wbuf[i];
/*
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index b320c1b..799f96c 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -1211,11 +1211,12 @@ int jbd2_journal_get_undo_access(handle_t *handle, struct buffer_head *bh)
struct journal_head *jh;
char *committed_data = NULL;
- JBUFFER_TRACE(jh, "entry");
if (jbd2_write_access_granted(handle, bh, true))
return 0;
jh = jbd2_journal_add_journal_head(bh);
+ JBUFFER_TRACE(jh, "entry");
+
/*
* Do this first --- it can drop the journal lock, so we want to
* make sure that obtaining the committed_data is done
@@ -1326,15 +1327,17 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
if (is_handle_aborted(handle))
return -EROFS;
- if (!buffer_jbd(bh)) {
- ret = -EUCLEAN;
- goto out;
- }
+ if (!buffer_jbd(bh))
+ return -EUCLEAN;
+
/*
* We don't grab jh reference here since the buffer must be part
* of the running transaction.
*/
jh = bh2jh(bh);
+ jbd_debug(5, "journal_head %p\n", jh);
+ JBUFFER_TRACE(jh, "entry");
+
/*
* This and the following assertions are unreliable since we may see jh
* in inconsistent state unless we grab bh_state lock. But this is
@@ -1368,9 +1371,6 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
}
journal = transaction->t_journal;
- jbd_debug(5, "journal_head %p\n", jh);
- JBUFFER_TRACE(jh, "entry");
-
jbd_lock_bh_state(bh);
if (jh->b_modified == 0) {
@@ -1568,14 +1568,21 @@ int jbd2_journal_forget (handle_t *handle, struct buffer_head *bh)
/* However, if the buffer is still owned by a prior
* (committing) transaction, we can't drop it yet... */
JBUFFER_TRACE(jh, "belongs to older transaction");
- /* ... but we CAN drop it from the new transaction if we
- * have also modified it since the original commit. */
+ /* ... but we CAN drop it from the new transaction through
+ * marking the buffer as freed and set j_next_transaction to
+ * the new transaction, so that not only the commit code
+ * knows it should clear dirty bits when it is done with the
+ * buffer, but also the buffer can be checkpointed only
+ * after the new transaction commits. */
- if (jh->b_next_transaction) {
- J_ASSERT(jh->b_next_transaction == transaction);
+ set_buffer_freed(bh);
+
+ if (!jh->b_next_transaction) {
spin_lock(&journal->j_list_lock);
- jh->b_next_transaction = NULL;
+ jh->b_next_transaction = transaction;
spin_unlock(&journal->j_list_lock);
+ } else {
+ J_ASSERT(jh->b_next_transaction == transaction);
/*
* only drop a reference if this transaction modified
diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index cf4c636..0b9d23b 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -468,7 +468,7 @@ static void kernfs_drain(struct kernfs_node *kn)
rwsem_release(&kn->dep_map, 1, _RET_IP_);
}
- kernfs_unmap_bin_file(kn);
+ kernfs_drain_open_files(kn);
mutex_lock(&kernfs_mutex);
}
diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index d6512cd..27358c8 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -708,7 +708,8 @@ static int kernfs_fop_open(struct inode *inode, struct file *file)
if (error)
goto err_free;
- ((struct seq_file *)file->private_data)->private = of;
+ of->seq_file = file->private_data;
+ of->seq_file->private = of;
/* seq_file clears PWRITE unconditionally, restore it if WRITE */
if (file->f_mode & FMODE_WRITE)
@@ -717,13 +718,22 @@ static int kernfs_fop_open(struct inode *inode, struct file *file)
/* make sure we have open node struct */
error = kernfs_get_open_node(kn, of);
if (error)
- goto err_close;
+ goto err_seq_release;
+
+ if (ops->open) {
+ /* nobody has access to @of yet, skip @of->mutex */
+ error = ops->open(of);
+ if (error)
+ goto err_put_node;
+ }
/* open succeeded, put active references */
kernfs_put_active(kn);
return 0;
-err_close:
+err_put_node:
+ kernfs_put_open_node(kn, of);
+err_seq_release:
seq_release(inode, file);
err_free:
kfree(of->prealloc_buf);
@@ -733,11 +743,41 @@ static int kernfs_fop_open(struct inode *inode, struct file *file)
return error;
}
+/* used from release/drain to ensure that ->release() is called exactly once */
+static void kernfs_release_file(struct kernfs_node *kn,
+ struct kernfs_open_file *of)
+{
+ /*
+ * @of is guaranteed to have no other file operations in flight and
+ * we just want to synchronize release and drain paths.
+ * @kernfs_open_file_mutex is enough. @of->mutex can't be used
+ * here because drain path may be called from places which can
+ * cause circular dependency.
+ */
+ lockdep_assert_held(&kernfs_open_file_mutex);
+
+ if (!of->released) {
+ /*
+ * A file is never detached without being released and we
+ * need to be able to release files which are deactivated
+ * and being drained. Don't use kernfs_ops().
+ */
+ kn->attr.ops->release(of);
+ of->released = true;
+ }
+}
+
static int kernfs_fop_release(struct inode *inode, struct file *filp)
{
struct kernfs_node *kn = filp->f_path.dentry->d_fsdata;
struct kernfs_open_file *of = kernfs_of(filp);
+ if (kn->flags & KERNFS_HAS_RELEASE) {
+ mutex_lock(&kernfs_open_file_mutex);
+ kernfs_release_file(kn, of);
+ mutex_unlock(&kernfs_open_file_mutex);
+ }
+
kernfs_put_open_node(kn, of);
seq_release(inode, filp);
kfree(of->prealloc_buf);
@@ -746,12 +786,12 @@ static int kernfs_fop_release(struct inode *inode, struct file *filp)
return 0;
}
-void kernfs_unmap_bin_file(struct kernfs_node *kn)
+void kernfs_drain_open_files(struct kernfs_node *kn)
{
struct kernfs_open_node *on;
struct kernfs_open_file *of;
- if (!(kn->flags & KERNFS_HAS_MMAP))
+ if (!(kn->flags & (KERNFS_HAS_MMAP | KERNFS_HAS_RELEASE)))
return;
spin_lock_irq(&kernfs_open_node_lock);
@@ -763,10 +803,17 @@ void kernfs_unmap_bin_file(struct kernfs_node *kn)
return;
mutex_lock(&kernfs_open_file_mutex);
+
list_for_each_entry(of, &on->files, list) {
struct inode *inode = file_inode(of->file);
- unmap_mapping_range(inode->i_mapping, 0, 0, 1);
+
+ if (kn->flags & KERNFS_HAS_MMAP)
+ unmap_mapping_range(inode->i_mapping, 0, 0, 1);
+
+ if (kn->flags & KERNFS_HAS_RELEASE)
+ kernfs_release_file(kn, of);
}
+
mutex_unlock(&kernfs_open_file_mutex);
kernfs_put_open_node(kn, NULL);
@@ -786,26 +833,35 @@ void kernfs_unmap_bin_file(struct kernfs_node *kn)
* to see if it supports poll (Neither 'poll' nor 'select' return
* an appropriate error code). When in doubt, set a suitable timeout value.
*/
+unsigned int kernfs_generic_poll(struct kernfs_open_file *of, poll_table *wait)
+{
+ struct kernfs_node *kn = of->file->f_path.dentry->d_fsdata;
+ struct kernfs_open_node *on = kn->attr.open;
+
+ poll_wait(of->file, &on->poll, wait);
+
+ if (of->event != atomic_read(&on->event))
+ return DEFAULT_POLLMASK|POLLERR|POLLPRI;
+
+ return DEFAULT_POLLMASK;
+}
+
static unsigned int kernfs_fop_poll(struct file *filp, poll_table *wait)
{
struct kernfs_open_file *of = kernfs_of(filp);
struct kernfs_node *kn = filp->f_path.dentry->d_fsdata;
- struct kernfs_open_node *on = kn->attr.open;
+ unsigned int ret;
if (!kernfs_get_active(kn))
- goto trigger;
+ return DEFAULT_POLLMASK|POLLERR|POLLPRI;
- poll_wait(filp, &on->poll, wait);
+ if (kn->attr.ops->poll)
+ ret = kn->attr.ops->poll(of, wait);
+ else
+ ret = kernfs_generic_poll(of, wait);
kernfs_put_active(kn);
-
- if (of->event != atomic_read(&on->event))
- goto trigger;
-
- return DEFAULT_POLLMASK;
-
- trigger:
- return DEFAULT_POLLMASK|POLLERR|POLLPRI;
+ return ret;
}
static void kernfs_notify_workfn(struct work_struct *work)
@@ -965,6 +1021,8 @@ struct kernfs_node *__kernfs_create_file(struct kernfs_node *parent,
kn->flags |= KERNFS_HAS_SEQ_SHOW;
if (ops->mmap)
kn->flags |= KERNFS_HAS_MMAP;
+ if (ops->release)
+ kn->flags |= KERNFS_HAS_RELEASE;
rc = kernfs_add_one(kn);
if (rc) {
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index bfd551b..3100987 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -104,7 +104,7 @@ struct kernfs_node *kernfs_new_node(struct kernfs_node *parent,
*/
extern const struct file_operations kernfs_file_fops;
-void kernfs_unmap_bin_file(struct kernfs_node *kn);
+void kernfs_drain_open_files(struct kernfs_node *kn);
/*
* symlink.c
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index eb55ab6..6d0d94f 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2748,7 +2748,8 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata,
nfs4_schedule_stateid_recovery(server, state);
}
out:
- nfs4_sequence_free_slot(&opendata->o_res.seq_res);
+ if (!opendata->cancelled)
+ nfs4_sequence_free_slot(&opendata->o_res.seq_res);
return ret;
}
diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index 892c885..fad4d51 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -975,6 +975,17 @@ static void nfs_pageio_doio(struct nfs_pageio_descriptor *desc)
}
}
+static void
+nfs_pageio_cleanup_request(struct nfs_pageio_descriptor *desc,
+ struct nfs_page *req)
+{
+ LIST_HEAD(head);
+
+ nfs_list_remove_request(req);
+ nfs_list_add_request(req, &head);
+ desc->pg_completion_ops->error_cleanup(&head);
+}
+
/**
* nfs_pageio_add_request - Attempt to coalesce a request into a page list.
* @desc: destination io descriptor
@@ -1012,10 +1023,8 @@ static int __nfs_pageio_add_request(struct nfs_pageio_descriptor *desc,
nfs_page_group_unlock(req);
desc->pg_moreio = 1;
nfs_pageio_doio(desc);
- if (desc->pg_error < 0)
- return 0;
- if (mirror->pg_recoalesce)
- return 0;
+ if (desc->pg_error < 0 || mirror->pg_recoalesce)
+ goto out_cleanup_subreq;
/* retry add_request for this subreq */
nfs_page_group_lock(req, false);
continue;
@@ -1048,6 +1057,10 @@ static int __nfs_pageio_add_request(struct nfs_pageio_descriptor *desc,
desc->pg_error = PTR_ERR(subreq);
nfs_page_group_unlock(req);
return 0;
+out_cleanup_subreq:
+ if (req != subreq)
+ nfs_pageio_cleanup_request(desc, subreq);
+ return 0;
}
static int nfs_do_recoalesce(struct nfs_pageio_descriptor *desc)
@@ -1066,7 +1079,6 @@ static int nfs_do_recoalesce(struct nfs_pageio_descriptor *desc)
struct nfs_page *req;
req = list_first_entry(&head, struct nfs_page, wb_list);
- nfs_list_remove_request(req);
if (__nfs_pageio_add_request(desc, req))
continue;
if (desc->pg_error < 0) {
@@ -1141,11 +1153,14 @@ int nfs_pageio_add_request(struct nfs_pageio_descriptor *desc,
if (nfs_pgio_has_mirroring(desc))
desc->pg_mirror_idx = midx;
if (!nfs_pageio_add_request_mirror(desc, dupreq))
- goto out_failed;
+ goto out_cleanup_subreq;
}
return 1;
+out_cleanup_subreq:
+ if (req != dupreq)
+ nfs_pageio_cleanup_request(desc, dupreq);
out_failed:
/*
* We might have failed before sending any reqs over wire.
@@ -1185,7 +1200,7 @@ static void nfs_pageio_complete_mirror(struct nfs_pageio_descriptor *desc,
desc->pg_mirror_idx = mirror_idx;
for (;;) {
nfs_pageio_doio(desc);
- if (!mirror->pg_recoalesce)
+ if (desc->pg_error < 0 || !mirror->pg_recoalesce)
break;
if (!nfs_do_recoalesce(desc))
break;
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 659ad12..42c3158 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -2047,7 +2047,8 @@ static int nfs23_validate_mount_data(void *options,
memcpy(sap, &data->addr, sizeof(data->addr));
args->nfs_server.addrlen = sizeof(data->addr);
args->nfs_server.port = ntohs(data->addr.sin_port);
- if (!nfs_verify_server_address(sap))
+ if (sap->sa_family != AF_INET ||
+ !nfs_verify_server_address(sap))
goto out_no_address;
if (!(data->flags & NFS_MOUNT_TCP))
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index d818e4f..00b472f 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -431,8 +431,19 @@ nfsd3_proc_readdir(struct svc_rqst *rqstp, struct nfsd3_readdirargs *argp,
&resp->common, nfs3svc_encode_entry);
memcpy(resp->verf, argp->verf, 8);
resp->count = resp->buffer - argp->buffer;
- if (resp->offset)
- xdr_encode_hyper(resp->offset, argp->cookie);
+ if (resp->offset) {
+ loff_t offset = argp->cookie;
+
+ if (unlikely(resp->offset1)) {
+ /* we ended up with offset on a page boundary */
+ *resp->offset = htonl(offset >> 32);
+ *resp->offset1 = htonl(offset & 0xffffffff);
+ resp->offset1 = NULL;
+ } else {
+ xdr_encode_hyper(resp->offset, offset);
+ }
+ resp->offset = NULL;
+ }
RETURN_STATUS(nfserr);
}
@@ -500,6 +511,7 @@ nfsd3_proc_readdirplus(struct svc_rqst *rqstp, struct nfsd3_readdirargs *argp,
} else {
xdr_encode_hyper(resp->offset, offset);
}
+ resp->offset = NULL;
}
RETURN_STATUS(nfserr);
diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 4523346..7e50248 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -899,6 +899,7 @@ encode_entry(struct readdir_cd *ccd, const char *name, int namlen,
} else {
xdr_encode_hyper(cd->offset, offset64);
}
+ cd->offset = NULL;
}
/*
diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index 3069cd4..8d84228 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -934,8 +934,9 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
cb->cb_seq_status = 1;
cb->cb_status = 0;
if (minorversion) {
- if (!nfsd41_cb_get_slot(clp, task))
+ if (!cb->cb_holds_slot && !nfsd41_cb_get_slot(clp, task))
return;
+ cb->cb_holds_slot = true;
}
rpc_call_start(task);
}
@@ -962,6 +963,9 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
return true;
}
+ if (!cb->cb_holds_slot)
+ goto need_restart;
+
switch (cb->cb_seq_status) {
case 0:
/*
@@ -999,6 +1003,7 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
cb->cb_seq_status);
}
+ cb->cb_holds_slot = false;
clear_bit(0, &clp->cl_cb_slot_busy);
rpc_wake_up_next(&clp->cl_cb_waitq);
dprintk("%s: freed slot, new seqid=%d\n", __func__,
@@ -1206,6 +1211,7 @@ void nfsd4_init_cb(struct nfsd4_callback *cb, struct nfs4_client *clp,
cb->cb_seq_status = 1;
cb->cb_status = 0;
cb->cb_need_restart = false;
+ cb->cb_holds_slot = false;
}
void nfsd4_run_cb(struct nfsd4_callback *cb)
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 797a155..f704f90 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1103,7 +1103,7 @@ static ssize_t write_v4_end_grace(struct file *file, char *buf, size_t size)
case 'Y':
case 'y':
case '1':
- if (nn->nfsd_serv)
+ if (!nn->nfsd_serv)
return -EBUSY;
nfsd4_end_grace(nn);
break;
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 86aa92d..133d8bf 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -69,6 +69,7 @@ struct nfsd4_callback {
int cb_seq_status;
int cb_status;
bool cb_need_restart;
+ bool cb_holds_slot;
};
struct nfsd4_callback_ops {
diff --git a/fs/ocfs2/cluster/nodemanager.c b/fs/ocfs2/cluster/nodemanager.c
index c204ac9b..81a0d5d 100644
--- a/fs/ocfs2/cluster/nodemanager.c
+++ b/fs/ocfs2/cluster/nodemanager.c
@@ -621,13 +621,15 @@ static void o2nm_node_group_drop_item(struct config_group *group,
struct o2nm_node *node = to_o2nm_node(item);
struct o2nm_cluster *cluster = to_o2nm_cluster(group->cg_item.ci_parent);
- o2net_disconnect_node(node);
+ if (cluster->cl_nodes[node->nd_num] == node) {
+ o2net_disconnect_node(node);
- if (cluster->cl_has_local &&
- (cluster->cl_local_node == node->nd_num)) {
- cluster->cl_has_local = 0;
- cluster->cl_local_node = O2NM_INVALID_NODE_NUM;
- o2net_stop_listening(node);
+ if (cluster->cl_has_local &&
+ (cluster->cl_local_node == node->nd_num)) {
+ cluster->cl_has_local = 0;
+ cluster->cl_local_node = O2NM_INVALID_NODE_NUM;
+ o2net_stop_listening(node);
+ }
}
/* XXX call into net to stop this node from trading messages */
diff --git a/fs/open.c b/fs/open.c
index 73b7d19..e8818ea 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -730,6 +730,12 @@ static int do_dentry_open(struct file *f,
return 0;
}
+ /* Any file opened for execve()/uselib() has to be a regular file. */
+ if (unlikely(f->f_flags & FMODE_EXEC && !S_ISREG(inode->i_mode))) {
+ error = -EACCES;
+ goto cleanup_file;
+ }
+
if (f->f_mode & FMODE_WRITE && !special_file(inode->i_mode)) {
error = get_write_access(inode);
if (unlikely(error))
diff --git a/fs/pipe.c b/fs/pipe.c
index 3434553..388e09a 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -238,6 +238,14 @@ static const struct pipe_buf_operations anon_pipe_buf_ops = {
.get = generic_pipe_buf_get,
};
+static const struct pipe_buf_operations anon_pipe_buf_nomerge_ops = {
+ .can_merge = 0,
+ .confirm = generic_pipe_buf_confirm,
+ .release = anon_pipe_buf_release,
+ .steal = anon_pipe_buf_steal,
+ .get = generic_pipe_buf_get,
+};
+
static const struct pipe_buf_operations packet_pipe_buf_ops = {
.can_merge = 0,
.confirm = generic_pipe_buf_confirm,
@@ -246,6 +254,12 @@ static const struct pipe_buf_operations packet_pipe_buf_ops = {
.get = generic_pipe_buf_get,
};
+void pipe_buf_mark_unmergeable(struct pipe_buffer *buf)
+{
+ if (buf->ops == &anon_pipe_buf_ops)
+ buf->ops = &anon_pipe_buf_nomerge_ops;
+}
+
static ssize_t
pipe_read(struct kiocb *iocb, struct iov_iter *to)
{
diff --git a/fs/proc/loadavg.c b/fs/proc/loadavg.c
index aec66e6..e4b986e 100644
--- a/fs/proc/loadavg.c
+++ b/fs/proc/loadavg.c
@@ -3,13 +3,11 @@
#include <linux/pid_namespace.h>
#include <linux/proc_fs.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/seq_file.h>
#include <linux/seqlock.h>
#include <linux/time.h>
-#define LOAD_INT(x) ((x) >> FSHIFT)
-#define LOAD_FRAC(x) LOAD_INT(((x) & (FIXED_1-1)) * 100)
-
static int loadavg_proc_show(struct seq_file *m, void *v)
{
unsigned long avnrun[3];
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 1999e85..5b32c05 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -1604,8 +1604,11 @@ static void drop_sysctl_table(struct ctl_table_header *header)
if (--header->nreg)
return;
- put_links(header);
- start_unregistering(header);
+ if (parent) {
+ put_links(header);
+ start_unregistering(header);
+ }
+
if (!--header->count)
kfree_rcu(header, rcu);
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e61acca..32ddba9 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -517,7 +517,7 @@ struct mem_size_stats {
};
static void smaps_account(struct mem_size_stats *mss, struct page *page,
- bool compound, bool young, bool dirty)
+ bool compound, bool young, bool dirty, bool locked)
{
int i, nr = compound ? 1 << compound_order(page) : 1;
unsigned long size = nr * PAGE_SIZE;
@@ -541,24 +541,31 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
else
mss->private_clean += size;
mss->pss += (u64)size << PSS_SHIFT;
+ if (locked)
+ mss->pss_locked += (u64)size << PSS_SHIFT;
return;
}
for (i = 0; i < nr; i++, page++) {
int mapcount = page_mapcount(page);
+ unsigned long pss = (PAGE_SIZE << PSS_SHIFT);
if (mapcount >= 2) {
if (dirty || PageDirty(page))
mss->shared_dirty += PAGE_SIZE;
else
mss->shared_clean += PAGE_SIZE;
- mss->pss += (PAGE_SIZE << PSS_SHIFT) / mapcount;
+ mss->pss += pss / mapcount;
+ if (locked)
+ mss->pss_locked += pss / mapcount;
} else {
if (dirty || PageDirty(page))
mss->private_dirty += PAGE_SIZE;
else
mss->private_clean += PAGE_SIZE;
- mss->pss += PAGE_SIZE << PSS_SHIFT;
+ mss->pss += pss;
+ if (locked)
+ mss->pss_locked += pss;
}
}
}
@@ -581,6 +588,7 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
{
struct mem_size_stats *mss = walk->private;
struct vm_area_struct *vma = walk->vma;
+ bool locked = !!(vma->vm_flags & VM_LOCKED);
struct page *page = NULL;
if (pte_present(*pte)) {
@@ -621,7 +629,7 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
if (!page)
return;
- smaps_account(mss, page, false, pte_young(*pte), pte_dirty(*pte));
+ smaps_account(mss, page, false, pte_young(*pte), pte_dirty(*pte), locked);
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -630,6 +638,7 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
{
struct mem_size_stats *mss = walk->private;
struct vm_area_struct *vma = walk->vma;
+ bool locked = !!(vma->vm_flags & VM_LOCKED);
struct page *page;
/* FOLL_DUMP will return -EFAULT on huge zero page */
@@ -644,7 +653,7 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
/* pass */;
else
VM_BUG_ON_PAGE(1, page);
- smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd));
+ smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd), locked);
}
#else
static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
@@ -840,11 +849,8 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
}
}
#endif
-
/* mmap_sem is held in m_start */
walk_page_vma(vma, &smaps_walk);
- if (vma->vm_flags & VM_LOCKED)
- mss->pss_locked += mss->pss;
if (!rollup_mode) {
show_map_vma(m, vma, is_pid);
diff --git a/fs/read_write.c b/fs/read_write.c
index c06d7f9..ce77e76 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1196,6 +1196,9 @@ COMPAT_SYSCALL_DEFINE5(preadv64v2, unsigned long, fd,
const struct compat_iovec __user *,vec,
unsigned long, vlen, loff_t, pos, int, flags)
{
+ if (pos == -1)
+ return do_compat_readv(fd, vec, vlen, flags);
+
return do_compat_preadv64(fd, vec, vlen, pos, flags);
}
#endif
@@ -1302,6 +1305,9 @@ COMPAT_SYSCALL_DEFINE5(pwritev64v2, unsigned long, fd,
const struct compat_iovec __user *,vec,
unsigned long, vlen, loff_t, pos, int, flags)
{
+ if (pos == -1)
+ return do_compat_writev(fd, vec, vlen, flags);
+
return do_compat_pwritev64(fd, vec, vlen, pos, flags);
}
#endif
diff --git a/fs/splice.c b/fs/splice.c
index 8dd79ec..01983be 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1594,6 +1594,8 @@ static int splice_pipe_to_pipe(struct pipe_inode_info *ipipe,
*/
obuf->flags &= ~PIPE_BUF_FLAG_GIFT;
+ pipe_buf_mark_unmergeable(obuf);
+
obuf->len = len;
opipe->nrbufs++;
ibuf->offset += obuf->len;
@@ -1668,6 +1670,8 @@ static int link_pipe(struct pipe_inode_info *ipipe,
*/
obuf->flags &= ~PIPE_BUF_FLAG_GIFT;
+ pipe_buf_mark_unmergeable(obuf);
+
if (obuf->len > len)
obuf->len = len;
diff --git a/fs/udf/truncate.c b/fs/udf/truncate.c
index 42b8c57..c6ce750 100644
--- a/fs/udf/truncate.c
+++ b/fs/udf/truncate.c
@@ -260,6 +260,9 @@ void udf_truncate_extents(struct inode *inode)
epos.block = eloc;
epos.bh = udf_tread(sb,
udf_get_lb_pblock(sb, &eloc, 0));
+ /* Error reading indirect block? */
+ if (!epos.bh)
+ return;
if (elen)
indirect_ext_len =
(elen + sb->s_blocksize - 1) >>
diff --git a/include/acpi/acconfig.h b/include/acpi/acconfig.h
index 12c2882..eef0696 100644
--- a/include/acpi/acconfig.h
+++ b/include/acpi/acconfig.h
@@ -122,7 +122,7 @@
/* Maximum object reference count (detects object deletion issues) */
-#define ACPI_MAX_REFERENCE_COUNT 0x1000
+#define ACPI_MAX_REFERENCE_COUNT 0x4000
/* Default page size for use in mapping memory for operation regions */
diff --git a/include/linux/atalk.h b/include/linux/atalk.h
index 73fd8b7..af43ed4 100644
--- a/include/linux/atalk.h
+++ b/include/linux/atalk.h
@@ -150,19 +150,29 @@ extern int sysctl_aarp_retransmit_limit;
extern int sysctl_aarp_resolve_time;
#ifdef CONFIG_SYSCTL
-extern void atalk_register_sysctl(void);
+extern int atalk_register_sysctl(void);
extern void atalk_unregister_sysctl(void);
#else
-#define atalk_register_sysctl() do { } while(0)
-#define atalk_unregister_sysctl() do { } while(0)
+static inline int atalk_register_sysctl(void)
+{
+ return 0;
+}
+static inline void atalk_unregister_sysctl(void)
+{
+}
#endif
#ifdef CONFIG_PROC_FS
extern int atalk_proc_init(void);
extern void atalk_proc_exit(void);
#else
-#define atalk_proc_init() ({ 0; })
-#define atalk_proc_exit() do { } while(0)
+static inline int atalk_proc_init(void)
+{
+ return 0;
+}
+static inline void atalk_proc_exit(void)
+{
+}
#endif /* CONFIG_PROC_FS */
#endif /* __LINUX_ATALK_H__ */
diff --git a/include/linux/bitrev.h b/include/linux/bitrev.h
index fb790b8..333e42c 100644
--- a/include/linux/bitrev.h
+++ b/include/linux/bitrev.h
@@ -31,32 +31,32 @@ static inline u32 __bitrev32(u32 x)
#define __constant_bitrev32(x) \
({ \
- u32 __x = x; \
- __x = (__x >> 16) | (__x << 16); \
- __x = ((__x & (u32)0xFF00FF00UL) >> 8) | ((__x & (u32)0x00FF00FFUL) << 8); \
- __x = ((__x & (u32)0xF0F0F0F0UL) >> 4) | ((__x & (u32)0x0F0F0F0FUL) << 4); \
- __x = ((__x & (u32)0xCCCCCCCCUL) >> 2) | ((__x & (u32)0x33333333UL) << 2); \
- __x = ((__x & (u32)0xAAAAAAAAUL) >> 1) | ((__x & (u32)0x55555555UL) << 1); \
- __x; \
+ u32 ___x = x; \
+ ___x = (___x >> 16) | (___x << 16); \
+ ___x = ((___x & (u32)0xFF00FF00UL) >> 8) | ((___x & (u32)0x00FF00FFUL) << 8); \
+ ___x = ((___x & (u32)0xF0F0F0F0UL) >> 4) | ((___x & (u32)0x0F0F0F0FUL) << 4); \
+ ___x = ((___x & (u32)0xCCCCCCCCUL) >> 2) | ((___x & (u32)0x33333333UL) << 2); \
+ ___x = ((___x & (u32)0xAAAAAAAAUL) >> 1) | ((___x & (u32)0x55555555UL) << 1); \
+ ___x; \
})
#define __constant_bitrev16(x) \
({ \
- u16 __x = x; \
- __x = (__x >> 8) | (__x << 8); \
- __x = ((__x & (u16)0xF0F0U) >> 4) | ((__x & (u16)0x0F0FU) << 4); \
- __x = ((__x & (u16)0xCCCCU) >> 2) | ((__x & (u16)0x3333U) << 2); \
- __x = ((__x & (u16)0xAAAAU) >> 1) | ((__x & (u16)0x5555U) << 1); \
- __x; \
+ u16 ___x = x; \
+ ___x = (___x >> 8) | (___x << 8); \
+ ___x = ((___x & (u16)0xF0F0U) >> 4) | ((___x & (u16)0x0F0FU) << 4); \
+ ___x = ((___x & (u16)0xCCCCU) >> 2) | ((___x & (u16)0x3333U) << 2); \
+ ___x = ((___x & (u16)0xAAAAU) >> 1) | ((___x & (u16)0x5555U) << 1); \
+ ___x; \
})
#define __constant_bitrev8(x) \
({ \
- u8 __x = x; \
- __x = (__x >> 4) | (__x << 4); \
- __x = ((__x & (u8)0xCCU) >> 2) | ((__x & (u8)0x33U) << 2); \
- __x = ((__x & (u8)0xAAU) >> 1) | ((__x & (u8)0x55U) << 1); \
- __x; \
+ u8 ___x = x; \
+ ___x = (___x >> 4) | (___x << 4); \
+ ___x = ((___x & (u8)0xCCU) >> 2) | ((___x & (u8)0x33U) << 2); \
+ ___x = ((___x & (u8)0xAAU) >> 1) | ((___x & (u8)0x55U) << 1); \
+ ___x; \
})
#define bitrev32(x) \
diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h
index a8a5748..fe75751 100644
--- a/include/linux/ceph/libceph.h
+++ b/include/linux/ceph/libceph.h
@@ -276,6 +276,8 @@ extern void ceph_destroy_client(struct ceph_client *client);
extern int __ceph_open_session(struct ceph_client *client,
unsigned long started);
extern int ceph_open_session(struct ceph_client *client);
+int ceph_wait_for_latest_osdmap(struct ceph_client *client,
+ unsigned long timeout);
/* pagevec.c */
extern void ceph_release_page_vector(struct page **pages, int num_pages);
diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 34d3f27..ce0f32d 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -17,6 +17,7 @@
#include <linux/percpu-rwsem.h>
#include <linux/workqueue.h>
#include <linux/bpf-cgroup.h>
+#include <linux/psi_types.h>
#ifdef CONFIG_CGROUPS
@@ -28,6 +29,7 @@ struct kernfs_node;
struct kernfs_ops;
struct kernfs_open_file;
struct seq_file;
+struct poll_table_struct;
#define MAX_CGROUP_TYPE_NAMELEN 32
#define MAX_CGROUP_ROOT_NAMELEN 64
@@ -302,6 +304,9 @@ struct cgroup {
/* used to schedule release agent */
struct work_struct release_agent_work;
+ /* used to track pressure stalls */
+ struct psi_group psi;
+
/* used to store eBPF programs */
struct cgroup_bpf bpf;
@@ -389,6 +394,9 @@ struct cftype {
struct list_head node; /* anchored at ss->cfts */
struct kernfs_ops *kf_ops;
+ int (*open)(struct kernfs_open_file *of);
+ void (*release)(struct kernfs_open_file *of);
+
/*
* read_u64() is a shortcut for the common case of returning a
* single integer. Use it in place of read()
@@ -429,6 +437,9 @@ struct cftype {
ssize_t (*write)(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off);
+ unsigned int (*poll)(struct kernfs_open_file *of,
+ struct poll_table_struct *pt);
+
#ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lock_class_key lockdep_key;
#endif
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
old mode 100644
new mode 100755
index 3b242a3..5a232e8
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -388,6 +388,16 @@ static inline void css_put_many(struct cgroup_subsys_state *css, unsigned int n)
percpu_ref_put_many(&css->refcnt, n);
}
+static inline void cgroup_get(struct cgroup *cgrp)
+{
+ css_get(&cgrp->self);
+}
+
+static inline bool cgroup_tryget(struct cgroup *cgrp)
+{
+ return css_tryget(&cgrp->self);
+}
+
static inline void cgroup_put(struct cgroup *cgrp)
{
css_put(&cgrp->self);
@@ -500,6 +510,20 @@ static inline struct cgroup *task_cgroup(struct task_struct *task,
return task_css(task, subsys_id)->cgroup;
}
+static inline struct cgroup *task_dfl_cgroup(struct task_struct *task)
+{
+ return task_css_set(task)->dfl_cgrp;
+}
+
+static inline struct cgroup *cgroup_parent(struct cgroup *cgrp)
+{
+ struct cgroup_subsys_state *parent_css = cgrp->self.parent;
+
+ if (parent_css)
+ return container_of(parent_css, struct cgroup, self);
+ return NULL;
+}
+
/**
* cgroup_is_descendant - test ancestry
* @cgrp: the cgroup to be tested
@@ -599,6 +623,11 @@ static inline void pr_cont_cgroup_path(struct cgroup *cgrp)
*/
int subsys_cgroup_allow_attach(struct cgroup_taskset *tset);
+static inline struct psi_group *cgroup_psi(struct cgroup *cgrp)
+{
+ return &cgrp->psi;
+}
+
static inline void cgroup_init_kthreadd(void)
{
/*
@@ -641,6 +670,16 @@ static inline int cgroup_init(void) { return 0; }
static inline void cgroup_init_kthreadd(void) {}
static inline void cgroup_kthread_ready(void) {}
+static inline struct cgroup *cgroup_parent(struct cgroup *cgrp)
+{
+ return NULL;
+}
+
+static inline struct psi_group *cgroup_psi(struct cgroup *cgrp)
+{
+ return NULL;
+}
+
static inline bool task_under_cgroup_hierarchy(struct task_struct *task,
struct cgroup *ancestor)
{
diff --git a/include/linux/delayacct.h b/include/linux/delayacct.h
index 6cee17c..84c2f68 100644
--- a/include/linux/delayacct.h
+++ b/include/linux/delayacct.h
@@ -29,7 +29,48 @@
#define DELAYACCT_PF_BLKIO 0x00000002 /* I am waiting on IO */
#ifdef CONFIG_TASK_DELAY_ACCT
+struct task_delay_info {
+ spinlock_t lock;
+ unsigned int flags; /* Private per-task flags */
+ /* For each stat XXX, add following, aligned appropriately
+ *
+ * struct timespec XXX_start, XXX_end;
+ * u64 XXX_delay;
+ * u32 XXX_count;
+ *
+ * Atomicity of updates to XXX_delay, XXX_count protected by
+ * single lock above (split into XXX_lock if contention is an issue).
+ */
+
+ /*
+ * XXX_count is incremented on every XXX operation, the delay
+ * associated with the operation is added to XXX_delay.
+ * XXX_delay contains the accumulated delay time in nanoseconds.
+ */
+ u64 blkio_start; /* Shared by blkio, swapin */
+ u64 blkio_delay; /* wait for sync block io completion */
+ u64 swapin_delay; /* wait for swapin block io completion */
+ u32 blkio_count; /* total count of the number of sync block */
+ /* io operations performed */
+ u32 swapin_count; /* total count of the number of swapin block */
+ /* io operations performed */
+
+ u64 freepages_start;
+ u64 freepages_delay; /* wait for memory reclaim */
+
+ u64 thrashing_start;
+ u64 thrashing_delay; /* wait for thrashing page */
+
+ u32 freepages_count; /* total count of memory reclaim */
+ u32 thrashing_count; /* total count of thrash waits */
+};
+#endif
+
+#include <linux/sched.h>
+#include <linux/slab.h>
+
+#ifdef CONFIG_TASK_DELAY_ACCT
extern int delayacct_on; /* Delay accounting turned on/off */
extern struct kmem_cache *delayacct_cache;
extern void delayacct_init(void);
@@ -41,6 +82,8 @@ extern int __delayacct_add_tsk(struct taskstats *, struct task_struct *);
extern __u64 __delayacct_blkio_ticks(struct task_struct *);
extern void __delayacct_freepages_start(void);
extern void __delayacct_freepages_end(void);
+extern void __delayacct_thrashing_start(void);
+extern void __delayacct_thrashing_end(void);
static inline int delayacct_is_task_waiting_on_io(struct task_struct *p)
{
@@ -121,6 +164,18 @@ static inline void delayacct_freepages_end(void)
__delayacct_freepages_end();
}
+static inline void delayacct_thrashing_start(void)
+{
+ if (current->delays)
+ __delayacct_thrashing_start();
+}
+
+static inline void delayacct_thrashing_end(void)
+{
+ if (current->delays)
+ __delayacct_thrashing_end();
+}
+
#else
static inline void delayacct_set_flag(int flag)
{}
@@ -147,6 +202,10 @@ static inline void delayacct_freepages_start(void)
{}
static inline void delayacct_freepages_end(void)
{}
+static inline void delayacct_thrashing_start(void)
+{}
+static inline void delayacct_thrashing_end(void)
+{}
#endif /* CONFIG_TASK_DELAY_ACCT */
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 20e26d9..2b37e89 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -640,7 +640,7 @@ extern struct ratelimit_state dm_ratelimit_state;
*/
#define dm_target_offset(ti, sector) ((sector) - (ti)->begin)
-static inline sector_t to_sector(unsigned long n)
+static inline sector_t to_sector(unsigned long long n)
{
return (n >> SECTOR_SHIFT);
}
diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
index d771104..f574042 100644
--- a/include/linux/f2fs_fs.h
+++ b/include/linux/f2fs_fs.h
@@ -116,6 +116,7 @@ struct f2fs_super_block {
/*
* For checkpoint
*/
+#define CP_DISABLED_QUICK_FLAG 0x00002000
#define CP_DISABLED_FLAG 0x00001000
#define CP_QUOTA_NEED_FSCK_FLAG 0x00000800
#define CP_LARGE_NAT_BITMAP_FLAG 0x00000400
@@ -186,7 +187,7 @@ struct f2fs_orphan_block {
struct f2fs_extent {
__le32 fofs; /* start file offset of the extent */
__le32 blk; /* start block address of the extent */
- __le32 len; /* lengh of the extent */
+ __le32 len; /* length of the extent */
} __packed;
#define F2FS_NAME_LEN 255
@@ -284,7 +285,7 @@ enum {
struct node_footer {
__le32 nid; /* node id */
- __le32 ino; /* inode nunmber */
+ __le32 ino; /* inode number */
__le32 flag; /* include cold/fsync/dentry marks and offset */
__le64 cp_ver; /* checkpoint version */
__le32 next_blkaddr; /* next node page block address */
@@ -489,12 +490,12 @@ typedef __le32 f2fs_hash_t;
/*
* space utilization of regular dentry and inline dentry (w/o extra reservation)
- * regular dentry inline dentry
- * bitmap 1 * 27 = 27 1 * 23 = 23
- * reserved 1 * 3 = 3 1 * 7 = 7
- * dentry 11 * 214 = 2354 11 * 182 = 2002
- * filename 8 * 214 = 1712 8 * 182 = 1456
- * total 4096 3488
+ * regular dentry inline dentry (def) inline dentry (min)
+ * bitmap 1 * 27 = 27 1 * 23 = 23 1 * 1 = 1
+ * reserved 1 * 3 = 3 1 * 7 = 7 1 * 1 = 1
+ * dentry 11 * 214 = 2354 11 * 182 = 2002 11 * 2 = 22
+ * filename 8 * 214 = 1712 8 * 182 = 1456 8 * 2 = 16
+ * total 4096 3488 40
*
* Note: there are more reserved space in inline dentry than in regular
* dentry, when converting inline dentry we should handle this carefully.
@@ -506,12 +507,13 @@ typedef __le32 f2fs_hash_t;
#define SIZE_OF_RESERVED (PAGE_SIZE - ((SIZE_OF_DIR_ENTRY + \
F2FS_SLOT_LEN) * \
NR_DENTRY_IN_BLOCK + SIZE_OF_DENTRY_BITMAP))
+#define MIN_INLINE_DENTRY_SIZE 40 /* just include '.' and '..' entries */
/* One directory entry slot representing F2FS_SLOT_LEN-sized file name */
struct f2fs_dir_entry {
__le32 hash_code; /* hash code of file name */
__le32 ino; /* inode number */
- __le16 name_len; /* lengh of file name */
+ __le16 name_len; /* length of file name */
__u8 file_type; /* file type */
} __packed;
diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h
index c9be579..bb5547a8 100644
--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -61,6 +61,7 @@ struct irq_desc {
unsigned int core_internal_state__do_not_mess_with_it;
unsigned int depth; /* nested irq disables */
unsigned int wake_depth; /* nested wake enables */
+ unsigned int tot_count;
unsigned int irq_count; /* For detecting broken IRQs */
unsigned long last_unhandled; /* Aging timer for unhandled count */
unsigned int irqs_unhandled;
diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
index 7056238..44e5293 100644
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -24,6 +24,7 @@ struct seq_file;
struct vm_area_struct;
struct super_block;
struct file_system_type;
+struct poll_table_struct;
struct kernfs_open_node;
struct kernfs_iattrs;
@@ -46,6 +47,7 @@ enum kernfs_node_flag {
KERNFS_SUICIDAL = 0x0400,
KERNFS_SUICIDED = 0x0800,
KERNFS_EMPTY_DIR = 0x1000,
+ KERNFS_HAS_RELEASE = 0x2000,
};
/* @flags for kernfs_create_root() */
@@ -175,6 +177,7 @@ struct kernfs_open_file {
/* published fields */
struct kernfs_node *kn;
struct file *file;
+ struct seq_file *seq_file;
void *priv;
/* private fields, do not use outside kernfs proper */
@@ -186,11 +189,19 @@ struct kernfs_open_file {
size_t atomic_write_len;
bool mmapped;
+ bool released:1;
const struct vm_operations_struct *vm_ops;
};
struct kernfs_ops {
/*
+ * Optional open/release methods. Both are called with
+ * @of->seq_file populated.
+ */
+ int (*open)(struct kernfs_open_file *of);
+ void (*release)(struct kernfs_open_file *of);
+
+ /*
* Read is handled by either seq_file or raw_read().
*
* If seq_show() is present, seq_file path is active. Other seq
@@ -228,6 +239,9 @@ struct kernfs_ops {
ssize_t (*write)(struct kernfs_open_file *of, char *buf, size_t bytes,
loff_t off);
+ unsigned int (*poll)(struct kernfs_open_file *of,
+ struct poll_table_struct *pt);
+
int (*mmap)(struct kernfs_open_file *of, struct vm_area_struct *vma);
#ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -315,6 +329,8 @@ int kernfs_remove_by_name_ns(struct kernfs_node *parent, const char *name,
int kernfs_rename_ns(struct kernfs_node *kn, struct kernfs_node *new_parent,
const char *new_name, const void *new_ns);
int kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr);
+unsigned int kernfs_generic_poll(struct kernfs_open_file *of,
+ struct poll_table_struct *pt);
void kernfs_notify(struct kernfs_node *kn);
const void *kernfs_super_ns(struct super_block *sb);
diff --git a/include/linux/keychord.h b/include/linux/keychord.h
deleted file mode 100644
index 08cf540..0000000
--- a/include/linux/keychord.h
+++ /dev/null
@@ -1,23 +0,0 @@
-/*
- * Key chord input driver
- *
- * Copyright (C) 2008 Google, Inc.
- * Author: Mike Lockwood <lockwood@android.com>
- *
- * This software is licensed under the terms of the GNU General Public
- * License version 2, as published by the Free Software Foundation, and
- * may be copied, distributed, and modified under those terms.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
-*/
-
-#ifndef __LINUX_KEYCHORD_H_
-#define __LINUX_KEYCHORD_H_
-
-#include <uapi/linux/keychord.h>
-
-#endif /* __LINUX_KEYCHORD_H_ */
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index e233925..cb527c7 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -197,6 +197,7 @@ struct kretprobe_instance {
struct kretprobe *rp;
kprobe_opcode_t *ret_addr;
struct task_struct *task;
+ void *fp;
char data[0];
};
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 859fd20..509e990 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -578,6 +578,8 @@ enum mlx5_pci_status {
};
struct mlx5_td {
+ /* protects tirs list changes while tirs refresh */
+ struct mutex list_lock;
struct list_head tirs_list;
u32 tdn;
};
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1be499d..2091e1d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1872,6 +1872,8 @@ static inline spinlock_t *pmd_lock(struct mm_struct *mm, pmd_t *pmd)
return ptl;
}
+extern void __init pagecache_init(void);
+
extern void free_area_init(unsigned long * zones_size);
extern void free_area_init_node(int nid, unsigned long * zones_size,
unsigned long zone_start_pfn, unsigned long *zholes_size);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 57203c7..e717b8d 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -161,6 +161,7 @@ enum node_stat_item {
NR_ISOLATED_FILE, /* Temporary isolated pages from file lru */
WORKINGSET_REFAULT,
WORKINGSET_ACTIVATE,
+ WORKINGSET_RESTORE,
WORKINGSET_NODERECLAIM,
NR_ANON_MAPPED, /* Mapped anonymous pages */
NR_FILE_MAPPED, /* pagecache pages mapped into pagetables.
diff --git a/include/linux/of.h b/include/linux/of.h
index 02f5a31..671c8f6 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -148,16 +148,20 @@ extern raw_spinlock_t devtree_lock;
#ifdef CONFIG_OF
void of_core_init(void);
-static inline bool is_of_node(struct fwnode_handle *fwnode)
+static inline bool is_of_node(const struct fwnode_handle *fwnode)
{
return !IS_ERR_OR_NULL(fwnode) && fwnode->type == FWNODE_OF;
}
-static inline struct device_node *to_of_node(struct fwnode_handle *fwnode)
-{
- return is_of_node(fwnode) ?
- container_of(fwnode, struct device_node, fwnode) : NULL;
-}
+#define to_of_node(__fwnode) \
+ ({ \
+ typeof(__fwnode) __to_of_node_fwnode = (__fwnode); \
+ \
+ is_of_node(__to_of_node_fwnode) ? \
+ container_of(__to_of_node_fwnode, \
+ struct device_node, fwnode) : \
+ NULL; \
+ })
static inline bool of_have_populated_dt(void)
{
@@ -529,12 +533,12 @@ static inline void of_core_init(void)
{
}
-static inline bool is_of_node(struct fwnode_handle *fwnode)
+static inline bool is_of_node(const struct fwnode_handle *fwnode)
{
return false;
}
-static inline struct device_node *to_of_node(struct fwnode_handle *fwnode)
+static inline struct device_node *to_of_node(const struct fwnode_handle *fwnode)
{
return NULL;
}
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 74e4dda..4077d60 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -73,12 +73,14 @@
*/
enum pageflags {
PG_locked, /* Page is locked. Don't touch. */
- PG_error,
+ PG_waiters, /* Page has waiters, check its waitqueue */
PG_referenced,
PG_uptodate,
PG_dirty,
PG_lru,
PG_active,
+ PG_workingset,
+ PG_error,
PG_slab,
PG_owner_priv_1, /* Owner use. If pagecache, fs may use*/
PG_arch_1,
@@ -167,6 +169,9 @@ static __always_inline int PageCompound(struct page *page)
* for compound page all operations related to the page flag applied to
* head page.
*
+ * PF_ONLY_HEAD:
+ * for compound page, callers only ever operate on the head page.
+ *
* PF_NO_TAIL:
* modifications of the page flag must be done on small or head pages,
* checks can be done on tail pages too.
@@ -176,6 +181,9 @@ static __always_inline int PageCompound(struct page *page)
*/
#define PF_ANY(page, enforce) page
#define PF_HEAD(page, enforce) compound_head(page)
+#define PF_ONLY_HEAD(page, enforce) ({ \
+ VM_BUG_ON_PGFLAGS(PageTail(page), page); \
+ page;})
#define PF_NO_TAIL(page, enforce) ({ \
VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page); \
compound_head(page);})
@@ -253,6 +261,7 @@ static inline int TestClearPage##uname(struct page *page) { return 0; }
TESTSETFLAG_FALSE(uname) TESTCLEARFLAG_FALSE(uname)
__PAGEFLAG(Locked, locked, PF_NO_TAIL)
+PAGEFLAG(Waiters, waiters, PF_ONLY_HEAD) __CLEARPAGEFLAG(Waiters, waiters, PF_ONLY_HEAD)
PAGEFLAG(Error, error, PF_NO_COMPOUND) TESTCLEARFLAG(Error, error, PF_NO_COMPOUND)
PAGEFLAG(Referenced, referenced, PF_HEAD)
TESTCLEARFLAG(Referenced, referenced, PF_HEAD)
@@ -262,6 +271,8 @@ PAGEFLAG(Dirty, dirty, PF_HEAD) TESTSCFLAG(Dirty, dirty, PF_HEAD)
PAGEFLAG(LRU, lru, PF_HEAD) __CLEARPAGEFLAG(LRU, lru, PF_HEAD)
PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
TESTCLEARFLAG(Active, active, PF_HEAD)
+PAGEFLAG(Workingset, workingset, PF_HEAD)
+ TESTCLEARFLAG(Workingset, workingset, PF_HEAD)
__PAGEFLAG(Slab, slab, PF_NO_TAIL)
__PAGEFLAG(SlobFree, slob_free, PF_NO_TAIL)
PAGEFLAG(Checked, checked, PF_NO_COMPOUND) /* Used by some filesystems */
@@ -735,6 +746,7 @@ static inline int page_has_private(struct page *page)
#undef PF_ANY
#undef PF_HEAD
+#undef PF_ONLY_HEAD
#undef PF_NO_TAIL
#undef PF_NO_COMPOUND
#endif /* !__GENERATING_BOUNDS_H */
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index c890724a..0dfc605 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -238,6 +238,7 @@ pgoff_t page_cache_prev_hole(struct address_space *mapping,
#define FGP_WRITE 0x00000008
#define FGP_NOFS 0x00000010
#define FGP_NOWAIT 0x00000020
+#define FGP_FOR_MMAP 0x00000040
struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
int fgp_flags, gfp_t cache_gfp_mask);
@@ -494,22 +495,14 @@ static inline int lock_page_or_retry(struct page *page, struct mm_struct *mm,
* and for filesystems which need to wait on PG_private.
*/
extern void wait_on_page_bit(struct page *page, int bit_nr);
-
extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
-extern int wait_on_page_bit_killable_timeout(struct page *page,
- int bit_nr, unsigned long timeout);
+extern void wake_up_page_bit(struct page *page, int bit_nr);
-static inline int wait_on_page_locked_killable(struct page *page)
-{
- if (!PageLocked(page))
- return 0;
- return wait_on_page_bit_killable(compound_head(page), PG_locked);
-}
-
-extern wait_queue_head_t *page_waitqueue(struct page *page);
static inline void wake_up_page(struct page *page, int bit)
{
- __wake_up_bit(page_waitqueue(page), &page->flags, bit);
+ if (!PageWaiters(page))
+ return;
+ wake_up_page_bit(page, bit);
}
/*
@@ -525,6 +518,13 @@ static inline void wait_on_page_locked(struct page *page)
wait_on_page_bit(compound_head(page), PG_locked);
}
+static inline int wait_on_page_locked_killable(struct page *page)
+{
+ if (!PageLocked(page))
+ return 0;
+ return wait_on_page_bit_killable(compound_head(page), PG_locked);
+}
+
/*
* Wait for a page to complete writeback
*/
diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
index e7497c9..4f71293 100644
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -182,6 +182,7 @@ void generic_pipe_buf_get(struct pipe_inode_info *, struct pipe_buffer *);
int generic_pipe_buf_confirm(struct pipe_inode_info *, struct pipe_buffer *);
int generic_pipe_buf_steal(struct pipe_inode_info *, struct pipe_buffer *);
void generic_pipe_buf_release(struct pipe_inode_info *, struct pipe_buffer *);
+void pipe_buf_mark_unmergeable(struct pipe_buffer *buf);
extern const struct pipe_buf_operations nosteal_pipe_buf_ops;
diff --git a/include/linux/property.h b/include/linux/property.h
index 459337f..d5c7ebd 100644
--- a/include/linux/property.h
+++ b/include/linux/property.h
@@ -233,7 +233,7 @@ struct property_entry {
#define PROPERTY_ENTRY_STRING(_name_, _val_) \
(struct property_entry) { \
.name = _name_, \
- .length = sizeof(_val_), \
+ .length = sizeof(const char *), \
.is_string = true, \
{ .value = { .str = _val_ } }, \
}
diff --git a/include/linux/psi.h b/include/linux/psi.h
new file mode 100644
index 0000000..b825fa5
--- /dev/null
+++ b/include/linux/psi.h
@@ -0,0 +1,63 @@
+#ifndef _LINUX_PSI_H
+#define _LINUX_PSI_H
+
+#include <linux/jump_label.h>
+#include <linux/psi_types.h>
+#include <linux/sched.h>
+#include <linux/poll.h>
+#include <linux/cgroup-defs.h>
+
+struct seq_file;
+struct css_set;
+
+#ifdef CONFIG_PSI
+
+extern struct static_key_false psi_disabled;
+
+void psi_init(void);
+
+void psi_task_change(struct task_struct *task, int clear, int set);
+
+void psi_memstall_tick(struct task_struct *task, int cpu);
+void psi_memstall_enter(unsigned long *flags);
+void psi_memstall_leave(unsigned long *flags);
+
+int psi_show(struct seq_file *s, struct psi_group *group, enum psi_res res);
+
+#ifdef CONFIG_CGROUPS
+int psi_cgroup_alloc(struct cgroup *cgrp);
+void psi_cgroup_free(struct cgroup *cgrp);
+void cgroup_move_task(struct task_struct *p, struct css_set *to);
+
+struct psi_trigger *psi_trigger_create(struct psi_group *group,
+ char *buf, size_t nbytes, enum psi_res res);
+void psi_trigger_replace(void **trigger_ptr, struct psi_trigger *t);
+
+unsigned int psi_trigger_poll(void **trigger_ptr, struct file *file,
+ poll_table *wait);
+#endif
+
+#else /* CONFIG_PSI */
+
+static inline void psi_init(void) {}
+
+static inline void psi_memstall_enter(unsigned long *flags) {}
+static inline void psi_memstall_leave(unsigned long *flags) {}
+
+#ifdef CONFIG_CGROUPS
+static inline int psi_cgroup_alloc(struct cgroup *cgrp)
+{
+ return 0;
+}
+static inline void psi_cgroup_free(struct cgroup *cgrp)
+{
+}
+static inline void cgroup_move_task(struct task_struct *p, struct css_set *to)
+{
+ rcu_assign_pointer(p->cgroups, to);
+}
+#endif
+
+#endif /* CONFIG_PSI */
+
+#endif /* _LINUX_PSI_H */
diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
new file mode 100644
index 0000000..07aaf9b
--- /dev/null
+++ b/include/linux/psi_types.h
@@ -0,0 +1,173 @@
+#ifndef _LINUX_PSI_TYPES_H
+#define _LINUX_PSI_TYPES_H
+
+#include <linux/kthread.h>
+#include <linux/seqlock.h>
+#include <linux/types.h>
+#include <linux/kref.h>
+#include <linux/wait.h>
+
+#ifdef CONFIG_PSI
+
+/* Tracked task states */
+enum psi_task_count {
+ NR_IOWAIT,
+ NR_MEMSTALL,
+ NR_RUNNING,
+ NR_PSI_TASK_COUNTS = 3,
+};
+
+/* Task state bitmasks */
+#define TSK_IOWAIT (1 << NR_IOWAIT)
+#define TSK_MEMSTALL (1 << NR_MEMSTALL)
+#define TSK_RUNNING (1 << NR_RUNNING)
+
+/* Resources that workloads could be stalled on */
+enum psi_res {
+ PSI_IO,
+ PSI_MEM,
+ PSI_CPU,
+ NR_PSI_RESOURCES = 3,
+};
+
+/*
+ * Pressure states for each resource:
+ *
+ * SOME: Stalled tasks & working tasks
+ * FULL: Stalled tasks & no working tasks
+ */
+enum psi_states {
+ PSI_IO_SOME,
+ PSI_IO_FULL,
+ PSI_MEM_SOME,
+ PSI_MEM_FULL,
+ PSI_CPU_SOME,
+ /* Only per-CPU, to weigh the CPU in the global average: */
+ PSI_NONIDLE,
+ NR_PSI_STATES = 6,
+};
+
+enum psi_aggregators {
+ PSI_AVGS = 0,
+ PSI_POLL,
+ NR_PSI_AGGREGATORS,
+};
+
+struct psi_group_cpu {
+ /* 1st cacheline updated by the scheduler */
+
+ /* Aggregator needs to know of concurrent changes */
+ seqcount_t seq ____cacheline_aligned_in_smp;
+
+ /* States of the tasks belonging to this group */
+ unsigned int tasks[NR_PSI_TASK_COUNTS];
+
+ /* Aggregate pressure state derived from the tasks */
+ u32 state_mask;
+
+ /* Period time sampling buckets for each state of interest (ns) */
+ u32 times[NR_PSI_STATES];
+
+ /* Time of last task change in this group (rq_clock) */
+ u64 state_start;
+
+ /* 2nd cacheline updated by the aggregator */
+
+ /* Delta detection against the sampling buckets */
+ u32 times_prev[NR_PSI_AGGREGATORS][NR_PSI_STATES]
+ ____cacheline_aligned_in_smp;
+};
+
+/* PSI growth tracking window */
+struct psi_window {
+ /* Window size in ns */
+ u64 size;
+
+ /* Start time of the current window in ns */
+ u64 start_time;
+
+ /* Value at the start of the window */
+ u64 start_value;
+
+ /* Value growth in the previous window */
+ u64 prev_growth;
+};
+
+struct psi_trigger {
+ /* PSI state being monitored by the trigger */
+ enum psi_states state;
+
+ /* User-spacified threshold in ns */
+ u64 threshold;
+
+ /* List node inside triggers list */
+ struct list_head node;
+
+ /* Backpointer needed during trigger destruction */
+ struct psi_group *group;
+
+ /* Wait queue for polling */
+ wait_queue_head_t event_wait;
+
+ /* Pending event flag */
+ int event;
+
+ /* Tracking window */
+ struct psi_window win;
+
+ /*
+ * Time last event was generated. Used for rate-limiting
+ * events to one per window
+ */
+ u64 last_event_time;
+
+ /* Refcounting to prevent premature destruction */
+ struct kref refcount;
+};
+
+struct psi_group {
+ /* Protects data used by the aggregator */
+ struct mutex avgs_lock;
+
+ /* Per-cpu task state & time tracking */
+ struct psi_group_cpu __percpu *pcpu;
+
+ /* Running pressure averages */
+ u64 avg_total[NR_PSI_STATES - 1];
+ u64 avg_last_update;
+ u64 avg_next_update;
+
+ /* Aggregator work control */
+ struct delayed_work avgs_work;
+
+ /* Total stall times and sampled pressure averages */
+ u64 total[NR_PSI_AGGREGATORS][NR_PSI_STATES - 1];
+ unsigned long avg[NR_PSI_STATES - 1][3];
+
+ /* Monitor work control */
+ atomic_t poll_scheduled;
+ struct kthread_worker __rcu *poll_kworker;
+ struct kthread_delayed_work poll_work;
+
+ /* Protects data used by the monitor */
+ struct mutex trigger_lock;
+
+ /* Configured polling triggers */
+ struct list_head triggers;
+ u32 nr_triggers[NR_PSI_STATES - 1];
+ u32 poll_states;
+ u64 poll_min_period;
+
+ /* Total stall times at the start of monitor activation */
+ u64 polling_total[NR_PSI_STATES - 1];
+ u64 polling_next_update;
+ u64 polling_until;
+};
+
+#else /* CONFIG_PSI */
+
+struct psi_group { };
+
+#endif /* CONFIG_PSI */
+
+#endif /* _LINUX_PSI_TYPES_H */
diff --git a/include/linux/relay.h b/include/linux/relay.h
index 68c1448..2560f87 100644
--- a/include/linux/relay.h
+++ b/include/linux/relay.h
@@ -65,7 +65,7 @@ struct rchan
struct kref kref; /* channel refcount */
void *private_data; /* for user-defined data */
size_t last_toobig; /* tried to log event > subbuf size */
- struct rchan_buf ** __percpu buf; /* per-cpu channel buffers */
+ struct rchan_buf * __percpu *buf; /* per-cpu channel buffers */
int is_global; /* One global buffer ? */
struct list_head list; /* for channel list */
struct dentry *parent; /* parent dentry passed to open */
diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index 19d0778..121c8f9 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -125,7 +125,7 @@ ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts,
unsigned long *lost_events);
struct ring_buffer_iter *
-ring_buffer_read_prepare(struct ring_buffer *buffer, int cpu);
+ring_buffer_read_prepare(struct ring_buffer *buffer, int cpu, gfp_t flags);
void ring_buffer_read_prepare_sync(void);
void ring_buffer_read_start(struct ring_buffer_iter *iter);
void ring_buffer_read_finish(struct ring_buffer_iter *iter);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 3b2f96a..4abc0b0 100755
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -58,7 +58,6 @@ struct sched_param {
#include <linux/uidgid.h>
#include <linux/gfp.h>
#include <linux/magic.h>
-#include <linux/cgroup-defs.h>
#include <asm/processor.h>
@@ -139,31 +138,6 @@ struct nameidata;
#define VMACACHE_SIZE (1U << VMACACHE_BITS)
#define VMACACHE_MASK (VMACACHE_SIZE - 1)
-/*
- * These are the constant used to fake the fixed-point load-average
- * counting. Some notes:
- * - 11 bit fractions expand to 22 bits by the multiplies: this gives
- * a load-average precision of 10 bits integer + 11 bits fractional
- * - if you want to count load-averages more often, you need more
- * precision, or rounding will get you. With 2-second counting freq,
- * the EXP_n values would be 1981, 2034 and 2043 if still using only
- * 11 bit fractions.
- */
-extern unsigned long avenrun[]; /* Load averages */
-extern void get_avenrun(unsigned long *loads, unsigned long offset, int shift);
-
-#define FSHIFT 11 /* nr of bits of precision */
-#define FIXED_1 (1<<FSHIFT) /* 1.0 as fixed-point */
-#define LOAD_FREQ (5*HZ+1) /* 5 sec intervals */
-#define EXP_1 1884 /* 1/exp(5sec/1min) as fixed-point */
-#define EXP_5 2014 /* 1/exp(5sec/5min) */
-#define EXP_15 2037 /* 1/exp(5sec/15min) */
-
-#define CALC_LOAD(load,exp,n) \
- load *= exp; \
- load += n*(FIXED_1-exp); \
- load >>= FSHIFT;
-
extern unsigned long total_forks;
extern int nr_threads;
DECLARE_PER_CPU(unsigned long, process_counts);
@@ -1020,39 +994,7 @@ struct sched_info {
};
#endif /* CONFIG_SCHED_INFO */
-#ifdef CONFIG_TASK_DELAY_ACCT
-struct task_delay_info {
- spinlock_t lock;
- unsigned int flags; /* Private per-task flags */
-
- /* For each stat XXX, add following, aligned appropriately
- *
- * struct timespec XXX_start, XXX_end;
- * u64 XXX_delay;
- * u32 XXX_count;
- *
- * Atomicity of updates to XXX_delay, XXX_count protected by
- * single lock above (split into XXX_lock if contention is an issue).
- */
-
- /*
- * XXX_count is incremented on every XXX operation, the delay
- * associated with the operation is added to XXX_delay.
- * XXX_delay contains the accumulated delay time in nanoseconds.
- */
- u64 blkio_start; /* Shared by blkio, swapin */
- u64 blkio_delay; /* wait for sync block io completion */
- u64 swapin_delay; /* wait for swapin block io completion */
- u32 blkio_count; /* total count of the number of sync block */
- /* io operations performed */
- u32 swapin_count; /* total count of the number of swapin block */
- /* io operations performed */
-
- u64 freepages_start;
- u64 freepages_delay; /* wait for memory reclaim */
- u32 freepages_count; /* total count of memory reclaim */
-};
-#endif /* CONFIG_TASK_DELAY_ACCT */
+struct task_delay_info;
static inline int sched_info_on(void)
{
@@ -1850,6 +1792,10 @@ struct task_struct {
unsigned sched_contributes_to_load:1;
unsigned sched_migrated:1;
unsigned sched_remote_wakeup:1;
+#ifdef CONFIG_PSI
+ unsigned sched_psi_wake_requeue:1;
+#endif
+
unsigned :0; /* force alignment to the next boundary */
/* unserialized, strictly 'current' */
@@ -2067,6 +2013,10 @@ struct task_struct {
unsigned long ptrace_message;
siginfo_t *last_siginfo; /* For ptrace use. */
struct task_io_accounting ioac;
+#ifdef CONFIG_PSI
+ /* Pressure stall state */
+ unsigned int psi_flags;
+#endif
#if defined(CONFIG_TASK_XACCT)
u64 acct_rss_mem1; /* accumulated rss usage */
u64 acct_vm_mem1; /* accumulated virtual memory usage */
@@ -2160,9 +2110,10 @@ struct task_struct {
struct page_frag task_frag;
-#ifdef CONFIG_TASK_DELAY_ACCT
- struct task_delay_info *delays;
+#ifdef CONFIG_TASK_DELAY_ACCT
+ struct task_delay_info *delays;
#endif
+
#ifdef CONFIG_FAULT_INJECTION
int make_it_fail;
#endif
@@ -2573,6 +2524,7 @@ extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut,
#define PF_KTHREAD 0x00200000 /* I am a kernel thread */
#define PF_RANDOMIZE 0x00400000 /* randomize virtual address space */
#define PF_SWAPWRITE 0x00800000 /* Allowed to write to swap */
+#define PF_MEMSTALL 0x01000000 /* Stalled due to lack of memory */
#define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_allowed */
#define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */
#define PF_MUTEX_TESTER 0x20000000 /* Thread belongs to the rt mutex tester */
@@ -3491,11 +3443,7 @@ static inline void unlock_task_sighand(struct task_struct *tsk,
* subsystems needing threadgroup stability can hook into for
* synchronization.
*/
-static inline void threadgroup_change_begin(struct task_struct *tsk)
-{
- might_sleep();
- cgroup_threadgroup_change_begin(tsk);
-}
+extern void threadgroup_change_begin(struct task_struct *tsk);
/**
* threadgroup_change_end - mark the end of changes to a threadgroup
@@ -3503,10 +3451,7 @@ static inline void threadgroup_change_begin(struct task_struct *tsk)
*
* See threadgroup_change_begin().
*/
-static inline void threadgroup_change_end(struct task_struct *tsk)
-{
- cgroup_threadgroup_change_end(tsk);
-}
+extern void threadgroup_change_end(struct task_struct *tsk);
#ifdef CONFIG_THREAD_INFO_IN_TASK
diff --git a/include/linux/sched/loadavg.h b/include/linux/sched/loadavg.h
new file mode 100644
index 0000000..208bbf2
--- /dev/null
+++ b/include/linux/sched/loadavg.h
@@ -0,0 +1,47 @@
+#ifndef _LINUX_SCHED_LOADAVG_H
+#define _LINUX_SCHED_LOADAVG_H
+
+/*
+ * These are the constant used to fake the fixed-point load-average
+ * counting. Some notes:
+ * - 11 bit fractions expand to 22 bits by the multiplies: this gives
+ * a load-average precision of 10 bits integer + 11 bits fractional
+ * - if you want to count load-averages more often, you need more
+ * precision, or rounding will get you. With 2-second counting freq,
+ * the EXP_n values would be 1981, 2034 and 2043 if still using only
+ * 11 bit fractions.
+ */
+extern unsigned long avenrun[]; /* Load averages */
+extern void get_avenrun(unsigned long *loads, unsigned long offset, int shift);
+
+#define FSHIFT 11 /* nr of bits of precision */
+#define FIXED_1 (1<<FSHIFT) /* 1.0 as fixed-point */
+#define LOAD_FREQ (5*HZ+1) /* 5 sec intervals */
+#define EXP_1 1884 /* 1/exp(5sec/1min) as fixed-point */
+#define EXP_5 2014 /* 1/exp(5sec/5min) */
+#define EXP_15 2037 /* 1/exp(5sec/15min) */
+
+/*
+ * a1 = a0 * e + a * (1 - e)
+ */
+static inline unsigned long
+calc_load(unsigned long load, unsigned long exp, unsigned long active)
+{
+ unsigned long newload;
+
+ newload = load * exp + active * (FIXED_1 - exp);
+ if (active >= load)
+ newload += FIXED_1-1;
+
+ return newload / FIXED_1;
+}
+
+extern unsigned long calc_load_n(unsigned long load, unsigned long exp,
+ unsigned long active, unsigned int n);
+
+#define LOAD_INT(x) ((x) >> FSHIFT)
+#define LOAD_FRAC(x) LOAD_INT(((x) & (FIXED_1-1)) * 100)
+
+extern void calc_global_load(unsigned long ticks);
+
+#endif /* _LINUX_SCHED_LOADAVG_H */
diff --git a/include/linux/siphash.h b/include/linux/siphash.h
new file mode 100644
index 0000000..fa7a6b9
--- /dev/null
+++ b/include/linux/siphash.h
@@ -0,0 +1,140 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ *
+ * This implementation is specifically for SipHash2-4 for a secure PRF
+ * and HalfSipHash1-3/SipHash1-3 for an insecure PRF only suitable for
+ * hashtables.
+ */
+
+#ifndef _LINUX_SIPHASH_H
+#define _LINUX_SIPHASH_H
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+
+#define SIPHASH_ALIGNMENT __alignof__(u64)
+typedef struct {
+ u64 key[2];
+} siphash_key_t;
+
+u64 __siphash_aligned(const void *data, size_t len, const siphash_key_t *key);
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+u64 __siphash_unaligned(const void *data, size_t len, const siphash_key_t *key);
+#endif
+
+u64 siphash_1u64(const u64 a, const siphash_key_t *key);
+u64 siphash_2u64(const u64 a, const u64 b, const siphash_key_t *key);
+u64 siphash_3u64(const u64 a, const u64 b, const u64 c,
+ const siphash_key_t *key);
+u64 siphash_4u64(const u64 a, const u64 b, const u64 c, const u64 d,
+ const siphash_key_t *key);
+u64 siphash_1u32(const u32 a, const siphash_key_t *key);
+u64 siphash_3u32(const u32 a, const u32 b, const u32 c,
+ const siphash_key_t *key);
+
+static inline u64 siphash_2u32(const u32 a, const u32 b,
+ const siphash_key_t *key)
+{
+ return siphash_1u64((u64)b << 32 | a, key);
+}
+static inline u64 siphash_4u32(const u32 a, const u32 b, const u32 c,
+ const u32 d, const siphash_key_t *key)
+{
+ return siphash_2u64((u64)b << 32 | a, (u64)d << 32 | c, key);
+}
+
+
+static inline u64 ___siphash_aligned(const __le64 *data, size_t len,
+ const siphash_key_t *key)
+{
+ if (__builtin_constant_p(len) && len == 4)
+ return siphash_1u32(le32_to_cpup((const __le32 *)data), key);
+ if (__builtin_constant_p(len) && len == 8)
+ return siphash_1u64(le64_to_cpu(data[0]), key);
+ if (__builtin_constant_p(len) && len == 16)
+ return siphash_2u64(le64_to_cpu(data[0]), le64_to_cpu(data[1]),
+ key);
+ if (__builtin_constant_p(len) && len == 24)
+ return siphash_3u64(le64_to_cpu(data[0]), le64_to_cpu(data[1]),
+ le64_to_cpu(data[2]), key);
+ if (__builtin_constant_p(len) && len == 32)
+ return siphash_4u64(le64_to_cpu(data[0]), le64_to_cpu(data[1]),
+ le64_to_cpu(data[2]), le64_to_cpu(data[3]),
+ key);
+ return __siphash_aligned(data, len, key);
+}
+
+/**
+ * siphash - compute 64-bit siphash PRF value
+ * @data: buffer to hash
+ * @size: size of @data
+ * @key: the siphash key
+ */
+static inline u64 siphash(const void *data, size_t len,
+ const siphash_key_t *key)
+{
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+ if (!IS_ALIGNED((unsigned long)data, SIPHASH_ALIGNMENT))
+ return __siphash_unaligned(data, len, key);
+#endif
+ return ___siphash_aligned(data, len, key);
+}
+
+#define HSIPHASH_ALIGNMENT __alignof__(unsigned long)
+typedef struct {
+ unsigned long key[2];
+} hsiphash_key_t;
+
+u32 __hsiphash_aligned(const void *data, size_t len,
+ const hsiphash_key_t *key);
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+u32 __hsiphash_unaligned(const void *data, size_t len,
+ const hsiphash_key_t *key);
+#endif
+
+u32 hsiphash_1u32(const u32 a, const hsiphash_key_t *key);
+u32 hsiphash_2u32(const u32 a, const u32 b, const hsiphash_key_t *key);
+u32 hsiphash_3u32(const u32 a, const u32 b, const u32 c,
+ const hsiphash_key_t *key);
+u32 hsiphash_4u32(const u32 a, const u32 b, const u32 c, const u32 d,
+ const hsiphash_key_t *key);
+
+static inline u32 ___hsiphash_aligned(const __le32 *data, size_t len,
+ const hsiphash_key_t *key)
+{
+ if (__builtin_constant_p(len) && len == 4)
+ return hsiphash_1u32(le32_to_cpu(data[0]), key);
+ if (__builtin_constant_p(len) && len == 8)
+ return hsiphash_2u32(le32_to_cpu(data[0]), le32_to_cpu(data[1]),
+ key);
+ if (__builtin_constant_p(len) && len == 12)
+ return hsiphash_3u32(le32_to_cpu(data[0]), le32_to_cpu(data[1]),
+ le32_to_cpu(data[2]), key);
+ if (__builtin_constant_p(len) && len == 16)
+ return hsiphash_4u32(le32_to_cpu(data[0]), le32_to_cpu(data[1]),
+ le32_to_cpu(data[2]), le32_to_cpu(data[3]),
+ key);
+ return __hsiphash_aligned(data, len, key);
+}
+
+/**
+ * hsiphash - compute 32-bit hsiphash PRF value
+ * @data: buffer to hash
+ * @size: size of @data
+ * @key: the hsiphash key
+ */
+static inline u32 hsiphash(const void *data, size_t len,
+ const hsiphash_key_t *key)
+{
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+ if (!IS_ALIGNED((unsigned long)data, HSIPHASH_ALIGNMENT))
+ return __hsiphash_unaligned(data, len, key);
+#endif
+ return ___hsiphash_aligned(data, len, key);
+}
+
+#endif /* _LINUX_SIPHASH_H */
diff --git a/include/linux/string.h b/include/linux/string.h
index d55f6ee..c890bdf 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -111,6 +111,9 @@ extern void * memscan(void *,int,__kernel_size_t);
#ifndef __HAVE_ARCH_MEMCMP
extern int memcmp(const void *,const void *,__kernel_size_t);
#endif
+#ifndef __HAVE_ARCH_BCMP
+extern int bcmp(const void *,const void *,__kernel_size_t);
+#endif
#ifndef __HAVE_ARCH_MEMCHR
extern void * memchr(const void *,int,__kernel_size_t);
#endif
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 7b488ec..5c15a01 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -135,9 +135,9 @@ struct swap_extent {
/*
* Max bad pages in the new format..
*/
-#define __swapoffset(x) ((unsigned long)&((union swap_header *)0)->x)
#define MAX_SWAP_BADPAGES \
- ((__swapoffset(magic.magic) - __swapoffset(info.badpages)) / sizeof(int))
+ ((offsetof(union swap_header, magic.magic) - \
+ offsetof(union swap_header, info.badpages)) / sizeof(int))
enum {
SWP_USED = (1 << 0), /* is slot in swap_info[] used? */
@@ -249,7 +249,7 @@ struct swap_info_struct {
/* linux/mm/workingset.c */
void *workingset_eviction(struct address_space *mapping, struct page *page);
-bool workingset_refault(void *shadow);
+void workingset_refault(struct page *page, void *shadow);
void workingset_activation(struct page *page);
extern struct list_lru workingset_shadow_nodes;
diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index e8d3693..b38c187 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -62,7 +62,7 @@ struct virtqueue;
/*
* Creates a virtqueue and allocates the descriptor ring. If
* may_reduce_num is set, then this may allocate a smaller ring than
- * expected. The caller should query virtqueue_get_ring_size to learn
+ * expected. The caller should query virtqueue_get_vring_size to learn
* the actual size of the ring.
*/
struct virtqueue *vring_create_virtqueue(unsigned int index,
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 8e880f7..b95c511 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -359,6 +359,8 @@ extern struct workqueue_struct *system_freezable_wq;
extern struct workqueue_struct *system_power_efficient_wq;
extern struct workqueue_struct *system_freezable_power_efficient_wq;
+extern bool wq_online;
+
extern struct workqueue_struct *
__alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
struct lock_class_key *key, const char *lock_name, ...) __printf(1, 6);
@@ -598,7 +600,7 @@ static inline bool schedule_delayed_work(struct delayed_work *dwork,
*/
static inline bool keventd_up(void)
{
- return system_wq != NULL;
+ return wq_online;
}
#ifndef CONFIG_SMP
@@ -635,4 +637,7 @@ int workqueue_online_cpu(unsigned int cpu);
int workqueue_offline_cpu(unsigned int cpu);
#endif
+int __init workqueue_init_early(void);
+int __init workqueue_init(void);
+
#endif
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 3eed4f1..6341e78 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -363,7 +363,6 @@ void global_dirty_limits(unsigned long *pbackground, unsigned long *pdirty);
unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh);
void wb_update_bandwidth(struct bdi_writeback *wb, unsigned long start_time);
-void page_writeback_init(void);
void balance_dirty_pages_ratelimited(struct address_space *mapping);
bool wb_over_bg_thresh(struct bdi_writeback *wb);
diff --git a/include/net/gro_cells.h b/include/net/gro_cells.h
index 95f33ee..6db0e85 100644
--- a/include/net/gro_cells.h
+++ b/include/net/gro_cells.h
@@ -18,22 +18,36 @@ static inline int gro_cells_receive(struct gro_cells *gcells, struct sk_buff *sk
{
struct gro_cell *cell;
struct net_device *dev = skb->dev;
+ int res;
- if (!gcells->cells || skb_cloned(skb) || !(dev->features & NETIF_F_GRO))
- return netif_rx(skb);
+ rcu_read_lock();
+ if (unlikely(!(dev->flags & IFF_UP)))
+ goto drop;
+
+ if (!gcells->cells || skb_cloned(skb) || !(dev->features & NETIF_F_GRO)) {
+ res = netif_rx(skb);
+ goto unlock;
+ }
cell = this_cpu_ptr(gcells->cells);
if (skb_queue_len(&cell->napi_skbs) > netdev_max_backlog) {
+drop:
atomic_long_inc(&dev->rx_dropped);
kfree_skb(skb);
- return NET_RX_DROP;
+ res = NET_RX_DROP;
+ goto unlock;
}
__skb_queue_tail(&cell->napi_skbs, skb);
if (skb_queue_len(&cell->napi_skbs) == 1)
napi_schedule(&cell->napi);
- return NET_RX_SUCCESS;
+
+ res = NET_RX_SUCCESS;
+
+unlock:
+ rcu_read_unlock();
+ return res;
}
/* called under BH context */
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index 197a30d..146054c 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -289,11 +289,6 @@ static inline int inet_csk_reqsk_queue_len(const struct sock *sk)
return reqsk_queue_len(&inet_csk(sk)->icsk_accept_queue);
}
-static inline int inet_csk_reqsk_queue_young(const struct sock *sk)
-{
- return reqsk_queue_len_young(&inet_csk(sk)->icsk_accept_queue);
-}
-
static inline int inet_csk_reqsk_queue_is_full(const struct sock *sk)
{
return inet_csk_reqsk_queue_len(sk) >= sk->sk_max_ack_backlog;
diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h
index a3812e9..c2c724a 100644
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -76,8 +76,8 @@ struct inet_frag_queue {
struct timer_list timer;
spinlock_t lock;
atomic_t refcnt;
- struct sk_buff *fragments; /* Used in IPv6. */
- struct rb_root rb_fragments; /* Used in IPv4. */
+ struct sk_buff *fragments; /* used in 6lopwpan IPv6. */
+ struct rb_root rb_fragments; /* Used in IPv4/IPv6. */
struct sk_buff *fragments_tail;
struct sk_buff *last_run_head;
ktime_t stamp;
@@ -152,4 +152,16 @@ static inline void add_frag_mem_limit(struct netns_frags *nf, long val)
extern const u8 ip_frag_ecn_table[16];
+/* Return values of inet_frag_queue_insert() */
+#define IPFRAG_OK 0
+#define IPFRAG_DUP 1
+#define IPFRAG_OVERLAP 2
+int inet_frag_queue_insert(struct inet_frag_queue *q, struct sk_buff *skb,
+ int offset, int end);
+void *inet_frag_reasm_prepare(struct inet_frag_queue *q, struct sk_buff *skb,
+ struct sk_buff *parent);
+void inet_frag_reasm_finish(struct inet_frag_queue *q, struct sk_buff *head,
+ void *reasm_data);
+struct sk_buff *inet_frag_pull_head(struct inet_frag_queue *q);
+
#endif
diff --git a/include/net/ip.h b/include/net/ip.h
index 9c31d95..db6e110 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -583,7 +583,7 @@ int ip_options_get_from_user(struct net *net, struct ip_options_rcu **optp,
unsigned char __user *data, int optlen);
void ip_options_undo(struct ip_options *opt);
void ip_forward_options(struct sk_buff *skb);
-int ip_options_rcv_srr(struct sk_buff *skb);
+int ip_options_rcv_srr(struct sk_buff *skb, struct net_device *dev);
/*
* Functions provided by ip_sockglue.c
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 7cb100d..168009e 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -511,35 +511,6 @@ static inline bool ipv6_prefix_equal(const struct in6_addr *addr1,
}
#endif
-struct inet_frag_queue;
-
-enum ip6_defrag_users {
- IP6_DEFRAG_LOCAL_DELIVER,
- IP6_DEFRAG_CONNTRACK_IN,
- __IP6_DEFRAG_CONNTRACK_IN = IP6_DEFRAG_CONNTRACK_IN + USHRT_MAX,
- IP6_DEFRAG_CONNTRACK_OUT,
- __IP6_DEFRAG_CONNTRACK_OUT = IP6_DEFRAG_CONNTRACK_OUT + USHRT_MAX,
- IP6_DEFRAG_CONNTRACK_BRIDGE_IN,
- __IP6_DEFRAG_CONNTRACK_BRIDGE_IN = IP6_DEFRAG_CONNTRACK_BRIDGE_IN + USHRT_MAX,
-};
-
-void ip6_frag_init(struct inet_frag_queue *q, const void *a);
-extern const struct rhashtable_params ip6_rhash_params;
-
-/*
- * Equivalent of ipv4 struct ip
- */
-struct frag_queue {
- struct inet_frag_queue q;
-
- int iif;
- unsigned int csum;
- __u16 nhoffset;
- u8 ecn;
-};
-
-void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq);
-
static inline bool ipv6_addr_any(const struct in6_addr *a)
{
#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
diff --git a/include/net/ipv6_frag.h b/include/net/ipv6_frag.h
new file mode 100644
index 0000000..28aa9b3
--- /dev/null
+++ b/include/net/ipv6_frag.h
@@ -0,0 +1,111 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _IPV6_FRAG_H
+#define _IPV6_FRAG_H
+#include <linux/kernel.h>
+#include <net/addrconf.h>
+#include <net/ipv6.h>
+#include <net/inet_frag.h>
+
+enum ip6_defrag_users {
+ IP6_DEFRAG_LOCAL_DELIVER,
+ IP6_DEFRAG_CONNTRACK_IN,
+ __IP6_DEFRAG_CONNTRACK_IN = IP6_DEFRAG_CONNTRACK_IN + USHRT_MAX,
+ IP6_DEFRAG_CONNTRACK_OUT,
+ __IP6_DEFRAG_CONNTRACK_OUT = IP6_DEFRAG_CONNTRACK_OUT + USHRT_MAX,
+ IP6_DEFRAG_CONNTRACK_BRIDGE_IN,
+ __IP6_DEFRAG_CONNTRACK_BRIDGE_IN = IP6_DEFRAG_CONNTRACK_BRIDGE_IN + USHRT_MAX,
+};
+
+/*
+ * Equivalent of ipv4 struct ip
+ */
+struct frag_queue {
+ struct inet_frag_queue q;
+
+ int iif;
+ __u16 nhoffset;
+ u8 ecn;
+};
+
+#if IS_ENABLED(CONFIG_IPV6)
+static inline void ip6frag_init(struct inet_frag_queue *q, const void *a)
+{
+ struct frag_queue *fq = container_of(q, struct frag_queue, q);
+ const struct frag_v6_compare_key *key = a;
+
+ q->key.v6 = *key;
+ fq->ecn = 0;
+}
+
+static inline u32 ip6frag_key_hashfn(const void *data, u32 len, u32 seed)
+{
+ return jhash2(data,
+ sizeof(struct frag_v6_compare_key) / sizeof(u32), seed);
+}
+
+static inline u32 ip6frag_obj_hashfn(const void *data, u32 len, u32 seed)
+{
+ const struct inet_frag_queue *fq = data;
+
+ return jhash2((const u32 *)&fq->key.v6,
+ sizeof(struct frag_v6_compare_key) / sizeof(u32), seed);
+}
+
+static inline int
+ip6frag_obj_cmpfn(struct rhashtable_compare_arg *arg, const void *ptr)
+{
+ const struct frag_v6_compare_key *key = arg->key;
+ const struct inet_frag_queue *fq = ptr;
+
+ return !!memcmp(&fq->key, key, sizeof(*key));
+}
+
+static inline void
+ip6frag_expire_frag_queue(struct net *net, struct frag_queue *fq)
+{
+ struct net_device *dev = NULL;
+ struct sk_buff *head;
+
+ rcu_read_lock();
+ spin_lock(&fq->q.lock);
+
+ if (fq->q.flags & INET_FRAG_COMPLETE)
+ goto out;
+
+ inet_frag_kill(&fq->q);
+
+ dev = dev_get_by_index_rcu(net, fq->iif);
+ if (!dev)
+ goto out;
+
+ __IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMFAILS);
+ __IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT);
+
+ /* Don't send error if the first segment did not arrive. */
+ if (!(fq->q.flags & INET_FRAG_FIRST_IN))
+ goto out;
+
+ /* sk_buff::dev and sk_buff::rbnode are unionized. So we
+ * pull the head out of the tree in order to be able to
+ * deal with head->dev.
+ */
+ head = inet_frag_pull_head(&fq->q);
+ if (!head)
+ goto out;
+
+ head->dev = dev;
+ skb_get(head);
+ spin_unlock(&fq->q.lock);
+
+ icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0);
+ kfree_skb(head);
+ goto out_rcu_unlock;
+
+out:
+ spin_unlock(&fq->q.lock);
+out_rcu_unlock:
+ rcu_read_unlock();
+ inet_frag_put(&fq->q);
+}
+#endif
+#endif
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index c05db6f..0cdafe3 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -53,6 +53,7 @@ struct net {
*/
spinlock_t rules_mod_lock;
+ u32 hash_mix;
atomic64_t cookie_gen;
struct list_head list; /* list of network namespaces */
diff --git a/include/net/netfilter/br_netfilter.h b/include/net/netfilter/br_netfilter.h
index 0b0c35c..238d1b8 100644
--- a/include/net/netfilter/br_netfilter.h
+++ b/include/net/netfilter/br_netfilter.h
@@ -48,7 +48,6 @@ static inline struct rtable *bridge_parent_rtable(const struct net_device *dev)
}
struct net_device *setup_pre_routing(struct sk_buff *skb);
-void br_netfilter_enable(void);
#if IS_ENABLED(CONFIG_IPV6)
int br_validate_ipv6(struct net *net, struct sk_buff *skb);
diff --git a/include/net/netns/hash.h b/include/net/netns/hash.h
index 69a6715..a347b2f 100644
--- a/include/net/netns/hash.h
+++ b/include/net/netns/hash.h
@@ -1,21 +1,10 @@
#ifndef __NET_NS_HASH_H__
#define __NET_NS_HASH_H__
-#include <asm/cache.h>
-
-struct net;
+#include <net/net_namespace.h>
static inline u32 net_hash_mix(const struct net *net)
{
-#ifdef CONFIG_NET_NS
- /*
- * shift this right to eliminate bits, that are
- * always zeroed
- */
-
- return (u32)(((unsigned long)net) >> L1_CACHE_SHIFT);
-#else
- return 0;
-#endif
+ return net->hash_mix;
}
#endif
diff --git a/include/net/phonet/pep.h b/include/net/phonet/pep.h
index b669fe6d..98f31c7 100644
--- a/include/net/phonet/pep.h
+++ b/include/net/phonet/pep.h
@@ -63,10 +63,11 @@ struct pnpipehdr {
u8 state_after_reset; /* reset request */
u8 error_code; /* any response */
u8 pep_type; /* status indication */
- u8 data[1];
+ u8 data0; /* anything else */
};
+ u8 data[];
};
-#define other_pep_type data[1]
+#define other_pep_type data[0]
static inline struct pnpipehdr *pnp_hdr(struct sk_buff *skb)
{
diff --git a/include/net/sctp/checksum.h b/include/net/sctp/checksum.h
index 4a5b9a3..803fc26 100644
--- a/include/net/sctp/checksum.h
+++ b/include/net/sctp/checksum.h
@@ -60,7 +60,7 @@ static inline __wsum sctp_csum_combine(__wsum csum, __wsum csum2,
static inline __le32 sctp_compute_cksum(const struct sk_buff *skb,
unsigned int offset)
{
- struct sctphdr *sh = sctp_hdr(skb);
+ struct sctphdr *sh = (struct sctphdr *)(skb->data + offset);
__le32 ret, old = sh->checksum;
const struct skb_checksum_ops ops = {
.update = sctp_csum_update,
diff --git a/include/net/sock.h b/include/net/sock.h
index e7756fe..5375467 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -651,6 +651,12 @@ static inline void sk_add_node_rcu(struct sock *sk, struct hlist_head *list)
hlist_add_head_rcu(&sk->sk_node, list);
}
+static inline void sk_add_node_tail_rcu(struct sock *sk, struct hlist_head *list)
+{
+ sock_hold(sk);
+ hlist_add_tail_rcu(&sk->sk_node, list);
+}
+
static inline void __sk_nulls_add_node_rcu(struct sock *sk, struct hlist_nulls_head *list)
{
hlist_nulls_add_head_rcu(&sk->sk_nulls_node, list);
diff --git a/include/soc/qcom/icnss.h b/include/soc/qcom/icnss.h
index 0717cd1..e55353a 100644
--- a/include/soc/qcom/icnss.h
+++ b/include/soc/qcom/icnss.h
@@ -50,6 +50,7 @@ struct icnss_driver_ops {
int (*suspend_noirq)(struct device *dev);
int (*resume_noirq)(struct device *dev);
int (*uevent)(struct device *dev, struct icnss_uevent_data *uevent);
+ int (*set_therm_state)(struct device *dev, unsigned long thermal_state);
};
@@ -146,4 +147,8 @@ extern bool icnss_is_rejuvenate(void);
extern int icnss_trigger_recovery(struct device *dev);
extern void icnss_block_shutdown(bool status);
extern bool icnss_is_pdr(void);
+extern int icnss_thermal_register(struct device *dev, unsigned long max_state);
+extern void icnss_thermal_unregister(struct device *dev);
+extern int icnss_get_curr_therm_state(struct device *dev,
+ unsigned long *thermal_state);
#endif /* _ICNSS_WLAN_H_ */
diff --git a/include/soc/qcom/socinfo.h b/include/soc/qcom/socinfo.h
index 862153e..f6fc1e0 100644
--- a/include/soc/qcom/socinfo.h
+++ b/include/soc/qcom/socinfo.h
@@ -138,6 +138,8 @@
of_flat_dt_is_compatible(of_get_flat_dt_root(), "qcom,sda439")
#define early_machine_is_sda429() \
of_flat_dt_is_compatible(of_get_flat_dt_root(), "qcom,sda429")
+#define early_machine_is_sdm429w() \
+ of_flat_dt_is_compatible(of_get_flat_dt_root(), "qcom,sdm429w")
#define early_machine_is_mdm9650() \
of_flat_dt_is_compatible(of_get_flat_dt_root(), "qcom,mdm9650")
#define early_machine_is_qm215() \
@@ -199,6 +201,7 @@
#define early_machine_is_sdm429() 0
#define early_machine_is_sda439() 0
#define early_machine_is_sda429() 0
+#define early_machine_is_sdm429w() 0
#define early_machine_is_mdm9650() 0
#define early_machine_is_qm215() 0
#define early_machine_is_sdm712() 0
@@ -283,6 +286,7 @@ enum msm_cpu {
MSM_CPU_SDM429,
MSM_CPU_SDA439,
MSM_CPU_SDA429,
+ MSM_CPU_SDM429W,
MSM_CPU_9650,
MSM_CPU_QM215,
};
diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h
index e88465a..e238f54 100644
--- a/include/trace/events/f2fs.h
+++ b/include/trace/events/f2fs.h
@@ -142,6 +142,17 @@ TRACE_DEFINE_ENUM(CP_TRIMMED);
{ CP_SPEC_LOG_NUM, "log type is 2" }, \
{ CP_RECOVER_DIR, "dir needs recovery" })
+#define show_shutdown_mode(type) \
+ __print_symbolic(type, \
+ { F2FS_GOING_DOWN_FULLSYNC, "full sync" }, \
+ { F2FS_GOING_DOWN_METASYNC, "meta sync" }, \
+ { F2FS_GOING_DOWN_NOSYNC, "no sync" }, \
+ { F2FS_GOING_DOWN_METAFLUSH, "meta flush" }, \
+ { F2FS_GOING_DOWN_NEED_FSCK, "need fsck" })
+
+struct f2fs_sb_info;
+struct f2fs_io_info;
+struct extent_info;
struct victim_sel_policy;
struct f2fs_map_blocks;
@@ -526,6 +537,9 @@ TRACE_EVENT(f2fs_map_blocks,
__field(block_t, m_lblk)
__field(block_t, m_pblk)
__field(unsigned int, m_len)
+ __field(unsigned int, m_flags)
+ __field(int, m_seg_type)
+ __field(bool, m_may_create)
__field(int, ret)
),
@@ -535,15 +549,22 @@ TRACE_EVENT(f2fs_map_blocks,
__entry->m_lblk = map->m_lblk;
__entry->m_pblk = map->m_pblk;
__entry->m_len = map->m_len;
+ __entry->m_flags = map->m_flags;
+ __entry->m_seg_type = map->m_seg_type;
+ __entry->m_may_create = map->m_may_create;
__entry->ret = ret;
),
TP_printk("dev = (%d,%d), ino = %lu, file offset = %llu, "
- "start blkaddr = 0x%llx, len = 0x%llx, err = %d",
+ "start blkaddr = 0x%llx, len = 0x%llx, flags = %u,"
+ "seg_type = %d, may_create = %d, err = %d",
show_dev_ino(__entry),
(unsigned long long)__entry->m_lblk,
(unsigned long long)__entry->m_pblk,
(unsigned long long)__entry->m_len,
+ __entry->m_flags,
+ __entry->m_seg_type,
+ __entry->m_may_create,
__entry->ret)
);
@@ -1609,6 +1630,30 @@ DEFINE_EVENT(f2fs_sync_dirty_inodes, f2fs_sync_dirty_inodes_exit,
TP_ARGS(sb, type, count)
);
+TRACE_EVENT(f2fs_shutdown,
+
+ TP_PROTO(struct f2fs_sb_info *sbi, unsigned int mode, int ret),
+
+ TP_ARGS(sbi, mode, ret),
+
+ TP_STRUCT__entry(
+ __field(dev_t, dev)
+ __field(unsigned int, mode)
+ __field(int, ret)
+ ),
+
+ TP_fast_assign(
+ __entry->dev = sbi->sb->s_dev;
+ __entry->mode = mode;
+ __entry->ret = ret;
+ ),
+
+ TP_printk("dev = (%d,%d), mode: %s, ret:%d",
+ show_dev(__entry->dev),
+ show_shutdown_mode(__entry->mode),
+ __entry->ret)
+);
+
#endif /* _TRACE_F2FS_H */
/* This part must be outside protection */
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index 5a81ab4..61d41f2 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -81,12 +81,14 @@
#define __def_pageflag_names \
{1UL << PG_locked, "locked" }, \
+ {1UL << PG_waiters, "waiters" }, \
{1UL << PG_error, "error" }, \
{1UL << PG_referenced, "referenced" }, \
{1UL << PG_uptodate, "uptodate" }, \
{1UL << PG_dirty, "dirty" }, \
{1UL << PG_lru, "lru" }, \
{1UL << PG_active, "active" }, \
+ {1UL << PG_workingset, "workingset" }, \
{1UL << PG_slab, "slab" }, \
{1UL << PG_owner_priv_1, "owner_priv_1" }, \
{1UL << PG_arch_1, "arch_1" }, \
diff --git a/include/uapi/linux/keychord.h b/include/uapi/linux/keychord.h
deleted file mode 100644
index ea7cf4d..0000000
--- a/include/uapi/linux/keychord.h
+++ /dev/null
@@ -1,52 +0,0 @@
-/*
- * Key chord input driver
- *
- * Copyright (C) 2008 Google, Inc.
- * Author: Mike Lockwood <lockwood@android.com>
- *
- * This software is licensed under the terms of the GNU General Public
- * License version 2, as published by the Free Software Foundation, and
- * may be copied, distributed, and modified under those terms.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
-*/
-
-#ifndef _UAPI_LINUX_KEYCHORD_H_
-#define _UAPI_LINUX_KEYCHORD_H_
-
-#include <linux/input.h>
-
-#define KEYCHORD_VERSION 1
-
-/*
- * One or more input_keychord structs are written to /dev/keychord
- * at once to specify the list of keychords to monitor.
- * Reading /dev/keychord returns the id of a keychord when the
- * keychord combination is pressed. A keychord is signalled when
- * all of the keys in the keycode list are in the pressed state.
- * The order in which the keys are pressed does not matter.
- * The keychord will not be signalled if keys not in the keycode
- * list are pressed.
- * Keychords will not be signalled on key release events.
- */
-struct input_keychord {
- /* should be KEYCHORD_VERSION */
- __u16 version;
- /*
- * client specified ID, returned from read()
- * when this keychord is pressed.
- */
- __u16 id;
-
- /* number of keycodes in this keychord */
- __u16 count;
-
- /* variable length array of keycodes */
- __u16 keycodes[];
-};
-
-#endif /* _UAPI_LINUX_KEYCHORD_H_ */
diff --git a/include/uapi/linux/taskstats.h b/include/uapi/linux/taskstats.h
index 2466e55..b48f747 100644
--- a/include/uapi/linux/taskstats.h
+++ b/include/uapi/linux/taskstats.h
@@ -33,7 +33,7 @@
*/
-#define TASKSTATS_VERSION 8
+#define TASKSTATS_VERSION 9
#define TS_COMM_LEN 32 /* should be >= TASK_COMM_LEN
* in linux/sched.h */
@@ -163,6 +163,10 @@ struct taskstats {
/* Delay waiting for memory reclaim */
__u64 freepages_count;
__u64 freepages_delay_total;
+
+ /* Delay waiting for thrashing page */
+ __u64 thrashing_count;
+ __u64 thrashing_delay_total;
};
diff --git a/init/Kconfig b/init/Kconfig
index c854ad5..158e2c1 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -484,6 +484,45 @@
Say N if unsure.
+config PSI
+ bool "Pressure stall information tracking"
+ help
+ Collect metrics that indicate how overcommitted the CPU, memory,
+ and IO capacity are in the system.
+
+ If you say Y here, the kernel will create /proc/pressure/ with the
+ pressure statistics files cpu, memory, and io. These will indicate
+ the share of walltime in which some or all tasks in the system are
+ delayed due to contention of the respective resource.
+
+ In kernels with cgroup support, cgroups (cgroup2 only) will
+ have cpu.pressure, memory.pressure, and io.pressure files,
+ which aggregate pressure stalls for the grouped tasks only.
+
+ For more details see Documentation/accounting/psi.txt.
+
+ Say N if unsure.
+
+config PSI_DEFAULT_DISABLED
+ bool "Require boot parameter to enable pressure stall information tracking"
+ default n
+ depends on PSI
+ help
+ If set, pressure stall information tracking will be disabled
+ per default but can be enabled through passing psi=1 on the
+ kernel commandline during boot.
+
+ This feature adds some code to the task wakeup and sleep
+ paths of the scheduler. The overhead is too low to affect
+ common scheduling-intense workloads in practice (such as
+ webservers, memcache), but it does show up in artificial
+ scheduler stress tests, such as hackbench.
+
+ If you are paranoid and not sure what the kernel will be
+ used for, say Y.
+
+ Say N if unsure.
+
endmenu # "CPU/Task time and stats accounting"
menu "RCU Subsystem"
diff --git a/init/main.c b/init/main.c
index f5670ca..b3c0e4b 100644
--- a/init/main.c
+++ b/init/main.c
@@ -552,6 +552,14 @@ asmlinkage __visible void __init start_kernel(void)
"Interrupts were enabled *very* early, fixing it\n"))
local_irq_disable();
idr_init_cache();
+
+ /*
+ * Allow workqueue creation and work item queueing/cancelling
+ * early. Work item execution depends on kthreads and starts after
+ * workqueue_init().
+ */
+ workqueue_init_early();
+
rcu_init();
/* trace_printk() and trace points may be used after this */
@@ -637,9 +645,8 @@ asmlinkage __visible void __init start_kernel(void)
security_init();
dbg_late_init();
vfs_caches_init();
+ pagecache_init();
signals_init();
- /* rootfs populating might need page-writeback */
- page_writeback_init();
proc_root_init();
nsfs_init();
cpuset_init();
@@ -1006,6 +1013,8 @@ static noinline void __init kernel_init_freeable(void)
smp_prepare_cpus(setup_max_cpus);
+ workqueue_init();
+
do_pre_smp_initcalls();
lockup_detector_init();
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a83771f..1953033 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -62,6 +62,7 @@
#include <linux/proc_ns.h>
#include <linux/nsproxy.h>
#include <linux/file.h>
+#include <linux/psi.h>
#include <net/sock.h>
#define CREATE_TRACE_POINTS
@@ -361,15 +362,6 @@ static void cgroup_idr_remove(struct idr *idr, int id)
spin_unlock_bh(&cgroup_idr_lock);
}
-static struct cgroup *cgroup_parent(struct cgroup *cgrp)
-{
- struct cgroup_subsys_state *parent_css = cgrp->self.parent;
-
- if (parent_css)
- return container_of(parent_css, struct cgroup, self);
- return NULL;
-}
-
/* subsystems visibly enabled on a cgroup */
static u16 cgroup_control(struct cgroup *cgrp)
{
@@ -487,17 +479,6 @@ static inline bool cgroup_is_dead(const struct cgroup *cgrp)
return !(cgrp->self.flags & CSS_ONLINE);
}
-static void cgroup_get(struct cgroup *cgrp)
-{
- WARN_ON_ONCE(cgroup_is_dead(cgrp));
- css_get(&cgrp->self);
-}
-
-static bool cgroup_tryget(struct cgroup *cgrp)
-{
- return css_tryget(&cgrp->self);
-}
-
struct cgroup_subsys_state *of_css(struct kernfs_open_file *of)
{
struct cgroup *cgrp = of->kn->parent->priv;
@@ -781,7 +762,7 @@ static void css_set_move_task(struct task_struct *task,
*/
WARN_ON_ONCE(task->flags & PF_EXITING);
- rcu_assign_pointer(task->cgroups, to_cset);
+ cgroup_move_task(task, to_cset);
list_add_tail(&task->cg_list, use_mg_tasks ? &to_cset->mg_tasks :
&to_cset->tasks);
}
@@ -3525,6 +3506,96 @@ static int cgroup_events_show(struct seq_file *seq, void *v)
return 0;
}
+#ifdef CONFIG_PSI
+static int cgroup_io_pressure_show(struct seq_file *seq, void *v)
+{
+ return psi_show(seq, &seq_css(seq)->cgroup->psi, PSI_IO);
+}
+static int cgroup_memory_pressure_show(struct seq_file *seq, void *v)
+{
+ return psi_show(seq, &seq_css(seq)->cgroup->psi, PSI_MEM);
+}
+static int cgroup_cpu_pressure_show(struct seq_file *seq, void *v)
+{
+ return psi_show(seq, &seq_css(seq)->cgroup->psi, PSI_CPU);
+}
+
+static ssize_t cgroup_pressure_write(struct kernfs_open_file *of, char *buf,
+ size_t nbytes, enum psi_res res)
+{
+ struct psi_trigger *new;
+ struct cgroup *cgrp;
+
+ cgrp = cgroup_kn_lock_live(of->kn, false);
+ if (!cgrp)
+ return -ENODEV;
+
+ cgroup_get(cgrp);
+ cgroup_kn_unlock(of->kn);
+
+ new = psi_trigger_create(&cgrp->psi, buf, nbytes, res);
+ if (IS_ERR(new)) {
+ cgroup_put(cgrp);
+ return PTR_ERR(new);
+ }
+
+ psi_trigger_replace(&of->priv, new);
+
+ cgroup_put(cgrp);
+
+ return nbytes;
+}
+
+static ssize_t cgroup_io_pressure_write(struct kernfs_open_file *of,
+ char *buf, size_t nbytes,
+ loff_t off)
+{
+ return cgroup_pressure_write(of, buf, nbytes, PSI_IO);
+}
+
+static ssize_t cgroup_memory_pressure_write(struct kernfs_open_file *of,
+ char *buf, size_t nbytes,
+ loff_t off)
+{
+ return cgroup_pressure_write(of, buf, nbytes, PSI_MEM);
+}
+
+static ssize_t cgroup_cpu_pressure_write(struct kernfs_open_file *of,
+ char *buf, size_t nbytes,
+ loff_t off)
+{
+ return cgroup_pressure_write(of, buf, nbytes, PSI_CPU);
+}
+
+static unsigned int cgroup_pressure_poll(struct kernfs_open_file *of,
+ poll_table *pt)
+{
+ return psi_trigger_poll(&of->priv, of->file, pt);
+}
+
+static void cgroup_pressure_release(struct kernfs_open_file *of)
+{
+ psi_trigger_replace(&of->priv, NULL);
+}
+#endif /* CONFIG_PSI */
+
+static int cgroup_file_open(struct kernfs_open_file *of)
+{
+ struct cftype *cft = of->kn->priv;
+
+ if (cft->open)
+ return cft->open(of);
+ return 0;
+}
+
+static void cgroup_file_release(struct kernfs_open_file *of)
+{
+ struct cftype *cft = of->kn->priv;
+
+ if (cft->release)
+ cft->release(of);
+}
+
static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf,
size_t nbytes, loff_t off)
{
@@ -3563,6 +3634,16 @@ static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf,
return ret ?: nbytes;
}
+static unsigned int cgroup_file_poll(struct kernfs_open_file *of, poll_table *pt)
+{
+ struct cftype *cft = of->kn->priv;
+
+ if (cft->poll)
+ return cft->poll(of, pt);
+
+ return kernfs_generic_poll(of, pt);
+}
+
static void *cgroup_seqfile_start(struct seq_file *seq, loff_t *ppos)
{
return seq_cft(seq)->seq_start(seq, ppos);
@@ -3575,7 +3656,8 @@ static void *cgroup_seqfile_next(struct seq_file *seq, void *v, loff_t *ppos)
static void cgroup_seqfile_stop(struct seq_file *seq, void *v)
{
- seq_cft(seq)->seq_stop(seq, v);
+ if (seq_cft(seq)->seq_stop)
+ seq_cft(seq)->seq_stop(seq, v);
}
static int cgroup_seqfile_show(struct seq_file *m, void *arg)
@@ -3597,13 +3679,19 @@ static int cgroup_seqfile_show(struct seq_file *m, void *arg)
static struct kernfs_ops cgroup_kf_single_ops = {
.atomic_write_len = PAGE_SIZE,
+ .open = cgroup_file_open,
+ .release = cgroup_file_release,
.write = cgroup_file_write,
+ .poll = cgroup_file_poll,
.seq_show = cgroup_seqfile_show,
};
static struct kernfs_ops cgroup_kf_ops = {
.atomic_write_len = PAGE_SIZE,
+ .open = cgroup_file_open,
+ .release = cgroup_file_release,
.write = cgroup_file_write,
+ .poll = cgroup_file_poll,
.seq_start = cgroup_seqfile_start,
.seq_next = cgroup_seqfile_next,
.seq_stop = cgroup_seqfile_stop,
@@ -4940,6 +5028,32 @@ static struct cftype cgroup_dfl_base_files[] = {
.file_offset = offsetof(struct cgroup, events_file),
.seq_show = cgroup_events_show,
},
+#ifdef CONFIG_PSI
+ {
+ .name = "io.pressure",
+ .flags = CFTYPE_NOT_ON_ROOT,
+ .seq_show = cgroup_io_pressure_show,
+ .write = cgroup_io_pressure_write,
+ .poll = cgroup_pressure_poll,
+ .release = cgroup_pressure_release,
+ },
+ {
+ .name = "memory.pressure",
+ .flags = CFTYPE_NOT_ON_ROOT,
+ .seq_show = cgroup_memory_pressure_show,
+ .write = cgroup_memory_pressure_write,
+ .poll = cgroup_pressure_poll,
+ .release = cgroup_pressure_release,
+ },
+ {
+ .name = "cpu.pressure",
+ .flags = CFTYPE_NOT_ON_ROOT,
+ .seq_show = cgroup_cpu_pressure_show,
+ .write = cgroup_cpu_pressure_write,
+ .poll = cgroup_pressure_poll,
+ .release = cgroup_pressure_release,
+ },
+#endif /* CONFIG_PSI */
{ } /* terminate */
};
@@ -5045,6 +5159,8 @@ static void css_free_work_fn(struct work_struct *work)
*/
cgroup_put(cgroup_parent(cgrp));
kernfs_put(cgrp->kn);
+ if (cgroup_on_dfl(cgrp))
+ psi_cgroup_free(cgrp);
kfree(cgrp);
} else {
/*
@@ -5314,6 +5430,12 @@ static struct cgroup *cgroup_create(struct cgroup *parent)
if (!cgroup_on_dfl(cgrp))
cgrp->subtree_control = cgroup_control(cgrp);
+ if (cgroup_on_dfl(cgrp)) {
+ ret = psi_cgroup_alloc(cgrp);
+ if (ret)
+ goto out_idr_free;
+ }
+
if (parent)
cgroup_bpf_inherit(cgrp, parent);
@@ -5321,6 +5443,8 @@ static struct cgroup *cgroup_create(struct cgroup *parent)
return cgrp;
+out_idr_free:
+ cgroup_idr_remove(&root->cgroup_idr, cgrp->id);
out_cancel_ref:
percpu_ref_exit(&cgrp->self.refcnt);
out_free_cgrp:
diff --git a/kernel/cpu.c b/kernel/cpu.c
index dd067c0..e5825d4 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -599,6 +599,20 @@ static void undo_cpu_up(unsigned int cpu, struct cpuhp_cpu_state *st)
}
}
+static inline bool can_rollback_cpu(struct cpuhp_cpu_state *st)
+{
+ if (IS_ENABLED(CONFIG_HOTPLUG_CPU))
+ return true;
+ /*
+ * When CPU hotplug is disabled, then taking the CPU down is not
+ * possible because takedown_cpu() and the architecture and
+ * subsystem specific mechanisms are not available. So the CPU
+ * which would be completely unplugged again needs to stay around
+ * in the current state.
+ */
+ return st->state <= CPUHP_BRINGUP_CPU;
+}
+
static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
enum cpuhp_state target)
{
@@ -609,8 +623,10 @@ static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
st->state++;
ret = cpuhp_invoke_callback(cpu, st->state, true, NULL);
if (ret) {
- st->target = prev_state;
- undo_cpu_up(cpu, st);
+ if (can_rollback_cpu(st)) {
+ st->target = prev_state;
+ undo_cpu_up(cpu, st);
+ }
break;
}
}
diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
index 5a58421..12076ff 100644
--- a/kernel/debug/kdb/kdb_main.c
+++ b/kernel/debug/kdb/kdb_main.c
@@ -18,6 +18,7 @@
#include <linux/kmsg_dump.h>
#include <linux/reboot.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/sysrq.h>
#include <linux/smp.h>
#include <linux/utsname.h>
@@ -2584,16 +2585,11 @@ static int kdb_summary(int argc, const char **argv)
}
kdb_printf("%02ld:%02ld\n", val.uptime/(60*60), (val.uptime/60)%60);
- /* lifted from fs/proc/proc_misc.c::loadavg_read_proc() */
-
-#define LOAD_INT(x) ((x) >> FSHIFT)
-#define LOAD_FRAC(x) LOAD_INT(((x) & (FIXED_1-1)) * 100)
kdb_printf("load avg %ld.%02ld %ld.%02ld %ld.%02ld\n",
LOAD_INT(val.loads[0]), LOAD_FRAC(val.loads[0]),
LOAD_INT(val.loads[1]), LOAD_FRAC(val.loads[1]),
LOAD_INT(val.loads[2]), LOAD_FRAC(val.loads[2]));
-#undef LOAD_INT
-#undef LOAD_FRAC
+
/* Display in kilobytes */
#define K(x) ((x) << (PAGE_SHIFT - 10))
kdb_printf("\nMemTotal: %8lu kB\nMemFree: %8lu kB\n"
diff --git a/kernel/delayacct.c b/kernel/delayacct.c
index 435c14a..3fde7df 100644
--- a/kernel/delayacct.c
+++ b/kernel/delayacct.c
@@ -124,9 +124,12 @@ int __delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
d->swapin_delay_total = (tmp < d->swapin_delay_total) ? 0 : tmp;
tmp = d->freepages_delay_total + tsk->delays->freepages_delay;
d->freepages_delay_total = (tmp < d->freepages_delay_total) ? 0 : tmp;
+ tmp = d->thrashing_delay_total + tsk->delays->thrashing_delay;
+ d->thrashing_delay_total = (tmp < d->thrashing_delay_total) ? 0 : tmp;
d->blkio_count += tsk->delays->blkio_count;
d->swapin_count += tsk->delays->swapin_count;
d->freepages_count += tsk->delays->freepages_count;
+ d->thrashing_count += tsk->delays->thrashing_count;
spin_unlock_irqrestore(&tsk->delays->lock, flags);
return 0;
@@ -156,3 +159,14 @@ void __delayacct_freepages_end(void)
¤t->delays->freepages_count);
}
+void __delayacct_thrashing_start(void)
+{
+ current->delays->thrashing_start = ktime_get_ns();
+}
+
+void __delayacct_thrashing_end(void)
+{
+ delayacct_end(¤t->delays->thrashing_start,
+ ¤t->delays->thrashing_delay,
+ ¤t->delays->thrashing_count);
+}
diff --git a/kernel/events/core.c b/kernel/events/core.c
index e8f0970..1dbbc81 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6792,6 +6792,7 @@ static void perf_event_mmap_output(struct perf_event *event,
struct perf_output_handle handle;
struct perf_sample_data sample;
int size = mmap_event->event_id.header.size;
+ u32 type = mmap_event->event_id.header.type;
int ret;
if (!perf_event_mmap_match(event, data))
@@ -6835,6 +6836,7 @@ static void perf_event_mmap_output(struct perf_event *event,
perf_output_end(&handle);
out:
mmap_event->event_id.header.size = size;
+ mmap_event->event_id.header.type = type;
}
static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
diff --git a/kernel/fork.c b/kernel/fork.c
index f1cd160..b4b4108 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1625,6 +1625,10 @@ static __latent_entropy struct task_struct *copy_process(
p->default_timer_slack_ns = current->timer_slack_ns;
+#ifdef CONFIG_PSI
+ p->psi_flags = 0;
+#endif
+
task_io_accounting_init(&p->ioac);
acct_clear_integrals(p);
diff --git a/kernel/futex.c b/kernel/futex.c
index 30fe043..2e766ff 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -3110,6 +3110,10 @@ int handle_futex_death(u32 __user *uaddr, struct task_struct *curr, int pi)
{
u32 uval, uninitialized_var(nval), mval;
+ /* Futex address must be 32bit aligned */
+ if ((((unsigned long)uaddr) % sizeof(*uaddr)) != 0)
+ return -1;
+
retry:
if (get_user(uval, uaddr))
return -1;
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index c7e79d8..58125b7 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -15,6 +15,7 @@
#include <linux/lockdep.h>
#include <linux/export.h>
#include <linux/sysctl.h>
+#include <linux/suspend.h>
#include <linux/utsname.h>
#include <trace/events/sched.h>
#include <linux/sched/sysctl.h>
@@ -233,6 +234,28 @@ void reset_hung_task_detector(void)
}
EXPORT_SYMBOL_GPL(reset_hung_task_detector);
+static bool hung_detector_suspended;
+
+static int hungtask_pm_notify(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ switch (action) {
+ case PM_SUSPEND_PREPARE:
+ case PM_HIBERNATION_PREPARE:
+ case PM_RESTORE_PREPARE:
+ hung_detector_suspended = true;
+ break;
+ case PM_POST_SUSPEND:
+ case PM_POST_HIBERNATION:
+ case PM_POST_RESTORE:
+ hung_detector_suspended = false;
+ break;
+ default:
+ break;
+ }
+ return NOTIFY_OK;
+}
+
/*
* kthread which checks for tasks stuck in D state
*/
@@ -247,7 +270,8 @@ static int watchdog(void *dummy)
long t = hung_timeout_jiffies(hung_last_checked, timeout);
if (t <= 0) {
- if (!atomic_xchg(&reset_hung_task, 0))
+ if (!atomic_xchg(&reset_hung_task, 0) &&
+ !hung_detector_suspended)
check_hung_uninterruptible_tasks(timeout);
hung_last_checked = jiffies;
continue;
@@ -261,6 +285,10 @@ static int watchdog(void *dummy)
static int __init hung_task_init(void)
{
atomic_notifier_chain_register(&panic_notifier_list, &panic_block);
+
+ /* Disable hung task detector on suspend */
+ pm_notifier(hungtask_pm_notify, 0);
+
watchdog_task = kthread_run(watchdog, NULL, "khungtaskd");
return 0;
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index f30110e..9f13667 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -729,7 +729,11 @@ void handle_percpu_irq(struct irq_desc *desc)
{
struct irq_chip *chip = irq_desc_get_chip(desc);
- kstat_incr_irqs_this_cpu(desc);
+ /*
+ * PER CPU interrupts are not serialized. Do not touch
+ * desc->tot_count.
+ */
+ __kstat_incr_irqs_this_cpu(desc);
if (chip->irq_ack)
chip->irq_ack(&desc->irq_data);
@@ -758,7 +762,11 @@ void handle_percpu_devid_irq(struct irq_desc *desc)
unsigned int irq = irq_desc_get_irq(desc);
irqreturn_t res;
- kstat_incr_irqs_this_cpu(desc);
+ /*
+ * PER CPU interrupts are not serialized. Do not touch
+ * desc->tot_count.
+ */
+ __kstat_incr_irqs_this_cpu(desc);
if (chip->irq_ack)
chip->irq_ack(&desc->irq_data);
@@ -1134,6 +1142,10 @@ int irq_chip_set_vcpu_affinity_parent(struct irq_data *data, void *vcpu_info)
int irq_chip_set_wake_parent(struct irq_data *data, unsigned int on)
{
data = data->parent_data;
+
+ if (data->chip->flags & IRQCHIP_SKIP_SET_WAKE)
+ return 0;
+
if (data->chip->irq_set_wake)
return data->chip->irq_set_wake(data, on);
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index bc226e7..22e3f29 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -199,12 +199,18 @@ static inline bool irqd_has_set(struct irq_data *d, unsigned int mask)
#undef __irqd_to_state
-static inline void kstat_incr_irqs_this_cpu(struct irq_desc *desc)
+static inline void __kstat_incr_irqs_this_cpu(struct irq_desc *desc)
{
__this_cpu_inc(*desc->kstat_irqs);
__this_cpu_inc(kstat.irqs_sum);
}
+static inline void kstat_incr_irqs_this_cpu(struct irq_desc *desc)
+{
+ __kstat_incr_irqs_this_cpu(desc);
+ desc->tot_count++;
+}
+
static inline int irq_desc_get_node(struct irq_desc *desc)
{
return irq_common_data_get_node(&desc->irq_common_data);
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 77977f55df..5e0ea17 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -109,6 +109,7 @@ static void desc_set_defaults(unsigned int irq, struct irq_desc *desc, int node,
desc->depth = 1;
desc->irq_count = 0;
desc->irqs_unhandled = 0;
+ desc->tot_count = 0;
desc->name = NULL;
desc->owner = owner;
for_each_possible_cpu(cpu)
@@ -880,11 +881,15 @@ unsigned int kstat_irqs_cpu(unsigned int irq, int cpu)
unsigned int kstat_irqs(unsigned int irq)
{
struct irq_desc *desc = irq_to_desc(irq);
- int cpu;
unsigned int sum = 0;
+ int cpu;
if (!desc || !desc->kstat_irqs)
return 0;
+ if (!irq_settings_is_per_cpu_devid(desc) &&
+ !irq_settings_is_per_cpu(desc))
+ return desc->tot_count;
+
for_each_possible_cpu(cpu)
sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
return sum;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index f580352..e2845dd 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -668,7 +668,6 @@ static void unoptimize_kprobe(struct kprobe *p, bool force)
static int reuse_unused_kprobe(struct kprobe *ap)
{
struct optimized_kprobe *op;
- int ret;
BUG_ON(!kprobe_unused(ap));
/*
@@ -682,9 +681,8 @@ static int reuse_unused_kprobe(struct kprobe *ap)
/* Enable the probe again */
ap->flags &= ~KPROBE_FLAG_DISABLED;
/* Optimize it again (remove from op->list) */
- ret = kprobe_optready(ap);
- if (ret)
- return ret;
+ if (!kprobe_optready(ap))
+ return -EINVAL;
optimize_kprobe(ap);
return 0;
diff --git a/kernel/panic.c b/kernel/panic.c
index 278d64b..bebc117 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -608,7 +608,7 @@ EXPORT_SYMBOL(warn_slowpath_null);
*/
__visible void __stack_chk_fail(void)
{
- panic("stack-protector: Kernel stack is corrupted in: %p\n",
+ panic("stack-protector: Kernel stack is corrupted in: %pB\n",
__builtin_return_address(0));
}
EXPORT_SYMBOL(__stack_chk_fail);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 084c41c..7fd74ef 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1718,15 +1718,23 @@ static int rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
}
/*
- * Awaken the grace-period kthread for the specified flavor of RCU.
- * Don't do a self-awaken, and don't bother awakening when there is
- * nothing for the grace-period kthread to do (as in several CPUs
- * raced to awaken, and we lost), and finally don't try to awaken
- * a kthread that has not yet been created.
+ * Awaken the grace-period kthread. Don't do a self-awaken (unless in
+ * an interrupt or softirq handler), and don't bother awakening when there
+ * is nothing for the grace-period kthread to do (as in several CPUs raced
+ * to awaken, and we lost), and finally don't try to awaken a kthread that
+ * has not yet been created. If all those checks are passed, track some
+ * debug information and awaken.
+ *
+ * So why do the self-wakeup when in an interrupt or softirq handler
+ * in the grace-period kthread's context? Because the kthread might have
+ * been interrupted just as it was going to sleep, and just after the final
+ * pre-sleep check of the awaken condition. In this case, a wakeup really
+ * is required, and is therefore supplied.
*/
static void rcu_gp_kthread_wake(struct rcu_state *rsp)
{
- if (current == rsp->gp_kthread ||
+ if ((current == rsp->gp_kthread &&
+ !in_interrupt() && !in_serving_softirq()) ||
!READ_ONCE(rsp->gp_flags) ||
!rsp->gp_kthread)
return;
diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
old mode 100644
new mode 100755
index 38cbc0d..134b6f8
--- a/kernel/sched/Makefile
+++ b/kernel/sched/Makefile
@@ -28,3 +28,4 @@
obj-$(CONFIG_CPU_FREQ) += cpufreq.o
obj-$(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) += cpufreq_schedutil.o
obj-$(CONFIG_SCHED_CORE_CTL) += core_ctl.o
+obj-$(CONFIG_PSI) += psi.o
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4910292..35371f2 100755
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -77,6 +77,8 @@
#include <linux/prefetch.h>
#include <linux/irq.h>
#include <linux/cpufreq_times.h>
+#include <linux/sched/loadavg.h>
+#include <linux/cgroup-defs.h>
#include <asm/switch_to.h>
#include <asm/tlb.h>
@@ -176,21 +178,6 @@ unlock_rq_of(struct rq *rq, struct task_struct *p, struct rq_flags *flags)
}
/*
- * this_rq_lock - lock this runqueue and disable interrupts.
- */
-static struct rq *this_rq_lock(void)
- __acquires(rq->lock)
-{
- struct rq *rq;
-
- local_irq_disable();
- rq = this_rq();
- raw_spin_lock(&rq->lock);
-
- return rq;
-}
-
-/*
* __task_rq_lock - lock the rq @p resides on.
*/
struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
@@ -204,7 +191,7 @@ struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
rq = task_rq(p);
raw_spin_lock(&rq->lock);
if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
- rf->cookie = lockdep_pin_lock(&rq->lock);
+ rq_pin_lock(rq, rf);
return rq;
}
raw_spin_unlock(&rq->lock);
@@ -244,7 +231,7 @@ struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
* pair with the WMB to ensure we must then also see migrating.
*/
if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
- rf->cookie = lockdep_pin_lock(&rq->lock);
+ rq_pin_lock(rq, rf);
return rq;
}
raw_spin_unlock(&rq->lock);
@@ -774,8 +761,10 @@ static void set_load_weight(struct task_struct *p)
static inline void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
{
update_rq_clock(rq);
- if (!(flags & ENQUEUE_RESTORE))
+ if (!(flags & ENQUEUE_RESTORE)) {
sched_info_queued(rq, p);
+ psi_enqueue(p, flags & ENQUEUE_WAKEUP);
+ }
p->sched_class->enqueue_task(rq, p, flags);
walt_update_last_enqueue(p);
trace_sched_enq_deq_task(p, 1, cpumask_bits(&p->cpus_allowed)[0]);
@@ -784,8 +773,10 @@ static inline void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
static inline void dequeue_task(struct rq *rq, struct task_struct *p, int flags)
{
update_rq_clock(rq);
- if (!(flags & DEQUEUE_SAVE))
+ if (!(flags & DEQUEUE_SAVE)) {
sched_info_dequeued(rq, p);
+ psi_dequeue(p, flags & DEQUEUE_SLEEP);
+ }
p->sched_class->dequeue_task(rq, p, flags);
trace_sched_enq_deq_task(p, 0, cpumask_bits(&p->cpus_allowed)[0]);
}
@@ -1239,9 +1230,9 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
* OK, since we're going to drop the lock immediately
* afterwards anyway.
*/
- lockdep_unpin_lock(&rq->lock, rf.cookie);
+ rq_unpin_lock(rq, &rf);
rq = move_queued_task(rq, p, dest_cpu);
- lockdep_repin_lock(&rq->lock, rf.cookie);
+ rq_repin_lock(rq, &rf);
}
out:
task_rq_unlock(rq, p, &rf);
@@ -1757,7 +1748,7 @@ static inline void ttwu_activate(struct rq *rq, struct task_struct *p, int en_fl
* Mark the task runnable and perform wakeup-preemption.
*/
static void ttwu_do_wakeup(struct rq *rq, struct task_struct *p, int wake_flags,
- struct pin_cookie cookie)
+ struct rq_flags *rf)
{
check_preempt_curr(rq, p, wake_flags);
p->state = TASK_RUNNING;
@@ -1769,9 +1760,9 @@ static void ttwu_do_wakeup(struct rq *rq, struct task_struct *p, int wake_flags,
* Our task @p is fully woken up and running; so its safe to
* drop the rq->lock, hereafter rq is only used for statistics.
*/
- lockdep_unpin_lock(&rq->lock, cookie);
+ rq_unpin_lock(rq, rf);
p->sched_class->task_woken(rq, p);
- lockdep_repin_lock(&rq->lock, cookie);
+ rq_repin_lock(rq, rf);
}
if (rq->idle_stamp) {
@@ -1790,7 +1781,7 @@ static void ttwu_do_wakeup(struct rq *rq, struct task_struct *p, int wake_flags,
static void
ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags,
- struct pin_cookie cookie)
+ struct rq_flags *rf)
{
int en_flags = ENQUEUE_WAKEUP;
@@ -1805,7 +1796,7 @@ ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags,
#endif
ttwu_activate(rq, p, en_flags);
- ttwu_do_wakeup(rq, p, wake_flags, cookie);
+ ttwu_do_wakeup(rq, p, wake_flags, rf);
}
/*
@@ -1824,7 +1815,7 @@ static int ttwu_remote(struct task_struct *p, int wake_flags)
if (task_on_rq_queued(p)) {
/* check_preempt_curr() may use rq clock */
update_rq_clock(rq);
- ttwu_do_wakeup(rq, p, wake_flags, rf.cookie);
+ ttwu_do_wakeup(rq, p, wake_flags, &rf);
ret = 1;
}
__task_rq_unlock(rq, &rf);
@@ -1837,15 +1828,15 @@ void sched_ttwu_pending(void)
{
struct rq *rq = this_rq();
struct llist_node *llist = llist_del_all(&rq->wake_list);
- struct pin_cookie cookie;
struct task_struct *p;
unsigned long flags;
+ struct rq_flags rf;
if (!llist)
return;
raw_spin_lock_irqsave(&rq->lock, flags);
- cookie = lockdep_pin_lock(&rq->lock);
+ rq_pin_lock(rq, &rf);
while (llist) {
int wake_flags = 0;
@@ -1856,10 +1847,10 @@ void sched_ttwu_pending(void)
if (p->sched_remote_wakeup)
wake_flags = WF_MIGRATED;
- ttwu_do_activate(rq, p, wake_flags, cookie);
+ ttwu_do_activate(rq, p, wake_flags, &rf);
}
- lockdep_unpin_lock(&rq->lock, cookie);
+ rq_unpin_lock(rq, &rf);
raw_spin_unlock_irqrestore(&rq->lock, flags);
}
@@ -1959,7 +1950,7 @@ bool cpus_share_cache(int this_cpu, int that_cpu)
static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
{
struct rq *rq = cpu_rq(cpu);
- struct pin_cookie cookie;
+ struct rq_flags rf;
#if defined(CONFIG_SMP)
if (sched_feat(TTWU_QUEUE) && !cpus_share_cache(smp_processor_id(), cpu)) {
@@ -1970,9 +1961,9 @@ static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
#endif
raw_spin_lock(&rq->lock);
- cookie = lockdep_pin_lock(&rq->lock);
- ttwu_do_activate(rq, p, wake_flags, cookie);
- lockdep_unpin_lock(&rq->lock, cookie);
+ rq_pin_lock(rq, &rf);
+ ttwu_do_activate(rq, p, wake_flags, &rf);
+ rq_unpin_lock(rq, &rf);
raw_spin_unlock(&rq->lock);
}
@@ -2191,6 +2182,7 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
src_cpu = task_cpu(p);
if (src_cpu != cpu) {
wake_flags |= WF_MIGRATED;
+ psi_ttwu_dequeue(p);
set_task_cpu(p, cpu);
notif_required = true;
}
@@ -2225,7 +2217,7 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
* ensure that this_rq() is locked, @p is bound to this_rq() and not
* the current task.
*/
-static void try_to_wake_up_local(struct task_struct *p, struct pin_cookie cookie)
+static void try_to_wake_up_local(struct task_struct *p, struct rq_flags *rf)
{
struct rq *rq = task_rq(p);
@@ -2246,11 +2238,11 @@ static void try_to_wake_up_local(struct task_struct *p, struct pin_cookie cookie
* disabled avoiding further scheduler activity on it and we've
* not yet picked a replacement task.
*/
- lockdep_unpin_lock(&rq->lock, cookie);
+ rq_unpin_lock(rq, rf);
raw_spin_unlock(&rq->lock);
raw_spin_lock(&p->pi_lock);
raw_spin_lock(&rq->lock);
- lockdep_repin_lock(&rq->lock, cookie);
+ rq_repin_lock(rq, rf);
}
if (!(p->state & TASK_NORMAL))
@@ -2267,7 +2259,7 @@ static void try_to_wake_up_local(struct task_struct *p, struct pin_cookie cookie
note_task_waking(p, wallclock);
}
- ttwu_do_wakeup(rq, p, 0, cookie);
+ ttwu_do_wakeup(rq, p, 0, rf);
ttwu_stat(p, smp_processor_id(), 0);
out:
raw_spin_unlock(&p->pi_lock);
@@ -2724,9 +2716,9 @@ void wake_up_new_task(struct task_struct *p)
* Nothing relies on rq->lock after this, so its fine to
* drop it.
*/
- lockdep_unpin_lock(&rq->lock, rf.cookie);
+ rq_unpin_lock(rq, &rf);
p->sched_class->task_woken(rq, p);
- lockdep_repin_lock(&rq->lock, rf.cookie);
+ rq_repin_lock(rq, &rf);
}
#endif
task_rq_unlock(rq, p, &rf);
@@ -2995,7 +2987,7 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
*/
static __always_inline struct rq *
context_switch(struct rq *rq, struct task_struct *prev,
- struct task_struct *next, struct pin_cookie cookie)
+ struct task_struct *next, struct rq_flags *rf)
{
struct mm_struct *mm, *oldmm;
@@ -3027,7 +3019,7 @@ context_switch(struct rq *rq, struct task_struct *prev,
* of the scheduler it's an obvious special-case), so we
* do an early lockdep release here:
*/
- lockdep_unpin_lock(&rq->lock, cookie);
+ rq_unpin_lock(rq, rf);
spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
/* Here we just switch the register state and the stack. */
@@ -3270,6 +3262,9 @@ void scheduler_tick(void)
flag = SCHED_CPUFREQ_WALT | SCHED_CPUFREQ_EARLY_DET;
cpufreq_update_util(rq, flag);
+
+ psi_task_tick(rq);
+
raw_spin_unlock(&rq->lock);
if (early_notif)
@@ -3491,7 +3486,7 @@ static inline void schedule_debug(struct task_struct *prev)
* Pick up the highest-prio task:
*/
static inline struct task_struct *
-pick_next_task(struct rq *rq, struct task_struct *prev, struct pin_cookie cookie)
+pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
const struct sched_class *class = &fair_sched_class;
struct task_struct *p;
@@ -3502,20 +3497,20 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct pin_cookie cookie
*/
if (likely(prev->sched_class == class &&
rq->nr_running == rq->cfs.h_nr_running)) {
- p = fair_sched_class.pick_next_task(rq, prev, cookie);
+ p = fair_sched_class.pick_next_task(rq, prev, rf);
if (unlikely(p == RETRY_TASK))
goto again;
/* assumes fair_sched_class->next == idle_sched_class */
if (unlikely(!p))
- p = idle_sched_class.pick_next_task(rq, prev, cookie);
+ p = idle_sched_class.pick_next_task(rq, prev, rf);
return p;
}
again:
for_each_class(class) {
- p = class->pick_next_task(rq, prev, cookie);
+ p = class->pick_next_task(rq, prev, rf);
if (p) {
if (unlikely(p == RETRY_TASK))
goto again;
@@ -3569,7 +3564,7 @@ static void __sched notrace __schedule(bool preempt)
{
struct task_struct *prev, *next;
unsigned long *switch_count;
- struct pin_cookie cookie;
+ struct rq_flags rf;
struct rq *rq;
int cpu;
u64 wallclock;
@@ -3593,7 +3588,7 @@ static void __sched notrace __schedule(bool preempt)
*/
smp_mb__before_spinlock();
raw_spin_lock(&rq->lock);
- cookie = lockdep_pin_lock(&rq->lock);
+ rq_pin_lock(rq, &rf);
rq->clock_skip_update <<= 1; /* promote REQ to ACT */
@@ -3615,7 +3610,7 @@ static void __sched notrace __schedule(bool preempt)
to_wakeup = wq_worker_sleeping(prev);
if (to_wakeup)
- try_to_wake_up_local(to_wakeup, cookie);
+ try_to_wake_up_local(to_wakeup, &rf);
}
}
switch_count = &prev->nvcsw;
@@ -3624,12 +3619,12 @@ static void __sched notrace __schedule(bool preempt)
if (task_on_rq_queued(prev))
update_rq_clock(rq);
- next = pick_next_task(rq, prev, cookie);
+ next = pick_next_task(rq, prev, &rf);
+ wallclock = sched_ktime_clock();
clear_tsk_need_resched(prev);
clear_preempt_need_resched();
rq->clock_skip_update = 0;
- wallclock = sched_ktime_clock();
if (likely(prev != next)) {
if (!prev->on_rq)
prev->last_sleep_ts = wallclock;
@@ -3641,10 +3636,10 @@ static void __sched notrace __schedule(bool preempt)
++*switch_count;
trace_sched_switch(preempt, prev, next);
- rq = context_switch(rq, prev, next, cookie); /* unlocks the rq */
+ rq = context_switch(rq, prev, next, &rf); /* unlocks the rq */
} else {
update_task_ravg(prev, rq, TASK_UPDATE, wallclock, 0);
- lockdep_unpin_lock(&rq->lock, cookie);
+ rq_unpin_lock(rq, &rf);
raw_spin_unlock_irq(&rq->lock);
}
@@ -5144,7 +5139,10 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
*/
SYSCALL_DEFINE0(sched_yield)
{
- struct rq *rq = this_rq_lock();
+ struct rq_flags rf;
+ struct rq *rq;
+
+ rq = this_rq_lock_irq(&rf);
schedstat_inc(rq->yld_count);
current->sched_class->yield_task(rq);
@@ -5153,9 +5151,8 @@ SYSCALL_DEFINE0(sched_yield)
* Since we are going to call schedule() anyway, there's
* no need to preempt or enable interrupts:
*/
- __release(rq->lock);
- spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
- do_raw_spin_unlock(&rq->lock);
+ preempt_disable();
+ rq_unlock(rq, &rf);
sched_preempt_enable_no_resched();
schedule();
@@ -5815,7 +5812,7 @@ static void migrate_tasks(struct rq *dead_rq, bool migrate_pinned_tasks)
{
struct rq *rq = dead_rq;
struct task_struct *next, *stop = rq->stop;
- struct pin_cookie cookie;
+ struct rq_flags rf;
int dest_cpu;
unsigned int num_pinned_kthreads = 1; /* this thread */
LIST_HEAD(tasks);
@@ -5852,8 +5849,8 @@ static void migrate_tasks(struct rq *dead_rq, bool migrate_pinned_tasks)
/*
* pick_next_task assumes pinned rq->lock.
*/
- cookie = lockdep_pin_lock(&rq->lock);
- next = pick_next_task(rq, &fake_task, cookie);
+ rq_pin_lock(rq, &rf);
+ next = pick_next_task(rq, &fake_task, &rf);
BUG_ON(!next);
next->sched_class->put_prev_task(rq, next);
@@ -5861,7 +5858,7 @@ static void migrate_tasks(struct rq *dead_rq, bool migrate_pinned_tasks)
!cpumask_intersects(&avail_cpus, &next->cpus_allowed)) {
detach_one_task(next, rq, &tasks);
num_pinned_kthreads += 1;
- lockdep_unpin_lock(&rq->lock, cookie);
+ rq_unpin_lock(rq, &rf);
continue;
}
@@ -5874,7 +5871,7 @@ static void migrate_tasks(struct rq *dead_rq, bool migrate_pinned_tasks)
* because !cpu_active at this point, which means load-balance
* will not interfere. Also, stop-machine.
*/
- lockdep_unpin_lock(&rq->lock, cookie);
+ rq_unpin_lock(rq, &rf);
raw_spin_unlock(&rq->lock);
raw_spin_lock(&next->pi_lock);
raw_spin_lock(&rq->lock);
@@ -8383,6 +8380,8 @@ void __init sched_init(void)
init_schedstats();
+ psi_init();
+
scheduler_running = 1;
}
@@ -9088,6 +9087,17 @@ int sched_updown_migrate_handler(struct ctl_table *table, int write,
}
#endif
+void threadgroup_change_begin(struct task_struct *tsk)
+{
+ might_sleep();
+ cgroup_threadgroup_change_begin(tsk);
+}
+
+void threadgroup_change_end(struct task_struct *tsk)
+{
+ cgroup_threadgroup_change_end(tsk);
+}
+
#ifdef CONFIG_CGROUP_SCHED
inline struct task_group *css_tg(struct cgroup_subsys_state *css)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index d7c7dd6..ba9dce4 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -749,9 +749,9 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
* Nothing relies on rq->lock after this, so its safe to drop
* rq->lock.
*/
- lockdep_unpin_lock(&rq->lock, rf.cookie);
+ rq_unpin_lock(rq, &rf);
push_dl_task(rq);
- lockdep_repin_lock(&rq->lock, rf.cookie);
+ rq_repin_lock(rq, &rf);
}
#endif
@@ -1248,7 +1248,7 @@ static struct sched_dl_entity *pick_next_dl_entity(struct rq *rq,
}
struct task_struct *
-pick_next_task_dl(struct rq *rq, struct task_struct *prev, struct pin_cookie cookie)
+pick_next_task_dl(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
struct sched_dl_entity *dl_se;
struct task_struct *p;
@@ -1263,9 +1263,9 @@ pick_next_task_dl(struct rq *rq, struct task_struct *prev, struct pin_cookie coo
* disabled avoiding further scheduler activity on it and we're
* being very careful to re-start the picking loop.
*/
- lockdep_unpin_lock(&rq->lock, cookie);
+ rq_unpin_lock(rq, rf);
pull_dl_task(rq);
- lockdep_repin_lock(&rq->lock, cookie);
+ rq_repin_lock(rq, rf);
/*
* pull_rt_task() can drop (and re-acquire) rq->lock; this
* means a stop task can slip in, in which case we need to
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 08391ff..3f863a1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2012,6 +2012,10 @@ static u64 numa_get_avg_runtime(struct task_struct *p, u64 *period)
if (p->last_task_numa_placement) {
delta = runtime - p->last_sum_exec_runtime;
*period = now - p->last_task_numa_placement;
+
+ /* Avoid time going backwards, prevent potential divide error: */
+ if (unlikely((s64)*period < 0))
+ *period = 0;
} else {
delta = p->se.avg.load_sum / p->se.load.weight;
*period = LOAD_AVG_MAX;
@@ -4671,12 +4675,15 @@ static enum hrtimer_restart sched_cfs_slack_timer(struct hrtimer *timer)
return HRTIMER_NORESTART;
}
+extern const u64 max_cfs_quota_period;
+
static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer)
{
struct cfs_bandwidth *cfs_b =
container_of(timer, struct cfs_bandwidth, period_timer);
int overrun;
int idle = 0;
+ int count = 0;
raw_spin_lock(&cfs_b->lock);
for (;;) {
@@ -4684,6 +4691,28 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer)
if (!overrun)
break;
+ if (++count > 3) {
+ u64 new, old = ktime_to_ns(cfs_b->period);
+
+ new = (old * 147) / 128; /* ~115% */
+ new = min(new, max_cfs_quota_period);
+
+ cfs_b->period = ns_to_ktime(new);
+
+ /* since max is 1s, this is limited to 1e9^2, which fits in u64 */
+ cfs_b->quota *= new;
+ cfs_b->quota = div64_u64(cfs_b->quota, old);
+
+ pr_warn_ratelimited(
+ "cfs_period_timer[cpu%d]: period too short, scaling up (new cfs_period_us %lld, cfs_quota_us = %lld)\n",
+ smp_processor_id(),
+ div_u64(new, NSEC_PER_USEC),
+ div_u64(cfs_b->quota, NSEC_PER_USEC));
+
+ /* reset count so we don't come right back in here */
+ count = 0;
+ }
+
idle = do_sched_cfs_period_timer(cfs_b, overrun);
}
if (idle)
@@ -5927,7 +5956,8 @@ static int compute_energy(struct energy_env *eenv)
cpu_count--;
}
- if (cpumask_equal(sched_group_cpus(sg), sched_group_cpus(eenv->sg_top)))
+ if (cpumask_equal(sched_group_cpus(sg), sched_group_cpus(eenv->sg_top)) &&
+ sd->child)
goto next_cpu;
} while (sg = sg->next, sg != sd->groups);
@@ -7891,7 +7921,7 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_
}
static struct task_struct *
-pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct pin_cookie cookie)
+pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
struct cfs_rq *cfs_rq = &rq->cfs;
struct sched_entity *se;
@@ -8009,9 +8039,9 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct pin_cookie c
* further scheduler activity on it and we're being very careful to
* re-start the picking loop.
*/
- lockdep_unpin_lock(&rq->lock, cookie);
+ rq_unpin_lock(rq, rf);
new_tasks = idle_balance(rq);
- lockdep_repin_lock(&rq->lock, cookie);
+ rq_repin_lock(rq, rf);
/*
* Because idle_balance() releases (and re-acquires) rq->lock, it is
* possible for any higher priority task to appear. In that case we
@@ -8702,10 +8732,10 @@ static void update_cfs_rq_h_load(struct cfs_rq *cfs_rq)
if (cfs_rq->last_h_load_update == now)
return;
- cfs_rq->h_load_next = NULL;
+ WRITE_ONCE(cfs_rq->h_load_next, NULL);
for_each_sched_entity(se) {
cfs_rq = cfs_rq_of(se);
- cfs_rq->h_load_next = se;
+ WRITE_ONCE(cfs_rq->h_load_next, se);
if (cfs_rq->last_h_load_update == now)
break;
}
@@ -8715,7 +8745,7 @@ static void update_cfs_rq_h_load(struct cfs_rq *cfs_rq)
cfs_rq->last_h_load_update = now;
}
- while ((se = cfs_rq->h_load_next) != NULL) {
+ while ((se = READ_ONCE(cfs_rq->h_load_next)) != NULL) {
load = cfs_rq->h_load;
load = div64_ul(load * se->avg.load_avg,
cfs_rq_load_avg(cfs_rq) + 1);
diff --git a/kernel/sched/idle_task.c b/kernel/sched/idle_task.c
index 5405d3f..0c00172 100644
--- a/kernel/sched/idle_task.c
+++ b/kernel/sched/idle_task.c
@@ -24,7 +24,7 @@ static void check_preempt_curr_idle(struct rq *rq, struct task_struct *p, int fl
}
static struct task_struct *
-pick_next_task_idle(struct rq *rq, struct task_struct *prev, struct pin_cookie cookie)
+pick_next_task_idle(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
put_prev_task(rq, prev);
update_idle_core(rq);
diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c
index ec91fcc..5f3812c 100644
--- a/kernel/sched/loadavg.c
+++ b/kernel/sched/loadavg.c
@@ -7,6 +7,7 @@
*/
#include <linux/export.h>
+#include <linux/sched/loadavg.h>
#include "sched.h"
@@ -93,19 +94,73 @@ long calc_load_fold_active(struct rq *this_rq, long adjust)
return delta;
}
-/*
- * a1 = a0 * e + a * (1 - e)
+/**
+ * fixed_power_int - compute: x^n, in O(log n) time
+ *
+ * @x: base of the power
+ * @frac_bits: fractional bits of @x
+ * @n: power to raise @x to.
+ *
+ * By exploiting the relation between the definition of the natural power
+ * function: x^n := x*x*...*x (x multiplied by itself for n times), and
+ * the binary encoding of numbers used by computers: n := \Sum n_i * 2^i,
+ * (where: n_i \elem {0, 1}, the binary vector representing n),
+ * we find: x^n := x^(\Sum n_i * 2^i) := \Prod x^(n_i * 2^i), which is
+ * of course trivially computable in O(log_2 n), the length of our binary
+ * vector.
*/
static unsigned long
-calc_load(unsigned long load, unsigned long exp, unsigned long active)
+fixed_power_int(unsigned long x, unsigned int frac_bits, unsigned int n)
{
- unsigned long newload;
+ unsigned long result = 1UL << frac_bits;
- newload = load * exp + active * (FIXED_1 - exp);
- if (active >= load)
- newload += FIXED_1-1;
+ if (n) {
+ for (;;) {
+ if (n & 1) {
+ result *= x;
+ result += 1UL << (frac_bits - 1);
+ result >>= frac_bits;
+ }
+ n >>= 1;
+ if (!n)
+ break;
+ x *= x;
+ x += 1UL << (frac_bits - 1);
+ x >>= frac_bits;
+ }
+ }
- return newload / FIXED_1;
+ return result;
+}
+
+/*
+ * a1 = a0 * e + a * (1 - e)
+ *
+ * a2 = a1 * e + a * (1 - e)
+ * = (a0 * e + a * (1 - e)) * e + a * (1 - e)
+ * = a0 * e^2 + a * (1 - e) * (1 + e)
+ *
+ * a3 = a2 * e + a * (1 - e)
+ * = (a0 * e^2 + a * (1 - e) * (1 + e)) * e + a * (1 - e)
+ * = a0 * e^3 + a * (1 - e) * (1 + e + e^2)
+ *
+ * ...
+ *
+ * an = a0 * e^n + a * (1 - e) * (1 + e + ... + e^n-1) [1]
+ * = a0 * e^n + a * (1 - e) * (1 - e^n)/(1 - e)
+ * = a0 * e^n + a * (1 - e^n)
+ *
+ * [1] application of the geometric series:
+ *
+ * n 1 - x^(n+1)
+ * S_n := \Sum x^i = -------------
+ * i=0 1 - x
+ */
+unsigned long
+calc_load_n(unsigned long load, unsigned long exp,
+ unsigned long active, unsigned int n)
+{
+ return calc_load(load, fixed_power_int(exp, FSHIFT, n), active);
}
#ifdef CONFIG_NO_HZ_COMMON
@@ -227,75 +282,6 @@ static long calc_load_fold_idle(void)
return delta;
}
-/**
- * fixed_power_int - compute: x^n, in O(log n) time
- *
- * @x: base of the power
- * @frac_bits: fractional bits of @x
- * @n: power to raise @x to.
- *
- * By exploiting the relation between the definition of the natural power
- * function: x^n := x*x*...*x (x multiplied by itself for n times), and
- * the binary encoding of numbers used by computers: n := \Sum n_i * 2^i,
- * (where: n_i \elem {0, 1}, the binary vector representing n),
- * we find: x^n := x^(\Sum n_i * 2^i) := \Prod x^(n_i * 2^i), which is
- * of course trivially computable in O(log_2 n), the length of our binary
- * vector.
- */
-static unsigned long
-fixed_power_int(unsigned long x, unsigned int frac_bits, unsigned int n)
-{
- unsigned long result = 1UL << frac_bits;
-
- if (n) {
- for (;;) {
- if (n & 1) {
- result *= x;
- result += 1UL << (frac_bits - 1);
- result >>= frac_bits;
- }
- n >>= 1;
- if (!n)
- break;
- x *= x;
- x += 1UL << (frac_bits - 1);
- x >>= frac_bits;
- }
- }
-
- return result;
-}
-
-/*
- * a1 = a0 * e + a * (1 - e)
- *
- * a2 = a1 * e + a * (1 - e)
- * = (a0 * e + a * (1 - e)) * e + a * (1 - e)
- * = a0 * e^2 + a * (1 - e) * (1 + e)
- *
- * a3 = a2 * e + a * (1 - e)
- * = (a0 * e^2 + a * (1 - e) * (1 + e)) * e + a * (1 - e)
- * = a0 * e^3 + a * (1 - e) * (1 + e + e^2)
- *
- * ...
- *
- * an = a0 * e^n + a * (1 - e) * (1 + e + ... + e^n-1) [1]
- * = a0 * e^n + a * (1 - e) * (1 - e^n)/(1 - e)
- * = a0 * e^n + a * (1 - e^n)
- *
- * [1] application of the geometric series:
- *
- * n 1 - x^(n+1)
- * S_n := \Sum x^i = -------------
- * i=0 1 - x
- */
-static unsigned long
-calc_load_n(unsigned long load, unsigned long exp,
- unsigned long active, unsigned int n)
-{
- return calc_load(load, fixed_power_int(exp, FSHIFT, n), active);
-}
-
/*
* NO_HZ can leave us missing all per-cpu ticks calling
* calc_load_account_active(), but since an idle CPU folds its delta into
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
new file mode 100644
index 0000000..ced7587
--- /dev/null
+++ b/kernel/sched/psi.c
@@ -0,0 +1,1280 @@
+/*
+ * Pressure stall information for CPU, memory and IO
+ *
+ * Copyright (c) 2018 Facebook, Inc.
+ * Author: Johannes Weiner <hannes@cmpxchg.org>
+ *
+ * Polling support by Suren Baghdasaryan <surenb@google.com>
+ * Copyright (c) 2018 Google, Inc.
+ *
+ * When CPU, memory and IO are contended, tasks experience delays that
+ * reduce throughput and introduce latencies into the workload. Memory
+ * and IO contention, in addition, can cause a full loss of forward
+ * progress in which the CPU goes idle.
+ *
+ * This code aggregates individual task delays into resource pressure
+ * metrics that indicate problems with both workload health and
+ * resource utilization.
+ *
+ * Model
+ *
+ * The time in which a task can execute on a CPU is our baseline for
+ * productivity. Pressure expresses the amount of time in which this
+ * potential cannot be realized due to resource contention.
+ *
+ * This concept of productivity has two components: the workload and
+ * the CPU. To measure the impact of pressure on both, we define two
+ * contention states for a resource: SOME and FULL.
+ *
+ * In the SOME state of a given resource, one or more tasks are
+ * delayed on that resource. This affects the workload's ability to
+ * perform work, but the CPU may still be executing other tasks.
+ *
+ * In the FULL state of a given resource, all non-idle tasks are
+ * delayed on that resource such that nobody is advancing and the CPU
+ * goes idle. This leaves both workload and CPU unproductive.
+ *
+ * (Naturally, the FULL state doesn't exist for the CPU resource.)
+ *
+ * SOME = nr_delayed_tasks != 0
+ * FULL = nr_delayed_tasks != 0 && nr_running_tasks == 0
+ *
+ * The percentage of wallclock time spent in those compound stall
+ * states gives pressure numbers between 0 and 100 for each resource,
+ * where the SOME percentage indicates workload slowdowns and the FULL
+ * percentage indicates reduced CPU utilization:
+ *
+ * %SOME = time(SOME) / period
+ * %FULL = time(FULL) / period
+ *
+ * Multiple CPUs
+ *
+ * The more tasks and available CPUs there are, the more work can be
+ * performed concurrently. This means that the potential that can go
+ * unrealized due to resource contention *also* scales with non-idle
+ * tasks and CPUs.
+ *
+ * Consider a scenario where 257 number crunching tasks are trying to
+ * run concurrently on 256 CPUs. If we simply aggregated the task
+ * states, we would have to conclude a CPU SOME pressure number of
+ * 100%, since *somebody* is waiting on a runqueue at all
+ * times. However, that is clearly not the amount of contention the
+ * workload is experiencing: only one out of 256 possible exceution
+ * threads will be contended at any given time, or about 0.4%.
+ *
+ * Conversely, consider a scenario of 4 tasks and 4 CPUs where at any
+ * given time *one* of the tasks is delayed due to a lack of memory.
+ * Again, looking purely at the task state would yield a memory FULL
+ * pressure number of 0%, since *somebody* is always making forward
+ * progress. But again this wouldn't capture the amount of execution
+ * potential lost, which is 1 out of 4 CPUs, or 25%.
+ *
+ * To calculate wasted potential (pressure) with multiple processors,
+ * we have to base our calculation on the number of non-idle tasks in
+ * conjunction with the number of available CPUs, which is the number
+ * of potential execution threads. SOME becomes then the proportion of
+ * delayed tasks to possibe threads, and FULL is the share of possible
+ * threads that are unproductive due to delays:
+ *
+ * threads = min(nr_nonidle_tasks, nr_cpus)
+ * SOME = min(nr_delayed_tasks / threads, 1)
+ * FULL = (threads - min(nr_running_tasks, threads)) / threads
+ *
+ * For the 257 number crunchers on 256 CPUs, this yields:
+ *
+ * threads = min(257, 256)
+ * SOME = min(1 / 256, 1) = 0.4%
+ * FULL = (256 - min(257, 256)) / 256 = 0%
+ *
+ * For the 1 out of 4 memory-delayed tasks, this yields:
+ *
+ * threads = min(4, 4)
+ * SOME = min(1 / 4, 1) = 25%
+ * FULL = (4 - min(3, 4)) / 4 = 25%
+ *
+ * [ Substitute nr_cpus with 1, and you can see that it's a natural
+ * extension of the single-CPU model. ]
+ *
+ * Implementation
+ *
+ * To assess the precise time spent in each such state, we would have
+ * to freeze the system on task changes and start/stop the state
+ * clocks accordingly. Obviously that doesn't scale in practice.
+ *
+ * Because the scheduler aims to distribute the compute load evenly
+ * among the available CPUs, we can track task state locally to each
+ * CPU and, at much lower frequency, extrapolate the global state for
+ * the cumulative stall times and the running averages.
+ *
+ * For each runqueue, we track:
+ *
+ * tSOME[cpu] = time(nr_delayed_tasks[cpu] != 0)
+ * tFULL[cpu] = time(nr_delayed_tasks[cpu] && !nr_running_tasks[cpu])
+ * tNONIDLE[cpu] = time(nr_nonidle_tasks[cpu] != 0)
+ *
+ * and then periodically aggregate:
+ *
+ * tNONIDLE = sum(tNONIDLE[i])
+ *
+ * tSOME = sum(tSOME[i] * tNONIDLE[i]) / tNONIDLE
+ * tFULL = sum(tFULL[i] * tNONIDLE[i]) / tNONIDLE
+ *
+ * %SOME = tSOME / period
+ * %FULL = tFULL / period
+ *
+ * This gives us an approximation of pressure that is practical
+ * cost-wise, yet way more sensitive and accurate than periodic
+ * sampling of the aggregate task states would be.
+ */
+
+#include "../workqueue_internal.h"
+#include <linux/sched/loadavg.h>
+#include <linux/seq_file.h>
+#include <linux/proc_fs.h>
+#include <linux/seqlock.h>
+#include <linux/uaccess.h>
+#include <linux/cgroup.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/ctype.h>
+#include <linux/file.h>
+#include <linux/poll.h>
+#include <linux/psi.h>
+#include "sched.h"
+
+static int psi_bug __read_mostly;
+
+DEFINE_STATIC_KEY_FALSE(psi_disabled);
+
+#ifdef CONFIG_PSI_DEFAULT_DISABLED
+static bool psi_enable;
+#else
+static bool psi_enable = true;
+#endif
+static int __init setup_psi(char *str)
+{
+ return kstrtobool(str, &psi_enable) == 0;
+}
+__setup("psi=", setup_psi);
+
+/* Running averages - we need to be higher-res than loadavg */
+#define PSI_FREQ (2*HZ+1) /* 2 sec intervals */
+#define EXP_10s 1677 /* 1/exp(2s/10s) as fixed-point */
+#define EXP_60s 1981 /* 1/exp(2s/60s) */
+#define EXP_300s 2034 /* 1/exp(2s/300s) */
+
+/* PSI trigger definitions */
+#define WINDOW_MIN_US 500000 /* Min window size is 500ms */
+#define WINDOW_MAX_US 10000000 /* Max window size is 10s */
+#define UPDATES_PER_WINDOW 10 /* 10 updates per window */
+
+/* Sampling frequency in nanoseconds */
+static u64 psi_period __read_mostly;
+
+/* System-level pressure and stall tracking */
+static DEFINE_PER_CPU(struct psi_group_cpu, system_group_pcpu);
+static struct psi_group psi_system = {
+ .pcpu = &system_group_pcpu,
+};
+
+static void psi_avgs_work(struct work_struct *work);
+
+static void group_init(struct psi_group *group)
+{
+ int cpu;
+
+ for_each_possible_cpu(cpu)
+ seqcount_init(&per_cpu_ptr(group->pcpu, cpu)->seq);
+ group->avg_next_update = sched_clock() + psi_period;
+ INIT_DELAYED_WORK(&group->avgs_work, psi_avgs_work);
+ mutex_init(&group->avgs_lock);
+ /* Init trigger-related members */
+ atomic_set(&group->poll_scheduled, 0);
+ mutex_init(&group->trigger_lock);
+ INIT_LIST_HEAD(&group->triggers);
+ memset(group->nr_triggers, 0, sizeof(group->nr_triggers));
+ group->poll_states = 0;
+ group->poll_min_period = U32_MAX;
+ memset(group->polling_total, 0, sizeof(group->polling_total));
+ group->polling_next_update = ULLONG_MAX;
+ group->polling_until = 0;
+ rcu_assign_pointer(group->poll_kworker, NULL);
+}
+
+void __init psi_init(void)
+{
+ if (!psi_enable) {
+ static_branch_enable(&psi_disabled);
+ return;
+ }
+
+ psi_period = jiffies_to_nsecs(PSI_FREQ);
+ group_init(&psi_system);
+}
+
+static bool test_state(unsigned int *tasks, enum psi_states state)
+{
+ switch (state) {
+ case PSI_IO_SOME:
+ return tasks[NR_IOWAIT];
+ case PSI_IO_FULL:
+ return tasks[NR_IOWAIT] && !tasks[NR_RUNNING];
+ case PSI_MEM_SOME:
+ return tasks[NR_MEMSTALL];
+ case PSI_MEM_FULL:
+ return tasks[NR_MEMSTALL] && !tasks[NR_RUNNING];
+ case PSI_CPU_SOME:
+ return tasks[NR_RUNNING] > 1;
+ case PSI_NONIDLE:
+ return tasks[NR_IOWAIT] || tasks[NR_MEMSTALL] ||
+ tasks[NR_RUNNING];
+ default:
+ return false;
+ }
+}
+
+static void get_recent_times(struct psi_group *group, int cpu,
+ enum psi_aggregators aggregator, u32 *times,
+ u32 *pchanged_states)
+{
+ struct psi_group_cpu *groupc = per_cpu_ptr(group->pcpu, cpu);
+ u64 now, state_start;
+ enum psi_states s;
+ unsigned int seq;
+ u32 state_mask;
+
+ *pchanged_states = 0;
+
+ /* Snapshot a coherent view of the CPU state */
+ do {
+ seq = read_seqcount_begin(&groupc->seq);
+ now = cpu_clock(cpu);
+ memcpy(times, groupc->times, sizeof(groupc->times));
+ state_mask = groupc->state_mask;
+ state_start = groupc->state_start;
+ } while (read_seqcount_retry(&groupc->seq, seq));
+
+ /* Calculate state time deltas against the previous snapshot */
+ for (s = 0; s < NR_PSI_STATES; s++) {
+ u32 delta;
+ /*
+ * In addition to already concluded states, we also
+ * incorporate currently active states on the CPU,
+ * since states may last for many sampling periods.
+ *
+ * This way we keep our delta sampling buckets small
+ * (u32) and our reported pressure close to what's
+ * actually happening.
+ */
+ if (state_mask & (1 << s))
+ times[s] += now - state_start;
+
+ delta = times[s] - groupc->times_prev[aggregator][s];
+ groupc->times_prev[aggregator][s] = times[s];
+
+ times[s] = delta;
+ if (delta)
+ *pchanged_states |= (1 << s);
+ }
+}
+
+static void calc_avgs(unsigned long avg[3], int missed_periods,
+ u64 time, u64 period)
+{
+ unsigned long pct;
+
+ /* Fill in zeroes for periods of no activity */
+ if (missed_periods) {
+ avg[0] = calc_load_n(avg[0], EXP_10s, 0, missed_periods);
+ avg[1] = calc_load_n(avg[1], EXP_60s, 0, missed_periods);
+ avg[2] = calc_load_n(avg[2], EXP_300s, 0, missed_periods);
+ }
+
+ /* Sample the most recent active period */
+ pct = div_u64(time * 100, period);
+ pct *= FIXED_1;
+ avg[0] = calc_load(avg[0], EXP_10s, pct);
+ avg[1] = calc_load(avg[1], EXP_60s, pct);
+ avg[2] = calc_load(avg[2], EXP_300s, pct);
+}
+
+static void collect_percpu_times(struct psi_group *group,
+ enum psi_aggregators aggregator,
+ u32 *pchanged_states)
+{
+ u64 deltas[NR_PSI_STATES - 1] = { 0, };
+ unsigned long nonidle_total = 0;
+ u32 changed_states = 0;
+ int cpu;
+ int s;
+
+ /*
+ * Collect the per-cpu time buckets and average them into a
+ * single time sample that is normalized to wallclock time.
+ *
+ * For averaging, each CPU is weighted by its non-idle time in
+ * the sampling period. This eliminates artifacts from uneven
+ * loading, or even entirely idle CPUs.
+ */
+ for_each_possible_cpu(cpu) {
+ u32 times[NR_PSI_STATES];
+ u32 nonidle;
+ u32 cpu_changed_states;
+
+ get_recent_times(group, cpu, aggregator, times,
+ &cpu_changed_states);
+ changed_states |= cpu_changed_states;
+
+ nonidle = nsecs_to_jiffies(times[PSI_NONIDLE]);
+ nonidle_total += nonidle;
+
+ for (s = 0; s < PSI_NONIDLE; s++)
+ deltas[s] += (u64)times[s] * nonidle;
+ }
+
+ /*
+ * Integrate the sample into the running statistics that are
+ * reported to userspace: the cumulative stall times and the
+ * decaying averages.
+ *
+ * Pressure percentages are sampled at PSI_FREQ. We might be
+ * called more often when the user polls more frequently than
+ * that; we might be called less often when there is no task
+ * activity, thus no data, and clock ticks are sporadic. The
+ * below handles both.
+ */
+
+ /* total= */
+ for (s = 0; s < NR_PSI_STATES - 1; s++)
+ group->total[aggregator][s] +=
+ div_u64(deltas[s], max(nonidle_total, 1UL));
+
+ if (pchanged_states)
+ *pchanged_states = changed_states;
+}
+
+static u64 update_averages(struct psi_group *group, u64 now)
+{
+ unsigned long missed_periods = 0;
+ u64 expires, period;
+ u64 avg_next_update;
+ int s;
+
+ /* avgX= */
+ expires = group->avg_next_update;
+ if (now - expires >= psi_period)
+ missed_periods = div_u64(now - expires, psi_period);
+
+ /*
+ * The periodic clock tick can get delayed for various
+ * reasons, especially on loaded systems. To avoid clock
+ * drift, we schedule the clock in fixed psi_period intervals.
+ * But the deltas we sample out of the per-cpu buckets above
+ * are based on the actual time elapsing between clock ticks.
+ */
+ avg_next_update = expires + ((1 + missed_periods) * psi_period);
+ period = now - (group->avg_last_update + (missed_periods * psi_period));
+ group->avg_last_update = now;
+
+ for (s = 0; s < NR_PSI_STATES - 1; s++) {
+ u32 sample;
+
+ sample = group->total[PSI_AVGS][s] - group->avg_total[s];
+ /*
+ * Due to the lockless sampling of the time buckets,
+ * recorded time deltas can slip into the next period,
+ * which under full pressure can result in samples in
+ * excess of the period length.
+ *
+ * We don't want to report non-sensical pressures in
+ * excess of 100%, nor do we want to drop such events
+ * on the floor. Instead we punt any overage into the
+ * future until pressure subsides. By doing this we
+ * don't underreport the occurring pressure curve, we
+ * just report it delayed by one period length.
+ *
+ * The error isn't cumulative. As soon as another
+ * delta slips from a period P to P+1, by definition
+ * it frees up its time T in P.
+ */
+ if (sample > period)
+ sample = period;
+ group->avg_total[s] += sample;
+ calc_avgs(group->avg[s], missed_periods, sample, period);
+ }
+
+ return avg_next_update;
+}
+
+static void psi_avgs_work(struct work_struct *work)
+{
+ struct delayed_work *dwork;
+ struct psi_group *group;
+ u32 changed_states;
+ bool nonidle;
+ u64 now;
+
+ dwork = to_delayed_work(work);
+ group = container_of(dwork, struct psi_group, avgs_work);
+
+ mutex_lock(&group->avgs_lock);
+
+ now = sched_clock();
+
+ collect_percpu_times(group, PSI_AVGS, &changed_states);
+ nonidle = changed_states & (1 << PSI_NONIDLE);
+ /*
+ * If there is task activity, periodically fold the per-cpu
+ * times and feed samples into the running averages. If things
+ * are idle and there is no data to process, stop the clock.
+ * Once restarted, we'll catch up the running averages in one
+ * go - see calc_avgs() and missed_periods.
+ */
+ if (now >= group->avg_next_update)
+ group->avg_next_update = update_averages(group, now);
+
+ if (nonidle) {
+ schedule_delayed_work(dwork, nsecs_to_jiffies(
+ group->avg_next_update - now) + 1);
+ }
+
+ mutex_unlock(&group->avgs_lock);
+}
+
+/* Trigger tracking window manupulations */
+static void window_reset(struct psi_window *win, u64 now, u64 value,
+ u64 prev_growth)
+{
+ win->start_time = now;
+ win->start_value = value;
+ win->prev_growth = prev_growth;
+}
+
+/*
+ * PSI growth tracking window update and growth calculation routine.
+ *
+ * This approximates a sliding tracking window by interpolating
+ * partially elapsed windows using historical growth data from the
+ * previous intervals. This minimizes memory requirements (by not storing
+ * all the intermediate values in the previous window) and simplifies
+ * the calculations. It works well because PSI signal changes only in
+ * positive direction and over relatively small window sizes the growth
+ * is close to linear.
+ */
+static u64 window_update(struct psi_window *win, u64 now, u64 value)
+{
+ u64 elapsed;
+ u64 growth;
+
+ elapsed = now - win->start_time;
+ growth = value - win->start_value;
+ /*
+ * After each tracking window passes win->start_value and
+ * win->start_time get reset and win->prev_growth stores
+ * the average per-window growth of the previous window.
+ * win->prev_growth is then used to interpolate additional
+ * growth from the previous window assuming it was linear.
+ */
+ if (elapsed > win->size)
+ window_reset(win, now, value, growth);
+ else {
+ u32 remaining;
+
+ remaining = win->size - elapsed;
+ growth += div_u64(win->prev_growth * remaining, win->size);
+ }
+
+ return growth;
+}
+
+static void init_triggers(struct psi_group *group, u64 now)
+{
+ struct psi_trigger *t;
+
+ list_for_each_entry(t, &group->triggers, node)
+ window_reset(&t->win, now,
+ group->total[PSI_POLL][t->state], 0);
+ memcpy(group->polling_total, group->total[PSI_POLL],
+ sizeof(group->polling_total));
+ group->polling_next_update = now + group->poll_min_period;
+}
+
+static u64 update_triggers(struct psi_group *group, u64 now)
+{
+ struct psi_trigger *t;
+ bool new_stall = false;
+ u64 *total = group->total[PSI_POLL];
+
+ /*
+ * On subsequent updates, calculate growth deltas and let
+ * watchers know when their specified thresholds are exceeded.
+ */
+ list_for_each_entry(t, &group->triggers, node) {
+ u64 growth;
+
+ /* Check for stall activity */
+ if (group->polling_total[t->state] == total[t->state])
+ continue;
+
+ /*
+ * Multiple triggers might be looking at the same state,
+ * remember to update group->polling_total[] once we've
+ * been through all of them. Also remember to extend the
+ * polling time if we see new stall activity.
+ */
+ new_stall = true;
+
+ /* Calculate growth since last update */
+ growth = window_update(&t->win, now, total[t->state]);
+ if (growth < t->threshold)
+ continue;
+
+ /* Limit event signaling to once per window */
+ if (now < t->last_event_time + t->win.size)
+ continue;
+
+ /* Generate an event */
+ if (cmpxchg(&t->event, 0, 1) == 0)
+ wake_up_interruptible(&t->event_wait);
+ t->last_event_time = now;
+ }
+
+ if (new_stall)
+ memcpy(group->polling_total, total,
+ sizeof(group->polling_total));
+
+ return now + group->poll_min_period;
+}
+
+/*
+ * Schedule polling if it's not already scheduled. It's safe to call even from
+ * hotpath because even though kthread_queue_delayed_work takes worker->lock
+ * spinlock that spinlock is never contended due to poll_scheduled atomic
+ * preventing such competition.
+ */
+static void psi_schedule_poll_work(struct psi_group *group, unsigned long delay)
+{
+ struct kthread_worker *kworker;
+
+ /* Do not reschedule if already scheduled */
+ if (atomic_cmpxchg(&group->poll_scheduled, 0, 1) != 0)
+ return;
+
+ rcu_read_lock();
+
+ kworker = rcu_dereference(group->poll_kworker);
+ /*
+ * kworker might be NULL in case psi_trigger_destroy races with
+ * psi_task_change (hotpath) which can't use locks
+ */
+ if (likely(kworker))
+ kthread_queue_delayed_work(kworker, &group->poll_work, delay);
+ else
+ atomic_set(&group->poll_scheduled, 0);
+
+ rcu_read_unlock();
+}
+
+static void psi_poll_work(struct kthread_work *work)
+{
+ struct kthread_delayed_work *dwork;
+ struct psi_group *group;
+ u32 changed_states;
+ u64 now;
+
+ dwork = container_of(work, struct kthread_delayed_work, work);
+ group = container_of(dwork, struct psi_group, poll_work);
+
+ atomic_set(&group->poll_scheduled, 0);
+
+ mutex_lock(&group->trigger_lock);
+
+ now = sched_clock();
+
+ collect_percpu_times(group, PSI_POLL, &changed_states);
+
+ if (changed_states & group->poll_states) {
+ /* Initialize trigger windows when entering polling mode */
+ if (now > group->polling_until)
+ init_triggers(group, now);
+
+ /*
+ * Keep the monitor active for at least the duration of the
+ * minimum tracking window as long as monitor states are
+ * changing.
+ */
+ group->polling_until = now +
+ group->poll_min_period * UPDATES_PER_WINDOW;
+ }
+
+ if (now > group->polling_until) {
+ group->polling_next_update = ULLONG_MAX;
+ goto out;
+ }
+
+ if (now >= group->polling_next_update)
+ group->polling_next_update = update_triggers(group, now);
+
+ psi_schedule_poll_work(group,
+ nsecs_to_jiffies(group->polling_next_update - now) + 1);
+
+out:
+ mutex_unlock(&group->trigger_lock);
+}
+
+static void record_times(struct psi_group_cpu *groupc, int cpu,
+ bool memstall_tick)
+{
+ u32 delta;
+ u64 now;
+
+ now = cpu_clock(cpu);
+ delta = now - groupc->state_start;
+ groupc->state_start = now;
+
+ if (groupc->state_mask & (1 << PSI_IO_SOME)) {
+ groupc->times[PSI_IO_SOME] += delta;
+ if (groupc->state_mask & (1 << PSI_IO_FULL))
+ groupc->times[PSI_IO_FULL] += delta;
+ }
+
+ if (groupc->state_mask & (1 << PSI_MEM_SOME)) {
+ groupc->times[PSI_MEM_SOME] += delta;
+ if (groupc->state_mask & (1 << PSI_MEM_FULL))
+ groupc->times[PSI_MEM_FULL] += delta;
+ else if (memstall_tick) {
+ u32 sample;
+ /*
+ * Since we care about lost potential, a
+ * memstall is FULL when there are no other
+ * working tasks, but also when the CPU is
+ * actively reclaiming and nothing productive
+ * could run even if it were runnable.
+ *
+ * When the timer tick sees a reclaiming CPU,
+ * regardless of runnable tasks, sample a FULL
+ * tick (or less if it hasn't been a full tick
+ * since the last state change).
+ */
+ sample = min(delta, (u32)jiffies_to_nsecs(1));
+ groupc->times[PSI_MEM_FULL] += sample;
+ }
+ }
+
+ if (groupc->state_mask & (1 << PSI_CPU_SOME))
+ groupc->times[PSI_CPU_SOME] += delta;
+
+ if (groupc->state_mask & (1 << PSI_NONIDLE))
+ groupc->times[PSI_NONIDLE] += delta;
+}
+
+static u32 psi_group_change(struct psi_group *group, int cpu,
+ unsigned int clear, unsigned int set)
+{
+ struct psi_group_cpu *groupc;
+ unsigned int t, m;
+ enum psi_states s;
+ u32 state_mask = 0;
+
+ groupc = per_cpu_ptr(group->pcpu, cpu);
+
+ /*
+ * First we assess the aggregate resource states this CPU's
+ * tasks have been in since the last change, and account any
+ * SOME and FULL time these may have resulted in.
+ *
+ * Then we update the task counts according to the state
+ * change requested through the @clear and @set bits.
+ */
+ write_seqcount_begin(&groupc->seq);
+
+ record_times(groupc, cpu, false);
+
+ for (t = 0, m = clear; m; m &= ~(1 << t), t++) {
+ if (!(m & (1 << t)))
+ continue;
+ if (groupc->tasks[t] == 0 && !psi_bug) {
+ printk_deferred(KERN_ERR "psi: task underflow! cpu=%d t=%d tasks=[%u %u %u] clear=%x set=%x\n",
+ cpu, t, groupc->tasks[0],
+ groupc->tasks[1], groupc->tasks[2],
+ clear, set);
+ psi_bug = 1;
+ }
+ groupc->tasks[t]--;
+ }
+
+ for (t = 0; set; set &= ~(1 << t), t++)
+ if (set & (1 << t))
+ groupc->tasks[t]++;
+
+ /* Calculate state mask representing active states */
+ for (s = 0; s < NR_PSI_STATES; s++) {
+ if (test_state(groupc->tasks, s))
+ state_mask |= (1 << s);
+ }
+ groupc->state_mask = state_mask;
+
+ write_seqcount_end(&groupc->seq);
+
+ return state_mask;
+}
+
+static struct psi_group *iterate_groups(struct task_struct *task, void **iter)
+{
+#ifdef CONFIG_CGROUPS
+ struct cgroup *cgroup = NULL;
+
+ if (!*iter)
+ cgroup = task->cgroups->dfl_cgrp;
+ else if (*iter == &psi_system)
+ return NULL;
+ else
+ cgroup = cgroup_parent(*iter);
+
+ if (cgroup && cgroup_parent(cgroup)) {
+ *iter = cgroup;
+ return cgroup_psi(cgroup);
+ }
+#else
+ if (*iter)
+ return NULL;
+#endif
+ *iter = &psi_system;
+ return &psi_system;
+}
+
+void psi_task_change(struct task_struct *task, int clear, int set)
+{
+ int cpu = task_cpu(task);
+ struct psi_group *group;
+ bool wake_clock = true;
+ void *iter = NULL;
+
+ if (!task->pid)
+ return;
+
+ if (((task->psi_flags & set) ||
+ (task->psi_flags & clear) != clear) &&
+ !psi_bug) {
+ printk_deferred(KERN_ERR "psi: inconsistent task state! task=%d:%s cpu=%d psi_flags=%x clear=%x set=%x\n",
+ task->pid, task->comm, cpu,
+ task->psi_flags, clear, set);
+ psi_bug = 1;
+ }
+
+ task->psi_flags &= ~clear;
+ task->psi_flags |= set;
+
+ /*
+ * Periodic aggregation shuts off if there is a period of no
+ * task changes, so we wake it back up if necessary. However,
+ * don't do this if the task change is the aggregation worker
+ * itself going to sleep, or we'll ping-pong forever.
+ */
+ if (unlikely((clear & TSK_RUNNING) &&
+ (task->flags & PF_WQ_WORKER) &&
+ wq_worker_last_func(task) == psi_avgs_work))
+ wake_clock = false;
+
+ while ((group = iterate_groups(task, &iter))) {
+ u32 state_mask = psi_group_change(group, cpu, clear, set);
+
+ if (state_mask & group->poll_states)
+ psi_schedule_poll_work(group, 1);
+
+ if (wake_clock && !delayed_work_pending(&group->avgs_work))
+ schedule_delayed_work(&group->avgs_work, PSI_FREQ);
+ }
+}
+
+void psi_memstall_tick(struct task_struct *task, int cpu)
+{
+ struct psi_group *group;
+ void *iter = NULL;
+
+ while ((group = iterate_groups(task, &iter))) {
+ struct psi_group_cpu *groupc;
+
+ groupc = per_cpu_ptr(group->pcpu, cpu);
+ write_seqcount_begin(&groupc->seq);
+ record_times(groupc, cpu, true);
+ write_seqcount_end(&groupc->seq);
+ }
+}
+
+/**
+ * psi_memstall_enter - mark the beginning of a memory stall section
+ * @flags: flags to handle nested sections
+ *
+ * Marks the calling task as being stalled due to a lack of memory,
+ * such as waiting for a refault or performing reclaim.
+ */
+void psi_memstall_enter(unsigned long *flags)
+{
+ struct rq_flags rf;
+ struct rq *rq;
+
+ if (static_branch_likely(&psi_disabled))
+ return;
+
+ *flags = current->flags & PF_MEMSTALL;
+ if (*flags)
+ return;
+ /*
+ * PF_MEMSTALL setting & accounting needs to be atomic wrt
+ * changes to the task's scheduling state, otherwise we can
+ * race with CPU migration.
+ */
+ rq = this_rq_lock_irq(&rf);
+
+ current->flags |= PF_MEMSTALL;
+ psi_task_change(current, 0, TSK_MEMSTALL);
+
+ rq_unlock_irq(rq, &rf);
+}
+
+/**
+ * psi_memstall_leave - mark the end of an memory stall section
+ * @flags: flags to handle nested memdelay sections
+ *
+ * Marks the calling task as no longer stalled due to lack of memory.
+ */
+void psi_memstall_leave(unsigned long *flags)
+{
+ struct rq_flags rf;
+ struct rq *rq;
+
+ if (static_branch_likely(&psi_disabled))
+ return;
+
+ if (*flags)
+ return;
+ /*
+ * PF_MEMSTALL clearing & accounting needs to be atomic wrt
+ * changes to the task's scheduling state, otherwise we could
+ * race with CPU migration.
+ */
+ rq = this_rq_lock_irq(&rf);
+
+ current->flags &= ~PF_MEMSTALL;
+ psi_task_change(current, TSK_MEMSTALL, 0);
+
+ rq_unlock_irq(rq, &rf);
+}
+
+#ifdef CONFIG_CGROUPS
+int psi_cgroup_alloc(struct cgroup *cgroup)
+{
+ if (static_branch_likely(&psi_disabled))
+ return 0;
+
+ cgroup->psi.pcpu = alloc_percpu(struct psi_group_cpu);
+ if (!cgroup->psi.pcpu)
+ return -ENOMEM;
+ group_init(&cgroup->psi);
+ return 0;
+}
+
+void psi_cgroup_free(struct cgroup *cgroup)
+{
+ if (static_branch_likely(&psi_disabled))
+ return;
+
+ cancel_delayed_work_sync(&cgroup->psi.avgs_work);
+ free_percpu(cgroup->psi.pcpu);
+ /* All triggers must be removed by now */
+ WARN_ONCE(cgroup->psi.poll_states, "psi: trigger leak\n");
+}
+
+/**
+ * cgroup_move_task - move task to a different cgroup
+ * @task: the task
+ * @to: the target css_set
+ *
+ * Move task to a new cgroup and safely migrate its associated stall
+ * state between the different groups.
+ *
+ * This function acquires the task's rq lock to lock out concurrent
+ * changes to the task's scheduling state and - in case the task is
+ * running - concurrent changes to its stall state.
+ */
+void cgroup_move_task(struct task_struct *task, struct css_set *to)
+{
+ unsigned int task_flags = 0;
+ struct rq_flags rf;
+ struct rq *rq;
+
+ if (static_branch_likely(&psi_disabled)) {
+ /*
+ * Lame to do this here, but the scheduler cannot be locked
+ * from the outside, so we move cgroups from inside sched/.
+ */
+ rcu_assign_pointer(task->cgroups, to);
+ return;
+ }
+
+ rq = task_rq_lock(task, &rf);
+
+ if (task_on_rq_queued(task))
+ task_flags = TSK_RUNNING;
+ else if (task->in_iowait)
+ task_flags = TSK_IOWAIT;
+
+ if (task->flags & PF_MEMSTALL)
+ task_flags |= TSK_MEMSTALL;
+
+ if (task_flags)
+ psi_task_change(task, task_flags, 0);
+
+ /* See comment above */
+ rcu_assign_pointer(task->cgroups, to);
+
+ if (task_flags)
+ psi_task_change(task, 0, task_flags);
+
+ task_rq_unlock(rq, task, &rf);
+}
+#endif /* CONFIG_CGROUPS */
+
+int psi_show(struct seq_file *m, struct psi_group *group, enum psi_res res)
+{
+ int full;
+ u64 now;
+
+ if (static_branch_likely(&psi_disabled))
+ return -EOPNOTSUPP;
+
+ /* Update averages before reporting them */
+ mutex_lock(&group->avgs_lock);
+ now = sched_clock();
+ collect_percpu_times(group, PSI_AVGS, NULL);
+ if (now >= group->avg_next_update)
+ group->avg_next_update = update_averages(group, now);
+ mutex_unlock(&group->avgs_lock);
+
+ for (full = 0; full < 2 - (res == PSI_CPU); full++) {
+ unsigned long avg[3];
+ u64 total;
+ int w;
+
+ for (w = 0; w < 3; w++)
+ avg[w] = group->avg[res * 2 + full][w];
+ total = div_u64(group->total[PSI_AVGS][res * 2 + full],
+ NSEC_PER_USEC);
+
+ seq_printf(m, "%s avg10=%lu.%02lu avg60=%lu.%02lu avg300=%lu.%02lu total=%llu\n",
+ full ? "full" : "some",
+ LOAD_INT(avg[0]), LOAD_FRAC(avg[0]),
+ LOAD_INT(avg[1]), LOAD_FRAC(avg[1]),
+ LOAD_INT(avg[2]), LOAD_FRAC(avg[2]),
+ total);
+ }
+
+ return 0;
+}
+
+static int psi_io_show(struct seq_file *m, void *v)
+{
+ return psi_show(m, &psi_system, PSI_IO);
+}
+
+static int psi_memory_show(struct seq_file *m, void *v)
+{
+ return psi_show(m, &psi_system, PSI_MEM);
+}
+
+static int psi_cpu_show(struct seq_file *m, void *v)
+{
+ return psi_show(m, &psi_system, PSI_CPU);
+}
+
+static int psi_io_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, psi_io_show, NULL);
+}
+
+static int psi_memory_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, psi_memory_show, NULL);
+}
+
+static int psi_cpu_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, psi_cpu_show, NULL);
+}
+
+struct psi_trigger *psi_trigger_create(struct psi_group *group,
+ char *buf, size_t nbytes, enum psi_res res)
+{
+ struct psi_trigger *t;
+ enum psi_states state;
+ u32 threshold_us;
+ u32 window_us;
+
+ if (static_branch_likely(&psi_disabled))
+ return ERR_PTR(-EOPNOTSUPP);
+
+ if (sscanf(buf, "some %u %u", &threshold_us, &window_us) == 2)
+ state = PSI_IO_SOME + res * 2;
+ else if (sscanf(buf, "full %u %u", &threshold_us, &window_us) == 2)
+ state = PSI_IO_FULL + res * 2;
+ else
+ return ERR_PTR(-EINVAL);
+
+ if (state >= PSI_NONIDLE)
+ return ERR_PTR(-EINVAL);
+
+ if (window_us < WINDOW_MIN_US ||
+ window_us > WINDOW_MAX_US)
+ return ERR_PTR(-EINVAL);
+
+ /* Check threshold */
+ if (threshold_us == 0 || threshold_us > window_us)
+ return ERR_PTR(-EINVAL);
+
+ t = kmalloc(sizeof(*t), GFP_KERNEL);
+ if (!t)
+ return ERR_PTR(-ENOMEM);
+
+ t->group = group;
+ t->state = state;
+ t->threshold = threshold_us * NSEC_PER_USEC;
+ t->win.size = window_us * NSEC_PER_USEC;
+ window_reset(&t->win, 0, 0, 0);
+
+ t->event = 0;
+ t->last_event_time = 0;
+ init_waitqueue_head(&t->event_wait);
+ kref_init(&t->refcount);
+
+ mutex_lock(&group->trigger_lock);
+
+ if (!rcu_access_pointer(group->poll_kworker)) {
+ struct sched_param param = {
+ .sched_priority = MAX_RT_PRIO - 1,
+ };
+ struct kthread_worker *kworker;
+
+ kworker = kthread_create_worker(0, "psimon");
+ if (IS_ERR(kworker)) {
+ kfree(t);
+ mutex_unlock(&group->trigger_lock);
+ return ERR_CAST(kworker);
+ }
+ sched_setscheduler(kworker->task, SCHED_FIFO, ¶m);
+ kthread_init_delayed_work(&group->poll_work,
+ psi_poll_work);
+ rcu_assign_pointer(group->poll_kworker, kworker);
+ }
+
+ list_add(&t->node, &group->triggers);
+ group->poll_min_period = min(group->poll_min_period,
+ div_u64(t->win.size, UPDATES_PER_WINDOW));
+ group->nr_triggers[t->state]++;
+ group->poll_states |= (1 << t->state);
+
+ mutex_unlock(&group->trigger_lock);
+
+ return t;
+}
+
+static void psi_trigger_destroy(struct kref *ref)
+{
+ struct psi_trigger *t = container_of(ref, struct psi_trigger, refcount);
+ struct psi_group *group = t->group;
+ struct kthread_worker *kworker_to_destroy = NULL;
+
+ if (static_branch_likely(&psi_disabled))
+ return;
+
+ /*
+ * Wakeup waiters to stop polling. Can happen if cgroup is deleted
+ * from under a polling process.
+ */
+ wake_up_interruptible(&t->event_wait);
+
+ mutex_lock(&group->trigger_lock);
+
+ if (!list_empty(&t->node)) {
+ struct psi_trigger *tmp;
+ u64 period = ULLONG_MAX;
+
+ list_del(&t->node);
+ group->nr_triggers[t->state]--;
+ if (!group->nr_triggers[t->state])
+ group->poll_states &= ~(1 << t->state);
+ /* reset min update period for the remaining triggers */
+ list_for_each_entry(tmp, &group->triggers, node)
+ period = min(period, div_u64(tmp->win.size,
+ UPDATES_PER_WINDOW));
+ group->poll_min_period = period;
+ /* Destroy poll_kworker when the last trigger is destroyed */
+ if (group->poll_states == 0) {
+ group->polling_until = 0;
+ kworker_to_destroy = rcu_dereference_protected(
+ group->poll_kworker,
+ lockdep_is_held(&group->trigger_lock));
+ rcu_assign_pointer(group->poll_kworker, NULL);
+ }
+ }
+
+ mutex_unlock(&group->trigger_lock);
+
+ /*
+ * Wait for both *trigger_ptr from psi_trigger_replace and
+ * poll_kworker RCUs to complete their read-side critical sections
+ * before destroying the trigger and optionally the poll_kworker
+ */
+ synchronize_rcu();
+ /*
+ * Destroy the kworker after releasing trigger_lock to prevent a
+ * deadlock while waiting for psi_poll_work to acquire trigger_lock
+ */
+ if (kworker_to_destroy) {
+ kthread_cancel_delayed_work_sync(&group->poll_work);
+ kthread_destroy_worker(kworker_to_destroy);
+ }
+ kfree(t);
+}
+
+void psi_trigger_replace(void **trigger_ptr, struct psi_trigger *new)
+{
+ struct psi_trigger *old = *trigger_ptr;
+
+ if (static_branch_likely(&psi_disabled))
+ return;
+
+ rcu_assign_pointer(*trigger_ptr, new);
+ if (old)
+ kref_put(&old->refcount, psi_trigger_destroy);
+}
+
+unsigned int psi_trigger_poll(void **trigger_ptr, struct file *file,
+ poll_table *wait)
+{
+ unsigned int ret = DEFAULT_POLLMASK;
+ struct psi_trigger *t;
+
+ if (static_branch_likely(&psi_disabled))
+ return DEFAULT_POLLMASK | POLLERR | POLLPRI;
+
+ rcu_read_lock();
+
+ t = rcu_dereference(*(void __rcu __force **)trigger_ptr);
+ if (!t) {
+ rcu_read_unlock();
+ return DEFAULT_POLLMASK | POLLERR | POLLPRI;
+ }
+ kref_get(&t->refcount);
+
+ rcu_read_unlock();
+
+ poll_wait(file, &t->event_wait, wait);
+
+ if (cmpxchg(&t->event, 1, 0) == 1)
+ ret |= POLLPRI;
+
+ kref_put(&t->refcount, psi_trigger_destroy);
+
+ return ret;
+}
+
+static ssize_t psi_write(struct file *file, const char __user *user_buf,
+ size_t nbytes, enum psi_res res)
+{
+ char buf[32];
+ size_t buf_size;
+ struct seq_file *seq;
+ struct psi_trigger *new;
+
+ if (static_branch_likely(&psi_disabled))
+ return -EOPNOTSUPP;
+
+ buf_size = min(nbytes, (sizeof(buf) - 1));
+ if (copy_from_user(buf, user_buf, buf_size))
+ return -EFAULT;
+
+ buf[buf_size - 1] = '\0';
+
+ new = psi_trigger_create(&psi_system, buf, nbytes, res);
+ if (IS_ERR(new))
+ return PTR_ERR(new);
+
+ seq = file->private_data;
+ /* Take seq->lock to protect seq->private from concurrent writes */
+ mutex_lock(&seq->lock);
+ psi_trigger_replace(&seq->private, new);
+ mutex_unlock(&seq->lock);
+
+ return nbytes;
+}
+
+static ssize_t psi_io_write(struct file *file, const char __user *user_buf,
+ size_t nbytes, loff_t *ppos)
+{
+ return psi_write(file, user_buf, nbytes, PSI_IO);
+}
+
+static ssize_t psi_memory_write(struct file *file, const char __user *user_buf,
+ size_t nbytes, loff_t *ppos)
+{
+ return psi_write(file, user_buf, nbytes, PSI_MEM);
+}
+
+static ssize_t psi_cpu_write(struct file *file, const char __user *user_buf,
+ size_t nbytes, loff_t *ppos)
+{
+ return psi_write(file, user_buf, nbytes, PSI_CPU);
+}
+
+static unsigned int psi_fop_poll(struct file *file, poll_table *wait)
+{
+ struct seq_file *seq = file->private_data;
+
+ return psi_trigger_poll(&seq->private, file, wait);
+}
+
+static int psi_fop_release(struct inode *inode, struct file *file)
+{
+ struct seq_file *seq = file->private_data;
+
+ psi_trigger_replace(&seq->private, NULL);
+ return single_release(inode, file);
+}
+
+static const struct file_operations psi_io_fops = {
+ .open = psi_io_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .write = psi_io_write,
+ .poll = psi_fop_poll,
+ .release = psi_fop_release,
+};
+
+static const struct file_operations psi_memory_fops = {
+ .open = psi_memory_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .write = psi_memory_write,
+ .poll = psi_fop_poll,
+ .release = psi_fop_release,
+};
+
+static const struct file_operations psi_cpu_fops = {
+ .open = psi_cpu_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .write = psi_cpu_write,
+ .poll = psi_fop_poll,
+ .release = psi_fop_release,
+};
+
+static int __init psi_proc_init(void)
+{
+ proc_mkdir("pressure", NULL);
+ proc_create("pressure/io", 0, NULL, &psi_io_fops);
+ proc_create("pressure/memory", 0, NULL, &psi_memory_fops);
+ proc_create("pressure/cpu", 0, NULL, &psi_cpu_fops);
+ return 0;
+}
+module_init(psi_proc_init);
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 95b68bd..1d704a4 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1625,7 +1625,7 @@ static struct task_struct *_pick_next_task_rt(struct rq *rq)
}
static struct task_struct *
-pick_next_task_rt(struct rq *rq, struct task_struct *prev, struct pin_cookie cookie)
+pick_next_task_rt(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
struct task_struct *p;
struct rt_rq *rt_rq = &rq->rt;
@@ -1637,9 +1637,9 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev, struct pin_cookie coo
* disabled avoiding further scheduler activity on it and we're
* being very careful to re-start the picking loop.
*/
- lockdep_unpin_lock(&rq->lock, cookie);
+ rq_unpin_lock(rq, rf);
pull_rt_task(rq);
- lockdep_repin_lock(&rq->lock, cookie);
+ rq_repin_lock(rq, rf);
/*
* pull_rt_task() can drop (and re-acquire) rq->lock; this
* means a dl or stop task can slip in, in which case we need
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 9942e02..af68f7a 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -7,6 +7,7 @@
#include <linux/kernel_stat.h>
#include <linux/binfmts.h>
#include <linux/mutex.h>
+#include <linux/psi.h>
#include <linux/spinlock.h>
#include <linux/stop_machine.h>
#include <linux/irq_work.h>
@@ -303,6 +304,7 @@ extern struct mutex sched_domains_mutex;
#ifdef CONFIG_CGROUP_SCHED
#include <linux/cgroup.h>
+#include <linux/psi.h>
struct cfs_rq;
struct rt_rq;
@@ -901,6 +903,8 @@ DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
#define cpu_curr(cpu) (cpu_rq(cpu)->curr)
#define raw_rq() raw_cpu_ptr(&runqueues)
+extern void update_rq_clock(struct rq *rq);
+
static inline u64 __rq_clock_broken(struct rq *rq)
{
return READ_ONCE(rq->clock);
@@ -930,6 +934,118 @@ static inline void rq_clock_skip_update(struct rq *rq, bool skip)
rq->clock_skip_update &= ~RQCF_REQ_SKIP;
}
+struct rq_flags {
+ unsigned long flags;
+ struct pin_cookie cookie;
+};
+
+static inline void rq_pin_lock(struct rq *rq, struct rq_flags *rf)
+{
+ rf->cookie = lockdep_pin_lock(&rq->lock);
+}
+
+static inline void rq_unpin_lock(struct rq *rq, struct rq_flags *rf)
+{
+ lockdep_unpin_lock(&rq->lock, rf->cookie);
+}
+
+static inline void rq_repin_lock(struct rq *rq, struct rq_flags *rf)
+{
+ lockdep_repin_lock(&rq->lock, rf->cookie);
+}
+
+struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
+ __acquires(rq->lock);
+
+struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
+ __acquires(p->pi_lock)
+ __acquires(rq->lock);
+
+static inline void __task_rq_unlock(struct rq *rq, struct rq_flags *rf)
+ __releases(rq->lock)
+{
+ rq_unpin_lock(rq, rf);
+ raw_spin_unlock(&rq->lock);
+}
+
+static inline void
+task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
+ __releases(rq->lock)
+ __releases(p->pi_lock)
+{
+ rq_unpin_lock(rq, rf);
+ raw_spin_unlock(&rq->lock);
+ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
+}
+
+static inline void
+rq_lock_irqsave(struct rq *rq, struct rq_flags *rf)
+ __acquires(rq->lock)
+{
+ raw_spin_lock_irqsave(&rq->lock, rf->flags);
+ rq_pin_lock(rq, rf);
+}
+
+static inline void
+rq_lock_irq(struct rq *rq, struct rq_flags *rf)
+ __acquires(rq->lock)
+{
+ raw_spin_lock_irq(&rq->lock);
+ rq_pin_lock(rq, rf);
+}
+
+static inline void
+rq_lock(struct rq *rq, struct rq_flags *rf)
+ __acquires(rq->lock)
+{
+ raw_spin_lock(&rq->lock);
+ rq_pin_lock(rq, rf);
+}
+
+static inline void
+rq_relock(struct rq *rq, struct rq_flags *rf)
+ __acquires(rq->lock)
+{
+ raw_spin_lock(&rq->lock);
+ rq_repin_lock(rq, rf);
+}
+
+static inline void
+rq_unlock_irqrestore(struct rq *rq, struct rq_flags *rf)
+ __releases(rq->lock)
+{
+ rq_unpin_lock(rq, rf);
+ raw_spin_unlock_irqrestore(&rq->lock, rf->flags);
+}
+
+static inline void
+rq_unlock_irq(struct rq *rq, struct rq_flags *rf)
+ __releases(rq->lock)
+{
+ rq_unpin_lock(rq, rf);
+ raw_spin_unlock_irq(&rq->lock);
+}
+
+static inline void
+rq_unlock(struct rq *rq, struct rq_flags *rf)
+ __releases(rq->lock)
+{
+ rq_unpin_lock(rq, rf);
+ raw_spin_unlock(&rq->lock);
+}
+
+static inline struct rq *
+this_rq_lock_irq(struct rq_flags *rf)
+ __acquires(rq->lock)
+{
+ struct rq *rq;
+
+ local_irq_disable();
+ rq = this_rq();
+ rq_lock(rq, rf);
+ return rq;
+}
+
#ifdef CONFIG_NUMA
enum numa_topology_type {
NUMA_DIRECT,
@@ -1404,7 +1520,7 @@ struct sched_class {
*/
struct task_struct * (*pick_next_task) (struct rq *rq,
struct task_struct *prev,
- struct pin_cookie cookie);
+ struct rq_flags *rf);
void (*put_prev_task) (struct rq *rq, struct task_struct *p);
#ifdef CONFIG_SMP
@@ -1656,8 +1772,6 @@ static inline void rq_last_tick_reset(struct rq *rq)
#endif
}
-extern void update_rq_clock(struct rq *rq);
-
extern void activate_task(struct rq *rq, struct task_struct *p, int flags);
extern void deactivate_task(struct rq *rq, struct task_struct *p, int flags);
@@ -1953,34 +2067,6 @@ static inline void sched_rt_avg_update(struct rq *rq, u64 rt_delta) { }
static inline void sched_avg_update(struct rq *rq) { }
#endif
-struct rq_flags {
- unsigned long flags;
- struct pin_cookie cookie;
-};
-
-struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
- __acquires(rq->lock);
-struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
- __acquires(p->pi_lock)
- __acquires(rq->lock);
-
-static inline void __task_rq_unlock(struct rq *rq, struct rq_flags *rf)
- __releases(rq->lock)
-{
- lockdep_unpin_lock(&rq->lock, rf->cookie);
- raw_spin_unlock(&rq->lock);
-}
-
-static inline void
-task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
- __releases(rq->lock)
- __releases(p->pi_lock)
-{
- lockdep_unpin_lock(&rq->lock, rf->cookie);
- raw_spin_unlock(&rq->lock);
- raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
-}
-
extern struct rq *lock_rq_of(struct task_struct *p, struct rq_flags *flags);
extern void unlock_rq_of(struct rq *rq, struct task_struct *p, struct rq_flags *flags);
diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index 34659a8..3c0d8f1 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -54,6 +54,92 @@ rq_sched_info_depart(struct rq *rq, unsigned long long delta)
#define schedstat_val_or_zero(var) 0
#endif /* CONFIG_SCHEDSTATS */
+#ifdef CONFIG_PSI
+/*
+ * PSI tracks state that persists across sleeps, such as iowaits and
+ * memory stalls. As a result, it has to distinguish between sleeps,
+ * where a task's runnable state changes, and requeues, where a task
+ * and its state are being moved between CPUs and runqueues.
+ */
+static inline void psi_enqueue(struct task_struct *p, bool wakeup)
+{
+ int clear = 0, set = TSK_RUNNING;
+
+ if (static_branch_likely(&psi_disabled))
+ return;
+
+ if (!wakeup || p->sched_psi_wake_requeue) {
+ if (p->flags & PF_MEMSTALL)
+ set |= TSK_MEMSTALL;
+ if (p->sched_psi_wake_requeue)
+ p->sched_psi_wake_requeue = 0;
+ } else {
+ if (p->in_iowait)
+ clear |= TSK_IOWAIT;
+ }
+
+ psi_task_change(p, clear, set);
+}
+
+static inline void psi_dequeue(struct task_struct *p, bool sleep)
+{
+ int clear = TSK_RUNNING, set = 0;
+
+ if (static_branch_likely(&psi_disabled))
+ return;
+
+ if (!sleep) {
+ if (p->flags & PF_MEMSTALL)
+ clear |= TSK_MEMSTALL;
+ } else {
+ if (p->in_iowait)
+ set |= TSK_IOWAIT;
+ }
+
+ psi_task_change(p, clear, set);
+}
+
+static inline void psi_ttwu_dequeue(struct task_struct *p)
+{
+ if (static_branch_likely(&psi_disabled))
+ return;
+ /*
+ * Is the task being migrated during a wakeup? Make sure to
+ * deregister its sleep-persistent psi states from the old
+ * queue, and let psi_enqueue() know it has to requeue.
+ */
+ if (unlikely(p->in_iowait || (p->flags & PF_MEMSTALL))) {
+ struct rq_flags rf;
+ struct rq *rq;
+ int clear = 0;
+
+ if (p->in_iowait)
+ clear |= TSK_IOWAIT;
+ if (p->flags & PF_MEMSTALL)
+ clear |= TSK_MEMSTALL;
+
+ rq = __task_rq_lock(p, &rf);
+ psi_task_change(p, clear, 0);
+ p->sched_psi_wake_requeue = 1;
+ __task_rq_unlock(rq, &rf);
+ }
+}
+
+static inline void psi_task_tick(struct rq *rq)
+{
+ if (static_branch_likely(&psi_disabled))
+ return;
+
+ if (unlikely(rq->curr->flags & PF_MEMSTALL))
+ psi_memstall_tick(rq->curr, cpu_of(rq));
+}
+#else /* CONFIG_PSI */
+static inline void psi_enqueue(struct task_struct *p, bool wakeup) {}
+static inline void psi_dequeue(struct task_struct *p, bool sleep) {}
+static inline void psi_ttwu_dequeue(struct task_struct *p) {}
+static inline void psi_task_tick(struct rq *rq) {}
+#endif /* CONFIG_PSI */
+
#ifdef CONFIG_SCHED_INFO
static inline void sched_info_reset_dequeued(struct task_struct *t)
{
diff --git a/kernel/sched/stop_task.c b/kernel/sched/stop_task.c
index 11a1888..a6ce92c 100644
--- a/kernel/sched/stop_task.c
+++ b/kernel/sched/stop_task.c
@@ -25,7 +25,7 @@ check_preempt_curr_stop(struct rq *rq, struct task_struct *p, int flags)
}
static struct task_struct *
-pick_next_task_stop(struct rq *rq, struct task_struct *prev, struct pin_cookie cookie)
+pick_next_task_stop(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
struct task_struct *stop = rq->stop;
diff --git a/kernel/sys.c b/kernel/sys.c
index 60e7ee0..181a7f0 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -51,6 +51,7 @@
#include <linux/binfmts.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/rcupdate.h>
#include <linux/uidgid.h>
#include <linux/cred.h>
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index b4f1889..a3c6446 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -125,7 +125,9 @@ static int __maybe_unused one = 1;
static int __maybe_unused two = 2;
static int __maybe_unused three = 3;
static int __maybe_unused four = 4;
+static unsigned long zero_ul;
static unsigned long one_ul = 1;
+static unsigned long long_max = LONG_MAX;
static int one_hundred = 100;
static int __maybe_unused one_thousand = 1000;
#ifdef CONFIG_SCHED_WALT
@@ -1881,6 +1883,8 @@ static struct ctl_table fs_table[] = {
.maxlen = sizeof(files_stat.max_files),
.mode = 0644,
.proc_handler = proc_doulongvec_minmax,
+ .extra1 = &zero_ul,
+ .extra2 = &long_max,
},
{
.procname = "nr_open",
@@ -2576,7 +2580,16 @@ static int do_proc_dointvec_minmax_conv(bool *negp, unsigned long *lvalp,
{
struct do_proc_dointvec_minmax_conv_param *param = data;
if (write) {
- int val = *negp ? -*lvalp : *lvalp;
+ int val;
+ if (*negp) {
+ if (*lvalp > (unsigned long) INT_MAX + 1)
+ return -EINVAL;
+ val = -*lvalp;
+ } else {
+ if (*lvalp > (unsigned long) INT_MAX)
+ return -EINVAL;
+ val = *lvalp;
+ }
if ((param->min && *param->min > val) ||
(param->max && *param->max < val))
return -EINVAL;
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 9d4c257..86e8969 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -15,6 +15,7 @@
#include <linux/init.h>
#include <linux/mm.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/syscore_ops.h>
#include <linux/clocksource.h>
#include <linux/jiffies.h>
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index ae6bdf3..618c7de 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -32,6 +32,7 @@
#include <linux/list.h>
#include <linux/hash.h>
#include <linux/rcupdate.h>
+#include <linux/kprobes.h>
#include <trace/events/sched.h>
@@ -5247,7 +5248,7 @@ void ftrace_reset_array_ops(struct trace_array *tr)
tr->ops->func = ftrace_stub;
}
-static inline void
+static nokprobe_inline void
__ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *ignored, struct pt_regs *regs)
{
@@ -5310,12 +5311,14 @@ static void ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
{
__ftrace_ops_list_func(ip, parent_ip, NULL, regs);
}
+NOKPROBE_SYMBOL(ftrace_ops_list_func);
#else
static void ftrace_ops_no_ops(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct pt_regs *regs)
{
__ftrace_ops_list_func(ip, parent_ip, NULL, NULL);
}
+NOKPROBE_SYMBOL(ftrace_ops_no_ops);
#endif
/*
@@ -5345,6 +5348,7 @@ static void ftrace_ops_assist_func(unsigned long ip, unsigned long parent_ip,
preempt_enable_notrace();
trace_clear_recursion(bit);
}
+NOKPROBE_SYMBOL(ftrace_ops_assist_func);
/**
* ftrace_ops_get_func - get the function a trampoline should call
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index f316e90..2cfe11e 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -701,7 +701,7 @@ u64 ring_buffer_time_stamp(struct ring_buffer *buffer, int cpu)
preempt_disable_notrace();
time = rb_time_stamp(buffer);
- preempt_enable_no_resched_notrace();
+ preempt_enable_notrace();
return time;
}
@@ -4037,6 +4037,7 @@ EXPORT_SYMBOL_GPL(ring_buffer_consume);
* ring_buffer_read_prepare - Prepare for a non consuming read of the buffer
* @buffer: The ring buffer to read from
* @cpu: The cpu buffer to iterate over
+ * @flags: gfp flags to use for memory allocation
*
* This performs the initial preparations necessary to iterate
* through the buffer. Memory is allocated, buffer recording
@@ -4054,7 +4055,7 @@ EXPORT_SYMBOL_GPL(ring_buffer_consume);
* This overall must be paired with ring_buffer_read_finish.
*/
struct ring_buffer_iter *
-ring_buffer_read_prepare(struct ring_buffer *buffer, int cpu)
+ring_buffer_read_prepare(struct ring_buffer *buffer, int cpu, gfp_t flags)
{
struct ring_buffer_per_cpu *cpu_buffer;
struct ring_buffer_iter *iter;
@@ -4062,7 +4063,7 @@ ring_buffer_read_prepare(struct ring_buffer *buffer, int cpu)
if (!cpumask_test_cpu(cpu, buffer->cpumask))
return NULL;
- iter = kmalloc(sizeof(*iter), GFP_KERNEL);
+ iter = kmalloc(sizeof(*iter), flags);
if (!iter)
return NULL;
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index c13e83b..7dba55c 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -501,8 +501,10 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
* not modified.
*/
pid_list = kmalloc(sizeof(*pid_list), GFP_KERNEL);
- if (!pid_list)
+ if (!pid_list) {
+ trace_parser_put(&parser);
return -ENOMEM;
+ }
pid_list->pid_max = READ_ONCE(pid_max);
@@ -512,6 +514,7 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
pid_list->pids = vzalloc((pid_list->pid_max + 7) >> 3);
if (!pid_list->pids) {
+ trace_parser_put(&parser);
kfree(pid_list);
return -ENOMEM;
}
@@ -3522,7 +3525,8 @@ __tracing_open(struct inode *inode, struct file *file, bool snapshot)
if (iter->cpu_file == RING_BUFFER_ALL_CPUS) {
for_each_tracing_cpu(cpu) {
iter->buffer_iter[cpu] =
- ring_buffer_read_prepare(iter->trace_buffer->buffer, cpu);
+ ring_buffer_read_prepare(iter->trace_buffer->buffer,
+ cpu, GFP_KERNEL);
}
ring_buffer_read_prepare_sync();
for_each_tracing_cpu(cpu) {
@@ -3532,7 +3536,8 @@ __tracing_open(struct inode *inode, struct file *file, bool snapshot)
} else {
cpu = iter->cpu_file;
iter->buffer_iter[cpu] =
- ring_buffer_read_prepare(iter->trace_buffer->buffer, cpu);
+ ring_buffer_read_prepare(iter->trace_buffer->buffer,
+ cpu, GFP_KERNEL);
ring_buffer_read_prepare_sync();
ring_buffer_read_start(iter->buffer_iter[cpu]);
tracing_iter_reset(iter, cpu);
@@ -5210,7 +5215,6 @@ static int tracing_open_pipe(struct inode *inode, struct file *filp)
return ret;
fail:
- kfree(iter->trace);
kfree(iter);
__trace_array_put(tr);
mutex_unlock(&trace_types_lock);
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 0664044..766e5cc 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -871,9 +871,10 @@ static inline void add_to_key(char *compound_key, void *key,
/* ensure NULL-termination */
if (size > key_field->size - 1)
size = key_field->size - 1;
- }
- memcpy(compound_key + key_field->offset, key, size);
+ strncpy(compound_key + key_field->offset, (char *)key, size);
+ } else
+ memcpy(compound_key + key_field->offset, key, size);
}
static void event_hist_trigger(struct event_trigger_data *data, void *rec)
diff --git a/kernel/trace/trace_kdb.c b/kernel/trace/trace_kdb.c
index 57149bc..8964582 100644
--- a/kernel/trace/trace_kdb.c
+++ b/kernel/trace/trace_kdb.c
@@ -50,14 +50,16 @@ static void ftrace_dump_buf(int skip_lines, long cpu_file)
if (cpu_file == RING_BUFFER_ALL_CPUS) {
for_each_tracing_cpu(cpu) {
iter.buffer_iter[cpu] =
- ring_buffer_read_prepare(iter.trace_buffer->buffer, cpu);
+ ring_buffer_read_prepare(iter.trace_buffer->buffer,
+ cpu, GFP_ATOMIC);
ring_buffer_read_start(iter.buffer_iter[cpu]);
tracing_iter_reset(&iter, cpu);
}
} else {
iter.cpu_file = cpu_file;
iter.buffer_iter[cpu_file] =
- ring_buffer_read_prepare(iter.trace_buffer->buffer, cpu_file);
+ ring_buffer_read_prepare(iter.trace_buffer->buffer,
+ cpu_file, GFP_ATOMIC);
ring_buffer_read_start(iter.buffer_iter[cpu_file]);
tracing_iter_reset(&iter, cpu_file);
}
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 20bee0c..255a824 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -614,7 +614,7 @@ static int probes_seq_show(struct seq_file *m, void *v)
/* Don't print "0x (null)" when offset is 0 */
if (tu->offset) {
- seq_printf(m, "0x%p", (void *)tu->offset);
+ seq_printf(m, "0x%px", (void *)tu->offset);
} else {
switch (sizeof(void *)) {
case 4:
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 3806a97..e14c99521 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -293,6 +293,8 @@ module_param_named(disable_numa, wq_disable_numa, bool, 0444);
static bool wq_power_efficient = IS_ENABLED(CONFIG_WQ_POWER_EFFICIENT_DEFAULT);
module_param_named(power_efficient, wq_power_efficient, bool, 0444);
+bool wq_online; /* can kworkers be created yet? */
+
static bool wq_numa_enabled; /* unbound NUMA affinity enabled */
/* buf for wq_update_unbound_numa_attrs(), protected by CPU hotplug exclusion */
@@ -912,6 +914,26 @@ struct task_struct *wq_worker_sleeping(struct task_struct *task)
}
/**
+ * wq_worker_last_func - retrieve worker's last work function
+ *
+ * Determine the last function a worker executed. This is called from
+ * the scheduler to get a worker's last known identity.
+ *
+ * CONTEXT:
+ * spin_lock_irq(rq->lock)
+ *
+ * Return:
+ * The last work function %current executed as a worker, NULL if it
+ * hasn't executed any work yet.
+ */
+work_func_t wq_worker_last_func(struct task_struct *task)
+{
+ struct worker *worker = kthread_data(task);
+
+ return worker->last_func;
+}
+
+/**
* worker_set_flags - set worker flags and adjust nr_running accordingly
* @worker: self
* @flags: flags to set
@@ -2130,6 +2152,9 @@ __acquires(&pool->lock)
if (unlikely(cpu_intensive))
worker_clr_flags(worker, WORKER_CPU_INTENSIVE);
+ /* tag the worker for identification in schedule() */
+ worker->last_func = worker->current_func;
+
/* we're done with it, release */
hash_del(&worker->hentry);
worker->current_work = NULL;
@@ -2586,6 +2611,9 @@ void flush_workqueue(struct workqueue_struct *wq)
};
int next_color;
+ if (WARN_ON(!wq_online))
+ return;
+
lock_map_acquire(&wq->lockdep_map);
lock_map_release(&wq->lockdep_map);
@@ -2846,6 +2874,9 @@ bool flush_work(struct work_struct *work)
{
struct wq_barrier barr;
+ if (WARN_ON(!wq_online))
+ return false;
+
lock_map_acquire(&work->lockdep_map);
lock_map_release(&work->lockdep_map);
@@ -2916,7 +2947,13 @@ static bool __cancel_work_timer(struct work_struct *work, bool is_dwork)
mark_work_canceling(work);
local_irq_restore(flags);
- flush_work(work);
+ /*
+ * This allows canceling during early boot. We know that @work
+ * isn't executing.
+ */
+ if (wq_online)
+ flush_work(work);
+
clear_work_data(work);
/*
@@ -3366,7 +3403,7 @@ static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs)
goto fail;
/* create and start the initial worker */
- if (!create_worker(pool))
+ if (wq_online && !create_worker(pool))
goto fail;
/* install */
@@ -3431,6 +3468,7 @@ static void pwq_adjust_max_active(struct pool_workqueue *pwq)
{
struct workqueue_struct *wq = pwq->wq;
bool freezable = wq->flags & WQ_FREEZABLE;
+ unsigned long flags;
/* for @wq->saved_max_active */
lockdep_assert_held(&wq->mutex);
@@ -3439,7 +3477,8 @@ static void pwq_adjust_max_active(struct pool_workqueue *pwq)
if (!freezable && pwq->max_active == wq->saved_max_active)
return;
- spin_lock_irq(&pwq->pool->lock);
+ /* this function can be called during early boot w/ irq disabled */
+ spin_lock_irqsave(&pwq->pool->lock, flags);
/*
* During [un]freezing, the caller is responsible for ensuring that
@@ -3462,7 +3501,7 @@ static void pwq_adjust_max_active(struct pool_workqueue *pwq)
pwq->max_active = 0;
}
- spin_unlock_irq(&pwq->pool->lock);
+ spin_unlock_irqrestore(&pwq->pool->lock, flags);
}
/* initialize newly alloced @pwq which is associated with @wq and @pool */
@@ -5509,7 +5548,17 @@ static void __init wq_numa_init(void)
wq_numa_enabled = true;
}
-static int __init init_workqueues(void)
+/**
+ * workqueue_init_early - early init for workqueue subsystem
+ *
+ * This is the first half of two-staged workqueue subsystem initialization
+ * and invoked as soon as the bare basics - memory allocation, cpumasks and
+ * idr are up. It sets up all the data structures and system workqueues
+ * and allows early boot code to create workqueues and queue/cancel work
+ * items. Actual work item execution starts only after kthreads can be
+ * created and scheduled right before early initcalls.
+ */
+int __init workqueue_init_early(void)
{
int std_nice[NR_STD_WORKER_POOLS] = { 0, HIGHPRI_NICE_LEVEL };
int i, cpu;
@@ -5542,16 +5591,6 @@ static int __init init_workqueues(void)
}
}
- /* create the initial worker */
- for_each_online_cpu(cpu) {
- struct worker_pool *pool;
-
- for_each_cpu_worker_pool(pool, cpu) {
- pool->flags &= ~POOL_DISASSOCIATED;
- BUG_ON(!create_worker(pool));
- }
- }
-
/* create default unbound and ordered wq attrs */
for (i = 0; i < NR_STD_WORKER_POOLS; i++) {
struct workqueue_attrs *attrs;
@@ -5588,8 +5627,36 @@ static int __init init_workqueues(void)
!system_power_efficient_wq ||
!system_freezable_power_efficient_wq);
+ return 0;
+}
+
+/**
+ * workqueue_init - bring workqueue subsystem fully online
+ *
+ * This is the latter half of two-staged workqueue subsystem initialization
+ * and invoked as soon as kthreads can be created and scheduled.
+ * Workqueues have been created and work items queued on them, but there
+ * are no kworkers executing the work items yet. Populate the worker pools
+ * with the initial workers and enable future kworker creations.
+ */
+int __init workqueue_init(void)
+{
+ struct worker_pool *pool;
+ int cpu, bkt;
+
+ /* create the initial workers */
+ for_each_online_cpu(cpu) {
+ for_each_cpu_worker_pool(pool, cpu) {
+ pool->flags &= ~POOL_DISASSOCIATED;
+ BUG_ON(!create_worker(pool));
+ }
+ }
+
+ hash_for_each(unbound_pool_hash, bkt, pool, hash_node)
+ BUG_ON(!create_worker(pool));
+
+ wq_online = true;
wq_watchdog_init();
return 0;
}
-early_initcall(init_workqueues);
diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h
index 29fa81f..c7f7db1 100644
--- a/kernel/workqueue_internal.h
+++ b/kernel/workqueue_internal.h
@@ -53,6 +53,9 @@ struct worker {
/* used only by rescuers to point to the target workqueue */
struct workqueue_struct *rescue_wq; /* I: the workqueue to rescue */
+
+ /* used by the scheduler to determine a worker's last known identity */
+ work_func_t last_func;
};
/**
@@ -67,9 +70,10 @@ static inline struct worker *current_wq_worker(void)
/*
* Scheduler hooks for concurrency managed workqueue. Only to be used from
- * sched/core.c and workqueue.c.
+ * sched/ and workqueue.c.
*/
void wq_worker_waking_up(struct task_struct *task, int cpu);
struct task_struct *wq_worker_sleeping(struct task_struct *task);
+work_func_t wq_worker_last_func(struct task_struct *task);
#endif /* _KERNEL_WORKQUEUE_INTERNAL_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index ad250d1..639f799 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1939,9 +1939,9 @@
tristate "Perform selftest on hash functions"
default n
help
- Enable this option to test the kernel's integer (<linux/hash,h>)
- and string (<linux/stringhash.h>) hash functions on boot
- (or module load).
+ Enable this option to test the kernel's integer (<linux/hash.h>),
+ string (<linux/stringhash.h>), and siphash (<linux/siphash.h>)
+ hash functions on boot (or module load).
This is intended to help people writing architecture-specific
optimized versions. If unsure, say N.
diff --git a/lib/Makefile b/lib/Makefile
index 78e7175..515c99a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -22,7 +22,8 @@
sha1.o chacha.o md5.o irq_regs.o argv_split.o \
flex_proportions.o ratelimit.o show_mem.o \
is_single_threaded.o plist.o decompress.o kobject_uevent.o \
- earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o
+ earlycpio.o seq_buf.o siphash.o \
+ nmi_backtrace.o nodemask.o win_minmax.o
lib-$(CONFIG_MMU) += ioremap.o
lib-$(CONFIG_SMP) += cpumask.o
@@ -46,7 +47,7 @@
obj-y += kstrtox.o
obj-$(CONFIG_TEST_BPF) += test_bpf.o
obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o
-obj-$(CONFIG_TEST_HASH) += test_hash.o
+obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o
obj-$(CONFIG_TEST_KASAN) += test_kasan.o
obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o
obj-$(CONFIG_TEST_LKM) += test_module.o
diff --git a/lib/assoc_array.c b/lib/assoc_array.c
index 5cd0935..3b46c54 100644
--- a/lib/assoc_array.c
+++ b/lib/assoc_array.c
@@ -781,9 +781,11 @@ static bool assoc_array_insert_into_terminal_node(struct assoc_array_edit *edit,
new_s0->index_key[i] =
ops->get_key_chunk(index_key, i * ASSOC_ARRAY_KEY_CHUNK_SIZE);
- blank = ULONG_MAX << (level & ASSOC_ARRAY_KEY_CHUNK_MASK);
- pr_devel("blank off [%zu] %d: %lx\n", keylen - 1, level, blank);
- new_s0->index_key[keylen - 1] &= ~blank;
+ if (level & ASSOC_ARRAY_KEY_CHUNK_MASK) {
+ blank = ULONG_MAX << (level & ASSOC_ARRAY_KEY_CHUNK_MASK);
+ pr_devel("blank off [%zu] %d: %lx\n", keylen - 1, level, blank);
+ new_s0->index_key[keylen - 1] &= ~blank;
+ }
/* This now reduces to a node splitting exercise for which we'll need
* to regenerate the disparity table.
diff --git a/lib/bsearch.c b/lib/bsearch.c
index e33c179..d500484 100644
--- a/lib/bsearch.c
+++ b/lib/bsearch.c
@@ -11,6 +11,7 @@
#include <linux/export.h>
#include <linux/bsearch.h>
+#include <linux/kprobes.h>
/*
* bsearch - binary search an array of elements
@@ -51,3 +52,4 @@ void *bsearch(const void *key, const void *base, size_t num, size_t size,
return NULL;
}
EXPORT_SYMBOL(bsearch);
+NOKPROBE_SYMBOL(bsearch);
diff --git a/lib/bug.c b/lib/bug.c
index bc3656e..dc7aa6b 100644
--- a/lib/bug.c
+++ b/lib/bug.c
@@ -177,7 +177,7 @@ enum bug_trap_type report_bug(unsigned long bugaddr, struct pt_regs *regs)
if (file)
pr_crit("kernel BUG at %s:%u!\n", file, line);
else
- pr_crit("Kernel BUG at %p [verbose debug info unavailable]\n",
+ pr_crit("Kernel BUG at %pB [verbose debug info unavailable]\n",
(void *)bugaddr);
return BUG_TRAP_TYPE_BUG;
diff --git a/lib/div64.c b/lib/div64.c
index 7f34525..c1c1a4c 100644
--- a/lib/div64.c
+++ b/lib/div64.c
@@ -102,7 +102,7 @@ u64 div64_u64_rem(u64 dividend, u64 divisor, u64 *remainder)
quot = div_u64_rem(dividend, divisor, &rem32);
*remainder = rem32;
} else {
- int n = 1 + fls(high);
+ int n = fls(high);
quot = div_u64(dividend >> n, divisor >> n);
if (quot != 0)
@@ -140,7 +140,7 @@ u64 div64_u64(u64 dividend, u64 divisor)
if (high == 0) {
quot = div_u64(dividend, divisor);
} else {
- int n = 1 + fls(high);
+ int n = fls(high);
quot = div_u64(dividend >> n, divisor >> n);
if (quot != 0)
diff --git a/lib/int_sqrt.c b/lib/int_sqrt.c
index 1ef4cc3..6d35274 100644
--- a/lib/int_sqrt.c
+++ b/lib/int_sqrt.c
@@ -7,6 +7,7 @@
#include <linux/kernel.h>
#include <linux/export.h>
+#include <linux/bitops.h>
/**
* int_sqrt - rough approximation to sqrt
@@ -21,7 +22,7 @@ unsigned long int_sqrt(unsigned long x)
if (x <= 1)
return x;
- m = 1UL << (BITS_PER_LONG - 2);
+ m = 1UL << (__fls(x) & ~1UL);
while (m != 0) {
b = y + m;
y >>= 1;
diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
index 3057011..10ca694 100644
--- a/lib/raid6/Makefile
+++ b/lib/raid6/Makefile
@@ -24,7 +24,7 @@
ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
NEON_FLAGS := -ffreestanding
ifeq ($(ARCH),arm)
-NEON_FLAGS += -mfloat-abi=softfp -mfpu=neon
+NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
endif
ifeq ($(ARCH),arm64)
CFLAGS_REMOVE_neon1.o += -mgeneral-regs-only
diff --git a/lib/siphash.c b/lib/siphash.c
new file mode 100644
index 0000000..3ae58b4
--- /dev/null
+++ b/lib/siphash.c
@@ -0,0 +1,551 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ *
+ * This implementation is specifically for SipHash2-4 for a secure PRF
+ * and HalfSipHash1-3/SipHash1-3 for an insecure PRF only suitable for
+ * hashtables.
+ */
+
+#include <linux/siphash.h>
+#include <asm/unaligned.h>
+
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+#include <linux/dcache.h>
+#include <asm/word-at-a-time.h>
+#endif
+
+#define SIPROUND \
+ do { \
+ v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); \
+ v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; \
+ v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; \
+ v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \
+ } while (0)
+
+#define PREAMBLE(len) \
+ u64 v0 = 0x736f6d6570736575ULL; \
+ u64 v1 = 0x646f72616e646f6dULL; \
+ u64 v2 = 0x6c7967656e657261ULL; \
+ u64 v3 = 0x7465646279746573ULL; \
+ u64 b = ((u64)(len)) << 56; \
+ v3 ^= key->key[1]; \
+ v2 ^= key->key[0]; \
+ v1 ^= key->key[1]; \
+ v0 ^= key->key[0];
+
+#define POSTAMBLE \
+ v3 ^= b; \
+ SIPROUND; \
+ SIPROUND; \
+ v0 ^= b; \
+ v2 ^= 0xff; \
+ SIPROUND; \
+ SIPROUND; \
+ SIPROUND; \
+ SIPROUND; \
+ return (v0 ^ v1) ^ (v2 ^ v3);
+
+u64 __siphash_aligned(const void *data, size_t len, const siphash_key_t *key)
+{
+ const u8 *end = data + len - (len % sizeof(u64));
+ const u8 left = len & (sizeof(u64) - 1);
+ u64 m;
+ PREAMBLE(len)
+ for (; data != end; data += sizeof(u64)) {
+ m = le64_to_cpup(data);
+ v3 ^= m;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= m;
+ }
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+ if (left)
+ b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) &
+ bytemask_from_count(left)));
+#else
+ switch (left) {
+ case 7: b |= ((u64)end[6]) << 48;
+ case 6: b |= ((u64)end[5]) << 40;
+ case 5: b |= ((u64)end[4]) << 32;
+ case 4: b |= le32_to_cpup(data); break;
+ case 3: b |= ((u64)end[2]) << 16;
+ case 2: b |= le16_to_cpup(data); break;
+ case 1: b |= end[0];
+ }
+#endif
+ POSTAMBLE
+}
+EXPORT_SYMBOL(__siphash_aligned);
+
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+u64 __siphash_unaligned(const void *data, size_t len, const siphash_key_t *key)
+{
+ const u8 *end = data + len - (len % sizeof(u64));
+ const u8 left = len & (sizeof(u64) - 1);
+ u64 m;
+ PREAMBLE(len)
+ for (; data != end; data += sizeof(u64)) {
+ m = get_unaligned_le64(data);
+ v3 ^= m;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= m;
+ }
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+ if (left)
+ b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) &
+ bytemask_from_count(left)));
+#else
+ switch (left) {
+ case 7: b |= ((u64)end[6]) << 48;
+ case 6: b |= ((u64)end[5]) << 40;
+ case 5: b |= ((u64)end[4]) << 32;
+ case 4: b |= get_unaligned_le32(end); break;
+ case 3: b |= ((u64)end[2]) << 16;
+ case 2: b |= get_unaligned_le16(end); break;
+ case 1: b |= end[0];
+ }
+#endif
+ POSTAMBLE
+}
+EXPORT_SYMBOL(__siphash_unaligned);
+#endif
+
+/**
+ * siphash_1u64 - compute 64-bit siphash PRF value of a u64
+ * @first: first u64
+ * @key: the siphash key
+ */
+u64 siphash_1u64(const u64 first, const siphash_key_t *key)
+{
+ PREAMBLE(8)
+ v3 ^= first;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= first;
+ POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_1u64);
+
+/**
+ * siphash_2u64 - compute 64-bit siphash PRF value of 2 u64
+ * @first: first u64
+ * @second: second u64
+ * @key: the siphash key
+ */
+u64 siphash_2u64(const u64 first, const u64 second, const siphash_key_t *key)
+{
+ PREAMBLE(16)
+ v3 ^= first;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= first;
+ v3 ^= second;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= second;
+ POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_2u64);
+
+/**
+ * siphash_3u64 - compute 64-bit siphash PRF value of 3 u64
+ * @first: first u64
+ * @second: second u64
+ * @third: third u64
+ * @key: the siphash key
+ */
+u64 siphash_3u64(const u64 first, const u64 second, const u64 third,
+ const siphash_key_t *key)
+{
+ PREAMBLE(24)
+ v3 ^= first;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= first;
+ v3 ^= second;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= second;
+ v3 ^= third;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= third;
+ POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_3u64);
+
+/**
+ * siphash_4u64 - compute 64-bit siphash PRF value of 4 u64
+ * @first: first u64
+ * @second: second u64
+ * @third: third u64
+ * @forth: forth u64
+ * @key: the siphash key
+ */
+u64 siphash_4u64(const u64 first, const u64 second, const u64 third,
+ const u64 forth, const siphash_key_t *key)
+{
+ PREAMBLE(32)
+ v3 ^= first;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= first;
+ v3 ^= second;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= second;
+ v3 ^= third;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= third;
+ v3 ^= forth;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= forth;
+ POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_4u64);
+
+u64 siphash_1u32(const u32 first, const siphash_key_t *key)
+{
+ PREAMBLE(4)
+ b |= first;
+ POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_1u32);
+
+u64 siphash_3u32(const u32 first, const u32 second, const u32 third,
+ const siphash_key_t *key)
+{
+ u64 combined = (u64)second << 32 | first;
+ PREAMBLE(12)
+ v3 ^= combined;
+ SIPROUND;
+ SIPROUND;
+ v0 ^= combined;
+ b |= third;
+ POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_3u32);
+
+#if BITS_PER_LONG == 64
+/* Note that on 64-bit, we make HalfSipHash1-3 actually be SipHash1-3, for
+ * performance reasons. On 32-bit, below, we actually implement HalfSipHash1-3.
+ */
+
+#define HSIPROUND SIPROUND
+#define HPREAMBLE(len) PREAMBLE(len)
+#define HPOSTAMBLE \
+ v3 ^= b; \
+ HSIPROUND; \
+ v0 ^= b; \
+ v2 ^= 0xff; \
+ HSIPROUND; \
+ HSIPROUND; \
+ HSIPROUND; \
+ return (v0 ^ v1) ^ (v2 ^ v3);
+
+u32 __hsiphash_aligned(const void *data, size_t len, const hsiphash_key_t *key)
+{
+ const u8 *end = data + len - (len % sizeof(u64));
+ const u8 left = len & (sizeof(u64) - 1);
+ u64 m;
+ HPREAMBLE(len)
+ for (; data != end; data += sizeof(u64)) {
+ m = le64_to_cpup(data);
+ v3 ^= m;
+ HSIPROUND;
+ v0 ^= m;
+ }
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+ if (left)
+ b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) &
+ bytemask_from_count(left)));
+#else
+ switch (left) {
+ case 7: b |= ((u64)end[6]) << 48;
+ case 6: b |= ((u64)end[5]) << 40;
+ case 5: b |= ((u64)end[4]) << 32;
+ case 4: b |= le32_to_cpup(data); break;
+ case 3: b |= ((u64)end[2]) << 16;
+ case 2: b |= le16_to_cpup(data); break;
+ case 1: b |= end[0];
+ }
+#endif
+ HPOSTAMBLE
+}
+EXPORT_SYMBOL(__hsiphash_aligned);
+
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+u32 __hsiphash_unaligned(const void *data, size_t len,
+ const hsiphash_key_t *key)
+{
+ const u8 *end = data + len - (len % sizeof(u64));
+ const u8 left = len & (sizeof(u64) - 1);
+ u64 m;
+ HPREAMBLE(len)
+ for (; data != end; data += sizeof(u64)) {
+ m = get_unaligned_le64(data);
+ v3 ^= m;
+ HSIPROUND;
+ v0 ^= m;
+ }
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+ if (left)
+ b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) &
+ bytemask_from_count(left)));
+#else
+ switch (left) {
+ case 7: b |= ((u64)end[6]) << 48;
+ case 6: b |= ((u64)end[5]) << 40;
+ case 5: b |= ((u64)end[4]) << 32;
+ case 4: b |= get_unaligned_le32(end); break;
+ case 3: b |= ((u64)end[2]) << 16;
+ case 2: b |= get_unaligned_le16(end); break;
+ case 1: b |= end[0];
+ }
+#endif
+ HPOSTAMBLE
+}
+EXPORT_SYMBOL(__hsiphash_unaligned);
+#endif
+
+/**
+ * hsiphash_1u32 - compute 64-bit hsiphash PRF value of a u32
+ * @first: first u32
+ * @key: the hsiphash key
+ */
+u32 hsiphash_1u32(const u32 first, const hsiphash_key_t *key)
+{
+ HPREAMBLE(4)
+ b |= first;
+ HPOSTAMBLE
+}
+EXPORT_SYMBOL(hsiphash_1u32);
+
+/**
+ * hsiphash_2u32 - compute 32-bit hsiphash PRF value of 2 u32
+ * @first: first u32
+ * @second: second u32
+ * @key: the hsiphash key
+ */
+u32 hsiphash_2u32(const u32 first, const u32 second, const hsiphash_key_t *key)
+{
+ u64 combined = (u64)second << 32 | first;
+ HPREAMBLE(8)
+ v3 ^= combined;
+ HSIPROUND;
+ v0 ^= combined;
+ HPOSTAMBLE
+}
+EXPORT_SYMBOL(hsiphash_2u32);
+
+/**
+ * hsiphash_3u32 - compute 32-bit hsiphash PRF value of 3 u32
+ * @first: first u32
+ * @second: second u32
+ * @third: third u32
+ * @key: the hsiphash key
+ */
+u32 hsiphash_3u32(const u32 first, const u32 second, const u32 third,
+ const hsiphash_key_t *key)
+{
+ u64 combined = (u64)second << 32 | first;
+ HPREAMBLE(12)
+ v3 ^= combined;
+ HSIPROUND;
+ v0 ^= combined;
+ b |= third;
+ HPOSTAMBLE
+}
+EXPORT_SYMBOL(hsiphash_3u32);
+
+/**
+ * hsiphash_4u32 - compute 32-bit hsiphash PRF value of 4 u32
+ * @first: first u32
+ * @second: second u32
+ * @third: third u32
+ * @forth: forth u32
+ * @key: the hsiphash key
+ */
+u32 hsiphash_4u32(const u32 first, const u32 second, const u32 third,
+ const u32 forth, const hsiphash_key_t *key)
+{
+ u64 combined = (u64)second << 32 | first;
+ HPREAMBLE(16)
+ v3 ^= combined;
+ HSIPROUND;
+ v0 ^= combined;
+ combined = (u64)forth << 32 | third;
+ v3 ^= combined;
+ HSIPROUND;
+ v0 ^= combined;
+ HPOSTAMBLE
+}
+EXPORT_SYMBOL(hsiphash_4u32);
+#else
+#define HSIPROUND \
+ do { \
+ v0 += v1; v1 = rol32(v1, 5); v1 ^= v0; v0 = rol32(v0, 16); \
+ v2 += v3; v3 = rol32(v3, 8); v3 ^= v2; \
+ v0 += v3; v3 = rol32(v3, 7); v3 ^= v0; \
+ v2 += v1; v1 = rol32(v1, 13); v1 ^= v2; v2 = rol32(v2, 16); \
+ } while (0)
+
+#define HPREAMBLE(len) \
+ u32 v0 = 0; \
+ u32 v1 = 0; \
+ u32 v2 = 0x6c796765U; \
+ u32 v3 = 0x74656462U; \
+ u32 b = ((u32)(len)) << 24; \
+ v3 ^= key->key[1]; \
+ v2 ^= key->key[0]; \
+ v1 ^= key->key[1]; \
+ v0 ^= key->key[0];
+
+#define HPOSTAMBLE \
+ v3 ^= b; \
+ HSIPROUND; \
+ v0 ^= b; \
+ v2 ^= 0xff; \
+ HSIPROUND; \
+ HSIPROUND; \
+ HSIPROUND; \
+ return v1 ^ v3;
+
+u32 __hsiphash_aligned(const void *data, size_t len, const hsiphash_key_t *key)
+{
+ const u8 *end = data + len - (len % sizeof(u32));
+ const u8 left = len & (sizeof(u32) - 1);
+ u32 m;
+ HPREAMBLE(len)
+ for (; data != end; data += sizeof(u32)) {
+ m = le32_to_cpup(data);
+ v3 ^= m;
+ HSIPROUND;
+ v0 ^= m;
+ }
+ switch (left) {
+ case 3: b |= ((u32)end[2]) << 16;
+ case 2: b |= le16_to_cpup(data); break;
+ case 1: b |= end[0];
+ }
+ HPOSTAMBLE
+}
+EXPORT_SYMBOL(__hsiphash_aligned);
+
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+u32 __hsiphash_unaligned(const void *data, size_t len,
+ const hsiphash_key_t *key)
+{
+ const u8 *end = data + len - (len % sizeof(u32));
+ const u8 left = len & (sizeof(u32) - 1);
+ u32 m;
+ HPREAMBLE(len)
+ for (; data != end; data += sizeof(u32)) {
+ m = get_unaligned_le32(data);
+ v3 ^= m;
+ HSIPROUND;
+ v0 ^= m;
+ }
+ switch (left) {
+ case 3: b |= ((u32)end[2]) << 16;
+ case 2: b |= get_unaligned_le16(end); break;
+ case 1: b |= end[0];
+ }
+ HPOSTAMBLE
+}
+EXPORT_SYMBOL(__hsiphash_unaligned);
+#endif
+
+/**
+ * hsiphash_1u32 - compute 32-bit hsiphash PRF value of a u32
+ * @first: first u32
+ * @key: the hsiphash key
+ */
+u32 hsiphash_1u32(const u32 first, const hsiphash_key_t *key)
+{
+ HPREAMBLE(4)
+ v3 ^= first;
+ HSIPROUND;
+ v0 ^= first;
+ HPOSTAMBLE
+}
+EXPORT_SYMBOL(hsiphash_1u32);
+
+/**
+ * hsiphash_2u32 - compute 32-bit hsiphash PRF value of 2 u32
+ * @first: first u32
+ * @second: second u32
+ * @key: the hsiphash key
+ */
+u32 hsiphash_2u32(const u32 first, const u32 second, const hsiphash_key_t *key)
+{
+ HPREAMBLE(8)
+ v3 ^= first;
+ HSIPROUND;
+ v0 ^= first;
+ v3 ^= second;
+ HSIPROUND;
+ v0 ^= second;
+ HPOSTAMBLE
+}
+EXPORT_SYMBOL(hsiphash_2u32);
+
+/**
+ * hsiphash_3u32 - compute 32-bit hsiphash PRF value of 3 u32
+ * @first: first u32
+ * @second: second u32
+ * @third: third u32
+ * @key: the hsiphash key
+ */
+u32 hsiphash_3u32(const u32 first, const u32 second, const u32 third,
+ const hsiphash_key_t *key)
+{
+ HPREAMBLE(12)
+ v3 ^= first;
+ HSIPROUND;
+ v0 ^= first;
+ v3 ^= second;
+ HSIPROUND;
+ v0 ^= second;
+ v3 ^= third;
+ HSIPROUND;
+ v0 ^= third;
+ HPOSTAMBLE
+}
+EXPORT_SYMBOL(hsiphash_3u32);
+
+/**
+ * hsiphash_4u32 - compute 32-bit hsiphash PRF value of 4 u32
+ * @first: first u32
+ * @second: second u32
+ * @third: third u32
+ * @forth: forth u32
+ * @key: the hsiphash key
+ */
+u32 hsiphash_4u32(const u32 first, const u32 second, const u32 third,
+ const u32 forth, const hsiphash_key_t *key)
+{
+ HPREAMBLE(16)
+ v3 ^= first;
+ HSIPROUND;
+ v0 ^= first;
+ v3 ^= second;
+ HSIPROUND;
+ v0 ^= second;
+ v3 ^= third;
+ HSIPROUND;
+ v0 ^= third;
+ v3 ^= forth;
+ HSIPROUND;
+ v0 ^= forth;
+ HPOSTAMBLE
+}
+EXPORT_SYMBOL(hsiphash_4u32);
+#endif
diff --git a/lib/string.c b/lib/string.c
index ccabe16..ddd3907 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -772,6 +772,26 @@ __visible int memcmp(const void *cs, const void *ct, size_t count)
EXPORT_SYMBOL(memcmp);
#endif
+#ifndef __HAVE_ARCH_BCMP
+/**
+ * bcmp - returns 0 if and only if the buffers have identical contents.
+ * @a: pointer to first buffer.
+ * @b: pointer to second buffer.
+ * @len: size of buffers.
+ *
+ * The sign or magnitude of a non-zero return value has no particular
+ * meaning, and architectures may implement their own more efficient bcmp(). So
+ * while this particular implementation is a simple (tail) call to memcmp, do
+ * not rely on anything but whether the return value is zero or non-zero.
+ */
+#undef bcmp
+int bcmp(const void *a, const void *b, size_t len)
+{
+ return memcmp(a, b, len);
+}
+EXPORT_SYMBOL(bcmp);
+#endif
+
#ifndef __HAVE_ARCH_MEMSCAN
/**
* memscan - Find a character in an area of memory.
diff --git a/lib/test_printf.c b/lib/test_printf.c
index 563f10e..71ebfa4 100644
--- a/lib/test_printf.c
+++ b/lib/test_printf.c
@@ -24,24 +24,6 @@
#define PAD_SIZE 16
#define FILL_CHAR '$'
-#define PTR1 ((void*)0x01234567)
-#define PTR2 ((void*)(long)(int)0xfedcba98)
-
-#if BITS_PER_LONG == 64
-#define PTR1_ZEROES "000000000"
-#define PTR1_SPACES " "
-#define PTR1_STR "1234567"
-#define PTR2_STR "fffffffffedcba98"
-#define PTR_WIDTH 16
-#else
-#define PTR1_ZEROES "0"
-#define PTR1_SPACES " "
-#define PTR1_STR "1234567"
-#define PTR2_STR "fedcba98"
-#define PTR_WIDTH 8
-#endif
-#define PTR_WIDTH_STR stringify(PTR_WIDTH)
-
static unsigned total_tests __initdata;
static unsigned failed_tests __initdata;
static char *test_buffer __initdata;
@@ -217,30 +199,79 @@ test_string(void)
test("a | | ", "%-3.s|%-3.0s|%-3.*s", "a", "b", 0, "c");
}
+#define PLAIN_BUF_SIZE 64 /* leave some space so we don't oops */
+
+#if BITS_PER_LONG == 64
+
+#define PTR_WIDTH 16
+#define PTR ((void *)0xffff0123456789ab)
+#define PTR_STR "ffff0123456789ab"
+#define ZEROS "00000000" /* hex 32 zero bits */
+
+static int __init
+plain_format(void)
+{
+ char buf[PLAIN_BUF_SIZE];
+ int nchars;
+
+ nchars = snprintf(buf, PLAIN_BUF_SIZE, "%p", PTR);
+
+ if (nchars != PTR_WIDTH || strncmp(buf, ZEROS, strlen(ZEROS)) != 0)
+ return -1;
+
+ return 0;
+}
+
+#else
+
+#define PTR_WIDTH 8
+#define PTR ((void *)0x456789ab)
+#define PTR_STR "456789ab"
+
+static int __init
+plain_format(void)
+{
+ /* Format is implicitly tested for 32 bit machines by plain_hash() */
+ return 0;
+}
+
+#endif /* BITS_PER_LONG == 64 */
+
+static int __init
+plain_hash(void)
+{
+ char buf[PLAIN_BUF_SIZE];
+ int nchars;
+
+ nchars = snprintf(buf, PLAIN_BUF_SIZE, "%p", PTR);
+
+ if (nchars != PTR_WIDTH || strncmp(buf, PTR_STR, PTR_WIDTH) == 0)
+ return -1;
+
+ return 0;
+}
+
+/*
+ * We can't use test() to test %p because we don't know what output to expect
+ * after an address is hashed.
+ */
static void __init
plain(void)
{
- test(PTR1_ZEROES PTR1_STR " " PTR2_STR, "%p %p", PTR1, PTR2);
- /*
- * The field width is overloaded for some %p extensions to
- * pass another piece of information. For plain pointers, the
- * behaviour is slightly odd: One cannot pass either the 0
- * flag nor a precision to %p without gcc complaining, and if
- * one explicitly gives a field width, the number is no longer
- * zero-padded.
- */
- test("|" PTR1_STR PTR1_SPACES " | " PTR1_SPACES PTR1_STR "|",
- "|%-*p|%*p|", PTR_WIDTH+2, PTR1, PTR_WIDTH+2, PTR1);
- test("|" PTR2_STR " | " PTR2_STR "|",
- "|%-*p|%*p|", PTR_WIDTH+2, PTR2, PTR_WIDTH+2, PTR2);
+ int err;
- /*
- * Unrecognized %p extensions are treated as plain %p, but the
- * alphanumeric suffix is ignored (that is, does not occur in
- * the output.)
- */
- test("|"PTR1_ZEROES PTR1_STR"|", "|%p0y|", PTR1);
- test("|"PTR2_STR"|", "|%p0y|", PTR2);
+ err = plain_hash();
+ if (err) {
+ pr_warn("plain 'p' does not appear to be hashed\n");
+ failed_tests++;
+ return;
+ }
+
+ err = plain_format();
+ if (err) {
+ pr_warn("hashing plain 'p' has unexpected format\n");
+ failed_tests++;
+ }
}
static void __init
@@ -251,6 +282,7 @@ symbol_ptr(void)
static void __init
kernel_ptr(void)
{
+ /* We can't test this without access to kptr_restrict. */
}
static void __init
diff --git a/lib/test_siphash.c b/lib/test_siphash.c
new file mode 100644
index 0000000..a6d854d
--- /dev/null
+++ b/lib/test_siphash.c
@@ -0,0 +1,223 @@
+/* Test cases for siphash.c
+ *
+ * Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ *
+ * This implementation is specifically for SipHash2-4 for a secure PRF
+ * and HalfSipHash1-3/SipHash1-3 for an insecure PRF only suitable for
+ * hashtables.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/siphash.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+
+/* Test vectors taken from reference source available at:
+ * https://github.com/veorq/SipHash
+ */
+
+static const siphash_key_t test_key_siphash =
+ {{ 0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL }};
+
+static const u64 test_vectors_siphash[64] = {
+ 0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL,
+ 0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL,
+ 0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL,
+ 0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL,
+ 0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL,
+ 0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL,
+ 0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL,
+ 0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL,
+ 0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL,
+ 0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL,
+ 0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL,
+ 0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL,
+ 0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL,
+ 0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL,
+ 0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL,
+ 0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL,
+ 0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL,
+ 0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL,
+ 0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL,
+ 0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL,
+ 0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL,
+ 0x958a324ceb064572ULL
+};
+
+#if BITS_PER_LONG == 64
+static const hsiphash_key_t test_key_hsiphash =
+ {{ 0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL }};
+
+static const u32 test_vectors_hsiphash[64] = {
+ 0x050fc4dcU, 0x7d57ca93U, 0x4dc7d44dU,
+ 0xe7ddf7fbU, 0x88d38328U, 0x49533b67U,
+ 0xc59f22a7U, 0x9bb11140U, 0x8d299a8eU,
+ 0x6c063de4U, 0x92ff097fU, 0xf94dc352U,
+ 0x57b4d9a2U, 0x1229ffa7U, 0xc0f95d34U,
+ 0x2a519956U, 0x7d908b66U, 0x63dbd80cU,
+ 0xb473e63eU, 0x8d297d1cU, 0xa6cce040U,
+ 0x2b45f844U, 0xa320872eU, 0xdae6c123U,
+ 0x67349c8cU, 0x705b0979U, 0xca9913a5U,
+ 0x4ade3b35U, 0xef6cd00dU, 0x4ab1e1f4U,
+ 0x43c5e663U, 0x8c21d1bcU, 0x16a7b60dU,
+ 0x7a8ff9bfU, 0x1f2a753eU, 0xbf186b91U,
+ 0xada26206U, 0xa3c33057U, 0xae3a36a1U,
+ 0x7b108392U, 0x99e41531U, 0x3f1ad944U,
+ 0xc8138825U, 0xc28949a6U, 0xfaf8876bU,
+ 0x9f042196U, 0x68b1d623U, 0x8b5114fdU,
+ 0xdf074c46U, 0x12cc86b3U, 0x0a52098fU,
+ 0x9d292f9aU, 0xa2f41f12U, 0x43a71ed0U,
+ 0x73f0bce6U, 0x70a7e980U, 0x243c6d75U,
+ 0xfdb71513U, 0xa67d8a08U, 0xb7e8f148U,
+ 0xf7a644eeU, 0x0f1837f2U, 0x4b6694e0U,
+ 0xb7bbb3a8U
+};
+#else
+static const hsiphash_key_t test_key_hsiphash =
+ {{ 0x03020100U, 0x07060504U }};
+
+static const u32 test_vectors_hsiphash[64] = {
+ 0x5814c896U, 0xe7e864caU, 0xbc4b0e30U,
+ 0x01539939U, 0x7e059ea6U, 0x88e3d89bU,
+ 0xa0080b65U, 0x9d38d9d6U, 0x577999b1U,
+ 0xc839caedU, 0xe4fa32cfU, 0x959246eeU,
+ 0x6b28096cU, 0x66dd9cd6U, 0x16658a7cU,
+ 0xd0257b04U, 0x8b31d501U, 0x2b1cd04bU,
+ 0x06712339U, 0x522aca67U, 0x911bb605U,
+ 0x90a65f0eU, 0xf826ef7bU, 0x62512debU,
+ 0x57150ad7U, 0x5d473507U, 0x1ec47442U,
+ 0xab64afd3U, 0x0a4100d0U, 0x6d2ce652U,
+ 0x2331b6a3U, 0x08d8791aU, 0xbc6dda8dU,
+ 0xe0f6c934U, 0xb0652033U, 0x9b9851ccU,
+ 0x7c46fb7fU, 0x732ba8cbU, 0xf142997aU,
+ 0xfcc9aa1bU, 0x05327eb2U, 0xe110131cU,
+ 0xf9e5e7c0U, 0xa7d708a6U, 0x11795ab1U,
+ 0x65671619U, 0x9f5fff91U, 0xd89c5267U,
+ 0x007783ebU, 0x95766243U, 0xab639262U,
+ 0x9c7e1390U, 0xc368dda6U, 0x38ddc455U,
+ 0xfa13d379U, 0x979ea4e8U, 0x53ecd77eU,
+ 0x2ee80657U, 0x33dbb66aU, 0xae3f0577U,
+ 0x88b4c4ccU, 0x3e7f480bU, 0x74c1ebf8U,
+ 0x87178304U
+};
+#endif
+
+static int __init siphash_test_init(void)
+{
+ u8 in[64] __aligned(SIPHASH_ALIGNMENT);
+ u8 in_unaligned[65] __aligned(SIPHASH_ALIGNMENT);
+ u8 i;
+ int ret = 0;
+
+ for (i = 0; i < 64; ++i) {
+ in[i] = i;
+ in_unaligned[i + 1] = i;
+ if (siphash(in, i, &test_key_siphash) !=
+ test_vectors_siphash[i]) {
+ pr_info("siphash self-test aligned %u: FAIL\n", i + 1);
+ ret = -EINVAL;
+ }
+ if (siphash(in_unaligned + 1, i, &test_key_siphash) !=
+ test_vectors_siphash[i]) {
+ pr_info("siphash self-test unaligned %u: FAIL\n", i + 1);
+ ret = -EINVAL;
+ }
+ if (hsiphash(in, i, &test_key_hsiphash) !=
+ test_vectors_hsiphash[i]) {
+ pr_info("hsiphash self-test aligned %u: FAIL\n", i + 1);
+ ret = -EINVAL;
+ }
+ if (hsiphash(in_unaligned + 1, i, &test_key_hsiphash) !=
+ test_vectors_hsiphash[i]) {
+ pr_info("hsiphash self-test unaligned %u: FAIL\n", i + 1);
+ ret = -EINVAL;
+ }
+ }
+ if (siphash_1u64(0x0706050403020100ULL, &test_key_siphash) !=
+ test_vectors_siphash[8]) {
+ pr_info("siphash self-test 1u64: FAIL\n");
+ ret = -EINVAL;
+ }
+ if (siphash_2u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL,
+ &test_key_siphash) != test_vectors_siphash[16]) {
+ pr_info("siphash self-test 2u64: FAIL\n");
+ ret = -EINVAL;
+ }
+ if (siphash_3u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL,
+ 0x1716151413121110ULL, &test_key_siphash) !=
+ test_vectors_siphash[24]) {
+ pr_info("siphash self-test 3u64: FAIL\n");
+ ret = -EINVAL;
+ }
+ if (siphash_4u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL,
+ 0x1716151413121110ULL, 0x1f1e1d1c1b1a1918ULL,
+ &test_key_siphash) != test_vectors_siphash[32]) {
+ pr_info("siphash self-test 4u64: FAIL\n");
+ ret = -EINVAL;
+ }
+ if (siphash_1u32(0x03020100U, &test_key_siphash) !=
+ test_vectors_siphash[4]) {
+ pr_info("siphash self-test 1u32: FAIL\n");
+ ret = -EINVAL;
+ }
+ if (siphash_2u32(0x03020100U, 0x07060504U, &test_key_siphash) !=
+ test_vectors_siphash[8]) {
+ pr_info("siphash self-test 2u32: FAIL\n");
+ ret = -EINVAL;
+ }
+ if (siphash_3u32(0x03020100U, 0x07060504U,
+ 0x0b0a0908U, &test_key_siphash) !=
+ test_vectors_siphash[12]) {
+ pr_info("siphash self-test 3u32: FAIL\n");
+ ret = -EINVAL;
+ }
+ if (siphash_4u32(0x03020100U, 0x07060504U,
+ 0x0b0a0908U, 0x0f0e0d0cU, &test_key_siphash) !=
+ test_vectors_siphash[16]) {
+ pr_info("siphash self-test 4u32: FAIL\n");
+ ret = -EINVAL;
+ }
+ if (hsiphash_1u32(0x03020100U, &test_key_hsiphash) !=
+ test_vectors_hsiphash[4]) {
+ pr_info("hsiphash self-test 1u32: FAIL\n");
+ ret = -EINVAL;
+ }
+ if (hsiphash_2u32(0x03020100U, 0x07060504U, &test_key_hsiphash) !=
+ test_vectors_hsiphash[8]) {
+ pr_info("hsiphash self-test 2u32: FAIL\n");
+ ret = -EINVAL;
+ }
+ if (hsiphash_3u32(0x03020100U, 0x07060504U,
+ 0x0b0a0908U, &test_key_hsiphash) !=
+ test_vectors_hsiphash[12]) {
+ pr_info("hsiphash self-test 3u32: FAIL\n");
+ ret = -EINVAL;
+ }
+ if (hsiphash_4u32(0x03020100U, 0x07060504U,
+ 0x0b0a0908U, 0x0f0e0d0cU, &test_key_hsiphash) !=
+ test_vectors_hsiphash[16]) {
+ pr_info("hsiphash self-test 4u32: FAIL\n");
+ ret = -EINVAL;
+ }
+ if (!ret)
+ pr_info("self-tests: pass\n");
+ return ret;
+}
+
+static void __exit siphash_test_exit(void)
+{
+}
+
+module_init(siphash_test_init);
+module_exit(siphash_test_exit);
+
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
+MODULE_LICENSE("Dual BSD/GPL");
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 79ba3cc..43f9b90 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -32,6 +32,8 @@
#include <linux/cred.h>
#include <linux/uuid.h>
#include <net/addrconf.h>
+#include <linux/siphash.h>
+#include <linux/compiler.h>
#ifdef CONFIG_BLOCK
#include <linux/blkdev.h>
#endif
@@ -1342,6 +1344,59 @@ char *uuid_string(char *buf, char *end, const u8 *addr,
return string(buf, end, uuid, spec);
}
+int kptr_restrict __read_mostly;
+
+static noinline_for_stack
+char *restricted_pointer(char *buf, char *end, const void *ptr,
+ struct printf_spec spec)
+{
+ spec.base = 16;
+ spec.flags |= SMALL;
+ if (spec.field_width == -1) {
+ spec.field_width = 2 * sizeof(ptr);
+ spec.flags |= ZEROPAD;
+ }
+
+ switch (kptr_restrict) {
+ case 0:
+ /* Always print %pK values */
+ break;
+ case 1: {
+ const struct cred *cred;
+
+ /*
+ * kptr_restrict==1 cannot be used in IRQ context
+ * because its test for CAP_SYSLOG would be meaningless.
+ */
+ if (in_irq() || in_serving_softirq() || in_nmi())
+ return string(buf, end, "pK-error", spec);
+
+ /*
+ * Only print the real pointer value if the current
+ * process has CAP_SYSLOG and is running with the
+ * same credentials it started with. This is because
+ * access to files is checked at open() time, but %pK
+ * checks permission at read() time. We don't want to
+ * leak pointer values if a binary opens a file using
+ * %pK and then elevates privileges before reading it.
+ */
+ cred = current_cred();
+ if (!has_capability_noaudit(current, CAP_SYSLOG) ||
+ !uid_eq(cred->euid, cred->uid) ||
+ !gid_eq(cred->egid, cred->gid))
+ ptr = NULL;
+ break;
+ }
+ case 2:
+ default:
+ /* Always print 0's for %pK */
+ ptr = NULL;
+ break;
+ }
+
+ return number(buf, end, (unsigned long)ptr, spec);
+}
+
static noinline_for_stack
char *netdev_bits(char *buf, char *end, const void *addr, const char *fmt)
{
@@ -1467,7 +1522,86 @@ char *flags_string(char *buf, char *end, void *flags_ptr, const char *fmt)
return format_flags(buf, end, flags, names);
}
-int kptr_restrict __read_mostly;
+static noinline_for_stack
+char *pointer_string(char *buf, char *end, const void *ptr,
+ struct printf_spec spec)
+{
+ spec.base = 16;
+ spec.flags |= SMALL;
+ if (spec.field_width == -1) {
+ spec.field_width = 2 * sizeof(ptr);
+ spec.flags |= ZEROPAD;
+ }
+
+ return number(buf, end, (unsigned long int)ptr, spec);
+}
+
+static bool have_filled_random_ptr_key __read_mostly;
+static siphash_key_t ptr_key __read_mostly;
+
+static void fill_random_ptr_key(struct random_ready_callback *unused)
+{
+ get_random_bytes(&ptr_key, sizeof(ptr_key));
+ /*
+ * have_filled_random_ptr_key==true is dependent on get_random_bytes().
+ * ptr_to_id() needs to see have_filled_random_ptr_key==true
+ * after get_random_bytes() returns.
+ */
+ smp_mb();
+ WRITE_ONCE(have_filled_random_ptr_key, true);
+}
+
+static struct random_ready_callback random_ready = {
+ .func = fill_random_ptr_key
+};
+
+static int __init initialize_ptr_random(void)
+{
+ int ret = add_random_ready_callback(&random_ready);
+
+ if (!ret) {
+ return 0;
+ } else if (ret == -EALREADY) {
+ fill_random_ptr_key(&random_ready);
+ return 0;
+ }
+
+ return ret;
+}
+early_initcall(initialize_ptr_random);
+
+/* Maps a pointer to a 32 bit unique identifier. */
+static char *ptr_to_id(char *buf, char *end, void *ptr, struct printf_spec spec)
+{
+ unsigned long hashval;
+ const int default_width = 2 * sizeof(ptr);
+
+ if (unlikely(!have_filled_random_ptr_key)) {
+ spec.field_width = default_width;
+ /* string length must be less than default_width */
+ return string(buf, end, "(ptrval)", spec);
+ }
+
+#ifdef CONFIG_64BIT
+ hashval = (unsigned long)siphash_1u64((u64)ptr, &ptr_key);
+ /*
+ * Mask off the first 32 bits, this makes explicit that we have
+ * modified the address (and 32 bits is plenty for a unique ID).
+ */
+ hashval = hashval & 0xffffffff;
+#else
+ hashval = (unsigned long)siphash_1u32((u32)ptr, &ptr_key);
+#endif
+
+ spec.flags |= SMALL;
+ if (spec.field_width == -1) {
+ spec.field_width = default_width;
+ spec.flags |= ZEROPAD;
+ }
+ spec.base = 16;
+
+ return number(buf, end, hashval, spec);
+}
/*
* Show a '%p' thing. A kernel extension is that the '%p' is followed
@@ -1561,11 +1695,16 @@ int kptr_restrict __read_mostly;
* g gfp flags (GFP_* and __GFP_*) given as pointer to gfp_t
* v vma flags (VM_*) given as pointer to unsigned long
*
+ * - 'x' For printing the address. Equivalent to "%lx".
+ *
* ** Please update also Documentation/printk-formats.txt when making changes **
*
* Note: The difference between 'S' and 'F' is that on ia64 and ppc64
* function pointers are really function descriptors, which contain a
* pointer to the real address.
+ *
+ * Note: The default behaviour (unadorned %p) is to hash the address,
+ * rendering it useful as a unique identifier.
*/
static noinline_for_stack
char *pointer(const char *fmt, char *buf, char *end, void *ptr,
@@ -1655,47 +1794,7 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,
return buf;
}
case 'K':
- switch (kptr_restrict) {
- case 0:
- /* Always print %pK values */
- break;
- case 1: {
- const struct cred *cred;
-
- /*
- * kptr_restrict==1 cannot be used in IRQ context
- * because its test for CAP_SYSLOG would be meaningless.
- */
- if (in_irq() || in_serving_softirq() || in_nmi()) {
- if (spec.field_width == -1)
- spec.field_width = default_width;
- return string(buf, end, "pK-error", spec);
- }
-
- /*
- * Only print the real pointer value if the current
- * process has CAP_SYSLOG and is running with the
- * same credentials it started with. This is because
- * access to files is checked at open() time, but %pK
- * checks permission at read() time. We don't want to
- * leak pointer values if a binary opens a file using
- * %pK and then elevates privileges before reading it.
- */
- cred = current_cred();
- if (!has_capability_noaudit(current, CAP_SYSLOG) ||
- !uid_eq(cred->euid, cred->uid) ||
- !gid_eq(cred->egid, cred->gid))
- ptr = NULL;
- break;
- }
- case 2:
- default:
- /* Always print 0's for %pK */
- ptr = NULL;
- break;
- }
- break;
-
+ return restricted_pointer(buf, end, ptr, spec);
case 'N':
return netdev_bits(buf, end, ptr, fmt);
case 'a':
@@ -1715,15 +1814,12 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,
case 'G':
return flags_string(buf, end, ptr, fmt);
+ case 'x':
+ return pointer_string(buf, end, ptr, spec);
}
- spec.flags |= SMALL;
- if (spec.field_width == -1) {
- spec.field_width = default_width;
- spec.flags |= ZEROPAD;
- }
- spec.base = 16;
- return number(buf, end, (unsigned long) ptr, spec);
+ /* default is to _not_ leak addresses, hash before printing */
+ return ptr_to_id(buf, end, ptr, spec);
}
/*
diff --git a/mm/cma.c b/mm/cma.c
index 1ccfaa1..677ed05 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -383,12 +383,14 @@ int __init cma_declare_contiguous(phys_addr_t base,
ret = cma_init_reserved_mem(base, size, order_per_bit, name, res_cma);
if (ret)
- goto err;
+ goto free_mem;
pr_info("Reserved %ld MiB at %pa\n", (unsigned long)size / SZ_1M,
&base);
return 0;
+free_mem:
+ memblock_free(base, size);
err:
pr_err("Failed to reserve %ld MiB\n", (unsigned long)size / SZ_1M);
return ret;
diff --git a/mm/compaction.c b/mm/compaction.c
index f002a7f..6a4c4eb 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -20,6 +20,7 @@
#include <linux/kthread.h>
#include <linux/freezer.h>
#include <linux/page_owner.h>
+#include <linux/psi.h>
#include "internal.h"
#ifdef CONFIG_COMPACTION
@@ -2005,11 +2006,15 @@ static int kcompactd(void *p)
pgdat->kcompactd_classzone_idx = pgdat->nr_zones - 1;
while (!kthread_should_stop()) {
+ unsigned long pflags;
+
trace_mm_compaction_kcompactd_sleep(pgdat->node_id);
wait_event_freezable(pgdat->kcompactd_wait,
kcompactd_work_requested(pgdat));
+ psi_memstall_enter(&pflags);
kcompactd_do_work(pgdat);
+ psi_memstall_leave(&pflags);
}
return 0;
diff --git a/mm/debug.c b/mm/debug.c
index bebe48a..dc7ad1f 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -49,7 +49,7 @@ void __dump_page(struct page *page, const char *reason)
*/
int mapcount = PageSlab(page) ? 0 : page_mapcount(page);
- pr_emerg("page:%p count:%d mapcount:%d mapping:%p index:%#lx",
+ pr_emerg("page:%px count:%d mapcount:%d mapping:%px index:%#lx",
page, page_ref_count(page), mapcount,
page->mapping, page_to_pgoff(page));
if (PageCompound(page))
@@ -64,7 +64,7 @@ void __dump_page(struct page *page, const char *reason)
#ifdef CONFIG_MEMCG
if (page->mem_cgroup)
- pr_alert("page->mem_cgroup:%p\n", page->mem_cgroup);
+ pr_alert("page->mem_cgroup:%px\n", page->mem_cgroup);
#endif
}
@@ -79,10 +79,10 @@ EXPORT_SYMBOL(dump_page);
void dump_vma(const struct vm_area_struct *vma)
{
- pr_emerg("vma %p start %p end %p\n"
- "next %p prev %p mm %p\n"
- "prot %lx anon_vma %p vm_ops %p\n"
- "pgoff %lx file %p private_data %p\n"
+ pr_emerg("vma %px start %px end %px\n"
+ "next %px prev %px mm %px\n"
+ "prot %lx anon_vma %px vm_ops %px\n"
+ "pgoff %lx file %px private_data %px\n"
"flags: %#lx(%pGv)\n",
vma, (void *)vma->vm_start, (void *)vma->vm_end, vma->vm_next,
vma->vm_prev, vma->vm_mm,
@@ -95,27 +95,27 @@ EXPORT_SYMBOL(dump_vma);
void dump_mm(const struct mm_struct *mm)
{
- pr_emerg("mm %p mmap %p seqnum %llu task_size %lu\n"
+ pr_emerg("mm %px mmap %px seqnum %llu task_size %lu\n"
#ifdef CONFIG_MMU
- "get_unmapped_area %p\n"
+ "get_unmapped_area %px\n"
#endif
"mmap_base %lu mmap_legacy_base %lu highest_vm_end %lu\n"
- "pgd %p mm_users %d mm_count %d nr_ptes %lu nr_pmds %lu map_count %d\n"
+ "pgd %px mm_users %d mm_count %d nr_ptes %lu nr_pmds %lu map_count %d\n"
"hiwater_rss %lx hiwater_vm %lx total_vm %lx locked_vm %lx\n"
"pinned_vm %lx data_vm %lx exec_vm %lx stack_vm %lx\n"
"start_code %lx end_code %lx start_data %lx end_data %lx\n"
"start_brk %lx brk %lx start_stack %lx\n"
"arg_start %lx arg_end %lx env_start %lx env_end %lx\n"
- "binfmt %p flags %lx core_state %p\n"
+ "binfmt %px flags %lx core_state %px\n"
#ifdef CONFIG_AIO
- "ioctx_table %p\n"
+ "ioctx_table %px\n"
#endif
#ifdef CONFIG_MEMCG
- "owner %p "
+ "owner %px "
#endif
- "exe_file %p\n"
+ "exe_file %px\n"
#ifdef CONFIG_MMU_NOTIFIER
- "mmu_notifier_mm %p\n"
+ "mmu_notifier_mm %px\n"
#endif
#ifdef CONFIG_NUMA_BALANCING
"numa_next_scan %lu numa_scan_offset %lu numa_scan_seq %d\n"
diff --git a/mm/filemap.c b/mm/filemap.c
index 4fcd8ee..1a5780b 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -35,6 +35,8 @@
#include <linux/memcontrol.h>
#include <linux/cleancache.h>
#include <linux/rmap.h>
+#include <linux/delayacct.h>
+#include <linux/psi.h>
#include "internal.h"
#define CREATE_TRACE_POINTS
@@ -747,12 +749,9 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
* data from the working set, only to cache data that will
* get overwritten with something else, is a waste of memory.
*/
- if (!(gfp_mask & __GFP_WRITE) &&
- shadow && workingset_refault(shadow)) {
- SetPageActive(page);
- workingset_activation(page);
- } else
- ClearPageActive(page);
+ WARN_ON_ONCE(PageActive(page));
+ if (!(gfp_mask & __GFP_WRITE) && shadow)
+ workingset_refault(page, shadow);
lru_cache_add(page);
}
return ret;
@@ -790,46 +789,177 @@ EXPORT_SYMBOL(__page_cache_alloc);
* at a cost of "thundering herd" phenomena during rare hash
* collisions.
*/
-wait_queue_head_t *page_waitqueue(struct page *page)
+#define PAGE_WAIT_TABLE_BITS 8
+#define PAGE_WAIT_TABLE_SIZE (1 << PAGE_WAIT_TABLE_BITS)
+static wait_queue_head_t page_wait_table[PAGE_WAIT_TABLE_SIZE] __cacheline_aligned;
+
+static wait_queue_head_t *page_waitqueue(struct page *page)
{
- return bit_waitqueue(page, 0);
+ return &page_wait_table[hash_ptr(page, PAGE_WAIT_TABLE_BITS)];
}
-EXPORT_SYMBOL(page_waitqueue);
+
+void __init pagecache_init(void)
+{
+ int i;
+
+ for (i = 0; i < PAGE_WAIT_TABLE_SIZE; i++)
+ init_waitqueue_head(&page_wait_table[i]);
+
+ page_writeback_init();
+}
+
+struct wait_page_key {
+ struct page *page;
+ int bit_nr;
+ int page_match;
+};
+
+struct wait_page_queue {
+ struct page *page;
+ int bit_nr;
+ wait_queue_t wait;
+};
+
+static int wake_page_function(wait_queue_t *wait, unsigned mode, int sync, void *arg)
+{
+ struct wait_page_key *key = arg;
+ struct wait_page_queue *wait_page
+ = container_of(wait, struct wait_page_queue, wait);
+
+ if (wait_page->page != key->page)
+ return 0;
+ key->page_match = 1;
+
+ if (wait_page->bit_nr != key->bit_nr)
+ return 0;
+ if (test_bit(key->bit_nr, &key->page->flags))
+ return 0;
+
+ return autoremove_wake_function(wait, mode, sync, key);
+}
+
+void wake_up_page_bit(struct page *page, int bit_nr)
+{
+ wait_queue_head_t *q = page_waitqueue(page);
+ struct wait_page_key key;
+ unsigned long flags;
+
+ key.page = page;
+ key.bit_nr = bit_nr;
+ key.page_match = 0;
+
+ spin_lock_irqsave(&q->lock, flags);
+ __wake_up_locked_key(q, TASK_NORMAL, &key);
+ /*
+ * It is possible for other pages to have collided on the waitqueue
+ * hash, so in that case check for a page match. That prevents a long-
+ * term waiter
+ *
+ * It is still possible to miss a case here, when we woke page waiters
+ * and removed them from the waitqueue, but there are still other
+ * page waiters.
+ */
+ if (!waitqueue_active(q) || !key.page_match) {
+ ClearPageWaiters(page);
+ /*
+ * It's possible to miss clearing Waiters here, when we woke
+ * our page waiters, but the hashed waitqueue has waiters for
+ * other pages on it.
+ *
+ * That's okay, it's a rare case. The next waker will clear it.
+ */
+ }
+ spin_unlock_irqrestore(&q->lock, flags);
+}
+EXPORT_SYMBOL(wake_up_page_bit);
+
+static inline int wait_on_page_bit_common(wait_queue_head_t *q,
+ struct page *page, int bit_nr, int state, bool lock)
+{
+ struct wait_page_queue wait_page;
+ wait_queue_t *wait = &wait_page.wait;
+ bool thrashing = false;
+ unsigned long pflags;
+ int ret = 0;
+
+ if (bit_nr == PG_locked &&
+ !PageUptodate(page) && PageWorkingset(page)) {
+ if (!PageSwapBacked(page))
+ delayacct_thrashing_start();
+ psi_memstall_enter(&pflags);
+ thrashing = true;
+ }
+
+ init_wait(wait);
+ wait->func = wake_page_function;
+ wait_page.page = page;
+ wait_page.bit_nr = bit_nr;
+
+ for (;;) {
+ spin_lock_irq(&q->lock);
+
+ if (likely(list_empty(&wait->task_list))) {
+ if (lock)
+ __add_wait_queue_tail_exclusive(q, wait);
+ else
+ __add_wait_queue(q, wait);
+ SetPageWaiters(page);
+ }
+
+ set_current_state(state);
+
+ spin_unlock_irq(&q->lock);
+
+ if (likely(test_bit(bit_nr, &page->flags))) {
+ io_schedule();
+ }
+
+ if (lock) {
+ if (!test_and_set_bit_lock(bit_nr, &page->flags))
+ break;
+ } else {
+ if (!test_bit(bit_nr, &page->flags))
+ break;
+ }
+
+ if (unlikely(signal_pending_state(state, current))) {
+ ret = -EINTR;
+ break;
+ }
+ }
+
+ finish_wait(q, wait);
+
+ if (thrashing) {
+ if (!PageSwapBacked(page))
+ delayacct_thrashing_end();
+ psi_memstall_leave(&pflags);
+ }
+
+ /*
+ * A signal could leave PageWaiters set. Clearing it here if
+ * !waitqueue_active would be possible (by open-coding finish_wait),
+ * but still fail to catch it in the case of wait hash collision. We
+ * already can fail to clear wait hash collision cases, so don't
+ * bother with signals either.
+ */
+
+ return ret;
+}
void wait_on_page_bit(struct page *page, int bit_nr)
{
- DEFINE_WAIT_BIT(wait, &page->flags, bit_nr);
-
- if (test_bit(bit_nr, &page->flags))
- __wait_on_bit(page_waitqueue(page), &wait, bit_wait_io,
- TASK_UNINTERRUPTIBLE);
+ wait_queue_head_t *q = page_waitqueue(page);
+ wait_on_page_bit_common(q, page, bit_nr, TASK_UNINTERRUPTIBLE, false);
}
EXPORT_SYMBOL(wait_on_page_bit);
int wait_on_page_bit_killable(struct page *page, int bit_nr)
{
- DEFINE_WAIT_BIT(wait, &page->flags, bit_nr);
-
- if (!test_bit(bit_nr, &page->flags))
- return 0;
-
- return __wait_on_bit(page_waitqueue(page), &wait,
- bit_wait_io, TASK_KILLABLE);
+ wait_queue_head_t *q = page_waitqueue(page);
+ return wait_on_page_bit_common(q, page, bit_nr, TASK_KILLABLE, false);
}
-int wait_on_page_bit_killable_timeout(struct page *page,
- int bit_nr, unsigned long timeout)
-{
- DEFINE_WAIT_BIT(wait, &page->flags, bit_nr);
-
- wait.key.timeout = jiffies + timeout;
- if (!test_bit(bit_nr, &page->flags))
- return 0;
- return __wait_on_bit(page_waitqueue(page), &wait,
- bit_wait_io_timeout, TASK_KILLABLE);
-}
-EXPORT_SYMBOL_GPL(wait_on_page_bit_killable_timeout);
-
/**
* add_page_wait_queue - Add an arbitrary waiter to a page's wait queue
* @page: Page defining the wait queue of interest
@@ -844,6 +974,7 @@ void add_page_wait_queue(struct page *page, wait_queue_t *waiter)
spin_lock_irqsave(&q->lock, flags);
__add_wait_queue(q, waiter);
+ SetPageWaiters(page);
spin_unlock_irqrestore(&q->lock, flags);
}
EXPORT_SYMBOL_GPL(add_page_wait_queue);
@@ -928,23 +1059,19 @@ EXPORT_SYMBOL_GPL(page_endio);
* __lock_page - get a lock on the page, assuming we need to sleep to get it
* @page: the page to lock
*/
-void __lock_page(struct page *page)
+void __lock_page(struct page *__page)
{
- struct page *page_head = compound_head(page);
- DEFINE_WAIT_BIT(wait, &page_head->flags, PG_locked);
-
- __wait_on_bit_lock(page_waitqueue(page_head), &wait, bit_wait_io,
- TASK_UNINTERRUPTIBLE);
+ struct page *page = compound_head(__page);
+ wait_queue_head_t *q = page_waitqueue(page);
+ wait_on_page_bit_common(q, page, PG_locked, TASK_UNINTERRUPTIBLE, true);
}
EXPORT_SYMBOL(__lock_page);
-int __lock_page_killable(struct page *page)
+int __lock_page_killable(struct page *__page)
{
- struct page *page_head = compound_head(page);
- DEFINE_WAIT_BIT(wait, &page_head->flags, PG_locked);
-
- return __wait_on_bit_lock(page_waitqueue(page_head), &wait,
- bit_wait_io, TASK_KILLABLE);
+ struct page *page = compound_head(__page);
+ wait_queue_head_t *q = page_waitqueue(page);
+ return wait_on_page_bit_common(q, page, PG_locked, TASK_KILLABLE, true);
}
EXPORT_SYMBOL_GPL(__lock_page_killable);
@@ -1190,6 +1317,9 @@ EXPORT_SYMBOL(find_lock_entry);
* @gfp_mask and added to the page cache and the VM's LRU
* list. The page is returned locked and with an increased
* refcount. Otherwise, %NULL is returned.
+ * FGP_FOR_MMAP: Similar to FGP_CREAT, only we want to allow the caller to do
+ * its own locking dance if the page is already in cache, or unlock the page
+ * before returning if we had to add the page to pagecache.
*
* If FGP_LOCK or FGP_CREAT are specified then the function may sleep even
* if the GFP flags specified for FGP_CREAT are atomic.
@@ -1242,7 +1372,7 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
if (!page)
return NULL;
- if (WARN_ON_ONCE(!(fgp_flags & FGP_LOCK)))
+ if (WARN_ON_ONCE(!(fgp_flags & (FGP_LOCK | FGP_FOR_MMAP))))
fgp_flags |= FGP_LOCK;
/* Init accessed so avoid atomic mark_page_accessed later */
@@ -1256,6 +1386,14 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
if (err == -EEXIST)
goto repeat;
}
+
+ /*
+ * add_to_page_cache_lru lock's the page, and for mmap we expect
+ * a unlocked page.
+ */
+ if (page && (fgp_flags & FGP_FOR_MMAP))
+ unlock_page(page);
+
}
return page;
@@ -1992,62 +2130,96 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
EXPORT_SYMBOL(generic_file_read_iter);
#ifdef CONFIG_MMU
-/**
- * page_cache_read - adds requested page to the page cache if not already there
- * @file: file to read
- * @offset: page index
- * @gfp_mask: memory allocation flags
- *
- * This adds the requested page to the page cache if it isn't already there,
- * and schedules an I/O to read in its contents from disk.
- */
-static int page_cache_read(struct file *file, pgoff_t offset, gfp_t gfp_mask)
-{
- struct address_space *mapping = file->f_mapping;
- struct page *page;
- int ret;
-
- do {
- page = __page_cache_alloc(gfp_mask|__GFP_COLD);
- if (!page)
- return -ENOMEM;
-
- ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask);
- if (ret == 0)
- ret = mapping->a_ops->readpage(file, page);
- else if (ret == -EEXIST)
- ret = 0; /* losing race to add is OK */
-
- put_page(page);
-
- } while (ret == AOP_TRUNCATED_PAGE);
-
- return ret;
-}
-
#define MMAP_LOTSAMISS (100)
+static struct file *maybe_unlock_mmap_for_io(struct vm_area_struct *vma,
+ unsigned long flags, struct file *fpin)
+{
+ if (fpin)
+ return fpin;
+
+ /*
+ * FAULT_FLAG_RETRY_NOWAIT means we don't want to wait on page locks or
+ * anything, so we only pin the file and drop the mmap_sem if only
+ * FAULT_FLAG_ALLOW_RETRY is set.
+ */
+ if ((flags & (FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT)) ==
+ FAULT_FLAG_ALLOW_RETRY) {
+ fpin = get_file(vma->vm_file);
+ up_read(&vma->vm_mm->mmap_sem);
+ }
+ return fpin;
+}
+
/*
- * Synchronous readahead happens when we don't even find
- * a page in the page cache at all.
+ * lock_page_maybe_drop_mmap - lock the page, possibly dropping the mmap_sem
+ * @vmf - the vm_fault for this fault.
+ * @page - the page to lock.
+ * @fpin - the pointer to the file we may pin (or is already pinned).
+ *
+ * This works similar to lock_page_or_retry in that it can drop the mmap_sem.
+ * It differs in that it actually returns the page locked if it returns 1 and 0
+ * if it couldn't lock the page. If we did have to drop the mmap_sem then fpin
+ * will point to the pinned file and needs to be fput()'ed at a later point.
*/
-static void do_sync_mmap_readahead(struct vm_area_struct *vma,
+static int lock_page_maybe_drop_mmap(struct vm_area_struct *vma,
+ unsigned long flags, struct page *page, struct file **fpin)
+{
+ if (trylock_page(page))
+ return 1;
+
+ /*
+ * NOTE! This will make us return with VM_FAULT_RETRY, but with
+ * the mmap_sem still held. That's how FAULT_FLAG_RETRY_NOWAIT
+ * is supposed to work. We have way too many special cases..
+ */
+ if (flags & FAULT_FLAG_RETRY_NOWAIT)
+ return 0;
+ *fpin = maybe_unlock_mmap_for_io(vma, flags, *fpin);
+ if (flags & FAULT_FLAG_KILLABLE) {
+ if (__lock_page_killable(page)) {
+ /*
+ * We didn't have the right flags to drop the mmap_sem,
+ * but all fault_handlers only check for fatal signals
+ * if we return VM_FAULT_RETRY, so we need to drop the
+ * mmap_sem here and return 0 if we don't have a fpin.
+ */
+ if (*fpin == NULL)
+ up_read(&vma->vm_mm->mmap_sem);
+ return 0;
+ }
+ } else
+ __lock_page(page);
+ return 1;
+}
+
+/*
+ * Synchronous readahead happens when we don't even find a page in the page
+ * cache at all. We don't want to perform IO under the mmap sem, so if we have
+ * to drop the mmap sem we return the file that was pinned in order for us to do
+ * that. If we didn't pin a file then we return NULL. The file that is
+ * returned needs to be fput()'ed when we're done with it.
+ */
+static struct file *do_sync_mmap_readahead(struct vm_area_struct *vma,
+ unsigned long flags,
struct file_ra_state *ra,
struct file *file,
pgoff_t offset)
{
+ struct file *fpin = NULL;
struct address_space *mapping = file->f_mapping;
/* If we don't want any read-ahead, don't bother */
if (vma->vm_flags & VM_RAND_READ)
- return;
+ return fpin;
if (!ra->ra_pages)
- return;
+ return fpin;
if (vma->vm_flags & VM_SEQ_READ) {
+ fpin = maybe_unlock_mmap_for_io(vma, flags, fpin);
page_cache_sync_readahead(mapping, ra, file, offset,
ra->ra_pages);
- return;
+ return fpin;
}
/* Avoid banging the cache line if not needed */
@@ -2059,37 +2231,45 @@ static void do_sync_mmap_readahead(struct vm_area_struct *vma,
* stop bothering with read-ahead. It will only hurt.
*/
if (ra->mmap_miss > MMAP_LOTSAMISS)
- return;
+ return fpin;
/*
* mmap read-around
*/
+ fpin = maybe_unlock_mmap_for_io(vma, flags, fpin);
ra->start = max_t(long, 0, offset - ra->ra_pages / 2);
ra->size = ra->ra_pages;
ra->async_size = ra->ra_pages / 4;
ra_submit(ra, mapping, file);
+ return fpin;
}
/*
* Asynchronous readahead happens when we find the page and PG_readahead,
- * so we want to possibly extend the readahead further..
+ * so we want to possibly extend the readahead further. We return the file that
+ * was pinned if we have to drop the mmap_sem in order to do IO.
*/
-static void do_async_mmap_readahead(struct vm_area_struct *vma,
+static struct file *do_async_mmap_readahead(struct vm_area_struct *vma,
+ unsigned long flags,
struct file_ra_state *ra,
struct file *file,
struct page *page,
pgoff_t offset)
{
struct address_space *mapping = file->f_mapping;
+ struct file *fpin = NULL;
/* If we don't want any read-ahead, don't bother */
if (vma->vm_flags & VM_RAND_READ)
- return;
+ return fpin;
if (ra->mmap_miss > 0)
ra->mmap_miss--;
- if (PageReadahead(page))
+ if (PageReadahead(page)) {
+ fpin = maybe_unlock_mmap_for_io(vma, flags, fpin);
page_cache_async_readahead(mapping, ra, file,
page, offset, ra->ra_pages);
+ }
+ return fpin;
}
/**
@@ -2120,6 +2300,7 @@ int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
{
int error;
struct file *file = vma->vm_file;
+ struct file *fpin = NULL;
struct address_space *mapping = file->f_mapping;
struct file_ra_state *ra = &file->f_ra;
struct inode *inode = mapping->host;
@@ -2141,23 +2322,27 @@ int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
* We found the page, so try async readahead before
* waiting for the lock.
*/
- do_async_mmap_readahead(vma, ra, file, page, offset);
+ fpin = do_async_mmap_readahead(vma, vmf->flags, ra,
+ file, page, offset);
} else if (!page) {
/* No page in the page cache at all */
- do_sync_mmap_readahead(vma, ra, file, offset);
count_vm_event(PGMAJFAULT);
mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT);
ret = VM_FAULT_MAJOR;
+ fpin = do_sync_mmap_readahead(vma, vmf->flags, ra,
+ file, offset);
retry_find:
- page = find_get_page(mapping, offset);
- if (!page)
- goto no_cached_page;
+ page = pagecache_get_page(mapping, offset,
+ FGP_CREAT|FGP_FOR_MMAP,
+ vmf->gfp_mask);
+ if (!page) {
+ if (fpin)
+ goto out_retry;
+ return VM_FAULT_OOM;
+ }
}
-
- if (!lock_page_or_retry(page, vma->vm_mm, vmf->flags)) {
- put_page(page);
- return ret | VM_FAULT_RETRY;
- }
+ if (!lock_page_maybe_drop_mmap(vma, vmf->flags, page, &fpin))
+ goto out_retry;
/* Did it get truncated? */
if (unlikely(page->mapping != mapping)) {
@@ -2175,6 +2360,16 @@ int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
goto page_not_uptodate;
/*
+ * We've made it this far and we had to drop our mmap_sem, now is the
+ * time to return to the upper layer and have it re-find the vma and
+ * redo the fault.
+ */
+ if (fpin) {
+ unlock_page(page);
+ goto out_retry;
+ }
+
+ /*
* Found the page and have a reference on it.
* We must recheck i_size under page lock.
*/
@@ -2188,30 +2383,6 @@ int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
vmf->page = page;
return ret | VM_FAULT_LOCKED;
-no_cached_page:
- /*
- * We're only likely to ever get here if MADV_RANDOM is in
- * effect.
- */
- error = page_cache_read(file, offset, vmf->gfp_mask);
-
- /*
- * The page we want has now been added to the page cache.
- * In the unlikely event that someone removed it in the
- * meantime, we'll just come back here and read it again.
- */
- if (error >= 0)
- goto retry_find;
-
- /*
- * An error return from page_cache_read can result if the
- * system is low on memory, or a problem occurs while trying
- * to schedule I/O.
- */
- if (error == -ENOMEM)
- return VM_FAULT_OOM;
- return VM_FAULT_SIGBUS;
-
page_not_uptodate:
/*
* Umm, take care of errors if the page isn't up-to-date.
@@ -2220,12 +2391,15 @@ int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
* and we need to check for errors.
*/
ClearPageError(page);
+ fpin = maybe_unlock_mmap_for_io(vma, vmf->flags, fpin);
error = mapping->a_ops->readpage(file, page);
if (!error) {
wait_on_page_locked(page);
if (!PageUptodate(page))
error = -EIO;
}
+ if (fpin)
+ goto out_retry;
put_page(page);
if (!error || error == AOP_TRUNCATED_PAGE)
@@ -2234,6 +2408,18 @@ int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
/* Things didn't work out. Return zero to tell the mm layer so. */
shrink_readahead_size_eio(file, ra);
return VM_FAULT_SIGBUS;
+
+out_retry:
+ /*
+ * We dropped the mmap_sem, we need to return to the fault handler to
+ * re-find the vma and come back and find our hopefully still populated
+ * page.
+ */
+ if (page)
+ put_page(page);
+ if (fpin)
+ fput(fpin);
+ return ret | VM_FAULT_RETRY;
}
EXPORT_SYMBOL(filemap_fault);
diff --git a/mm/gup.c b/mm/gup.c
index d71da72..99c2f10 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1423,7 +1423,8 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
if (pmd_none(pmd))
return 0;
- if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd))) {
+ if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
+ pmd_devmap(pmd))) {
/*
* NUMA hinting faults need to be handled in the GUP
* slowpath for accounting purposes and so that they
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 2541f3a..c24ff26 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1890,6 +1890,7 @@ static void __split_huge_page_tail(struct page *head, int tail,
(1L << PG_mlocked) |
(1L << PG_uptodate) |
(1L << PG_active) |
+ (1L << PG_workingset) |
(1L << PG_locked) |
(1L << PG_unevictable) |
(1L << PG_dirty)));
diff --git a/mm/internal.h b/mm/internal.h
index d9ac6de..1e93b2a 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -36,6 +36,8 @@
/* Do not use these with a slab allocator */
#define GFP_SLAB_BUG_MASK (__GFP_DMA32|__GFP_HIGHMEM|~__GFP_BITS_MASK)
+void page_writeback_init(void);
+
int do_swap_page(struct fault_env *fe, pte_t orig_pte);
#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 5f2b9eb..3867cac 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -138,7 +138,7 @@ static void print_error_description(struct kasan_access_info *info)
pr_err("BUG: KASAN: %s in %pS\n",
bug_type, (void *)info->ip);
- pr_err("%s of size %zu at addr %p by task %s/%d\n",
+ pr_err("%s of size %zu at addr %px by task %s/%d\n",
info->is_write ? "Write" : "Read", info->access_size,
info->access_addr, current->comm, task_pid_nr(current));
}
@@ -210,7 +210,7 @@ static void describe_object_addr(struct kmem_cache *cache, void *object,
const char *rel_type;
int rel_bytes;
- pr_err("The buggy address belongs to the object at %p\n"
+ pr_err("The buggy address belongs to the object at %px\n"
" which belongs to the cache %s of size %d\n",
object, cache->name, cache->object_size);
@@ -229,7 +229,7 @@ static void describe_object_addr(struct kmem_cache *cache, void *object,
}
pr_err("The buggy address is located %d bytes %s of\n"
- " %d-byte region [%p, %p)\n",
+ " %d-byte region [%px, %px)\n",
rel_bytes, rel_type, cache->object_size, (void *)object_addr,
(void *)(object_addr + cache->object_size));
}
@@ -306,7 +306,7 @@ static void print_shadow_for_address(const void *addr)
char shadow_buf[SHADOW_BYTES_PER_ROW];
snprintf(buffer, sizeof(buffer),
- (i == 0) ? ">%p: " : " %p: ", kaddr);
+ (i == 0) ? ">%px: " : " %px: ", kaddr);
/*
* We should not pass a shadow pointer to generic
* function, because generic functions may try to
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 27e6d17..a17d9c0 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1705,19 +1705,17 @@ static int soft_offline_in_use_page(struct page *page, int flags)
struct page *hpage = compound_head(page);
if (!PageHuge(page) && PageTransHuge(hpage)) {
- lock_page(hpage);
- if (!PageAnon(hpage) || unlikely(split_huge_page(hpage))) {
- unlock_page(hpage);
- if (!PageAnon(hpage))
+ lock_page(page);
+ if (!PageAnon(page) || unlikely(split_huge_page(page))) {
+ unlock_page(page);
+ if (!PageAnon(page))
pr_info("soft offline: %#lx: non anonymous thp\n", page_to_pfn(page));
else
pr_info("soft offline: %#lx: thp split failed\n", page_to_pfn(page));
- put_hwpoison_page(hpage);
+ put_hwpoison_page(page);
return -EBUSY;
}
- unlock_page(hpage);
- get_hwpoison_page(page);
- put_hwpoison_page(hpage);
+ unlock_page(page);
}
if (PageHuge(page))
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index ac1dae6..beb7c33 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -550,11 +550,16 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
goto retry;
}
- migrate_page_add(page, qp->pagelist, flags);
+ if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
+ if (!vma_migratable(vma))
+ break;
+ migrate_page_add(page, qp->pagelist, flags);
+ } else
+ break;
}
pte_unmap_unlock(pte - 1, ptl);
cond_resched();
- return 0;
+ return addr != end ? -EIO : 0;
}
static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
@@ -628,7 +633,12 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
unsigned long endvma = vma->vm_end;
unsigned long flags = qp->flags;
- if (!vma_migratable(vma))
+ /*
+ * Need check MPOL_MF_STRICT to return -EIO if possible
+ * regardless of vma_migratable
+ */
+ if (!vma_migratable(vma) &&
+ !(flags & MPOL_MF_STRICT))
return 1;
if (endvma > end)
@@ -655,7 +665,7 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
}
/* queue pages from current vma */
- if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))
+ if (flags & MPOL_MF_VALID)
return 0;
return 1;
}
diff --git a/mm/migrate.c b/mm/migrate.c
index d793289..2317b42 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -637,6 +637,8 @@ void migrate_page_copy(struct page *newpage, struct page *page)
SetPageActive(newpage);
} else if (TestClearPageUnevictable(page))
SetPageUnevictable(newpage);
+ if (PageWorkingset(page))
+ SetPageWorkingset(newpage);
if (PageChecked(page))
SetPageChecked(newpage);
if (PageMappedToDisk(page))
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
old mode 100644
new mode 100755
index c6ab8e8..f77c4ce
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -65,6 +65,7 @@
#include <linux/kthread.h>
#include <linux/memcontrol.h>
#include <linux/show_mem_notifier.h>
+#include <linux/psi.h>
#include <asm/sections.h>
#include <asm/tlbflush.h>
@@ -3247,15 +3248,20 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
enum compact_priority prio, enum compact_result *compact_result)
{
struct page *page;
+ unsigned long pflags;
unsigned int noreclaim_flag = current->flags & PF_MEMALLOC;
if (!order)
return NULL;
+ psi_memstall_enter(&pflags);
current->flags |= PF_MEMALLOC;
+
*compact_result = try_to_compact_pages(gfp_mask, order, alloc_flags, ac,
prio);
+
current->flags = (current->flags & ~PF_MEMALLOC) | noreclaim_flag;
+ psi_memstall_leave(&pflags);
if (*compact_result <= COMPACT_INACTIVE)
return NULL;
@@ -3435,11 +3441,13 @@ __perform_reclaim(gfp_t gfp_mask, unsigned int order,
{
struct reclaim_state reclaim_state;
int progress;
+ unsigned long pflags;
cond_resched();
/* We now go into synchronous reclaim */
cpuset_memory_pressure_bump();
+ psi_memstall_enter(&pflags);
current->flags |= PF_MEMALLOC;
lockdep_set_current_reclaim_state(gfp_mask);
reclaim_state.reclaimed_slab = 0;
@@ -3451,6 +3459,7 @@ __perform_reclaim(gfp_t gfp_mask, unsigned int order,
current->reclaim_state = NULL;
lockdep_clear_current_reclaim_state();
current->flags &= ~PF_MEMALLOC;
+ psi_memstall_leave(&pflags);
cond_resched();
@@ -4092,11 +4101,11 @@ void *__alloc_page_frag(struct page_frag_cache *nc,
/* Even if we own the page, we do not use atomic_set().
* This would break get_page_unless_zero() users.
*/
- page_ref_add(page, size - 1);
+ page_ref_add(page, size);
/* reset page count bias and offset to start of new frag */
nc->pfmemalloc = page_is_pfmemalloc(page);
- nc->pagecnt_bias = size;
+ nc->pagecnt_bias = size + 1;
nc->offset = size;
}
@@ -4112,10 +4121,10 @@ void *__alloc_page_frag(struct page_frag_cache *nc,
size = nc->size;
#endif
/* OK, page count is 0, we can safely set it */
- set_page_count(page, size);
+ set_page_count(page, size + 1);
/* reset page count bias and offset to start of new frag */
- nc->pagecnt_bias = size;
+ nc->pagecnt_bias = size + 1;
offset = size - fragsz;
}
diff --git a/mm/page_ext.c b/mm/page_ext.c
index fc3e7ff..561ed6e 100644
--- a/mm/page_ext.c
+++ b/mm/page_ext.c
@@ -283,6 +283,7 @@ static void free_page_ext(void *addr)
table_size = get_entry_size() * PAGES_PER_SECTION;
BUG_ON(PageReserved(page));
+ kmemleak_free(addr);
free_pages_exact(addr, table_size);
}
}
diff --git a/mm/percpu.c b/mm/percpu.c
index 3794cfc..0462a2a 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -2048,8 +2048,8 @@ int __init pcpu_embed_first_chunk(size_t reserved_size, size_t dyn_size,
ai->groups[group].base_offset = areas[group] - base;
}
- pr_info("Embedded %zu pages/cpu @%p s%zu r%zu d%zu u%zu\n",
- PFN_DOWN(size_sum), base, ai->static_size, ai->reserved_size,
+ pr_info("Embedded %zu pages/cpu s%zu r%zu d%zu u%zu\n",
+ PFN_DOWN(size_sum), ai->static_size, ai->reserved_size,
ai->dyn_size, ai->unit_size);
rc = pcpu_setup_first_chunk(ai, base);
@@ -2162,8 +2162,8 @@ int __init pcpu_page_first_chunk(size_t reserved_size,
}
/* we're ready, commit */
- pr_info("%d %s pages/cpu @%p s%zu r%zu d%zu\n",
- unit_pages, psize_str, vm.addr, ai->static_size,
+ pr_info("%d %s pages/cpu s%zu r%zu d%zu\n",
+ unit_pages, psize_str, ai->static_size,
ai->reserved_size, ai->dyn_size);
rc = pcpu_setup_first_chunk(ai, vm.addr);
diff --git a/mm/shmem.c b/mm/shmem.c
index d15851d..793a8cf 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2896,16 +2896,20 @@ static int shmem_create(struct inode *dir, struct dentry *dentry, umode_t mode,
static int shmem_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry)
{
struct inode *inode = d_inode(old_dentry);
- int ret;
+ int ret = 0;
/*
* No ordinary (disk based) filesystem counts links as inodes;
* but each new link needs a new dentry, pinning lowmem, and
* tmpfs dentries cannot be pruned until they are unlinked.
+ * But if an O_TMPFILE file is linked into the tmpfs, the
+ * first link must skip that, to get the accounting right.
*/
- ret = shmem_reserve_inode(inode->i_sb);
- if (ret)
- goto out;
+ if (inode->i_nlink) {
+ ret = shmem_reserve_inode(inode->i_sb);
+ if (ret)
+ goto out;
+ }
dir->i_size += BOGO_DIRENT_SIZE;
inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode);
diff --git a/mm/slab.c b/mm/slab.c
index 354a09d..bd498ef 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -566,14 +566,6 @@ static void start_cpu_timer(int cpu)
static void init_arraycache(struct array_cache *ac, int limit, int batch)
{
- /*
- * The array_cache structures contain pointers to free object.
- * However, when such objects are allocated or transferred to another
- * cache the pointers are not cleared and they could be counted as
- * valid references during a kmemleak scan. Therefore, kmemleak must
- * not scan such objects.
- */
- kmemleak_no_scan(ac);
if (ac) {
ac->avail = 0;
ac->limit = limit;
@@ -589,6 +581,14 @@ static struct array_cache *alloc_arraycache(int node, int entries,
struct array_cache *ac = NULL;
ac = kmalloc_node(memsize, gfp, node);
+ /*
+ * The array_cache structures contain pointers to free object.
+ * However, when such objects are allocated or transferred to another
+ * cache the pointers are not cleared and they could be counted as
+ * valid references during a kmemleak scan. Therefore, kmemleak must
+ * not scan such objects.
+ */
+ kmemleak_no_scan(ac);
init_arraycache(ac, entries, batchcount);
return ac;
}
@@ -683,6 +683,7 @@ static struct alien_cache *__alloc_alien_cache(int node, int entries,
alc = kmalloc_node(memsize, gfp, node);
if (alc) {
+ kmemleak_no_scan(alc);
init_arraycache(&alc->ac, entries, batch);
spin_lock_init(&alc->lock);
}
@@ -1620,11 +1621,8 @@ static void print_objinfo(struct kmem_cache *cachep, void *objp, int lines)
*dbg_redzone2(cachep, objp));
}
- if (cachep->flags & SLAB_STORE_USER) {
- pr_err("Last user: [<%p>](%pSR)\n",
- *dbg_userword(cachep, objp),
- *dbg_userword(cachep, objp));
- }
+ if (cachep->flags & SLAB_STORE_USER)
+ pr_err("Last user: (%pSR)\n", *dbg_userword(cachep, objp));
realobj = (char *)objp + obj_offset(cachep);
size = cachep->object_size;
for (i = 0; i < size && lines; i += 16, lines--) {
@@ -1657,7 +1655,7 @@ static void check_poison_obj(struct kmem_cache *cachep, void *objp)
/* Mismatch ! */
/* Print header */
if (lines == 0) {
- pr_err("Slab corruption (%s): %s start=%p, len=%d\n",
+ pr_err("Slab corruption (%s): %s start=%px, len=%d\n",
print_tainted(), cachep->name,
realobj, size);
print_objinfo(cachep, objp, 0);
@@ -1686,13 +1684,13 @@ static void check_poison_obj(struct kmem_cache *cachep, void *objp)
if (objnr) {
objp = index_to_obj(cachep, page, objnr - 1);
realobj = (char *)objp + obj_offset(cachep);
- pr_err("Prev obj: start=%p, len=%d\n", realobj, size);
+ pr_err("Prev obj: start=%px, len=%d\n", realobj, size);
print_objinfo(cachep, objp, 2);
}
if (objnr + 1 < cachep->num) {
objp = index_to_obj(cachep, page, objnr + 1);
realobj = (char *)objp + obj_offset(cachep);
- pr_err("Next obj: start=%p, len=%d\n", realobj, size);
+ pr_err("Next obj: start=%px, len=%d\n", realobj, size);
print_objinfo(cachep, objp, 2);
}
}
@@ -2639,7 +2637,7 @@ static void slab_put_obj(struct kmem_cache *cachep,
/* Verify double free bug */
for (i = page->active; i < cachep->num; i++) {
if (get_free_obj(page, i) == objnr) {
- pr_err("slab: double free detected in cache '%s', objp %p\n",
+ pr_err("slab: double free detected in cache '%s', objp %px\n",
cachep->name, objp);
BUG();
}
@@ -2802,7 +2800,7 @@ static inline void verify_redzone_free(struct kmem_cache *cache, void *obj)
else
slab_error(cache, "memory outside object was overwritten");
- pr_err("%p: redzone 1:0x%llx, redzone 2:0x%llx\n",
+ pr_err("%px: redzone 1:0x%llx, redzone 2:0x%llx\n",
obj, redzone1, redzone2);
}
@@ -3102,7 +3100,7 @@ static void *cache_alloc_debugcheck_after(struct kmem_cache *cachep,
if (*dbg_redzone1(cachep, objp) != RED_INACTIVE ||
*dbg_redzone2(cachep, objp) != RED_INACTIVE) {
slab_error(cachep, "double free, or memory outside object was overwritten");
- pr_err("%p: redzone 1:0x%llx, redzone 2:0x%llx\n",
+ pr_err("%px: redzone 1:0x%llx, redzone 2:0x%llx\n",
objp, *dbg_redzone1(cachep, objp),
*dbg_redzone2(cachep, objp));
}
@@ -3115,7 +3113,7 @@ static void *cache_alloc_debugcheck_after(struct kmem_cache *cachep,
cachep->ctor(objp);
if (ARCH_SLAB_MINALIGN &&
((unsigned long)objp & (ARCH_SLAB_MINALIGN-1))) {
- pr_err("0x%p: not aligned to ARCH_SLAB_MINALIGN=%d\n",
+ pr_err("0x%px: not aligned to ARCH_SLAB_MINALIGN=%d\n",
objp, (int)ARCH_SLAB_MINALIGN);
}
return objp;
@@ -4339,7 +4337,7 @@ static void show_symbol(struct seq_file *m, unsigned long address)
return;
}
#endif
- seq_printf(m, "%p", (void *)address);
+ seq_printf(m, "%px", (void *)address);
}
static int leaks_show(struct seq_file *m, void *p)
diff --git a/mm/swap.c b/mm/swap.c
index 086d7fa..4cffcb4 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -69,6 +69,7 @@ static void __page_cache_release(struct page *page)
del_page_from_lru_list(page, lruvec, page_off_lru(page));
spin_unlock_irqrestore(zone_lru_lock(zone), flags);
}
+ __ClearPageWaiters(page);
mem_cgroup_uncharge(page);
}
@@ -785,6 +786,7 @@ void release_pages(struct page **pages, int nr, bool cold)
/* Clear Active bit in case of parallel mark_page_accessed */
__ClearPageActive(page);
+ __ClearPageWaiters(page);
list_add(&page->lru, &pages_to_free);
}
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 2fcc719..d4f3b2f 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -372,6 +372,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
/*
* Initiate read into locked page and return.
*/
+ SetPageWorkingset(new_page);
lru_cache_add_anon(new_page);
*new_page_allocated = true;
return new_page;
diff --git a/mm/usercopy.c b/mm/usercopy.c
index 7683c22..27ae932 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -61,12 +61,11 @@ static noinline int check_stack_object(const void *obj, unsigned long len)
return GOOD_STACK;
}
-static void report_usercopy(const void *ptr, unsigned long len,
- bool to_user, const char *type)
+static void report_usercopy(unsigned long len, bool to_user, const char *type)
{
- pr_emerg("kernel memory %s attempt detected %s %p (%s) (%lu bytes)\n",
+ pr_emerg("kernel memory %s attempt detected %s '%s' (%lu bytes)\n",
to_user ? "exposure" : "overwrite",
- to_user ? "from" : "to", ptr, type ? : "unknown", len);
+ to_user ? "from" : "to", type ? : "unknown", len);
/*
* For greater effect, it would be nice to do do_group_exit(),
* but BUG() actually hooks all the lock-breaking and per-arch
@@ -275,6 +274,6 @@ void __check_object_size(const void *ptr, unsigned long n, bool to_user)
return;
report:
- report_usercopy(ptr, n, to_user, err);
+ report_usercopy(n, to_user, err);
}
EXPORT_SYMBOL(__check_object_size);
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 0ffab05..add160c 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -510,7 +510,11 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
}
found:
- if (addr + size > vend)
+ /*
+ * Check also calculated address against the vstart,
+ * because it can be 0 because of big align request.
+ */
+ if (addr + size > vend || addr < vstart)
goto overflow;
va->va_start = addr;
@@ -2296,7 +2300,7 @@ int remap_vmalloc_range_partial(struct vm_area_struct *vma, unsigned long uaddr,
if (!(area->flags & VM_USERMAP))
return -EINVAL;
- if (kaddr + size > area->addr + area->size)
+ if (kaddr + size > area->addr + get_vm_area_size(area))
return -EINVAL;
do {
diff --git a/mm/vmscan.c b/mm/vmscan.c
index abcc8be..b3c44fc 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -47,6 +47,7 @@
#include <linux/prefetch.h>
#include <linux/printk.h>
#include <linux/dax.h>
+#include <linux/psi.h>
#include <asm/tlbflush.h>
#include <asm/div64.h>
@@ -2094,6 +2095,7 @@ static void shrink_active_list(unsigned long nr_to_scan,
}
ClearPageActive(page); /* we are de-activating */
+ SetPageWorkingset(page);
list_add(&page->lru, &l_inactive);
}
@@ -3105,6 +3107,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
{
struct zonelist *zonelist;
unsigned long nr_reclaimed;
+ unsigned long pflags;
int nid;
struct scan_control sc = {
.nr_to_reclaim = max(nr_pages, SWAP_CLUSTER_MAX),
@@ -3132,9 +3135,13 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
sc.gfp_mask,
sc.reclaim_idx);
+ psi_memstall_enter(&pflags);
current->flags |= PF_MEMALLOC;
+
nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
+
current->flags &= ~PF_MEMALLOC;
+ psi_memstall_leave(&pflags);
trace_mm_vmscan_memcg_reclaim_end(nr_reclaimed);
@@ -3299,6 +3306,7 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx)
int i;
unsigned long nr_soft_reclaimed;
unsigned long nr_soft_scanned;
+ unsigned long pflags;
struct zone *zone;
struct scan_control sc = {
.gfp_mask = GFP_KERNEL,
@@ -3308,6 +3316,8 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx)
.may_unmap = 1,
.may_swap = 1,
};
+
+ psi_memstall_enter(&pflags);
count_vm_event(PAGEOUTRUN);
do {
@@ -3401,6 +3411,7 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx)
pgdat->kswapd_failures++;
out:
+ psi_memstall_leave(&pflags);
/*
* Return the order kswapd stopped reclaiming at as
* prepare_kswapd_sleep() takes it into account. If another caller
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 727bc07..98811ce 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -957,6 +957,7 @@ const char * const vmstat_text[] = {
"nr_isolated_file",
"workingset_refault",
"workingset_activate",
+ "workingset_restore",
"workingset_nodereclaim",
"nr_anon_pages",
"nr_mapped",
@@ -1076,13 +1077,8 @@ const char * const vmstat_text[] = {
#endif
#endif /* CONFIG_MEMORY_BALLOON */
#ifdef CONFIG_DEBUG_TLBFLUSH
-#ifdef CONFIG_SMP
"nr_tlb_remote_flush",
"nr_tlb_remote_flush_received",
-#else
- "", /* nr_tlb_remote_flush */
- "", /* nr_tlb_remote_flush_received */
-#endif /* CONFIG_SMP */
"nr_tlb_local_flush_all",
"nr_tlb_local_flush_one",
#endif /* CONFIG_DEBUG_TLBFLUSH */
diff --git a/mm/workingset.c b/mm/workingset.c
index 4c4f056..a697611 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -118,7 +118,7 @@
* the only thing eating into inactive list space is active pages.
*
*
- * Activating refaulting pages
+ * Refaulting inactive pages
*
* All that is known about the active list is that the pages have been
* accessed more than once in the past. This means that at any given
@@ -131,6 +131,10 @@
* used less frequently than the refaulting page - or even not used at
* all anymore.
*
+ * That means if inactive cache is refaulting with a suitable refault
+ * distance, we assume the cache workingset is transitioning and put
+ * pressure on the current active list.
+ *
* If this is wrong and demotion kicks in, the pages which are truly
* used more frequently will be reactivated while the less frequently
* used once will be evicted from memory.
@@ -138,6 +142,14 @@
* But if this is right, the stale pages will be pushed out of memory
* and the used pages get to stay in cache.
*
+ * Refaulting active pages
+ *
+ * If on the other hand the refaulting pages have recently been
+ * deactivated, it means that the active list is no longer protecting
+ * actively used cache from reclaim. The cache is NOT transitioning to
+ * a different workingset; the existing workingset is thrashing in the
+ * space allocated to the page cache.
+ *
*
* Implementation
*
@@ -153,8 +165,7 @@
*/
#define EVICTION_SHIFT (RADIX_TREE_EXCEPTIONAL_ENTRY + \
- NODES_SHIFT + \
- MEM_CGROUP_ID_SHIFT)
+ 1 + NODES_SHIFT + MEM_CGROUP_ID_SHIFT)
#define EVICTION_MASK (~0UL >> EVICTION_SHIFT)
/*
@@ -167,23 +178,28 @@
*/
static unsigned int bucket_order __read_mostly;
-static void *pack_shadow(int memcgid, pg_data_t *pgdat, unsigned long eviction)
+static void *pack_shadow(int memcgid, pg_data_t *pgdat, unsigned long eviction,
+ bool workingset)
{
eviction >>= bucket_order;
eviction = (eviction << MEM_CGROUP_ID_SHIFT) | memcgid;
eviction = (eviction << NODES_SHIFT) | pgdat->node_id;
+ eviction = (eviction << 1) | workingset;
eviction = (eviction << RADIX_TREE_EXCEPTIONAL_SHIFT);
return (void *)(eviction | RADIX_TREE_EXCEPTIONAL_ENTRY);
}
static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
- unsigned long *evictionp)
+ unsigned long *evictionp, bool *workingsetp)
{
unsigned long entry = (unsigned long)shadow;
int memcgid, nid;
+ bool workingset;
entry >>= RADIX_TREE_EXCEPTIONAL_SHIFT;
+ workingset = entry & 1;
+ entry >>= 1;
nid = entry & ((1UL << NODES_SHIFT) - 1);
entry >>= NODES_SHIFT;
memcgid = entry & ((1UL << MEM_CGROUP_ID_SHIFT) - 1);
@@ -192,6 +208,7 @@ static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
*memcgidp = memcgid;
*pgdat = NODE_DATA(nid);
*evictionp = entry << bucket_order;
+ *workingsetp = workingset;
}
/**
@@ -204,8 +221,8 @@ static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
*/
void *workingset_eviction(struct address_space *mapping, struct page *page)
{
- struct mem_cgroup *memcg = page_memcg(page);
struct pglist_data *pgdat = page_pgdat(page);
+ struct mem_cgroup *memcg = page_memcg(page);
int memcgid = mem_cgroup_id(memcg);
unsigned long eviction;
struct lruvec *lruvec;
@@ -217,30 +234,30 @@ void *workingset_eviction(struct address_space *mapping, struct page *page)
lruvec = mem_cgroup_lruvec(pgdat, memcg);
eviction = atomic_long_inc_return(&lruvec->inactive_age);
- return pack_shadow(memcgid, pgdat, eviction);
+ return pack_shadow(memcgid, pgdat, eviction, PageWorkingset(page));
}
/**
* workingset_refault - evaluate the refault of a previously evicted page
+ * @page: the freshly allocated replacement page
* @shadow: shadow entry of the evicted page
*
* Calculates and evaluates the refault distance of the previously
* evicted page in the context of the node it was allocated in.
- *
- * Returns %true if the page should be activated, %false otherwise.
*/
-bool workingset_refault(void *shadow)
+void workingset_refault(struct page *page, void *shadow)
{
unsigned long refault_distance;
+ struct pglist_data *pgdat;
unsigned long active_file;
struct mem_cgroup *memcg;
unsigned long eviction;
struct lruvec *lruvec;
unsigned long refault;
- struct pglist_data *pgdat;
+ bool workingset;
int memcgid;
- unpack_shadow(shadow, &memcgid, &pgdat, &eviction);
+ unpack_shadow(shadow, &memcgid, &pgdat, &eviction, &workingset);
rcu_read_lock();
/*
@@ -260,40 +277,51 @@ bool workingset_refault(void *shadow)
* configurations instead.
*/
memcg = mem_cgroup_from_id(memcgid);
- if (!mem_cgroup_disabled() && !memcg) {
- rcu_read_unlock();
- return false;
- }
+ if (!mem_cgroup_disabled() && !memcg)
+ goto out;
lruvec = mem_cgroup_lruvec(pgdat, memcg);
refault = atomic_long_read(&lruvec->inactive_age);
active_file = lruvec_lru_size(lruvec, LRU_ACTIVE_FILE, MAX_NR_ZONES);
- rcu_read_unlock();
/*
- * The unsigned subtraction here gives an accurate distance
- * across inactive_age overflows in most cases.
+ * Calculate the refault distance
*
- * There is a special case: usually, shadow entries have a
- * short lifetime and are either refaulted or reclaimed along
- * with the inode before they get too old. But it is not
- * impossible for the inactive_age to lap a shadow entry in
- * the field, which can then can result in a false small
- * refault distance, leading to a false activation should this
- * old entry actually refault again. However, earlier kernels
- * used to deactivate unconditionally with *every* reclaim
- * invocation for the longest time, so the occasional
- * inappropriate activation leading to pressure on the active
- * list is not a problem.
+ * The unsigned subtraction here gives an accurate distance
+ * across inactive_age overflows in most cases. There is a
+ * special case: usually, shadow entries have a short lifetime
+ * and are either refaulted or reclaimed along with the inode
+ * before they get too old. But it is not impossible for the
+ * inactive_age to lap a shadow entry in the field, which can
+ * then result in a false small refault distance, leading to a
+ * false activation should this old entry actually refault
+ * again. However, earlier kernels used to deactivate
+ * unconditionally with *every* reclaim invocation for the
+ * longest time, so the occasional inappropriate activation
+ * leading to pressure on the active list is not a problem.
*/
refault_distance = (refault - eviction) & EVICTION_MASK;
inc_node_state(pgdat, WORKINGSET_REFAULT);
- if (refault_distance <= active_file) {
- inc_node_state(pgdat, WORKINGSET_ACTIVATE);
- return true;
+ /*
+ * Compare the distance to the existing workingset size. We
+ * don't act on pages that couldn't stay resident even if all
+ * the memory was available to the page cache.
+ */
+ if (refault_distance > active_file)
+ goto out;
+
+ SetPageActive(page);
+ atomic_long_inc(&lruvec->inactive_age);
+ inc_node_state(pgdat, WORKINGSET_ACTIVATE);
+
+ /* Page was active prior to eviction */
+ if (workingset) {
+ SetPageWorkingset(page);
+ inc_node_state(pgdat, WORKINGSET_RESTORE);
}
- return false;
+out:
+ rcu_read_unlock();
}
/**
diff --git a/net/9p/client.c b/net/9p/client.c
index 142afe7..f1517ca 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -1058,7 +1058,7 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
p9_debug(P9_DEBUG_ERROR,
"Please specify a msize of at least 4k\n");
err = -EINVAL;
- goto free_client;
+ goto close_trans;
}
err = p9_client_version(clnt);
diff --git a/net/9p/protocol.c b/net/9p/protocol.c
index 145f805..7f1b45c 100644
--- a/net/9p/protocol.c
+++ b/net/9p/protocol.c
@@ -570,9 +570,10 @@ int p9stat_read(struct p9_client *clnt, char *buf, int len, struct p9_wstat *st)
if (ret) {
p9_debug(P9_DEBUG_9P, "<<< p9stat_read failed: %d\n", ret);
trace_9p_protocol_dump(clnt, &fake_pdu);
+ return ret;
}
- return ret;
+ return fake_pdu.offset;
}
EXPORT_SYMBOL(p9stat_read);
diff --git a/net/appletalk/atalk_proc.c b/net/appletalk/atalk_proc.c
index af46bc4..b5f84f4 100644
--- a/net/appletalk/atalk_proc.c
+++ b/net/appletalk/atalk_proc.c
@@ -293,7 +293,7 @@ int __init atalk_proc_init(void)
goto out;
}
-void __exit atalk_proc_exit(void)
+void atalk_proc_exit(void)
{
remove_proc_entry("interface", atalk_proc_dir);
remove_proc_entry("route", atalk_proc_dir);
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index 10d2bdc..e206d98 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1912,12 +1912,16 @@ static const char atalk_err_snap[] __initconst =
/* Called by proto.c on kernel start up */
static int __init atalk_init(void)
{
- int rc = proto_register(&ddp_proto, 0);
+ int rc;
- if (rc != 0)
+ rc = proto_register(&ddp_proto, 0);
+ if (rc)
goto out;
- (void)sock_register(&atalk_family_ops);
+ rc = sock_register(&atalk_family_ops);
+ if (rc)
+ goto out_proto;
+
ddp_dl = register_snap_client(ddp_snap_id, atalk_rcv);
if (!ddp_dl)
printk(atalk_err_snap);
@@ -1925,12 +1929,33 @@ static int __init atalk_init(void)
dev_add_pack(<alk_packet_type);
dev_add_pack(&ppptalk_packet_type);
- register_netdevice_notifier(&ddp_notifier);
+ rc = register_netdevice_notifier(&ddp_notifier);
+ if (rc)
+ goto out_sock;
+
aarp_proto_init();
- atalk_proc_init();
- atalk_register_sysctl();
+ rc = atalk_proc_init();
+ if (rc)
+ goto out_aarp;
+
+ rc = atalk_register_sysctl();
+ if (rc)
+ goto out_proc;
out:
return rc;
+out_proc:
+ atalk_proc_exit();
+out_aarp:
+ aarp_cleanup_module();
+ unregister_netdevice_notifier(&ddp_notifier);
+out_sock:
+ dev_remove_pack(&ppptalk_packet_type);
+ dev_remove_pack(<alk_packet_type);
+ unregister_snap_client(ddp_dl);
+ sock_unregister(PF_APPLETALK);
+out_proto:
+ proto_unregister(&ddp_proto);
+ goto out;
}
module_init(atalk_init);
diff --git a/net/appletalk/sysctl_net_atalk.c b/net/appletalk/sysctl_net_atalk.c
index ebb8643..4e6042e 100644
--- a/net/appletalk/sysctl_net_atalk.c
+++ b/net/appletalk/sysctl_net_atalk.c
@@ -44,9 +44,12 @@ static struct ctl_table atalk_table[] = {
static struct ctl_table_header *atalk_table_header;
-void atalk_register_sysctl(void)
+int __init atalk_register_sysctl(void)
{
atalk_table_header = register_net_sysctl(&init_net, "net/appletalk", atalk_table);
+ if (!atalk_table_header)
+ return -ENOMEM;
+ return 0;
}
void atalk_unregister_sysctl(void)
diff --git a/net/atm/lec.c b/net/atm/lec.c
index 1e84c52..704892d 100644
--- a/net/atm/lec.c
+++ b/net/atm/lec.c
@@ -721,7 +721,10 @@ static int lec_vcc_attach(struct atm_vcc *vcc, void __user *arg)
static int lec_mcast_attach(struct atm_vcc *vcc, int arg)
{
- if (arg < 0 || arg >= MAX_LEC_ITF || !dev_lec[arg])
+ if (arg < 0 || arg >= MAX_LEC_ITF)
+ return -EINVAL;
+ arg = array_index_nospec(arg, MAX_LEC_ITF);
+ if (!dev_lec[arg])
return -EINVAL;
vcc->proto_data = dev_lec[arg];
return lec_mcast_make(netdev_priv(dev_lec[arg]), vcc);
@@ -739,6 +742,7 @@ static int lecd_attach(struct atm_vcc *vcc, int arg)
i = arg;
if (arg >= MAX_LEC_ITF)
return -EINVAL;
+ i = array_index_nospec(arg, MAX_LEC_ITF);
if (!dev_lec[i]) {
int size;
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index c88a600..ca18369 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -826,8 +826,6 @@ static int hci_sock_release(struct socket *sock)
if (!sk)
return 0;
- hdev = hci_pi(sk)->hdev;
-
switch (hci_pi(sk)->channel) {
case HCI_CHANNEL_MONITOR:
atomic_dec(&monitor_promisc);
@@ -849,6 +847,7 @@ static int hci_sock_release(struct socket *sock)
bt_sock_unlink(&hci_sk_list, sk);
+ hdev = hci_pi(sk)->hdev;
if (hdev) {
if (hci_pi(sk)->channel == HCI_CHANNEL_USER) {
/* When releasing an user channel exclusive access,
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index 1fc23cb..d49aa4e 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -3326,16 +3326,22 @@ static int l2cap_parse_conf_req(struct l2cap_chan *chan, void *data, size_t data
while (len >= L2CAP_CONF_OPT_SIZE) {
len -= l2cap_get_conf_opt(&req, &type, &olen, &val);
+ if (len < 0)
+ break;
hint = type & L2CAP_CONF_HINT;
type &= L2CAP_CONF_MASK;
switch (type) {
case L2CAP_CONF_MTU:
+ if (olen != 2)
+ break;
mtu = val;
break;
case L2CAP_CONF_FLUSH_TO:
+ if (olen != 2)
+ break;
chan->flush_to = val;
break;
@@ -3343,26 +3349,30 @@ static int l2cap_parse_conf_req(struct l2cap_chan *chan, void *data, size_t data
break;
case L2CAP_CONF_RFC:
- if (olen == sizeof(rfc))
- memcpy(&rfc, (void *) val, olen);
+ if (olen != sizeof(rfc))
+ break;
+ memcpy(&rfc, (void *) val, olen);
break;
case L2CAP_CONF_FCS:
+ if (olen != 1)
+ break;
if (val == L2CAP_FCS_NONE)
set_bit(CONF_RECV_NO_FCS, &chan->conf_state);
break;
case L2CAP_CONF_EFS:
- if (olen == sizeof(efs)) {
- remote_efs = 1;
- memcpy(&efs, (void *) val, olen);
- }
+ if (olen != sizeof(efs))
+ break;
+ remote_efs = 1;
+ memcpy(&efs, (void *) val, olen);
break;
case L2CAP_CONF_EWS:
+ if (olen != 2)
+ break;
if (!(chan->conn->local_fixed_chan & L2CAP_FC_A2MP))
return -ECONNREFUSED;
-
set_bit(FLAG_EXT_CTRL, &chan->flags);
set_bit(CONF_EWS_RECV, &chan->conf_state);
chan->tx_win_max = L2CAP_DEFAULT_EXT_WINDOW;
@@ -3372,7 +3382,6 @@ static int l2cap_parse_conf_req(struct l2cap_chan *chan, void *data, size_t data
default:
if (hint)
break;
-
result = L2CAP_CONF_UNKNOWN;
*((u8 *) ptr++) = type;
break;
@@ -3537,58 +3546,65 @@ static int l2cap_parse_conf_rsp(struct l2cap_chan *chan, void *rsp, int len,
while (len >= L2CAP_CONF_OPT_SIZE) {
len -= l2cap_get_conf_opt(&rsp, &type, &olen, &val);
+ if (len < 0)
+ break;
switch (type) {
case L2CAP_CONF_MTU:
+ if (olen != 2)
+ break;
if (val < L2CAP_DEFAULT_MIN_MTU) {
*result = L2CAP_CONF_UNACCEPT;
chan->imtu = L2CAP_DEFAULT_MIN_MTU;
} else
chan->imtu = val;
- l2cap_add_conf_opt(&ptr, L2CAP_CONF_MTU, 2, chan->imtu, endptr - ptr);
+ l2cap_add_conf_opt(&ptr, L2CAP_CONF_MTU, 2, chan->imtu,
+ endptr - ptr);
break;
case L2CAP_CONF_FLUSH_TO:
+ if (olen != 2)
+ break;
chan->flush_to = val;
- l2cap_add_conf_opt(&ptr, L2CAP_CONF_FLUSH_TO,
- 2, chan->flush_to, endptr - ptr);
+ l2cap_add_conf_opt(&ptr, L2CAP_CONF_FLUSH_TO, 2,
+ chan->flush_to, endptr - ptr);
break;
case L2CAP_CONF_RFC:
- if (olen == sizeof(rfc))
- memcpy(&rfc, (void *)val, olen);
-
+ if (olen != sizeof(rfc))
+ break;
+ memcpy(&rfc, (void *)val, olen);
if (test_bit(CONF_STATE2_DEVICE, &chan->conf_state) &&
rfc.mode != chan->mode)
return -ECONNREFUSED;
-
chan->fcs = 0;
-
- l2cap_add_conf_opt(&ptr, L2CAP_CONF_RFC,
- sizeof(rfc), (unsigned long) &rfc, endptr - ptr);
+ l2cap_add_conf_opt(&ptr, L2CAP_CONF_RFC, sizeof(rfc),
+ (unsigned long) &rfc, endptr - ptr);
break;
case L2CAP_CONF_EWS:
+ if (olen != 2)
+ break;
chan->ack_win = min_t(u16, val, chan->ack_win);
l2cap_add_conf_opt(&ptr, L2CAP_CONF_EWS, 2,
chan->tx_win, endptr - ptr);
break;
case L2CAP_CONF_EFS:
- if (olen == sizeof(efs)) {
- memcpy(&efs, (void *)val, olen);
-
- if (chan->local_stype != L2CAP_SERV_NOTRAFIC &&
- efs.stype != L2CAP_SERV_NOTRAFIC &&
- efs.stype != chan->local_stype)
- return -ECONNREFUSED;
-
- l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs),
- (unsigned long) &efs, endptr - ptr);
- }
+ if (olen != sizeof(efs))
+ break;
+ memcpy(&efs, (void *)val, olen);
+ if (chan->local_stype != L2CAP_SERV_NOTRAFIC &&
+ efs.stype != L2CAP_SERV_NOTRAFIC &&
+ efs.stype != chan->local_stype)
+ return -ECONNREFUSED;
+ l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs),
+ (unsigned long) &efs, endptr - ptr);
break;
case L2CAP_CONF_FCS:
+ if (olen != 1)
+ break;
if (*result == L2CAP_CONF_PENDING)
if (val == L2CAP_FCS_NONE)
set_bit(CONF_RECV_NO_FCS,
@@ -3717,13 +3733,18 @@ static void l2cap_conf_rfc_get(struct l2cap_chan *chan, void *rsp, int len)
while (len >= L2CAP_CONF_OPT_SIZE) {
len -= l2cap_get_conf_opt(&rsp, &type, &olen, &val);
+ if (len < 0)
+ break;
switch (type) {
case L2CAP_CONF_RFC:
- if (olen == sizeof(rfc))
- memcpy(&rfc, (void *)val, olen);
+ if (olen != sizeof(rfc))
+ break;
+ memcpy(&rfc, (void *)val, olen);
break;
case L2CAP_CONF_EWS:
+ if (olen != 2)
+ break;
txwin_ext = val;
break;
}
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 267b46a..c615dff 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -231,13 +231,10 @@ static void __br_handle_local_finish(struct sk_buff *skb)
/* note: already called with rcu_read_lock */
static int br_handle_local_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
{
- struct net_bridge_port *p = br_port_get_rcu(skb->dev);
-
__br_handle_local_finish(skb);
- BR_INPUT_SKB_CB(skb)->brdev = p->br->dev;
- br_pass_frame_up(skb);
- return 0;
+ /* return 1 to signal the okfn() was called so it's ok to use the skb */
+ return 1;
}
/*
@@ -308,10 +305,18 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
goto forward;
}
- /* Deliver packet to local host only */
- NF_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_IN, dev_net(skb->dev),
- NULL, skb, skb->dev, NULL, br_handle_local_finish);
- return RX_HANDLER_CONSUMED;
+ /* The else clause should be hit when nf_hook():
+ * - returns < 0 (drop/error)
+ * - returns = 0 (stolen/nf_queue)
+ * Thus return 1 from the okfn() to signal the skb is ok to pass
+ */
+ if (NF_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_IN,
+ dev_net(skb->dev), NULL, skb, skb->dev, NULL,
+ br_handle_local_finish) == 1) {
+ return RX_HANDLER_PASS;
+ } else {
+ return RX_HANDLER_CONSUMED;
+ }
}
forward:
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 2136e45..964ffff 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1983,7 +1983,8 @@ static void br_multicast_start_querier(struct net_bridge *br,
__br_multicast_open(br, query);
- list_for_each_entry(port, &br->port_list, list) {
+ rcu_read_lock();
+ list_for_each_entry_rcu(port, &br->port_list, list) {
if (port->state == BR_STATE_DISABLED ||
port->state == BR_STATE_BLOCKING)
continue;
@@ -1995,6 +1996,7 @@ static void br_multicast_start_querier(struct net_bridge *br,
br_multicast_enable(&port->ip6_own_query);
#endif
}
+ rcu_read_unlock();
}
int br_multicast_toggle(struct net_bridge *br, unsigned long val)
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 7e42c0d..38865de 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -878,11 +878,6 @@ static const struct nf_br_ops br_ops = {
.br_dev_xmit_hook = br_nf_dev_xmit,
};
-void br_netfilter_enable(void)
-{
-}
-EXPORT_SYMBOL_GPL(br_netfilter_enable);
-
/* For br_nf_post_routing, we need (prio = NF_BR_PRI_LAST), because
* br_dev_queue_push_xmit is called afterwards */
static struct nf_hook_ops br_nf_ops[] __read_mostly = {
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index c7e5aaf..142ccaa 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -2056,7 +2056,8 @@ static int ebt_size_mwt(struct compat_ebt_entry_mwt *match32,
if (match_kern)
match_kern->match_size = ret;
- if (WARN_ON(type == EBT_COMPAT_TARGET && size_left))
+ /* rule should have no remaining data after target */
+ if (type == EBT_COMPAT_TARGET && size_left)
return -EINVAL;
match32 = (struct compat_ebt_entry_mwt *) buf;
diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c
index 464e885..bf0294c 100644
--- a/net/ceph/ceph_common.c
+++ b/net/ceph/ceph_common.c
@@ -699,7 +699,6 @@ int __ceph_open_session(struct ceph_client *client, unsigned long started)
}
EXPORT_SYMBOL(__ceph_open_session);
-
int ceph_open_session(struct ceph_client *client)
{
int ret;
@@ -715,6 +714,23 @@ int ceph_open_session(struct ceph_client *client)
}
EXPORT_SYMBOL(ceph_open_session);
+int ceph_wait_for_latest_osdmap(struct ceph_client *client,
+ unsigned long timeout)
+{
+ u64 newest_epoch;
+ int ret;
+
+ ret = ceph_monc_get_version(&client->monc, "osdmap", &newest_epoch);
+ if (ret)
+ return ret;
+
+ if (client->osdc.osdmap->epoch >= newest_epoch)
+ return 0;
+
+ ceph_osdc_maybe_request_map(&client->osdc);
+ return ceph_monc_wait_osdmap(&client->monc, newest_epoch, timeout);
+}
+EXPORT_SYMBOL(ceph_wait_for_latest_osdmap);
static int __init init_ceph_lib(void)
{
diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
index 5004810..288c1fc 100644
--- a/net/ceph/mon_client.c
+++ b/net/ceph/mon_client.c
@@ -914,6 +914,15 @@ int ceph_monc_blacklist_add(struct ceph_mon_client *monc,
mutex_unlock(&monc->mutex);
ret = wait_generic_request(req);
+ if (!ret)
+ /*
+ * Make sure we have the osdmap that includes the blacklist
+ * entry. This is needed to ensure that the OSDs pick up the
+ * new blacklist before processing any future requests from
+ * this client.
+ */
+ ret = ceph_wait_for_latest_osdmap(monc->client, 0);
+
out:
put_generic_request(req);
return ret;
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index a8a9938..20ae57f 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -1801,17 +1801,22 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr)
gstrings.len = ret;
- data = kcalloc(gstrings.len, ETH_GSTRING_LEN, GFP_USER);
- if (!data)
- return -ENOMEM;
+ if (gstrings.len) {
+ data = kcalloc(gstrings.len, ETH_GSTRING_LEN, GFP_USER);
+ if (!data)
+ return -ENOMEM;
- __ethtool_get_strings(dev, gstrings.string_set, data);
+ __ethtool_get_strings(dev, gstrings.string_set, data);
+ } else {
+ data = NULL;
+ }
ret = -EFAULT;
if (copy_to_user(useraddr, &gstrings, sizeof(gstrings)))
goto out;
useraddr += sizeof(gstrings);
- if (copy_to_user(useraddr, data, gstrings.len * ETH_GSTRING_LEN))
+ if (gstrings.len &&
+ copy_to_user(useraddr, data, gstrings.len * ETH_GSTRING_LEN))
goto out;
ret = 0;
@@ -1899,17 +1904,21 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
return -EFAULT;
stats.n_stats = n_stats;
- data = kmalloc(n_stats * sizeof(u64), GFP_USER);
- if (!data)
- return -ENOMEM;
+ if (n_stats) {
+ data = kmalloc(n_stats * sizeof(u64), GFP_USER);
+ if (!data)
+ return -ENOMEM;
- ops->get_ethtool_stats(dev, &stats, data);
+ ops->get_ethtool_stats(dev, &stats, data);
+ } else {
+ data = NULL;
+ }
ret = -EFAULT;
if (copy_to_user(useraddr, &stats, sizeof(stats)))
goto out;
useraddr += sizeof(stats);
- if (copy_to_user(useraddr, data, stats.n_stats * sizeof(u64)))
+ if (n_stats && copy_to_user(useraddr, data, n_stats * sizeof(u64)))
goto out;
ret = 0;
@@ -1938,19 +1947,23 @@ static int ethtool_get_phy_stats(struct net_device *dev, void __user *useraddr)
return -EFAULT;
stats.n_stats = n_stats;
- data = kmalloc_array(n_stats, sizeof(u64), GFP_USER);
- if (!data)
- return -ENOMEM;
+ if (n_stats) {
+ data = kmalloc_array(n_stats, sizeof(u64), GFP_USER);
+ if (!data)
+ return -ENOMEM;
- mutex_lock(&phydev->lock);
- phydev->drv->get_stats(phydev, &stats, data);
- mutex_unlock(&phydev->lock);
+ mutex_lock(&phydev->lock);
+ phydev->drv->get_stats(phydev, &stats, data);
+ mutex_unlock(&phydev->lock);
+ } else {
+ data = NULL;
+ }
ret = -EFAULT;
if (copy_to_user(useraddr, &stats, sizeof(stats)))
goto out;
useraddr += sizeof(stats);
- if (copy_to_user(useraddr, data, stats.n_stats * sizeof(u64)))
+ if (n_stats && copy_to_user(useraddr, data, n_stats * sizeof(u64)))
goto out;
ret = 0;
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 04fd04c..4509dec 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -282,6 +282,7 @@ static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
atomic_set(&net->count, 1);
atomic_set(&net->passive, 1);
+ get_random_bytes(&net->hash_mix, sizeof(u32));
net->dev_base_seq = 1;
net->user_ns = user_ns;
idr_init(&net->netns_ids);
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index 28ad6f1..1d6d3aa 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -596,13 +596,7 @@ int dccp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
if (inet_csk_reqsk_queue_is_full(sk))
goto drop;
- /*
- * Accept backlog is full. If we have already queued enough
- * of warm entries in syn queue, drop request. It is better than
- * clogging syn queue with openreqs with exponentially increasing
- * timeout.
- */
- if (sk_acceptq_is_full(sk) && inet_csk_reqsk_queue_young(sk) > 1)
+ if (sk_acceptq_is_full(sk))
goto drop;
req = inet_reqsk_alloc(&dccp_request_sock_ops, sk, true);
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 6cbcf39..87c513b 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -328,7 +328,7 @@ static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
if (inet_csk_reqsk_queue_is_full(sk))
goto drop;
- if (sk_acceptq_is_full(sk) && inet_csk_reqsk_queue_young(sk) > 1)
+ if (sk_acceptq_is_full(sk))
goto drop;
req = inet_reqsk_alloc(&dccp6_request_sock_ops, sk, true);
@@ -431,8 +431,8 @@ static struct sock *dccp_v6_request_recv_sock(const struct sock *sk,
newnp->ipv6_mc_list = NULL;
newnp->ipv6_ac_list = NULL;
newnp->ipv6_fl_list = NULL;
- newnp->mcast_oif = inet6_iif(skb);
- newnp->mcast_hops = ipv6_hdr(skb)->hop_limit;
+ newnp->mcast_oif = inet_iif(skb);
+ newnp->mcast_hops = ip_hdr(skb)->ttl;
/*
* No need to charge this sock to the relevant IPv6 refcnt debug socks count
diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c
index 16737cd..52694cb 100644
--- a/net/hsr/hsr_device.c
+++ b/net/hsr/hsr_device.c
@@ -94,9 +94,8 @@ static void hsr_check_announce(struct net_device *hsr_dev,
&& (old_operstate != IF_OPER_UP)) {
/* Went up */
hsr->announce_count = 0;
- hsr->announce_timer.expires = jiffies +
- msecs_to_jiffies(HSR_ANNOUNCE_INTERVAL);
- add_timer(&hsr->announce_timer);
+ mod_timer(&hsr->announce_timer,
+ jiffies + msecs_to_jiffies(HSR_ANNOUNCE_INTERVAL));
}
if ((hsr_dev->operstate != IF_OPER_UP) && (old_operstate == IF_OPER_UP))
@@ -331,6 +330,7 @@ static void hsr_announce(unsigned long data)
{
struct hsr_priv *hsr;
struct hsr_port *master;
+ unsigned long interval;
hsr = (struct hsr_priv *) data;
@@ -342,18 +342,16 @@ static void hsr_announce(unsigned long data)
hsr->protVersion);
hsr->announce_count++;
- hsr->announce_timer.expires = jiffies +
- msecs_to_jiffies(HSR_ANNOUNCE_INTERVAL);
+ interval = msecs_to_jiffies(HSR_ANNOUNCE_INTERVAL);
} else {
send_hsr_supervision_frame(master, HSR_TLV_LIFE_CHECK,
hsr->protVersion);
- hsr->announce_timer.expires = jiffies +
- msecs_to_jiffies(HSR_LIFE_CHECK_INTERVAL);
+ interval = msecs_to_jiffies(HSR_LIFE_CHECK_INTERVAL);
}
if (is_admin_up(master->dev))
- add_timer(&hsr->announce_timer);
+ mod_timer(&hsr->announce_timer, jiffies + interval);
rcu_read_unlock();
}
@@ -485,7 +483,7 @@ int hsr_dev_finalize(struct net_device *hsr_dev, struct net_device *slave[2],
res = hsr_add_port(hsr, hsr_dev, HSR_PT_MASTER);
if (res)
- return res;
+ goto err_add_port;
res = register_netdevice(hsr_dev);
if (res)
@@ -505,6 +503,8 @@ int hsr_dev_finalize(struct net_device *hsr_dev, struct net_device *slave[2],
fail:
hsr_for_each_port(hsr, port)
hsr_del_port(port);
+err_add_port:
+ hsr_del_node(&hsr->self_node_db);
return res;
}
diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c
index 284a9b8..6705420 100644
--- a/net/hsr/hsr_framereg.c
+++ b/net/hsr/hsr_framereg.c
@@ -124,6 +124,18 @@ int hsr_create_self_node(struct list_head *self_node_db,
return 0;
}
+void hsr_del_node(struct list_head *self_node_db)
+{
+ struct hsr_node *node;
+
+ rcu_read_lock();
+ node = list_first_or_null_rcu(self_node_db, struct hsr_node, mac_list);
+ rcu_read_unlock();
+ if (node) {
+ list_del_rcu(&node->mac_list);
+ kfree(node);
+ }
+}
/* Allocate an hsr_node and add it to node_db. 'addr' is the node's AddressA;
* seq_out is used to initialize filtering of outgoing duplicate frames
diff --git a/net/hsr/hsr_framereg.h b/net/hsr/hsr_framereg.h
index 4e04f0e..43958a3 100644
--- a/net/hsr/hsr_framereg.h
+++ b/net/hsr/hsr_framereg.h
@@ -16,6 +16,7 @@
struct hsr_node;
+void hsr_del_node(struct list_head *self_node_db);
struct hsr_node *hsr_add_node(struct list_head *node_db, unsigned char addr[],
u16 seq_out);
struct hsr_node *hsr_get_node(struct hsr_port *port, struct sk_buff *skb,
diff --git a/net/ieee802154/6lowpan/reassembly.c b/net/ieee802154/6lowpan/reassembly.c
index aab1e2d..c01df34 100644
--- a/net/ieee802154/6lowpan/reassembly.c
+++ b/net/ieee802154/6lowpan/reassembly.c
@@ -25,7 +25,7 @@
#include <net/ieee802154_netdev.h>
#include <net/6lowpan.h>
-#include <net/ipv6.h>
+#include <net/ipv6_frag.h>
#include <net/inet_frag.h>
#include "6lowpan_i.h"
diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
index 030d153..17acc89 100644
--- a/net/ipv4/fou.c
+++ b/net/ipv4/fou.c
@@ -119,6 +119,7 @@ static int gue_udp_recv(struct sock *sk, struct sk_buff *skb)
struct guehdr *guehdr;
void *data;
u16 doffset = 0;
+ u8 proto_ctype;
if (!fou)
return 1;
@@ -210,13 +211,14 @@ static int gue_udp_recv(struct sock *sk, struct sk_buff *skb)
if (unlikely(guehdr->control))
return gue_control_message(skb, guehdr);
+ proto_ctype = guehdr->proto_ctype;
__skb_pull(skb, sizeof(struct udphdr) + hdrlen);
skb_reset_transport_header(skb);
if (iptunnel_pull_offloads(skb))
goto drop;
- return -guehdr->proto_ctype;
+ return -proto_ctype;
drop:
kfree_skb(skb);
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 9453180..80a2d99 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -797,7 +797,6 @@ static void inet_child_forget(struct sock *sk, struct request_sock *req,
tcp_sk(child)->fastopen_rsk = NULL;
}
inet_csk_destroy_sock(child);
- reqsk_put(req);
}
struct sock *inet_csk_reqsk_queue_add(struct sock *sk,
@@ -868,6 +867,7 @@ void inet_csk_listen_stop(struct sock *sk)
sock_hold(child);
inet_child_forget(sk, req, child);
+ reqsk_put(req);
bh_unlock_sock(child);
local_bh_enable();
sock_put(child);
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index 0fb49de..2325cd3 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -24,6 +24,62 @@
#include <net/sock.h>
#include <net/inet_frag.h>
#include <net/inet_ecn.h>
+#include <net/ip.h>
+#include <net/ipv6.h>
+
+/* Use skb->cb to track consecutive/adjacent fragments coming at
+ * the end of the queue. Nodes in the rb-tree queue will
+ * contain "runs" of one or more adjacent fragments.
+ *
+ * Invariants:
+ * - next_frag is NULL at the tail of a "run";
+ * - the head of a "run" has the sum of all fragment lengths in frag_run_len.
+ */
+struct ipfrag_skb_cb {
+ union {
+ struct inet_skb_parm h4;
+ struct inet6_skb_parm h6;
+ };
+ struct sk_buff *next_frag;
+ int frag_run_len;
+};
+
+#define FRAG_CB(skb) ((struct ipfrag_skb_cb *)((skb)->cb))
+
+static void fragcb_clear(struct sk_buff *skb)
+{
+ RB_CLEAR_NODE(&skb->rbnode);
+ FRAG_CB(skb)->next_frag = NULL;
+ FRAG_CB(skb)->frag_run_len = skb->len;
+}
+
+/* Append skb to the last "run". */
+static void fragrun_append_to_last(struct inet_frag_queue *q,
+ struct sk_buff *skb)
+{
+ fragcb_clear(skb);
+
+ FRAG_CB(q->last_run_head)->frag_run_len += skb->len;
+ FRAG_CB(q->fragments_tail)->next_frag = skb;
+ q->fragments_tail = skb;
+}
+
+/* Create a new "run" with the skb. */
+static void fragrun_create(struct inet_frag_queue *q, struct sk_buff *skb)
+{
+ BUILD_BUG_ON(sizeof(struct ipfrag_skb_cb) > sizeof(skb->cb));
+ fragcb_clear(skb);
+
+ if (q->last_run_head)
+ rb_link_node(&skb->rbnode, &q->last_run_head->rbnode,
+ &q->last_run_head->rbnode.rb_right);
+ else
+ rb_link_node(&skb->rbnode, NULL, &q->rb_fragments.rb_node);
+ rb_insert_color(&skb->rbnode, &q->rb_fragments);
+
+ q->fragments_tail = skb;
+ q->last_run_head = skb;
+}
/* Given the OR values of all fragments, apply RFC 3168 5.3 requirements
* Value : 0xff if frame should be dropped.
@@ -122,6 +178,28 @@ static void inet_frag_destroy_rcu(struct rcu_head *head)
kmem_cache_free(f->frags_cachep, q);
}
+unsigned int inet_frag_rbtree_purge(struct rb_root *root)
+{
+ struct rb_node *p = rb_first(root);
+ unsigned int sum = 0;
+
+ while (p) {
+ struct sk_buff *skb = rb_entry(p, struct sk_buff, rbnode);
+
+ p = rb_next(p);
+ rb_erase(&skb->rbnode, root);
+ while (skb) {
+ struct sk_buff *next = FRAG_CB(skb)->next_frag;
+
+ sum += skb->truesize;
+ kfree_skb(skb);
+ skb = next;
+ }
+ }
+ return sum;
+}
+EXPORT_SYMBOL(inet_frag_rbtree_purge);
+
void inet_frag_destroy(struct inet_frag_queue *q)
{
struct sk_buff *fp;
@@ -223,3 +301,218 @@ struct inet_frag_queue *inet_frag_find(struct netns_frags *nf, void *key)
return fq;
}
EXPORT_SYMBOL(inet_frag_find);
+
+int inet_frag_queue_insert(struct inet_frag_queue *q, struct sk_buff *skb,
+ int offset, int end)
+{
+ struct sk_buff *last = q->fragments_tail;
+
+ /* RFC5722, Section 4, amended by Errata ID : 3089
+ * When reassembling an IPv6 datagram, if
+ * one or more its constituent fragments is determined to be an
+ * overlapping fragment, the entire datagram (and any constituent
+ * fragments) MUST be silently discarded.
+ *
+ * Duplicates, however, should be ignored (i.e. skb dropped, but the
+ * queue/fragments kept for later reassembly).
+ */
+ if (!last)
+ fragrun_create(q, skb); /* First fragment. */
+ else if (last->ip_defrag_offset + last->len < end) {
+ /* This is the common case: skb goes to the end. */
+ /* Detect and discard overlaps. */
+ if (offset < last->ip_defrag_offset + last->len)
+ return IPFRAG_OVERLAP;
+ if (offset == last->ip_defrag_offset + last->len)
+ fragrun_append_to_last(q, skb);
+ else
+ fragrun_create(q, skb);
+ } else {
+ /* Binary search. Note that skb can become the first fragment,
+ * but not the last (covered above).
+ */
+ struct rb_node **rbn, *parent;
+
+ rbn = &q->rb_fragments.rb_node;
+ do {
+ struct sk_buff *curr;
+ int curr_run_end;
+
+ parent = *rbn;
+ curr = rb_to_skb(parent);
+ curr_run_end = curr->ip_defrag_offset +
+ FRAG_CB(curr)->frag_run_len;
+ if (end <= curr->ip_defrag_offset)
+ rbn = &parent->rb_left;
+ else if (offset >= curr_run_end)
+ rbn = &parent->rb_right;
+ else if (offset >= curr->ip_defrag_offset &&
+ end <= curr_run_end)
+ return IPFRAG_DUP;
+ else
+ return IPFRAG_OVERLAP;
+ } while (*rbn);
+ /* Here we have parent properly set, and rbn pointing to
+ * one of its NULL left/right children. Insert skb.
+ */
+ fragcb_clear(skb);
+ rb_link_node(&skb->rbnode, parent, rbn);
+ rb_insert_color(&skb->rbnode, &q->rb_fragments);
+ }
+
+ skb->ip_defrag_offset = offset;
+
+ return IPFRAG_OK;
+}
+EXPORT_SYMBOL(inet_frag_queue_insert);
+
+void *inet_frag_reasm_prepare(struct inet_frag_queue *q, struct sk_buff *skb,
+ struct sk_buff *parent)
+{
+ struct sk_buff *fp, *head = skb_rb_first(&q->rb_fragments);
+ struct sk_buff **nextp;
+ int delta;
+
+ if (head != skb) {
+ fp = skb_clone(skb, GFP_ATOMIC);
+ if (!fp)
+ return NULL;
+ FRAG_CB(fp)->next_frag = FRAG_CB(skb)->next_frag;
+ if (RB_EMPTY_NODE(&skb->rbnode))
+ FRAG_CB(parent)->next_frag = fp;
+ else
+ rb_replace_node(&skb->rbnode, &fp->rbnode,
+ &q->rb_fragments);
+ if (q->fragments_tail == skb)
+ q->fragments_tail = fp;
+ skb_morph(skb, head);
+ FRAG_CB(skb)->next_frag = FRAG_CB(head)->next_frag;
+ rb_replace_node(&head->rbnode, &skb->rbnode,
+ &q->rb_fragments);
+ consume_skb(head);
+ head = skb;
+ }
+ WARN_ON(head->ip_defrag_offset != 0);
+
+ delta = -head->truesize;
+
+ /* Head of list must not be cloned. */
+ if (skb_unclone(head, GFP_ATOMIC))
+ return NULL;
+
+ delta += head->truesize;
+ if (delta)
+ add_frag_mem_limit(q->net, delta);
+
+ /* If the first fragment is fragmented itself, we split
+ * it to two chunks: the first with data and paged part
+ * and the second, holding only fragments.
+ */
+ if (skb_has_frag_list(head)) {
+ struct sk_buff *clone;
+ int i, plen = 0;
+
+ clone = alloc_skb(0, GFP_ATOMIC);
+ if (!clone)
+ return NULL;
+ skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list;
+ skb_frag_list_init(head);
+ for (i = 0; i < skb_shinfo(head)->nr_frags; i++)
+ plen += skb_frag_size(&skb_shinfo(head)->frags[i]);
+ clone->data_len = head->data_len - plen;
+ clone->len = clone->data_len;
+ head->truesize += clone->truesize;
+ clone->csum = 0;
+ clone->ip_summed = head->ip_summed;
+ add_frag_mem_limit(q->net, clone->truesize);
+ skb_shinfo(head)->frag_list = clone;
+ nextp = &clone->next;
+ } else {
+ nextp = &skb_shinfo(head)->frag_list;
+ }
+
+ return nextp;
+}
+EXPORT_SYMBOL(inet_frag_reasm_prepare);
+
+void inet_frag_reasm_finish(struct inet_frag_queue *q, struct sk_buff *head,
+ void *reasm_data)
+{
+ struct sk_buff **nextp = (struct sk_buff **)reasm_data;
+ struct rb_node *rbn;
+ struct sk_buff *fp;
+
+ skb_push(head, head->data - skb_network_header(head));
+
+ /* Traverse the tree in order, to build frag_list. */
+ fp = FRAG_CB(head)->next_frag;
+ rbn = rb_next(&head->rbnode);
+ rb_erase(&head->rbnode, &q->rb_fragments);
+ while (rbn || fp) {
+ /* fp points to the next sk_buff in the current run;
+ * rbn points to the next run.
+ */
+ /* Go through the current run. */
+ while (fp) {
+ *nextp = fp;
+ nextp = &fp->next;
+ fp->prev = NULL;
+ memset(&fp->rbnode, 0, sizeof(fp->rbnode));
+ fp->sk = NULL;
+ head->data_len += fp->len;
+ head->len += fp->len;
+ if (head->ip_summed != fp->ip_summed)
+ head->ip_summed = CHECKSUM_NONE;
+ else if (head->ip_summed == CHECKSUM_COMPLETE)
+ head->csum = csum_add(head->csum, fp->csum);
+ head->truesize += fp->truesize;
+ fp = FRAG_CB(fp)->next_frag;
+ }
+ /* Move to the next run. */
+ if (rbn) {
+ struct rb_node *rbnext = rb_next(rbn);
+
+ fp = rb_to_skb(rbn);
+ rb_erase(rbn, &q->rb_fragments);
+ rbn = rbnext;
+ }
+ }
+ sub_frag_mem_limit(q->net, head->truesize);
+
+ *nextp = NULL;
+ head->next = NULL;
+ head->prev = NULL;
+ head->tstamp = q->stamp;
+}
+EXPORT_SYMBOL(inet_frag_reasm_finish);
+
+struct sk_buff *inet_frag_pull_head(struct inet_frag_queue *q)
+{
+ struct sk_buff *head;
+
+ if (q->fragments) {
+ head = q->fragments;
+ q->fragments = head->next;
+ } else {
+ struct sk_buff *skb;
+
+ head = skb_rb_first(&q->rb_fragments);
+ if (!head)
+ return NULL;
+ skb = FRAG_CB(head)->next_frag;
+ if (skb)
+ rb_replace_node(&head->rbnode, &skb->rbnode,
+ &q->rb_fragments);
+ else
+ rb_erase(&head->rbnode, &q->rb_fragments);
+ memset(&head->rbnode, 0, sizeof(head->rbnode));
+ barrier();
+ }
+ if (head == q->fragments_tail)
+ q->fragments_tail = NULL;
+
+ sub_frag_mem_limit(q->net, head->truesize);
+
+ return head;
+}
+EXPORT_SYMBOL(inet_frag_pull_head);
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index c7334d1..6e9ba9d 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -56,57 +56,6 @@
*/
static const char ip_frag_cache_name[] = "ip4-frags";
-/* Use skb->cb to track consecutive/adjacent fragments coming at
- * the end of the queue. Nodes in the rb-tree queue will
- * contain "runs" of one or more adjacent fragments.
- *
- * Invariants:
- * - next_frag is NULL at the tail of a "run";
- * - the head of a "run" has the sum of all fragment lengths in frag_run_len.
- */
-struct ipfrag_skb_cb {
- struct inet_skb_parm h;
- struct sk_buff *next_frag;
- int frag_run_len;
-};
-
-#define FRAG_CB(skb) ((struct ipfrag_skb_cb *)((skb)->cb))
-
-static void ip4_frag_init_run(struct sk_buff *skb)
-{
- BUILD_BUG_ON(sizeof(struct ipfrag_skb_cb) > sizeof(skb->cb));
-
- FRAG_CB(skb)->next_frag = NULL;
- FRAG_CB(skb)->frag_run_len = skb->len;
-}
-
-/* Append skb to the last "run". */
-static void ip4_frag_append_to_last_run(struct inet_frag_queue *q,
- struct sk_buff *skb)
-{
- RB_CLEAR_NODE(&skb->rbnode);
- FRAG_CB(skb)->next_frag = NULL;
-
- FRAG_CB(q->last_run_head)->frag_run_len += skb->len;
- FRAG_CB(q->fragments_tail)->next_frag = skb;
- q->fragments_tail = skb;
-}
-
-/* Create a new "run" with the skb. */
-static void ip4_frag_create_run(struct inet_frag_queue *q, struct sk_buff *skb)
-{
- if (q->last_run_head)
- rb_link_node(&skb->rbnode, &q->last_run_head->rbnode,
- &q->last_run_head->rbnode.rb_right);
- else
- rb_link_node(&skb->rbnode, NULL, &q->rb_fragments.rb_node);
- rb_insert_color(&skb->rbnode, &q->rb_fragments);
-
- ip4_frag_init_run(skb);
- q->fragments_tail = skb;
- q->last_run_head = skb;
-}
-
/* Describe an entry in the "incomplete datagrams" queue. */
struct ipq {
struct inet_frag_queue q;
@@ -210,27 +159,9 @@ static void ip_expire(unsigned long arg)
* pull the head out of the tree in order to be able to
* deal with head->dev.
*/
- if (qp->q.fragments) {
- head = qp->q.fragments;
- qp->q.fragments = head->next;
- } else {
- head = skb_rb_first(&qp->q.rb_fragments);
- if (!head)
- goto out;
- if (FRAG_CB(head)->next_frag)
- rb_replace_node(&head->rbnode,
- &FRAG_CB(head)->next_frag->rbnode,
- &qp->q.rb_fragments);
- else
- rb_erase(&head->rbnode, &qp->q.rb_fragments);
- memset(&head->rbnode, 0, sizeof(head->rbnode));
- barrier();
- }
- if (head == qp->q.fragments_tail)
- qp->q.fragments_tail = NULL;
-
- sub_frag_mem_limit(qp->q.net, head->truesize);
-
+ head = inet_frag_pull_head(&qp->q);
+ if (!head)
+ goto out;
head->dev = dev_get_by_index_rcu(net, qp->iif);
if (!head->dev)
goto out;
@@ -343,12 +274,10 @@ static int ip_frag_reinit(struct ipq *qp)
static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
{
struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
- struct rb_node **rbn, *parent;
- struct sk_buff *skb1, *prev_tail;
- int ihl, end, skb1_run_end;
+ int ihl, end, flags, offset;
+ struct sk_buff *prev_tail;
struct net_device *dev;
unsigned int fragsize;
- int flags, offset;
int err = -ENOENT;
u8 ecn;
@@ -380,7 +309,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
*/
if (end < qp->q.len ||
((qp->q.flags & INET_FRAG_LAST_IN) && end != qp->q.len))
- goto err;
+ goto discard_qp;
qp->q.flags |= INET_FRAG_LAST_IN;
qp->q.len = end;
} else {
@@ -392,82 +321,33 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
if (end > qp->q.len) {
/* Some bits beyond end -> corruption. */
if (qp->q.flags & INET_FRAG_LAST_IN)
- goto err;
+ goto discard_qp;
qp->q.len = end;
}
}
if (end == offset)
- goto err;
+ goto discard_qp;
err = -ENOMEM;
if (!pskb_pull(skb, skb_network_offset(skb) + ihl))
- goto err;
+ goto discard_qp;
err = pskb_trim_rcsum(skb, end - offset);
if (err)
- goto err;
+ goto discard_qp;
/* Note : skb->rbnode and skb->dev share the same location. */
dev = skb->dev;
/* Makes sure compiler wont do silly aliasing games */
barrier();
- /* RFC5722, Section 4, amended by Errata ID : 3089
- * When reassembling an IPv6 datagram, if
- * one or more its constituent fragments is determined to be an
- * overlapping fragment, the entire datagram (and any constituent
- * fragments) MUST be silently discarded.
- *
- * We do the same here for IPv4 (and increment an snmp counter) but
- * we do not want to drop the whole queue in response to a duplicate
- * fragment.
- */
-
- err = -EINVAL;
- /* Find out where to put this fragment. */
prev_tail = qp->q.fragments_tail;
- if (!prev_tail)
- ip4_frag_create_run(&qp->q, skb); /* First fragment. */
- else if (prev_tail->ip_defrag_offset + prev_tail->len < end) {
- /* This is the common case: skb goes to the end. */
- /* Detect and discard overlaps. */
- if (offset < prev_tail->ip_defrag_offset + prev_tail->len)
- goto discard_qp;
- if (offset == prev_tail->ip_defrag_offset + prev_tail->len)
- ip4_frag_append_to_last_run(&qp->q, skb);
- else
- ip4_frag_create_run(&qp->q, skb);
- } else {
- /* Binary search. Note that skb can become the first fragment,
- * but not the last (covered above).
- */
- rbn = &qp->q.rb_fragments.rb_node;
- do {
- parent = *rbn;
- skb1 = rb_to_skb(parent);
- skb1_run_end = skb1->ip_defrag_offset +
- FRAG_CB(skb1)->frag_run_len;
- if (end <= skb1->ip_defrag_offset)
- rbn = &parent->rb_left;
- else if (offset >= skb1_run_end)
- rbn = &parent->rb_right;
- else if (offset >= skb1->ip_defrag_offset &&
- end <= skb1_run_end)
- goto err; /* No new data, potential duplicate */
- else
- goto discard_qp; /* Found an overlap */
- } while (*rbn);
- /* Here we have parent properly set, and rbn pointing to
- * one of its NULL left/right children. Insert skb.
- */
- ip4_frag_init_run(skb);
- rb_link_node(&skb->rbnode, parent, rbn);
- rb_insert_color(&skb->rbnode, &qp->q.rb_fragments);
- }
+ err = inet_frag_queue_insert(&qp->q, skb, offset, end);
+ if (err)
+ goto insert_error;
if (dev)
qp->iif = dev->ifindex;
- skb->ip_defrag_offset = offset;
qp->q.stamp = skb->tstamp;
qp->q.meat += skb->len;
@@ -492,15 +372,24 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
skb->_skb_refdst = 0UL;
err = ip_frag_reasm(qp, skb, prev_tail, dev);
skb->_skb_refdst = orefdst;
+ if (err)
+ inet_frag_kill(&qp->q);
return err;
}
skb_dst_drop(skb);
return -EINPROGRESS;
+insert_error:
+ if (err == IPFRAG_DUP) {
+ kfree_skb(skb);
+ return -EINVAL;
+ }
+ err = -EINVAL;
+ __IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS);
discard_qp:
inet_frag_kill(&qp->q);
- __IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS);
+ __IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS);
err:
kfree_skb(skb);
return err;
@@ -512,12 +401,8 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
{
struct net *net = container_of(qp->q.net, struct net, ipv4.frags);
struct iphdr *iph;
- struct sk_buff *fp, *head = skb_rb_first(&qp->q.rb_fragments);
- struct sk_buff **nextp; /* To build frag_list. */
- struct rb_node *rbn;
- int len;
- int ihlen;
- int err;
+ void *reasm_data;
+ int len, err;
u8 ecn;
ipq_kill(qp);
@@ -527,111 +412,23 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
err = -EINVAL;
goto out_fail;
}
+
/* Make the one we just received the head. */
- if (head != skb) {
- fp = skb_clone(skb, GFP_ATOMIC);
- if (!fp)
- goto out_nomem;
- FRAG_CB(fp)->next_frag = FRAG_CB(skb)->next_frag;
- if (RB_EMPTY_NODE(&skb->rbnode))
- FRAG_CB(prev_tail)->next_frag = fp;
- else
- rb_replace_node(&skb->rbnode, &fp->rbnode,
- &qp->q.rb_fragments);
- if (qp->q.fragments_tail == skb)
- qp->q.fragments_tail = fp;
- skb_morph(skb, head);
- FRAG_CB(skb)->next_frag = FRAG_CB(head)->next_frag;
- rb_replace_node(&head->rbnode, &skb->rbnode,
- &qp->q.rb_fragments);
- consume_skb(head);
- head = skb;
- }
+ reasm_data = inet_frag_reasm_prepare(&qp->q, skb, prev_tail);
+ if (!reasm_data)
+ goto out_nomem;
- WARN_ON(head->ip_defrag_offset != 0);
-
- /* Allocate a new buffer for the datagram. */
- ihlen = ip_hdrlen(head);
- len = ihlen + qp->q.len;
-
+ len = ip_hdrlen(skb) + qp->q.len;
err = -E2BIG;
if (len > 65535)
goto out_oversize;
- /* Head of list must not be cloned. */
- if (skb_unclone(head, GFP_ATOMIC))
- goto out_nomem;
+ inet_frag_reasm_finish(&qp->q, skb, reasm_data);
- /* If the first fragment is fragmented itself, we split
- * it to two chunks: the first with data and paged part
- * and the second, holding only fragments. */
- if (skb_has_frag_list(head)) {
- struct sk_buff *clone;
- int i, plen = 0;
+ skb->dev = dev;
+ IPCB(skb)->frag_max_size = max(qp->max_df_size, qp->q.max_size);
- clone = alloc_skb(0, GFP_ATOMIC);
- if (!clone)
- goto out_nomem;
- skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list;
- skb_frag_list_init(head);
- for (i = 0; i < skb_shinfo(head)->nr_frags; i++)
- plen += skb_frag_size(&skb_shinfo(head)->frags[i]);
- clone->len = clone->data_len = head->data_len - plen;
- head->truesize += clone->truesize;
- clone->csum = 0;
- clone->ip_summed = head->ip_summed;
- add_frag_mem_limit(qp->q.net, clone->truesize);
- skb_shinfo(head)->frag_list = clone;
- nextp = &clone->next;
- } else {
- nextp = &skb_shinfo(head)->frag_list;
- }
-
- skb_push(head, head->data - skb_network_header(head));
-
- /* Traverse the tree in order, to build frag_list. */
- fp = FRAG_CB(head)->next_frag;
- rbn = rb_next(&head->rbnode);
- rb_erase(&head->rbnode, &qp->q.rb_fragments);
- while (rbn || fp) {
- /* fp points to the next sk_buff in the current run;
- * rbn points to the next run.
- */
- /* Go through the current run. */
- while (fp) {
- *nextp = fp;
- nextp = &fp->next;
- fp->prev = NULL;
- memset(&fp->rbnode, 0, sizeof(fp->rbnode));
- fp->sk = NULL;
- head->data_len += fp->len;
- head->len += fp->len;
- if (head->ip_summed != fp->ip_summed)
- head->ip_summed = CHECKSUM_NONE;
- else if (head->ip_summed == CHECKSUM_COMPLETE)
- head->csum = csum_add(head->csum, fp->csum);
- head->truesize += fp->truesize;
- fp = FRAG_CB(fp)->next_frag;
- }
- /* Move to the next run. */
- if (rbn) {
- struct rb_node *rbnext = rb_next(rbn);
-
- fp = rb_to_skb(rbn);
- rb_erase(rbn, &qp->q.rb_fragments);
- rbn = rbnext;
- }
- }
- sub_frag_mem_limit(qp->q.net, head->truesize);
-
- *nextp = NULL;
- head->next = NULL;
- head->prev = NULL;
- head->dev = dev;
- head->tstamp = qp->q.stamp;
- IPCB(head)->frag_max_size = max(qp->max_df_size, qp->q.max_size);
-
- iph = ip_hdr(head);
+ iph = ip_hdr(skb);
iph->tot_len = htons(len);
iph->tos |= ecn;
@@ -644,7 +441,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
* from one very small df-fragment and one large non-df frag.
*/
if (qp->max_df_size == qp->q.max_size) {
- IPCB(head)->flags |= IPSKB_FRAG_PMTU;
+ IPCB(skb)->flags |= IPSKB_FRAG_PMTU;
iph->frag_off = htons(IP_DF);
} else {
iph->frag_off = 0;
@@ -742,28 +539,6 @@ struct sk_buff *ip_check_defrag(struct net *net, struct sk_buff *skb, u32 user)
}
EXPORT_SYMBOL(ip_check_defrag);
-unsigned int inet_frag_rbtree_purge(struct rb_root *root)
-{
- struct rb_node *p = rb_first(root);
- unsigned int sum = 0;
-
- while (p) {
- struct sk_buff *skb = rb_entry(p, struct sk_buff, rbnode);
-
- p = rb_next(p);
- rb_erase(&skb->rbnode, root);
- while (skb) {
- struct sk_buff *next = FRAG_CB(skb)->next_frag;
-
- sum += skb->truesize;
- kfree_skb(skb);
- skb = next;
- }
- }
- return sum;
-}
-EXPORT_SYMBOL(inet_frag_rbtree_purge);
-
#ifdef CONFIG_SYSCTL
static int dist_min;
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index bcadca2..ce0ce14 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -259,11 +259,10 @@ int ip_local_deliver(struct sk_buff *skb)
ip_local_deliver_finish);
}
-static inline bool ip_rcv_options(struct sk_buff *skb)
+static inline bool ip_rcv_options(struct sk_buff *skb, struct net_device *dev)
{
struct ip_options *opt;
const struct iphdr *iph;
- struct net_device *dev = skb->dev;
/* It looks as overkill, because not all
IP options require packet mangling.
@@ -299,7 +298,7 @@ static inline bool ip_rcv_options(struct sk_buff *skb)
}
}
- if (ip_options_rcv_srr(skb))
+ if (ip_options_rcv_srr(skb, dev))
goto drop;
}
@@ -361,7 +360,7 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
}
#endif
- if (iph->ihl > 5 && ip_rcv_options(skb))
+ if (iph->ihl > 5 && ip_rcv_options(skb, dev))
goto drop;
rt = skb_rtable(skb);
diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index 4cd3b5a..570cdb5 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -614,7 +614,7 @@ void ip_forward_options(struct sk_buff *skb)
}
}
-int ip_options_rcv_srr(struct sk_buff *skb)
+int ip_options_rcv_srr(struct sk_buff *skb, struct net_device *dev)
{
struct ip_options *opt = &(IPCB(skb)->opt);
int srrspace, srrptr;
@@ -649,7 +649,7 @@ int ip_options_rcv_srr(struct sk_buff *skb)
orefdst = skb->_skb_refdst;
skb_dst_set(skb, NULL);
- err = ip_route_input(skb, nexthop, iph->saddr, iph->tos, skb->dev);
+ err = ip_route_input(skb, nexthop, iph->saddr, iph->tos, dev);
rt2 = skb_rtable(skb);
if (err || (rt2->rt_type != RTN_UNICAST && rt2->rt_type != RTN_LOCAL)) {
skb_dst_drop(skb);
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 94419b8..7cad5aa 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1174,11 +1174,39 @@ static struct dst_entry *ipv4_dst_check(struct dst_entry *dst, u32 cookie)
return dst;
}
+static void ipv4_send_dest_unreach(struct sk_buff *skb)
+{
+ struct ip_options opt;
+ int res;
+
+ /* Recompile ip options since IPCB may not be valid anymore.
+ * Also check we have a reasonable ipv4 header.
+ */
+ if (!pskb_network_may_pull(skb, sizeof(struct iphdr)) ||
+ ip_hdr(skb)->version != 4 || ip_hdr(skb)->ihl < 5)
+ return;
+
+ memset(&opt, 0, sizeof(opt));
+ if (ip_hdr(skb)->ihl > 5) {
+ if (!pskb_network_may_pull(skb, ip_hdr(skb)->ihl * 4))
+ return;
+ opt.optlen = ip_hdr(skb)->ihl * 4 - sizeof(struct iphdr);
+
+ rcu_read_lock();
+ res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL);
+ rcu_read_unlock();
+
+ if (res)
+ return;
+ }
+ __icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0, &opt);
+}
+
static void ipv4_link_failure(struct sk_buff *skb)
{
struct rtable *rt;
- icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0);
+ ipv4_send_dest_unreach(skb);
rt = skb_rtable(skb);
if (rt)
@@ -1619,6 +1647,10 @@ static void ip_del_fnhe(struct fib_nh *nh, __be32 daddr)
if (fnhe->fnhe_daddr == daddr) {
rcu_assign_pointer(*fnhe_p, rcu_dereference_protected(
fnhe->fnhe_next, lockdep_is_held(&fnhe_lock)));
+ /* set fnhe_daddr to 0 to ensure it won't bind with
+ * new dsts in rt_bind_exception().
+ */
+ fnhe->fnhe_daddr = 0;
fnhe_flush_routes(fnhe);
kfree_rcu(fnhe, rcu);
break;
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 4487c71..b59d9cd 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -225,7 +225,12 @@ struct sock *tcp_get_cookie_sock(struct sock *sk, struct sk_buff *skb,
if (child) {
atomic_set(&req->rsk_refcnt, 1);
sock_rps_save_rxhash(child, skb);
- inet_csk_reqsk_queue_add(sk, req, child);
+ if (!inet_csk_reqsk_queue_add(sk, req, child)) {
+ bh_unlock_sock(child);
+ sock_put(child);
+ child = NULL;
+ reqsk_put(req);
+ }
} else {
reqsk_free(req);
}
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index adc9ccc..76da06e 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -45,6 +45,7 @@ static int tcp_delack_seg_min = TCP_DELACK_MIN;
static int tcp_delack_seg_max = 60;
static int tcp_use_userconfig_min;
static int tcp_use_userconfig_max = 1;
+static int one_day_secs = 24 * 3600;
/* Update system visible IP port range */
static void set_local_port_range(struct net *net, int range[2])
@@ -479,7 +480,9 @@ static struct ctl_table ipv4_table[] = {
.data = &sysctl_tcp_min_rtt_wlen,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dointvec
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &zero,
+ .extra2 = &one_day_secs
},
{
.procname = "tcp_low_latency",
diff --git a/net/ipv4/tcp_dctcp.c b/net/ipv4/tcp_dctcp.c
index a08cedf..910ef01 100644
--- a/net/ipv4/tcp_dctcp.c
+++ b/net/ipv4/tcp_dctcp.c
@@ -66,11 +66,6 @@ static unsigned int dctcp_alpha_on_init __read_mostly = DCTCP_MAX_ALPHA;
module_param(dctcp_alpha_on_init, uint, 0644);
MODULE_PARM_DESC(dctcp_alpha_on_init, "parameter for initial alpha value");
-static unsigned int dctcp_clamp_alpha_on_loss __read_mostly;
-module_param(dctcp_clamp_alpha_on_loss, uint, 0644);
-MODULE_PARM_DESC(dctcp_clamp_alpha_on_loss,
- "parameter for clamping alpha on loss");
-
static struct tcp_congestion_ops dctcp_reno;
static void dctcp_reset(const struct tcp_sock *tp, struct dctcp *ca)
@@ -211,21 +206,23 @@ static void dctcp_update_alpha(struct sock *sk, u32 flags)
}
}
+static void dctcp_react_to_loss(struct sock *sk)
+{
+ struct dctcp *ca = inet_csk_ca(sk);
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ ca->loss_cwnd = tp->snd_cwnd;
+ tp->snd_ssthresh = max(tp->snd_cwnd >> 1U, 2U);
+}
+
static void dctcp_state(struct sock *sk, u8 new_state)
{
- if (dctcp_clamp_alpha_on_loss && new_state == TCP_CA_Loss) {
- struct dctcp *ca = inet_csk_ca(sk);
-
- /* If this extension is enabled, we clamp dctcp_alpha to
- * max on packet loss; the motivation is that dctcp_alpha
- * is an indicator to the extend of congestion and packet
- * loss is an indicator of extreme congestion; setting
- * this in practice turned out to be beneficial, and
- * effectively assumes total congestion which reduces the
- * window by half.
- */
- ca->dctcp_alpha = DCTCP_MAX_ALPHA;
- }
+ if (new_state == TCP_CA_Recovery &&
+ new_state != inet_csk(sk)->icsk_ca_state)
+ dctcp_react_to_loss(sk);
+ /* We handle RTO in dctcp_cwnd_event to ensure that we perform only
+ * one loss-adjustment per RTT.
+ */
}
static void dctcp_cwnd_event(struct sock *sk, enum tcp_ca_event ev)
@@ -237,6 +234,9 @@ static void dctcp_cwnd_event(struct sock *sk, enum tcp_ca_event ev)
case CA_EVENT_ECN_NO_CE:
dctcp_ce_state_1_to_0(sk);
break;
+ case CA_EVENT_LOSS:
+ dctcp_react_to_loss(sk);
+ break;
default:
/* Don't care for the rest. */
break;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 5f6dc5f..6c42ffc 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -390,11 +390,12 @@ static int __tcp_grow_window(const struct sock *sk, const struct sk_buff *skb)
static void tcp_grow_window(struct sock *sk, const struct sk_buff *skb)
{
struct tcp_sock *tp = tcp_sk(sk);
+ int room;
+
+ room = min_t(int, tp->window_clamp, tcp_space(sk)) - tp->rcv_ssthresh;
/* Check #1 */
- if (tp->rcv_ssthresh < tp->window_clamp &&
- (int)tp->rcv_ssthresh < tcp_space(sk) &&
- !tcp_under_memory_pressure(sk)) {
+ if (room > 0 && !tcp_under_memory_pressure(sk)) {
int incr;
/* Check #2. Increase window, if skb with such overhead
@@ -407,8 +408,7 @@ static void tcp_grow_window(struct sock *sk, const struct sk_buff *skb)
if (incr) {
incr = max_t(int, incr, 2 * skb->len);
- tp->rcv_ssthresh = min(tp->rcv_ssthresh + incr,
- tp->window_clamp);
+ tp->rcv_ssthresh += min(room, incr);
inet_csk(sk)->icsk_ack.quick |= 1;
}
}
@@ -6376,13 +6376,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
goto drop;
}
-
- /* Accept backlog is full. If we have already queued enough
- * of warm entries in syn queue, drop request. It is better than
- * clogging syn queue with openreqs with exponentially increasing
- * timeout.
- */
- if (sk_acceptq_is_full(sk) && inet_csk_reqsk_queue_young(sk) > 1) {
+ if (sk_acceptq_is_full(sk)) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
goto drop;
}
@@ -6481,7 +6475,13 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
af_ops->send_synack(fastopen_sk, dst, &fl, req,
&foc, TCP_SYNACK_FASTOPEN);
/* Add the child socket directly into the accept queue */
- inet_csk_reqsk_queue_add(sk, req, fastopen_sk);
+ if (!inet_csk_reqsk_queue_add(sk, req, fastopen_sk)) {
+ reqsk_fastopen_remove(fastopen_sk, req, false);
+ bh_unlock_sock(fastopen_sk);
+ sock_put(fastopen_sk);
+ reqsk_put(req);
+ goto drop;
+ }
sk->sk_data_ready(sk);
bh_unlock_sock(fastopen_sk);
sock_put(fastopen_sk);
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 510a83f..8092480 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -601,7 +601,7 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
inet6_sk(skb->sk) : NULL;
struct ipv6hdr *tmp_hdr;
struct frag_hdr *fh;
- unsigned int mtu, hlen, left, len;
+ unsigned int mtu, hlen, left, len, nexthdr_offset;
int hroom, troom;
__be32 frag_id;
int ptr, offset = 0, err = 0;
@@ -612,6 +612,7 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
goto fail;
hlen = err;
nexthdr = *prevhdr;
+ nexthdr_offset = prevhdr - skb_network_header(skb);
mtu = ip6_skb_dst_mtu(skb);
@@ -646,6 +647,7 @@ int ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
(err = skb_checksum_help(skb)))
goto fail;
+ prevhdr = skb_network_header(skb) + nexthdr_offset;
hroom = LL_RESERVED_SPACE(rt->dst.dev);
if (skb_has_frag_list(skb)) {
int first_len = skb_pagelen(skb);
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 8855e89..bbc9748 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -634,7 +634,7 @@ ip4ip6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
IPPROTO_IPIP,
RT_TOS(eiph->tos), 0);
if (IS_ERR(rt) ||
- rt->dst.dev->type != ARPHRD_TUNNEL) {
+ rt->dst.dev->type != ARPHRD_TUNNEL6) {
if (!IS_ERR(rt))
ip_rt_put(rt);
goto out;
@@ -644,7 +644,7 @@ ip4ip6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
ip_rt_put(rt);
if (ip_route_input(skb2, eiph->daddr, eiph->saddr, eiph->tos,
skb2->dev) ||
- skb_dst(skb2)->dev->type != ARPHRD_TUNNEL)
+ skb_dst(skb2)->dev->type != ARPHRD_TUNNEL6)
goto out;
}
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
index e461853..1e1fa99 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -33,9 +33,8 @@
#include <net/sock.h>
#include <net/snmp.h>
-#include <net/inet_frag.h>
+#include <net/ipv6_frag.h>
-#include <net/ipv6.h>
#include <net/protocol.h>
#include <net/transp_v6.h>
#include <net/rawv6.h>
@@ -52,14 +51,6 @@
static const char nf_frags_cache_name[] = "nf-frags";
-struct nf_ct_frag6_skb_cb
-{
- struct inet6_skb_parm h;
- int offset;
-};
-
-#define NFCT_FRAG6_CB(skb) ((struct nf_ct_frag6_skb_cb *)((skb)->cb))
-
static struct inet_frags nf_frags;
#ifdef CONFIG_SYSCTL
@@ -145,6 +136,9 @@ static void __net_exit nf_ct_frags6_sysctl_unregister(struct net *net)
}
#endif
+static int nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *skb,
+ struct sk_buff *prev_tail, struct net_device *dev);
+
static inline u8 ip6_frag_ecn(const struct ipv6hdr *ipv6h)
{
return 1 << (ipv6_get_dsfield(ipv6h) & INET_ECN_MASK);
@@ -158,7 +152,7 @@ static void nf_ct_frag6_expire(unsigned long data)
fq = container_of((struct inet_frag_queue *)data, struct frag_queue, q);
net = container_of(fq->q.net, struct net, nf_frag.frags);
- ip6_expire_frag_queue(net, fq);
+ ip6frag_expire_frag_queue(net, fq);
}
/* Creation primitives. */
@@ -185,9 +179,10 @@ static struct frag_queue *fq_find(struct net *net, __be32 id, u32 user,
static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb,
const struct frag_hdr *fhdr, int nhoff)
{
- struct sk_buff *prev, *next;
unsigned int payload_len;
- int offset, end;
+ struct net_device *dev;
+ struct sk_buff *prev;
+ int offset, end, err;
u8 ecn;
if (fq->q.flags & INET_FRAG_COMPLETE) {
@@ -262,55 +257,19 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb,
goto err;
}
- /* Find out which fragments are in front and at the back of us
- * in the chain of fragments so far. We must know where to put
- * this fragment, right?
- */
+ /* Note : skb->rbnode and skb->dev share the same location. */
+ dev = skb->dev;
+ /* Makes sure compiler wont do silly aliasing games */
+ barrier();
+
prev = fq->q.fragments_tail;
- if (!prev || NFCT_FRAG6_CB(prev)->offset < offset) {
- next = NULL;
- goto found;
- }
- prev = NULL;
- for (next = fq->q.fragments; next != NULL; next = next->next) {
- if (NFCT_FRAG6_CB(next)->offset >= offset)
- break; /* bingo! */
- prev = next;
- }
+ err = inet_frag_queue_insert(&fq->q, skb, offset, end);
+ if (err)
+ goto insert_error;
-found:
- /* RFC5722, Section 4:
- * When reassembling an IPv6 datagram, if
- * one or more its constituent fragments is determined to be an
- * overlapping fragment, the entire datagram (and any constituent
- * fragments, including those not yet received) MUST be silently
- * discarded.
- */
+ if (dev)
+ fq->iif = dev->ifindex;
- /* Check for overlap with preceding fragment. */
- if (prev &&
- (NFCT_FRAG6_CB(prev)->offset + prev->len) > offset)
- goto discard_fq;
-
- /* Look for overlap with succeeding segment. */
- if (next && NFCT_FRAG6_CB(next)->offset < end)
- goto discard_fq;
-
- NFCT_FRAG6_CB(skb)->offset = offset;
-
- /* Insert this fragment in the chain of fragments. */
- skb->next = next;
- if (!next)
- fq->q.fragments_tail = skb;
- if (prev)
- prev->next = skb;
- else
- fq->q.fragments = skb;
-
- if (skb->dev) {
- fq->iif = skb->dev->ifindex;
- skb->dev = NULL;
- }
fq->q.stamp = skb->tstamp;
fq->q.meat += skb->len;
fq->ecn |= ecn;
@@ -326,11 +285,25 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb,
fq->q.flags |= INET_FRAG_FIRST_IN;
}
- return 0;
+ if (fq->q.flags == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) &&
+ fq->q.meat == fq->q.len) {
+ unsigned long orefdst = skb->_skb_refdst;
-discard_fq:
+ skb->_skb_refdst = 0UL;
+ err = nf_ct_frag6_reasm(fq, skb, prev, dev);
+ skb->_skb_refdst = orefdst;
+ return err;
+ }
+
+ skb_dst_drop(skb);
+ return -EINPROGRESS;
+
+insert_error:
+ if (err == IPFRAG_DUP)
+ goto err;
inet_frag_kill(&fq->q);
err:
+ skb_dst_drop(skb);
return -EINVAL;
}
@@ -340,141 +313,67 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb,
* It is called with locked fq, and caller must check that
* queue is eligible for reassembly i.e. it is not COMPLETE,
* the last and the first frames arrived and all the bits are here.
- *
- * returns true if *prev skb has been transformed into the reassembled
- * skb, false otherwise.
*/
-static bool
-nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *prev, struct net_device *dev)
+static int nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *skb,
+ struct sk_buff *prev_tail, struct net_device *dev)
{
- struct sk_buff *fp, *head = fq->q.fragments;
- int payload_len;
+ void *reasm_data;
+ int payload_len;
u8 ecn;
inet_frag_kill(&fq->q);
- WARN_ON(head == NULL);
- WARN_ON(NFCT_FRAG6_CB(head)->offset != 0);
-
ecn = ip_frag_ecn_table[fq->ecn];
if (unlikely(ecn == 0xff))
- return false;
+ goto err;
- /* Unfragmented part is taken from the first segment. */
- payload_len = ((head->data - skb_network_header(head)) -
+ reasm_data = inet_frag_reasm_prepare(&fq->q, skb, prev_tail);
+ if (!reasm_data)
+ goto err;
+
+ payload_len = ((skb->data - skb_network_header(skb)) -
sizeof(struct ipv6hdr) + fq->q.len -
sizeof(struct frag_hdr));
if (payload_len > IPV6_MAXPLEN) {
net_dbg_ratelimited("nf_ct_frag6_reasm: payload len = %d\n",
payload_len);
- return false;
- }
-
- /* Head of list must not be cloned. */
- if (skb_unclone(head, GFP_ATOMIC))
- return false;
-
- /* If the first fragment is fragmented itself, we split
- * it to two chunks: the first with data and paged part
- * and the second, holding only fragments. */
- if (skb_has_frag_list(head)) {
- struct sk_buff *clone;
- int i, plen = 0;
-
- clone = alloc_skb(0, GFP_ATOMIC);
- if (clone == NULL)
- return false;
-
- clone->next = head->next;
- head->next = clone;
- skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list;
- skb_frag_list_init(head);
- for (i = 0; i < skb_shinfo(head)->nr_frags; i++)
- plen += skb_frag_size(&skb_shinfo(head)->frags[i]);
- clone->len = clone->data_len = head->data_len - plen;
- head->data_len -= clone->len;
- head->len -= clone->len;
- clone->csum = 0;
- clone->ip_summed = head->ip_summed;
-
- add_frag_mem_limit(fq->q.net, clone->truesize);
- }
-
- /* morph head into last received skb: prev.
- *
- * This allows callers of ipv6 conntrack defrag to continue
- * to use the last skb(frag) passed into the reasm engine.
- * The last skb frag 'silently' turns into the full reassembled skb.
- *
- * Since prev is also part of q->fragments we have to clone it first.
- */
- if (head != prev) {
- struct sk_buff *iter;
-
- fp = skb_clone(prev, GFP_ATOMIC);
- if (!fp)
- return false;
-
- fp->next = prev->next;
-
- iter = head;
- while (iter) {
- if (iter->next == prev) {
- iter->next = fp;
- break;
- }
- iter = iter->next;
- }
-
- skb_morph(prev, head);
- prev->next = head->next;
- consume_skb(head);
- head = prev;
+ goto err;
}
/* We have to remove fragment header from datagram and to relocate
* header in order to calculate ICV correctly. */
- skb_network_header(head)[fq->nhoffset] = skb_transport_header(head)[0];
- memmove(head->head + sizeof(struct frag_hdr), head->head,
- (head->data - head->head) - sizeof(struct frag_hdr));
- head->mac_header += sizeof(struct frag_hdr);
- head->network_header += sizeof(struct frag_hdr);
+ skb_network_header(skb)[fq->nhoffset] = skb_transport_header(skb)[0];
+ memmove(skb->head + sizeof(struct frag_hdr), skb->head,
+ (skb->data - skb->head) - sizeof(struct frag_hdr));
+ skb->mac_header += sizeof(struct frag_hdr);
+ skb->network_header += sizeof(struct frag_hdr);
- skb_shinfo(head)->frag_list = head->next;
- skb_reset_transport_header(head);
- skb_push(head, head->data - skb_network_header(head));
+ skb_reset_transport_header(skb);
- for (fp = head->next; fp; fp = fp->next) {
- head->data_len += fp->len;
- head->len += fp->len;
- if (head->ip_summed != fp->ip_summed)
- head->ip_summed = CHECKSUM_NONE;
- else if (head->ip_summed == CHECKSUM_COMPLETE)
- head->csum = csum_add(head->csum, fp->csum);
- head->truesize += fp->truesize;
- fp->sk = NULL;
- }
- sub_frag_mem_limit(fq->q.net, head->truesize);
+ inet_frag_reasm_finish(&fq->q, skb, reasm_data);
- head->ignore_df = 1;
- head->next = NULL;
- head->dev = dev;
- head->tstamp = fq->q.stamp;
- ipv6_hdr(head)->payload_len = htons(payload_len);
- ipv6_change_dsfield(ipv6_hdr(head), 0xff, ecn);
- IP6CB(head)->frag_max_size = sizeof(struct ipv6hdr) + fq->q.max_size;
+ skb->ignore_df = 1;
+ skb->dev = dev;
+ ipv6_hdr(skb)->payload_len = htons(payload_len);
+ ipv6_change_dsfield(ipv6_hdr(skb), 0xff, ecn);
+ IP6CB(skb)->frag_max_size = sizeof(struct ipv6hdr) + fq->q.max_size;
/* Yes, and fold redundant checksum back. 8) */
- if (head->ip_summed == CHECKSUM_COMPLETE)
- head->csum = csum_partial(skb_network_header(head),
- skb_network_header_len(head),
- head->csum);
+ if (skb->ip_summed == CHECKSUM_COMPLETE)
+ skb->csum = csum_partial(skb_network_header(skb),
+ skb_network_header_len(skb),
+ skb->csum);
fq->q.fragments = NULL;
fq->q.rb_fragments = RB_ROOT;
fq->q.fragments_tail = NULL;
+ fq->q.last_run_head = NULL;
- return true;
+ return 0;
+
+err:
+ inet_frag_kill(&fq->q);
+ return -EINVAL;
}
/*
@@ -543,7 +442,6 @@ find_prev_fhdr(struct sk_buff *skb, u8 *prevhdrp, int *prevhoff, int *fhoff)
int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user)
{
u16 savethdr = skb->transport_header;
- struct net_device *dev = skb->dev;
int fhoff, nhoff, ret;
struct frag_hdr *fhdr;
struct frag_queue *fq;
@@ -566,10 +464,6 @@ int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user)
hdr = ipv6_hdr(skb);
fhdr = (struct frag_hdr *)skb_transport_header(skb);
- if (skb->len - skb_network_offset(skb) < IPV6_MIN_MTU &&
- fhdr->frag_off & htons(IP6_MF))
- return -EINVAL;
-
skb_orphan(skb);
fq = fq_find(net, fhdr->identification, user, hdr,
skb->dev ? skb->dev->ifindex : 0);
@@ -581,24 +475,17 @@ int nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user)
spin_lock_bh(&fq->q.lock);
ret = nf_ct_frag6_queue(fq, skb, fhdr, nhoff);
- if (ret < 0) {
- if (ret == -EPROTO) {
- skb->transport_header = savethdr;
- ret = 0;
- }
- goto out_unlock;
+ if (ret == -EPROTO) {
+ skb->transport_header = savethdr;
+ ret = 0;
}
/* after queue has assumed skb ownership, only 0 or -EINPROGRESS
* must be returned.
*/
- ret = -EINPROGRESS;
- if (fq->q.flags == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) &&
- fq->q.meat == fq->q.len &&
- nf_ct_frag6_reasm(fq, skb, dev))
- ret = 0;
+ if (ret)
+ ret = -EINPROGRESS;
-out_unlock:
spin_unlock_bh(&fq->q.lock);
inet_frag_put(&fq->q);
return ret;
@@ -634,16 +521,24 @@ static struct pernet_operations nf_ct_net_ops = {
.exit = nf_ct_net_exit,
};
+static const struct rhashtable_params nfct_rhash_params = {
+ .head_offset = offsetof(struct inet_frag_queue, node),
+ .hashfn = ip6frag_key_hashfn,
+ .obj_hashfn = ip6frag_obj_hashfn,
+ .obj_cmpfn = ip6frag_obj_cmpfn,
+ .automatic_shrinking = true,
+};
+
int nf_ct_frag6_init(void)
{
int ret = 0;
- nf_frags.constructor = ip6_frag_init;
+ nf_frags.constructor = ip6frag_init;
nf_frags.destructor = NULL;
nf_frags.qsize = sizeof(struct frag_queue);
nf_frags.frag_expire = nf_ct_frag6_expire;
nf_frags.frags_cache_name = nf_frags_cache_name;
- nf_frags.rhash_params = ip6_rhash_params;
+ nf_frags.rhash_params = nfct_rhash_params;
ret = inet_frags_init(&nf_frags);
if (ret)
goto out;
diff --git a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
index f06b047..c4070e9 100644
--- a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
+++ b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
@@ -14,8 +14,7 @@
#include <linux/skbuff.h>
#include <linux/icmp.h>
#include <linux/sysctl.h>
-#include <net/ipv6.h>
-#include <net/inet_frag.h>
+#include <net/ipv6_frag.h>
#include <linux/netfilter_ipv6.h>
#include <linux/netfilter_bridge.h>
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index 74ffbcb..4aed9c45 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -57,18 +57,11 @@
#include <net/rawv6.h>
#include <net/ndisc.h>
#include <net/addrconf.h>
-#include <net/inet_frag.h>
+#include <net/ipv6_frag.h>
#include <net/inet_ecn.h>
static const char ip6_frag_cache_name[] = "ip6-frags";
-struct ip6frag_skb_cb {
- struct inet6_skb_parm h;
- int offset;
-};
-
-#define FRAG6_CB(skb) ((struct ip6frag_skb_cb *)((skb)->cb))
-
static u8 ip6_frag_ecn(const struct ipv6hdr *ipv6h)
{
return 1 << (ipv6_get_dsfield(ipv6h) & INET_ECN_MASK);
@@ -76,63 +69,8 @@ static u8 ip6_frag_ecn(const struct ipv6hdr *ipv6h)
static struct inet_frags ip6_frags;
-static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
- struct net_device *dev);
-
-void ip6_frag_init(struct inet_frag_queue *q, const void *a)
-{
- struct frag_queue *fq = container_of(q, struct frag_queue, q);
- const struct frag_v6_compare_key *key = a;
-
- q->key.v6 = *key;
- fq->ecn = 0;
-}
-EXPORT_SYMBOL(ip6_frag_init);
-
-void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq)
-{
- struct net_device *dev = NULL;
- struct sk_buff *head;
-
- rcu_read_lock();
- spin_lock(&fq->q.lock);
-
- if (fq->q.flags & INET_FRAG_COMPLETE)
- goto out;
-
- inet_frag_kill(&fq->q);
-
- dev = dev_get_by_index_rcu(net, fq->iif);
- if (!dev)
- goto out;
-
- __IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMFAILS);
- __IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT);
-
- /* Don't send error if the first segment did not arrive. */
- head = fq->q.fragments;
- if (!(fq->q.flags & INET_FRAG_FIRST_IN) || !head)
- goto out;
-
- /* But use as source device on which LAST ARRIVED
- * segment was received. And do not use fq->dev
- * pointer directly, device might already disappeared.
- */
- head->dev = dev;
- skb_get(head);
- spin_unlock(&fq->q.lock);
-
- icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0);
- kfree_skb(head);
- goto out_rcu_unlock;
-
-out:
- spin_unlock(&fq->q.lock);
-out_rcu_unlock:
- rcu_read_unlock();
- inet_frag_put(&fq->q);
-}
-EXPORT_SYMBOL(ip6_expire_frag_queue);
+static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *skb,
+ struct sk_buff *prev_tail, struct net_device *dev);
static void ip6_frag_expire(unsigned long data)
{
@@ -142,7 +80,7 @@ static void ip6_frag_expire(unsigned long data)
fq = container_of((struct inet_frag_queue *)data, struct frag_queue, q);
net = container_of(fq->q.net, struct net, ipv6.frags);
- ip6_expire_frag_queue(net, fq);
+ ip6frag_expire_frag_queue(net, fq);
}
static struct frag_queue *
@@ -169,27 +107,29 @@ fq_find(struct net *net, __be32 id, const struct ipv6hdr *hdr, int iif)
}
static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
- struct frag_hdr *fhdr, int nhoff)
+ struct frag_hdr *fhdr, int nhoff,
+ u32 *prob_offset)
{
- struct sk_buff *prev, *next;
- struct net_device *dev;
- int offset, end;
struct net *net = dev_net(skb_dst(skb)->dev);
+ int offset, end, fragsize;
+ struct sk_buff *prev_tail;
+ struct net_device *dev;
+ int err = -ENOENT;
u8 ecn;
if (fq->q.flags & INET_FRAG_COMPLETE)
goto err;
+ err = -EINVAL;
offset = ntohs(fhdr->frag_off) & ~0x7;
end = offset + (ntohs(ipv6_hdr(skb)->payload_len) -
((u8 *)(fhdr + 1) - (u8 *)(ipv6_hdr(skb) + 1)));
if ((unsigned int)end > IPV6_MAXPLEN) {
- __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
- IPSTATS_MIB_INHDRERRORS);
- icmpv6_param_prob(skb, ICMPV6_HDR_FIELD,
- ((u8 *)&fhdr->frag_off -
- skb_network_header(skb)));
+ *prob_offset = (u8 *)&fhdr->frag_off - skb_network_header(skb);
+ /* note that if prob_offset is set, the skb is freed elsewhere,
+ * we do not free it here.
+ */
return -1;
}
@@ -209,7 +149,7 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
*/
if (end < fq->q.len ||
((fq->q.flags & INET_FRAG_LAST_IN) && end != fq->q.len))
- goto err;
+ goto discard_fq;
fq->q.flags |= INET_FRAG_LAST_IN;
fq->q.len = end;
} else {
@@ -220,84 +160,51 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
/* RFC2460 says always send parameter problem in
* this case. -DaveM
*/
- __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
- IPSTATS_MIB_INHDRERRORS);
- icmpv6_param_prob(skb, ICMPV6_HDR_FIELD,
- offsetof(struct ipv6hdr, payload_len));
+ *prob_offset = offsetof(struct ipv6hdr, payload_len);
return -1;
}
if (end > fq->q.len) {
/* Some bits beyond end -> corruption. */
if (fq->q.flags & INET_FRAG_LAST_IN)
- goto err;
+ goto discard_fq;
fq->q.len = end;
}
}
if (end == offset)
- goto err;
+ goto discard_fq;
+ err = -ENOMEM;
/* Point into the IP datagram 'data' part. */
if (!pskb_pull(skb, (u8 *) (fhdr + 1) - skb->data))
- goto err;
-
- if (pskb_trim_rcsum(skb, end - offset))
- goto err;
-
- /* Find out which fragments are in front and at the back of us
- * in the chain of fragments so far. We must know where to put
- * this fragment, right?
- */
- prev = fq->q.fragments_tail;
- if (!prev || FRAG6_CB(prev)->offset < offset) {
- next = NULL;
- goto found;
- }
- prev = NULL;
- for (next = fq->q.fragments; next != NULL; next = next->next) {
- if (FRAG6_CB(next)->offset >= offset)
- break; /* bingo! */
- prev = next;
- }
-
-found:
- /* RFC5722, Section 4, amended by Errata ID : 3089
- * When reassembling an IPv6 datagram, if
- * one or more its constituent fragments is determined to be an
- * overlapping fragment, the entire datagram (and any constituent
- * fragments) MUST be silently discarded.
- */
-
- /* Check for overlap with preceding fragment. */
- if (prev &&
- (FRAG6_CB(prev)->offset + prev->len) > offset)
goto discard_fq;
- /* Look for overlap with succeeding segment. */
- if (next && FRAG6_CB(next)->offset < end)
+ err = pskb_trim_rcsum(skb, end - offset);
+ if (err)
goto discard_fq;
- FRAG6_CB(skb)->offset = offset;
-
- /* Insert this fragment in the chain of fragments. */
- skb->next = next;
- if (!next)
- fq->q.fragments_tail = skb;
- if (prev)
- prev->next = skb;
- else
- fq->q.fragments = skb;
-
+ /* Note : skb->rbnode and skb->dev share the same location. */
dev = skb->dev;
- if (dev) {
+ /* Makes sure compiler wont do silly aliasing games */
+ barrier();
+
+ prev_tail = fq->q.fragments_tail;
+ err = inet_frag_queue_insert(&fq->q, skb, offset, end);
+ if (err)
+ goto insert_error;
+
+ if (dev)
fq->iif = dev->ifindex;
- skb->dev = NULL;
- }
+
fq->q.stamp = skb->tstamp;
fq->q.meat += skb->len;
fq->ecn |= ecn;
add_frag_mem_limit(fq->q.net, skb->truesize);
+ fragsize = -skb_network_offset(skb) + skb->len;
+ if (fragsize > fq->q.max_size)
+ fq->q.max_size = fragsize;
+
/* The first fragment.
* nhoffset is obtained from the first fragment, of course.
*/
@@ -308,44 +215,48 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
if (fq->q.flags == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) &&
fq->q.meat == fq->q.len) {
- int res;
unsigned long orefdst = skb->_skb_refdst;
skb->_skb_refdst = 0UL;
- res = ip6_frag_reasm(fq, prev, dev);
+ err = ip6_frag_reasm(fq, skb, prev_tail, dev);
skb->_skb_refdst = orefdst;
- return res;
+ return err;
}
skb_dst_drop(skb);
- return -1;
+ return -EINPROGRESS;
+insert_error:
+ if (err == IPFRAG_DUP) {
+ kfree_skb(skb);
+ return -EINVAL;
+ }
+ err = -EINVAL;
+ __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
+ IPSTATS_MIB_REASM_OVERLAPS);
discard_fq:
inet_frag_kill(&fq->q);
-err:
__IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
IPSTATS_MIB_REASMFAILS);
+err:
kfree_skb(skb);
- return -1;
+ return err;
}
/*
* Check if this packet is complete.
- * Returns NULL on failure by any reason, and pointer
- * to current nexthdr field in reassembled frame.
*
* It is called with locked fq, and caller must check that
* queue is eligible for reassembly i.e. it is not COMPLETE,
* the last and the first frames arrived and all the bits are here.
*/
-static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
- struct net_device *dev)
+static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *skb,
+ struct sk_buff *prev_tail, struct net_device *dev)
{
struct net *net = container_of(fq->q.net, struct net, ipv6.frags);
- struct sk_buff *fp, *head = fq->q.fragments;
- int payload_len;
unsigned int nhoff;
- int sum_truesize;
+ void *reasm_data;
+ int payload_len;
u8 ecn;
inet_frag_kill(&fq->q);
@@ -354,113 +265,40 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
if (unlikely(ecn == 0xff))
goto out_fail;
- /* Make the one we just received the head. */
- if (prev) {
- head = prev->next;
- fp = skb_clone(head, GFP_ATOMIC);
+ reasm_data = inet_frag_reasm_prepare(&fq->q, skb, prev_tail);
+ if (!reasm_data)
+ goto out_oom;
- if (!fp)
- goto out_oom;
-
- fp->next = head->next;
- if (!fp->next)
- fq->q.fragments_tail = fp;
- prev->next = fp;
-
- skb_morph(head, fq->q.fragments);
- head->next = fq->q.fragments->next;
-
- consume_skb(fq->q.fragments);
- fq->q.fragments = head;
- }
-
- WARN_ON(head == NULL);
- WARN_ON(FRAG6_CB(head)->offset != 0);
-
- /* Unfragmented part is taken from the first segment. */
- payload_len = ((head->data - skb_network_header(head)) -
+ payload_len = ((skb->data - skb_network_header(skb)) -
sizeof(struct ipv6hdr) + fq->q.len -
sizeof(struct frag_hdr));
if (payload_len > IPV6_MAXPLEN)
goto out_oversize;
- /* Head of list must not be cloned. */
- if (skb_unclone(head, GFP_ATOMIC))
- goto out_oom;
-
- /* If the first fragment is fragmented itself, we split
- * it to two chunks: the first with data and paged part
- * and the second, holding only fragments. */
- if (skb_has_frag_list(head)) {
- struct sk_buff *clone;
- int i, plen = 0;
-
- clone = alloc_skb(0, GFP_ATOMIC);
- if (!clone)
- goto out_oom;
- clone->next = head->next;
- head->next = clone;
- skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list;
- skb_frag_list_init(head);
- for (i = 0; i < skb_shinfo(head)->nr_frags; i++)
- plen += skb_frag_size(&skb_shinfo(head)->frags[i]);
- clone->len = clone->data_len = head->data_len - plen;
- head->data_len -= clone->len;
- head->len -= clone->len;
- clone->csum = 0;
- clone->ip_summed = head->ip_summed;
- add_frag_mem_limit(fq->q.net, clone->truesize);
- }
-
/* We have to remove fragment header from datagram and to relocate
* header in order to calculate ICV correctly. */
nhoff = fq->nhoffset;
- skb_network_header(head)[nhoff] = skb_transport_header(head)[0];
- memmove(head->head + sizeof(struct frag_hdr), head->head,
- (head->data - head->head) - sizeof(struct frag_hdr));
- if (skb_mac_header_was_set(head))
- head->mac_header += sizeof(struct frag_hdr);
- head->network_header += sizeof(struct frag_hdr);
+ skb_network_header(skb)[nhoff] = skb_transport_header(skb)[0];
+ memmove(skb->head + sizeof(struct frag_hdr), skb->head,
+ (skb->data - skb->head) - sizeof(struct frag_hdr));
+ if (skb_mac_header_was_set(skb))
+ skb->mac_header += sizeof(struct frag_hdr);
+ skb->network_header += sizeof(struct frag_hdr);
- skb_reset_transport_header(head);
- skb_push(head, head->data - skb_network_header(head));
+ skb_reset_transport_header(skb);
- sum_truesize = head->truesize;
- for (fp = head->next; fp;) {
- bool headstolen;
- int delta;
- struct sk_buff *next = fp->next;
+ inet_frag_reasm_finish(&fq->q, skb, reasm_data);
- sum_truesize += fp->truesize;
- if (head->ip_summed != fp->ip_summed)
- head->ip_summed = CHECKSUM_NONE;
- else if (head->ip_summed == CHECKSUM_COMPLETE)
- head->csum = csum_add(head->csum, fp->csum);
-
- if (skb_try_coalesce(head, fp, &headstolen, &delta)) {
- kfree_skb_partial(fp, headstolen);
- } else {
- if (!skb_shinfo(head)->frag_list)
- skb_shinfo(head)->frag_list = fp;
- head->data_len += fp->len;
- head->len += fp->len;
- head->truesize += fp->truesize;
- }
- fp = next;
- }
- sub_frag_mem_limit(fq->q.net, sum_truesize);
-
- head->next = NULL;
- head->dev = dev;
- head->tstamp = fq->q.stamp;
- ipv6_hdr(head)->payload_len = htons(payload_len);
- ipv6_change_dsfield(ipv6_hdr(head), 0xff, ecn);
- IP6CB(head)->nhoff = nhoff;
- IP6CB(head)->flags |= IP6SKB_FRAGMENTED;
+ skb->dev = dev;
+ ipv6_hdr(skb)->payload_len = htons(payload_len);
+ ipv6_change_dsfield(ipv6_hdr(skb), 0xff, ecn);
+ IP6CB(skb)->nhoff = nhoff;
+ IP6CB(skb)->flags |= IP6SKB_FRAGMENTED;
+ IP6CB(skb)->frag_max_size = fq->q.max_size;
/* Yes, and fold redundant checksum back. 8) */
- skb_postpush_rcsum(head, skb_network_header(head),
- skb_network_header_len(head));
+ skb_postpush_rcsum(skb, skb_network_header(skb),
+ skb_network_header_len(skb));
rcu_read_lock();
__IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMOKS);
@@ -468,6 +306,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
fq->q.fragments = NULL;
fq->q.rb_fragments = RB_ROOT;
fq->q.fragments_tail = NULL;
+ fq->q.last_run_head = NULL;
return 1;
out_oversize:
@@ -479,6 +318,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
rcu_read_lock();
__IP6_INC_STATS(net, __in6_dev_get(dev), IPSTATS_MIB_REASMFAILS);
rcu_read_unlock();
+ inet_frag_kill(&fq->q);
return -1;
}
@@ -517,22 +357,26 @@ static int ipv6_frag_rcv(struct sk_buff *skb)
return 1;
}
- if (skb->len - skb_network_offset(skb) < IPV6_MIN_MTU &&
- fhdr->frag_off & htons(IP6_MF))
- goto fail_hdr;
-
iif = skb->dev ? skb->dev->ifindex : 0;
fq = fq_find(net, fhdr->identification, hdr, iif);
if (fq) {
+ u32 prob_offset = 0;
int ret;
spin_lock(&fq->q.lock);
fq->iif = iif;
- ret = ip6_frag_queue(fq, skb, fhdr, IP6CB(skb)->nhoff);
+ ret = ip6_frag_queue(fq, skb, fhdr, IP6CB(skb)->nhoff,
+ &prob_offset);
spin_unlock(&fq->q.lock);
inet_frag_put(&fq->q);
+ if (prob_offset) {
+ __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
+ IPSTATS_MIB_INHDRERRORS);
+ /* icmpv6_param_prob() calls kfree_skb(skb) */
+ icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, prob_offset);
+ }
return ret;
}
@@ -700,42 +544,19 @@ static struct pernet_operations ip6_frags_ops = {
.exit = ipv6_frags_exit_net,
};
-static u32 ip6_key_hashfn(const void *data, u32 len, u32 seed)
-{
- return jhash2(data,
- sizeof(struct frag_v6_compare_key) / sizeof(u32), seed);
-}
-
-static u32 ip6_obj_hashfn(const void *data, u32 len, u32 seed)
-{
- const struct inet_frag_queue *fq = data;
-
- return jhash2((const u32 *)&fq->key.v6,
- sizeof(struct frag_v6_compare_key) / sizeof(u32), seed);
-}
-
-static int ip6_obj_cmpfn(struct rhashtable_compare_arg *arg, const void *ptr)
-{
- const struct frag_v6_compare_key *key = arg->key;
- const struct inet_frag_queue *fq = ptr;
-
- return !!memcmp(&fq->key, key, sizeof(*key));
-}
-
-const struct rhashtable_params ip6_rhash_params = {
+static const struct rhashtable_params ip6_rhash_params = {
.head_offset = offsetof(struct inet_frag_queue, node),
- .hashfn = ip6_key_hashfn,
- .obj_hashfn = ip6_obj_hashfn,
- .obj_cmpfn = ip6_obj_cmpfn,
+ .hashfn = ip6frag_key_hashfn,
+ .obj_hashfn = ip6frag_obj_hashfn,
+ .obj_cmpfn = ip6frag_obj_cmpfn,
.automatic_shrinking = true,
};
-EXPORT_SYMBOL(ip6_rhash_params);
int __init ipv6_frag_init(void)
{
int ret;
- ip6_frags.constructor = ip6_frag_init;
+ ip6_frags.constructor = ip6frag_init;
ip6_frags.destructor = NULL;
ip6_frags.qsize = sizeof(struct frag_queue);
ip6_frags.frag_expire = ip6_frag_expire;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 68c860d..4f808aa 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3177,7 +3177,7 @@ static int rt6_fill_node(struct net *net,
table = rt->rt6i_table->tb6_id;
else
table = RT6_TABLE_UNSPEC;
- rtm->rtm_table = table;
+ rtm->rtm_table = table < 256 ? table : RT_TABLE_COMPAT;
if (nla_put_u32(skb, RTA_TABLE, table))
goto nla_put_failure;
if (rt->rt6i_flags & RTF_REJECT) {
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 75de3dd..be74eee 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -661,6 +661,10 @@ static int ipip6_rcv(struct sk_buff *skb)
!net_eq(tunnel->net, dev_net(tunnel->dev))))
goto out;
+ /* skb can be uncloned in iptunnel_pull_header, so
+ * old iph is no longer valid
+ */
+ iph = (const struct iphdr *)skb_mac_header(skb);
err = IP_ECN_decapsulate(iph, skb);
if (unlikely(err)) {
if (log_ecn_error)
@@ -767,8 +771,9 @@ static bool check_6rd(struct ip_tunnel *tunnel, const struct in6_addr *v6dst,
pbw0 = tunnel->ip6rd.prefixlen >> 5;
pbi0 = tunnel->ip6rd.prefixlen & 0x1f;
- d = (ntohl(v6dst->s6_addr32[pbw0]) << pbi0) >>
- tunnel->ip6rd.relay_prefixlen;
+ d = tunnel->ip6rd.relay_prefixlen < 32 ?
+ (ntohl(v6dst->s6_addr32[pbw0]) << pbi0) >>
+ tunnel->ip6rd.relay_prefixlen : 0;
pbi1 = pbi0 - tunnel->ip6rd.relay_prefixlen;
if (pbi1 > 0)
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index ed5d635..3201aa2 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1062,11 +1062,11 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
newnp->ipv6_fl_list = NULL;
newnp->pktoptions = NULL;
newnp->opt = NULL;
- newnp->mcast_oif = tcp_v6_iif(skb);
- newnp->mcast_hops = ipv6_hdr(skb)->hop_limit;
- newnp->rcv_flowinfo = ip6_flowinfo(ipv6_hdr(skb));
+ newnp->mcast_oif = inet_iif(skb);
+ newnp->mcast_hops = ip_hdr(skb)->ttl;
+ newnp->rcv_flowinfo = 0;
if (np->repflow)
- newnp->flow_label = ip6_flowlabel(ipv6_hdr(skb));
+ newnp->flow_label = 0;
/*
* No need to charge this sock to the relevant IPv6 refcnt debug socks count
diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index 553d0ad..2f3cd09 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -2058,14 +2058,14 @@ static int __init kcm_init(void)
if (err)
goto fail;
- err = sock_register(&kcm_family_ops);
- if (err)
- goto sock_register_fail;
-
err = register_pernet_device(&kcm_net_ops);
if (err)
goto net_ops_fail;
+ err = sock_register(&kcm_family_ops);
+ if (err)
+ goto sock_register_fail;
+
err = kcm_proc_init();
if (err)
goto proc_init_fail;
@@ -2073,12 +2073,12 @@ static int __init kcm_init(void)
return 0;
proc_init_fail:
- unregister_pernet_device(&kcm_net_ops);
-
-net_ops_fail:
sock_unregister(PF_KCM);
sock_register_fail:
+ unregister_pernet_device(&kcm_net_ops);
+
+net_ops_fail:
proto_unregister(&kcm_proto);
fail:
@@ -2094,8 +2094,8 @@ static int __init kcm_init(void)
static void __exit kcm_exit(void)
{
kcm_proc_exit();
- unregister_pernet_device(&kcm_net_ops);
sock_unregister(PF_KCM);
+ unregister_pernet_device(&kcm_net_ops);
proto_unregister(&kcm_proto);
destroy_workqueue(kcm_wq);
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index c4b03d7..a007f11 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -680,9 +680,6 @@ static int l2tp_ip6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
if (flags & MSG_OOB)
goto out;
- if (addr_len)
- *addr_len = sizeof(*lsa);
-
if (flags & MSG_ERRQUEUE)
return ipv6_recv_error(sk, msg, len, addr_len);
@@ -712,6 +709,7 @@ static int l2tp_ip6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
lsa->l2tp_conn_id = 0;
if (ipv6_addr_type(&lsa->l2tp_addr) & IPV6_ADDR_LINKLOCAL)
lsa->l2tp_scope_id = inet6_iif(skb);
+ *addr_len = sizeof(*lsa);
}
if (np->rxopt.all)
diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h
index 49c8a9c..1a169de 100644
--- a/net/mac80211/driver-ops.h
+++ b/net/mac80211/driver-ops.h
@@ -1163,6 +1163,9 @@ static inline void drv_wake_tx_queue(struct ieee80211_local *local,
{
struct ieee80211_sub_if_data *sdata = vif_to_sdata(txq->txq.vif);
+ if (local->in_reconfig)
+ return;
+
if (!check_sdata_in_driver(sdata))
return;
diff --git a/net/netfilter/xt_physdev.c b/net/netfilter/xt_physdev.c
index bb33598..ec247d83 100644
--- a/net/netfilter/xt_physdev.c
+++ b/net/netfilter/xt_physdev.c
@@ -96,8 +96,7 @@ physdev_mt(const struct sk_buff *skb, struct xt_action_param *par)
static int physdev_mt_check(const struct xt_mtchk_param *par)
{
const struct xt_physdev_info *info = par->matchinfo;
-
- br_netfilter_enable();
+ static bool brnf_probed __read_mostly;
if (!(info->bitmask & XT_PHYSDEV_OP_MASK) ||
info->bitmask & ~XT_PHYSDEV_OP_MASK)
@@ -113,6 +112,12 @@ static int physdev_mt_check(const struct xt_mtchk_param *par)
if (par->hook_mask & (1 << NF_INET_LOCAL_OUT))
return -EINVAL;
}
+
+ if (!brnf_probed) {
+ brnf_probed = true;
+ request_module("br_netfilter");
+ }
+
return 0;
}
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index f135814..02d6f38 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -23,6 +23,7 @@
#include <net/netfilter/nf_conntrack_seqadj.h>
#include <net/netfilter/nf_conntrack_zones.h>
#include <net/netfilter/ipv6/nf_defrag_ipv6.h>
+#include <net/ipv6_frag.h>
#ifdef CONFIG_NF_NAT_NEEDED
#include <linux/netfilter/nf_nat.h>
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 3bd4d5d..50ea761 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -1853,14 +1853,14 @@ static struct nlattr *reserve_sfa_size(struct sw_flow_actions **sfa,
struct sw_flow_actions *acts;
int new_acts_size;
- int req_size = NLA_ALIGN(attr_len);
+ size_t req_size = NLA_ALIGN(attr_len);
int next_offset = offsetof(struct sw_flow_actions, actions) +
(*sfa)->actions_len;
if (req_size <= (ksize(*sfa) - next_offset))
goto out;
- new_acts_size = ksize(*sfa) * 2;
+ new_acts_size = max(next_offset + req_size, ksize(*sfa) * 2);
if (new_acts_size > MAX_ACTIONS_BUFSIZE) {
if ((MAX_ACTIONS_BUFSIZE - next_offset) < req_size) {
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 6542eff..5903163 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3278,7 +3278,7 @@ static int packet_create(struct net *net, struct socket *sock, int protocol,
}
mutex_lock(&net->packet.sklist_lock);
- sk_add_node_rcu(sk, &net->packet.sklist);
+ sk_add_node_tail_rcu(sk, &net->packet.sklist);
mutex_unlock(&net->packet.sklist_lock);
preempt_disable();
@@ -4229,7 +4229,7 @@ static struct pgv *alloc_pg_vec(struct tpacket_req *req, int order)
struct pgv *pg_vec;
int i;
- pg_vec = kcalloc(block_nr, sizeof(struct pgv), GFP_KERNEL);
+ pg_vec = kcalloc(block_nr, sizeof(struct pgv), GFP_KERNEL | __GFP_NOWARN);
if (unlikely(!pg_vec))
goto out;
diff --git a/net/phonet/pep.c b/net/phonet/pep.c
index 850a86c..f6aa532 100644
--- a/net/phonet/pep.c
+++ b/net/phonet/pep.c
@@ -131,7 +131,7 @@ static int pep_indicate(struct sock *sk, u8 id, u8 code,
ph->utid = 0;
ph->message_id = id;
ph->pipe_handle = pn->pipe_handle;
- ph->data[0] = code;
+ ph->error_code = code;
return pn_skb_send(sk, skb, NULL);
}
@@ -152,7 +152,7 @@ static int pipe_handler_request(struct sock *sk, u8 id, u8 code,
ph->utid = id; /* whatever */
ph->message_id = id;
ph->pipe_handle = pn->pipe_handle;
- ph->data[0] = code;
+ ph->error_code = code;
return pn_skb_send(sk, skb, NULL);
}
@@ -207,7 +207,7 @@ static int pep_ctrlreq_error(struct sock *sk, struct sk_buff *oskb, u8 code,
struct pnpipehdr *ph;
struct sockaddr_pn dst;
u8 data[4] = {
- oph->data[0], /* PEP type */
+ oph->pep_type, /* PEP type */
code, /* error code, at an unusual offset */
PAD, PAD,
};
@@ -220,7 +220,7 @@ static int pep_ctrlreq_error(struct sock *sk, struct sk_buff *oskb, u8 code,
ph->utid = oph->utid;
ph->message_id = PNS_PEP_CTRL_RESP;
ph->pipe_handle = oph->pipe_handle;
- ph->data[0] = oph->data[1]; /* CTRL id */
+ ph->data0 = oph->data[0]; /* CTRL id */
pn_skb_get_src_sockaddr(oskb, &dst);
return pn_skb_send(sk, skb, &dst);
@@ -271,17 +271,17 @@ static int pipe_rcv_status(struct sock *sk, struct sk_buff *skb)
return -EINVAL;
hdr = pnp_hdr(skb);
- if (hdr->data[0] != PN_PEP_TYPE_COMMON) {
+ if (hdr->pep_type != PN_PEP_TYPE_COMMON) {
net_dbg_ratelimited("Phonet unknown PEP type: %u\n",
- (unsigned int)hdr->data[0]);
+ (unsigned int)hdr->pep_type);
return -EOPNOTSUPP;
}
- switch (hdr->data[1]) {
+ switch (hdr->data[0]) {
case PN_PEP_IND_FLOW_CONTROL:
switch (pn->tx_fc) {
case PN_LEGACY_FLOW_CONTROL:
- switch (hdr->data[4]) {
+ switch (hdr->data[3]) {
case PEP_IND_BUSY:
atomic_set(&pn->tx_credits, 0);
break;
@@ -291,7 +291,7 @@ static int pipe_rcv_status(struct sock *sk, struct sk_buff *skb)
}
break;
case PN_ONE_CREDIT_FLOW_CONTROL:
- if (hdr->data[4] == PEP_IND_READY)
+ if (hdr->data[3] == PEP_IND_READY)
atomic_set(&pn->tx_credits, wake = 1);
break;
}
@@ -300,12 +300,12 @@ static int pipe_rcv_status(struct sock *sk, struct sk_buff *skb)
case PN_PEP_IND_ID_MCFC_GRANT_CREDITS:
if (pn->tx_fc != PN_MULTI_CREDIT_FLOW_CONTROL)
break;
- atomic_add(wake = hdr->data[4], &pn->tx_credits);
+ atomic_add(wake = hdr->data[3], &pn->tx_credits);
break;
default:
net_dbg_ratelimited("Phonet unknown PEP indication: %u\n",
- (unsigned int)hdr->data[1]);
+ (unsigned int)hdr->data[0]);
return -EOPNOTSUPP;
}
if (wake)
@@ -317,7 +317,7 @@ static int pipe_rcv_created(struct sock *sk, struct sk_buff *skb)
{
struct pep_sock *pn = pep_sk(sk);
struct pnpipehdr *hdr = pnp_hdr(skb);
- u8 n_sb = hdr->data[0];
+ u8 n_sb = hdr->data0;
pn->rx_fc = pn->tx_fc = PN_LEGACY_FLOW_CONTROL;
__skb_pull(skb, sizeof(*hdr));
@@ -505,7 +505,7 @@ static int pep_connresp_rcv(struct sock *sk, struct sk_buff *skb)
return -ECONNREFUSED;
/* Parse sub-blocks */
- n_sb = hdr->data[4];
+ n_sb = hdr->data[3];
while (n_sb > 0) {
u8 type, buf[6], len = sizeof(buf);
const u8 *data = pep_get_sb(skb, &type, &len, buf);
@@ -738,7 +738,7 @@ static int pipe_do_remove(struct sock *sk)
ph->utid = 0;
ph->message_id = PNS_PIPE_REMOVE_REQ;
ph->pipe_handle = pn->pipe_handle;
- ph->data[0] = PAD;
+ ph->data0 = PAD;
return pn_skb_send(sk, skb, NULL);
}
@@ -815,7 +815,7 @@ static struct sock *pep_sock_accept(struct sock *sk, int flags, int *errp)
peer_type = hdr->other_pep_type << 8;
/* Parse sub-blocks (options) */
- n_sb = hdr->data[4];
+ n_sb = hdr->data[3];
while (n_sb > 0) {
u8 type, buf[1], len = sizeof(buf);
const u8 *data = pep_get_sb(skb, &type, &len, buf);
@@ -1106,7 +1106,7 @@ static int pipe_skb_send(struct sock *sk, struct sk_buff *skb)
ph->utid = 0;
if (pn->aligned) {
ph->message_id = PNS_PIPE_ALIGNED_DATA;
- ph->data[0] = 0; /* padding */
+ ph->data0 = 0; /* padding */
} else
ph->message_id = PNS_PIPE_DATA;
ph->pipe_handle = pn->pipe_handle;
diff --git a/net/rds/ib_fmr.c b/net/rds/ib_fmr.c
index 4fe8f4f..da84d6b 100644
--- a/net/rds/ib_fmr.c
+++ b/net/rds/ib_fmr.c
@@ -44,6 +44,17 @@ struct rds_ib_mr *rds_ib_alloc_fmr(struct rds_ib_device *rds_ibdev, int npages)
else
pool = rds_ibdev->mr_1m_pool;
+ if (atomic_read(&pool->dirty_count) >= pool->max_items / 10)
+ queue_delayed_work(rds_ib_mr_wq, &pool->flush_worker, 10);
+
+ /* Switch pools if one of the pool is reaching upper limit */
+ if (atomic_read(&pool->dirty_count) >= pool->max_items * 9 / 10) {
+ if (pool->pool_type == RDS_IB_MR_8K_POOL)
+ pool = rds_ibdev->mr_1m_pool;
+ else
+ pool = rds_ibdev->mr_8k_pool;
+ }
+
ibmr = rds_ib_try_reuse_ibmr(pool);
if (ibmr)
return ibmr;
diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c
index 977f698..91b53d4 100644
--- a/net/rds/ib_rdma.c
+++ b/net/rds/ib_rdma.c
@@ -442,9 +442,6 @@ struct rds_ib_mr *rds_ib_try_reuse_ibmr(struct rds_ib_mr_pool *pool)
struct rds_ib_mr *ibmr = NULL;
int iter = 0;
- if (atomic_read(&pool->dirty_count) >= pool->max_items_soft / 10)
- queue_delayed_work(rds_ib_mr_wq, &pool->flush_worker, 10);
-
while (1) {
ibmr = rds_ib_reuse_mr(pool);
if (ibmr)
diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index d36effb..2daba53 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -527,7 +527,7 @@ static void rds_tcp_kill_sock(struct net *net)
list_for_each_entry_safe(tc, _tc, &rds_tcp_conn_list, t_tcp_node) {
struct net *c_net = read_pnet(&tc->t_cpath->cp_conn->c_net);
- if (net != c_net || !tc->t_sock)
+ if (net != c_net)
continue;
if (!list_has_conn(&tmp_list, tc->t_cpath->cp_conn)) {
list_move_tail(&tc->t_tcp_node, &tmp_list);
diff --git a/net/rose/rose_subr.c b/net/rose/rose_subr.c
index 7ca5774..7849f28 100644
--- a/net/rose/rose_subr.c
+++ b/net/rose/rose_subr.c
@@ -105,16 +105,17 @@ void rose_write_internal(struct sock *sk, int frametype)
struct sk_buff *skb;
unsigned char *dptr;
unsigned char lci1, lci2;
- char buffer[100];
- int len, faclen = 0;
+ int maxfaclen = 0;
+ int len, faclen;
+ int reserve;
- len = AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN + ROSE_MIN_LEN + 1;
+ reserve = AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN + 1;
+ len = ROSE_MIN_LEN;
switch (frametype) {
case ROSE_CALL_REQUEST:
len += 1 + ROSE_ADDR_LEN + ROSE_ADDR_LEN;
- faclen = rose_create_facilities(buffer, rose);
- len += faclen;
+ maxfaclen = 256;
break;
case ROSE_CALL_ACCEPTED:
case ROSE_CLEAR_REQUEST:
@@ -123,15 +124,16 @@ void rose_write_internal(struct sock *sk, int frametype)
break;
}
- if ((skb = alloc_skb(len, GFP_ATOMIC)) == NULL)
+ skb = alloc_skb(reserve + len + maxfaclen, GFP_ATOMIC);
+ if (!skb)
return;
/*
* Space for AX.25 header and PID.
*/
- skb_reserve(skb, AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN + 1);
+ skb_reserve(skb, reserve);
- dptr = skb_put(skb, skb_tailroom(skb));
+ dptr = skb_put(skb, len);
lci1 = (rose->lci >> 8) & 0x0F;
lci2 = (rose->lci >> 0) & 0xFF;
@@ -146,7 +148,8 @@ void rose_write_internal(struct sock *sk, int frametype)
dptr += ROSE_ADDR_LEN;
memcpy(dptr, &rose->source_addr, ROSE_ADDR_LEN);
dptr += ROSE_ADDR_LEN;
- memcpy(dptr, buffer, faclen);
+ faclen = rose_create_facilities(dptr, rose);
+ skb_put(skb, faclen);
dptr += faclen;
break;
diff --git a/net/rxrpc/conn_client.c b/net/rxrpc/conn_client.c
index 60ef960..0fce919 100644
--- a/net/rxrpc/conn_client.c
+++ b/net/rxrpc/conn_client.c
@@ -355,7 +355,7 @@ static int rxrpc_get_client_conn(struct rxrpc_call *call,
* normally have to take channel_lock but we do this before anyone else
* can see the connection.
*/
- list_add_tail(&call->chan_wait_link, &candidate->waiting_calls);
+ list_add(&call->chan_wait_link, &candidate->waiting_calls);
if (cp->exclusive) {
call->conn = candidate;
@@ -430,7 +430,7 @@ static int rxrpc_get_client_conn(struct rxrpc_call *call,
spin_lock(&conn->channel_lock);
call->conn = conn;
call->security_ix = conn->security_ix;
- list_add(&call->chan_wait_link, &conn->waiting_calls);
+ list_add_tail(&call->chan_wait_link, &conn->waiting_calls);
spin_unlock(&conn->channel_lock);
_leave(" = 0 [extant %d]", conn->debug_id);
return 0;
diff --git a/net/sched/em_meta.c b/net/sched/em_meta.c
index a309a07..909848f 100644
--- a/net/sched/em_meta.c
+++ b/net/sched/em_meta.c
@@ -63,6 +63,7 @@
#include <linux/types.h>
#include <linux/kernel.h>
#include <linux/sched.h>
+#include <linux/sched/loadavg.h>
#include <linux/string.h>
#include <linux/skbuff.h>
#include <linux/random.h>
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 8ea8217..d6af93a 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -600,6 +600,7 @@ static struct sock *sctp_v4_create_accept_sk(struct sock *sk,
static int sctp_v4_addr_to_user(struct sctp_sock *sp, union sctp_addr *addr)
{
/* No address mapping for V4 sockets */
+ memset(addr->v4.sin_zero, 0, sizeof(addr->v4.sin_zero));
return sizeof(struct sockaddr_in);
}
diff --git a/net/socket.c b/net/socket.c
index a523ac3..cff34b8 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -615,6 +615,7 @@ static void __sock_release(struct socket *sock, struct inode *inode)
if (inode)
inode_lock(inode);
sock->ops->release(sock);
+ sock->sk = NULL;
if (inode)
inode_unlock(inode);
sock->ops = NULL;
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index cab50ec..cdcc0fe 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -54,6 +54,7 @@ static void cache_init(struct cache_head *h, struct cache_detail *detail)
h->last_refresh = now;
}
+static inline int cache_is_valid(struct cache_head *h);
static void cache_fresh_locked(struct cache_head *head, time_t expiry,
struct cache_detail *detail);
static void cache_fresh_unlocked(struct cache_head *head,
@@ -100,6 +101,8 @@ struct cache_head *sunrpc_cache_lookup(struct cache_detail *detail,
if (cache_is_expired(detail, tmp)) {
hlist_del_init(&tmp->cache_list);
detail->entries --;
+ if (cache_is_valid(tmp) == -EAGAIN)
+ set_bit(CACHE_NEGATIVE, &tmp->flags);
cache_fresh_locked(tmp, 0, detail);
freeme = tmp;
break;
diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c
index d947b82..0cf9403 100644
--- a/net/tipc/netlink_compat.c
+++ b/net/tipc/netlink_compat.c
@@ -262,8 +262,14 @@ static int tipc_nl_compat_dumpit(struct tipc_nl_compat_cmd_dump *cmd,
if (msg->rep_type)
tipc_tlv_init(msg->rep, msg->rep_type);
- if (cmd->header)
- (*cmd->header)(msg);
+ if (cmd->header) {
+ err = (*cmd->header)(msg);
+ if (err) {
+ kfree_skb(msg->rep);
+ msg->rep = NULL;
+ return err;
+ }
+ }
arg = nlmsg_new(0, GFP_KERNEL);
if (!arg) {
@@ -388,7 +394,12 @@ static int tipc_nl_compat_bearer_enable(struct tipc_nl_compat_cmd_doit *cmd,
if (!bearer)
return -EMSGSIZE;
- len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_BEARER_NAME);
+ len = TLV_GET_DATA_LEN(msg->req);
+ len -= offsetof(struct tipc_bearer_config, name);
+ if (len <= 0)
+ return -EINVAL;
+
+ len = min_t(int, len, TIPC_MAX_BEARER_NAME);
if (!string_is_valid(b->name, len))
return -EINVAL;
@@ -757,7 +768,12 @@ static int tipc_nl_compat_link_set(struct tipc_nl_compat_cmd_doit *cmd,
lc = (struct tipc_link_config *)TLV_DATA(msg->req);
- len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME);
+ len = TLV_GET_DATA_LEN(msg->req);
+ len -= offsetof(struct tipc_link_config, name);
+ if (len <= 0)
+ return -EINVAL;
+
+ len = min_t(int, len, TIPC_MAX_LINK_NAME);
if (!string_is_valid(lc->name, len))
return -EINVAL;
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index a3df5e1..f0c2197 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -891,7 +891,7 @@ static int unix_autobind(struct socket *sock)
addr->hash ^= sk->sk_type;
__unix_remove_socket(sk);
- u->addr = addr;
+ smp_store_release(&u->addr, addr);
__unix_insert_socket(&unix_socket_table[addr->hash], sk);
spin_unlock(&unix_table_lock);
err = 0;
@@ -1061,7 +1061,7 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
err = 0;
__unix_remove_socket(sk);
- u->addr = addr;
+ smp_store_release(&u->addr, addr);
__unix_insert_socket(list, sk);
out_unlock:
@@ -1332,15 +1332,29 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
RCU_INIT_POINTER(newsk->sk_wq, &newu->peer_wq);
otheru = unix_sk(other);
- /* copy address information from listening to new sock*/
- if (otheru->addr) {
- atomic_inc(&otheru->addr->refcnt);
- newu->addr = otheru->addr;
- }
+ /* copy address information from listening to new sock
+ *
+ * The contents of *(otheru->addr) and otheru->path
+ * are seen fully set up here, since we have found
+ * otheru in hash under unix_table_lock. Insertion
+ * into the hash chain we'd found it in had been done
+ * in an earlier critical area protected by unix_table_lock,
+ * the same one where we'd set *(otheru->addr) contents,
+ * as well as otheru->path and otheru->addr itself.
+ *
+ * Using smp_store_release() here to set newu->addr
+ * is enough to make those stores, as well as stores
+ * to newu->path visible to anyone who gets newu->addr
+ * by smp_load_acquire(). IOW, the same warranties
+ * as for unix_sock instances bound in unix_bind() or
+ * in unix_autobind().
+ */
if (otheru->path.dentry) {
path_get(&otheru->path);
newu->path = otheru->path;
}
+ atomic_inc(&otheru->addr->refcnt);
+ smp_store_release(&newu->addr, otheru->addr);
/* Set credentials */
copy_peercred(sk, other);
@@ -1453,7 +1467,7 @@ static int unix_accept(struct socket *sock, struct socket *newsock, int flags)
static int unix_getname(struct socket *sock, struct sockaddr *uaddr, int *uaddr_len, int peer)
{
struct sock *sk = sock->sk;
- struct unix_sock *u;
+ struct unix_address *addr;
DECLARE_SOCKADDR(struct sockaddr_un *, sunaddr, uaddr);
int err = 0;
@@ -1468,19 +1482,15 @@ static int unix_getname(struct socket *sock, struct sockaddr *uaddr, int *uaddr_
sock_hold(sk);
}
- u = unix_sk(sk);
- unix_state_lock(sk);
- if (!u->addr) {
+ addr = smp_load_acquire(&unix_sk(sk)->addr);
+ if (!addr) {
sunaddr->sun_family = AF_UNIX;
sunaddr->sun_path[0] = 0;
*uaddr_len = sizeof(short);
} else {
- struct unix_address *addr = u->addr;
-
*uaddr_len = addr->len;
memcpy(sunaddr, addr->name, *uaddr_len);
}
- unix_state_unlock(sk);
sock_put(sk);
out:
return err;
@@ -2094,11 +2104,11 @@ static int unix_seqpacket_recvmsg(struct socket *sock, struct msghdr *msg,
static void unix_copy_addr(struct msghdr *msg, struct sock *sk)
{
- struct unix_sock *u = unix_sk(sk);
+ struct unix_address *addr = smp_load_acquire(&unix_sk(sk)->addr);
- if (u->addr) {
- msg->msg_namelen = u->addr->len;
- memcpy(msg->msg_name, u->addr->name, u->addr->len);
+ if (addr) {
+ msg->msg_namelen = addr->len;
+ memcpy(msg->msg_name, addr->name, addr->len);
}
}
@@ -2814,7 +2824,7 @@ static int unix_seq_show(struct seq_file *seq, void *v)
(s->sk_state == TCP_ESTABLISHED ? SS_CONNECTING : SS_DISCONNECTING),
sock_i_ino(s));
- if (u->addr) {
+ if (u->addr) { // under unix_table_lock here
int i, len;
seq_putc(seq, ' ');
diff --git a/net/unix/diag.c b/net/unix/diag.c
index 384c84e..3183d9b 100644
--- a/net/unix/diag.c
+++ b/net/unix/diag.c
@@ -10,7 +10,8 @@
static int sk_diag_dump_name(struct sock *sk, struct sk_buff *nlskb)
{
- struct unix_address *addr = unix_sk(sk)->addr;
+ /* might or might not have unix_table_lock */
+ struct unix_address *addr = smp_load_acquire(&unix_sk(sk)->addr);
if (!addr)
return 0;
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 9c07c76..cc4b4ab 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -601,6 +601,8 @@ static int virtio_transport_reset(struct vsock_sock *vsk,
*/
static int virtio_transport_reset_no_sock(struct virtio_vsock_pkt *pkt)
{
+ const struct virtio_transport *t;
+ struct virtio_vsock_pkt *reply;
struct virtio_vsock_pkt_info info = {
.op = VIRTIO_VSOCK_OP_RST,
.type = le16_to_cpu(pkt->hdr.type),
@@ -611,15 +613,21 @@ static int virtio_transport_reset_no_sock(struct virtio_vsock_pkt *pkt)
if (le16_to_cpu(pkt->hdr.op) == VIRTIO_VSOCK_OP_RST)
return 0;
- pkt = virtio_transport_alloc_pkt(&info, 0,
- le64_to_cpu(pkt->hdr.dst_cid),
- le32_to_cpu(pkt->hdr.dst_port),
- le64_to_cpu(pkt->hdr.src_cid),
- le32_to_cpu(pkt->hdr.src_port));
- if (!pkt)
+ reply = virtio_transport_alloc_pkt(&info, 0,
+ le64_to_cpu(pkt->hdr.dst_cid),
+ le32_to_cpu(pkt->hdr.dst_port),
+ le64_to_cpu(pkt->hdr.src_cid),
+ le32_to_cpu(pkt->hdr.src_port));
+ if (!reply)
return -ENOMEM;
- return virtio_transport_get_ops()->send_pkt(pkt);
+ t = virtio_transport_get_ops();
+ if (!t) {
+ virtio_transport_free_pkt(reply);
+ return -ENOTCONN;
+ }
+
+ return t->send_pkt(reply);
}
static void virtio_transport_wait_close(struct sock *sk, long timeout)
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
old mode 100644
new mode 100755
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 0a7e5d9..770abab 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -678,8 +678,7 @@ static int x25_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
struct sockaddr_x25 *addr = (struct sockaddr_x25 *)uaddr;
int len, i, rc = 0;
- if (!sock_flag(sk, SOCK_ZAPPED) ||
- addr_len != sizeof(struct sockaddr_x25) ||
+ if (addr_len != sizeof(struct sockaddr_x25) ||
addr->sx25_family != AF_X25) {
rc = -EINVAL;
goto out;
@@ -694,9 +693,13 @@ static int x25_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
}
lock_sock(sk);
- x25_sk(sk)->source_addr = addr->sx25_addr;
- x25_insert_socket(sk);
- sock_reset_flag(sk, SOCK_ZAPPED);
+ if (sock_flag(sk, SOCK_ZAPPED)) {
+ x25_sk(sk)->source_addr = addr->sx25_addr;
+ x25_insert_socket(sk);
+ sock_reset_flag(sk, SOCK_ZAPPED);
+ } else {
+ rc = -EINVAL;
+ }
release_sock(sk);
SOCK_DEBUG(sk, "x25_bind: socket is bound\n");
out:
@@ -812,8 +815,13 @@ static int x25_connect(struct socket *sock, struct sockaddr *uaddr,
sock->state = SS_CONNECTED;
rc = 0;
out_put_neigh:
- if (rc)
+ if (rc) {
+ read_lock_bh(&x25_list_lock);
x25_neigh_put(x25->neighbour);
+ x25->neighbour = NULL;
+ read_unlock_bh(&x25_list_lock);
+ x25->state = X25_STATE_0;
+ }
out_put_route:
x25_route_put(rt);
out:
diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 512cec5..f40431f 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -197,9 +197,7 @@
# ld-option
# Usage: LDFLAGS += $(call ld-option, -X)
-ld-option = $(call try-run,\
- $(CC) $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -x c /dev/null -c -o "$$TMPO"; \
- $(LD) $(LDFLAGS) $(1) "$$TMPO" -o "$$TMP",$(1),$(2))
+ld-option = $(call try-run, $(LD) $(LDFLAGS) $(1) -v,$(1),$(2))
# ar-option
# Usage: KBUILD_ARFLAGS := $(call ar-option,D)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 409662f..c141b05 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -5874,6 +5874,32 @@
}
}
+ # check for vsprintf extension %p<foo> misuses
+ if ($^V && $^V ge 5.10.0 &&
+ defined $stat &&
+ $stat =~ /^\+(?![^\{]*\{\s*).*\b(\w+)\s*\(.*$String\s*,/s &&
+ $1 !~ /^_*volatile_*$/) {
+ my $bad_extension = "";
+ my $lc = $stat =~ tr@\n@@;
+ $lc = $lc + $linenr;
+ for (my $count = $linenr; $count <= $lc; $count++) {
+ my $fmt = get_quoted_string($lines[$count - 1], raw_line($count, 0));
+ $fmt =~ s/%%//g;
+ if ($fmt =~ /(\%[\*\d\.]*p(?![\WFfSsBKRraEhMmIiUDdgVCbGNOx]).)/) {
+ $bad_extension = $1;
+ last;
+ }
+ }
+ if ($bad_extension ne "") {
+ my $stat_real = raw_line($linenr, 0);
+ for (my $count = $linenr + 1; $count <= $lc; $count++) {
+ $stat_real = $stat_real . "\n" . raw_line($count, 0);
+ }
+ WARN("VSPRINTF_POINTER_EXTENSION",
+ "Invalid vsprintf pointer extension '$bad_extension'\n" . "$here\n$stat_real\n");
+ }
+ }
+
# Check for misused memsets
if ($^V && $^V ge 5.10.0 &&
defined $stat &&
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
index 29d6699..55b4c0d 100644
--- a/scripts/mod/file2alias.c
+++ b/scripts/mod/file2alias.c
@@ -47,49 +47,9 @@ typedef struct {
struct devtable {
const char *device_id; /* name of table, __mod_<name>__*_device_table. */
unsigned long id_size;
- void *function;
+ int (*do_entry)(const char *filename, void *symval, char *alias);
};
-#define ___cat(a,b) a ## b
-#define __cat(a,b) ___cat(a,b)
-
-/* we need some special handling for this host tool running eventually on
- * Darwin. The Mach-O section handling is a bit different than ELF section
- * handling. The differnces in detail are:
- * a) we have segments which have sections
- * b) we need a API call to get the respective section symbols */
-#if defined(__MACH__)
-#include <mach-o/getsect.h>
-
-#define INIT_SECTION(name) do { \
- unsigned long name ## _len; \
- char *__cat(pstart_,name) = getsectdata("__TEXT", \
- #name, &__cat(name,_len)); \
- char *__cat(pstop_,name) = __cat(pstart_,name) + \
- __cat(name, _len); \
- __cat(__start_,name) = (void *)__cat(pstart_,name); \
- __cat(__stop_,name) = (void *)__cat(pstop_,name); \
- } while (0)
-#define SECTION(name) __attribute__((section("__TEXT, " #name)))
-
-struct devtable **__start___devtable, **__stop___devtable;
-#else
-#define INIT_SECTION(name) /* no-op for ELF */
-#define SECTION(name) __attribute__((section(#name)))
-
-/* We construct a table of pointers in an ELF section (pointers generally
- * go unpadded by gcc). ld creates boundary syms for us. */
-extern struct devtable *__start___devtable[], *__stop___devtable[];
-#endif /* __MACH__ */
-
-#if !defined(__used)
-# if __GNUC__ == 3 && __GNUC_MINOR__ < 3
-# define __used __attribute__((__unused__))
-# else
-# define __used __attribute__((__used__))
-# endif
-#endif
-
/* Define a variable f that holds the value of field f of struct devid
* based at address m.
*/
@@ -102,16 +62,6 @@ extern struct devtable *__start___devtable[], *__stop___devtable[];
#define DEF_FIELD_ADDR(m, devid, f) \
typeof(((struct devid *)0)->f) *f = ((m) + OFF_##devid##_##f)
-/* Add a table entry. We test function type matches while we're here. */
-#define ADD_TO_DEVTABLE(device_id, type, function) \
- static struct devtable __cat(devtable,__LINE__) = { \
- device_id + 0*sizeof((function)((const char *)NULL, \
- (void *)NULL, \
- (char *)NULL)), \
- SIZE_##type, (function) }; \
- static struct devtable *SECTION(__devtable) __used \
- __cat(devtable_ptr,__LINE__) = &__cat(devtable,__LINE__)
-
#define ADD(str, sep, cond, field) \
do { \
strcat(str, sep); \
@@ -431,7 +381,6 @@ static int do_hid_entry(const char *filename,
return 1;
}
-ADD_TO_DEVTABLE("hid", hid_device_id, do_hid_entry);
/* Looks like: ieee1394:venNmoNspNverN */
static int do_ieee1394_entry(const char *filename,
@@ -456,7 +405,6 @@ static int do_ieee1394_entry(const char *filename,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("ieee1394", ieee1394_device_id, do_ieee1394_entry);
/* Looks like: pci:vNdNsvNsdNbcNscNiN. */
static int do_pci_entry(const char *filename,
@@ -500,7 +448,6 @@ static int do_pci_entry(const char *filename,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("pci", pci_device_id, do_pci_entry);
/* looks like: "ccw:tNmNdtNdmN" */
static int do_ccw_entry(const char *filename,
@@ -524,7 +471,6 @@ static int do_ccw_entry(const char *filename,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("ccw", ccw_device_id, do_ccw_entry);
/* looks like: "ap:tN" */
static int do_ap_entry(const char *filename,
@@ -535,7 +481,6 @@ static int do_ap_entry(const char *filename,
sprintf(alias, "ap:t%02X*", dev_type);
return 1;
}
-ADD_TO_DEVTABLE("ap", ap_device_id, do_ap_entry);
/* looks like: "css:tN" */
static int do_css_entry(const char *filename,
@@ -546,7 +491,6 @@ static int do_css_entry(const char *filename,
sprintf(alias, "css:t%01X", type);
return 1;
}
-ADD_TO_DEVTABLE("css", css_device_id, do_css_entry);
/* Looks like: "serio:tyNprNidNexN" */
static int do_serio_entry(const char *filename,
@@ -566,7 +510,6 @@ static int do_serio_entry(const char *filename,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("serio", serio_device_id, do_serio_entry);
/* looks like: "acpi:ACPI0003" or "acpi:PNP0C0B" or "acpi:LNXVIDEO" or
* "acpi:bbsspp" (bb=base-class, ss=sub-class, pp=prog-if)
@@ -604,7 +547,6 @@ static int do_acpi_entry(const char *filename,
}
return 1;
}
-ADD_TO_DEVTABLE("acpi", acpi_device_id, do_acpi_entry);
/* looks like: "pnp:dD" */
static void do_pnp_device_entry(void *symval, unsigned long size,
@@ -725,7 +667,6 @@ static int do_pcmcia_entry(const char *filename,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("pcmcia", pcmcia_device_id, do_pcmcia_entry);
static int do_vio_entry(const char *filename, void *symval,
char *alias)
@@ -745,7 +686,6 @@ static int do_vio_entry(const char *filename, void *symval,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("vio", vio_device_id, do_vio_entry);
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
@@ -818,7 +758,6 @@ static int do_input_entry(const char *filename, void *symval,
do_input(alias, *swbit, 0, INPUT_DEVICE_ID_SW_MAX);
return 1;
}
-ADD_TO_DEVTABLE("input", input_device_id, do_input_entry);
static int do_eisa_entry(const char *filename, void *symval,
char *alias)
@@ -830,7 +769,6 @@ static int do_eisa_entry(const char *filename, void *symval,
strcat(alias, "*");
return 1;
}
-ADD_TO_DEVTABLE("eisa", eisa_device_id, do_eisa_entry);
/* Looks like: parisc:tNhvNrevNsvN */
static int do_parisc_entry(const char *filename, void *symval,
@@ -850,7 +788,6 @@ static int do_parisc_entry(const char *filename, void *symval,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("parisc", parisc_device_id, do_parisc_entry);
/* Looks like: sdio:cNvNdN. */
static int do_sdio_entry(const char *filename,
@@ -867,7 +804,6 @@ static int do_sdio_entry(const char *filename,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("sdio", sdio_device_id, do_sdio_entry);
/* Looks like: ssb:vNidNrevN. */
static int do_ssb_entry(const char *filename,
@@ -884,7 +820,6 @@ static int do_ssb_entry(const char *filename,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("ssb", ssb_device_id, do_ssb_entry);
/* Looks like: bcma:mNidNrevNclN. */
static int do_bcma_entry(const char *filename,
@@ -903,7 +838,6 @@ static int do_bcma_entry(const char *filename,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("bcma", bcma_device_id, do_bcma_entry);
/* Looks like: virtio:dNvN */
static int do_virtio_entry(const char *filename, void *symval,
@@ -919,7 +853,6 @@ static int do_virtio_entry(const char *filename, void *symval,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("virtio", virtio_device_id, do_virtio_entry);
/*
* Looks like: vmbus:guid
@@ -942,7 +875,6 @@ static int do_vmbus_entry(const char *filename, void *symval,
return 1;
}
-ADD_TO_DEVTABLE("vmbus", hv_vmbus_device_id, do_vmbus_entry);
/* Looks like: i2c:S */
static int do_i2c_entry(const char *filename, void *symval,
@@ -953,7 +885,6 @@ static int do_i2c_entry(const char *filename, void *symval,
return 1;
}
-ADD_TO_DEVTABLE("i2c", i2c_device_id, do_i2c_entry);
/* Looks like: spi:S */
static int do_spi_entry(const char *filename, void *symval,
@@ -964,7 +895,6 @@ static int do_spi_entry(const char *filename, void *symval,
return 1;
}
-ADD_TO_DEVTABLE("spi", spi_device_id, do_spi_entry);
static const struct dmifield {
const char *prefix;
@@ -1019,7 +949,6 @@ static int do_dmi_entry(const char *filename, void *symval,
strcat(alias, ":");
return 1;
}
-ADD_TO_DEVTABLE("dmi", dmi_system_id, do_dmi_entry);
static int do_platform_entry(const char *filename,
void *symval, char *alias)
@@ -1028,7 +957,6 @@ static int do_platform_entry(const char *filename,
sprintf(alias, PLATFORM_MODULE_PREFIX "%s", *name);
return 1;
}
-ADD_TO_DEVTABLE("platform", platform_device_id, do_platform_entry);
static int do_mdio_entry(const char *filename,
void *symval, char *alias)
@@ -1053,7 +981,6 @@ static int do_mdio_entry(const char *filename,
return 1;
}
-ADD_TO_DEVTABLE("mdio", mdio_device_id, do_mdio_entry);
/* Looks like: zorro:iN. */
static int do_zorro_entry(const char *filename, void *symval,
@@ -1064,7 +991,6 @@ static int do_zorro_entry(const char *filename, void *symval,
ADD(alias, "i", id != ZORRO_WILDCARD, id);
return 1;
}
-ADD_TO_DEVTABLE("zorro", zorro_device_id, do_zorro_entry);
/* looks like: "pnp:dD" */
static int do_isapnp_entry(const char *filename,
@@ -1080,7 +1006,6 @@ static int do_isapnp_entry(const char *filename,
(function >> 12) & 0x0f, (function >> 8) & 0x0f);
return 1;
}
-ADD_TO_DEVTABLE("isapnp", isapnp_device_id, do_isapnp_entry);
/* Looks like: "ipack:fNvNdN". */
static int do_ipack_entry(const char *filename,
@@ -1096,7 +1021,6 @@ static int do_ipack_entry(const char *filename,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("ipack", ipack_device_id, do_ipack_entry);
/*
* Append a match expression for a single masked hex digit.
@@ -1167,7 +1091,6 @@ static int do_amba_entry(const char *filename,
return 1;
}
-ADD_TO_DEVTABLE("amba", amba_id, do_amba_entry);
/*
* looks like: "mipscdmm:tN"
@@ -1183,7 +1106,6 @@ static int do_mips_cdmm_entry(const char *filename,
sprintf(alias, "mipscdmm:t%02X*", type);
return 1;
}
-ADD_TO_DEVTABLE("mipscdmm", mips_cdmm_device_id, do_mips_cdmm_entry);
/* LOOKS like cpu:type:x86,venVVVVfamFFFFmodMMMM:feature:*,FEAT,*
* All fields are numbers. It would be nicer to use strings for vendor
@@ -1208,7 +1130,6 @@ static int do_x86cpu_entry(const char *filename, void *symval,
sprintf(alias + strlen(alias), "%04X*", feature);
return 1;
}
-ADD_TO_DEVTABLE("x86cpu", x86_cpu_id, do_x86cpu_entry);
/* LOOKS like cpu:type:*:feature:*FEAT* */
static int do_cpu_entry(const char *filename, void *symval, char *alias)
@@ -1218,7 +1139,6 @@ static int do_cpu_entry(const char *filename, void *symval, char *alias)
sprintf(alias, "cpu:type:*:feature:*%04X*", feature);
return 1;
}
-ADD_TO_DEVTABLE("cpu", cpu_feature, do_cpu_entry);
/* Looks like: mei:S:uuid:N:* */
static int do_mei_entry(const char *filename, void *symval,
@@ -1237,7 +1157,6 @@ static int do_mei_entry(const char *filename, void *symval,
return 1;
}
-ADD_TO_DEVTABLE("mei", mei_cl_device_id, do_mei_entry);
/* Looks like: rapidio:vNdNavNadN */
static int do_rio_entry(const char *filename,
@@ -1257,7 +1176,6 @@ static int do_rio_entry(const char *filename,
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("rapidio", rio_device_id, do_rio_entry);
/* Looks like: ulpi:vNpN */
static int do_ulpi_entry(const char *filename, void *symval,
@@ -1270,7 +1188,6 @@ static int do_ulpi_entry(const char *filename, void *symval,
return 1;
}
-ADD_TO_DEVTABLE("ulpi", ulpi_device_id, do_ulpi_entry);
/* Looks like: hdaudio:vNrNaN */
static int do_hda_entry(const char *filename, void *symval, char *alias)
@@ -1287,7 +1204,6 @@ static int do_hda_entry(const char *filename, void *symval, char *alias)
add_wildcard(alias);
return 1;
}
-ADD_TO_DEVTABLE("hdaudio", hda_device_id, do_hda_entry);
/* Looks like: fsl-mc:vNdN */
static int do_fsl_mc_entry(const char *filename, void *symval,
@@ -1299,7 +1215,6 @@ static int do_fsl_mc_entry(const char *filename, void *symval,
sprintf(alias, "fsl-mc:v%08Xd%s", vendor, *obj_type);
return 1;
}
-ADD_TO_DEVTABLE("fslmc", fsl_mc_device_id, do_fsl_mc_entry);
/* Does namelen bytes of name exactly match the symbol? */
static bool sym_is(const char *name, unsigned namelen, const char *symbol)
@@ -1313,12 +1228,11 @@ static bool sym_is(const char *name, unsigned namelen, const char *symbol)
static void do_table(void *symval, unsigned long size,
unsigned long id_size,
const char *device_id,
- void *function,
+ int (*do_entry)(const char *filename, void *symval, char *alias),
struct module *mod)
{
unsigned int i;
char alias[500];
- int (*do_entry)(const char *, void *entry, char *alias) = function;
device_id_check(mod->name, device_id, size, id_size, symval);
/* Leave last one: it's the terminator. */
@@ -1332,6 +1246,44 @@ static void do_table(void *symval, unsigned long size,
}
}
+static const struct devtable devtable[] = {
+ {"hid", SIZE_hid_device_id, do_hid_entry},
+ {"ieee1394", SIZE_ieee1394_device_id, do_ieee1394_entry},
+ {"pci", SIZE_pci_device_id, do_pci_entry},
+ {"ccw", SIZE_ccw_device_id, do_ccw_entry},
+ {"ap", SIZE_ap_device_id, do_ap_entry},
+ {"css", SIZE_css_device_id, do_css_entry},
+ {"serio", SIZE_serio_device_id, do_serio_entry},
+ {"acpi", SIZE_acpi_device_id, do_acpi_entry},
+ {"pcmcia", SIZE_pcmcia_device_id, do_pcmcia_entry},
+ {"vio", SIZE_vio_device_id, do_vio_entry},
+ {"input", SIZE_input_device_id, do_input_entry},
+ {"eisa", SIZE_eisa_device_id, do_eisa_entry},
+ {"parisc", SIZE_parisc_device_id, do_parisc_entry},
+ {"sdio", SIZE_sdio_device_id, do_sdio_entry},
+ {"ssb", SIZE_ssb_device_id, do_ssb_entry},
+ {"bcma", SIZE_bcma_device_id, do_bcma_entry},
+ {"virtio", SIZE_virtio_device_id, do_virtio_entry},
+ {"vmbus", SIZE_hv_vmbus_device_id, do_vmbus_entry},
+ {"i2c", SIZE_i2c_device_id, do_i2c_entry},
+ {"spi", SIZE_spi_device_id, do_spi_entry},
+ {"dmi", SIZE_dmi_system_id, do_dmi_entry},
+ {"platform", SIZE_platform_device_id, do_platform_entry},
+ {"mdio", SIZE_mdio_device_id, do_mdio_entry},
+ {"zorro", SIZE_zorro_device_id, do_zorro_entry},
+ {"isapnp", SIZE_isapnp_device_id, do_isapnp_entry},
+ {"ipack", SIZE_ipack_device_id, do_ipack_entry},
+ {"amba", SIZE_amba_id, do_amba_entry},
+ {"mipscdmm", SIZE_mips_cdmm_device_id, do_mips_cdmm_entry},
+ {"x86cpu", SIZE_x86_cpu_id, do_x86cpu_entry},
+ {"cpu", SIZE_cpu_feature, do_cpu_entry},
+ {"mei", SIZE_mei_cl_device_id, do_mei_entry},
+ {"rapidio", SIZE_rio_device_id, do_rio_entry},
+ {"ulpi", SIZE_ulpi_device_id, do_ulpi_entry},
+ {"hdaudio", SIZE_hda_device_id, do_hda_entry},
+ {"fslmc", SIZE_fsl_mc_device_id, do_fsl_mc_entry},
+};
+
/* Create MODULE_ALIAS() statements.
* At this time, we cannot write the actual output C source yet,
* so we write into the mod->dev_table_buf buffer. */
@@ -1386,13 +1338,14 @@ void handle_moddevtable(struct module *mod, struct elf_info *info,
else if (sym_is(name, namelen, "pnp_card"))
do_pnp_card_entries(symval, sym->st_size, mod);
else {
- struct devtable **p;
- INIT_SECTION(__devtable);
+ int i;
- for (p = __start___devtable; p < __stop___devtable; p++) {
- if (sym_is(name, namelen, (*p)->device_id)) {
- do_table(symval, sym->st_size, (*p)->id_size,
- (*p)->device_id, (*p)->function, mod);
+ for (i = 0; i < ARRAY_SIZE(devtable); i++) {
+ const struct devtable *p = &devtable[i];
+
+ if (sym_is(name, namelen, p->device_id)) {
+ do_table(symval, sym->st_size, p->id_size,
+ p->device_id, p->do_entry, mod);
break;
}
}
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 03c1652..db3bdc9 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -568,7 +568,7 @@ static int propagate_exception(struct dev_cgroup *devcg_root,
devcg->behavior == DEVCG_DEFAULT_ALLOW) {
rc = dev_exception_add(devcg, ex);
if (rc)
- break;
+ return rc;
} else {
/*
* in the other possible cases:
diff --git a/security/keys/proc.c b/security/keys/proc.c
index ec493dd..f2c7e09 100644
--- a/security/keys/proc.c
+++ b/security/keys/proc.c
@@ -187,7 +187,7 @@ static int proc_keys_show(struct seq_file *m, void *v)
struct keyring_search_context ctx = {
.index_key = key->index_key,
- .cred = current_cred(),
+ .cred = m->file->f_cred,
.match_data.cmp = lookup_user_key_possessed,
.match_data.raw_data = key,
.match_data.lookup_type = KEYRING_SEARCH_LOOKUP_DIRECT,
@@ -207,11 +207,7 @@ static int proc_keys_show(struct seq_file *m, void *v)
}
}
- /* check whether the current task is allowed to view the key (assuming
- * non-possession)
- * - the caller holds a spinlock, and thus the RCU read lock, making our
- * access to __current_cred() safe
- */
+ /* check whether the current task is allowed to view the key */
rc = key_task_permission(key_ref, ctx.cred, KEY_NEED_VIEW);
if (rc < 0)
return 0;
diff --git a/security/lsm_audit.c b/security/lsm_audit.c
index 37f04da..44a20c2 100644
--- a/security/lsm_audit.c
+++ b/security/lsm_audit.c
@@ -321,6 +321,7 @@ static void dump_common_audit_data(struct audit_buffer *ab,
if (a->u.net->sk) {
struct sock *sk = a->u.net->sk;
struct unix_sock *u;
+ struct unix_address *addr;
int len = 0;
char *p = NULL;
@@ -351,14 +352,15 @@ static void dump_common_audit_data(struct audit_buffer *ab,
#endif
case AF_UNIX:
u = unix_sk(sk);
+ addr = smp_load_acquire(&u->addr);
+ if (!addr)
+ break;
if (u->path.dentry) {
audit_log_d_path(ab, " path=", &u->path);
break;
}
- if (!u->addr)
- break;
- len = u->addr->len-sizeof(short);
- p = &u->addr->name->sun_path[0];
+ len = addr->len-sizeof(short);
+ p = &addr->name->sun_path[0];
audit_log_format(ab, " path=");
if (*p)
audit_log_untrustedstring(ab, p);
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 423a350..6171d6d 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3284,12 +3284,16 @@ static int selinux_inode_setsecurity(struct inode *inode, const char *name,
const void *value, size_t size, int flags)
{
struct inode_security_struct *isec = inode_security_novalidate(inode);
+ struct superblock_security_struct *sbsec = inode->i_sb->s_security;
u32 newsid;
int rc;
if (strcmp(name, XATTR_SELINUX_SUFFIX))
return -EOPNOTSUPP;
+ if (!(sbsec->flags & SBLABEL_MNT))
+ return -EOPNOTSUPP;
+
if (!value || !size)
return -EACCES;
@@ -6003,7 +6007,10 @@ static void selinux_inode_invalidate_secctx(struct inode *inode)
*/
static int selinux_inode_notifysecctx(struct inode *inode, void *ctx, u32 ctxlen)
{
- return selinux_inode_setsecurity(inode, XATTR_SELINUX_SUFFIX, ctx, ctxlen, 0);
+ int rc = selinux_inode_setsecurity(inode, XATTR_SELINUX_SUFFIX,
+ ctx, ctxlen, 0);
+ /* Do not return error when suppressing label (SBLABEL_MNT not set). */
+ return rc == -EOPNOTSUPP ? 0 : rc;
}
/*
diff --git a/sound/core/info.c b/sound/core/info.c
index 08558ae..d092d84 100644
--- a/sound/core/info.c
+++ b/sound/core/info.c
@@ -823,7 +823,12 @@ void snd_info_free_entry(struct snd_info_entry * entry)
list_for_each_entry_safe(p, n, &entry->children, list)
snd_info_free_entry(p);
- list_del(&entry->list);
+ p = entry->parent;
+ if (p) {
+ mutex_lock(&p->access);
+ list_del(&entry->list);
+ mutex_unlock(&p->access);
+ }
kfree(entry->name);
if (entry->private_free)
entry->private_free(entry);
diff --git a/sound/core/init.c b/sound/core/init.c
index c816648..f49f507 100644
--- a/sound/core/init.c
+++ b/sound/core/init.c
@@ -452,14 +452,7 @@ int snd_card_disconnect(struct snd_card *card)
card->shutdown = 1;
spin_unlock(&card->files_lock);
- /* phase 1: disable fops (user space) operations for ALSA API */
- mutex_lock(&snd_card_mutex);
- snd_cards[card->number] = NULL;
- clear_bit(card->number, snd_cards_lock);
- mutex_unlock(&snd_card_mutex);
-
- /* phase 2: replace file->f_op with special dummy operations */
-
+ /* replace file->f_op with special dummy operations */
spin_lock(&card->files_lock);
list_for_each_entry(mfile, &card->files_list, list) {
/* it's critical part, use endless loop */
@@ -475,7 +468,7 @@ int snd_card_disconnect(struct snd_card *card)
}
spin_unlock(&card->files_lock);
- /* phase 3: notify all connected devices about disconnection */
+ /* notify all connected devices about disconnection */
/* at this point, they cannot respond to any calls except release() */
#if IS_ENABLED(CONFIG_SND_MIXER_OSS)
@@ -491,6 +484,13 @@ int snd_card_disconnect(struct snd_card *card)
device_del(&card->card_dev);
card->registered = false;
}
+
+ /* disable fops (user space) operations for ALSA API */
+ mutex_lock(&snd_card_mutex);
+ snd_cards[card->number] = NULL;
+ clear_bit(card->number, snd_cards_lock);
+ mutex_unlock(&snd_card_mutex);
+
#ifdef CONFIG_PM
wake_up(&card->power_sleep);
#endif
diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c
index cfb8f58..8240975 100644
--- a/sound/core/oss/pcm_oss.c
+++ b/sound/core/oss/pcm_oss.c
@@ -951,6 +951,28 @@ static int snd_pcm_oss_change_params_locked(struct snd_pcm_substream *substream)
oss_frame_size = snd_pcm_format_physical_width(params_format(params)) *
params_channels(params) / 8;
+ err = snd_pcm_oss_period_size(substream, params, sparams);
+ if (err < 0)
+ goto failure;
+
+ n = snd_pcm_plug_slave_size(substream, runtime->oss.period_bytes / oss_frame_size);
+ err = snd_pcm_hw_param_near(substream, sparams, SNDRV_PCM_HW_PARAM_PERIOD_SIZE, n, NULL);
+ if (err < 0)
+ goto failure;
+
+ err = snd_pcm_hw_param_near(substream, sparams, SNDRV_PCM_HW_PARAM_PERIODS,
+ runtime->oss.periods, NULL);
+ if (err < 0)
+ goto failure;
+
+ snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_DROP, NULL);
+
+ err = snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_HW_PARAMS, sparams);
+ if (err < 0) {
+ pcm_dbg(substream->pcm, "HW_PARAMS failed: %i\n", err);
+ goto failure;
+ }
+
#ifdef CONFIG_SND_PCM_OSS_PLUGINS
snd_pcm_oss_plugin_clear(substream);
if (!direct) {
@@ -985,27 +1007,6 @@ static int snd_pcm_oss_change_params_locked(struct snd_pcm_substream *substream)
}
#endif
- err = snd_pcm_oss_period_size(substream, params, sparams);
- if (err < 0)
- goto failure;
-
- n = snd_pcm_plug_slave_size(substream, runtime->oss.period_bytes / oss_frame_size);
- err = snd_pcm_hw_param_near(substream, sparams, SNDRV_PCM_HW_PARAM_PERIOD_SIZE, n, NULL);
- if (err < 0)
- goto failure;
-
- err = snd_pcm_hw_param_near(substream, sparams, SNDRV_PCM_HW_PARAM_PERIODS,
- runtime->oss.periods, NULL);
- if (err < 0)
- goto failure;
-
- snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_DROP, NULL);
-
- if ((err = snd_pcm_kernel_ioctl(substream, SNDRV_PCM_IOCTL_HW_PARAMS, sparams)) < 0) {
- pcm_dbg(substream->pcm, "HW_PARAMS failed: %i\n", err);
- goto failure;
- }
-
if (runtime->oss.trigger) {
sw_params->start_threshold = 1;
} else {
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index a6fdf3a..0a747df 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -1282,8 +1282,15 @@ static int snd_pcm_pause(struct snd_pcm_substream *substream, int push)
static int snd_pcm_pre_suspend(struct snd_pcm_substream *substream, int state)
{
struct snd_pcm_runtime *runtime = substream->runtime;
- if (runtime->status->state == SNDRV_PCM_STATE_SUSPENDED)
+ switch (runtime->status->state) {
+ case SNDRV_PCM_STATE_SUSPENDED:
return -EBUSY;
+ /* unresumable PCM state; return -EBUSY for skipping suspend */
+ case SNDRV_PCM_STATE_OPEN:
+ case SNDRV_PCM_STATE_SETUP:
+ case SNDRV_PCM_STATE_DISCONNECTED:
+ return -EBUSY;
+ }
runtime->trigger_master = substream;
return 0;
}
@@ -1363,6 +1370,14 @@ int snd_pcm_suspend_all(struct snd_pcm *pcm)
/* FIXME: the open/close code should lock this as well */
if (substream->runtime == NULL)
continue;
+
+ /*
+ * Skip BE dai link PCM's that are internal and may
+ * not have their substream ops set.
+ */
+ if (!substream->ops)
+ continue;
+
err = snd_pcm_suspend(substream);
if (err < 0 && err != -EBUSY)
return err;
diff --git a/sound/core/rawmidi.c b/sound/core/rawmidi.c
index f217a1d..73a07da 100644
--- a/sound/core/rawmidi.c
+++ b/sound/core/rawmidi.c
@@ -29,6 +29,7 @@
#include <linux/mutex.h>
#include <linux/module.h>
#include <linux/delay.h>
+#include <linux/nospec.h>
#include <sound/rawmidi.h>
#include <sound/info.h>
#include <sound/control.h>
@@ -592,6 +593,7 @@ static int __snd_rawmidi_info_select(struct snd_card *card,
return -ENXIO;
if (info->stream < 0 || info->stream > 1)
return -EINVAL;
+ info->stream = array_index_nospec(info->stream, 2);
pstr = &rmidi->streams[info->stream];
if (pstr->substream_count == 0)
return -ENOENT;
diff --git a/sound/core/seq/oss/seq_oss_synth.c b/sound/core/seq/oss/seq_oss_synth.c
index 278ebb9..c939459 100644
--- a/sound/core/seq/oss/seq_oss_synth.c
+++ b/sound/core/seq/oss/seq_oss_synth.c
@@ -617,13 +617,14 @@ int
snd_seq_oss_synth_make_info(struct seq_oss_devinfo *dp, int dev, struct synth_info *inf)
{
struct seq_oss_synth *rec;
+ struct seq_oss_synthinfo *info = get_synthinfo_nospec(dp, dev);
- if (dev < 0 || dev >= dp->max_synthdev)
+ if (!info)
return -ENXIO;
- if (dp->synths[dev].is_midi) {
+ if (info->is_midi) {
struct midi_info minf;
- snd_seq_oss_midi_make_info(dp, dp->synths[dev].midi_mapped, &minf);
+ snd_seq_oss_midi_make_info(dp, info->midi_mapped, &minf);
inf->synth_type = SYNTH_TYPE_MIDI;
inf->synth_subtype = 0;
inf->nr_voices = 16;
diff --git a/sound/core/seq/seq_clientmgr.c b/sound/core/seq/seq_clientmgr.c
index 965473d..09491b2 100644
--- a/sound/core/seq/seq_clientmgr.c
+++ b/sound/core/seq/seq_clientmgr.c
@@ -1249,7 +1249,7 @@ static int snd_seq_ioctl_set_client_info(struct snd_seq_client *client,
/* fill the info fields */
if (client_info->name[0])
- strlcpy(client->name, client_info->name, sizeof(client->name));
+ strscpy(client->name, client_info->name, sizeof(client->name));
client->filter = client_info->filter;
client->event_lost = client_info->event_lost;
@@ -1527,7 +1527,7 @@ static int snd_seq_ioctl_create_queue(struct snd_seq_client *client, void *arg)
/* set queue name */
if (!info->name[0])
snprintf(info->name, sizeof(info->name), "Queue-%d", q->queue);
- strlcpy(q->name, info->name, sizeof(q->name));
+ strscpy(q->name, info->name, sizeof(q->name));
snd_use_lock_free(&q->use_lock);
return 0;
@@ -1589,7 +1589,7 @@ static int snd_seq_ioctl_set_queue_info(struct snd_seq_client *client,
queuefree(q);
return -EPERM;
}
- strlcpy(q->name, info->name, sizeof(q->name));
+ strscpy(q->name, info->name, sizeof(q->name));
queuefree(q);
return 0;
diff --git a/sound/drivers/opl3/opl3_voice.h b/sound/drivers/opl3/opl3_voice.h
index a371c07..e267025 100644
--- a/sound/drivers/opl3/opl3_voice.h
+++ b/sound/drivers/opl3/opl3_voice.h
@@ -41,7 +41,7 @@ void snd_opl3_timer_func(unsigned long data);
/* Prototypes for opl3_drums.c */
void snd_opl3_load_drums(struct snd_opl3 *opl3);
-void snd_opl3_drum_switch(struct snd_opl3 *opl3, int note, int on_off, int vel, struct snd_midi_channel *chan);
+void snd_opl3_drum_switch(struct snd_opl3 *opl3, int note, int vel, int on_off, struct snd_midi_channel *chan);
/* Prototypes for opl3_oss.c */
#ifdef CONFIG_SND_SEQUENCER_OSS
diff --git a/sound/firewire/bebob/bebob.c b/sound/firewire/bebob/bebob.c
index 3b4eaff..a205b93 100644
--- a/sound/firewire/bebob/bebob.c
+++ b/sound/firewire/bebob/bebob.c
@@ -474,7 +474,19 @@ static const struct ieee1394_device_id bebob_id_table[] = {
/* Focusrite, SaffirePro 26 I/O */
SND_BEBOB_DEV_ENTRY(VEN_FOCUSRITE, 0x00000003, &saffirepro_26_spec),
/* Focusrite, SaffirePro 10 I/O */
- SND_BEBOB_DEV_ENTRY(VEN_FOCUSRITE, 0x00000006, &saffirepro_10_spec),
+ {
+ // The combination of vendor_id and model_id is the same as the
+ // same as the one of Liquid Saffire 56.
+ .match_flags = IEEE1394_MATCH_VENDOR_ID |
+ IEEE1394_MATCH_MODEL_ID |
+ IEEE1394_MATCH_SPECIFIER_ID |
+ IEEE1394_MATCH_VERSION,
+ .vendor_id = VEN_FOCUSRITE,
+ .model_id = 0x000006,
+ .specifier_id = 0x00a02d,
+ .version = 0x010001,
+ .driver_data = (kernel_ulong_t)&saffirepro_10_spec,
+ },
/* Focusrite, Saffire(no label and LE) */
SND_BEBOB_DEV_ENTRY(VEN_FOCUSRITE, MODEL_FOCUSRITE_SAFFIRE_BOTH,
&saffire_spec),
diff --git a/sound/isa/sb/sb8.c b/sound/isa/sb/sb8.c
index ad42d23..e75bfc5 100644
--- a/sound/isa/sb/sb8.c
+++ b/sound/isa/sb/sb8.c
@@ -111,6 +111,10 @@ static int snd_sb8_probe(struct device *pdev, unsigned int dev)
/* block the 0x388 port to avoid PnP conflicts */
acard->fm_res = request_region(0x388, 4, "SoundBlaster FM");
+ if (!acard->fm_res) {
+ err = -EBUSY;
+ goto _err;
+ }
if (port[dev] != SNDRV_AUTO_PORT) {
if ((err = snd_sbdsp_create(card, port[dev], irq[dev],
diff --git a/sound/pci/echoaudio/echoaudio.c b/sound/pci/echoaudio/echoaudio.c
index 286f5e3..d73ee11a 100644
--- a/sound/pci/echoaudio/echoaudio.c
+++ b/sound/pci/echoaudio/echoaudio.c
@@ -1953,6 +1953,11 @@ static int snd_echo_create(struct snd_card *card,
}
chip->dsp_registers = (volatile u32 __iomem *)
ioremap_nocache(chip->dsp_registers_phys, sz);
+ if (!chip->dsp_registers) {
+ dev_err(chip->card->dev, "ioremap failed\n");
+ snd_echo_free(chip);
+ return -ENOMEM;
+ }
if (request_irq(pci->irq, snd_echo_interrupt, IRQF_SHARED,
KBUILD_MODNAME, chip)) {
diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c
index c6b046d..1b5e217 100644
--- a/sound/pci/hda/hda_codec.c
+++ b/sound/pci/hda/hda_codec.c
@@ -3004,6 +3004,7 @@ static void hda_call_codec_resume(struct hda_codec *codec)
hda_jackpoll_work(&codec->jackpoll_work.work);
else
snd_hda_jack_report_sync(codec);
+ codec->core.dev.power.power_state = PMSG_ON;
atomic_dec(&codec->core.in_pm);
}
@@ -3036,10 +3037,62 @@ static int hda_codec_runtime_resume(struct device *dev)
}
#endif /* CONFIG_PM */
+#ifdef CONFIG_PM_SLEEP
+static int hda_codec_force_resume(struct device *dev)
+{
+ int ret;
+
+ /* The get/put pair below enforces the runtime resume even if the
+ * device hasn't been used at suspend time. This trick is needed to
+ * update the jack state change during the sleep.
+ */
+ pm_runtime_get_noresume(dev);
+ ret = pm_runtime_force_resume(dev);
+ pm_runtime_put(dev);
+ return ret;
+}
+
+static int hda_codec_pm_suspend(struct device *dev)
+{
+ dev->power.power_state = PMSG_SUSPEND;
+ return pm_runtime_force_suspend(dev);
+}
+
+static int hda_codec_pm_resume(struct device *dev)
+{
+ dev->power.power_state = PMSG_RESUME;
+ return hda_codec_force_resume(dev);
+}
+
+static int hda_codec_pm_freeze(struct device *dev)
+{
+ dev->power.power_state = PMSG_FREEZE;
+ return pm_runtime_force_suspend(dev);
+}
+
+static int hda_codec_pm_thaw(struct device *dev)
+{
+ dev->power.power_state = PMSG_THAW;
+ return hda_codec_force_resume(dev);
+}
+
+static int hda_codec_pm_restore(struct device *dev)
+{
+ dev->power.power_state = PMSG_RESTORE;
+ return hda_codec_force_resume(dev);
+}
+#endif /* CONFIG_PM_SLEEP */
+
/* referred in hda_bind.c */
const struct dev_pm_ops hda_codec_driver_pm = {
- SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
- pm_runtime_force_resume)
+#ifdef CONFIG_PM_SLEEP
+ .suspend = hda_codec_pm_suspend,
+ .resume = hda_codec_pm_resume,
+ .freeze = hda_codec_pm_freeze,
+ .thaw = hda_codec_pm_thaw,
+ .poweroff = hda_codec_pm_suspend,
+ .restore = hda_codec_pm_restore,
+#endif /* CONFIG_PM_SLEEP */
SET_RUNTIME_PM_OPS(hda_codec_runtime_suspend, hda_codec_runtime_resume,
NULL)
};
diff --git a/sound/soc/fsl/fsl-asoc-card.c b/sound/soc/fsl/fsl-asoc-card.c
index dffd549..705d252 100644
--- a/sound/soc/fsl/fsl-asoc-card.c
+++ b/sound/soc/fsl/fsl-asoc-card.c
@@ -689,6 +689,7 @@ static int fsl_asoc_card_probe(struct platform_device *pdev)
asrc_fail:
of_node_put(asrc_np);
of_node_put(codec_np);
+ put_device(&cpu_pdev->dev);
fail:
of_node_put(cpu_np);
diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c
index 3ef1745..fa64cc2 100644
--- a/sound/soc/fsl/fsl_esai.c
+++ b/sound/soc/fsl/fsl_esai.c
@@ -59,6 +59,8 @@ struct fsl_esai {
u32 fifo_depth;
u32 slot_width;
u32 slots;
+ u32 tx_mask;
+ u32 rx_mask;
u32 hck_rate[2];
u32 sck_rate[2];
bool hck_dir[2];
@@ -359,21 +361,13 @@ static int fsl_esai_set_dai_tdm_slot(struct snd_soc_dai *dai, u32 tx_mask,
regmap_update_bits(esai_priv->regmap, REG_ESAI_TCCR,
ESAI_xCCR_xDC_MASK, ESAI_xCCR_xDC(slots));
- regmap_update_bits(esai_priv->regmap, REG_ESAI_TSMA,
- ESAI_xSMA_xS_MASK, ESAI_xSMA_xS(tx_mask));
- regmap_update_bits(esai_priv->regmap, REG_ESAI_TSMB,
- ESAI_xSMB_xS_MASK, ESAI_xSMB_xS(tx_mask));
-
regmap_update_bits(esai_priv->regmap, REG_ESAI_RCCR,
ESAI_xCCR_xDC_MASK, ESAI_xCCR_xDC(slots));
- regmap_update_bits(esai_priv->regmap, REG_ESAI_RSMA,
- ESAI_xSMA_xS_MASK, ESAI_xSMA_xS(rx_mask));
- regmap_update_bits(esai_priv->regmap, REG_ESAI_RSMB,
- ESAI_xSMB_xS_MASK, ESAI_xSMB_xS(rx_mask));
-
esai_priv->slot_width = slot_width;
esai_priv->slots = slots;
+ esai_priv->tx_mask = tx_mask;
+ esai_priv->rx_mask = rx_mask;
return 0;
}
@@ -396,7 +390,8 @@ static int fsl_esai_set_dai_fmt(struct snd_soc_dai *dai, unsigned int fmt)
break;
case SND_SOC_DAIFMT_RIGHT_J:
/* Data on rising edge of bclk, frame high, right aligned */
- xccr |= ESAI_xCCR_xCKP | ESAI_xCCR_xHCKP | ESAI_xCR_xWA;
+ xccr |= ESAI_xCCR_xCKP | ESAI_xCCR_xHCKP;
+ xcr |= ESAI_xCR_xWA;
break;
case SND_SOC_DAIFMT_DSP_A:
/* Data on rising edge of bclk, frame high, 1clk before data */
@@ -453,12 +448,12 @@ static int fsl_esai_set_dai_fmt(struct snd_soc_dai *dai, unsigned int fmt)
return -EINVAL;
}
- mask = ESAI_xCR_xFSL | ESAI_xCR_xFSR;
+ mask = ESAI_xCR_xFSL | ESAI_xCR_xFSR | ESAI_xCR_xWA;
regmap_update_bits(esai_priv->regmap, REG_ESAI_TCR, mask, xcr);
regmap_update_bits(esai_priv->regmap, REG_ESAI_RCR, mask, xcr);
mask = ESAI_xCCR_xCKP | ESAI_xCCR_xHCKP | ESAI_xCCR_xFSP |
- ESAI_xCCR_xFSD | ESAI_xCCR_xCKD | ESAI_xCR_xWA;
+ ESAI_xCCR_xFSD | ESAI_xCCR_xCKD;
regmap_update_bits(esai_priv->regmap, REG_ESAI_TCCR, mask, xccr);
regmap_update_bits(esai_priv->regmap, REG_ESAI_RCCR, mask, xccr);
@@ -593,6 +588,7 @@ static int fsl_esai_trigger(struct snd_pcm_substream *substream, int cmd,
bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
u8 i, channels = substream->runtime->channels;
u32 pins = DIV_ROUND_UP(channels, esai_priv->slots);
+ u32 mask;
switch (cmd) {
case SNDRV_PCM_TRIGGER_START:
@@ -605,15 +601,38 @@ static int fsl_esai_trigger(struct snd_pcm_substream *substream, int cmd,
for (i = 0; tx && i < channels; i++)
regmap_write(esai_priv->regmap, REG_ESAI_ETDR, 0x0);
+ /*
+ * When set the TE/RE in the end of enablement flow, there
+ * will be channel swap issue for multi data line case.
+ * In order to workaround this issue, we switch the bit
+ * enablement sequence to below sequence
+ * 1) clear the xSMB & xSMA: which is done in probe and
+ * stop state.
+ * 2) set TE/RE
+ * 3) set xSMB
+ * 4) set xSMA: xSMA is the last one in this flow, which
+ * will trigger esai to start.
+ */
regmap_update_bits(esai_priv->regmap, REG_ESAI_xCR(tx),
tx ? ESAI_xCR_TE_MASK : ESAI_xCR_RE_MASK,
tx ? ESAI_xCR_TE(pins) : ESAI_xCR_RE(pins));
+ mask = tx ? esai_priv->tx_mask : esai_priv->rx_mask;
+
+ regmap_update_bits(esai_priv->regmap, REG_ESAI_xSMB(tx),
+ ESAI_xSMB_xS_MASK, ESAI_xSMB_xS(mask));
+ regmap_update_bits(esai_priv->regmap, REG_ESAI_xSMA(tx),
+ ESAI_xSMA_xS_MASK, ESAI_xSMA_xS(mask));
+
break;
case SNDRV_PCM_TRIGGER_SUSPEND:
case SNDRV_PCM_TRIGGER_STOP:
case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
regmap_update_bits(esai_priv->regmap, REG_ESAI_xCR(tx),
tx ? ESAI_xCR_TE_MASK : ESAI_xCR_RE_MASK, 0);
+ regmap_update_bits(esai_priv->regmap, REG_ESAI_xSMA(tx),
+ ESAI_xSMA_xS_MASK, 0);
+ regmap_update_bits(esai_priv->regmap, REG_ESAI_xSMB(tx),
+ ESAI_xSMB_xS_MASK, 0);
/* Disable and reset FIFO */
regmap_update_bits(esai_priv->regmap, REG_ESAI_xFCR(tx),
@@ -903,6 +922,15 @@ static int fsl_esai_probe(struct platform_device *pdev)
return ret;
}
+ esai_priv->tx_mask = 0xFFFFFFFF;
+ esai_priv->rx_mask = 0xFFFFFFFF;
+
+ /* Clear the TSMA, TSMB, RSMA, RSMB */
+ regmap_write(esai_priv->regmap, REG_ESAI_TSMA, 0);
+ regmap_write(esai_priv->regmap, REG_ESAI_TSMB, 0);
+ regmap_write(esai_priv->regmap, REG_ESAI_RSMA, 0);
+ regmap_write(esai_priv->regmap, REG_ESAI_RSMB, 0);
+
ret = devm_snd_soc_register_component(&pdev->dev, &fsl_esai_component,
&fsl_esai_dai, 1);
if (ret) {
diff --git a/sound/soc/fsl/imx-sgtl5000.c b/sound/soc/fsl/imx-sgtl5000.c
index b99e0b5..8e525f7 100644
--- a/sound/soc/fsl/imx-sgtl5000.c
+++ b/sound/soc/fsl/imx-sgtl5000.c
@@ -115,6 +115,7 @@ static int imx_sgtl5000_probe(struct platform_device *pdev)
ret = -EPROBE_DEFER;
goto fail;
}
+ put_device(&ssi_pdev->dev);
codec_dev = of_find_i2c_device_by_node(codec_np);
if (!codec_dev) {
dev_err(&pdev->dev, "failed to find codec platform device\n");
diff --git a/sound/soc/soc-topology.c b/sound/soc/soc-topology.c
index d6b48c7..086fe4d 100644
--- a/sound/soc/soc-topology.c
+++ b/sound/soc/soc-topology.c
@@ -1989,6 +1989,7 @@ int snd_soc_tplg_component_load(struct snd_soc_component *comp,
struct snd_soc_tplg_ops *ops, const struct firmware *fw, u32 id)
{
struct soc_tplg tplg;
+ int ret;
/* setup parsing context */
memset(&tplg, 0, sizeof(tplg));
@@ -2002,7 +2003,12 @@ int snd_soc_tplg_component_load(struct snd_soc_component *comp,
tplg.bytes_ext_ops = ops->bytes_ext_ops;
tplg.bytes_ext_ops_count = ops->bytes_ext_ops_count;
- return soc_tplg_load(&tplg);
+ ret = soc_tplg_load(&tplg);
+ /* free the created components if fail to load topology */
+ if (ret)
+ snd_soc_tplg_component_remove(comp, SND_SOC_TPLG_INDEX_ALL);
+
+ return ret;
}
EXPORT_SYMBOL_GPL(snd_soc_tplg_component_load);
diff --git a/tools/accounting/getdelays.c b/tools/accounting/getdelays.c
index b5ca536..961e473 100644
--- a/tools/accounting/getdelays.c
+++ b/tools/accounting/getdelays.c
@@ -202,6 +202,8 @@ static void print_delayacct(struct taskstats *t)
"SWAP %15s%15s%15s\n"
" %15llu%15llu%15llums\n"
"RECLAIM %12s%15s%15s\n"
+ " %15llu%15llu%15llums\n"
+ "THRASHING%12s%15s%15s\n"
" %15llu%15llu%15llums\n",
"count", "real total", "virtual total",
"delay total", "delay average",
@@ -221,7 +223,11 @@ static void print_delayacct(struct taskstats *t)
"count", "delay total", "delay average",
(unsigned long long)t->freepages_count,
(unsigned long long)t->freepages_delay_total,
- average_ms(t->freepages_delay_total, t->freepages_count));
+ average_ms(t->freepages_delay_total, t->freepages_count),
+ "count", "delay total", "delay average",
+ (unsigned long long)t->thrashing_count,
+ (unsigned long long)t->thrashing_delay_total,
+ average_ms(t->thrashing_delay_total, t->thrashing_count));
}
static void task_context_switch_counts(struct taskstats *t)
diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 6694753..700c74b 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -2428,7 +2428,7 @@ static int arg_num_eval(struct print_arg *arg, long long *val)
static char *arg_eval (struct print_arg *arg)
{
long long val;
- static char buf[20];
+ static char buf[24];
switch (arg->type) {
case PRINT_ATOM:
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index e128d1c..3ff025b 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2132,9 +2132,10 @@ static void cleanup(struct objtool_file *file)
elf_close(file->elf);
}
+static struct objtool_file file;
+
int check(const char *_objname, bool orc)
{
- struct objtool_file file;
int ret, warnings = 0;
objname = _objname;
diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
index cb081ac5..bd359a0 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -112,7 +112,7 @@
[report]
# Defaults
- sort-order = comm,dso,symbol
+ sort_order = comm,dso,symbol
percent-limit = 0
queue-size = 0
children = true
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index e68c866..cd2900ac 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1323,8 +1323,9 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
goto out_delete_evlist;
symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
- if (symbol__init(NULL) < 0)
- return -1;
+ status = symbol__init(NULL);
+ if (status < 0)
+ goto out_delete_evlist;
sort__setup_elide(stdout);
diff --git a/tools/perf/tests/evsel-tp-sched.c b/tools/perf/tests/evsel-tp-sched.c
index 66b53f1..b5d0be5 100644
--- a/tools/perf/tests/evsel-tp-sched.c
+++ b/tools/perf/tests/evsel-tp-sched.c
@@ -42,7 +42,7 @@ int test__perf_evsel__tp_sched_test(int subtest __maybe_unused)
return -1;
}
- if (perf_evsel__test_field(evsel, "prev_comm", 16, true))
+ if (perf_evsel__test_field(evsel, "prev_comm", 16, false))
ret = -1;
if (perf_evsel__test_field(evsel, "prev_pid", 4, true))
@@ -54,7 +54,7 @@ int test__perf_evsel__tp_sched_test(int subtest __maybe_unused)
if (perf_evsel__test_field(evsel, "prev_state", sizeof(long), true))
ret = -1;
- if (perf_evsel__test_field(evsel, "next_comm", 16, true))
+ if (perf_evsel__test_field(evsel, "next_comm", 16, false))
ret = -1;
if (perf_evsel__test_field(evsel, "next_pid", 4, true))
@@ -72,7 +72,7 @@ int test__perf_evsel__tp_sched_test(int subtest __maybe_unused)
return -1;
}
- if (perf_evsel__test_field(evsel, "comm", 16, true))
+ if (perf_evsel__test_field(evsel, "comm", 16, false))
ret = -1;
if (perf_evsel__test_field(evsel, "pid", 4, true))
@@ -84,5 +84,6 @@ int test__perf_evsel__tp_sched_test(int subtest __maybe_unused)
if (perf_evsel__test_field(evsel, "target_cpu", 4, true))
ret = -1;
+ perf_evsel__delete(evsel);
return ret;
}
diff --git a/tools/perf/tests/openat-syscall-all-cpus.c b/tools/perf/tests/openat-syscall-all-cpus.c
index c8d9592..75d504e 100644
--- a/tools/perf/tests/openat-syscall-all-cpus.c
+++ b/tools/perf/tests/openat-syscall-all-cpus.c
@@ -38,7 +38,7 @@ int test__openat_syscall_event_on_all_cpus(int subtest __maybe_unused)
if (IS_ERR(evsel)) {
tracing_path__strerror_open_tp(errno, errbuf, sizeof(errbuf), "syscalls", "sys_enter_openat");
pr_debug("%s\n", errbuf);
- goto out_thread_map_delete;
+ goto out_cpu_map_delete;
}
if (perf_evsel__open(evsel, cpus, threads) < 0) {
@@ -112,6 +112,8 @@ int test__openat_syscall_event_on_all_cpus(int subtest __maybe_unused)
perf_evsel__close_fd(evsel, 1, threads->nr);
out_evsel_delete:
perf_evsel__delete(evsel);
+out_cpu_map_delete:
+ cpu_map__put(cpus);
out_thread_map_delete:
thread_map__put(threads);
return err;
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 29d015e..b87221e 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1244,9 +1244,9 @@ static int __auxtrace_mmap__read(struct auxtrace_mmap *mm,
}
/* padding must be written by fn() e.g. record__process_auxtrace() */
- padding = size & 7;
+ padding = size & (PERF_AUXTRACE_RECORD_ALIGNMENT - 1);
if (padding)
- padding = 8 - padding;
+ padding = PERF_AUXTRACE_RECORD_ALIGNMENT - padding;
memset(&ev, 0, sizeof(ev));
ev.auxtrace.header.type = PERF_RECORD_AUXTRACE;
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 26fb1ee..1b6963e 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -37,6 +37,9 @@ struct record_opts;
struct auxtrace_info_event;
struct events_stats;
+/* Auxtrace records must have the same alignment as perf event records */
+#define PERF_AUXTRACE_RECORD_ALIGNMENT 8
+
enum auxtrace_type {
PERF_AUXTRACE_UNKNOWN,
PERF_AUXTRACE_INTEL_PT,
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 993ef27..32aab95 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -176,6 +176,7 @@ char *build_id_cache__linkname(const char *sbuild_id, char *bf, size_t size)
return bf;
}
+/* The caller is responsible to free the returned buffer. */
char *build_id_cache__origname(const char *sbuild_id)
{
char *linkname;
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 18dae74..1d66f8e 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -595,11 +595,10 @@ static int collect_config(const char *var, const char *value,
}
ret = set_value(item, value);
- return ret;
out_free:
free(key);
- return -1;
+ return ret;
}
static int perf_config_set__init(struct perf_config_set *set)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f7128c2..a62f795 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1167,6 +1167,7 @@ void perf_evsel__exit(struct perf_evsel *evsel)
{
assert(list_empty(&evsel->node));
assert(evsel->evlist == NULL);
+ perf_evsel__free_counts(evsel);
perf_evsel__free_fd(evsel);
perf_evsel__free_id(evsel);
perf_evsel__free_config_terms(evsel);
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index ad613ea..82833ce 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1027,8 +1027,10 @@ int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
iter->evsel, al, max_stack_depth);
- if (err)
+ if (err) {
+ map__put(alm);
return err;
+ }
err = iter->ops->prepare_entry(iter, al);
if (err)
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index d27715f..3c13726 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -26,6 +26,7 @@
#include "../cache.h"
#include "../util.h"
+#include "../auxtrace.h"
#include "intel-pt-insn-decoder.h"
#include "intel-pt-pkt-decoder.h"
@@ -239,19 +240,15 @@ struct intel_pt_decoder *intel_pt_decoder_new(struct intel_pt_params *params)
if (!(decoder->tsc_ctc_ratio_n % decoder->tsc_ctc_ratio_d))
decoder->tsc_ctc_mult = decoder->tsc_ctc_ratio_n /
decoder->tsc_ctc_ratio_d;
-
- /*
- * Allow for timestamps appearing to backwards because a TSC
- * packet has slipped past a MTC packet, so allow 2 MTC ticks
- * or ...
- */
- decoder->tsc_slip = multdiv(2 << decoder->mtc_shift,
- decoder->tsc_ctc_ratio_n,
- decoder->tsc_ctc_ratio_d);
}
- /* ... or 0x100 paranoia */
- if (decoder->tsc_slip < 0x100)
- decoder->tsc_slip = 0x100;
+
+ /*
+ * A TSC packet can slip past MTC packets so that the timestamp appears
+ * to go backwards. One estimate is that can be up to about 40 CPU
+ * cycles, which is certainly less than 0x1000 TSC ticks, but accept
+ * slippage an order of magnitude more to be on the safe side.
+ */
+ decoder->tsc_slip = 0x10000;
intel_pt_log("timestamp: mtc_shift %u\n", decoder->mtc_shift);
intel_pt_log("timestamp: tsc_ctc_ratio_n %u\n", decoder->tsc_ctc_ratio_n);
@@ -1311,7 +1308,6 @@ static int intel_pt_overflow(struct intel_pt_decoder *decoder)
{
intel_pt_log("ERROR: Buffer overflow\n");
intel_pt_clear_tx_flags(decoder);
- decoder->cbr = 0;
decoder->timestamp_insn_cnt = 0;
decoder->pkt_state = INTEL_PT_STATE_ERR_RESYNC;
decoder->overflow = true;
@@ -2351,6 +2347,34 @@ static int intel_pt_tsc_cmp(uint64_t tsc1, uint64_t tsc2)
}
}
+#define MAX_PADDING (PERF_AUXTRACE_RECORD_ALIGNMENT - 1)
+
+/**
+ * adj_for_padding - adjust overlap to account for padding.
+ * @buf_b: second buffer
+ * @buf_a: first buffer
+ * @len_a: size of first buffer
+ *
+ * @buf_a might have up to 7 bytes of padding appended. Adjust the overlap
+ * accordingly.
+ *
+ * Return: A pointer into @buf_b from where non-overlapped data starts
+ */
+static unsigned char *adj_for_padding(unsigned char *buf_b,
+ unsigned char *buf_a, size_t len_a)
+{
+ unsigned char *p = buf_b - MAX_PADDING;
+ unsigned char *q = buf_a + len_a - MAX_PADDING;
+ int i;
+
+ for (i = MAX_PADDING; i; i--, p++, q++) {
+ if (*p != *q)
+ break;
+ }
+
+ return p;
+}
+
/**
* intel_pt_find_overlap_tsc - determine start of non-overlapped trace data
* using TSC.
@@ -2401,8 +2425,11 @@ static unsigned char *intel_pt_find_overlap_tsc(unsigned char *buf_a,
/* Same TSC, so buffers are consecutive */
if (!cmp && rem_b >= rem_a) {
+ unsigned char *start;
+
*consecutive = true;
- return buf_b + len_b - (rem_b - rem_a);
+ start = buf_b + len_b - (rem_b - rem_a);
+ return adj_for_padding(start, buf_a, len_a);
}
if (cmp < 0)
return buf_b; /* tsc_a < tsc_b => no overlap */
@@ -2465,7 +2492,7 @@ unsigned char *intel_pt_find_overlap(unsigned char *buf_a, size_t len_a,
found = memmem(buf_a, len_a, buf_b, len_a);
if (found) {
*consecutive = true;
- return buf_b + len_a;
+ return adj_for_padding(buf_b + len_a, buf_a, len_a);
}
/* Try again at next PSB in buffer 'a' */
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index d40ab4c..24c6621 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2259,6 +2259,8 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
}
pt->timeless_decoding = intel_pt_timeless_decoding(pt);
+ if (pt->timeless_decoding && !pt->tc.time_mult)
+ pt->tc.time_mult = 1;
pt->have_tsc = intel_pt_have_tsc(pt);
pt->sampling_mode = false;
pt->est_tsc = !pt->timeless_decoding;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 14f111a..6193be6 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2104,6 +2104,7 @@ void print_sdt_events(const char *subsys_glob, const char *event_glob,
printf(" %-50s [%s]\n", buf, "SDT event");
free(buf);
}
+ free(path);
} else
printf(" %-50s [%s]\n", nd->s, "SDT event");
if (nd2) {
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 5ec2de8..b4c5d96 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -3691,6 +3691,9 @@ int fork_it(char **argv)
signal(SIGQUIT, SIG_IGN);
if (waitpid(child_pid, &status, 0) == -1)
err(status, "waitpid");
+
+ if (WIFEXITED(status))
+ status = WEXITSTATUS(status);
}
/*
* n.b. fork_it() does not check for errors from for_all_cpus()
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 60de4c3..c72586a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2793,6 +2793,9 @@ static long kvm_device_ioctl(struct file *filp, unsigned int ioctl,
{
struct kvm_device *dev = filp->private_data;
+ if (dev->kvm->mm != current->mm)
+ return -EIO;
+
switch (ioctl) {
case KVM_SET_DEVICE_ATTR:
return kvm_device_ioctl_attr(dev, dev->ops->set_attr, arg);