blob: beae3fde075ee50bd790e3c8acebbaa93d0e9fc7 [file] [log] [blame]
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001The Definitive KVM (Kernel-based Virtual Machine) API Documentation
2===================================================================
3
41. General description
Jan Kiszka414fa982012-04-24 16:40:15 +02005----------------------
Avi Kivity9c1b96e2009-06-09 12:37:58 +03006
7The kvm API is a set of ioctls that are issued to control various aspects
8of a virtual machine. The ioctls belong to three classes
9
10 - System ioctls: These query and set global attributes which affect the
11 whole kvm subsystem. In addition a system ioctl is used to create
12 virtual machines
13
14 - VM ioctls: These query and set attributes that affect an entire virtual
15 machine, for example memory layout. In addition a VM ioctl is used to
16 create virtual cpus (vcpus).
17
18 Only run VM ioctls from the same process (address space) that was used
19 to create the VM.
20
21 - vcpu ioctls: These query and set attributes that control the operation
22 of a single virtual cpu.
23
24 Only run vcpu ioctls from the same thread that was used to create the
25 vcpu.
26
Jan Kiszka414fa982012-04-24 16:40:15 +020027
Wu Fengguang2044892d2009-12-24 09:04:16 +0800282. File descriptors
Jan Kiszka414fa982012-04-24 16:40:15 +020029-------------------
Avi Kivity9c1b96e2009-06-09 12:37:58 +030030
31The kvm API is centered around file descriptors. An initial
32open("/dev/kvm") obtains a handle to the kvm subsystem; this handle
33can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this
Wu Fengguang2044892d2009-12-24 09:04:16 +080034handle will create a VM file descriptor which can be used to issue VM
Avi Kivity9c1b96e2009-06-09 12:37:58 +030035ioctls. A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu
36and return a file descriptor pointing to it. Finally, ioctls on a vcpu
37fd can be used to control the vcpu, including the important task of
38actually running guest code.
39
40In general file descriptors can be migrated among processes by means
41of fork() and the SCM_RIGHTS facility of unix domain socket. These
42kinds of tricks are explicitly not supported by kvm. While they will
43not cause harm to the host, their actual behavior is not guaranteed by
44the API. The only supported use is one virtual machine per process,
45and one vcpu per thread.
46
Jan Kiszka414fa982012-04-24 16:40:15 +020047
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300483. Extensions
Jan Kiszka414fa982012-04-24 16:40:15 +020049-------------
Avi Kivity9c1b96e2009-06-09 12:37:58 +030050
51As of Linux 2.6.22, the KVM ABI has been stabilized: no backward
52incompatible change are allowed. However, there is an extension
53facility that allows backward-compatible extensions to the API to be
54queried and used.
55
Masanari Iidac9f3f2d2013-07-18 01:29:12 +090056The extension mechanism is not based on the Linux version number.
Avi Kivity9c1b96e2009-06-09 12:37:58 +030057Instead, kvm defines extension identifiers and a facility to query
58whether a particular extension identifier is available. If it is, a
59set of ioctls is available for application use.
60
Jan Kiszka414fa982012-04-24 16:40:15 +020061
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300624. API description
Jan Kiszka414fa982012-04-24 16:40:15 +020063------------------
Avi Kivity9c1b96e2009-06-09 12:37:58 +030064
65This section describes ioctls that can be used to control kvm guests.
66For each ioctl, the following information is provided along with a
67description:
68
69 Capability: which KVM extension provides this ioctl. Can be 'basic',
70 which means that is will be provided by any kernel that supports
71 API version 12 (see section 4.1), or a KVM_CAP_xyz constant, which
72 means availability needs to be checked with KVM_CHECK_EXTENSION
73 (see section 4.4).
74
75 Architectures: which instruction set architectures provide this ioctl.
76 x86 includes both i386 and x86_64.
77
78 Type: system, vm, or vcpu.
79
80 Parameters: what parameters are accepted by the ioctl.
81
82 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL)
83 are not detailed, but errors with specific meanings are.
84
Jan Kiszka414fa982012-04-24 16:40:15 +020085
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300864.1 KVM_GET_API_VERSION
87
88Capability: basic
89Architectures: all
90Type: system ioctl
91Parameters: none
92Returns: the constant KVM_API_VERSION (=12)
93
94This identifies the API version as the stable kvm API. It is not
95expected that this number will change. However, Linux 2.6.20 and
962.6.21 report earlier versions; these are not documented and not
97supported. Applications should refuse to run if KVM_GET_API_VERSION
98returns a value other than 12. If this check passes, all ioctls
99described as 'basic' will be available.
100
Jan Kiszka414fa982012-04-24 16:40:15 +0200101
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001024.2 KVM_CREATE_VM
103
104Capability: basic
105Architectures: all
106Type: system ioctl
Carsten Ottee08b9632012-01-04 10:25:20 +0100107Parameters: machine type identifier (KVM_VM_*)
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300108Returns: a VM fd that can be used to control the new virtual machine.
109
110The new VM has no virtual cpus and no memory. An mmap() of a VM fd
111will access the virtual machine's physical address space; offset zero
112corresponds to guest physical address zero. Use of mmap() on a VM fd
113is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is
114available.
Carsten Ottee08b9632012-01-04 10:25:20 +0100115You most certainly want to use 0 as machine type.
116
117In order to create user controlled virtual machines on S390, check
118KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as
119privileged user (CAP_SYS_ADMIN).
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300120
Jan Kiszka414fa982012-04-24 16:40:15 +0200121
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001224.3 KVM_GET_MSR_INDEX_LIST
123
124Capability: basic
125Architectures: x86
126Type: system
127Parameters: struct kvm_msr_list (in/out)
128Returns: 0 on success; -1 on error
129Errors:
130 E2BIG: the msr index list is to be to fit in the array specified by
131 the user.
132
133struct kvm_msr_list {
134 __u32 nmsrs; /* number of msrs in entries */
135 __u32 indices[0];
136};
137
138This ioctl returns the guest msrs that are supported. The list varies
139by kvm version and host processor, but does not change otherwise. The
140user fills in the size of the indices array in nmsrs, and in return
141kvm adjusts nmsrs to reflect the actual number of msrs and fills in
142the indices array with their numbers.
143
Avi Kivity2e2602c2010-07-07 14:09:39 +0300144Note: if kvm indicates supports MCE (KVM_CAP_MCE), then the MCE bank MSRs are
145not returned in the MSR list, as different vcpus can have a different number
146of banks, as set via the KVM_X86_SETUP_MCE ioctl.
147
Jan Kiszka414fa982012-04-24 16:40:15 +0200148
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001494.4 KVM_CHECK_EXTENSION
150
Alexander Graf92b591a2014-07-14 18:33:08 +0200151Capability: basic, KVM_CAP_CHECK_EXTENSION_VM for vm ioctl
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300152Architectures: all
Alexander Graf92b591a2014-07-14 18:33:08 +0200153Type: system ioctl, vm ioctl
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300154Parameters: extension identifier (KVM_CAP_*)
155Returns: 0 if unsupported; 1 (or some other positive integer) if supported
156
157The API allows the application to query about extensions to the core
158kvm API. Userspace passes an extension identifier (an integer) and
159receives an integer that describes the extension availability.
160Generally 0 means no and 1 means yes, but some extensions may report
161additional information in the integer return value.
162
Alexander Graf92b591a2014-07-14 18:33:08 +0200163Based on their initialization different VMs may have different capabilities.
164It is thus encouraged to use the vm ioctl to query for capabilities (available
165with KVM_CAP_CHECK_EXTENSION_VM on the vm fd)
Jan Kiszka414fa982012-04-24 16:40:15 +0200166
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001674.5 KVM_GET_VCPU_MMAP_SIZE
168
169Capability: basic
170Architectures: all
171Type: system ioctl
172Parameters: none
173Returns: size of vcpu mmap area, in bytes
174
175The KVM_RUN ioctl (cf.) communicates with userspace via a shared
176memory region. This ioctl returns the size of that region. See the
177KVM_RUN documentation for details.
178
Jan Kiszka414fa982012-04-24 16:40:15 +0200179
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001804.6 KVM_SET_MEMORY_REGION
181
182Capability: basic
183Architectures: all
184Type: vm ioctl
185Parameters: struct kvm_memory_region (in)
186Returns: 0 on success, -1 on error
187
Avi Kivityb74a07b2010-06-21 11:48:05 +0300188This ioctl is obsolete and has been removed.
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300189
Jan Kiszka414fa982012-04-24 16:40:15 +0200190
Paul Bolle68ba6972011-02-15 00:05:59 +01001914.7 KVM_CREATE_VCPU
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300192
193Capability: basic
194Architectures: all
195Type: vm ioctl
196Parameters: vcpu id (apic id on x86)
197Returns: vcpu fd on success, -1 on error
198
199This API adds a vcpu to a virtual machine. The vcpu id is a small integer
Sasha Levin8c3ba332011-07-18 17:17:15 +0300200in the range [0, max_vcpus).
201
202The recommended max_vcpus value can be retrieved using the KVM_CAP_NR_VCPUS of
203the KVM_CHECK_EXTENSION ioctl() at run-time.
204The maximum possible value for max_vcpus can be retrieved using the
205KVM_CAP_MAX_VCPUS of the KVM_CHECK_EXTENSION ioctl() at run-time.
206
Pekka Enberg76d25402011-05-09 22:48:54 +0300207If the KVM_CAP_NR_VCPUS does not exist, you should assume that max_vcpus is 4
208cpus max.
Sasha Levin8c3ba332011-07-18 17:17:15 +0300209If the KVM_CAP_MAX_VCPUS does not exist, you should assume that max_vcpus is
210same as the value returned from KVM_CAP_NR_VCPUS.
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300211
Paul Mackerras371fefd2011-06-29 00:23:08 +0000212On powerpc using book3s_hv mode, the vcpus are mapped onto virtual
213threads in one or more virtual CPU cores. (This is because the
214hardware requires all the hardware threads in a CPU core to be in the
215same partition.) The KVM_CAP_PPC_SMT capability indicates the number
216of vcpus per virtual core (vcore). The vcore id is obtained by
217dividing the vcpu id by the number of vcpus per vcore. The vcpus in a
218given vcore will always be in the same physical core as each other
219(though that might be a different physical core from time to time).
220Userspace can control the threading (SMT) mode of the guest by its
221allocation of vcpu ids. For example, if userspace wants
222single-threaded guest vcpus, it should make all vcpu ids be a multiple
223of the number of vcpus per vcore.
224
Carsten Otte5b1c1492012-01-04 10:25:23 +0100225For virtual cpus that have been created with S390 user controlled virtual
226machines, the resulting vcpu fd can be memory mapped at page offset
227KVM_S390_SIE_PAGE_OFFSET in order to obtain a memory map of the virtual
228cpu's hardware control block.
229
Jan Kiszka414fa982012-04-24 16:40:15 +0200230
Paul Bolle68ba6972011-02-15 00:05:59 +01002314.8 KVM_GET_DIRTY_LOG (vm ioctl)
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300232
233Capability: basic
234Architectures: x86
235Type: vm ioctl
236Parameters: struct kvm_dirty_log (in/out)
237Returns: 0 on success, -1 on error
238
239/* for KVM_GET_DIRTY_LOG */
240struct kvm_dirty_log {
241 __u32 slot;
242 __u32 padding;
243 union {
244 void __user *dirty_bitmap; /* one bit per page */
245 __u64 padding;
246 };
247};
248
249Given a memory slot, return a bitmap containing any pages dirtied
250since the last call to this ioctl. Bit 0 is the first page in the
251memory slot. Ensure the entire structure is cleared to avoid padding
252issues.
253
Jan Kiszka414fa982012-04-24 16:40:15 +0200254
Paul Bolle68ba6972011-02-15 00:05:59 +01002554.9 KVM_SET_MEMORY_ALIAS
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300256
257Capability: basic
258Architectures: x86
259Type: vm ioctl
260Parameters: struct kvm_memory_alias (in)
261Returns: 0 (success), -1 (error)
262
Avi Kivitya1f4d3952010-06-21 11:44:20 +0300263This ioctl is obsolete and has been removed.
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300264
Jan Kiszka414fa982012-04-24 16:40:15 +0200265
Paul Bolle68ba6972011-02-15 00:05:59 +01002664.10 KVM_RUN
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300267
268Capability: basic
269Architectures: all
270Type: vcpu ioctl
271Parameters: none
272Returns: 0 on success, -1 on error
273Errors:
274 EINTR: an unmasked signal is pending
275
276This ioctl is used to run a guest virtual cpu. While there are no
277explicit parameters, there is an implicit parameter block that can be
278obtained by mmap()ing the vcpu fd at offset 0, with the size given by
279KVM_GET_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct
280kvm_run' (see below).
281
Jan Kiszka414fa982012-04-24 16:40:15 +0200282
Paul Bolle68ba6972011-02-15 00:05:59 +01002834.11 KVM_GET_REGS
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300284
285Capability: basic
Marc Zyngier379e04c2013-04-02 17:46:31 +0100286Architectures: all except ARM, arm64
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300287Type: vcpu ioctl
288Parameters: struct kvm_regs (out)
289Returns: 0 on success, -1 on error
290
291Reads the general purpose registers from the vcpu.
292
293/* x86 */
294struct kvm_regs {
295 /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
296 __u64 rax, rbx, rcx, rdx;
297 __u64 rsi, rdi, rsp, rbp;
298 __u64 r8, r9, r10, r11;
299 __u64 r12, r13, r14, r15;
300 __u64 rip, rflags;
301};
302
James Hoganc2d2c212014-07-04 15:11:35 +0100303/* mips */
304struct kvm_regs {
305 /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
306 __u64 gpr[32];
307 __u64 hi;
308 __u64 lo;
309 __u64 pc;
310};
311
Jan Kiszka414fa982012-04-24 16:40:15 +0200312
Paul Bolle68ba6972011-02-15 00:05:59 +01003134.12 KVM_SET_REGS
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300314
315Capability: basic
Marc Zyngier379e04c2013-04-02 17:46:31 +0100316Architectures: all except ARM, arm64
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300317Type: vcpu ioctl
318Parameters: struct kvm_regs (in)
319Returns: 0 on success, -1 on error
320
321Writes the general purpose registers into the vcpu.
322
323See KVM_GET_REGS for the data structure.
324
Jan Kiszka414fa982012-04-24 16:40:15 +0200325
Paul Bolle68ba6972011-02-15 00:05:59 +01003264.13 KVM_GET_SREGS
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300327
328Capability: basic
Scott Wood5ce941e2011-04-27 17:24:21 -0500329Architectures: x86, ppc
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300330Type: vcpu ioctl
331Parameters: struct kvm_sregs (out)
332Returns: 0 on success, -1 on error
333
334Reads special registers from the vcpu.
335
336/* x86 */
337struct kvm_sregs {
338 struct kvm_segment cs, ds, es, fs, gs, ss;
339 struct kvm_segment tr, ldt;
340 struct kvm_dtable gdt, idt;
341 __u64 cr0, cr2, cr3, cr4, cr8;
342 __u64 efer;
343 __u64 apic_base;
344 __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
345};
346
Mihai Caraman68e2ffe2012-12-11 03:38:23 +0000347/* ppc -- see arch/powerpc/include/uapi/asm/kvm.h */
Scott Wood5ce941e2011-04-27 17:24:21 -0500348
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300349interrupt_bitmap is a bitmap of pending external interrupts. At most
350one bit may be set. This interrupt has been acknowledged by the APIC
351but not yet injected into the cpu core.
352
Jan Kiszka414fa982012-04-24 16:40:15 +0200353
Paul Bolle68ba6972011-02-15 00:05:59 +01003544.14 KVM_SET_SREGS
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300355
356Capability: basic
Scott Wood5ce941e2011-04-27 17:24:21 -0500357Architectures: x86, ppc
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300358Type: vcpu ioctl
359Parameters: struct kvm_sregs (in)
360Returns: 0 on success, -1 on error
361
362Writes special registers into the vcpu. See KVM_GET_SREGS for the
363data structures.
364
Jan Kiszka414fa982012-04-24 16:40:15 +0200365
Paul Bolle68ba6972011-02-15 00:05:59 +01003664.15 KVM_TRANSLATE
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300367
368Capability: basic
369Architectures: x86
370Type: vcpu ioctl
371Parameters: struct kvm_translation (in/out)
372Returns: 0 on success, -1 on error
373
374Translates a virtual address according to the vcpu's current address
375translation mode.
376
377struct kvm_translation {
378 /* in */
379 __u64 linear_address;
380
381 /* out */
382 __u64 physical_address;
383 __u8 valid;
384 __u8 writeable;
385 __u8 usermode;
386 __u8 pad[5];
387};
388
Jan Kiszka414fa982012-04-24 16:40:15 +0200389
Paul Bolle68ba6972011-02-15 00:05:59 +01003904.16 KVM_INTERRUPT
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300391
392Capability: basic
James Hoganc2d2c212014-07-04 15:11:35 +0100393Architectures: x86, ppc, mips
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300394Type: vcpu ioctl
395Parameters: struct kvm_interrupt (in)
396Returns: 0 on success, -1 on error
397
398Queues a hardware interrupt vector to be injected. This is only
Alexander Graf6f7a2bd2010-08-31 02:03:32 +0200399useful if in-kernel local APIC or equivalent is not used.
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300400
401/* for KVM_INTERRUPT */
402struct kvm_interrupt {
403 /* in */
404 __u32 irq;
405};
406
Alexander Graf6f7a2bd2010-08-31 02:03:32 +0200407X86:
408
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300409Note 'irq' is an interrupt vector, not an interrupt pin or line.
410
Alexander Graf6f7a2bd2010-08-31 02:03:32 +0200411PPC:
412
413Queues an external interrupt to be injected. This ioctl is overleaded
414with 3 different irq values:
415
416a) KVM_INTERRUPT_SET
417
418 This injects an edge type external interrupt into the guest once it's ready
419 to receive interrupts. When injected, the interrupt is done.
420
421b) KVM_INTERRUPT_UNSET
422
423 This unsets any pending interrupt.
424
425 Only available with KVM_CAP_PPC_UNSET_IRQ.
426
427c) KVM_INTERRUPT_SET_LEVEL
428
429 This injects a level type external interrupt into the guest context. The
430 interrupt stays pending until a specific ioctl with KVM_INTERRUPT_UNSET
431 is triggered.
432
433 Only available with KVM_CAP_PPC_IRQ_LEVEL.
434
435Note that any value for 'irq' other than the ones stated above is invalid
436and incurs unexpected behavior.
437
James Hoganc2d2c212014-07-04 15:11:35 +0100438MIPS:
439
440Queues an external interrupt to be injected into the virtual CPU. A negative
441interrupt number dequeues the interrupt.
442
Jan Kiszka414fa982012-04-24 16:40:15 +0200443
Paul Bolle68ba6972011-02-15 00:05:59 +01004444.17 KVM_DEBUG_GUEST
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300445
446Capability: basic
447Architectures: none
448Type: vcpu ioctl
449Parameters: none)
450Returns: -1 on error
451
452Support for this has been removed. Use KVM_SET_GUEST_DEBUG instead.
453
Jan Kiszka414fa982012-04-24 16:40:15 +0200454
Paul Bolle68ba6972011-02-15 00:05:59 +01004554.18 KVM_GET_MSRS
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300456
457Capability: basic
458Architectures: x86
459Type: vcpu ioctl
460Parameters: struct kvm_msrs (in/out)
461Returns: 0 on success, -1 on error
462
463Reads model-specific registers from the vcpu. Supported msr indices can
464be obtained using KVM_GET_MSR_INDEX_LIST.
465
466struct kvm_msrs {
467 __u32 nmsrs; /* number of msrs in entries */
468 __u32 pad;
469
470 struct kvm_msr_entry entries[0];
471};
472
473struct kvm_msr_entry {
474 __u32 index;
475 __u32 reserved;
476 __u64 data;
477};
478
479Application code should set the 'nmsrs' member (which indicates the
480size of the entries array) and the 'index' member of each array entry.
481kvm will fill in the 'data' member.
482
Jan Kiszka414fa982012-04-24 16:40:15 +0200483
Paul Bolle68ba6972011-02-15 00:05:59 +01004844.19 KVM_SET_MSRS
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300485
486Capability: basic
487Architectures: x86
488Type: vcpu ioctl
489Parameters: struct kvm_msrs (in)
490Returns: 0 on success, -1 on error
491
492Writes model-specific registers to the vcpu. See KVM_GET_MSRS for the
493data structures.
494
495Application code should set the 'nmsrs' member (which indicates the
496size of the entries array), and the 'index' and 'data' members of each
497array entry.
498
Jan Kiszka414fa982012-04-24 16:40:15 +0200499
Paul Bolle68ba6972011-02-15 00:05:59 +01005004.20 KVM_SET_CPUID
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300501
502Capability: basic
503Architectures: x86
504Type: vcpu ioctl
505Parameters: struct kvm_cpuid (in)
506Returns: 0 on success, -1 on error
507
508Defines the vcpu responses to the cpuid instruction. Applications
509should use the KVM_SET_CPUID2 ioctl if available.
510
511
512struct kvm_cpuid_entry {
513 __u32 function;
514 __u32 eax;
515 __u32 ebx;
516 __u32 ecx;
517 __u32 edx;
518 __u32 padding;
519};
520
521/* for KVM_SET_CPUID */
522struct kvm_cpuid {
523 __u32 nent;
524 __u32 padding;
525 struct kvm_cpuid_entry entries[0];
526};
527
Jan Kiszka414fa982012-04-24 16:40:15 +0200528
Paul Bolle68ba6972011-02-15 00:05:59 +01005294.21 KVM_SET_SIGNAL_MASK
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300530
531Capability: basic
James Hogan572e0922014-07-04 15:11:33 +0100532Architectures: all
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300533Type: vcpu ioctl
534Parameters: struct kvm_signal_mask (in)
535Returns: 0 on success, -1 on error
536
537Defines which signals are blocked during execution of KVM_RUN. This
538signal mask temporarily overrides the threads signal mask. Any
539unblocked signal received (except SIGKILL and SIGSTOP, which retain
540their traditional behaviour) will cause KVM_RUN to return with -EINTR.
541
542Note the signal will only be delivered if not blocked by the original
543signal mask.
544
545/* for KVM_SET_SIGNAL_MASK */
546struct kvm_signal_mask {
547 __u32 len;
548 __u8 sigset[0];
549};
550
Jan Kiszka414fa982012-04-24 16:40:15 +0200551
Paul Bolle68ba6972011-02-15 00:05:59 +01005524.22 KVM_GET_FPU
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300553
554Capability: basic
555Architectures: x86
556Type: vcpu ioctl
557Parameters: struct kvm_fpu (out)
558Returns: 0 on success, -1 on error
559
560Reads the floating point state from the vcpu.
561
562/* for KVM_GET_FPU and KVM_SET_FPU */
563struct kvm_fpu {
564 __u8 fpr[8][16];
565 __u16 fcw;
566 __u16 fsw;
567 __u8 ftwx; /* in fxsave format */
568 __u8 pad1;
569 __u16 last_opcode;
570 __u64 last_ip;
571 __u64 last_dp;
572 __u8 xmm[16][16];
573 __u32 mxcsr;
574 __u32 pad2;
575};
576
Jan Kiszka414fa982012-04-24 16:40:15 +0200577
Paul Bolle68ba6972011-02-15 00:05:59 +01005784.23 KVM_SET_FPU
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300579
580Capability: basic
581Architectures: x86
582Type: vcpu ioctl
583Parameters: struct kvm_fpu (in)
584Returns: 0 on success, -1 on error
585
586Writes the floating point state to the vcpu.
587
588/* for KVM_GET_FPU and KVM_SET_FPU */
589struct kvm_fpu {
590 __u8 fpr[8][16];
591 __u16 fcw;
592 __u16 fsw;
593 __u8 ftwx; /* in fxsave format */
594 __u8 pad1;
595 __u16 last_opcode;
596 __u64 last_ip;
597 __u64 last_dp;
598 __u8 xmm[16][16];
599 __u32 mxcsr;
600 __u32 pad2;
601};
602
Jan Kiszka414fa982012-04-24 16:40:15 +0200603
Paul Bolle68ba6972011-02-15 00:05:59 +01006044.24 KVM_CREATE_IRQCHIP
Avi Kivity5dadbfd2009-08-23 17:08:04 +0300605
Cornelia Huck84223592013-07-15 13:36:01 +0200606Capability: KVM_CAP_IRQCHIP, KVM_CAP_S390_IRQCHIP (s390)
607Architectures: x86, ia64, ARM, arm64, s390
Avi Kivity5dadbfd2009-08-23 17:08:04 +0300608Type: vm ioctl
609Parameters: none
610Returns: 0 on success, -1 on error
611
612Creates an interrupt controller model in the kernel. On x86, creates a virtual
613ioapic, a virtual PIC (two PICs, nested), and sets up future vcpus to have a
614local APIC. IRQ routing for GSIs 0-15 is set to both PIC and IOAPIC; GSI 16-23
Marc Zyngier379e04c2013-04-02 17:46:31 +0100615only go to the IOAPIC. On ia64, a IOSAPIC is created. On ARM/arm64, a GIC is
Cornelia Huck84223592013-07-15 13:36:01 +0200616created. On s390, a dummy irq routing table is created.
617
618Note that on s390 the KVM_CAP_S390_IRQCHIP vm capability needs to be enabled
619before KVM_CREATE_IRQCHIP can be used.
Avi Kivity5dadbfd2009-08-23 17:08:04 +0300620
Jan Kiszka414fa982012-04-24 16:40:15 +0200621
Paul Bolle68ba6972011-02-15 00:05:59 +01006224.25 KVM_IRQ_LINE
Avi Kivity5dadbfd2009-08-23 17:08:04 +0300623
624Capability: KVM_CAP_IRQCHIP
Marc Zyngier379e04c2013-04-02 17:46:31 +0100625Architectures: x86, ia64, arm, arm64
Avi Kivity5dadbfd2009-08-23 17:08:04 +0300626Type: vm ioctl
627Parameters: struct kvm_irq_level
628Returns: 0 on success, -1 on error
629
630Sets the level of a GSI input to the interrupt controller model in the kernel.
Christoffer Dall86ce85352013-01-20 18:28:08 -0500631On some architectures it is required that an interrupt controller model has
632been previously created with KVM_CREATE_IRQCHIP. Note that edge-triggered
633interrupts require the level to be set to 1 and then back to 0.
634
Gabriel L. Somlo100943c2014-02-27 23:06:17 -0500635On real hardware, interrupt pins can be active-low or active-high. This
636does not matter for the level field of struct kvm_irq_level: 1 always
637means active (asserted), 0 means inactive (deasserted).
638
639x86 allows the operating system to program the interrupt polarity
640(active-low/active-high) for level-triggered interrupts, and KVM used
641to consider the polarity. However, due to bitrot in the handling of
642active-low interrupts, the above convention is now valid on x86 too.
643This is signaled by KVM_CAP_X86_IOAPIC_POLARITY_IGNORED. Userspace
644should not present interrupts to the guest as active-low unless this
645capability is present (or unless it is not using the in-kernel irqchip,
646of course).
647
648
Marc Zyngier379e04c2013-04-02 17:46:31 +0100649ARM/arm64 can signal an interrupt either at the CPU level, or at the
650in-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to
651use PPIs designated for specific cpus. The irq field is interpreted
652like this:
Christoffer Dall86ce85352013-01-20 18:28:08 -0500653
654  bits: | 31 ... 24 | 23 ... 16 | 15 ... 0 |
655 field: | irq_type | vcpu_index | irq_id |
656
657The irq_type field has the following values:
658- irq_type[0]: out-of-kernel GIC: irq_id 0 is IRQ, irq_id 1 is FIQ
659- irq_type[1]: in-kernel GIC: SPI, irq_id between 32 and 1019 (incl.)
660 (the vcpu_index field is ignored)
661- irq_type[2]: in-kernel GIC: PPI, irq_id between 16 and 31 (incl.)
662
663(The irq_id field thus corresponds nicely to the IRQ ID in the ARM GIC specs)
664
Gabriel L. Somlo100943c2014-02-27 23:06:17 -0500665In both cases, level is used to assert/deassert the line.
Avi Kivity5dadbfd2009-08-23 17:08:04 +0300666
667struct kvm_irq_level {
668 union {
669 __u32 irq; /* GSI */
670 __s32 status; /* not used for KVM_IRQ_LEVEL */
671 };
672 __u32 level; /* 0 or 1 */
673};
674
Jan Kiszka414fa982012-04-24 16:40:15 +0200675
Paul Bolle68ba6972011-02-15 00:05:59 +01006764.26 KVM_GET_IRQCHIP
Avi Kivity5dadbfd2009-08-23 17:08:04 +0300677
678Capability: KVM_CAP_IRQCHIP
679Architectures: x86, ia64
680Type: vm ioctl
681Parameters: struct kvm_irqchip (in/out)
682Returns: 0 on success, -1 on error
683
684Reads the state of a kernel interrupt controller created with
685KVM_CREATE_IRQCHIP into a buffer provided by the caller.
686
687struct kvm_irqchip {
688 __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
689 __u32 pad;
690 union {
691 char dummy[512]; /* reserving space */
692 struct kvm_pic_state pic;
693 struct kvm_ioapic_state ioapic;
694 } chip;
695};
696
Jan Kiszka414fa982012-04-24 16:40:15 +0200697
Paul Bolle68ba6972011-02-15 00:05:59 +01006984.27 KVM_SET_IRQCHIP
Avi Kivity5dadbfd2009-08-23 17:08:04 +0300699
700Capability: KVM_CAP_IRQCHIP
701Architectures: x86, ia64
702Type: vm ioctl
703Parameters: struct kvm_irqchip (in)
704Returns: 0 on success, -1 on error
705
706Sets the state of a kernel interrupt controller created with
707KVM_CREATE_IRQCHIP from a buffer provided by the caller.
708
709struct kvm_irqchip {
710 __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
711 __u32 pad;
712 union {
713 char dummy[512]; /* reserving space */
714 struct kvm_pic_state pic;
715 struct kvm_ioapic_state ioapic;
716 } chip;
717};
718
Jan Kiszka414fa982012-04-24 16:40:15 +0200719
Paul Bolle68ba6972011-02-15 00:05:59 +01007204.28 KVM_XEN_HVM_CONFIG
Ed Swierkffde22a2009-10-15 15:21:43 -0700721
722Capability: KVM_CAP_XEN_HVM
723Architectures: x86
724Type: vm ioctl
725Parameters: struct kvm_xen_hvm_config (in)
726Returns: 0 on success, -1 on error
727
728Sets the MSR that the Xen HVM guest uses to initialize its hypercall
729page, and provides the starting address and size of the hypercall
730blobs in userspace. When the guest writes the MSR, kvm copies one
731page of a blob (32- or 64-bit, depending on the vcpu mode) to guest
732memory.
733
734struct kvm_xen_hvm_config {
735 __u32 flags;
736 __u32 msr;
737 __u64 blob_addr_32;
738 __u64 blob_addr_64;
739 __u8 blob_size_32;
740 __u8 blob_size_64;
741 __u8 pad2[30];
742};
743
Jan Kiszka414fa982012-04-24 16:40:15 +0200744
Paul Bolle68ba6972011-02-15 00:05:59 +01007454.29 KVM_GET_CLOCK
Glauber Costaafbcf7a2009-10-16 15:28:36 -0400746
747Capability: KVM_CAP_ADJUST_CLOCK
748Architectures: x86
749Type: vm ioctl
750Parameters: struct kvm_clock_data (out)
751Returns: 0 on success, -1 on error
752
753Gets the current timestamp of kvmclock as seen by the current guest. In
754conjunction with KVM_SET_CLOCK, it is used to ensure monotonicity on scenarios
755such as migration.
756
757struct kvm_clock_data {
758 __u64 clock; /* kvmclock current value */
759 __u32 flags;
760 __u32 pad[9];
761};
762
Jan Kiszka414fa982012-04-24 16:40:15 +0200763
Paul Bolle68ba6972011-02-15 00:05:59 +01007644.30 KVM_SET_CLOCK
Glauber Costaafbcf7a2009-10-16 15:28:36 -0400765
766Capability: KVM_CAP_ADJUST_CLOCK
767Architectures: x86
768Type: vm ioctl
769Parameters: struct kvm_clock_data (in)
770Returns: 0 on success, -1 on error
771
Wu Fengguang2044892d2009-12-24 09:04:16 +0800772Sets the current timestamp of kvmclock to the value specified in its parameter.
Glauber Costaafbcf7a2009-10-16 15:28:36 -0400773In conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on scenarios
774such as migration.
775
776struct kvm_clock_data {
777 __u64 clock; /* kvmclock current value */
778 __u32 flags;
779 __u32 pad[9];
780};
781
Jan Kiszka414fa982012-04-24 16:40:15 +0200782
Paul Bolle68ba6972011-02-15 00:05:59 +01007834.31 KVM_GET_VCPU_EVENTS
Jan Kiszka3cfc3092009-11-12 01:04:25 +0100784
785Capability: KVM_CAP_VCPU_EVENTS
Jan Kiszka48005f62010-02-19 19:38:07 +0100786Extended by: KVM_CAP_INTR_SHADOW
Jan Kiszka3cfc3092009-11-12 01:04:25 +0100787Architectures: x86
788Type: vm ioctl
789Parameters: struct kvm_vcpu_event (out)
790Returns: 0 on success, -1 on error
791
792Gets currently pending exceptions, interrupts, and NMIs as well as related
793states of the vcpu.
794
795struct kvm_vcpu_events {
796 struct {
797 __u8 injected;
798 __u8 nr;
799 __u8 has_error_code;
800 __u8 pad;
801 __u32 error_code;
802 } exception;
803 struct {
804 __u8 injected;
805 __u8 nr;
806 __u8 soft;
Jan Kiszka48005f62010-02-19 19:38:07 +0100807 __u8 shadow;
Jan Kiszka3cfc3092009-11-12 01:04:25 +0100808 } interrupt;
809 struct {
810 __u8 injected;
811 __u8 pending;
812 __u8 masked;
813 __u8 pad;
814 } nmi;
815 __u32 sipi_vector;
Jan Kiszkadab4b912009-12-06 18:24:15 +0100816 __u32 flags;
Jan Kiszka3cfc3092009-11-12 01:04:25 +0100817};
818
Jan Kiszka48005f62010-02-19 19:38:07 +0100819KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that
820interrupt.shadow contains a valid state. Otherwise, this field is undefined.
821
Jan Kiszka414fa982012-04-24 16:40:15 +0200822
Paul Bolle68ba6972011-02-15 00:05:59 +01008234.32 KVM_SET_VCPU_EVENTS
Jan Kiszka3cfc3092009-11-12 01:04:25 +0100824
825Capability: KVM_CAP_VCPU_EVENTS
Jan Kiszka48005f62010-02-19 19:38:07 +0100826Extended by: KVM_CAP_INTR_SHADOW
Jan Kiszka3cfc3092009-11-12 01:04:25 +0100827Architectures: x86
828Type: vm ioctl
829Parameters: struct kvm_vcpu_event (in)
830Returns: 0 on success, -1 on error
831
832Set pending exceptions, interrupts, and NMIs as well as related states of the
833vcpu.
834
835See KVM_GET_VCPU_EVENTS for the data structure.
836
Jan Kiszkadab4b912009-12-06 18:24:15 +0100837Fields that may be modified asynchronously by running VCPUs can be excluded
838from the update. These fields are nmi.pending and sipi_vector. Keep the
839corresponding bits in the flags field cleared to suppress overwriting the
840current in-kernel state. The bits are:
841
842KVM_VCPUEVENT_VALID_NMI_PENDING - transfer nmi.pending to the kernel
843KVM_VCPUEVENT_VALID_SIPI_VECTOR - transfer sipi_vector
844
Jan Kiszka48005f62010-02-19 19:38:07 +0100845If KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in
846the flags field to signal that interrupt.shadow contains a valid state and
847shall be written into the VCPU.
848
Jan Kiszka414fa982012-04-24 16:40:15 +0200849
Paul Bolle68ba6972011-02-15 00:05:59 +01008504.33 KVM_GET_DEBUGREGS
Jan Kiszkaa1efbe72010-02-15 10:45:43 +0100851
852Capability: KVM_CAP_DEBUGREGS
853Architectures: x86
854Type: vm ioctl
855Parameters: struct kvm_debugregs (out)
856Returns: 0 on success, -1 on error
857
858Reads debug registers from the vcpu.
859
860struct kvm_debugregs {
861 __u64 db[4];
862 __u64 dr6;
863 __u64 dr7;
864 __u64 flags;
865 __u64 reserved[9];
866};
867
Jan Kiszka414fa982012-04-24 16:40:15 +0200868
Paul Bolle68ba6972011-02-15 00:05:59 +01008694.34 KVM_SET_DEBUGREGS
Jan Kiszkaa1efbe72010-02-15 10:45:43 +0100870
871Capability: KVM_CAP_DEBUGREGS
872Architectures: x86
873Type: vm ioctl
874Parameters: struct kvm_debugregs (in)
875Returns: 0 on success, -1 on error
876
877Writes debug registers into the vcpu.
878
879See KVM_GET_DEBUGREGS for the data structure. The flags field is unused
880yet and must be cleared on entry.
881
Jan Kiszka414fa982012-04-24 16:40:15 +0200882
Paul Bolle68ba6972011-02-15 00:05:59 +01008834.35 KVM_SET_USER_MEMORY_REGION
Avi Kivity0f2d8f42010-03-25 12:16:48 +0200884
885Capability: KVM_CAP_USER_MEM
886Architectures: all
887Type: vm ioctl
888Parameters: struct kvm_userspace_memory_region (in)
889Returns: 0 on success, -1 on error
890
891struct kvm_userspace_memory_region {
892 __u32 slot;
893 __u32 flags;
894 __u64 guest_phys_addr;
895 __u64 memory_size; /* bytes */
896 __u64 userspace_addr; /* start of the userspace allocated memory */
897};
898
899/* for kvm_memory_region::flags */
Xiao Guangrong4d8b81a2012-08-21 11:02:51 +0800900#define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0)
901#define KVM_MEM_READONLY (1UL << 1)
Avi Kivity0f2d8f42010-03-25 12:16:48 +0200902
903This ioctl allows the user to create or modify a guest physical memory
904slot. When changing an existing slot, it may be moved in the guest
905physical memory space, or its flags may be modified. It may not be
906resized. Slots may not overlap in guest physical address space.
907
908Memory for the region is taken starting at the address denoted by the
909field userspace_addr, which must point at user addressable memory for
910the entire memory slot size. Any object may back this memory, including
911anonymous memory, ordinary files, and hugetlbfs.
912
913It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
914be identical. This allows large pages in the guest to be backed by large
915pages in the host.
916
Takuya Yoshikawa75d61fb2013-01-30 19:40:41 +0900917The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and
918KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of
919writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to
920use it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it,
921to make a new slot read-only. In this case, writes to this memory will be
922posted to userspace as KVM_EXIT_MMIO exits.
Avi Kivity0f2d8f42010-03-25 12:16:48 +0200923
Jan Kiszka7efd8fa2012-09-07 13:17:47 +0200924When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of
925the memory region are automatically reflected into the guest. For example, an
926mmap() that affects the region will be made visible immediately. Another
927example is madvise(MADV_DROP).
Avi Kivity0f2d8f42010-03-25 12:16:48 +0200928
929It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl.
930The KVM_SET_MEMORY_REGION does not allow fine grained control over memory
931allocation and is deprecated.
Jan Kiszka3cfc3092009-11-12 01:04:25 +0100932
Jan Kiszka414fa982012-04-24 16:40:15 +0200933
Paul Bolle68ba6972011-02-15 00:05:59 +01009344.36 KVM_SET_TSS_ADDR
Avi Kivity8a5416d2010-03-25 12:27:30 +0200935
936Capability: KVM_CAP_SET_TSS_ADDR
937Architectures: x86
938Type: vm ioctl
939Parameters: unsigned long tss_address (in)
940Returns: 0 on success, -1 on error
941
942This ioctl defines the physical address of a three-page region in the guest
943physical address space. The region must be within the first 4GB of the
944guest physical address space and must not conflict with any memory slot
945or any mmio address. The guest may malfunction if it accesses this memory
946region.
947
948This ioctl is required on Intel-based hosts. This is needed on Intel hardware
949because of a quirk in the virtualization implementation (see the internals
950documentation when it pops into existence).
951
Jan Kiszka414fa982012-04-24 16:40:15 +0200952
Paul Bolle68ba6972011-02-15 00:05:59 +01009534.37 KVM_ENABLE_CAP
Alexander Graf71fbfd52010-03-24 21:48:29 +0100954
Cornelia Huckd938dc52013-10-23 18:26:34 +0200955Capability: KVM_CAP_ENABLE_CAP, KVM_CAP_ENABLE_CAP_VM
Cornelia Huckd6712df2012-12-20 15:32:11 +0100956Architectures: ppc, s390
Cornelia Huckd938dc52013-10-23 18:26:34 +0200957Type: vcpu ioctl, vm ioctl (with KVM_CAP_ENABLE_CAP_VM)
Alexander Graf71fbfd52010-03-24 21:48:29 +0100958Parameters: struct kvm_enable_cap (in)
959Returns: 0 on success; -1 on error
960
961+Not all extensions are enabled by default. Using this ioctl the application
962can enable an extension, making it available to the guest.
963
964On systems that do not support this ioctl, it always fails. On systems that
965do support it, it only works for extensions that are supported for enablement.
966
967To check if a capability can be enabled, the KVM_CHECK_EXTENSION ioctl should
968be used.
969
970struct kvm_enable_cap {
971 /* in */
972 __u32 cap;
973
974The capability that is supposed to get enabled.
975
976 __u32 flags;
977
978A bitfield indicating future enhancements. Has to be 0 for now.
979
980 __u64 args[4];
981
982Arguments for enabling a feature. If a feature needs initial values to
983function properly, this is the place to put them.
984
985 __u8 pad[64];
986};
987
Cornelia Huckd938dc52013-10-23 18:26:34 +0200988The vcpu ioctl should be used for vcpu-specific capabilities, the vm ioctl
989for vm-wide capabilities.
Jan Kiszka414fa982012-04-24 16:40:15 +0200990
Paul Bolle68ba6972011-02-15 00:05:59 +01009914.38 KVM_GET_MP_STATE
Avi Kivityb843f062010-04-25 15:51:46 +0300992
993Capability: KVM_CAP_MP_STATE
David Hildenbrand6352e4d2014-04-10 17:35:00 +0200994Architectures: x86, ia64, s390
Avi Kivityb843f062010-04-25 15:51:46 +0300995Type: vcpu ioctl
996Parameters: struct kvm_mp_state (out)
997Returns: 0 on success; -1 on error
998
999struct kvm_mp_state {
1000 __u32 mp_state;
1001};
1002
1003Returns the vcpu's current "multiprocessing state" (though also valid on
1004uniprocessor guests).
1005
1006Possible values are:
1007
David Hildenbrand0b4820d2014-05-12 16:05:13 +02001008 - KVM_MP_STATE_RUNNABLE: the vcpu is currently running [x86, ia64]
Avi Kivityb843f062010-04-25 15:51:46 +03001009 - KVM_MP_STATE_UNINITIALIZED: the vcpu is an application processor (AP)
David Hildenbrand0b4820d2014-05-12 16:05:13 +02001010 which has not yet received an INIT signal [x86,
1011 ia64]
Avi Kivityb843f062010-04-25 15:51:46 +03001012 - KVM_MP_STATE_INIT_RECEIVED: the vcpu has received an INIT signal, and is
David Hildenbrand0b4820d2014-05-12 16:05:13 +02001013 now ready for a SIPI [x86, ia64]
Avi Kivityb843f062010-04-25 15:51:46 +03001014 - KVM_MP_STATE_HALTED: the vcpu has executed a HLT instruction and
David Hildenbrand0b4820d2014-05-12 16:05:13 +02001015 is waiting for an interrupt [x86, ia64]
Avi Kivityb843f062010-04-25 15:51:46 +03001016 - KVM_MP_STATE_SIPI_RECEIVED: the vcpu has just received a SIPI (vector
David Hildenbrand0b4820d2014-05-12 16:05:13 +02001017 accessible via KVM_GET_VCPU_EVENTS) [x86, ia64]
David Hildenbrand6352e4d2014-04-10 17:35:00 +02001018 - KVM_MP_STATE_STOPPED: the vcpu is stopped [s390]
1019 - KVM_MP_STATE_CHECK_STOP: the vcpu is in a special error state [s390]
1020 - KVM_MP_STATE_OPERATING: the vcpu is operating (running or halted)
1021 [s390]
1022 - KVM_MP_STATE_LOAD: the vcpu is in a special load/startup state
1023 [s390]
Avi Kivityb843f062010-04-25 15:51:46 +03001024
David Hildenbrand0b4820d2014-05-12 16:05:13 +02001025On x86 and ia64, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
1026in-kernel irqchip, the multiprocessing state must be maintained by userspace on
1027these architectures.
Avi Kivityb843f062010-04-25 15:51:46 +03001028
Jan Kiszka414fa982012-04-24 16:40:15 +02001029
Paul Bolle68ba6972011-02-15 00:05:59 +010010304.39 KVM_SET_MP_STATE
Avi Kivityb843f062010-04-25 15:51:46 +03001031
1032Capability: KVM_CAP_MP_STATE
David Hildenbrand6352e4d2014-04-10 17:35:00 +02001033Architectures: x86, ia64, s390
Avi Kivityb843f062010-04-25 15:51:46 +03001034Type: vcpu ioctl
1035Parameters: struct kvm_mp_state (in)
1036Returns: 0 on success; -1 on error
1037
1038Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
1039arguments.
1040
David Hildenbrand0b4820d2014-05-12 16:05:13 +02001041On x86 and ia64, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
1042in-kernel irqchip, the multiprocessing state must be maintained by userspace on
1043these architectures.
Avi Kivityb843f062010-04-25 15:51:46 +03001044
Jan Kiszka414fa982012-04-24 16:40:15 +02001045
Paul Bolle68ba6972011-02-15 00:05:59 +010010464.40 KVM_SET_IDENTITY_MAP_ADDR
Avi Kivity47dbb842010-04-29 12:08:56 +03001047
1048Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR
1049Architectures: x86
1050Type: vm ioctl
1051Parameters: unsigned long identity (in)
1052Returns: 0 on success, -1 on error
1053
1054This ioctl defines the physical address of a one-page region in the guest
1055physical address space. The region must be within the first 4GB of the
1056guest physical address space and must not conflict with any memory slot
1057or any mmio address. The guest may malfunction if it accesses this memory
1058region.
1059
1060This ioctl is required on Intel-based hosts. This is needed on Intel hardware
1061because of a quirk in the virtualization implementation (see the internals
1062documentation when it pops into existence).
1063
Jan Kiszka414fa982012-04-24 16:40:15 +02001064
Paul Bolle68ba6972011-02-15 00:05:59 +010010654.41 KVM_SET_BOOT_CPU_ID
Avi Kivity57bc24c2010-04-29 12:12:57 +03001066
1067Capability: KVM_CAP_SET_BOOT_CPU_ID
1068Architectures: x86, ia64
1069Type: vm ioctl
1070Parameters: unsigned long vcpu_id
1071Returns: 0 on success, -1 on error
1072
1073Define which vcpu is the Bootstrap Processor (BSP). Values are the same
1074as the vcpu id in KVM_CREATE_VCPU. If this ioctl is not called, the default
1075is vcpu 0.
1076
Jan Kiszka414fa982012-04-24 16:40:15 +02001077
Paul Bolle68ba6972011-02-15 00:05:59 +010010784.42 KVM_GET_XSAVE
Sheng Yang2d5b5a62010-06-13 17:29:39 +08001079
1080Capability: KVM_CAP_XSAVE
1081Architectures: x86
1082Type: vcpu ioctl
1083Parameters: struct kvm_xsave (out)
1084Returns: 0 on success, -1 on error
1085
1086struct kvm_xsave {
1087 __u32 region[1024];
1088};
1089
1090This ioctl would copy current vcpu's xsave struct to the userspace.
1091
Jan Kiszka414fa982012-04-24 16:40:15 +02001092
Paul Bolle68ba6972011-02-15 00:05:59 +010010934.43 KVM_SET_XSAVE
Sheng Yang2d5b5a62010-06-13 17:29:39 +08001094
1095Capability: KVM_CAP_XSAVE
1096Architectures: x86
1097Type: vcpu ioctl
1098Parameters: struct kvm_xsave (in)
1099Returns: 0 on success, -1 on error
1100
1101struct kvm_xsave {
1102 __u32 region[1024];
1103};
1104
1105This ioctl would copy userspace's xsave struct to the kernel.
1106
Jan Kiszka414fa982012-04-24 16:40:15 +02001107
Paul Bolle68ba6972011-02-15 00:05:59 +010011084.44 KVM_GET_XCRS
Sheng Yang2d5b5a62010-06-13 17:29:39 +08001109
1110Capability: KVM_CAP_XCRS
1111Architectures: x86
1112Type: vcpu ioctl
1113Parameters: struct kvm_xcrs (out)
1114Returns: 0 on success, -1 on error
1115
1116struct kvm_xcr {
1117 __u32 xcr;
1118 __u32 reserved;
1119 __u64 value;
1120};
1121
1122struct kvm_xcrs {
1123 __u32 nr_xcrs;
1124 __u32 flags;
1125 struct kvm_xcr xcrs[KVM_MAX_XCRS];
1126 __u64 padding[16];
1127};
1128
1129This ioctl would copy current vcpu's xcrs to the userspace.
1130
Jan Kiszka414fa982012-04-24 16:40:15 +02001131
Paul Bolle68ba6972011-02-15 00:05:59 +010011324.45 KVM_SET_XCRS
Sheng Yang2d5b5a62010-06-13 17:29:39 +08001133
1134Capability: KVM_CAP_XCRS
1135Architectures: x86
1136Type: vcpu ioctl
1137Parameters: struct kvm_xcrs (in)
1138Returns: 0 on success, -1 on error
1139
1140struct kvm_xcr {
1141 __u32 xcr;
1142 __u32 reserved;
1143 __u64 value;
1144};
1145
1146struct kvm_xcrs {
1147 __u32 nr_xcrs;
1148 __u32 flags;
1149 struct kvm_xcr xcrs[KVM_MAX_XCRS];
1150 __u64 padding[16];
1151};
1152
1153This ioctl would set vcpu's xcr to the value userspace specified.
1154
Jan Kiszka414fa982012-04-24 16:40:15 +02001155
Paul Bolle68ba6972011-02-15 00:05:59 +010011564.46 KVM_GET_SUPPORTED_CPUID
Avi Kivityd1535132010-07-14 09:45:21 +03001157
1158Capability: KVM_CAP_EXT_CPUID
1159Architectures: x86
1160Type: system ioctl
1161Parameters: struct kvm_cpuid2 (in/out)
1162Returns: 0 on success, -1 on error
1163
1164struct kvm_cpuid2 {
1165 __u32 nent;
1166 __u32 padding;
1167 struct kvm_cpuid_entry2 entries[0];
1168};
1169
Borislav Petkov9c15bb12013-09-22 16:44:50 +02001170#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX BIT(0)
1171#define KVM_CPUID_FLAG_STATEFUL_FUNC BIT(1)
1172#define KVM_CPUID_FLAG_STATE_READ_NEXT BIT(2)
Avi Kivityd1535132010-07-14 09:45:21 +03001173
1174struct kvm_cpuid_entry2 {
1175 __u32 function;
1176 __u32 index;
1177 __u32 flags;
1178 __u32 eax;
1179 __u32 ebx;
1180 __u32 ecx;
1181 __u32 edx;
1182 __u32 padding[3];
1183};
1184
1185This ioctl returns x86 cpuid features which are supported by both the hardware
1186and kvm. Userspace can use the information returned by this ioctl to
1187construct cpuid information (for KVM_SET_CPUID2) that is consistent with
1188hardware, kernel, and userspace capabilities, and with user requirements (for
1189example, the user may wish to constrain cpuid to emulate older hardware,
1190or for feature consistency across a cluster).
1191
1192Userspace invokes KVM_GET_SUPPORTED_CPUID by passing a kvm_cpuid2 structure
1193with the 'nent' field indicating the number of entries in the variable-size
1194array 'entries'. If the number of entries is too low to describe the cpu
1195capabilities, an error (E2BIG) is returned. If the number is too high,
1196the 'nent' field is adjusted and an error (ENOMEM) is returned. If the
1197number is just right, the 'nent' field is adjusted to the number of valid
1198entries in the 'entries' array, which is then filled.
1199
1200The entries returned are the host cpuid as returned by the cpuid instruction,
Avi Kivityc39cbd22010-09-12 16:39:11 +02001201with unknown or unsupported features masked out. Some features (for example,
1202x2apic), may not be present in the host cpu, but are exposed by kvm if it can
1203emulate them efficiently. The fields in each entry are defined as follows:
Avi Kivityd1535132010-07-14 09:45:21 +03001204
1205 function: the eax value used to obtain the entry
1206 index: the ecx value used to obtain the entry (for entries that are
1207 affected by ecx)
1208 flags: an OR of zero or more of the following:
1209 KVM_CPUID_FLAG_SIGNIFCANT_INDEX:
1210 if the index field is valid
1211 KVM_CPUID_FLAG_STATEFUL_FUNC:
1212 if cpuid for this function returns different values for successive
1213 invocations; there will be several entries with the same function,
1214 all with this flag set
1215 KVM_CPUID_FLAG_STATE_READ_NEXT:
1216 for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is
1217 the first entry to be read by a cpu
1218 eax, ebx, ecx, edx: the values returned by the cpuid instruction for
1219 this function/index combination
1220
Jan Kiszka4d25a0662011-12-21 12:28:29 +01001221The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned
1222as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC
1223support. Instead it is reported via
1224
1225 ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER)
1226
1227if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
1228feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
1229
Jan Kiszka414fa982012-04-24 16:40:15 +02001230
Paul Bolle68ba6972011-02-15 00:05:59 +010012314.47 KVM_PPC_GET_PVINFO
Alexander Graf15711e92010-07-29 14:48:08 +02001232
1233Capability: KVM_CAP_PPC_GET_PVINFO
1234Architectures: ppc
1235Type: vm ioctl
1236Parameters: struct kvm_ppc_pvinfo (out)
1237Returns: 0 on success, !0 on error
1238
1239struct kvm_ppc_pvinfo {
1240 __u32 flags;
1241 __u32 hcall[4];
1242 __u8 pad[108];
1243};
1244
1245This ioctl fetches PV specific information that need to be passed to the guest
1246using the device tree or other means from vm context.
1247
Liu Yu-B132019202e072012-07-03 05:48:52 +00001248The hcall array defines 4 instructions that make up a hypercall.
Alexander Graf15711e92010-07-29 14:48:08 +02001249
1250If any additional field gets added to this structure later on, a bit for that
1251additional piece of information will be set in the flags bitmap.
1252
Liu Yu-B132019202e072012-07-03 05:48:52 +00001253The flags bitmap is defined as:
1254
1255 /* the host supports the ePAPR idle hcall
1256 #define KVM_PPC_PVINFO_FLAGS_EV_IDLE (1<<0)
Jan Kiszka414fa982012-04-24 16:40:15 +02001257
Paul Bolle68ba6972011-02-15 00:05:59 +010012584.48 KVM_ASSIGN_PCI_DEVICE
Jan Kiszka49f48172010-11-16 22:30:07 +01001259
1260Capability: KVM_CAP_DEVICE_ASSIGNMENT
1261Architectures: x86 ia64
1262Type: vm ioctl
1263Parameters: struct kvm_assigned_pci_dev (in)
1264Returns: 0 on success, -1 on error
1265
1266Assigns a host PCI device to the VM.
1267
1268struct kvm_assigned_pci_dev {
1269 __u32 assigned_dev_id;
1270 __u32 busnr;
1271 __u32 devfn;
1272 __u32 flags;
1273 __u32 segnr;
1274 union {
1275 __u32 reserved[11];
1276 };
1277};
1278
1279The PCI device is specified by the triple segnr, busnr, and devfn.
1280Identification in succeeding service requests is done via assigned_dev_id. The
1281following flags are specified:
1282
1283/* Depends on KVM_CAP_IOMMU */
1284#define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
Jan Kiszka07700a92012-02-28 14:19:54 +01001285/* The following two depend on KVM_CAP_PCI_2_3 */
1286#define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1)
1287#define KVM_DEV_ASSIGN_MASK_INTX (1 << 2)
1288
1289If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx interrupts
1290via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with other
1291assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the
1292guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details.
Jan Kiszka49f48172010-11-16 22:30:07 +01001293
Alex Williamson42387372011-12-20 21:59:03 -07001294The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
1295isolation of the device. Usages not specifying this flag are deprecated.
1296
Alex Williamson3d27e232011-12-20 21:59:09 -07001297Only PCI header type 0 devices with PCI BAR resources are supported by
1298device assignment. The user requesting this ioctl must have read/write
1299access to the PCI sysfs resource files associated with the device.
1300
Jan Kiszka414fa982012-04-24 16:40:15 +02001301
Paul Bolle68ba6972011-02-15 00:05:59 +010013024.49 KVM_DEASSIGN_PCI_DEVICE
Jan Kiszka49f48172010-11-16 22:30:07 +01001303
1304Capability: KVM_CAP_DEVICE_DEASSIGNMENT
1305Architectures: x86 ia64
1306Type: vm ioctl
1307Parameters: struct kvm_assigned_pci_dev (in)
1308Returns: 0 on success, -1 on error
1309
1310Ends PCI device assignment, releasing all associated resources.
1311
1312See KVM_CAP_DEVICE_ASSIGNMENT for the data structure. Only assigned_dev_id is
1313used in kvm_assigned_pci_dev to identify the device.
1314
Jan Kiszka414fa982012-04-24 16:40:15 +02001315
Paul Bolle68ba6972011-02-15 00:05:59 +010013164.50 KVM_ASSIGN_DEV_IRQ
Jan Kiszka49f48172010-11-16 22:30:07 +01001317
1318Capability: KVM_CAP_ASSIGN_DEV_IRQ
1319Architectures: x86 ia64
1320Type: vm ioctl
1321Parameters: struct kvm_assigned_irq (in)
1322Returns: 0 on success, -1 on error
1323
1324Assigns an IRQ to a passed-through device.
1325
1326struct kvm_assigned_irq {
1327 __u32 assigned_dev_id;
Jan Kiszka91e3d712011-06-03 08:51:05 +02001328 __u32 host_irq; /* ignored (legacy field) */
Jan Kiszka49f48172010-11-16 22:30:07 +01001329 __u32 guest_irq;
1330 __u32 flags;
1331 union {
Jan Kiszka49f48172010-11-16 22:30:07 +01001332 __u32 reserved[12];
1333 };
1334};
1335
1336The following flags are defined:
1337
1338#define KVM_DEV_IRQ_HOST_INTX (1 << 0)
1339#define KVM_DEV_IRQ_HOST_MSI (1 << 1)
1340#define KVM_DEV_IRQ_HOST_MSIX (1 << 2)
1341
1342#define KVM_DEV_IRQ_GUEST_INTX (1 << 8)
1343#define KVM_DEV_IRQ_GUEST_MSI (1 << 9)
1344#define KVM_DEV_IRQ_GUEST_MSIX (1 << 10)
1345
1346It is not valid to specify multiple types per host or guest IRQ. However, the
1347IRQ type of host and guest can differ or can even be null.
1348
Jan Kiszka414fa982012-04-24 16:40:15 +02001349
Paul Bolle68ba6972011-02-15 00:05:59 +010013504.51 KVM_DEASSIGN_DEV_IRQ
Jan Kiszka49f48172010-11-16 22:30:07 +01001351
1352Capability: KVM_CAP_ASSIGN_DEV_IRQ
1353Architectures: x86 ia64
1354Type: vm ioctl
1355Parameters: struct kvm_assigned_irq (in)
1356Returns: 0 on success, -1 on error
1357
1358Ends an IRQ assignment to a passed-through device.
1359
1360See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is specified
1361by assigned_dev_id, flags must correspond to the IRQ type specified on
1362KVM_ASSIGN_DEV_IRQ. Partial deassignment of host or guest IRQ is allowed.
1363
Jan Kiszka414fa982012-04-24 16:40:15 +02001364
Paul Bolle68ba6972011-02-15 00:05:59 +010013654.52 KVM_SET_GSI_ROUTING
Jan Kiszka49f48172010-11-16 22:30:07 +01001366
1367Capability: KVM_CAP_IRQ_ROUTING
Cornelia Huck84223592013-07-15 13:36:01 +02001368Architectures: x86 ia64 s390
Jan Kiszka49f48172010-11-16 22:30:07 +01001369Type: vm ioctl
1370Parameters: struct kvm_irq_routing (in)
1371Returns: 0 on success, -1 on error
1372
1373Sets the GSI routing table entries, overwriting any previously set entries.
1374
1375struct kvm_irq_routing {
1376 __u32 nr;
1377 __u32 flags;
1378 struct kvm_irq_routing_entry entries[0];
1379};
1380
1381No flags are specified so far, the corresponding field must be set to zero.
1382
1383struct kvm_irq_routing_entry {
1384 __u32 gsi;
1385 __u32 type;
1386 __u32 flags;
1387 __u32 pad;
1388 union {
1389 struct kvm_irq_routing_irqchip irqchip;
1390 struct kvm_irq_routing_msi msi;
Cornelia Huck84223592013-07-15 13:36:01 +02001391 struct kvm_irq_routing_s390_adapter adapter;
Jan Kiszka49f48172010-11-16 22:30:07 +01001392 __u32 pad[8];
1393 } u;
1394};
1395
1396/* gsi routing entry types */
1397#define KVM_IRQ_ROUTING_IRQCHIP 1
1398#define KVM_IRQ_ROUTING_MSI 2
Cornelia Huck84223592013-07-15 13:36:01 +02001399#define KVM_IRQ_ROUTING_S390_ADAPTER 3
Jan Kiszka49f48172010-11-16 22:30:07 +01001400
1401No flags are specified so far, the corresponding field must be set to zero.
1402
1403struct kvm_irq_routing_irqchip {
1404 __u32 irqchip;
1405 __u32 pin;
1406};
1407
1408struct kvm_irq_routing_msi {
1409 __u32 address_lo;
1410 __u32 address_hi;
1411 __u32 data;
1412 __u32 pad;
1413};
1414
Cornelia Huck84223592013-07-15 13:36:01 +02001415struct kvm_irq_routing_s390_adapter {
1416 __u64 ind_addr;
1417 __u64 summary_addr;
1418 __u64 ind_offset;
1419 __u32 summary_offset;
1420 __u32 adapter_id;
1421};
1422
Jan Kiszka414fa982012-04-24 16:40:15 +02001423
Paul Bolle68ba6972011-02-15 00:05:59 +010014244.53 KVM_ASSIGN_SET_MSIX_NR
Jan Kiszka49f48172010-11-16 22:30:07 +01001425
1426Capability: KVM_CAP_DEVICE_MSIX
1427Architectures: x86 ia64
1428Type: vm ioctl
1429Parameters: struct kvm_assigned_msix_nr (in)
1430Returns: 0 on success, -1 on error
1431
Jan Kiszka58f09642011-06-11 12:24:24 +02001432Set the number of MSI-X interrupts for an assigned device. The number is
1433reset again by terminating the MSI-X assignment of the device via
1434KVM_DEASSIGN_DEV_IRQ. Calling this service more than once at any earlier
1435point will fail.
Jan Kiszka49f48172010-11-16 22:30:07 +01001436
1437struct kvm_assigned_msix_nr {
1438 __u32 assigned_dev_id;
1439 __u16 entry_nr;
1440 __u16 padding;
1441};
1442
1443#define KVM_MAX_MSIX_PER_DEV 256
1444
Jan Kiszka414fa982012-04-24 16:40:15 +02001445
Paul Bolle68ba6972011-02-15 00:05:59 +010014464.54 KVM_ASSIGN_SET_MSIX_ENTRY
Jan Kiszka49f48172010-11-16 22:30:07 +01001447
1448Capability: KVM_CAP_DEVICE_MSIX
1449Architectures: x86 ia64
1450Type: vm ioctl
1451Parameters: struct kvm_assigned_msix_entry (in)
1452Returns: 0 on success, -1 on error
1453
1454Specifies the routing of an MSI-X assigned device interrupt to a GSI. Setting
1455the GSI vector to zero means disabling the interrupt.
1456
1457struct kvm_assigned_msix_entry {
1458 __u32 assigned_dev_id;
1459 __u32 gsi;
1460 __u16 entry; /* The index of entry in the MSI-X table */
1461 __u16 padding[3];
1462};
1463
Jan Kiszka414fa982012-04-24 16:40:15 +02001464
14654.55 KVM_SET_TSC_KHZ
Joerg Roedel92a1f122011-03-25 09:44:51 +01001466
1467Capability: KVM_CAP_TSC_CONTROL
1468Architectures: x86
1469Type: vcpu ioctl
1470Parameters: virtual tsc_khz
1471Returns: 0 on success, -1 on error
1472
1473Specifies the tsc frequency for the virtual machine. The unit of the
1474frequency is KHz.
1475
Jan Kiszka414fa982012-04-24 16:40:15 +02001476
14774.56 KVM_GET_TSC_KHZ
Joerg Roedel92a1f122011-03-25 09:44:51 +01001478
1479Capability: KVM_CAP_GET_TSC_KHZ
1480Architectures: x86
1481Type: vcpu ioctl
1482Parameters: none
1483Returns: virtual tsc-khz on success, negative value on error
1484
1485Returns the tsc frequency of the guest. The unit of the return value is
1486KHz. If the host has unstable tsc this ioctl returns -EIO instead as an
1487error.
1488
Jan Kiszka414fa982012-04-24 16:40:15 +02001489
14904.57 KVM_GET_LAPIC
Avi Kivitye7677932011-05-11 08:30:51 -04001491
1492Capability: KVM_CAP_IRQCHIP
1493Architectures: x86
1494Type: vcpu ioctl
1495Parameters: struct kvm_lapic_state (out)
1496Returns: 0 on success, -1 on error
1497
1498#define KVM_APIC_REG_SIZE 0x400
1499struct kvm_lapic_state {
1500 char regs[KVM_APIC_REG_SIZE];
1501};
1502
1503Reads the Local APIC registers and copies them into the input argument. The
1504data format and layout are the same as documented in the architecture manual.
1505
Jan Kiszka414fa982012-04-24 16:40:15 +02001506
15074.58 KVM_SET_LAPIC
Avi Kivitye7677932011-05-11 08:30:51 -04001508
1509Capability: KVM_CAP_IRQCHIP
1510Architectures: x86
1511Type: vcpu ioctl
1512Parameters: struct kvm_lapic_state (in)
1513Returns: 0 on success, -1 on error
1514
1515#define KVM_APIC_REG_SIZE 0x400
1516struct kvm_lapic_state {
1517 char regs[KVM_APIC_REG_SIZE];
1518};
1519
Masanari Iidadf5cbb22014-03-21 10:04:30 +09001520Copies the input argument into the Local APIC registers. The data format
Avi Kivitye7677932011-05-11 08:30:51 -04001521and layout are the same as documented in the architecture manual.
1522
Jan Kiszka414fa982012-04-24 16:40:15 +02001523
15244.59 KVM_IOEVENTFD
Sasha Levin55399a02011-05-28 14:12:30 +03001525
1526Capability: KVM_CAP_IOEVENTFD
1527Architectures: all
1528Type: vm ioctl
1529Parameters: struct kvm_ioeventfd (in)
1530Returns: 0 on success, !0 on error
1531
1532This ioctl attaches or detaches an ioeventfd to a legal pio/mmio address
1533within the guest. A guest write in the registered address will signal the
1534provided event instead of triggering an exit.
1535
1536struct kvm_ioeventfd {
1537 __u64 datamatch;
1538 __u64 addr; /* legal pio/mmio address */
1539 __u32 len; /* 1, 2, 4, or 8 bytes */
1540 __s32 fd;
1541 __u32 flags;
1542 __u8 pad[36];
1543};
1544
Cornelia Huck2b834512013-02-28 12:33:20 +01001545For the special case of virtio-ccw devices on s390, the ioevent is matched
1546to a subchannel/virtqueue tuple instead.
1547
Sasha Levin55399a02011-05-28 14:12:30 +03001548The following flags are defined:
1549
1550#define KVM_IOEVENTFD_FLAG_DATAMATCH (1 << kvm_ioeventfd_flag_nr_datamatch)
1551#define KVM_IOEVENTFD_FLAG_PIO (1 << kvm_ioeventfd_flag_nr_pio)
1552#define KVM_IOEVENTFD_FLAG_DEASSIGN (1 << kvm_ioeventfd_flag_nr_deassign)
Cornelia Huck2b834512013-02-28 12:33:20 +01001553#define KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY \
1554 (1 << kvm_ioeventfd_flag_nr_virtio_ccw_notify)
Sasha Levin55399a02011-05-28 14:12:30 +03001555
1556If datamatch flag is set, the event will be signaled only if the written value
1557to the registered address is equal to datamatch in struct kvm_ioeventfd.
1558
Cornelia Huck2b834512013-02-28 12:33:20 +01001559For virtio-ccw devices, addr contains the subchannel id and datamatch the
1560virtqueue index.
1561
Jan Kiszka414fa982012-04-24 16:40:15 +02001562
15634.60 KVM_DIRTY_TLB
Scott Wooddc83b8b2011-08-18 15:25:21 -05001564
1565Capability: KVM_CAP_SW_TLB
1566Architectures: ppc
1567Type: vcpu ioctl
1568Parameters: struct kvm_dirty_tlb (in)
1569Returns: 0 on success, -1 on error
1570
1571struct kvm_dirty_tlb {
1572 __u64 bitmap;
1573 __u32 num_dirty;
1574};
1575
1576This must be called whenever userspace has changed an entry in the shared
1577TLB, prior to calling KVM_RUN on the associated vcpu.
1578
1579The "bitmap" field is the userspace address of an array. This array
1580consists of a number of bits, equal to the total number of TLB entries as
1581determined by the last successful call to KVM_CONFIG_TLB, rounded up to the
1582nearest multiple of 64.
1583
1584Each bit corresponds to one TLB entry, ordered the same as in the shared TLB
1585array.
1586
1587The array is little-endian: the bit 0 is the least significant bit of the
1588first byte, bit 8 is the least significant bit of the second byte, etc.
1589This avoids any complications with differing word sizes.
1590
1591The "num_dirty" field is a performance hint for KVM to determine whether it
1592should skip processing the bitmap and just invalidate everything. It must
1593be set to the number of set bits in the bitmap.
1594
Jan Kiszka414fa982012-04-24 16:40:15 +02001595
15964.61 KVM_ASSIGN_SET_INTX_MASK
Jan Kiszka07700a92012-02-28 14:19:54 +01001597
1598Capability: KVM_CAP_PCI_2_3
1599Architectures: x86
1600Type: vm ioctl
1601Parameters: struct kvm_assigned_pci_dev (in)
1602Returns: 0 on success, -1 on error
1603
1604Allows userspace to mask PCI INTx interrupts from the assigned device. The
1605kernel will not deliver INTx interrupts to the guest between setting and
1606clearing of KVM_ASSIGN_SET_INTX_MASK via this interface. This enables use of
1607and emulation of PCI 2.3 INTx disable command register behavior.
1608
1609This may be used for both PCI 2.3 devices supporting INTx disable natively and
1610older devices lacking this support. Userspace is responsible for emulating the
1611read value of the INTx disable bit in the guest visible PCI command register.
1612When modifying the INTx disable state, userspace should precede updating the
1613physical device command register by calling this ioctl to inform the kernel of
1614the new intended INTx mask state.
1615
1616Note that the kernel uses the device INTx disable bit to internally manage the
1617device interrupt state for PCI 2.3 devices. Reads of this register may
1618therefore not match the expected value. Writes should always use the guest
1619intended INTx disable value rather than attempting to read-copy-update the
1620current physical device state. Races between user and kernel updates to the
1621INTx disable bit are handled lazily in the kernel. It's possible the device
1622may generate unintended interrupts, but they will not be injected into the
1623guest.
1624
1625See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is specified
1626by assigned_dev_id. In the flags field, only KVM_DEV_ASSIGN_MASK_INTX is
1627evaluated.
1628
Jan Kiszka414fa982012-04-24 16:40:15 +02001629
David Gibson54738c02011-06-29 00:22:41 +000016304.62 KVM_CREATE_SPAPR_TCE
1631
1632Capability: KVM_CAP_SPAPR_TCE
1633Architectures: powerpc
1634Type: vm ioctl
1635Parameters: struct kvm_create_spapr_tce (in)
1636Returns: file descriptor for manipulating the created TCE table
1637
1638This creates a virtual TCE (translation control entry) table, which
1639is an IOMMU for PAPR-style virtual I/O. It is used to translate
1640logical addresses used in virtual I/O into guest physical addresses,
1641and provides a scatter/gather capability for PAPR virtual I/O.
1642
1643/* for KVM_CAP_SPAPR_TCE */
1644struct kvm_create_spapr_tce {
1645 __u64 liobn;
1646 __u32 window_size;
1647};
1648
1649The liobn field gives the logical IO bus number for which to create a
1650TCE table. The window_size field specifies the size of the DMA window
1651which this TCE table will translate - the table will contain one 64
1652bit TCE entry for every 4kiB of the DMA window.
1653
1654When the guest issues an H_PUT_TCE hcall on a liobn for which a TCE
1655table has been created using this ioctl(), the kernel will handle it
1656in real mode, updating the TCE table. H_PUT_TCE calls for other
1657liobns will cause a vm exit and must be handled by userspace.
1658
1659The return value is a file descriptor which can be passed to mmap(2)
1660to map the created TCE table into userspace. This lets userspace read
1661the entries written by kernel-handled H_PUT_TCE calls, and also lets
1662userspace update the TCE table directly which is useful in some
1663circumstances.
1664
Jan Kiszka414fa982012-04-24 16:40:15 +02001665
Paul Mackerrasaa04b4c2011-06-29 00:25:44 +000016664.63 KVM_ALLOCATE_RMA
1667
1668Capability: KVM_CAP_PPC_RMA
1669Architectures: powerpc
1670Type: vm ioctl
1671Parameters: struct kvm_allocate_rma (out)
1672Returns: file descriptor for mapping the allocated RMA
1673
1674This allocates a Real Mode Area (RMA) from the pool allocated at boot
1675time by the kernel. An RMA is a physically-contiguous, aligned region
1676of memory used on older POWER processors to provide the memory which
1677will be accessed by real-mode (MMU off) accesses in a KVM guest.
1678POWER processors support a set of sizes for the RMA that usually
1679includes 64MB, 128MB, 256MB and some larger powers of two.
1680
1681/* for KVM_ALLOCATE_RMA */
1682struct kvm_allocate_rma {
1683 __u64 rma_size;
1684};
1685
1686The return value is a file descriptor which can be passed to mmap(2)
1687to map the allocated RMA into userspace. The mapped area can then be
1688passed to the KVM_SET_USER_MEMORY_REGION ioctl to establish it as the
1689RMA for a virtual machine. The size of the RMA in bytes (which is
1690fixed at host kernel boot time) is returned in the rma_size field of
1691the argument structure.
1692
1693The KVM_CAP_PPC_RMA capability is 1 or 2 if the KVM_ALLOCATE_RMA ioctl
1694is supported; 2 if the processor requires all virtual machines to have
1695an RMA, or 1 if the processor can use an RMA but doesn't require it,
1696because it supports the Virtual RMA (VRMA) facility.
1697
Jan Kiszka414fa982012-04-24 16:40:15 +02001698
Avi Kivity3f745f12011-12-07 12:42:47 +020016994.64 KVM_NMI
1700
1701Capability: KVM_CAP_USER_NMI
1702Architectures: x86
1703Type: vcpu ioctl
1704Parameters: none
1705Returns: 0 on success, -1 on error
1706
1707Queues an NMI on the thread's vcpu. Note this is well defined only
1708when KVM_CREATE_IRQCHIP has not been called, since this is an interface
1709between the virtual cpu core and virtual local APIC. After KVM_CREATE_IRQCHIP
1710has been called, this interface is completely emulated within the kernel.
1711
1712To use this to emulate the LINT1 input with KVM_CREATE_IRQCHIP, use the
1713following algorithm:
1714
1715 - pause the vpcu
1716 - read the local APIC's state (KVM_GET_LAPIC)
1717 - check whether changing LINT1 will queue an NMI (see the LVT entry for LINT1)
1718 - if so, issue KVM_NMI
1719 - resume the vcpu
1720
1721Some guests configure the LINT1 NMI input to cause a panic, aiding in
1722debugging.
1723
Jan Kiszka414fa982012-04-24 16:40:15 +02001724
Alexander Grafe24ed812011-09-14 10:02:41 +020017254.65 KVM_S390_UCAS_MAP
Carsten Otte27e03932012-01-04 10:25:21 +01001726
1727Capability: KVM_CAP_S390_UCONTROL
1728Architectures: s390
1729Type: vcpu ioctl
1730Parameters: struct kvm_s390_ucas_mapping (in)
1731Returns: 0 in case of success
1732
1733The parameter is defined like this:
1734 struct kvm_s390_ucas_mapping {
1735 __u64 user_addr;
1736 __u64 vcpu_addr;
1737 __u64 length;
1738 };
1739
1740This ioctl maps the memory at "user_addr" with the length "length" to
1741the vcpu's address space starting at "vcpu_addr". All parameters need to
Anatol Pomozovf884ab12013-05-08 16:56:16 -07001742be aligned by 1 megabyte.
Carsten Otte27e03932012-01-04 10:25:21 +01001743
Jan Kiszka414fa982012-04-24 16:40:15 +02001744
Alexander Grafe24ed812011-09-14 10:02:41 +020017454.66 KVM_S390_UCAS_UNMAP
Carsten Otte27e03932012-01-04 10:25:21 +01001746
1747Capability: KVM_CAP_S390_UCONTROL
1748Architectures: s390
1749Type: vcpu ioctl
1750Parameters: struct kvm_s390_ucas_mapping (in)
1751Returns: 0 in case of success
1752
1753The parameter is defined like this:
1754 struct kvm_s390_ucas_mapping {
1755 __u64 user_addr;
1756 __u64 vcpu_addr;
1757 __u64 length;
1758 };
1759
1760This ioctl unmaps the memory in the vcpu's address space starting at
1761"vcpu_addr" with the length "length". The field "user_addr" is ignored.
Anatol Pomozovf884ab12013-05-08 16:56:16 -07001762All parameters need to be aligned by 1 megabyte.
Carsten Otte27e03932012-01-04 10:25:21 +01001763
Jan Kiszka414fa982012-04-24 16:40:15 +02001764
Alexander Grafe24ed812011-09-14 10:02:41 +020017654.67 KVM_S390_VCPU_FAULT
Carsten Otteccc79102012-01-04 10:25:26 +01001766
1767Capability: KVM_CAP_S390_UCONTROL
1768Architectures: s390
1769Type: vcpu ioctl
1770Parameters: vcpu absolute address (in)
1771Returns: 0 in case of success
1772
1773This call creates a page table entry on the virtual cpu's address space
1774(for user controlled virtual machines) or the virtual machine's address
1775space (for regular virtual machines). This only works for minor faults,
1776thus it's recommended to access subject memory page via the user page
1777table upfront. This is useful to handle validity intercepts for user
1778controlled virtual machines to fault in the virtual cpu's lowcore pages
1779prior to calling the KVM_RUN ioctl.
1780
Jan Kiszka414fa982012-04-24 16:40:15 +02001781
Alexander Grafe24ed812011-09-14 10:02:41 +020017824.68 KVM_SET_ONE_REG
1783
1784Capability: KVM_CAP_ONE_REG
1785Architectures: all
1786Type: vcpu ioctl
1787Parameters: struct kvm_one_reg (in)
1788Returns: 0 on success, negative value on failure
1789
1790struct kvm_one_reg {
1791 __u64 id;
1792 __u64 addr;
1793};
1794
1795Using this ioctl, a single vcpu register can be set to a specific value
1796defined by user space with the passed in struct kvm_one_reg, where id
1797refers to the register identifier as described below and addr is a pointer
1798to a variable with the respective size. There can be architecture agnostic
1799and architecture specific registers. Each have their own range of operation
1800and their own constants and width. To keep track of the implemented
1801registers, find a list below:
1802
James Hoganbf5590f2014-07-04 15:11:34 +01001803 Arch | Register | Width (bits)
1804 | |
1805 PPC | KVM_REG_PPC_HIOR | 64
1806 PPC | KVM_REG_PPC_IAC1 | 64
1807 PPC | KVM_REG_PPC_IAC2 | 64
1808 PPC | KVM_REG_PPC_IAC3 | 64
1809 PPC | KVM_REG_PPC_IAC4 | 64
1810 PPC | KVM_REG_PPC_DAC1 | 64
1811 PPC | KVM_REG_PPC_DAC2 | 64
1812 PPC | KVM_REG_PPC_DABR | 64
1813 PPC | KVM_REG_PPC_DSCR | 64
1814 PPC | KVM_REG_PPC_PURR | 64
1815 PPC | KVM_REG_PPC_SPURR | 64
1816 PPC | KVM_REG_PPC_DAR | 64
1817 PPC | KVM_REG_PPC_DSISR | 32
1818 PPC | KVM_REG_PPC_AMR | 64
1819 PPC | KVM_REG_PPC_UAMOR | 64
1820 PPC | KVM_REG_PPC_MMCR0 | 64
1821 PPC | KVM_REG_PPC_MMCR1 | 64
1822 PPC | KVM_REG_PPC_MMCRA | 64
1823 PPC | KVM_REG_PPC_MMCR2 | 64
1824 PPC | KVM_REG_PPC_MMCRS | 64
1825 PPC | KVM_REG_PPC_SIAR | 64
1826 PPC | KVM_REG_PPC_SDAR | 64
1827 PPC | KVM_REG_PPC_SIER | 64
1828 PPC | KVM_REG_PPC_PMC1 | 32
1829 PPC | KVM_REG_PPC_PMC2 | 32
1830 PPC | KVM_REG_PPC_PMC3 | 32
1831 PPC | KVM_REG_PPC_PMC4 | 32
1832 PPC | KVM_REG_PPC_PMC5 | 32
1833 PPC | KVM_REG_PPC_PMC6 | 32
1834 PPC | KVM_REG_PPC_PMC7 | 32
1835 PPC | KVM_REG_PPC_PMC8 | 32
1836 PPC | KVM_REG_PPC_FPR0 | 64
Paul Mackerrasa8bd19e2012-09-25 20:32:30 +00001837 ...
James Hoganbf5590f2014-07-04 15:11:34 +01001838 PPC | KVM_REG_PPC_FPR31 | 64
1839 PPC | KVM_REG_PPC_VR0 | 128
Paul Mackerrasa8bd19e2012-09-25 20:32:30 +00001840 ...
James Hoganbf5590f2014-07-04 15:11:34 +01001841 PPC | KVM_REG_PPC_VR31 | 128
1842 PPC | KVM_REG_PPC_VSR0 | 128
Paul Mackerrasa8bd19e2012-09-25 20:32:30 +00001843 ...
James Hoganbf5590f2014-07-04 15:11:34 +01001844 PPC | KVM_REG_PPC_VSR31 | 128
1845 PPC | KVM_REG_PPC_FPSCR | 64
1846 PPC | KVM_REG_PPC_VSCR | 32
1847 PPC | KVM_REG_PPC_VPA_ADDR | 64
1848 PPC | KVM_REG_PPC_VPA_SLB | 128
1849 PPC | KVM_REG_PPC_VPA_DTL | 128
1850 PPC | KVM_REG_PPC_EPCR | 32
1851 PPC | KVM_REG_PPC_EPR | 32
1852 PPC | KVM_REG_PPC_TCR | 32
1853 PPC | KVM_REG_PPC_TSR | 32
1854 PPC | KVM_REG_PPC_OR_TSR | 32
1855 PPC | KVM_REG_PPC_CLEAR_TSR | 32
1856 PPC | KVM_REG_PPC_MAS0 | 32
1857 PPC | KVM_REG_PPC_MAS1 | 32
1858 PPC | KVM_REG_PPC_MAS2 | 64
1859 PPC | KVM_REG_PPC_MAS7_3 | 64
1860 PPC | KVM_REG_PPC_MAS4 | 32
1861 PPC | KVM_REG_PPC_MAS6 | 32
1862 PPC | KVM_REG_PPC_MMUCFG | 32
1863 PPC | KVM_REG_PPC_TLB0CFG | 32
1864 PPC | KVM_REG_PPC_TLB1CFG | 32
1865 PPC | KVM_REG_PPC_TLB2CFG | 32
1866 PPC | KVM_REG_PPC_TLB3CFG | 32
1867 PPC | KVM_REG_PPC_TLB0PS | 32
1868 PPC | KVM_REG_PPC_TLB1PS | 32
1869 PPC | KVM_REG_PPC_TLB2PS | 32
1870 PPC | KVM_REG_PPC_TLB3PS | 32
1871 PPC | KVM_REG_PPC_EPTCFG | 32
1872 PPC | KVM_REG_PPC_ICP_STATE | 64
1873 PPC | KVM_REG_PPC_TB_OFFSET | 64
1874 PPC | KVM_REG_PPC_SPMC1 | 32
1875 PPC | KVM_REG_PPC_SPMC2 | 32
1876 PPC | KVM_REG_PPC_IAMR | 64
1877 PPC | KVM_REG_PPC_TFHAR | 64
1878 PPC | KVM_REG_PPC_TFIAR | 64
1879 PPC | KVM_REG_PPC_TEXASR | 64
1880 PPC | KVM_REG_PPC_FSCR | 64
1881 PPC | KVM_REG_PPC_PSPB | 32
1882 PPC | KVM_REG_PPC_EBBHR | 64
1883 PPC | KVM_REG_PPC_EBBRR | 64
1884 PPC | KVM_REG_PPC_BESCR | 64
1885 PPC | KVM_REG_PPC_TAR | 64
1886 PPC | KVM_REG_PPC_DPDES | 64
1887 PPC | KVM_REG_PPC_DAWR | 64
1888 PPC | KVM_REG_PPC_DAWRX | 64
1889 PPC | KVM_REG_PPC_CIABR | 64
1890 PPC | KVM_REG_PPC_IC | 64
1891 PPC | KVM_REG_PPC_VTB | 64
1892 PPC | KVM_REG_PPC_CSIGR | 64
1893 PPC | KVM_REG_PPC_TACR | 64
1894 PPC | KVM_REG_PPC_TCSCR | 64
1895 PPC | KVM_REG_PPC_PID | 64
1896 PPC | KVM_REG_PPC_ACOP | 64
1897 PPC | KVM_REG_PPC_VRSAVE | 32
Paolo Bonzinicc568ea2014-08-05 09:55:22 +02001898 PPC | KVM_REG_PPC_LPCR | 32
1899 PPC | KVM_REG_PPC_LPCR_64 | 64
James Hoganbf5590f2014-07-04 15:11:34 +01001900 PPC | KVM_REG_PPC_PPR | 64
1901 PPC | KVM_REG_PPC_ARCH_COMPAT | 32
1902 PPC | KVM_REG_PPC_DABRX | 32
1903 PPC | KVM_REG_PPC_WORT | 64
1904 PPC | KVM_REG_PPC_TM_GPR0 | 64
Michael Neuling3b783472013-09-03 11:13:12 +10001905 ...
James Hoganbf5590f2014-07-04 15:11:34 +01001906 PPC | KVM_REG_PPC_TM_GPR31 | 64
1907 PPC | KVM_REG_PPC_TM_VSR0 | 128
Michael Neuling3b783472013-09-03 11:13:12 +10001908 ...
James Hoganbf5590f2014-07-04 15:11:34 +01001909 PPC | KVM_REG_PPC_TM_VSR63 | 128
1910 PPC | KVM_REG_PPC_TM_CR | 64
1911 PPC | KVM_REG_PPC_TM_LR | 64
1912 PPC | KVM_REG_PPC_TM_CTR | 64
1913 PPC | KVM_REG_PPC_TM_FPSCR | 64
1914 PPC | KVM_REG_PPC_TM_AMR | 64
1915 PPC | KVM_REG_PPC_TM_PPR | 64
1916 PPC | KVM_REG_PPC_TM_VRSAVE | 64
1917 PPC | KVM_REG_PPC_TM_VSCR | 32
1918 PPC | KVM_REG_PPC_TM_DSCR | 64
1919 PPC | KVM_REG_PPC_TM_TAR | 64
James Hoganc2d2c212014-07-04 15:11:35 +01001920 | |
1921 MIPS | KVM_REG_MIPS_R0 | 64
1922 ...
1923 MIPS | KVM_REG_MIPS_R31 | 64
1924 MIPS | KVM_REG_MIPS_HI | 64
1925 MIPS | KVM_REG_MIPS_LO | 64
1926 MIPS | KVM_REG_MIPS_PC | 64
1927 MIPS | KVM_REG_MIPS_CP0_INDEX | 32
1928 MIPS | KVM_REG_MIPS_CP0_CONTEXT | 64
1929 MIPS | KVM_REG_MIPS_CP0_USERLOCAL | 64
1930 MIPS | KVM_REG_MIPS_CP0_PAGEMASK | 32
1931 MIPS | KVM_REG_MIPS_CP0_WIRED | 32
1932 MIPS | KVM_REG_MIPS_CP0_HWRENA | 32
1933 MIPS | KVM_REG_MIPS_CP0_BADVADDR | 64
1934 MIPS | KVM_REG_MIPS_CP0_COUNT | 32
1935 MIPS | KVM_REG_MIPS_CP0_ENTRYHI | 64
1936 MIPS | KVM_REG_MIPS_CP0_COMPARE | 32
1937 MIPS | KVM_REG_MIPS_CP0_STATUS | 32
1938 MIPS | KVM_REG_MIPS_CP0_CAUSE | 32
1939 MIPS | KVM_REG_MIPS_CP0_EPC | 64
1940 MIPS | KVM_REG_MIPS_CP0_CONFIG | 32
1941 MIPS | KVM_REG_MIPS_CP0_CONFIG1 | 32
1942 MIPS | KVM_REG_MIPS_CP0_CONFIG2 | 32
1943 MIPS | KVM_REG_MIPS_CP0_CONFIG3 | 32
1944 MIPS | KVM_REG_MIPS_CP0_CONFIG7 | 32
1945 MIPS | KVM_REG_MIPS_CP0_ERROREPC | 64
1946 MIPS | KVM_REG_MIPS_COUNT_CTL | 64
1947 MIPS | KVM_REG_MIPS_COUNT_RESUME | 64
1948 MIPS | KVM_REG_MIPS_COUNT_HZ | 64
Jan Kiszka414fa982012-04-24 16:40:15 +02001949
Christoffer Dall749cf76c2013-01-20 18:28:06 -05001950ARM registers are mapped using the lower 32 bits. The upper 16 of that
1951is the register group type, or coprocessor number:
1952
1953ARM core registers have the following id bit patterns:
Christoffer Dallaa404dd2013-04-22 18:57:46 -07001954 0x4020 0000 0010 <index into the kvm_regs struct:16>
Christoffer Dall749cf76c2013-01-20 18:28:06 -05001955
Christoffer Dall11382452013-01-20 18:28:10 -05001956ARM 32-bit CP15 registers have the following id bit patterns:
Christoffer Dallaa404dd2013-04-22 18:57:46 -07001957 0x4020 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3>
Christoffer Dall11382452013-01-20 18:28:10 -05001958
1959ARM 64-bit CP15 registers have the following id bit patterns:
Christoffer Dallaa404dd2013-04-22 18:57:46 -07001960 0x4030 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3>
Christoffer Dall749cf76c2013-01-20 18:28:06 -05001961
Christoffer Dallc27581e2013-01-20 18:28:10 -05001962ARM CCSIDR registers are demultiplexed by CSSELR value:
Christoffer Dallaa404dd2013-04-22 18:57:46 -07001963 0x4020 0000 0011 00 <csselr:8>
Christoffer Dall749cf76c2013-01-20 18:28:06 -05001964
Rusty Russell4fe21e42013-01-20 18:28:11 -05001965ARM 32-bit VFP control registers have the following id bit patterns:
Christoffer Dallaa404dd2013-04-22 18:57:46 -07001966 0x4020 0000 0012 1 <regno:12>
Rusty Russell4fe21e42013-01-20 18:28:11 -05001967
1968ARM 64-bit FP registers have the following id bit patterns:
Christoffer Dallaa404dd2013-04-22 18:57:46 -07001969 0x4030 0000 0012 0 <regno:12>
Rusty Russell4fe21e42013-01-20 18:28:11 -05001970
Marc Zyngier379e04c2013-04-02 17:46:31 +01001971
1972arm64 registers are mapped using the lower 32 bits. The upper 16 of
1973that is the register group type, or coprocessor number:
1974
1975arm64 core/FP-SIMD registers have the following id bit patterns. Note
1976that the size of the access is variable, as the kvm_regs structure
1977contains elements ranging from 32 to 128 bits. The index is a 32bit
1978value in the kvm_regs structure seen as a 32bit array.
1979 0x60x0 0000 0010 <index into the kvm_regs struct:16>
1980
1981arm64 CCSIDR registers are demultiplexed by CSSELR value:
1982 0x6020 0000 0011 00 <csselr:8>
1983
1984arm64 system registers have the following id bit patterns:
1985 0x6030 0000 0013 <op0:2> <op1:3> <crn:4> <crm:4> <op2:3>
1986
James Hoganc2d2c212014-07-04 15:11:35 +01001987
1988MIPS registers are mapped using the lower 32 bits. The upper 16 of that is
1989the register group type:
1990
1991MIPS core registers (see above) have the following id bit patterns:
1992 0x7030 0000 0000 <reg:16>
1993
1994MIPS CP0 registers (see KVM_REG_MIPS_CP0_* above) have the following id bit
1995patterns depending on whether they're 32-bit or 64-bit registers:
1996 0x7020 0000 0001 00 <reg:5> <sel:3> (32-bit)
1997 0x7030 0000 0001 00 <reg:5> <sel:3> (64-bit)
1998
1999MIPS KVM control registers (see above) have the following id bit patterns:
2000 0x7030 0000 0002 <reg:16>
2001
2002
Alexander Grafe24ed812011-09-14 10:02:41 +020020034.69 KVM_GET_ONE_REG
2004
2005Capability: KVM_CAP_ONE_REG
2006Architectures: all
2007Type: vcpu ioctl
2008Parameters: struct kvm_one_reg (in and out)
2009Returns: 0 on success, negative value on failure
2010
2011This ioctl allows to receive the value of a single register implemented
2012in a vcpu. The register to read is indicated by the "id" field of the
2013kvm_one_reg struct passed in. On success, the register value can be found
2014at the memory location pointed to by "addr".
2015
2016The list of registers accessible using this interface is identical to the
Bharat Bhushan2e232702012-08-15 17:37:13 +00002017list in 4.68.
Alexander Grafe24ed812011-09-14 10:02:41 +02002018
Jan Kiszka414fa982012-04-24 16:40:15 +02002019
Eric B Munson1c0b28c2012-03-10 14:37:27 -050020204.70 KVM_KVMCLOCK_CTRL
2021
2022Capability: KVM_CAP_KVMCLOCK_CTRL
2023Architectures: Any that implement pvclocks (currently x86 only)
2024Type: vcpu ioctl
2025Parameters: None
2026Returns: 0 on success, -1 on error
2027
2028This signals to the host kernel that the specified guest is being paused by
2029userspace. The host will set a flag in the pvclock structure that is checked
2030from the soft lockup watchdog. The flag is part of the pvclock structure that
2031is shared between guest and host, specifically the second bit of the flags
2032field of the pvclock_vcpu_time_info structure. It will be set exclusively by
2033the host and read/cleared exclusively by the guest. The guest operation of
2034checking and clearing the flag must an atomic operation so
2035load-link/store-conditional, or equivalent must be used. There are two cases
2036where the guest will clear the flag: when the soft lockup watchdog timer resets
2037itself or when a soft lockup is detected. This ioctl can be called any time
2038after pausing the vcpu, but before it is resumed.
2039
Jan Kiszka414fa982012-04-24 16:40:15 +02002040
Jan Kiszka07975ad2012-03-29 21:14:12 +020020414.71 KVM_SIGNAL_MSI
2042
2043Capability: KVM_CAP_SIGNAL_MSI
2044Architectures: x86
2045Type: vm ioctl
2046Parameters: struct kvm_msi (in)
2047Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error
2048
2049Directly inject a MSI message. Only valid with in-kernel irqchip that handles
2050MSI messages.
2051
2052struct kvm_msi {
2053 __u32 address_lo;
2054 __u32 address_hi;
2055 __u32 data;
2056 __u32 flags;
2057 __u8 pad[16];
2058};
2059
2060No flags are defined so far. The corresponding field must be 0.
2061
Jan Kiszka414fa982012-04-24 16:40:15 +02002062
Jan Kiszka0589ff62012-04-24 16:40:16 +020020634.71 KVM_CREATE_PIT2
2064
2065Capability: KVM_CAP_PIT2
2066Architectures: x86
2067Type: vm ioctl
2068Parameters: struct kvm_pit_config (in)
2069Returns: 0 on success, -1 on error
2070
2071Creates an in-kernel device model for the i8254 PIT. This call is only valid
2072after enabling in-kernel irqchip support via KVM_CREATE_IRQCHIP. The following
2073parameters have to be passed:
2074
2075struct kvm_pit_config {
2076 __u32 flags;
2077 __u32 pad[15];
2078};
2079
2080Valid flags are:
2081
2082#define KVM_PIT_SPEAKER_DUMMY 1 /* emulate speaker port stub */
2083
Jan Kiszkab6ddf052012-04-24 16:40:17 +02002084PIT timer interrupts may use a per-VM kernel thread for injection. If it
2085exists, this thread will have a name of the following pattern:
2086
2087kvm-pit/<owner-process-pid>
2088
2089When running a guest with elevated priorities, the scheduling parameters of
2090this thread may have to be adjusted accordingly.
2091
Jan Kiszka0589ff62012-04-24 16:40:16 +02002092This IOCTL replaces the obsolete KVM_CREATE_PIT.
2093
2094
20954.72 KVM_GET_PIT2
2096
2097Capability: KVM_CAP_PIT_STATE2
2098Architectures: x86
2099Type: vm ioctl
2100Parameters: struct kvm_pit_state2 (out)
2101Returns: 0 on success, -1 on error
2102
2103Retrieves the state of the in-kernel PIT model. Only valid after
2104KVM_CREATE_PIT2. The state is returned in the following structure:
2105
2106struct kvm_pit_state2 {
2107 struct kvm_pit_channel_state channels[3];
2108 __u32 flags;
2109 __u32 reserved[9];
2110};
2111
2112Valid flags are:
2113
2114/* disable PIT in HPET legacy mode */
2115#define KVM_PIT_FLAGS_HPET_LEGACY 0x00000001
2116
2117This IOCTL replaces the obsolete KVM_GET_PIT.
2118
2119
21204.73 KVM_SET_PIT2
2121
2122Capability: KVM_CAP_PIT_STATE2
2123Architectures: x86
2124Type: vm ioctl
2125Parameters: struct kvm_pit_state2 (in)
2126Returns: 0 on success, -1 on error
2127
2128Sets the state of the in-kernel PIT model. Only valid after KVM_CREATE_PIT2.
2129See KVM_GET_PIT2 for details on struct kvm_pit_state2.
2130
2131This IOCTL replaces the obsolete KVM_SET_PIT.
2132
2133
Benjamin Herrenschmidt5b747162012-04-26 19:43:42 +000021344.74 KVM_PPC_GET_SMMU_INFO
2135
2136Capability: KVM_CAP_PPC_GET_SMMU_INFO
2137Architectures: powerpc
2138Type: vm ioctl
2139Parameters: None
2140Returns: 0 on success, -1 on error
2141
2142This populates and returns a structure describing the features of
2143the "Server" class MMU emulation supported by KVM.
Stefan Hubercc22c352013-06-05 12:24:37 +02002144This can in turn be used by userspace to generate the appropriate
Benjamin Herrenschmidt5b747162012-04-26 19:43:42 +00002145device-tree properties for the guest operating system.
2146
Carlos Garciac98be0c2014-04-04 22:31:00 -04002147The structure contains some global information, followed by an
Benjamin Herrenschmidt5b747162012-04-26 19:43:42 +00002148array of supported segment page sizes:
2149
2150 struct kvm_ppc_smmu_info {
2151 __u64 flags;
2152 __u32 slb_size;
2153 __u32 pad;
2154 struct kvm_ppc_one_seg_page_size sps[KVM_PPC_PAGE_SIZES_MAX_SZ];
2155 };
2156
2157The supported flags are:
2158
2159 - KVM_PPC_PAGE_SIZES_REAL:
2160 When that flag is set, guest page sizes must "fit" the backing
2161 store page sizes. When not set, any page size in the list can
2162 be used regardless of how they are backed by userspace.
2163
2164 - KVM_PPC_1T_SEGMENTS
2165 The emulated MMU supports 1T segments in addition to the
2166 standard 256M ones.
2167
2168The "slb_size" field indicates how many SLB entries are supported
2169
2170The "sps" array contains 8 entries indicating the supported base
2171page sizes for a segment in increasing order. Each entry is defined
2172as follow:
2173
2174 struct kvm_ppc_one_seg_page_size {
2175 __u32 page_shift; /* Base page shift of segment (or 0) */
2176 __u32 slb_enc; /* SLB encoding for BookS */
2177 struct kvm_ppc_one_page_size enc[KVM_PPC_PAGE_SIZES_MAX_SZ];
2178 };
2179
2180An entry with a "page_shift" of 0 is unused. Because the array is
2181organized in increasing order, a lookup can stop when encoutering
2182such an entry.
2183
2184The "slb_enc" field provides the encoding to use in the SLB for the
2185page size. The bits are in positions such as the value can directly
2186be OR'ed into the "vsid" argument of the slbmte instruction.
2187
2188The "enc" array is a list which for each of those segment base page
2189size provides the list of supported actual page sizes (which can be
2190only larger or equal to the base page size), along with the
Anatol Pomozovf884ab12013-05-08 16:56:16 -07002191corresponding encoding in the hash PTE. Similarly, the array is
Benjamin Herrenschmidt5b747162012-04-26 19:43:42 +000021928 entries sorted by increasing sizes and an entry with a "0" shift
2193is an empty entry and a terminator:
2194
2195 struct kvm_ppc_one_page_size {
2196 __u32 page_shift; /* Page shift (or 0) */
2197 __u32 pte_enc; /* Encoding in the HPTE (>>12) */
2198 };
2199
2200The "pte_enc" field provides a value that can OR'ed into the hash
2201PTE's RPN field (ie, it needs to be shifted left by 12 to OR it
2202into the hash PTE second double word).
2203
Alex Williamsonf36992e2012-06-29 09:56:16 -060022044.75 KVM_IRQFD
2205
2206Capability: KVM_CAP_IRQFD
Cornelia Huckebc32262014-05-09 15:00:46 +02002207Architectures: x86 s390
Alex Williamsonf36992e2012-06-29 09:56:16 -06002208Type: vm ioctl
2209Parameters: struct kvm_irqfd (in)
2210Returns: 0 on success, -1 on error
2211
2212Allows setting an eventfd to directly trigger a guest interrupt.
2213kvm_irqfd.fd specifies the file descriptor to use as the eventfd and
2214kvm_irqfd.gsi specifies the irqchip pin toggled by this event. When
Masanari Iida17180032013-12-22 01:21:23 +09002215an event is triggered on the eventfd, an interrupt is injected into
Alex Williamsonf36992e2012-06-29 09:56:16 -06002216the guest using the specified gsi pin. The irqfd is removed using
2217the KVM_IRQFD_FLAG_DEASSIGN flag, specifying both kvm_irqfd.fd
2218and kvm_irqfd.gsi.
2219
Alex Williamson7a844282012-09-21 11:58:03 -06002220With KVM_CAP_IRQFD_RESAMPLE, KVM_IRQFD supports a de-assert and notify
2221mechanism allowing emulation of level-triggered, irqfd-based
2222interrupts. When KVM_IRQFD_FLAG_RESAMPLE is set the user must pass an
2223additional eventfd in the kvm_irqfd.resamplefd field. When operating
2224in resample mode, posting of an interrupt through kvm_irq.fd asserts
2225the specified gsi in the irqchip. When the irqchip is resampled, such
Masanari Iida17180032013-12-22 01:21:23 +09002226as from an EOI, the gsi is de-asserted and the user is notified via
Alex Williamson7a844282012-09-21 11:58:03 -06002227kvm_irqfd.resamplefd. It is the user's responsibility to re-queue
2228the interrupt if the device making use of it still requires service.
2229Note that closing the resamplefd is not sufficient to disable the
2230irqfd. The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
2231and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
2232
Linus Torvalds5fecc9d2012-07-24 12:01:20 -070022334.76 KVM_PPC_ALLOCATE_HTAB
Paul Mackerras32fad282012-05-04 02:32:53 +00002234
2235Capability: KVM_CAP_PPC_ALLOC_HTAB
2236Architectures: powerpc
2237Type: vm ioctl
2238Parameters: Pointer to u32 containing hash table order (in/out)
2239Returns: 0 on success, -1 on error
2240
2241This requests the host kernel to allocate an MMU hash table for a
2242guest using the PAPR paravirtualization interface. This only does
2243anything if the kernel is configured to use the Book 3S HV style of
2244virtualization. Otherwise the capability doesn't exist and the ioctl
2245returns an ENOTTY error. The rest of this description assumes Book 3S
2246HV.
2247
2248There must be no vcpus running when this ioctl is called; if there
2249are, it will do nothing and return an EBUSY error.
2250
2251The parameter is a pointer to a 32-bit unsigned integer variable
2252containing the order (log base 2) of the desired size of the hash
2253table, which must be between 18 and 46. On successful return from the
2254ioctl, it will have been updated with the order of the hash table that
2255was allocated.
2256
2257If no hash table has been allocated when any vcpu is asked to run
2258(with the KVM_RUN ioctl), the host kernel will allocate a
2259default-sized hash table (16 MB).
2260
2261If this ioctl is called when a hash table has already been allocated,
2262the kernel will clear out the existing hash table (zero all HPTEs) and
2263return the hash table order in the parameter. (If the guest is using
2264the virtualized real-mode area (VRMA) facility, the kernel will
2265re-create the VMRA HPTEs on the next KVM_RUN of any vcpu.)
2266
Cornelia Huck416ad652012-10-02 16:25:37 +020022674.77 KVM_S390_INTERRUPT
2268
2269Capability: basic
2270Architectures: s390
2271Type: vm ioctl, vcpu ioctl
2272Parameters: struct kvm_s390_interrupt (in)
2273Returns: 0 on success, -1 on error
2274
2275Allows to inject an interrupt to the guest. Interrupts can be floating
2276(vm ioctl) or per cpu (vcpu ioctl), depending on the interrupt type.
2277
2278Interrupt parameters are passed via kvm_s390_interrupt:
2279
2280struct kvm_s390_interrupt {
2281 __u32 type;
2282 __u32 parm;
2283 __u64 parm64;
2284};
2285
2286type can be one of the following:
2287
2288KVM_S390_SIGP_STOP (vcpu) - sigp restart
2289KVM_S390_PROGRAM_INT (vcpu) - program check; code in parm
2290KVM_S390_SIGP_SET_PREFIX (vcpu) - sigp set prefix; prefix address in parm
2291KVM_S390_RESTART (vcpu) - restart
Thomas Huthe029ae52014-03-26 16:11:54 +01002292KVM_S390_INT_CLOCK_COMP (vcpu) - clock comparator interrupt
2293KVM_S390_INT_CPU_TIMER (vcpu) - CPU timer interrupt
Cornelia Huck416ad652012-10-02 16:25:37 +02002294KVM_S390_INT_VIRTIO (vm) - virtio external interrupt; external interrupt
2295 parameters in parm and parm64
2296KVM_S390_INT_SERVICE (vm) - sclp external interrupt; sclp parameter in parm
2297KVM_S390_INT_EMERGENCY (vcpu) - sigp emergency; source cpu in parm
2298KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm
Cornelia Huckd8346b72012-12-20 15:32:08 +01002299KVM_S390_INT_IO(ai,cssid,ssid,schid) (vm) - compound value to indicate an
2300 I/O interrupt (ai - adapter interrupt; cssid,ssid,schid - subchannel);
2301 I/O interruption parameters in parm (subchannel) and parm64 (intparm,
2302 interruption subclass)
Cornelia Huck48a3e952012-12-20 15:32:09 +01002303KVM_S390_MCHK (vm, vcpu) - machine check interrupt; cr 14 bits in parm,
2304 machine check interrupt code in parm64 (note that
2305 machine checks needing further payload are not
2306 supported by this ioctl)
Cornelia Huck416ad652012-10-02 16:25:37 +02002307
2308Note that the vcpu ioctl is asynchronous to vcpu execution.
2309
Paul Mackerrasa2932922012-11-19 22:57:20 +000023104.78 KVM_PPC_GET_HTAB_FD
2311
2312Capability: KVM_CAP_PPC_HTAB_FD
2313Architectures: powerpc
2314Type: vm ioctl
2315Parameters: Pointer to struct kvm_get_htab_fd (in)
2316Returns: file descriptor number (>= 0) on success, -1 on error
2317
2318This returns a file descriptor that can be used either to read out the
2319entries in the guest's hashed page table (HPT), or to write entries to
2320initialize the HPT. The returned fd can only be written to if the
2321KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and
2322can only be read if that bit is clear. The argument struct looks like
2323this:
2324
2325/* For KVM_PPC_GET_HTAB_FD */
2326struct kvm_get_htab_fd {
2327 __u64 flags;
2328 __u64 start_index;
2329 __u64 reserved[2];
2330};
2331
2332/* Values for kvm_get_htab_fd.flags */
2333#define KVM_GET_HTAB_BOLTED_ONLY ((__u64)0x1)
2334#define KVM_GET_HTAB_WRITE ((__u64)0x2)
2335
2336The `start_index' field gives the index in the HPT of the entry at
2337which to start reading. It is ignored when writing.
2338
2339Reads on the fd will initially supply information about all
2340"interesting" HPT entries. Interesting entries are those with the
2341bolted bit set, if the KVM_GET_HTAB_BOLTED_ONLY bit is set, otherwise
2342all entries. When the end of the HPT is reached, the read() will
2343return. If read() is called again on the fd, it will start again from
2344the beginning of the HPT, but will only return HPT entries that have
2345changed since they were last read.
2346
2347Data read or written is structured as a header (8 bytes) followed by a
2348series of valid HPT entries (16 bytes) each. The header indicates how
2349many valid HPT entries there are and how many invalid entries follow
2350the valid entries. The invalid entries are not represented explicitly
2351in the stream. The header format is:
2352
2353struct kvm_get_htab_header {
2354 __u32 index;
2355 __u16 n_valid;
2356 __u16 n_invalid;
2357};
2358
2359Writes to the fd create HPT entries starting at the index given in the
2360header; first `n_valid' valid entries with contents from the data
2361written, then `n_invalid' invalid entries, invalidating any previously
2362valid entries found.
2363
Scott Wood852b6d52013-04-12 14:08:42 +000023644.79 KVM_CREATE_DEVICE
2365
2366Capability: KVM_CAP_DEVICE_CTRL
2367Type: vm ioctl
2368Parameters: struct kvm_create_device (in/out)
2369Returns: 0 on success, -1 on error
2370Errors:
2371 ENODEV: The device type is unknown or unsupported
2372 EEXIST: Device already created, and this type of device may not
2373 be instantiated multiple times
2374
2375 Other error conditions may be defined by individual device types or
2376 have their standard meanings.
2377
2378Creates an emulated device in the kernel. The file descriptor returned
2379in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
2380
2381If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
2382device type is supported (not necessarily whether it can be created
2383in the current vm).
2384
2385Individual devices should not define flags. Attributes should be used
2386for specifying any behavior that is not implied by the device type
2387number.
2388
2389struct kvm_create_device {
2390 __u32 type; /* in: KVM_DEV_TYPE_xxx */
2391 __u32 fd; /* out: device handle */
2392 __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */
2393};
2394
23954.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
2396
Dominik Dingelf2061652014-04-09 13:13:00 +02002397Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device
2398Type: device ioctl, vm ioctl
Scott Wood852b6d52013-04-12 14:08:42 +00002399Parameters: struct kvm_device_attr
2400Returns: 0 on success, -1 on error
2401Errors:
2402 ENXIO: The group or attribute is unknown/unsupported for this device
2403 EPERM: The attribute cannot (currently) be accessed this way
2404 (e.g. read-only attribute, or attribute that only makes
2405 sense when the device is in a different state)
2406
2407 Other error conditions may be defined by individual device types.
2408
2409Gets/sets a specified piece of device configuration and/or state. The
2410semantics are device-specific. See individual device documentation in
2411the "devices" directory. As with ONE_REG, the size of the data
2412transferred is defined by the particular attribute.
2413
2414struct kvm_device_attr {
2415 __u32 flags; /* no flags currently defined */
2416 __u32 group; /* device-defined */
2417 __u64 attr; /* group-defined */
2418 __u64 addr; /* userspace address of attr data */
2419};
2420
24214.81 KVM_HAS_DEVICE_ATTR
2422
Dominik Dingelf2061652014-04-09 13:13:00 +02002423Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device
2424Type: device ioctl, vm ioctl
Scott Wood852b6d52013-04-12 14:08:42 +00002425Parameters: struct kvm_device_attr
2426Returns: 0 on success, -1 on error
2427Errors:
2428 ENXIO: The group or attribute is unknown/unsupported for this device
2429
2430Tests whether a device supports a particular attribute. A successful
2431return indicates the attribute is implemented. It does not necessarily
2432indicate that the attribute can be read or written in the device's
2433current state. "addr" is ignored.
Alex Williamsonf36992e2012-06-29 09:56:16 -06002434
Alexey Kardashevskiyd8968f12013-06-19 11:42:07 +100024354.82 KVM_ARM_VCPU_INIT
Christoffer Dall749cf76c2013-01-20 18:28:06 -05002436
2437Capability: basic
Marc Zyngier379e04c2013-04-02 17:46:31 +01002438Architectures: arm, arm64
Christoffer Dall749cf76c2013-01-20 18:28:06 -05002439Type: vcpu ioctl
Anup Patelbeb11fc2013-12-12 21:42:24 +05302440Parameters: struct kvm_vcpu_init (in)
Christoffer Dall749cf76c2013-01-20 18:28:06 -05002441Returns: 0 on success; -1 on error
2442Errors:
2443  EINVAL:    the target is unknown, or the combination of features is invalid.
2444  ENOENT:    a features bit specified is unknown.
2445
2446This tells KVM what type of CPU to present to the guest, and what
2447optional features it should have.  This will cause a reset of the cpu
2448registers to their initial values.  If this is not called, KVM_RUN will
2449return ENOEXEC for that vcpu.
2450
2451Note that because some registers reflect machine topology, all vcpus
2452should be created before this ioctl is invoked.
2453
Marc Zyngieraa024c2f2013-01-20 18:28:13 -05002454Possible features:
2455 - KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state.
2456 Depends on KVM_CAP_ARM_PSCI.
Marc Zyngier379e04c2013-04-02 17:46:31 +01002457 - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
2458 Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
Anup Patel50bb0c92014-04-29 11:24:17 +05302459 - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
2460 Depends on KVM_CAP_ARM_PSCI_0_2.
Marc Zyngieraa024c2f2013-01-20 18:28:13 -05002461
Christoffer Dall749cf76c2013-01-20 18:28:06 -05002462
Anup Patel740edfc2013-09-30 14:20:08 +053024634.83 KVM_ARM_PREFERRED_TARGET
2464
2465Capability: basic
2466Architectures: arm, arm64
2467Type: vm ioctl
2468Parameters: struct struct kvm_vcpu_init (out)
2469Returns: 0 on success; -1 on error
2470Errors:
Christoffer Dalla7265fb2013-10-15 17:43:00 -07002471 ENODEV: no preferred target available for the host
Anup Patel740edfc2013-09-30 14:20:08 +05302472
2473This queries KVM for preferred CPU target type which can be emulated
2474by KVM on underlying host.
2475
2476The ioctl returns struct kvm_vcpu_init instance containing information
2477about preferred CPU target type and recommended features for it. The
2478kvm_vcpu_init->features bitmap returned will have feature bits set if
2479the preferred target recommends setting these features, but this is
2480not mandatory.
2481
2482The information returned by this ioctl can be used to prepare an instance
2483of struct kvm_vcpu_init for KVM_ARM_VCPU_INIT ioctl which will result in
2484in VCPU matching underlying host.
2485
2486
24874.84 KVM_GET_REG_LIST
Christoffer Dall749cf76c2013-01-20 18:28:06 -05002488
2489Capability: basic
James Hoganc2d2c212014-07-04 15:11:35 +01002490Architectures: arm, arm64, mips
Christoffer Dall749cf76c2013-01-20 18:28:06 -05002491Type: vcpu ioctl
2492Parameters: struct kvm_reg_list (in/out)
2493Returns: 0 on success; -1 on error
2494Errors:
2495  E2BIG:     the reg index list is too big to fit in the array specified by
2496             the user (the number required will be written into n).
2497
2498struct kvm_reg_list {
2499 __u64 n; /* number of registers in reg[] */
2500 __u64 reg[0];
2501};
2502
2503This ioctl returns the guest registers that are supported for the
2504KVM_GET_ONE_REG/KVM_SET_ONE_REG calls.
2505
Christoffer Dallce01e4e2013-09-23 14:55:56 -07002506
25074.85 KVM_ARM_SET_DEVICE_ADDR (deprecated)
Christoffer Dall3401d5462013-01-23 13:18:04 -05002508
2509Capability: KVM_CAP_ARM_SET_DEVICE_ADDR
Marc Zyngier379e04c2013-04-02 17:46:31 +01002510Architectures: arm, arm64
Christoffer Dall3401d5462013-01-23 13:18:04 -05002511Type: vm ioctl
2512Parameters: struct kvm_arm_device_address (in)
2513Returns: 0 on success, -1 on error
2514Errors:
2515 ENODEV: The device id is unknown
2516 ENXIO: Device not supported on current system
2517 EEXIST: Address already set
2518 E2BIG: Address outside guest physical address space
Christoffer Dall330690c2013-01-21 19:36:13 -05002519 EBUSY: Address overlaps with other device range
Christoffer Dall3401d5462013-01-23 13:18:04 -05002520
2521struct kvm_arm_device_addr {
2522 __u64 id;
2523 __u64 addr;
2524};
2525
2526Specify a device address in the guest's physical address space where guests
2527can access emulated or directly exposed devices, which the host kernel needs
2528to know about. The id field is an architecture specific identifier for a
2529specific device.
2530
Marc Zyngier379e04c2013-04-02 17:46:31 +01002531ARM/arm64 divides the id field into two parts, a device id and an
2532address type id specific to the individual device.
Christoffer Dall3401d5462013-01-23 13:18:04 -05002533
2534  bits: | 63 ... 32 | 31 ... 16 | 15 ... 0 |
2535 field: | 0x00000000 | device id | addr type id |
2536
Marc Zyngier379e04c2013-04-02 17:46:31 +01002537ARM/arm64 currently only require this when using the in-kernel GIC
2538support for the hardware VGIC features, using KVM_ARM_DEVICE_VGIC_V2
2539as the device id. When setting the base address for the guest's
2540mapping of the VGIC virtual CPU and distributor interface, the ioctl
2541must be called after calling KVM_CREATE_IRQCHIP, but before calling
2542KVM_RUN on any of the VCPUs. Calling this ioctl twice for any of the
2543base addresses will return -EEXIST.
Christoffer Dall3401d5462013-01-23 13:18:04 -05002544
Christoffer Dallce01e4e2013-09-23 14:55:56 -07002545Note, this IOCTL is deprecated and the more flexible SET/GET_DEVICE_ATTR API
2546should be used instead.
2547
2548
Anup Patel740edfc2013-09-30 14:20:08 +053025494.86 KVM_PPC_RTAS_DEFINE_TOKEN
Michael Ellerman8e591cb2013-04-17 20:30:00 +00002550
2551Capability: KVM_CAP_PPC_RTAS
2552Architectures: ppc
2553Type: vm ioctl
2554Parameters: struct kvm_rtas_token_args
2555Returns: 0 on success, -1 on error
2556
2557Defines a token value for a RTAS (Run Time Abstraction Services)
2558service in order to allow it to be handled in the kernel. The
2559argument struct gives the name of the service, which must be the name
2560of a service that has a kernel-side implementation. If the token
2561value is non-zero, it will be associated with that service, and
2562subsequent RTAS calls by the guest specifying that token will be
2563handled by the kernel. If the token value is 0, then any token
2564associated with the service will be forgotten, and subsequent RTAS
2565calls by the guest for that service will be passed to userspace to be
2566handled.
2567
Christoffer Dall3401d5462013-01-23 13:18:04 -05002568
Avi Kivity9c1b96e2009-06-09 12:37:58 +030025695. The kvm_run structure
Jan Kiszka414fa982012-04-24 16:40:15 +02002570------------------------
Avi Kivity9c1b96e2009-06-09 12:37:58 +03002571
2572Application code obtains a pointer to the kvm_run structure by
2573mmap()ing a vcpu fd. From that point, application code can control
2574execution by changing fields in kvm_run prior to calling the KVM_RUN
2575ioctl, and obtain information about the reason KVM_RUN returned by
2576looking up structure members.
2577
2578struct kvm_run {
2579 /* in */
2580 __u8 request_interrupt_window;
2581
2582Request that KVM_RUN return when it becomes possible to inject external
2583interrupts into the guest. Useful in conjunction with KVM_INTERRUPT.
2584
2585 __u8 padding1[7];
2586
2587 /* out */
2588 __u32 exit_reason;
2589
2590When KVM_RUN has returned successfully (return value 0), this informs
2591application code why KVM_RUN has returned. Allowable values for this
2592field are detailed below.
2593
2594 __u8 ready_for_interrupt_injection;
2595
2596If request_interrupt_window has been specified, this field indicates
2597an interrupt can be injected now with KVM_INTERRUPT.
2598
2599 __u8 if_flag;
2600
2601The value of the current interrupt flag. Only valid if in-kernel
2602local APIC is not used.
2603
2604 __u8 padding2[2];
2605
2606 /* in (pre_kvm_run), out (post_kvm_run) */
2607 __u64 cr8;
2608
2609The value of the cr8 register. Only valid if in-kernel local APIC is
2610not used. Both input and output.
2611
2612 __u64 apic_base;
2613
2614The value of the APIC BASE msr. Only valid if in-kernel local
2615APIC is not used. Both input and output.
2616
2617 union {
2618 /* KVM_EXIT_UNKNOWN */
2619 struct {
2620 __u64 hardware_exit_reason;
2621 } hw;
2622
2623If exit_reason is KVM_EXIT_UNKNOWN, the vcpu has exited due to unknown
2624reasons. Further architecture-specific information is available in
2625hardware_exit_reason.
2626
2627 /* KVM_EXIT_FAIL_ENTRY */
2628 struct {
2629 __u64 hardware_entry_failure_reason;
2630 } fail_entry;
2631
2632If exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due
2633to unknown reasons. Further architecture-specific information is
2634available in hardware_entry_failure_reason.
2635
2636 /* KVM_EXIT_EXCEPTION */
2637 struct {
2638 __u32 exception;
2639 __u32 error_code;
2640 } ex;
2641
2642Unused.
2643
2644 /* KVM_EXIT_IO */
2645 struct {
2646#define KVM_EXIT_IO_IN 0
2647#define KVM_EXIT_IO_OUT 1
2648 __u8 direction;
2649 __u8 size; /* bytes */
2650 __u16 port;
2651 __u32 count;
2652 __u64 data_offset; /* relative to kvm_run start */
2653 } io;
2654
Wu Fengguang2044892d2009-12-24 09:04:16 +08002655If exit_reason is KVM_EXIT_IO, then the vcpu has
Avi Kivity9c1b96e2009-06-09 12:37:58 +03002656executed a port I/O instruction which could not be satisfied by kvm.
2657data_offset describes where the data is located (KVM_EXIT_IO_OUT) or
2658where kvm expects application code to place the data for the next
Wu Fengguang2044892d2009-12-24 09:04:16 +08002659KVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a packed array.
Avi Kivity9c1b96e2009-06-09 12:37:58 +03002660
2661 struct {
2662 struct kvm_debug_exit_arch arch;
2663 } debug;
2664
2665Unused.
2666
2667 /* KVM_EXIT_MMIO */
2668 struct {
2669 __u64 phys_addr;
2670 __u8 data[8];
2671 __u32 len;
2672 __u8 is_write;
2673 } mmio;
2674
Wu Fengguang2044892d2009-12-24 09:04:16 +08002675If exit_reason is KVM_EXIT_MMIO, then the vcpu has
Avi Kivity9c1b96e2009-06-09 12:37:58 +03002676executed a memory-mapped I/O instruction which could not be satisfied
2677by kvm. The 'data' member contains the written data if 'is_write' is
2678true, and should be filled by application code otherwise.
2679
Christoffer Dall6acdb162014-01-28 08:28:42 -08002680The 'data' member contains, in its first 'len' bytes, the value as it would
2681appear if the VCPU performed a load or store of the appropriate width directly
2682to the byte array.
2683
Paolo Bonzinicc568ea2014-08-05 09:55:22 +02002684NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and
Alexander Grafce91ddc2014-07-28 19:29:13 +02002685 KVM_EXIT_EPR the corresponding
Alexander Grafad0a0482010-03-24 21:48:30 +01002686operations are complete (and guest state is consistent) only after userspace
2687has re-entered the kernel with KVM_RUN. The kernel side will first finish
Marcelo Tosatti67961342010-02-13 16:10:26 -02002688incomplete operations and then check for pending signals. Userspace
2689can re-enter the guest with an unmasked signal pending to complete
2690pending operations.
2691
Avi Kivity9c1b96e2009-06-09 12:37:58 +03002692 /* KVM_EXIT_HYPERCALL */
2693 struct {
2694 __u64 nr;
2695 __u64 args[6];
2696 __u64 ret;
2697 __u32 longmode;
2698 __u32 pad;
2699 } hypercall;
2700
Avi Kivity647dc492010-04-01 14:39:21 +03002701Unused. This was once used for 'hypercall to userspace'. To implement
2702such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
2703Note KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
Avi Kivity9c1b96e2009-06-09 12:37:58 +03002704
2705 /* KVM_EXIT_TPR_ACCESS */
2706 struct {
2707 __u64 rip;
2708 __u32 is_write;
2709 __u32 pad;
2710 } tpr_access;
2711
2712To be documented (KVM_TPR_ACCESS_REPORTING).
2713
2714 /* KVM_EXIT_S390_SIEIC */
2715 struct {
2716 __u8 icptcode;
2717 __u64 mask; /* psw upper half */
2718 __u64 addr; /* psw lower half */
2719 __u16 ipa;
2720 __u32 ipb;
2721 } s390_sieic;
2722
2723s390 specific.
2724
2725 /* KVM_EXIT_S390_RESET */
2726#define KVM_S390_RESET_POR 1
2727#define KVM_S390_RESET_CLEAR 2
2728#define KVM_S390_RESET_SUBSYSTEM 4
2729#define KVM_S390_RESET_CPU_INIT 8
2730#define KVM_S390_RESET_IPL 16
2731 __u64 s390_reset_flags;
2732
2733s390 specific.
2734
Carsten Ottee168bf82012-01-04 10:25:22 +01002735 /* KVM_EXIT_S390_UCONTROL */
2736 struct {
2737 __u64 trans_exc_code;
2738 __u32 pgm_code;
2739 } s390_ucontrol;
2740
2741s390 specific. A page fault has occurred for a user controlled virtual
2742machine (KVM_VM_S390_UNCONTROL) on it's host page table that cannot be
2743resolved by the kernel.
2744The program code and the translation exception code that were placed
2745in the cpu's lowcore are presented here as defined by the z Architecture
2746Principles of Operation Book in the Chapter for Dynamic Address Translation
2747(DAT)
2748
Avi Kivity9c1b96e2009-06-09 12:37:58 +03002749 /* KVM_EXIT_DCR */
2750 struct {
2751 __u32 dcrn;
2752 __u32 data;
2753 __u8 is_write;
2754 } dcr;
2755
Alexander Grafce91ddc2014-07-28 19:29:13 +02002756Deprecated - was used for 440 KVM.
Avi Kivity9c1b96e2009-06-09 12:37:58 +03002757
Alexander Grafad0a0482010-03-24 21:48:30 +01002758 /* KVM_EXIT_OSI */
2759 struct {
2760 __u64 gprs[32];
2761 } osi;
2762
2763MOL uses a special hypercall interface it calls 'OSI'. To enable it, we catch
2764hypercalls and exit with this exit struct that contains all the guest gprs.
2765
2766If exit_reason is KVM_EXIT_OSI, then the vcpu has triggered such a hypercall.
2767Userspace can now handle the hypercall and when it's done modify the gprs as
2768necessary. Upon guest entry all guest GPRs will then be replaced by the values
2769in this struct.
2770
Paul Mackerrasde56a942011-06-29 00:21:34 +00002771 /* KVM_EXIT_PAPR_HCALL */
2772 struct {
2773 __u64 nr;
2774 __u64 ret;
2775 __u64 args[9];
2776 } papr_hcall;
2777
2778This is used on 64-bit PowerPC when emulating a pSeries partition,
2779e.g. with the 'pseries' machine type in qemu. It occurs when the
2780guest does a hypercall using the 'sc 1' instruction. The 'nr' field
2781contains the hypercall number (from the guest R3), and 'args' contains
2782the arguments (from the guest R4 - R12). Userspace should put the
2783return code in 'ret' and any extra returned values in args[].
2784The possible hypercalls are defined in the Power Architecture Platform
2785Requirements (PAPR) document available from www.power.org (free
2786developer registration required to access it).
2787
Cornelia Huckfa6b7fe2012-12-20 15:32:12 +01002788 /* KVM_EXIT_S390_TSCH */
2789 struct {
2790 __u16 subchannel_id;
2791 __u16 subchannel_nr;
2792 __u32 io_int_parm;
2793 __u32 io_int_word;
2794 __u32 ipb;
2795 __u8 dequeued;
2796 } s390_tsch;
2797
2798s390 specific. This exit occurs when KVM_CAP_S390_CSS_SUPPORT has been enabled
2799and TEST SUBCHANNEL was intercepted. If dequeued is set, a pending I/O
2800interrupt for the target subchannel has been dequeued and subchannel_id,
2801subchannel_nr, io_int_parm and io_int_word contain the parameters for that
2802interrupt. ipb is needed for instruction parameter decoding.
2803
Alexander Graf1c810632013-01-04 18:12:48 +01002804 /* KVM_EXIT_EPR */
2805 struct {
2806 __u32 epr;
2807 } epr;
2808
2809On FSL BookE PowerPC chips, the interrupt controller has a fast patch
2810interrupt acknowledge path to the core. When the core successfully
2811delivers an interrupt, it automatically populates the EPR register with
2812the interrupt vector number and acknowledges the interrupt inside
2813the interrupt controller.
2814
2815In case the interrupt controller lives in user space, we need to do
2816the interrupt acknowledge cycle through it to fetch the next to be
2817delivered interrupt vector using this exit.
2818
2819It gets triggered whenever both KVM_CAP_PPC_EPR are enabled and an
2820external interrupt has just been delivered into the guest. User space
2821should put the acknowledged interrupt vector into the 'epr' field.
2822
Anup Patel8ad6b632014-04-29 11:24:19 +05302823 /* KVM_EXIT_SYSTEM_EVENT */
2824 struct {
2825#define KVM_SYSTEM_EVENT_SHUTDOWN 1
2826#define KVM_SYSTEM_EVENT_RESET 2
2827 __u32 type;
2828 __u64 flags;
2829 } system_event;
2830
2831If exit_reason is KVM_EXIT_SYSTEM_EVENT then the vcpu has triggered
2832a system-level event using some architecture specific mechanism (hypercall
2833or some special instruction). In case of ARM/ARM64, this is triggered using
2834HVC instruction based PSCI call from the vcpu. The 'type' field describes
2835the system-level event type. The 'flags' field describes architecture
2836specific flags for the system-level event.
2837
Avi Kivity9c1b96e2009-06-09 12:37:58 +03002838 /* Fix the size of the union. */
2839 char padding[256];
2840 };
Christian Borntraegerb9e5dc82012-01-11 11:20:30 +01002841
2842 /*
2843 * shared registers between kvm and userspace.
2844 * kvm_valid_regs specifies the register classes set by the host
2845 * kvm_dirty_regs specified the register classes dirtied by userspace
2846 * struct kvm_sync_regs is architecture specific, as well as the
2847 * bits for kvm_valid_regs and kvm_dirty_regs
2848 */
2849 __u64 kvm_valid_regs;
2850 __u64 kvm_dirty_regs;
2851 union {
2852 struct kvm_sync_regs regs;
2853 char padding[1024];
2854 } s;
2855
2856If KVM_CAP_SYNC_REGS is defined, these fields allow userspace to access
2857certain guest registers without having to call SET/GET_*REGS. Thus we can
2858avoid some system call overhead if userspace has to handle the exit.
2859Userspace can query the validity of the structure by checking
2860kvm_valid_regs for specific bits. These bits are architecture specific
2861and usually define the validity of a groups of registers. (e.g. one bit
2862 for general purpose registers)
2863
Avi Kivity9c1b96e2009-06-09 12:37:58 +03002864};
Alexander Graf821246a2011-08-31 10:58:55 +02002865
Jan Kiszka414fa982012-04-24 16:40:15 +02002866
Borislav Petkov9c15bb12013-09-22 16:44:50 +020028674.81 KVM_GET_EMULATED_CPUID
2868
2869Capability: KVM_CAP_EXT_EMUL_CPUID
2870Architectures: x86
2871Type: system ioctl
2872Parameters: struct kvm_cpuid2 (in/out)
2873Returns: 0 on success, -1 on error
2874
2875struct kvm_cpuid2 {
2876 __u32 nent;
2877 __u32 flags;
2878 struct kvm_cpuid_entry2 entries[0];
2879};
2880
2881The member 'flags' is used for passing flags from userspace.
2882
2883#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX BIT(0)
2884#define KVM_CPUID_FLAG_STATEFUL_FUNC BIT(1)
2885#define KVM_CPUID_FLAG_STATE_READ_NEXT BIT(2)
2886
2887struct kvm_cpuid_entry2 {
2888 __u32 function;
2889 __u32 index;
2890 __u32 flags;
2891 __u32 eax;
2892 __u32 ebx;
2893 __u32 ecx;
2894 __u32 edx;
2895 __u32 padding[3];
2896};
2897
2898This ioctl returns x86 cpuid features which are emulated by
2899kvm.Userspace can use the information returned by this ioctl to query
2900which features are emulated by kvm instead of being present natively.
2901
2902Userspace invokes KVM_GET_EMULATED_CPUID by passing a kvm_cpuid2
2903structure with the 'nent' field indicating the number of entries in
2904the variable-size array 'entries'. If the number of entries is too low
2905to describe the cpu capabilities, an error (E2BIG) is returned. If the
2906number is too high, the 'nent' field is adjusted and an error (ENOMEM)
2907is returned. If the number is just right, the 'nent' field is adjusted
2908to the number of valid entries in the 'entries' array, which is then
2909filled.
2910
2911The entries returned are the set CPUID bits of the respective features
2912which kvm emulates, as returned by the CPUID instruction, with unknown
2913or unsupported feature bits cleared.
2914
2915Features like x2apic, for example, may not be present in the host cpu
2916but are exposed by kvm in KVM_GET_SUPPORTED_CPUID because they can be
2917emulated efficiently and thus not included here.
2918
2919The fields in each entry are defined as follows:
2920
2921 function: the eax value used to obtain the entry
2922 index: the ecx value used to obtain the entry (for entries that are
2923 affected by ecx)
2924 flags: an OR of zero or more of the following:
2925 KVM_CPUID_FLAG_SIGNIFCANT_INDEX:
2926 if the index field is valid
2927 KVM_CPUID_FLAG_STATEFUL_FUNC:
2928 if cpuid for this function returns different values for successive
2929 invocations; there will be several entries with the same function,
2930 all with this flag set
2931 KVM_CPUID_FLAG_STATE_READ_NEXT:
2932 for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is
2933 the first entry to be read by a cpu
2934 eax, ebx, ecx, edx: the values returned by the cpuid instruction for
2935 this function/index combination
2936
2937
Paul Mackerras699a0ea2014-06-02 11:02:59 +100029386. Capabilities that can be enabled on vCPUs
2939--------------------------------------------
Alexander Graf821246a2011-08-31 10:58:55 +02002940
Cornelia Huck0907c852014-06-27 09:29:26 +02002941There are certain capabilities that change the behavior of the virtual CPU or
2942the virtual machine when enabled. To enable them, please see section 4.37.
2943Below you can find a list of capabilities and what their effect on the vCPU or
2944the virtual machine is when enabling them.
Alexander Graf821246a2011-08-31 10:58:55 +02002945
2946The following information is provided along with the description:
2947
2948 Architectures: which instruction set architectures provide this ioctl.
2949 x86 includes both i386 and x86_64.
2950
Cornelia Huck0907c852014-06-27 09:29:26 +02002951 Target: whether this is a per-vcpu or per-vm capability.
2952
Alexander Graf821246a2011-08-31 10:58:55 +02002953 Parameters: what parameters are accepted by the capability.
2954
2955 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL)
2956 are not detailed, but errors with specific meanings are.
2957
Jan Kiszka414fa982012-04-24 16:40:15 +02002958
Alexander Graf821246a2011-08-31 10:58:55 +020029596.1 KVM_CAP_PPC_OSI
2960
2961Architectures: ppc
Cornelia Huck0907c852014-06-27 09:29:26 +02002962Target: vcpu
Alexander Graf821246a2011-08-31 10:58:55 +02002963Parameters: none
2964Returns: 0 on success; -1 on error
2965
2966This capability enables interception of OSI hypercalls that otherwise would
2967be treated as normal system calls to be injected into the guest. OSI hypercalls
2968were invented by Mac-on-Linux to have a standardized communication mechanism
2969between the guest and the host.
2970
2971When this capability is enabled, KVM_EXIT_OSI can occur.
2972
Jan Kiszka414fa982012-04-24 16:40:15 +02002973
Alexander Graf821246a2011-08-31 10:58:55 +020029746.2 KVM_CAP_PPC_PAPR
2975
2976Architectures: ppc
Cornelia Huck0907c852014-06-27 09:29:26 +02002977Target: vcpu
Alexander Graf821246a2011-08-31 10:58:55 +02002978Parameters: none
2979Returns: 0 on success; -1 on error
2980
2981This capability enables interception of PAPR hypercalls. PAPR hypercalls are
2982done using the hypercall instruction "sc 1".
2983
2984It also sets the guest privilege level to "supervisor" mode. Usually the guest
2985runs in "hypervisor" privilege mode with a few missing features.
2986
2987In addition to the above, it changes the semantics of SDR1. In this mode, the
2988HTAB address part of SDR1 contains an HVA instead of a GPA, as PAPR keeps the
2989HTAB invisible to the guest.
2990
2991When this capability is enabled, KVM_EXIT_PAPR_HCALL can occur.
Scott Wooddc83b8b2011-08-18 15:25:21 -05002992
Jan Kiszka414fa982012-04-24 16:40:15 +02002993
Scott Wooddc83b8b2011-08-18 15:25:21 -050029946.3 KVM_CAP_SW_TLB
2995
2996Architectures: ppc
Cornelia Huck0907c852014-06-27 09:29:26 +02002997Target: vcpu
Scott Wooddc83b8b2011-08-18 15:25:21 -05002998Parameters: args[0] is the address of a struct kvm_config_tlb
2999Returns: 0 on success; -1 on error
3000
3001struct kvm_config_tlb {
3002 __u64 params;
3003 __u64 array;
3004 __u32 mmu_type;
3005 __u32 array_len;
3006};
3007
3008Configures the virtual CPU's TLB array, establishing a shared memory area
3009between userspace and KVM. The "params" and "array" fields are userspace
3010addresses of mmu-type-specific data structures. The "array_len" field is an
3011safety mechanism, and should be set to the size in bytes of the memory that
3012userspace has reserved for the array. It must be at least the size dictated
3013by "mmu_type" and "params".
3014
3015While KVM_RUN is active, the shared region is under control of KVM. Its
3016contents are undefined, and any modification by userspace results in
3017boundedly undefined behavior.
3018
3019On return from KVM_RUN, the shared region will reflect the current state of
3020the guest's TLB. If userspace makes any changes, it must call KVM_DIRTY_TLB
3021to tell KVM which entries have been changed, prior to calling KVM_RUN again
3022on this vcpu.
3023
3024For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV:
3025 - The "params" field is of type "struct kvm_book3e_206_tlb_params".
3026 - The "array" field points to an array of type "struct
3027 kvm_book3e_206_tlb_entry".
3028 - The array consists of all entries in the first TLB, followed by all
3029 entries in the second TLB.
3030 - Within a TLB, entries are ordered first by increasing set number. Within a
3031 set, entries are ordered by way (increasing ESEL).
3032 - The hash for determining set number in TLB0 is: (MAS2 >> 12) & (num_sets - 1)
3033 where "num_sets" is the tlb_sizes[] value divided by the tlb_ways[] value.
3034 - The tsize field of mas1 shall be set to 4K on TLB0, even though the
3035 hardware ignores this value for TLB0.
Cornelia Huckfa6b7fe2012-12-20 15:32:12 +01003036
30376.4 KVM_CAP_S390_CSS_SUPPORT
3038
3039Architectures: s390
Cornelia Huck0907c852014-06-27 09:29:26 +02003040Target: vcpu
Cornelia Huckfa6b7fe2012-12-20 15:32:12 +01003041Parameters: none
3042Returns: 0 on success; -1 on error
3043
3044This capability enables support for handling of channel I/O instructions.
3045
3046TEST PENDING INTERRUPTION and the interrupt portion of TEST SUBCHANNEL are
3047handled in-kernel, while the other I/O instructions are passed to userspace.
3048
3049When this capability is enabled, KVM_EXIT_S390_TSCH will occur on TEST
3050SUBCHANNEL intercepts.
Alexander Graf1c810632013-01-04 18:12:48 +01003051
Cornelia Huck0907c852014-06-27 09:29:26 +02003052Note that even though this capability is enabled per-vcpu, the complete
3053virtual machine is affected.
3054
Alexander Graf1c810632013-01-04 18:12:48 +010030556.5 KVM_CAP_PPC_EPR
3056
3057Architectures: ppc
Cornelia Huck0907c852014-06-27 09:29:26 +02003058Target: vcpu
Alexander Graf1c810632013-01-04 18:12:48 +01003059Parameters: args[0] defines whether the proxy facility is active
3060Returns: 0 on success; -1 on error
3061
3062This capability enables or disables the delivery of interrupts through the
3063external proxy facility.
3064
3065When enabled (args[0] != 0), every time the guest gets an external interrupt
3066delivered, it automatically exits into user space with a KVM_EXIT_EPR exit
3067to receive the topmost interrupt vector.
3068
3069When disabled (args[0] == 0), behavior is as if this facility is unsupported.
3070
3071When this capability is enabled, KVM_EXIT_EPR can occur.
Scott Woodeb1e4f42013-04-12 14:08:47 +00003072
30736.6 KVM_CAP_IRQ_MPIC
3074
3075Architectures: ppc
3076Parameters: args[0] is the MPIC device fd
3077 args[1] is the MPIC CPU number for this vcpu
3078
3079This capability connects the vcpu to an in-kernel MPIC device.
Paul Mackerras5975a2e2013-04-27 00:28:37 +00003080
30816.7 KVM_CAP_IRQ_XICS
3082
3083Architectures: ppc
Cornelia Huck0907c852014-06-27 09:29:26 +02003084Target: vcpu
Paul Mackerras5975a2e2013-04-27 00:28:37 +00003085Parameters: args[0] is the XICS device fd
3086 args[1] is the XICS CPU number (server ID) for this vcpu
3087
3088This capability connects the vcpu to an in-kernel XICS device.
Cornelia Huck8a366a42014-06-27 11:06:25 +02003089
30906.8 KVM_CAP_S390_IRQCHIP
3091
3092Architectures: s390
3093Target: vm
3094Parameters: none
3095
3096This capability enables the in-kernel irqchip for s390. Please refer to
3097"4.24 KVM_CREATE_IRQCHIP" for details.
Paul Mackerras699a0ea2014-06-02 11:02:59 +10003098
30997. Capabilities that can be enabled on VMs
3100------------------------------------------
3101
3102There are certain capabilities that change the behavior of the virtual
3103machine when enabled. To enable them, please see section 4.37. Below
3104you can find a list of capabilities and what their effect on the VM
3105is when enabling them.
3106
3107The following information is provided along with the description:
3108
3109 Architectures: which instruction set architectures provide this ioctl.
3110 x86 includes both i386 and x86_64.
3111
3112 Parameters: what parameters are accepted by the capability.
3113
3114 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL)
3115 are not detailed, but errors with specific meanings are.
3116
3117
31187.1 KVM_CAP_PPC_ENABLE_HCALL
3119
3120Architectures: ppc
3121Parameters: args[0] is the sPAPR hcall number
3122 args[1] is 0 to disable, 1 to enable in-kernel handling
3123
3124This capability controls whether individual sPAPR hypercalls (hcalls)
3125get handled by the kernel or not. Enabling or disabling in-kernel
3126handling of an hcall is effective across the VM. On creation, an
3127initial set of hcalls are enabled for in-kernel handling, which
3128consists of those hcalls for which in-kernel handlers were implemented
3129before this capability was implemented. If disabled, the kernel will
3130not to attempt to handle the hcall, but will always exit to userspace
3131to handle it. Note that it may not make sense to enable some and
3132disable others of a group of related hcalls, but KVM does not prevent
3133userspace from doing that.
Paul Mackerrasae2113a2014-06-02 11:03:00 +10003134
3135If the hcall number specified is not one that has an in-kernel
3136implementation, the KVM_ENABLE_CAP ioctl will fail with an EINVAL
3137error.