blob: 44d9893f9db1ad7f63befdbc90ff67cacc2f4b63 [file] [log] [blame]
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001The Definitive KVM (Kernel-based Virtual Machine) API Documentation
2===================================================================
3
41. General description
5
6The kvm API is a set of ioctls that are issued to control various aspects
7of a virtual machine. The ioctls belong to three classes
8
9 - System ioctls: These query and set global attributes which affect the
10 whole kvm subsystem. In addition a system ioctl is used to create
11 virtual machines
12
13 - VM ioctls: These query and set attributes that affect an entire virtual
14 machine, for example memory layout. In addition a VM ioctl is used to
15 create virtual cpus (vcpus).
16
17 Only run VM ioctls from the same process (address space) that was used
18 to create the VM.
19
20 - vcpu ioctls: These query and set attributes that control the operation
21 of a single virtual cpu.
22
23 Only run vcpu ioctls from the same thread that was used to create the
24 vcpu.
25
Wu Fengguang2044892d2009-12-24 09:04:16 +0800262. File descriptors
Avi Kivity9c1b96e2009-06-09 12:37:58 +030027
28The kvm API is centered around file descriptors. An initial
29open("/dev/kvm") obtains a handle to the kvm subsystem; this handle
30can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this
Wu Fengguang2044892d2009-12-24 09:04:16 +080031handle will create a VM file descriptor which can be used to issue VM
Avi Kivity9c1b96e2009-06-09 12:37:58 +030032ioctls. A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu
33and return a file descriptor pointing to it. Finally, ioctls on a vcpu
34fd can be used to control the vcpu, including the important task of
35actually running guest code.
36
37In general file descriptors can be migrated among processes by means
38of fork() and the SCM_RIGHTS facility of unix domain socket. These
39kinds of tricks are explicitly not supported by kvm. While they will
40not cause harm to the host, their actual behavior is not guaranteed by
41the API. The only supported use is one virtual machine per process,
42and one vcpu per thread.
43
443. Extensions
45
46As of Linux 2.6.22, the KVM ABI has been stabilized: no backward
47incompatible change are allowed. However, there is an extension
48facility that allows backward-compatible extensions to the API to be
49queried and used.
50
51The extension mechanism is not based on on the Linux version number.
52Instead, kvm defines extension identifiers and a facility to query
53whether a particular extension identifier is available. If it is, a
54set of ioctls is available for application use.
55
564. API description
57
58This section describes ioctls that can be used to control kvm guests.
59For each ioctl, the following information is provided along with a
60description:
61
62 Capability: which KVM extension provides this ioctl. Can be 'basic',
63 which means that is will be provided by any kernel that supports
64 API version 12 (see section 4.1), or a KVM_CAP_xyz constant, which
65 means availability needs to be checked with KVM_CHECK_EXTENSION
66 (see section 4.4).
67
68 Architectures: which instruction set architectures provide this ioctl.
69 x86 includes both i386 and x86_64.
70
71 Type: system, vm, or vcpu.
72
73 Parameters: what parameters are accepted by the ioctl.
74
75 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL)
76 are not detailed, but errors with specific meanings are.
77
784.1 KVM_GET_API_VERSION
79
80Capability: basic
81Architectures: all
82Type: system ioctl
83Parameters: none
84Returns: the constant KVM_API_VERSION (=12)
85
86This identifies the API version as the stable kvm API. It is not
87expected that this number will change. However, Linux 2.6.20 and
882.6.21 report earlier versions; these are not documented and not
89supported. Applications should refuse to run if KVM_GET_API_VERSION
90returns a value other than 12. If this check passes, all ioctls
91described as 'basic' will be available.
92
934.2 KVM_CREATE_VM
94
95Capability: basic
96Architectures: all
97Type: system ioctl
98Parameters: none
99Returns: a VM fd that can be used to control the new virtual machine.
100
101The new VM has no virtual cpus and no memory. An mmap() of a VM fd
102will access the virtual machine's physical address space; offset zero
103corresponds to guest physical address zero. Use of mmap() on a VM fd
104is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is
105available.
106
1074.3 KVM_GET_MSR_INDEX_LIST
108
109Capability: basic
110Architectures: x86
111Type: system
112Parameters: struct kvm_msr_list (in/out)
113Returns: 0 on success; -1 on error
114Errors:
115 E2BIG: the msr index list is to be to fit in the array specified by
116 the user.
117
118struct kvm_msr_list {
119 __u32 nmsrs; /* number of msrs in entries */
120 __u32 indices[0];
121};
122
123This ioctl returns the guest msrs that are supported. The list varies
124by kvm version and host processor, but does not change otherwise. The
125user fills in the size of the indices array in nmsrs, and in return
126kvm adjusts nmsrs to reflect the actual number of msrs and fills in
127the indices array with their numbers.
128
Avi Kivity2e2602c2010-07-07 14:09:39 +0300129Note: if kvm indicates supports MCE (KVM_CAP_MCE), then the MCE bank MSRs are
130not returned in the MSR list, as different vcpus can have a different number
131of banks, as set via the KVM_X86_SETUP_MCE ioctl.
132
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001334.4 KVM_CHECK_EXTENSION
134
135Capability: basic
136Architectures: all
137Type: system ioctl
138Parameters: extension identifier (KVM_CAP_*)
139Returns: 0 if unsupported; 1 (or some other positive integer) if supported
140
141The API allows the application to query about extensions to the core
142kvm API. Userspace passes an extension identifier (an integer) and
143receives an integer that describes the extension availability.
144Generally 0 means no and 1 means yes, but some extensions may report
145additional information in the integer return value.
146
1474.5 KVM_GET_VCPU_MMAP_SIZE
148
149Capability: basic
150Architectures: all
151Type: system ioctl
152Parameters: none
153Returns: size of vcpu mmap area, in bytes
154
155The KVM_RUN ioctl (cf.) communicates with userspace via a shared
156memory region. This ioctl returns the size of that region. See the
157KVM_RUN documentation for details.
158
1594.6 KVM_SET_MEMORY_REGION
160
161Capability: basic
162Architectures: all
163Type: vm ioctl
164Parameters: struct kvm_memory_region (in)
165Returns: 0 on success, -1 on error
166
Avi Kivityb74a07b2010-06-21 11:48:05 +0300167This ioctl is obsolete and has been removed.
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300168
1694.6 KVM_CREATE_VCPU
170
171Capability: basic
172Architectures: all
173Type: vm ioctl
174Parameters: vcpu id (apic id on x86)
175Returns: vcpu fd on success, -1 on error
176
177This API adds a vcpu to a virtual machine. The vcpu id is a small integer
178in the range [0, max_vcpus).
179
1804.7 KVM_GET_DIRTY_LOG (vm ioctl)
181
182Capability: basic
183Architectures: x86
184Type: vm ioctl
185Parameters: struct kvm_dirty_log (in/out)
186Returns: 0 on success, -1 on error
187
188/* for KVM_GET_DIRTY_LOG */
189struct kvm_dirty_log {
190 __u32 slot;
191 __u32 padding;
192 union {
193 void __user *dirty_bitmap; /* one bit per page */
194 __u64 padding;
195 };
196};
197
198Given a memory slot, return a bitmap containing any pages dirtied
199since the last call to this ioctl. Bit 0 is the first page in the
200memory slot. Ensure the entire structure is cleared to avoid padding
201issues.
202
2034.8 KVM_SET_MEMORY_ALIAS
204
205Capability: basic
206Architectures: x86
207Type: vm ioctl
208Parameters: struct kvm_memory_alias (in)
209Returns: 0 (success), -1 (error)
210
Avi Kivitya1f4d3952010-06-21 11:44:20 +0300211This ioctl is obsolete and has been removed.
Avi Kivity9c1b96e2009-06-09 12:37:58 +0300212
2134.9 KVM_RUN
214
215Capability: basic
216Architectures: all
217Type: vcpu ioctl
218Parameters: none
219Returns: 0 on success, -1 on error
220Errors:
221 EINTR: an unmasked signal is pending
222
223This ioctl is used to run a guest virtual cpu. While there are no
224explicit parameters, there is an implicit parameter block that can be
225obtained by mmap()ing the vcpu fd at offset 0, with the size given by
226KVM_GET_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct
227kvm_run' (see below).
228
2294.10 KVM_GET_REGS
230
231Capability: basic
232Architectures: all
233Type: vcpu ioctl
234Parameters: struct kvm_regs (out)
235Returns: 0 on success, -1 on error
236
237Reads the general purpose registers from the vcpu.
238
239/* x86 */
240struct kvm_regs {
241 /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
242 __u64 rax, rbx, rcx, rdx;
243 __u64 rsi, rdi, rsp, rbp;
244 __u64 r8, r9, r10, r11;
245 __u64 r12, r13, r14, r15;
246 __u64 rip, rflags;
247};
248
2494.11 KVM_SET_REGS
250
251Capability: basic
252Architectures: all
253Type: vcpu ioctl
254Parameters: struct kvm_regs (in)
255Returns: 0 on success, -1 on error
256
257Writes the general purpose registers into the vcpu.
258
259See KVM_GET_REGS for the data structure.
260
2614.12 KVM_GET_SREGS
262
263Capability: basic
264Architectures: x86
265Type: vcpu ioctl
266Parameters: struct kvm_sregs (out)
267Returns: 0 on success, -1 on error
268
269Reads special registers from the vcpu.
270
271/* x86 */
272struct kvm_sregs {
273 struct kvm_segment cs, ds, es, fs, gs, ss;
274 struct kvm_segment tr, ldt;
275 struct kvm_dtable gdt, idt;
276 __u64 cr0, cr2, cr3, cr4, cr8;
277 __u64 efer;
278 __u64 apic_base;
279 __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
280};
281
282interrupt_bitmap is a bitmap of pending external interrupts. At most
283one bit may be set. This interrupt has been acknowledged by the APIC
284but not yet injected into the cpu core.
285
2864.13 KVM_SET_SREGS
287
288Capability: basic
289Architectures: x86
290Type: vcpu ioctl
291Parameters: struct kvm_sregs (in)
292Returns: 0 on success, -1 on error
293
294Writes special registers into the vcpu. See KVM_GET_SREGS for the
295data structures.
296
2974.14 KVM_TRANSLATE
298
299Capability: basic
300Architectures: x86
301Type: vcpu ioctl
302Parameters: struct kvm_translation (in/out)
303Returns: 0 on success, -1 on error
304
305Translates a virtual address according to the vcpu's current address
306translation mode.
307
308struct kvm_translation {
309 /* in */
310 __u64 linear_address;
311
312 /* out */
313 __u64 physical_address;
314 __u8 valid;
315 __u8 writeable;
316 __u8 usermode;
317 __u8 pad[5];
318};
319
3204.15 KVM_INTERRUPT
321
322Capability: basic
323Architectures: x86
324Type: vcpu ioctl
325Parameters: struct kvm_interrupt (in)
326Returns: 0 on success, -1 on error
327
328Queues a hardware interrupt vector to be injected. This is only
329useful if in-kernel local APIC is not used.
330
331/* for KVM_INTERRUPT */
332struct kvm_interrupt {
333 /* in */
334 __u32 irq;
335};
336
337Note 'irq' is an interrupt vector, not an interrupt pin or line.
338
3394.16 KVM_DEBUG_GUEST
340
341Capability: basic
342Architectures: none
343Type: vcpu ioctl
344Parameters: none)
345Returns: -1 on error
346
347Support for this has been removed. Use KVM_SET_GUEST_DEBUG instead.
348
3494.17 KVM_GET_MSRS
350
351Capability: basic
352Architectures: x86
353Type: vcpu ioctl
354Parameters: struct kvm_msrs (in/out)
355Returns: 0 on success, -1 on error
356
357Reads model-specific registers from the vcpu. Supported msr indices can
358be obtained using KVM_GET_MSR_INDEX_LIST.
359
360struct kvm_msrs {
361 __u32 nmsrs; /* number of msrs in entries */
362 __u32 pad;
363
364 struct kvm_msr_entry entries[0];
365};
366
367struct kvm_msr_entry {
368 __u32 index;
369 __u32 reserved;
370 __u64 data;
371};
372
373Application code should set the 'nmsrs' member (which indicates the
374size of the entries array) and the 'index' member of each array entry.
375kvm will fill in the 'data' member.
376
3774.18 KVM_SET_MSRS
378
379Capability: basic
380Architectures: x86
381Type: vcpu ioctl
382Parameters: struct kvm_msrs (in)
383Returns: 0 on success, -1 on error
384
385Writes model-specific registers to the vcpu. See KVM_GET_MSRS for the
386data structures.
387
388Application code should set the 'nmsrs' member (which indicates the
389size of the entries array), and the 'index' and 'data' members of each
390array entry.
391
3924.19 KVM_SET_CPUID
393
394Capability: basic
395Architectures: x86
396Type: vcpu ioctl
397Parameters: struct kvm_cpuid (in)
398Returns: 0 on success, -1 on error
399
400Defines the vcpu responses to the cpuid instruction. Applications
401should use the KVM_SET_CPUID2 ioctl if available.
402
403
404struct kvm_cpuid_entry {
405 __u32 function;
406 __u32 eax;
407 __u32 ebx;
408 __u32 ecx;
409 __u32 edx;
410 __u32 padding;
411};
412
413/* for KVM_SET_CPUID */
414struct kvm_cpuid {
415 __u32 nent;
416 __u32 padding;
417 struct kvm_cpuid_entry entries[0];
418};
419
4204.20 KVM_SET_SIGNAL_MASK
421
422Capability: basic
423Architectures: x86
424Type: vcpu ioctl
425Parameters: struct kvm_signal_mask (in)
426Returns: 0 on success, -1 on error
427
428Defines which signals are blocked during execution of KVM_RUN. This
429signal mask temporarily overrides the threads signal mask. Any
430unblocked signal received (except SIGKILL and SIGSTOP, which retain
431their traditional behaviour) will cause KVM_RUN to return with -EINTR.
432
433Note the signal will only be delivered if not blocked by the original
434signal mask.
435
436/* for KVM_SET_SIGNAL_MASK */
437struct kvm_signal_mask {
438 __u32 len;
439 __u8 sigset[0];
440};
441
4424.21 KVM_GET_FPU
443
444Capability: basic
445Architectures: x86
446Type: vcpu ioctl
447Parameters: struct kvm_fpu (out)
448Returns: 0 on success, -1 on error
449
450Reads the floating point state from the vcpu.
451
452/* for KVM_GET_FPU and KVM_SET_FPU */
453struct kvm_fpu {
454 __u8 fpr[8][16];
455 __u16 fcw;
456 __u16 fsw;
457 __u8 ftwx; /* in fxsave format */
458 __u8 pad1;
459 __u16 last_opcode;
460 __u64 last_ip;
461 __u64 last_dp;
462 __u8 xmm[16][16];
463 __u32 mxcsr;
464 __u32 pad2;
465};
466
4674.22 KVM_SET_FPU
468
469Capability: basic
470Architectures: x86
471Type: vcpu ioctl
472Parameters: struct kvm_fpu (in)
473Returns: 0 on success, -1 on error
474
475Writes the floating point state to the vcpu.
476
477/* for KVM_GET_FPU and KVM_SET_FPU */
478struct kvm_fpu {
479 __u8 fpr[8][16];
480 __u16 fcw;
481 __u16 fsw;
482 __u8 ftwx; /* in fxsave format */
483 __u8 pad1;
484 __u16 last_opcode;
485 __u64 last_ip;
486 __u64 last_dp;
487 __u8 xmm[16][16];
488 __u32 mxcsr;
489 __u32 pad2;
490};
491
Avi Kivity5dadbfd2009-08-23 17:08:04 +03004924.23 KVM_CREATE_IRQCHIP
493
494Capability: KVM_CAP_IRQCHIP
495Architectures: x86, ia64
496Type: vm ioctl
497Parameters: none
498Returns: 0 on success, -1 on error
499
500Creates an interrupt controller model in the kernel. On x86, creates a virtual
501ioapic, a virtual PIC (two PICs, nested), and sets up future vcpus to have a
502local APIC. IRQ routing for GSIs 0-15 is set to both PIC and IOAPIC; GSI 16-23
503only go to the IOAPIC. On ia64, a IOSAPIC is created.
504
5054.24 KVM_IRQ_LINE
506
507Capability: KVM_CAP_IRQCHIP
508Architectures: x86, ia64
509Type: vm ioctl
510Parameters: struct kvm_irq_level
511Returns: 0 on success, -1 on error
512
513Sets the level of a GSI input to the interrupt controller model in the kernel.
514Requires that an interrupt controller model has been previously created with
515KVM_CREATE_IRQCHIP. Note that edge-triggered interrupts require the level
516to be set to 1 and then back to 0.
517
518struct kvm_irq_level {
519 union {
520 __u32 irq; /* GSI */
521 __s32 status; /* not used for KVM_IRQ_LEVEL */
522 };
523 __u32 level; /* 0 or 1 */
524};
525
5264.25 KVM_GET_IRQCHIP
527
528Capability: KVM_CAP_IRQCHIP
529Architectures: x86, ia64
530Type: vm ioctl
531Parameters: struct kvm_irqchip (in/out)
532Returns: 0 on success, -1 on error
533
534Reads the state of a kernel interrupt controller created with
535KVM_CREATE_IRQCHIP into a buffer provided by the caller.
536
537struct kvm_irqchip {
538 __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
539 __u32 pad;
540 union {
541 char dummy[512]; /* reserving space */
542 struct kvm_pic_state pic;
543 struct kvm_ioapic_state ioapic;
544 } chip;
545};
546
5474.26 KVM_SET_IRQCHIP
548
549Capability: KVM_CAP_IRQCHIP
550Architectures: x86, ia64
551Type: vm ioctl
552Parameters: struct kvm_irqchip (in)
553Returns: 0 on success, -1 on error
554
555Sets the state of a kernel interrupt controller created with
556KVM_CREATE_IRQCHIP from a buffer provided by the caller.
557
558struct kvm_irqchip {
559 __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
560 __u32 pad;
561 union {
562 char dummy[512]; /* reserving space */
563 struct kvm_pic_state pic;
564 struct kvm_ioapic_state ioapic;
565 } chip;
566};
567
Ed Swierkffde22a2009-10-15 15:21:43 -07005684.27 KVM_XEN_HVM_CONFIG
569
570Capability: KVM_CAP_XEN_HVM
571Architectures: x86
572Type: vm ioctl
573Parameters: struct kvm_xen_hvm_config (in)
574Returns: 0 on success, -1 on error
575
576Sets the MSR that the Xen HVM guest uses to initialize its hypercall
577page, and provides the starting address and size of the hypercall
578blobs in userspace. When the guest writes the MSR, kvm copies one
579page of a blob (32- or 64-bit, depending on the vcpu mode) to guest
580memory.
581
582struct kvm_xen_hvm_config {
583 __u32 flags;
584 __u32 msr;
585 __u64 blob_addr_32;
586 __u64 blob_addr_64;
587 __u8 blob_size_32;
588 __u8 blob_size_64;
589 __u8 pad2[30];
590};
591
Glauber Costaafbcf7a2009-10-16 15:28:36 -04005924.27 KVM_GET_CLOCK
593
594Capability: KVM_CAP_ADJUST_CLOCK
595Architectures: x86
596Type: vm ioctl
597Parameters: struct kvm_clock_data (out)
598Returns: 0 on success, -1 on error
599
600Gets the current timestamp of kvmclock as seen by the current guest. In
601conjunction with KVM_SET_CLOCK, it is used to ensure monotonicity on scenarios
602such as migration.
603
604struct kvm_clock_data {
605 __u64 clock; /* kvmclock current value */
606 __u32 flags;
607 __u32 pad[9];
608};
609
6104.28 KVM_SET_CLOCK
611
612Capability: KVM_CAP_ADJUST_CLOCK
613Architectures: x86
614Type: vm ioctl
615Parameters: struct kvm_clock_data (in)
616Returns: 0 on success, -1 on error
617
Wu Fengguang2044892d2009-12-24 09:04:16 +0800618Sets the current timestamp of kvmclock to the value specified in its parameter.
Glauber Costaafbcf7a2009-10-16 15:28:36 -0400619In conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on scenarios
620such as migration.
621
622struct kvm_clock_data {
623 __u64 clock; /* kvmclock current value */
624 __u32 flags;
625 __u32 pad[9];
626};
627
Jan Kiszka3cfc3092009-11-12 01:04:25 +01006284.29 KVM_GET_VCPU_EVENTS
629
630Capability: KVM_CAP_VCPU_EVENTS
Jan Kiszka48005f62010-02-19 19:38:07 +0100631Extended by: KVM_CAP_INTR_SHADOW
Jan Kiszka3cfc3092009-11-12 01:04:25 +0100632Architectures: x86
633Type: vm ioctl
634Parameters: struct kvm_vcpu_event (out)
635Returns: 0 on success, -1 on error
636
637Gets currently pending exceptions, interrupts, and NMIs as well as related
638states of the vcpu.
639
640struct kvm_vcpu_events {
641 struct {
642 __u8 injected;
643 __u8 nr;
644 __u8 has_error_code;
645 __u8 pad;
646 __u32 error_code;
647 } exception;
648 struct {
649 __u8 injected;
650 __u8 nr;
651 __u8 soft;
Jan Kiszka48005f62010-02-19 19:38:07 +0100652 __u8 shadow;
Jan Kiszka3cfc3092009-11-12 01:04:25 +0100653 } interrupt;
654 struct {
655 __u8 injected;
656 __u8 pending;
657 __u8 masked;
658 __u8 pad;
659 } nmi;
660 __u32 sipi_vector;
Jan Kiszkadab4b912009-12-06 18:24:15 +0100661 __u32 flags;
Jan Kiszka3cfc3092009-11-12 01:04:25 +0100662};
663
Jan Kiszka48005f62010-02-19 19:38:07 +0100664KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that
665interrupt.shadow contains a valid state. Otherwise, this field is undefined.
666
Jan Kiszka3cfc3092009-11-12 01:04:25 +01006674.30 KVM_SET_VCPU_EVENTS
668
669Capability: KVM_CAP_VCPU_EVENTS
Jan Kiszka48005f62010-02-19 19:38:07 +0100670Extended by: KVM_CAP_INTR_SHADOW
Jan Kiszka3cfc3092009-11-12 01:04:25 +0100671Architectures: x86
672Type: vm ioctl
673Parameters: struct kvm_vcpu_event (in)
674Returns: 0 on success, -1 on error
675
676Set pending exceptions, interrupts, and NMIs as well as related states of the
677vcpu.
678
679See KVM_GET_VCPU_EVENTS for the data structure.
680
Jan Kiszkadab4b912009-12-06 18:24:15 +0100681Fields that may be modified asynchronously by running VCPUs can be excluded
682from the update. These fields are nmi.pending and sipi_vector. Keep the
683corresponding bits in the flags field cleared to suppress overwriting the
684current in-kernel state. The bits are:
685
686KVM_VCPUEVENT_VALID_NMI_PENDING - transfer nmi.pending to the kernel
687KVM_VCPUEVENT_VALID_SIPI_VECTOR - transfer sipi_vector
688
Jan Kiszka48005f62010-02-19 19:38:07 +0100689If KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in
690the flags field to signal that interrupt.shadow contains a valid state and
691shall be written into the VCPU.
692
Jan Kiszkaa1efbe72010-02-15 10:45:43 +01006934.32 KVM_GET_DEBUGREGS
694
695Capability: KVM_CAP_DEBUGREGS
696Architectures: x86
697Type: vm ioctl
698Parameters: struct kvm_debugregs (out)
699Returns: 0 on success, -1 on error
700
701Reads debug registers from the vcpu.
702
703struct kvm_debugregs {
704 __u64 db[4];
705 __u64 dr6;
706 __u64 dr7;
707 __u64 flags;
708 __u64 reserved[9];
709};
710
7114.33 KVM_SET_DEBUGREGS
712
713Capability: KVM_CAP_DEBUGREGS
714Architectures: x86
715Type: vm ioctl
716Parameters: struct kvm_debugregs (in)
717Returns: 0 on success, -1 on error
718
719Writes debug registers into the vcpu.
720
721See KVM_GET_DEBUGREGS for the data structure. The flags field is unused
722yet and must be cleared on entry.
723
Avi Kivity0f2d8f42010-03-25 12:16:48 +02007244.34 KVM_SET_USER_MEMORY_REGION
725
726Capability: KVM_CAP_USER_MEM
727Architectures: all
728Type: vm ioctl
729Parameters: struct kvm_userspace_memory_region (in)
730Returns: 0 on success, -1 on error
731
732struct kvm_userspace_memory_region {
733 __u32 slot;
734 __u32 flags;
735 __u64 guest_phys_addr;
736 __u64 memory_size; /* bytes */
737 __u64 userspace_addr; /* start of the userspace allocated memory */
738};
739
740/* for kvm_memory_region::flags */
741#define KVM_MEM_LOG_DIRTY_PAGES 1UL
742
743This ioctl allows the user to create or modify a guest physical memory
744slot. When changing an existing slot, it may be moved in the guest
745physical memory space, or its flags may be modified. It may not be
746resized. Slots may not overlap in guest physical address space.
747
748Memory for the region is taken starting at the address denoted by the
749field userspace_addr, which must point at user addressable memory for
750the entire memory slot size. Any object may back this memory, including
751anonymous memory, ordinary files, and hugetlbfs.
752
753It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
754be identical. This allows large pages in the guest to be backed by large
755pages in the host.
756
757The flags field supports just one flag, KVM_MEM_LOG_DIRTY_PAGES, which
758instructs kvm to keep track of writes to memory within the slot. See
759the KVM_GET_DIRTY_LOG ioctl.
760
761When the KVM_CAP_SYNC_MMU capability, changes in the backing of the memory
762region are automatically reflected into the guest. For example, an mmap()
763that affects the region will be made visible immediately. Another example
764is madvise(MADV_DROP).
765
766It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl.
767The KVM_SET_MEMORY_REGION does not allow fine grained control over memory
768allocation and is deprecated.
Jan Kiszka3cfc3092009-11-12 01:04:25 +0100769
Avi Kivity8a5416d2010-03-25 12:27:30 +02007704.35 KVM_SET_TSS_ADDR
771
772Capability: KVM_CAP_SET_TSS_ADDR
773Architectures: x86
774Type: vm ioctl
775Parameters: unsigned long tss_address (in)
776Returns: 0 on success, -1 on error
777
778This ioctl defines the physical address of a three-page region in the guest
779physical address space. The region must be within the first 4GB of the
780guest physical address space and must not conflict with any memory slot
781or any mmio address. The guest may malfunction if it accesses this memory
782region.
783
784This ioctl is required on Intel-based hosts. This is needed on Intel hardware
785because of a quirk in the virtualization implementation (see the internals
786documentation when it pops into existence).
787
Alexander Graf71fbfd52010-03-24 21:48:29 +01007884.36 KVM_ENABLE_CAP
789
790Capability: KVM_CAP_ENABLE_CAP
791Architectures: ppc
792Type: vcpu ioctl
793Parameters: struct kvm_enable_cap (in)
794Returns: 0 on success; -1 on error
795
796+Not all extensions are enabled by default. Using this ioctl the application
797can enable an extension, making it available to the guest.
798
799On systems that do not support this ioctl, it always fails. On systems that
800do support it, it only works for extensions that are supported for enablement.
801
802To check if a capability can be enabled, the KVM_CHECK_EXTENSION ioctl should
803be used.
804
805struct kvm_enable_cap {
806 /* in */
807 __u32 cap;
808
809The capability that is supposed to get enabled.
810
811 __u32 flags;
812
813A bitfield indicating future enhancements. Has to be 0 for now.
814
815 __u64 args[4];
816
817Arguments for enabling a feature. If a feature needs initial values to
818function properly, this is the place to put them.
819
820 __u8 pad[64];
821};
822
Avi Kivityb843f062010-04-25 15:51:46 +03008234.37 KVM_GET_MP_STATE
824
825Capability: KVM_CAP_MP_STATE
826Architectures: x86, ia64
827Type: vcpu ioctl
828Parameters: struct kvm_mp_state (out)
829Returns: 0 on success; -1 on error
830
831struct kvm_mp_state {
832 __u32 mp_state;
833};
834
835Returns the vcpu's current "multiprocessing state" (though also valid on
836uniprocessor guests).
837
838Possible values are:
839
840 - KVM_MP_STATE_RUNNABLE: the vcpu is currently running
841 - KVM_MP_STATE_UNINITIALIZED: the vcpu is an application processor (AP)
842 which has not yet received an INIT signal
843 - KVM_MP_STATE_INIT_RECEIVED: the vcpu has received an INIT signal, and is
844 now ready for a SIPI
845 - KVM_MP_STATE_HALTED: the vcpu has executed a HLT instruction and
846 is waiting for an interrupt
847 - KVM_MP_STATE_SIPI_RECEIVED: the vcpu has just received a SIPI (vector
848 accesible via KVM_GET_VCPU_EVENTS)
849
850This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel
851irqchip, the multiprocessing state must be maintained by userspace.
852
8534.38 KVM_SET_MP_STATE
854
855Capability: KVM_CAP_MP_STATE
856Architectures: x86, ia64
857Type: vcpu ioctl
858Parameters: struct kvm_mp_state (in)
859Returns: 0 on success; -1 on error
860
861Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
862arguments.
863
864This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel
865irqchip, the multiprocessing state must be maintained by userspace.
866
Avi Kivity47dbb842010-04-29 12:08:56 +03008674.39 KVM_SET_IDENTITY_MAP_ADDR
868
869Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR
870Architectures: x86
871Type: vm ioctl
872Parameters: unsigned long identity (in)
873Returns: 0 on success, -1 on error
874
875This ioctl defines the physical address of a one-page region in the guest
876physical address space. The region must be within the first 4GB of the
877guest physical address space and must not conflict with any memory slot
878or any mmio address. The guest may malfunction if it accesses this memory
879region.
880
881This ioctl is required on Intel-based hosts. This is needed on Intel hardware
882because of a quirk in the virtualization implementation (see the internals
883documentation when it pops into existence).
884
Avi Kivity57bc24c2010-04-29 12:12:57 +03008854.40 KVM_SET_BOOT_CPU_ID
886
887Capability: KVM_CAP_SET_BOOT_CPU_ID
888Architectures: x86, ia64
889Type: vm ioctl
890Parameters: unsigned long vcpu_id
891Returns: 0 on success, -1 on error
892
893Define which vcpu is the Bootstrap Processor (BSP). Values are the same
894as the vcpu id in KVM_CREATE_VCPU. If this ioctl is not called, the default
895is vcpu 0.
896
Sheng Yang2d5b5a62010-06-13 17:29:39 +08008974.41 KVM_GET_XSAVE
898
899Capability: KVM_CAP_XSAVE
900Architectures: x86
901Type: vcpu ioctl
902Parameters: struct kvm_xsave (out)
903Returns: 0 on success, -1 on error
904
905struct kvm_xsave {
906 __u32 region[1024];
907};
908
909This ioctl would copy current vcpu's xsave struct to the userspace.
910
9114.42 KVM_SET_XSAVE
912
913Capability: KVM_CAP_XSAVE
914Architectures: x86
915Type: vcpu ioctl
916Parameters: struct kvm_xsave (in)
917Returns: 0 on success, -1 on error
918
919struct kvm_xsave {
920 __u32 region[1024];
921};
922
923This ioctl would copy userspace's xsave struct to the kernel.
924
9254.43 KVM_GET_XCRS
926
927Capability: KVM_CAP_XCRS
928Architectures: x86
929Type: vcpu ioctl
930Parameters: struct kvm_xcrs (out)
931Returns: 0 on success, -1 on error
932
933struct kvm_xcr {
934 __u32 xcr;
935 __u32 reserved;
936 __u64 value;
937};
938
939struct kvm_xcrs {
940 __u32 nr_xcrs;
941 __u32 flags;
942 struct kvm_xcr xcrs[KVM_MAX_XCRS];
943 __u64 padding[16];
944};
945
946This ioctl would copy current vcpu's xcrs to the userspace.
947
9484.44 KVM_SET_XCRS
949
950Capability: KVM_CAP_XCRS
951Architectures: x86
952Type: vcpu ioctl
953Parameters: struct kvm_xcrs (in)
954Returns: 0 on success, -1 on error
955
956struct kvm_xcr {
957 __u32 xcr;
958 __u32 reserved;
959 __u64 value;
960};
961
962struct kvm_xcrs {
963 __u32 nr_xcrs;
964 __u32 flags;
965 struct kvm_xcr xcrs[KVM_MAX_XCRS];
966 __u64 padding[16];
967};
968
969This ioctl would set vcpu's xcr to the value userspace specified.
970
Avi Kivityd1535132010-07-14 09:45:21 +03009714.45 KVM_GET_SUPPORTED_CPUID
972
973Capability: KVM_CAP_EXT_CPUID
974Architectures: x86
975Type: system ioctl
976Parameters: struct kvm_cpuid2 (in/out)
977Returns: 0 on success, -1 on error
978
979struct kvm_cpuid2 {
980 __u32 nent;
981 __u32 padding;
982 struct kvm_cpuid_entry2 entries[0];
983};
984
985#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX 1
986#define KVM_CPUID_FLAG_STATEFUL_FUNC 2
987#define KVM_CPUID_FLAG_STATE_READ_NEXT 4
988
989struct kvm_cpuid_entry2 {
990 __u32 function;
991 __u32 index;
992 __u32 flags;
993 __u32 eax;
994 __u32 ebx;
995 __u32 ecx;
996 __u32 edx;
997 __u32 padding[3];
998};
999
1000This ioctl returns x86 cpuid features which are supported by both the hardware
1001and kvm. Userspace can use the information returned by this ioctl to
1002construct cpuid information (for KVM_SET_CPUID2) that is consistent with
1003hardware, kernel, and userspace capabilities, and with user requirements (for
1004example, the user may wish to constrain cpuid to emulate older hardware,
1005or for feature consistency across a cluster).
1006
1007Userspace invokes KVM_GET_SUPPORTED_CPUID by passing a kvm_cpuid2 structure
1008with the 'nent' field indicating the number of entries in the variable-size
1009array 'entries'. If the number of entries is too low to describe the cpu
1010capabilities, an error (E2BIG) is returned. If the number is too high,
1011the 'nent' field is adjusted and an error (ENOMEM) is returned. If the
1012number is just right, the 'nent' field is adjusted to the number of valid
1013entries in the 'entries' array, which is then filled.
1014
1015The entries returned are the host cpuid as returned by the cpuid instruction,
1016with unknown or unsupported features masked out. The fields in each entry
1017are defined as follows:
1018
1019 function: the eax value used to obtain the entry
1020 index: the ecx value used to obtain the entry (for entries that are
1021 affected by ecx)
1022 flags: an OR of zero or more of the following:
1023 KVM_CPUID_FLAG_SIGNIFCANT_INDEX:
1024 if the index field is valid
1025 KVM_CPUID_FLAG_STATEFUL_FUNC:
1026 if cpuid for this function returns different values for successive
1027 invocations; there will be several entries with the same function,
1028 all with this flag set
1029 KVM_CPUID_FLAG_STATE_READ_NEXT:
1030 for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is
1031 the first entry to be read by a cpu
1032 eax, ebx, ecx, edx: the values returned by the cpuid instruction for
1033 this function/index combination
1034
Alexander Graf15711e92010-07-29 14:48:08 +020010354.46 KVM_PPC_GET_PVINFO
1036
1037Capability: KVM_CAP_PPC_GET_PVINFO
1038Architectures: ppc
1039Type: vm ioctl
1040Parameters: struct kvm_ppc_pvinfo (out)
1041Returns: 0 on success, !0 on error
1042
1043struct kvm_ppc_pvinfo {
1044 __u32 flags;
1045 __u32 hcall[4];
1046 __u8 pad[108];
1047};
1048
1049This ioctl fetches PV specific information that need to be passed to the guest
1050using the device tree or other means from vm context.
1051
1052For now the only implemented piece of information distributed here is an array
1053of 4 instructions that make up a hypercall.
1054
1055If any additional field gets added to this structure later on, a bit for that
1056additional piece of information will be set in the flags bitmap.
1057
Avi Kivity9c1b96e2009-06-09 12:37:58 +030010585. The kvm_run structure
1059
1060Application code obtains a pointer to the kvm_run structure by
1061mmap()ing a vcpu fd. From that point, application code can control
1062execution by changing fields in kvm_run prior to calling the KVM_RUN
1063ioctl, and obtain information about the reason KVM_RUN returned by
1064looking up structure members.
1065
1066struct kvm_run {
1067 /* in */
1068 __u8 request_interrupt_window;
1069
1070Request that KVM_RUN return when it becomes possible to inject external
1071interrupts into the guest. Useful in conjunction with KVM_INTERRUPT.
1072
1073 __u8 padding1[7];
1074
1075 /* out */
1076 __u32 exit_reason;
1077
1078When KVM_RUN has returned successfully (return value 0), this informs
1079application code why KVM_RUN has returned. Allowable values for this
1080field are detailed below.
1081
1082 __u8 ready_for_interrupt_injection;
1083
1084If request_interrupt_window has been specified, this field indicates
1085an interrupt can be injected now with KVM_INTERRUPT.
1086
1087 __u8 if_flag;
1088
1089The value of the current interrupt flag. Only valid if in-kernel
1090local APIC is not used.
1091
1092 __u8 padding2[2];
1093
1094 /* in (pre_kvm_run), out (post_kvm_run) */
1095 __u64 cr8;
1096
1097The value of the cr8 register. Only valid if in-kernel local APIC is
1098not used. Both input and output.
1099
1100 __u64 apic_base;
1101
1102The value of the APIC BASE msr. Only valid if in-kernel local
1103APIC is not used. Both input and output.
1104
1105 union {
1106 /* KVM_EXIT_UNKNOWN */
1107 struct {
1108 __u64 hardware_exit_reason;
1109 } hw;
1110
1111If exit_reason is KVM_EXIT_UNKNOWN, the vcpu has exited due to unknown
1112reasons. Further architecture-specific information is available in
1113hardware_exit_reason.
1114
1115 /* KVM_EXIT_FAIL_ENTRY */
1116 struct {
1117 __u64 hardware_entry_failure_reason;
1118 } fail_entry;
1119
1120If exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due
1121to unknown reasons. Further architecture-specific information is
1122available in hardware_entry_failure_reason.
1123
1124 /* KVM_EXIT_EXCEPTION */
1125 struct {
1126 __u32 exception;
1127 __u32 error_code;
1128 } ex;
1129
1130Unused.
1131
1132 /* KVM_EXIT_IO */
1133 struct {
1134#define KVM_EXIT_IO_IN 0
1135#define KVM_EXIT_IO_OUT 1
1136 __u8 direction;
1137 __u8 size; /* bytes */
1138 __u16 port;
1139 __u32 count;
1140 __u64 data_offset; /* relative to kvm_run start */
1141 } io;
1142
Wu Fengguang2044892d2009-12-24 09:04:16 +08001143If exit_reason is KVM_EXIT_IO, then the vcpu has
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001144executed a port I/O instruction which could not be satisfied by kvm.
1145data_offset describes where the data is located (KVM_EXIT_IO_OUT) or
1146where kvm expects application code to place the data for the next
Wu Fengguang2044892d2009-12-24 09:04:16 +08001147KVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a packed array.
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001148
1149 struct {
1150 struct kvm_debug_exit_arch arch;
1151 } debug;
1152
1153Unused.
1154
1155 /* KVM_EXIT_MMIO */
1156 struct {
1157 __u64 phys_addr;
1158 __u8 data[8];
1159 __u32 len;
1160 __u8 is_write;
1161 } mmio;
1162
Wu Fengguang2044892d2009-12-24 09:04:16 +08001163If exit_reason is KVM_EXIT_MMIO, then the vcpu has
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001164executed a memory-mapped I/O instruction which could not be satisfied
1165by kvm. The 'data' member contains the written data if 'is_write' is
1166true, and should be filled by application code otherwise.
1167
Alexander Grafad0a0482010-03-24 21:48:30 +01001168NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI, the corresponding
1169operations are complete (and guest state is consistent) only after userspace
1170has re-entered the kernel with KVM_RUN. The kernel side will first finish
Marcelo Tosatti67961342010-02-13 16:10:26 -02001171incomplete operations and then check for pending signals. Userspace
1172can re-enter the guest with an unmasked signal pending to complete
1173pending operations.
1174
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001175 /* KVM_EXIT_HYPERCALL */
1176 struct {
1177 __u64 nr;
1178 __u64 args[6];
1179 __u64 ret;
1180 __u32 longmode;
1181 __u32 pad;
1182 } hypercall;
1183
Avi Kivity647dc492010-04-01 14:39:21 +03001184Unused. This was once used for 'hypercall to userspace'. To implement
1185such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
1186Note KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001187
1188 /* KVM_EXIT_TPR_ACCESS */
1189 struct {
1190 __u64 rip;
1191 __u32 is_write;
1192 __u32 pad;
1193 } tpr_access;
1194
1195To be documented (KVM_TPR_ACCESS_REPORTING).
1196
1197 /* KVM_EXIT_S390_SIEIC */
1198 struct {
1199 __u8 icptcode;
1200 __u64 mask; /* psw upper half */
1201 __u64 addr; /* psw lower half */
1202 __u16 ipa;
1203 __u32 ipb;
1204 } s390_sieic;
1205
1206s390 specific.
1207
1208 /* KVM_EXIT_S390_RESET */
1209#define KVM_S390_RESET_POR 1
1210#define KVM_S390_RESET_CLEAR 2
1211#define KVM_S390_RESET_SUBSYSTEM 4
1212#define KVM_S390_RESET_CPU_INIT 8
1213#define KVM_S390_RESET_IPL 16
1214 __u64 s390_reset_flags;
1215
1216s390 specific.
1217
1218 /* KVM_EXIT_DCR */
1219 struct {
1220 __u32 dcrn;
1221 __u32 data;
1222 __u8 is_write;
1223 } dcr;
1224
1225powerpc specific.
1226
Alexander Grafad0a0482010-03-24 21:48:30 +01001227 /* KVM_EXIT_OSI */
1228 struct {
1229 __u64 gprs[32];
1230 } osi;
1231
1232MOL uses a special hypercall interface it calls 'OSI'. To enable it, we catch
1233hypercalls and exit with this exit struct that contains all the guest gprs.
1234
1235If exit_reason is KVM_EXIT_OSI, then the vcpu has triggered such a hypercall.
1236Userspace can now handle the hypercall and when it's done modify the gprs as
1237necessary. Upon guest entry all guest GPRs will then be replaced by the values
1238in this struct.
1239
Avi Kivity9c1b96e2009-06-09 12:37:58 +03001240 /* Fix the size of the union. */
1241 char padding[256];
1242 };
1243};