diff options
Diffstat (limited to 'doc/KVM-Test-API.asciidoc')
-rw-r--r-- | doc/KVM-Test-API.asciidoc | 487 |
1 files changed, 487 insertions, 0 deletions
diff --git a/doc/KVM-Test-API.asciidoc b/doc/KVM-Test-API.asciidoc new file mode 100644 index 000000000..7e537afcb --- /dev/null +++ b/doc/KVM-Test-API.asciidoc @@ -0,0 +1,487 @@ +LTP KVM Test API +================ + +Testing KVM is more complex than other Linux features. Some KVM bugs allow +userspace code running inside the virtual machine to bypass (emulated) hardware +access restrictions and elevate its privileges inside the guest operating +system. The worst types of KVM bugs may even allow the guest code to crash or +compromise the physical host. KVM tests therefore need to be split into two +components – a KVM controller program running on the physical host and a guest +payload program running inside the VM. The cooperation of these two components +allows testing even of bugs that somehow cross the virtualization boundary. + +NOTE: See also + https://github.com/linux-test-project/ltp/wiki/Test-Writing-Guidelines[Test Writing Guidelines], + https://github.com/linux-test-project/ltp/wiki/C-Test-Case-Tutorial[C Test Case Tutorial], + https://github.com/linux-test-project/ltp/wiki/C-Test-API[C Test API]. + +1. Basic KVM test structure +--------------------------- + +KVM tests are simple C source files containing both the KVM controller code +and the guest payload code separated by `#ifdef COMPILE_PAYLOAD` preprocessor +condition. The file will be compiled twice. Once to compile the payload part, +once to compile the KVM controller part and embed the payload binary inside. +The result is a single self-contained binary that'll execute the embedded +payload inside a KVM virtual machine and print results in the same format as +a normal LTP test. + +A KVM test source should start with `#include "kvm_test.h"` instead of the +usual `tst_test.h`. The `kvm_test.h` header file will include the other basic +headers appropriate for the current compilation pass. Everything else in the +source file should be enclosed in `#ifdef COMPILE_PAYLOAD ... #else ... #endif` +condition, including any other header file includes. Note that the standard +LTP headers are not available in the payload compilation pass, only the KVM +guest library headers can be included. + +.Example KVM test +[source,c] +------------------------------------------------------------------------------- +#include "kvm_test.h" + +#ifdef COMPILE_PAYLOAD + +/* Guest payload code */ + +void main(void) +{ + tst_res(TPASS, "Hello, world!"); +} + +#else /* COMPILE_PAYLOAD */ + +/* KVM controller code */ + +static struct tst_test test = { + .test_all = tst_kvm_run, + .setup = tst_kvm_setup, + .cleanup = tst_kvm_cleanup, +}; + +#endif /* COMPILE_PAYLOAD */ +------------------------------------------------------------------------------- + +The KVM controller code is a normal LTP test and needs to define an instance +of `struct tst_test` with metadata and the usual setup, cleanup, and test +functions. Basic implementation of all three functions is provided by the KVM +host library. + +On the other hand, the payload is essentially a tiny kernel that'll run +on bare virtual hardware. It cannot access any files, Linux syscalls, standard +library functions, etc. except for the small subset provided by the KVM guest +library. The payload code must define a `void main(void)` function which will +be the VM entry point of the test. + +2. KVM host library +------------------- + +The KVM host library provides helper functions for creating and running +a minimal KVM virtual machine. + +2.1 Data structures +~~~~~~~~~~~~~~~~~~~ + +[source,c] +------------------------------------------------------------------------------- +struct tst_kvm_instance { + int vm_fd, vcpu_fd; + struct kvm_run *vcpu_info; + size_t vcpu_info_size; + struct kvm_userspace_memory_region ram[MAX_KVM_MEMSLOTS]; + struct tst_kvm_result *result; +}; +------------------------------------------------------------------------------- + +`struct tst_kvm_instance` holds the file descriptors and memory buffers +of a single KVM virtual machine: + +- `int vm_fd` is the main VM file descriptor created by `ioctl(KVM_CREATE_VM)` +- `int vcpu_fd` is the virtual CPU filedescriptor created by + `ioctl(KVM_CREATE_VCPU)` +- `struct kvm_run *vcpu_info` is the VCPU state structure created by + `mmap(vcpu_fd)` +- `size_t vcpu_info_size` is the size of `vcpu_info` buffer +- `struct kvm_userspace_memory_region ram[MAX_KVM_MEMSLOTS]` is the list + of memory slots defined in this VM. Unused memory slots have zero + in the `userspace_addr` field. +- `struct tst_kvm_result *result` is a buffer for passing test result data + from the VM to the controller program, mainly `tst_res()`/`tst_brk()` flags + and messages. + +[source,c] +------------------------------------------------------------------------------- +struct tst_kvm_result { + int32_t result; + int32_t lineno; + uint64_t file_addr; + char message[0]; +}; +------------------------------------------------------------------------------- + +`struct tst_kvm_result` is used to pass test results and synchronization data +between the KVM guest and the controller program. Most often, it is used +to pass `tst_res()` and `tst_brk()` messages from the VM, but special values +can also be used to send control flow requests both ways. + +- `int32_t result` is the message type, either one of the `TPASS`, `TFAIL`, + `TWARN`, `TBROK`, `TINFO` flags or a special control flow value. Errno flags + are not supported. +- `int32_t lineno` is the line number for `tst_res()`/`tst_brk()` messages. +- `uint64_t file_addr` is the VM address of the filename string for + `tst_res()`/`tst_brk()` messages. +- `char message[0]` is the buffer for arbitrary message data, most often used + to pass `tst_res()`/`tst_brk()` message strings. + +2.2 Working with virtual machines +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The KVM host library provides default implementation of the setup, cleanup +and test functions for `struct tst_test` in cases where you do not need +to customize the VM configuration. You can either assign these functions +to the `struct tst_test` instance directly or call them from your own function +that does some additional steps. All three function must be used together. + +- `void tst_kvm_setup(void)` +- `void tst_kvm_run(void)` +- `void tst_kvm_cleanup(void)` + +Note: `tst_kvm_run()` calls `tst_free_all()`. Calling it will free all +previously allocated guarded buffers. + +- `void tst_kvm_validate_result(int value)` – Validate whether the value + returned in `struct tst_kvm_result.result` can be safely passed + to `tst_res()` or `tst_brk()`. If the value is not valid, the controller + program will be terminated with error. + +- `uint64_t tst_kvm_get_phys_address(const struct tst_kvm_instance *inst, + uint64_t addr)` – Converts pointer value (virtual address) from KVM virtual + machine `inst` to the corresponding physical address. Returns 0 if + the virtual address is not mapped to any physical address. If virtual memory + mapping is not enabled in the VM or not available on the arch at all, this + function simply returns `addr` as is. + +- `int tst_kvm_find_phys_memslot(const struct tst_kvm_instance *inst, + uint64_t paddr)` – Returns index of the memory slot in KVM virtual machine + `inst` which contains the physical address `paddr`. If the address is not + backed by a memory buffer, returns -1. + +- `int tst_kvm_find_memslot(const struct tst_kvm_instance *inst, + uint64_t addr)` – Returns index of the memory slot in KVM virtual machine + `inst` which contains the virtual address `addr`. If the virtual address + is not mapped to a valid physical address backed by a memory buffer, + returns -1. + +- `void *tst_kvm_get_memptr(const struct tst_kvm_instance *inst, + uint64_t addr)` – Converts pointer value (virtual address) from KVM virtual + machine `inst` to host-side pointer. + +- `void *tst_kvm_alloc_memory(struct tst_kvm_instance *inst, unsigned int slot, + uint64_t baseaddr, size_t size, unsigned int flags)` – Allocates a guarded + buffer of given `size` in bytes and installs it into specified memory `slot` + of the KVM virtual machine `inst` at base address `baseaddr`. The buffer + will be automatically page aligned at both ends. See the kernel + documentation of `KVM_SET_USER_MEMORY_REGION` ioctl for list of valid + `flags`. Returns pointer to page-aligned beginning of the allocated buffer. + The actual requested `baseaddr` will be located at + `ret + baseaddr % pagesize`. + +- `struct kvm_cpuid2 *tst_kvm_get_cpuid(int sysfd)` – Get a list of supported + virtual CPU features returned by `ioctl(KVM_GET_SUPPORTED_CPUID)`. + The argument must be an open file descriptor returned by `open("/dev/kvm")`. + +- `void tst_kvm_create_instance(struct tst_kvm_instance *inst, + size_t ram_size)` – Creates and fully initializes a new KVM virtual machine + with at least `ram_size` bytes of memory. The VM instance info will be + stored in `inst`. + +- `int tst_kvm_run_instance(struct tst_kvm_instance *inst, int exp_errno)` – + Executes the program installed in KVM virtual machine `inst`. Any result + messages returned by the VM will be automatically printed to controller + program output. Returns zero. If `exp_errno` is non-zero, the VM execution + syscall is allowed to fail with the `exp_errno` error code and + `tst_kvm_run_instance()` will return -1 instead of terminating the test. + +- `void tst_kvm_destroy_instance(struct tst_kvm_instance *inst)` – Deletes + the KVM virtual machine `inst`. Note that the guarded buffers assigned + to the VM by `tst_kvm_create_instance()` or `tst_kvm_alloc_memory()` will + not be freed. + +The KVM host library does not provide any way to reset a VM instance back +to initial state. Running multiple iterations of the test requires destroying +the old VM instance and creating a new one. Otherwise the VM will exit +without reporting any results on the second iteration and the test will fail. +The `tst_kvm_run()` function handles this issue correctly. + +3. KVM guest library +-------------------- + +The KVM guest library provides a minimal implementation of both the LTP +test library and the standard C library functions. Do not try to include +the usual LTP or C headers in guest payload code, it will not work. + +3.1 Standard C functions +~~~~~~~~~~~~~~~~~~~~~~~~ + +`#include "kvm_test.h"` + +The functions listed below are implemented according to the C standard: + +- `void *memset(void *dest, int val, size_t size)` +- `void *memzero(void *dest, size_t size)` +- `void *memcpy(void *dest, const void *src, size_t size)` +- `char *strcpy(char *dest, const char *src)` +- `char *strcat(char *dest, const char *src)` +- `size_t strlen(const char *str)` + +3.2 LTP library functions +~~~~~~~~~~~~~~~~~~~~~~~~~ + +`#include "kvm_test.h"` + +The KVM guest library currently provides the LTP functions for reporting test +results. All standard result flags except for `T*ERRNO` are supported +with the same rules as usual. However, the printf-like formatting is not +implemented yet. + +- `void tst_res(int result, const char *message)` +- `void tst_brk(int result, const char *message) __attribute__((noreturn))` + +A handful of useful macros is also available: + +- `TST_TEST_TCONF(message)` – Generates a test program that will simply print + a `TCONF` message and exit. This is useful when the real test cannot be + built due to missing dependencies or arch limitations. + +- `ARRAY_SIZE(arr)` – Returns the number of items in statically allocated + array `arr`. + +- `LTP_ALIGN(x, a)` – Aligns integer `x` to be a multiple of `a`, which + must be a power of 2. + +3.3 Arch independent functions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +`#include "kvm_test.h"` + +Memory management in KVM guest library currently uses only primitive linear +buffer for memory allocation. There are no checks whether the VM can allocate +more memory and the already allocated memory cannot be freed. + +- `void *tst_heap_alloc(size_t size)` – Allocates a block of memory on the heap. + +- `void *tst_heap_alloc_aligned(size_t size, size_t align)` – Allocates + a block of memory on the heap with the starting address aligned to given + value. + +3.4 x86 specific functions +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +`#include "kvm_test.h"` + +`#include "kvm_x86.h"` + +- `struct kvm_interrupt_frame` – Opaque arch-dependent structure which holds + interrupt frame information. Use the functions below to get individual values: + +- `uintptr_t kvm_get_interrupt_ip(const struct kvm_interrupt_frame *ifrm)` – + Get instruction pointer value from interrupt frame structure. This may be + the instruction which caused an interrupt or the one immediately after, + depending on the interrupt vector semantics. + +- `int (*tst_interrupt_callback)(void *userdata, + struct kvm_interrupt_frame *ifrm, unsigned long errcode)` – Interrupt handler + callback prototype. When an interrupt occurs, the assigned callback function + will be passed the `userdata` pointer that was given + to `tst_set_interrupt_callback()`, interrupt frame `ifrm` and the error + code `errcode` defined by the interrupt vector semantics. If the interrupt + vector does not generate an error code, `errcode` will be set to zero. + The callback function must return 0 if the interrupt was successfully + handled and test execution should resume. Non-zero return value means that + the interrupt could not be handled and the test will terminate with error. + +- `void tst_set_interrupt_callback(unsigned int vector, + tst_interrupt_callback func, void *userdata)` – Register new interrupt + handler callback function `func` for interrupt `vector`. The `userdata` + argument is an arbitrary pointer that will be passed to `func()` every time + it gets called. The previous interrupt handler callback will be removed. + Setting `func` to `NULL` will remove any existing interrupt handler + from `vector` and the interrupt will become fatal error. + +[source,c] +------------------------------------------------------------------------------- +struct page_table_entry_pae { + unsigned int present: 1; + unsigned int writable: 1; + unsigned int user_access: 1; + unsigned int write_through: 1; + unsigned int disable_cache: 1; + unsigned int accessed: 1; + unsigned int dirty: 1; + unsigned int page_type: 1; + unsigned int global: 1; + unsigned int padding: 3; + uint64_t address: 40; + unsigned int padding2: 7; + unsigned int prot_key: 4; + unsigned int noexec: 1; +} __attribute__((__packed__)); + +struct kvm_cpuid { + unsigned int eax, ebx, ecx, edx; +}; + +struct kvm_cregs { + unsigned long cr0, cr2, cr3, cr4; +}; + +struct kvm_sregs { + uint16_t cs, ds, es, fs, gs, ss; +}; +------------------------------------------------------------------------------- + +`struct page_table_entry_pae` is the page table entry structure for PAE and +64bit paging modes. See Intel(R) 64 and IA-32 Architectures Software +Developer's Manual, Volume 3, Chapter 4 for explanation of the fields. + +- `uintptr_t kvm_get_page_address_pae(const struct page_table_entry_pae *entry)` + – Returns the physical address of the memory page referenced by the given + page table `entry`. Depending on memory mapping changes done by the test, + the physical address may not be a valid pointer. The caller must determine + whether the address points to another page table entry or a data page, using + the known position in page table hierarchy and `entry->page_type`. Returns + zero if the `entry` does not reference any memory page. + +- `void kvm_set_segment_descriptor(struct segment_descriptor *dst, uint64_t baseaddr, uint32_t limit, unsigned int flags)` - + Fill the `dst` segment descriptor with given values. The maximum value + of `limit` is `0xfffff` (inclusive) regardless of `flags`. + +- `void kvm_parse_segment_descriptor(struct segment_descriptor *src, uint64_t *baseaddr, uint32_t *limit, unsigned int *flags)` - + Parse data in the `src` segment descriptor and copy them to variables + pointed to by the other arguments. Any parameter except the first one can + be `NULL`. + +- `int kvm_find_free_descriptor(const struct segment_descriptor *table, size_t size)` - + Find the first segment descriptor in `table` which does not have + the `SEGFLAG_PRESENT` bit set. The function handles double-size descriptors + correctly. Returns index of the first available descriptor or -1 if all + `size` descriptors are taken. + +- `unsigned int kvm_create_stack_descriptor(struct segment_descriptor *table, size_t tabsize, void *stack_base)` - + Convenience function for registering a stack segment descriptor. It'll + automatically find a free slot in `table` and fill the necessary flags. + The `stack_base` pointer must point to the bottom of the stack. + +- `void kvm_get_cpuid(unsigned int eax, unsigned int ecx, + struct kvm_cpuid *buf)` – Executes the CPUID instruction with the given + `eax` and `ecx` arguments and stores the results in `buf`. + +- `void kvm_read_cregs(struct kvm_cregs *buf)` – Copies the current values + of control registers to `buf`. + +- `void kvm_read_sregs(struct kvm_sregs *buf)` - Copies the current values + of segment registers to `buf`. + +- `uint64_t kvm_rdmsr(unsigned int msr)` – Returns the current value + of model-specific register `msr`. + +- `void kvm_wrmsr(unsigned int msr, uint64_t value)` – Stores `value` + into model-specific register `msr`. + +- `void kvm_exit(void) __attribute__((noreturn))` – Terminate the test. + Similar to calling `exit(0)` in a regular LTP test, although `kvm_exit()` + will terminate only one iteration of the test, not the whole host process. + +See Intel(R) 64 and IA-32 Architectures Software Developer's Manual +for documentation of standard and model-specific x86 registers. + +3.5 AMD SVM helper functions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +`#include "kvm_test.h"` + +`#include "kvm_x86.h"` + +`#include "kvm_x86_svm.h"` + +The KVM guest library provides basic helper functions for creating and running +nested virtual machines using the AMD SVM technology. + +.Example code to execute nested VM +[source,c] +------------------------------------------------------------------------------- +int guest_main(void) +{ + ... + return 0; +} + +void main(void) +{ + struct kvm_svm_vcpu *vm; + + kvm_init_svm(); + vm = kvm_create_svm_vcpu(guest_main, 1); + kvm_svm_vmrun(vm); +} +------------------------------------------------------------------------------- + +- `int kvm_is_svm_supported(void)` - Returns non-zero value if the CPU + supports AMD SVM, otherwise returns 0. + +- `int kvm_get_svm_state(void)` - Returns non-zero value if SVM is currently + enabled, otherwise returns 0. + +- `void kvm_set_svm_state(int enabled)` - Enable or disable SVM according + to argument. If SVM is disabled by host or not supported, the test will exit + with `TCONF`. + +- `void kvm_init_svm(void)` - Enable and fully initialize SVM, including + allocating and setting up host save area VMCB. If SVM is disabled by host or + not supported, the test will exit with `TCONF`. + +- `struct kvm_vmcb *kvm_alloc_vmcb(void)` - Allocate new VMCB structure + with correct memory alignment and fill it with zeroes. + +- `void kvm_vmcb_set_intercept(struct kvm_vmcb *vmcb, unsigned int id, unsigned int state)` - + Set SVM intercept bit `id` to given `state`. + +- `void kvm_init_guest_vmcb(struct kvm_vmcb *vmcb, uint32_t asid, uint16_t ss, void *rsp, int (*guest_main)(void))` - + Initialize new SVM virtual machine. The `asid` parameter is the nested + page table ID. The `ss` and `rsp` parameters set the stack segment and stack + pointer values, respectively. The `guest_main` parameter sets the code entry + point of the virtual machine. All control registers, segment registers + (except stack segment register), GDTR and IDTR will be copied + from the current CPU state. + +- `struct kvm_svm_vcpu *kvm_create_svm_vcpu(int (*guest_main)(void), int alloc_stack)` - + Convenience function for allocating and initializing new SVM virtual CPU. + The `guest_main` parameter is passed to `kvm_init_guest_vmcb()`, + the `alloc_stack` parameter controls whether a new 8KB stack will be + allocated and registered in GDT. Interception will be enabled for `VMSAVE` + and `HLT` instructions. If you set `alloc_stack` to zero, you must configure + the stack segment register and stack pointer manually. + +- `void kvm_svm_vmrun(struct kvm_svm_vcpu *cpu)` - Start or continue execution + of a nested virtual machine. Beware that FPU state is not saved. Do not use + floating point types or values in nested guest code. Also do not use + `tst_res()` or `tst_brk()` functions in nested guest code. + +See AMD64 Architecture Programmer's Manual Volume 2 for documentation +of the Secure Virtual Machine (SVM) technology. + +4. KVM guest environment +------------------------ + +KVM guest payload execution begins with bootstrap code which will perform +the minimal guest environment setup required for running C code: + +- Activate the appropriate CPU execution mode (IA-32 protected mode + on 32-bit x86 or the 64-bit mode on x86_64). +- Create indentity mapping (virtual address = physical address) of the lower + 2GB memory region, even if parts of the region are not backed by any host + memory buffers. The memory region above 2GB threshold is left unmapped + except for one memory page reserved for the `struct tst_kvm_result` buffer. +- Initialize 8KB stack. +- Install default interrupt handlers for standard CPU exception vectors. + +When the environment setup is complete, bootstrap will call `void main(void)` +function implemented by the test program. To finish execution of guest payload, +the test can either return from the `main()` function or call `kvm_exit()` +at any point. |