diff -urN linux-2.4.19.orig/CREDITS linux-2.4.19/CREDITS --- linux-2.4.19.orig/CREDITS Sat Aug 3 09:39:42 2002 +++ linux-2.4.19/CREDITS Sat Nov 16 02:10:51 2002 @@ -1905,6 +1905,10 @@ S: Halifax, Nova Scotia S: Canada B3J 3C8 +N: Toshiyuki Maeda +E: tosh@is.s.u-tokyo.ac.jp +D: Kernel Mode Linux + N: Kai Mäkisara E: Kai.Makisara@metla.fi D: SCSI Tape Driver diff -urN linux-2.4.19.orig/Documentation/00-INDEX linux-2.4.19/Documentation/00-INDEX --- linux-2.4.19.orig/Documentation/00-INDEX Mon Aug 27 23:44:15 2001 +++ linux-2.4.19/Documentation/00-INDEX Sat Nov 16 02:10:51 2002 @@ -108,6 +108,8 @@ - listing of various WWW + books that document kernel internals. kernel-parameters.txt - summary listing of command line / boot prompt args for the kernel. +kml.txt + - info on Kernel Mode Linux. kmod.txt - info on the kernel module loader/unloader (kerneld replacement). locks.txt diff -urN linux-2.4.19.orig/Documentation/Configure.help linux-2.4.19/Documentation/Configure.help --- linux-2.4.19.orig/Documentation/Configure.help Sat Aug 3 09:39:42 2002 +++ linux-2.4.19/Documentation/Configure.help Sat Nov 16 02:10:51 2002 @@ -140,6 +140,18 @@ If you don't know what to do here, say N. +Kernel Mode Linux support +CONFIG_KERNEL_MODE_LINUX + This enables Kernel Mode Linux. In Kernel Mode Linux, user programs + can be executed safely in a kernel mode and access a kernel address space + directly. Thus, for example, costly mode switching between a user and a kernel + can be eliminated. If you say Y here, the kernel enables Kernel Mode Linux. + + More information about Kernel Mode Linux can be found in the + + + If you don't know what to do here, say N. + Intel or compatible 80x86 processor CONFIG_X86 This is Linux's home port. Linux was originally native to the Intel diff -urN linux-2.4.19.orig/Documentation/kml.txt linux-2.4.19/Documentation/kml.txt --- linux-2.4.19.orig/Documentation/kml.txt Thu Jan 1 09:00:00 1970 +++ linux-2.4.19/Documentation/kml.txt Sat Nov 16 02:10:51 2002 @@ -0,0 +1,95 @@ +Kernel Mode Linux (http://web.yl.is.s.u-tokyo.ac.jp/~tosh/kml) +Toshiyuki Maeda + + +Introduction: + +Kernel Mode Linux is a technology which enables us to execute user programs +in a kernel mode. In Kernel Mode Linux, user programs can be executed as +user processes that have the privilege level of a kernel mode. +The benefit of executing user programs in a kernel mode +is that the user programs can access a kernel address space directly. +So, for example, user programs can invoke +system calls very fast because it is unnecessary to switch between a kernel +mode and a user mode by using costly software interruptions or context switches. +Unlike kernel modules, user programs are executed +as ordinary processes (except for their privilege level), +so scheduling and paging are performed as usual. + +Although it seems dangerous to let user programs access a kernel directly, +safety of the kernel can be ensured, for example, by static type checking, +software fault isolation, and so forth. +For proof of concept, we are developing a system which is based on the combination +of Kernel Mode Linux and Typed Assembly Language, TAL. +(TAL can ensure safety of programs through its type checking and +the type checking can be done at machine binary level. +For more information about TAL, see http://www.cs.cornell.edu/talc) + + +Note: + +Currently, only IA-32 is supported. +Programs executed in a kernel mode shouldn't modify its CS, DS and SS. +If modified, the system will be in an undefined state. + + +Instruction: + +To enable Kernel Mode Linux, say Y in Kernel Mode Linux field of +kernel configuration, build and install the kernel, and reboot your machine. +Then, all executables under directory /trusted are executed in a kernel mode +in current Kernel Mode Linux implementation. For example, to execute a program +named "cat" in a kernel mode, copy the program to directory /trusted +and execute it as follows: + +% /trusted/cat + + +Implementation for IA-32: + +To execute user programs in a kernel mode, Kernel Mode Linux have +special start_thread (start_kernel_thread) routine, +which is called in execve(2) and set registers +of a user process to specified initial values. The original start_thread +routine set CS segment register to USER_CS. The start_kernel_thread routine +set the CS register to KERNEL_CS (same as DS, SS, and so on). +Thus, a user program is started as a user process executed in a kernel mode. + +The biggest problem to implement Kernel Mode Linux is +a stack starvation problem. Let's assume that a user program is executed +in a kernel mode and it does a page fault on its user stack. +To generate a page fault exception, a IA-32 CPU tries to push several +registers (EIP, CS, and so on) to the same user stack because the program +is executed in a kernel mode and the IA-32 CPU doesn't switch its stack +to a kernel stack. Therefore, the IA-32 CPU cannot push the registers +and generate a double fault exception and fail again. +Finally the IA-32 CPU gives up and reset itself. +This is the stack starvation problem. + +To solve the stack starvation problem, we use IA-32 hardware task mechanism to +handle exceptions. By using IA-32 task, IA-32 CPU doesn't push the registers +to its stack but switch an execution context to special contexts. +Therefore, the stack starvation problem doesn't occur. +However, it is costly to handle all exceptions by IA-32 tasks. +So, in current Kernel Mode Linux implementation, +only a double fault exception is handled by IA-32 task. + +The other problem is a manual stack switching problem. +In normal Linux Kernel, IA-32 CPU switches a stack from a user stack +to a kernel stack at exceptions or interruptions. +However, in Kernel Mode Linux, a user program may be executed in a kernel mode +and IA-32 CPU may not switch a stack. Therefore, +in current Kernel Mode Linux implementation, the kernel switches a stack +manually at exceptions and interruptions. To switch a stack, +a kernel must know a location of a kernel stack in an address space. +However, at exceptions and interruptions, the kernel cannot use +general registers (EAX, EBX, and so on). Therefore, it is very difficult +to get the location of the kernel stack. + +To solve the above problem, current Kernel Mode Linux implementation +exploits a per CPU GDT from Ingo Molnar's TLS patch. In Kernel Mode Linux, +one segment descriptor of the per CPU GDT entries directly points to the +location of the pointer to the kernel stack in a TSS. Thus, by using the +segment descriptor, the address of the kernel stack can be available with +only one general register. + diff -urN linux-2.4.19.orig/MAINTAINERS linux-2.4.19/MAINTAINERS --- linux-2.4.19.orig/MAINTAINERS Sat Aug 3 09:39:42 2002 +++ linux-2.4.19/MAINTAINERS Sat Nov 16 02:10:51 2002 @@ -880,6 +880,12 @@ W: http://kbuild.sourceforge.net S: Maintained +KERNEL MODE LINUX +P: Toshiyuki Maeda +M: tosh@is.s.u-tokyo.ac.jp +W: http://www.yl.is.s.u-tokyo.ac.jp/~tosh/kml/ +S: Maintained + KERNEL NFSD P: Neil Brown M: neilb@cse.unsw.edu.au diff -urN linux-2.4.19.orig/Makefile linux-2.4.19/Makefile --- linux-2.4.19.orig/Makefile Sat Aug 3 09:39:46 2002 +++ linux-2.4.19/Makefile Sat Nov 16 02:10:51 2002 @@ -1,7 +1,7 @@ VERSION = 2 PATCHLEVEL = 4 SUBLEVEL = 19 -EXTRAVERSION = +EXTRAVERSION = -kml KERNELRELEASE=$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION) diff -urN linux-2.4.19.orig/arch/i386/config.in linux-2.4.19/arch/i386/config.in --- linux-2.4.19.orig/arch/i386/config.in Sat Aug 3 09:39:42 2002 +++ linux-2.4.19/arch/i386/config.in Sat Nov 16 02:12:15 2002 @@ -412,6 +412,18 @@ source net/bluetooth/Config.in +# "$CONFIG_X86_WP_WORKS_OK" != "n" doesn't work! why? +if [ "$CONFIG_M386" != "y" ]; then + mainmenu_option next_comment + comment 'Kernel Mode Linux' + bool 'Kernel Mode Linux' CONFIG_KERNEL_MODE_LINUX + if [ "$CONFIG_KERNEL_MODE_LINUX" != "n" ]; then + comment ' Safety check have not been implemented' + define_bool CONFIG_KML_CHECK_SAFETY n + fi + endmenu +fi + mainmenu_option next_comment comment 'Kernel hacking' diff -urN linux-2.4.19.orig/arch/i386/kernel/apm.c linux-2.4.19/arch/i386/kernel/apm.c --- linux-2.4.19.orig/arch/i386/kernel/apm.c Sat Aug 3 09:39:42 2002 +++ linux-2.4.19/arch/i386/kernel/apm.c Sat Nov 16 02:10:51 2002 @@ -1909,35 +1909,38 @@ * that extends up to the end of page zero (that we have reserved). * This is for buggy BIOS's that refer to (real mode) segment 0x40 * even though they are called in protected mode. + * + * NOTE: on SMP we call into the APM BIOS only on CPU#0, so it's + * enough to modify CPU#0's GDT. */ - set_base(gdt[APM_40 >> 3], + set_base(cpu_gdt_table[0][APM_40 >> 3], __va((unsigned long)0x40 << 4)); - _set_limit((char *)&gdt[APM_40 >> 3], 4095 - (0x40 << 4)); + _set_limit((char *)&cpu_gdt_table[0][APM_40 >> 3], 4095 - (0x40 << 4)); apm_bios_entry.offset = apm_info.bios.offset; apm_bios_entry.segment = APM_CS; - set_base(gdt[APM_CS >> 3], + set_base(cpu_gdt_table[0][APM_CS >> 3], __va((unsigned long)apm_info.bios.cseg << 4)); - set_base(gdt[APM_CS_16 >> 3], + set_base(cpu_gdt_table[0][APM_CS_16 >> 3], __va((unsigned long)apm_info.bios.cseg_16 << 4)); - set_base(gdt[APM_DS >> 3], + set_base(cpu_gdt_table[0][APM_DS >> 3], __va((unsigned long)apm_info.bios.dseg << 4)); #ifndef APM_RELAX_SEGMENTS if (apm_info.bios.version == 0x100) { #endif /* For ASUS motherboard, Award BIOS rev 110 (and others?) */ - _set_limit((char *)&gdt[APM_CS >> 3], 64 * 1024 - 1); + _set_limit((char *)&cpu_gdt_table[0][APM_CS >> 3], 64 * 1024 - 1); /* For some unknown machine. */ - _set_limit((char *)&gdt[APM_CS_16 >> 3], 64 * 1024 - 1); + _set_limit((char *)&cpu_gdt_table[0][APM_CS_16 >> 3], 64 * 1024 - 1); /* For the DEC Hinote Ultra CT475 (and others?) */ - _set_limit((char *)&gdt[APM_DS >> 3], 64 * 1024 - 1); + _set_limit((char *)&cpu_gdt_table[0][APM_DS >> 3], 64 * 1024 - 1); #ifndef APM_RELAX_SEGMENTS } else { - _set_limit((char *)&gdt[APM_CS >> 3], + _set_limit((char *)&cpu_gdt_table[0][APM_CS >> 3], (apm_info.bios.cseg_len - 1) & 0xffff); - _set_limit((char *)&gdt[APM_CS_16 >> 3], + _set_limit((char *)&cpu_gdt_table[0][APM_CS_16 >> 3], (apm_info.bios.cseg_16_len - 1) & 0xffff); - _set_limit((char *)&gdt[APM_DS >> 3], + _set_limit((char *)&cpu_gdt_table[0][APM_DS >> 3], (apm_info.bios.dseg_len - 1) & 0xffff); } #endif diff -urN linux-2.4.19.orig/arch/i386/kernel/entry.S linux-2.4.19/arch/i386/kernel/entry.S --- linux-2.4.19.orig/arch/i386/kernel/entry.S Sat Aug 3 09:39:42 2002 +++ linux-2.4.19/arch/i386/kernel/entry.S Sat Nov 16 02:10:51 2002 @@ -45,6 +45,7 @@ #include #include #include +#include EBX = 0x00 ECX = 0x04 @@ -58,6 +59,14 @@ ORIG_EAX = 0x24 EIP = 0x28 CS = 0x2C +#ifdef CONFIG_KERNEL_MODE_LINUX +/* + * CS_HW is used as stack switch indicator. + * If CS_HW is non-zero, stack switch occured. + * That is, we were in Kernel-User mode before interruption. + */ +CS_HW = 0x2E +#endif EFLAGS = 0x30 OLDESP = 0x34 OLDSS = 0x38 @@ -97,6 +106,88 @@ movl %edx,%ds; \ movl %edx,%es; +#ifndef CONFIG_KERNEL_MODE_LINUX +#define SWITCH_STACK_TO_KK +#define SWITCH_STACK_TO_KK_WITH_ERROR_CODE +#else + +#define TASK_SIZE (__PAGE_OFFSET) +#define __SW_KERNEL_CS (0xffff0000 | __KERNEL_CS) + +/* + * This is a macro for stack switching. + */ +#define SWITCH_STACK_TO_KK \ + /* Check whether if we were in Kernel-User mode or not. */ \ + cmpl $(TASK_SIZE), %esp; \ + /* For anceint processors, clear stack switch in XCS */ \ + /* because they doesn't clear High 16 bits of XCS. */ \ + movw $0x0, 6(%esp); \ + ja 3f; \ + /* \ + * We were in Kernel-User mode, \ + * therefore, XCS == __KERNEL_CS. \ + * Thus, we can safely overwrite XCS \ + */ \ + movl %ebp, 4(%esp); /* save %ebp to XCS */ \ + movl %ds, %ebp; \ + cmpw $(__KERNELSTACK_DS), %bp; \ + je 1f; \ + movl $(__KERNELSTACK_DS), %ebp; \ + movl %ebp, %ds; \ +1: \ + movl %esp, %ebp; \ + movl (0x0), %esp; \ + je 2f; \ + pushl $(__KERNEL_DS); \ + popl %ds; \ +2: \ + addl $12, %ebp; \ + addl $-4, %esp; /* XSS */ \ + pushl %ebp; /* ESP */ \ + pushl -4(%ebp); /* EFLAGS */ \ + pushl $(__SW_KERNEL_CS); /* XCS */ \ + pushl -12(%ebp); /* EIP */ \ + movl -8(%ebp), %ebp; /* restore %ebp from XCS */ \ +3: + +/* + * This is as same as the SWITCH_STACK_TO_KK + * but handles an error code on a stack + */ +#define SWITCH_STACK_TO_KK_WITH_ERROR_CODE \ + cmpl $(TASK_SIZE), %esp; \ + movw $0x0, 10(%esp); /* clear stack switch in XCS, sigh... */ \ + ja 3f; \ + /* \ + * We are in Kernel-User mode, \ + * therefore, XCS == __KERNEL_CS. \ + */ \ + movl %ebp, 8(%esp); /* save %ebp to XCS */ \ + movl %ds, %ebp; \ + cmpw $(__KERNELSTACK_DS), %bp; \ + je 1f; \ + movl $(__KERNELSTACK_DS), %ebp; \ + movl %ebp, %ds; \ +1: \ + movl %esp, %ebp; \ + movl (0x0), %esp; \ + je 2f; \ + pushl $(__KERNEL_DS); \ + popl %ds; \ +2: \ + addl $16, %ebp; \ + addl $-4, %esp; /* XSS */ \ + pushl %ebp; /* ESP */ \ + pushl -4(%ebp); /* EFLAGS */ \ + pushl $(__SW_KERNEL_CS); /* XCS */ \ + pushl -12(%ebp); /* EIP */ \ + pushl -16(%ebp); /* error_code */ \ + movl -8(%ebp), %ebp; /* restore %ebp from XCS */ \ + 3: +#endif + +#ifndef CONFIG_KERNEL_MODE_LINUX #define RESTORE_ALL \ popl %ebx; \ popl %ecx; \ @@ -127,6 +218,58 @@ .long 2b,5b; \ .long 3b,6b; \ .previous +#else +#define RESTORE_ALL \ + popl %ebx; \ + popl %ecx; \ + popl %edx; \ + popl %esi; \ + popl %edi; \ + popl %ebp; \ + popl %eax; \ +1: popl %ds; \ +2: popl %es; \ + addl $4,%esp; \ +/* Switch stack KK -> KU. */ \ + /* check whether if stack switch occured or not */ \ + cmpw $0x0, 6(%esp); \ + je 8f; \ + /* clear stack switch record in XCS */ \ + movw $0x0, 6(%esp); \ + pushl %ebp; \ + movl 16(%esp), %ebp; \ + addl $-16, %ebp; \ +3: popl (%ebp); \ +4: popl 4(%ebp); \ +5: popl 8(%ebp); \ +6: popl 12(%ebp); \ + movl %ebp, %esp; \ +7: popl %ebp; \ +8: iret; \ +.section __ex_table,"a";\ + .align 4; \ + .long 1b,3f; \ + .long 2b,4f; \ + .long 3b,5f; \ + .long 4b,5f; \ + .long 5b,5f; \ + .long 6b,5f; \ + .long 7b,5f; \ + .long 8b,5f; \ +.previous; \ +.section .fixup,"ax"; \ +3: movl $0,(%esp); \ + jmp 1b; \ +4: movl $0,(%esp); \ + jmp 2b; \ +5: pushl %ss; \ + popl %ds; \ + pushl %ss; \ + popl %es; \ + pushl $11; \ + call do_exit; \ +.previous +#endif #define GET_CURRENT(reg) \ movl $-8192, reg; \ @@ -192,6 +335,7 @@ */ ENTRY(system_call) + SWITCH_STACK_TO_KK pushl %eax # save orig_eax SAVE_ALL GET_CURRENT(%ebx) @@ -252,6 +396,10 @@ movb CS(%esp),%al testl $(VM_MASK | 3),%eax # return to VM86 mode or non-supervisor? jne ret_from_sys_call +#ifdef CONFIG_KERNEL_MODE_LINUX + cmpw $0x0, CS_HW(%esp) # return to Kernel-User mode? + jne ret_from_sys_call +#endif jmp restore_all ALIGN @@ -260,6 +408,7 @@ jmp ret_from_sys_call ENTRY(divide_error) + SWITCH_STACK_TO_KK pushl $0 # no error code pushl $ SYMBOL_NAME(do_divide_error) ALIGN @@ -292,16 +441,19 @@ jmp ret_from_exception ENTRY(coprocessor_error) + SWITCH_STACK_TO_KK pushl $0 pushl $ SYMBOL_NAME(do_coprocessor_error) jmp error_code ENTRY(simd_coprocessor_error) + SWITCH_STACK_TO_KK pushl $0 pushl $ SYMBOL_NAME(do_simd_coprocessor_error) jmp error_code ENTRY(device_not_available) + SWITCH_STACK_TO_KK pushl $-1 # mark this as an int SAVE_ALL GET_CURRENT(%ebx) @@ -317,11 +469,13 @@ jmp ret_from_exception ENTRY(debug) + SWITCH_STACK_TO_KK pushl $0 pushl $ SYMBOL_NAME(do_debug) jmp error_code ENTRY(nmi) + SWITCH_STACK_TO_KK pushl %eax SAVE_ALL movl %esp,%edx @@ -332,67 +486,215 @@ RESTORE_ALL ENTRY(int3) + SWITCH_STACK_TO_KK pushl $0 pushl $ SYMBOL_NAME(do_int3) jmp error_code ENTRY(overflow) + SWITCH_STACK_TO_KK pushl $0 pushl $ SYMBOL_NAME(do_overflow) jmp error_code ENTRY(bounds) + SWITCH_STACK_TO_KK pushl $0 pushl $ SYMBOL_NAME(do_bounds) jmp error_code ENTRY(invalid_op) + SWITCH_STACK_TO_KK pushl $0 pushl $ SYMBOL_NAME(do_invalid_op) jmp error_code ENTRY(coprocessor_segment_overrun) + SWITCH_STACK_TO_KK pushl $0 pushl $ SYMBOL_NAME(do_coprocessor_segment_overrun) jmp error_code ENTRY(double_fault) + SWITCH_STACK_TO_KK_WITH_ERROR_CODE + pushl $ SYMBOL_NAME(do_double_fault) + jmp error_code + +#ifdef CONFIG_KERNEL_MODE_LINUX +ENTRY(double_fault_no_stack_switch) pushl $ SYMBOL_NAME(do_double_fault) jmp error_code +#endif + +#ifdef CONFIG_KERNEL_MODE_LINUX + +PAGE_FAULT_ERROR_CODE = 0x2 +TSS_CR3 = 28 +TSS_EIP = 32 +TSS_EFLAGS = 36 +TSS_CS = 76 +TSS_ESP = 56 +TSS_SS = 80 + +/* + * This is a task-handler for double fault. + * In Kernel Mode Linux, user programs may be executed in ring 0 (kernel mode). + * Therefore, normal interruption handling mechanism doesn't work. + * For example, if a page fault occurs in a stack, + * CPU cannot generate a page fault exception because there is no stack + * to save the CPU context. We call this problem "stack starvation". + * To solve the stack starvation, we handle double fault with task-handler. + */ +ENTRY(double_fault_task) + movl 4(%esp), %edi # get current TSS. +/* %edi = current_tss */ + movl 8(%esp), %ebx # get previous TSS. +/* %ebx = prev_tss */ + + # get kernel stack. + cmpw $__KERNEL_CS, TSS_CS(%ebx) + jne 1f + movl TSS_ESP(%ebx), %esi + cmpl $TASK_SIZE, %esi + ja 2f +1: + movl $(__KERNELSTACK_DS), %eax + movl %eax, %ds + movl (0x0), %esi + movl $(__KERNEL_DS), %eax + movl %eax, %ds +2: + movl %esi, %esp +/* From now on, we can use stack. */ + + # recreate stack layout as if normal interruption occurs. + cmpw $__KERNEL_CS, TSS_CS(%ebx) + jne 3f + movl TSS_ESP(%ebx), %esi + cmpl $TASK_SIZE, %esi + ja 4f +3: + pushl TSS_SS(%ebx) + pushl TSS_ESP(%ebx) + + movl TSS_ESP(%ebx), %esi +4: + pushl TSS_EFLAGS(%ebx) + pushl TSS_CS(%ebx) + pushl TSS_EIP(%ebx) + + movw $0x0, 6(%esp) + cmpw $__KERNEL_CS, TSS_CS(%ebx) + jne 5f + cmpl $TASK_SIZE, %esi + ja 5f + /* record stack switch in XCS */ + movw $0xffff, 6(%esp) +5: + + # check whether if stack starvation occured or not. +/* %esi = prev_tss->esp */ + # calling address_presents_and_writable + addl $-4, %esi /* %esi = prev_tss->esp - 4 */ + addl $-12, %esp + pushl %esi + call address_presents_and_writable + addl $16, %esp + + testl %eax, %eax + jne 7f +6: + pushl $PAGE_FAULT_ERROR_CODE + movl $page_fault_no_stack_switch, TSS_EIP(%ebx) + andb $253, 37(%ebx) /* == andl $~IF_MASK, TSS_EFLAGS(%ebx) */ + movl %esi, %eax + movl %eax, %cr2 + jmp 9f +7: + addl $-12, %esi /* %esi = prev_tss->esp - 16 */ + addl $-12, %esp + pushl %esi + call address_presents_and_writable + addl $16, %esp + + testl %eax, %eax + jne 8f + jmp 6b +8: + pushl $0 + movl $double_fault_no_stack_switch, TSS_EIP(%ebx) +9: + andb $254, 37(%ebx) /* == andl $~TF_MASK, TSS_EFLAGS(%ebx) */ + movw $__KERNEL_CS, TSS_CS(%ebx) + movl %esp, TSS_ESP(%ebx) + movw $__KERNEL_DS, TSS_SS(%ebx) + + movl TSS_CR3(%edi), %eax + movl %eax, TSS_CR3(%ebx) + + movl TSS_ESP(%edi), %esp + + iret + jmp double_fault_task +#endif ENTRY(invalid_TSS) + SWITCH_STACK_TO_KK_WITH_ERROR_CODE pushl $ SYMBOL_NAME(do_invalid_TSS) jmp error_code ENTRY(segment_not_present) + SWITCH_STACK_TO_KK_WITH_ERROR_CODE pushl $ SYMBOL_NAME(do_segment_not_present) jmp error_code ENTRY(stack_segment) + SWITCH_STACK_TO_KK_WITH_ERROR_CODE pushl $ SYMBOL_NAME(do_stack_segment) jmp error_code ENTRY(general_protection) + SWITCH_STACK_TO_KK_WITH_ERROR_CODE pushl $ SYMBOL_NAME(do_general_protection) jmp error_code ENTRY(alignment_check) + SWITCH_STACK_TO_KK_WITH_ERROR_CODE pushl $ SYMBOL_NAME(do_alignment_check) jmp error_code ENTRY(page_fault) + SWITCH_STACK_TO_KK_WITH_ERROR_CODE + pushl $ SYMBOL_NAME(do_page_fault) + jmp error_code + +#ifdef CONFIG_KERNEL_MODE_LINUX +ENTRY(page_fault_no_stack_switch) pushl $ SYMBOL_NAME(do_page_fault) jmp error_code +#endif ENTRY(machine_check) + SWITCH_STACK_TO_KK pushl $0 pushl $ SYMBOL_NAME(do_machine_check) jmp error_code ENTRY(spurious_interrupt_bug) + SWITCH_STACK_TO_KK pushl $0 pushl $ SYMBOL_NAME(do_spurious_interrupt_bug) jmp error_code + +#ifdef CONFIG_KERNEL_MODE_LINUX +ENTRY(get_kernelstack_address) + pushl %fs + pushl $(__KERNELSTACK_DS) + popl %fs + movl %fs:0x0, %eax + popl %fs + ret +#endif .data ENTRY(sys_call_table) diff -urN linux-2.4.19.orig/arch/i386/kernel/head.S linux-2.4.19/arch/i386/kernel/head.S --- linux-2.4.19.orig/arch/i386/kernel/head.S Sat Aug 3 09:39:42 2002 +++ linux-2.4.19/arch/i386/kernel/head.S Sat Nov 16 02:10:51 2002 @@ -241,7 +241,7 @@ 2: movl %eax,%cr0 call check_x87 incb ready - lgdt gdt_descr + lgdt cpu_gdt_descr lidt idt_descr ljmp $(__KERNEL_CS),$1f 1: movl $(__KERNEL_DS),%eax # reload all the segment registers @@ -347,30 +347,30 @@ popl %eax iret + /* - * The interrupt descriptor table has room for 256 idt's, - * the global descriptor table is dependent on the number - * of tasks we can have.. + * The IDT and GDT 'descriptors' are a strange 48-bit object + * only used by the lidt and lgdt instructions. They are not + * like usual segment descriptors - they consist of a 16-bit + * segment size, and 32-bit linear address value: */ -#define IDT_ENTRIES 256 -#define GDT_ENTRIES (__TSS(NR_CPUS)) - -.globl SYMBOL_NAME(idt) -.globl SYMBOL_NAME(gdt) +.globl SYMBOL_NAME(idt_descr) +.globl SYMBOL_NAME(cpu_gdt_descr) ALIGN - .word 0 -idt_descr: + .word 0 # 32-bit align idt_desc.address + +SYMBOL_NAME(idt_descr): .word IDT_ENTRIES*8-1 # idt contains 256 entries -SYMBOL_NAME(idt): .long SYMBOL_NAME(idt_table) - .word 0 -gdt_descr: +SYMBOL_NAME(cpu_gdt_descr): .word GDT_ENTRIES*8-1 -SYMBOL_NAME(gdt): - .long SYMBOL_NAME(gdt_table) + .long SYMBOL_NAME(cpu_gdt_table) + + .fill NR_CPUS-1,6,0 # space for the other GDT descriptors + /* * This is initialized to create an identity-mapping at 0-8M (for bootup @@ -428,15 +428,15 @@ * NOTE! Make sure the gdt descriptor in head.S matches this if you * change anything. */ -ENTRY(gdt_table) +ENTRY(cpu_gdt_table) .quad 0x0000000000000000 /* NULL descriptor */ - .quad 0x0000000000000000 /* not used */ + .quad 0x0000000000000000 /* TLS descriptor */ .quad 0x00cf9a000000ffff /* 0x10 kernel 4GB code at 0x00000000 */ .quad 0x00cf92000000ffff /* 0x18 kernel 4GB data at 0x00000000 */ .quad 0x00cffa000000ffff /* 0x23 user 4GB code at 0x00000000 */ .quad 0x00cff2000000ffff /* 0x2b user 4GB data at 0x00000000 */ - .quad 0x0000000000000000 /* not used */ - .quad 0x0000000000000000 /* not used */ + .quad 0x0000000000000000 /* TSS descriptor */ + .quad 0x0000000000000000 /* LDT descriptor */ /* * The APM segments have byte granularity and their bases * and limits are set at run time. @@ -445,4 +445,22 @@ .quad 0x00409a0000000000 /* 0x48 APM CS code */ .quad 0x00009a0000000000 /* 0x50 APM CS 16 code (16 bit) */ .quad 0x0040920000000000 /* 0x58 APM DS data */ - .fill NR_CPUS*4,8,0 /* space for TSS's and LDT's */ + /* Segments used for calling PnP BIOS */ + .quad 0x00c09a0000000000 /* 0x60 32-bit code */ + .quad 0x00809a0000000000 /* 0x68 16-bit code */ + .quad 0x0080920000000000 /* 0x70 16-bit data */ + .quad 0x0080920000000000 /* 0x78 16-bit data */ + .quad 0x0080920000000000 /* 0x80 16-bit data */ + .quad 0x0000000000000000 /* 0x88 not used */ +#ifndef CONFIG_KERNEL_MODE_LINUX + .quad 0x0000000000000000 /* 0x90 not used */ + .quad 0x0000000000000000 /* 0x98 not used */ +#else + .quad 0x0000000000000000 /* 0x90 Kernel Stack Location segment (KSL) set at runtime */ + .quad 0x0000000000000000 /* 0x98 Double Fault Task (DFT) set at runtime */ +#endif + +#if CONFIG_SMP + .fill (NR_CPUS-1)*GDT_ENTRIES,8,0 /* other CPU's GDT */ +#endif + diff -urN linux-2.4.19.orig/arch/i386/kernel/i386_ksyms.c linux-2.4.19/arch/i386/kernel/i386_ksyms.c --- linux-2.4.19.orig/arch/i386/kernel/i386_ksyms.c Sat Aug 3 09:39:42 2002 +++ linux-2.4.19/arch/i386/kernel/i386_ksyms.c Sat Nov 16 02:10:51 2002 @@ -72,7 +72,6 @@ EXPORT_SYMBOL(pm_power_off); EXPORT_SYMBOL(get_cmos_time); EXPORT_SYMBOL(apm_info); -EXPORT_SYMBOL(gdt); EXPORT_SYMBOL(empty_zero_page); #ifdef CONFIG_DEBUG_IOVIRT diff -urN linux-2.4.19.orig/arch/i386/kernel/init_task.c linux-2.4.19/arch/i386/kernel/init_task.c --- linux-2.4.19.orig/arch/i386/kernel/init_task.c Tue Sep 18 07:29:09 2001 +++ linux-2.4.19/arch/i386/kernel/init_task.c Sat Nov 16 02:10:51 2002 @@ -31,3 +31,11 @@ */ struct tss_struct init_tss[NR_CPUS] __cacheline_aligned = { [0 ... NR_CPUS-1] = INIT_TSS }; +#ifdef CONFIG_KERNEL_MODE_LINUX +/* + * We need per cpu TSS of double fault task-handler + * because task-handler cannot be executed cocurrently. + */ +struct tss_struct init_dft[NR_CPUS] __cacheline_aligned = { [0 ... NR_CPUS-1] = INIT_DFT }; +struct dft_stack_struct dft_stack[NR_CPUS] __cacheline_aligned; +#endif diff -urN linux-2.4.19.orig/arch/i386/kernel/setup.c linux-2.4.19/arch/i386/kernel/setup.c --- linux-2.4.19.orig/arch/i386/kernel/setup.c Sat Aug 3 09:39:42 2002 +++ linux-2.4.19/arch/i386/kernel/setup.c Sat Nov 16 02:10:51 2002 @@ -3090,6 +3090,10 @@ { int nr = smp_processor_id(); struct tss_struct * t = &init_tss[nr]; +#ifdef CONFIG_KERNEL_MODE_LINUX + struct tss_struct* d = &init_dft[nr]; + struct dft_stack_struct* ds = &dft_stack[nr]; +#endif if (test_and_set_bit(nr, &cpu_initialized)) { printk(KERN_WARNING "CPU#%d already initialized!\n", nr); @@ -3108,7 +3112,16 @@ } #endif - __asm__ __volatile__("lgdt %0": "=m" (gdt_descr)); + /* + * Initialize the per-CPU GDT with the boot GDT, + * and set up the GDT descriptor: + */ + if (nr) { + memcpy(cpu_gdt_table[nr], cpu_gdt_table[0], GDT_SIZE); + cpu_gdt_descr[nr].size = GDT_SIZE; + cpu_gdt_descr[nr].address = (unsigned long)cpu_gdt_table[nr]; + } + __asm__ __volatile__("lgdt %0": "=m" (cpu_gdt_descr[nr])); __asm__ __volatile__("lidt %0": "=m" (idt_descr)); /* @@ -3127,9 +3140,21 @@ t->esp0 = current->thread.esp0; set_tss_desc(nr,t); - gdt_table[__TSS(nr)].b &= 0xfffffdff; - load_TR(nr); + cpu_gdt_table[nr][TSS_ENTRY].b &= 0xfffffdff; + load_TR_desc(); load_LDT(&init_mm); + +#ifdef CONFIG_KERNEL_MODE_LINUX + set_ksl_desc(nr, &t->esp0); + __asm__ ("pushl $0x00004002; popl %0\n\t" : "=m" (d->eflags)); + d->esp = (unsigned long)(&(ds->error_code) + 1); + ds->current_tss = d; + ds->previous_tss = t; + set_dft_desc(nr, d); + + t->ldt = LDT_ENTRY << 3; + d->ldt = LDT_ENTRY << 3; +#endif /* * Clear all 6 debug registers: diff -urN linux-2.4.19.orig/arch/i386/kernel/signal.c linux-2.4.19/arch/i386/kernel/signal.c --- linux-2.4.19.orig/arch/i386/kernel/signal.c Sat Aug 3 09:39:42 2002 +++ linux-2.4.19/arch/i386/kernel/signal.c Sat Nov 16 02:10:51 2002 @@ -197,10 +197,22 @@ err |= __get_user(tmp, &sc->seg); \ regs->x##seg = tmp; } +#ifndef CONFIG_KERNEL_MODE_LINUX #define COPY_SEG_STRICT(seg) \ { unsigned short tmp; \ err |= __get_user(tmp, &sc->seg); \ regs->x##seg = tmp|3; } +#else +#define COPY_SEG_STRICT(seg) \ + { unsigned short tmp; \ + err |= __get_user(tmp, &sc->seg); \ + regs->x##seg = tmp|(regs->x##seg & 3); } + +#define COPY_CS_STRICT \ + { unsigned long tmp; \ + err |= __get_user(tmp, &sc->xcs); \ + regs->xcs = tmp|(regs->xcs & 3); } +#endif #define GET_SEG(seg) \ { unsigned short tmp; \ @@ -219,7 +231,11 @@ COPY(edx); COPY(ecx); COPY(eip); +#ifndef CONFIG_KERNEL_MODE_LINUX COPY_SEG_STRICT(cs); +#else + COPY_CS_STRICT; +#endif COPY_SEG_STRICT(ss); { @@ -340,7 +356,11 @@ err |= __put_user(current->thread.trap_no, &sc->trapno); err |= __put_user(current->thread.error_code, &sc->err); err |= __put_user(regs->eip, &sc->eip); +#ifndef CONFIG_KERNEL_MODE_LINUX err |= __put_user(regs->xcs, (unsigned int *)&sc->cs); +#else + err |= __put_user(regs->xcs, &sc->xcs); +#endif err |= __put_user(regs->eflags, &sc->eflags); err |= __put_user(regs->esp, &sc->esp_at_signal); err |= __put_user(regs->xss, (unsigned int *)&sc->ss); @@ -376,11 +396,20 @@ } /* This is the legacy signal stack switching. */ +#ifndef CONFIG_KERNEL_MODE_LINUX else if ((regs->xss & 0xffff) != __USER_DS && !(ka->sa.sa_flags & SA_RESTORER) && ka->sa.sa_restorer) { esp = (unsigned long) ka->sa.sa_restorer; } +#else + else if ((regs->xss & 0xffff) != __USER_DS && + (regs->esp > TASK_SIZE) && + !(ka->sa.sa_flags & SA_RESTORER) && + ka->sa.sa_restorer) { + esp = (unsigned long) ka->sa.sa_restorer; + } +#endif return (void *)((esp - frame_size) & -8ul); } @@ -435,11 +464,13 @@ regs->esp = (unsigned long) frame; regs->eip = (unsigned long) ka->sa.sa_handler; +#ifndef CONFIG_KERNEL_MODE_LINUX set_fs(USER_DS); regs->xds = __USER_DS; regs->xes = __USER_DS; regs->xss = __USER_DS; regs->xcs = __USER_CS; +#endif regs->eflags &= ~TF_MASK; #if DEBUG_SIG @@ -510,11 +541,13 @@ regs->esp = (unsigned long) frame; regs->eip = (unsigned long) ka->sa.sa_handler; +#ifndef CONFIG_KERNEL_MODE_LINUX set_fs(USER_DS); regs->xds = __USER_DS; regs->xes = __USER_DS; regs->xss = __USER_DS; regs->xcs = __USER_CS; +#endif regs->eflags &= ~TF_MASK; #if DEBUG_SIG @@ -592,8 +625,13 @@ * kernel mode. Just return without doing anything * if so. */ +#ifndef CONFIG_KERNEL_MODE_LINUX if ((regs->xcs & 3) != 3) return 1; +#else + if ((regs->xcs & 3) != 3 && (regs->xcs & 0xffff0000) == 0) + return 1; +#endif if (!oldset) oldset = ¤t->blocked; diff -urN linux-2.4.19.orig/arch/i386/kernel/trampoline.S linux-2.4.19/arch/i386/kernel/trampoline.S --- linux-2.4.19.orig/arch/i386/kernel/trampoline.S Fri Oct 5 10:42:54 2001 +++ linux-2.4.19/arch/i386/kernel/trampoline.S Sat Nov 16 02:10:51 2002 @@ -63,9 +63,14 @@ .word 0 # idt limit = 0 .word 0, 0 # idt base = 0L +# +# NOTE: here we actually use CPU#0's GDT - but that is OK, we reload +# the proper GDT shortly after booting up the secondary CPUs. +# + gdt_48: .word 0x0800 # gdt limit = 2048, 256 GDT entries - .long gdt_table-__PAGE_OFFSET # gdt base = gdt (first SMP CPU) + .long cpu_gdt_table-__PAGE_OFFSET # gdt base = gdt (first SMP CPU) .globl SYMBOL_NAME(trampoline_end) SYMBOL_NAME_LABEL(trampoline_end) diff -urN linux-2.4.19.orig/arch/i386/kernel/traps.c linux-2.4.19/arch/i386/kernel/traps.c --- linux-2.4.19.orig/arch/i386/kernel/traps.c Sat Aug 3 09:39:42 2002 +++ linux-2.4.19/arch/i386/kernel/traps.c Sat Nov 16 02:10:51 2002 @@ -186,6 +186,18 @@ show_trace(esp); } +#ifndef CONFIG_KERNEL_MODE_LINUX +static inline int in_user_mode(struct pt_regs* regs) +{ + return (regs->xcs & 3); +} +#else +static inline int in_user_mode(struct pt_regs* regs) +{ + return (regs->xcs & 0xffff0003); +} +#endif + void show_registers(struct pt_regs *regs) { int i; @@ -195,7 +207,7 @@ esp = (unsigned long) (®s->esp); ss = __KERNEL_DS; - if (regs->xcs & 3) { + if (in_user_mode(regs)) { in_kernel = 0; esp = regs->esp; ss = regs->xss & 0xffff; @@ -289,7 +301,7 @@ static inline void die_if_kernel(const char * str, struct pt_regs * regs, long err) { - if (!(regs->eflags & VM_MASK) && !(3 & regs->xcs)) + if (!(regs->eflags & VM_MASK) && !in_user_mode(regs)) die(str, regs, err); } @@ -307,7 +319,7 @@ { if (vm86 && regs->eflags & VM_MASK) goto vm86_trap; - if (!(regs->xcs & 3)) + if (!in_user_mode(regs)) goto kernel_trap; trap_signal: { @@ -389,7 +401,7 @@ if (regs->eflags & VM_MASK) goto gp_in_vm86; - if (!(regs->xcs & 3)) + if (!in_user_mode(regs)) goto gp_in_kernel; current->thread.error_code = error_code; @@ -557,8 +569,8 @@ /* If this is a kernel mode trap, save the user PC on entry to * the kernel, that's what the debugger can make sense of. */ - info.si_addr = ((regs->xcs & 3) == 0) ? (void *)tsk->thread.eip : - (void *)regs->eip; + info.si_addr = (!in_user_mode(regs)) ? (void *)tsk->thread.eip : + (void *)regs->eip; force_sig_info(SIGTRAP, &info, tsk); /* Disable additional traps. They'll be re-enabled when @@ -782,11 +794,10 @@ __flush_tlb_all(); /* - * "idt" is magic - it overlaps the idt_descr - * variable so that updating idt will automatically - * update the idt descriptor.. - */ - idt = (struct desc_struct *)page; + * Update the IDT descriptor and reload the IDT so that + * it uses the read-only mapped virtual address. + */ + idt_descr.address = page; __asm__ __volatile__("lidt %0": "=m" (idt_descr)); } #endif @@ -804,6 +815,21 @@ "3" ((char *) (addr)),"2" (__KERNEL_CS << 16)); \ } while (0) +#ifdef CONFIG_KERNEL_MODE_LINUX +#define _set_task_gate(gate_addr,dpl,tss_sel) \ +do { \ + int __d0, __d1; \ + __asm__ __volatile__ ("movw %%dx,%%ax\n\t" \ + "movw %4,%%dx\n\t" \ + "movl %%eax,%0\n\t" \ + "movl %%edx,%1" \ + :"=m" (*((long *) (gate_addr))), \ + "=m" (*(1+(long *) (gate_addr))), "=&a" (__d0), "=&d" (__d1) \ + :"i" ((short) (0x8000+(dpl<<13)+(5<<8))), \ + "3" (0),"2" (tss_sel << 16)); \ +} while (0) + +#endif /* * This needs to use 'idt_table' rather than 'idt', and @@ -831,36 +857,12 @@ _set_gate(a,12,3,addr); } -#define _set_seg_desc(gate_addr,type,dpl,base,limit) {\ - *((gate_addr)+1) = ((base) & 0xff000000) | \ - (((base) & 0x00ff0000)>>16) | \ - ((limit) & 0xf0000) | \ - ((dpl)<<13) | \ - (0x00408000) | \ - ((type)<<8); \ - *(gate_addr) = (((base) & 0x0000ffff)<<16) | \ - ((limit) & 0x0ffff); } - -#define _set_tssldt_desc(n,addr,limit,type) \ -__asm__ __volatile__ ("movw %w3,0(%2)\n\t" \ - "movw %%ax,2(%2)\n\t" \ - "rorl $16,%%eax\n\t" \ - "movb %%al,4(%2)\n\t" \ - "movb %4,5(%2)\n\t" \ - "movb $0,6(%2)\n\t" \ - "movb %%ah,7(%2)\n\t" \ - "rorl $16,%%eax" \ - : "=m"(*(n)) : "a" (addr), "r"(n), "ir"(limit), "i"(type)) - -void set_tss_desc(unsigned int n, void *addr) -{ - _set_tssldt_desc(gdt_table+__TSS(n), (int)addr, 235, 0x89); -} - -void set_ldt_desc(unsigned int n, void *addr, unsigned int size) +#ifdef CONFIG_KERNEL_MODE_LINUX +static void __init set_task_gate(unsigned int n, unsigned int tss_sel) { - _set_tssldt_desc(gdt_table+__LDT(n), (int)addr, ((size << 3)-1), 0x82); + _set_task_gate(idt_table+n,0,tss_sel); } +#endif #ifdef CONFIG_X86_VISWS_APIC @@ -971,7 +973,11 @@ set_system_gate(5,&bounds); set_trap_gate(6,&invalid_op); set_trap_gate(7,&device_not_available); +#ifndef CONFIG_KERNEL_MODE_LINUX set_trap_gate(8,&double_fault); +#else + set_task_gate(8,(DFT_ENTRY << 3)); +#endif set_trap_gate(9,&coprocessor_segment_overrun); set_trap_gate(10,&invalid_TSS); set_trap_gate(11,&segment_not_present); diff -urN linux-2.4.19.orig/arch/i386/mm/fault.c linux-2.4.19/arch/i386/mm/fault.c --- linux-2.4.19.orig/arch/i386/mm/fault.c Sat Aug 3 09:39:42 2002 +++ linux-2.4.19/arch/i386/mm/fault.c Sat Nov 16 02:10:51 2002 @@ -24,6 +24,7 @@ #include #include #include +#include extern void die(const char *,struct pt_regs *,long); @@ -126,8 +127,18 @@ } asmlinkage void do_invalid_op(struct pt_regs *, unsigned long); -extern unsigned long idt; +#ifndef CONFIG_KERNEL_MODE_LINUX +static inline int user_mode_access(unsigned long error_code, struct pt_regs* regs) +{ + return (error_code & 4); +} +#else +static inline int user_mode_access(unsigned long error_code, struct pt_regs* regs) +{ + return (error_code & 4) || (regs->xcs & 0xffff0000); +} +#endif /* * This routine handles page faults. It determines the address, * and the problem, and then passes it off to one of the appropriate @@ -171,7 +182,7 @@ * (error_code & 4) == 0, and that the fault was not a * protection error (error_code & 1) == 0. */ - if (address >= TASK_SIZE && !(error_code & 5)) + if (address >= TASK_SIZE && !(error_code & 1) && !user_mode_access(error_code, regs)) goto vmalloc_fault; mm = tsk->mm; @@ -193,7 +204,7 @@ goto good_area; if (!(vma->vm_flags & VM_GROWSDOWN)) goto bad_area; - if (error_code & 4) { + if (user_mode_access(error_code, regs)) { /* * accessing the stack below %esp is always a bug. * The "+ 32" is there due to some instructions (like @@ -215,7 +226,11 @@ switch (error_code & 3) { default: /* 3: write, present */ #ifdef TEST_VERIFY_AREA - if (regs->cs == KERNEL_CS) +#ifndef CONFIG_KERNEL_MODE_LINUX + if (regs->xcs == KERNEL_CS) +#else + if (regs->xcs == KERNEL_CS && !(regs->xcs & 0xffff0000)) +#endif printk("WP fault at %08lx\n", regs->eip); #endif /* fall through */ @@ -269,7 +284,7 @@ up_read(&mm->mmap_sem); /* User mode accesses just cause a SIGSEGV */ - if (error_code & 4) { + if (user_mode_access(error_code, regs)) { tsk->thread.cr2 = address; tsk->thread.error_code = error_code; tsk->thread.trap_no = 14; @@ -287,7 +302,7 @@ if (boot_cpu_data.f00f_bug) { unsigned long nr; - nr = (address - idt) >> 3; + nr = (address - idt_descr.address) >> 3; if (nr == 6) { do_invalid_op(regs, 0); @@ -342,7 +357,7 @@ goto survive; } printk("VM: killing process %s\n", tsk->comm); - if (error_code & 4) + if (user_mode_access(error_code, regs)) do_exit(SIGKILL); goto no_context; @@ -363,7 +378,7 @@ force_sig_info(SIGBUS, &info, tsk); /* Kernel mode? Handle exceptions or die */ - if (!(error_code & 4)) + if (!user_mode_access(error_code, regs)) goto no_context; return; diff -urN linux-2.4.19.orig/fs/binfmt_elf.c linux-2.4.19/fs/binfmt_elf.c --- linux-2.4.19.orig/fs/binfmt_elf.c Sat Aug 3 09:39:45 2002 +++ linux-2.4.19/fs/binfmt_elf.c Sat Nov 16 02:10:51 2002 @@ -423,6 +423,42 @@ #define INTERPRETER_AOUT 1 #define INTERPRETER_ELF 2 +#ifdef CONFIG_KERNEL_MODE_LINUX +/* + * XXX : we haven't implemented safety check of user programs. + */ +#define TRUSTED_DIR_STR "/trusted/" +#define TRUSTED_DIR_STR_LEN 9 + +static inline int is_safe(struct file* file) +{ + int ret; + char* path; + char* tmp; + struct fs_struct* cur_fs; + + tmp = (char*)__get_free_page(GFP_KERNEL); + + if (!tmp) { + return 0; + } + + path = d_path(file->f_dentry, file->f_vfsmnt, tmp, PAGE_SIZE); + ret = (0 == strncmp(TRUSTED_DIR_STR, path, TRUSTED_DIR_STR_LEN)); + if (ret) { + /* Check whether if we are "chroot"ed */ + /* XXX : I don't know how to check whether if we are chrooted. Is this code correct? */ + cur_fs = current->fs; + read_lock(&cur_fs->lock); + spin_lock(&dcache_lock); + ret = IS_ROOT(cur_fs->root); + spin_unlock(&dcache_lock); + read_unlock(&cur_fs->lock); + } + free_page((unsigned long)tmp); + return ret; +} +#endif static int load_elf_binary(struct linux_binprm * bprm, struct pt_regs * regs) { @@ -781,7 +817,15 @@ ELF_PLAT_INIT(regs); #endif +#if !defined(CONFIG_KERNEL_MODE_LINUX) || defined(CONFIG_KML_CHECK_SAFETY) start_thread(regs, elf_entry, bprm->p); +#else + if (is_safe(bprm->file)) { + start_kernel_thread(regs, elf_entry, bprm->p); + } else { + start_thread(regs, elf_entry, bprm->p); + } +#endif if (current->ptrace & PT_PTRACED) send_sig(SIGTRAP, current, 0); retval = 0; diff -urN linux-2.4.19.orig/include/asm-i386/desc.h linux-2.4.19/include/asm-i386/desc.h --- linux-2.4.19.orig/include/asm-i386/desc.h Fri Jul 27 05:40:32 2001 +++ linux-2.4.19/include/asm-i386/desc.h Sat Nov 16 02:10:51 2002 @@ -4,61 +4,67 @@ #include /* - * The layout of the GDT under Linux: + * The layout of the per-CPU GDT under Linux: * * 0 - null - * 1 - not used + * 1 - Reserved for Thread-Local Storage (TLS) segment * 2 - kernel code segment * 3 - kernel data segment - * 4 - user code segment <-- new cacheline + * 4 - user code segment <==== new cacheline * 5 - user data segment - * 6 - not used - * 7 - not used - * 8 - APM BIOS support <-- new cacheline + * 6 - TSS + * 7 - LDT + * 8 - APM BIOS support <==== new cacheline * 9 - APM BIOS support * 10 - APM BIOS support * 11 - APM BIOS support + * 12 - PNPBIOS support <==== new cacheline + * 13 - PNPBIOS support + * 14 - PNPBIOS support + * 15 - PNPBIOS support + * 16 - PNPBIOS support <==== new cacheline + * 17 - not used +#ifndef CONFIG_KERNEL_MODE_LINUX + * 18 - not used + * 19 - not used +#else + * 18 - Kernel Stack Location segment (KSL) + * 19 - Double Fault Handling Task (DFT) +#endif + */ +#define TSS_ENTRY 6 +#define LDT_ENTRY 7 +#ifdef CONFIG_KERNEL_MODE_LINUX +#define KSL_ENTRY 18 +#define DFT_ENTRY 19 +#endif +/* + * The interrupt descriptor table has room for 256 idt's, + * the global descriptor table is dependent on the number + * of tasks we can have.. * - * The TSS+LDT descriptors are spread out a bit so that every CPU - * has an exclusive cacheline for the per-CPU TSS and LDT: - * - * 12 - CPU#0 TSS <-- new cacheline - * 13 - CPU#0 LDT - * 14 - not used - * 15 - not used - * 16 - CPU#1 TSS <-- new cacheline - * 17 - CPU#1 LDT - * 18 - not used - * 19 - not used - * ... NR_CPUS per-CPU TSS+LDT's if on SMP - * - * Entry into gdt where to find first TSS. + * We pad the GDT to cacheline boundary. */ -#define __FIRST_TSS_ENTRY 12 -#define __FIRST_LDT_ENTRY (__FIRST_TSS_ENTRY+1) - -#define __TSS(n) (((n)<<2) + __FIRST_TSS_ENTRY) -#define __LDT(n) (((n)<<2) + __FIRST_LDT_ENTRY) +#define IDT_ENTRIES 256 +#define GDT_ENTRIES 20 #ifndef __ASSEMBLY__ -struct desc_struct { - unsigned long a,b; -}; -extern struct desc_struct gdt_table[]; -extern struct desc_struct *idt, *gdt; +#include + +#define GDT_SIZE (GDT_ENTRIES*sizeof(struct desc_struct)) + +extern struct desc_struct cpu_gdt_table[NR_CPUS][GDT_ENTRIES]; struct Xgt_desc_struct { unsigned short size; unsigned long address __attribute__((packed)); -}; - -#define idt_descr (*(struct Xgt_desc_struct *)((char *)&idt - 2)) -#define gdt_descr (*(struct Xgt_desc_struct *)((char *)&gdt - 2)) +} __attribute__ ((packed)); -#define load_TR(n) __asm__ __volatile__("ltr %%ax"::"a" (__TSS(n)<<3)) +extern struct Xgt_desc_struct idt_descr, cpu_gdt_descr[NR_CPUS]; -#define __load_LDT(n) __asm__ __volatile__("lldt %%ax"::"a" (__LDT(n)<<3)) +#define load_TR_desc() __asm__ __volatile__("ltr %%ax"::"a" (TSS_ENTRY<<3)) +#define load_LDT_desc() __asm__ __volatile__("lldt %%ax"::"a" (LDT_ENTRY<<3)) /* * This is the ldt that every process will get unless we need @@ -66,32 +72,78 @@ */ extern struct desc_struct default_ldt[]; extern void set_intr_gate(unsigned int irq, void * addr); -extern void set_ldt_desc(unsigned int n, void *addr, unsigned int size); -extern void set_tss_desc(unsigned int n, void *addr); + +#define _set_tssldt_desc(n,addr,limit,type) \ +__asm__ __volatile__ ("movw %w3,0(%2)\n\t" \ + "movw %%ax,2(%2)\n\t" \ + "rorl $16,%%eax\n\t" \ + "movb %%al,4(%2)\n\t" \ + "movb %4,5(%2)\n\t" \ + "movb $0,6(%2)\n\t" \ + "movb %%ah,7(%2)\n\t" \ + "rorl $16,%%eax" \ + : "=m"(*(n)) : "a" (addr), "r"(n), "ir"(limit), "i"(type)) + +#ifdef CONFIG_KERNEL_MODE_LINUX +#define _set_codedata_seg_desc(n,addr,type) \ +__asm__ __volatile__ ("movw $0xffff,0(%2)\n\t" \ + "movw %%ax,2(%2)\n\t" \ + "rorl $16,%%eax\n\t" \ + "movb %%al,4(%2)\n\t" \ + "movb %3,5(%2)\n\t" \ + "movb $0xcf,6(%2)\n\t" \ + "movb %%ah,7(%2)\n\t" \ + "rorl $16,%%eax" \ + : "=m"(*(n)) : "a" (addr), "r"(n), "i"(type)) +#endif + +static inline void set_tss_desc(unsigned int cpu, void *addr) +{ + _set_tssldt_desc(&cpu_gdt_table[cpu][TSS_ENTRY], (int)addr, 235, 0x89); +} + +static inline void set_ldt_desc(unsigned int cpu, void *addr, unsigned int size) +{ + _set_tssldt_desc(&cpu_gdt_table[cpu][LDT_ENTRY], (int)addr, ((size << 3)-1), 0x82); +} + +#ifdef CONFIG_KERNEL_MODE_LINUX + +static inline void set_ksl_desc(unsigned int cpu, void* addr) +{ + _set_codedata_seg_desc(&cpu_gdt_table[cpu][KSL_ENTRY], (int)addr, 0x92); +} + +static inline void set_dft_desc(unsigned int cpu, void *addr) +{ + _set_tssldt_desc(&cpu_gdt_table[cpu][DFT_ENTRY], (int)addr, 235, 0x89); +} + +#endif static inline void clear_LDT(void) { - int cpu = smp_processor_id(); - set_ldt_desc(cpu, &default_ldt[0], 5); - __load_LDT(cpu); + set_ldt_desc(smp_processor_id(), &default_ldt[0], 5); + load_LDT_desc(); } /* * load one particular LDT into the current CPU */ +#include + static inline void load_LDT (struct mm_struct *mm) { - int cpu = smp_processor_id(); void *segments = mm->context.segments; int count = LDT_ENTRIES; - if (!segments) { + if (likely(!count)) { segments = &default_ldt[0]; count = 5; } - set_ldt_desc(cpu, segments, count); - __load_LDT(cpu); + set_ldt_desc(smp_processor_id(), segments, count); + load_LDT_desc(); } #endif /* !__ASSEMBLY__ */ diff -urN linux-2.4.19.orig/include/asm-i386/hw_irq.h linux-2.4.19/include/asm-i386/hw_irq.h --- linux-2.4.19.orig/include/asm-i386/hw_irq.h Fri Nov 23 04:46:18 2001 +++ linux-2.4.19/include/asm-i386/hw_irq.h Sat Nov 16 02:10:51 2002 @@ -97,21 +97,59 @@ #define SAVE_ALL \ "cld\n\t" \ - "pushl %es\n\t" \ - "pushl %ds\n\t" \ - "pushl %eax\n\t" \ - "pushl %ebp\n\t" \ - "pushl %edi\n\t" \ - "pushl %esi\n\t" \ - "pushl %edx\n\t" \ - "pushl %ecx\n\t" \ - "pushl %ebx\n\t" \ - "movl $" STR(__KERNEL_DS) ",%edx\n\t" \ - "movl %edx,%ds\n\t" \ - "movl %edx,%es\n\t" + "pushl %%es\n\t" \ + "pushl %%ds\n\t" \ + "pushl %%eax\n\t" \ + "pushl %%ebp\n\t" \ + "pushl %%edi\n\t" \ + "pushl %%esi\n\t" \ + "pushl %%edx\n\t" \ + "pushl %%ecx\n\t" \ + "pushl %%ebx\n\t" \ + "movl $" STR(__KERNEL_DS) ",%%edx\n\t" \ + "movl %%edx,%%ds\n\t" \ + "movl %%edx,%%es\n\t" + +#ifndef CONFIG_KERNEL_MODE_LINUX +#define SWITCH_STACK_TO_KK +#define SWITCH_STACK_TO_KK_CONSTRAINTS : : +#else +/* Same as a macro in arch/i386/kernel/entry.S */ +#define SWITCH_STACK_TO_KK \ + "cmpl %0, %%esp\n\t" \ + "movw $0x0, 6(%%esp)\n\t" \ + "ja 3f\n\t" \ + "movl %%ebp, 4(%%esp)\n\t" \ + "movl %%ds, %%ebp\n\t" \ + "cmpw %1, %%bp\n\t" \ + "je 1f\n\t" \ + "movl %1, %%ebp\n\t" \ + "movl %%ebp, %%ds\n\t" \ + "1:\n\t" \ + "movl %%esp, %%ebp\n\t" \ + "movl (0x0), %%esp\n\t" \ + "je 2f\n\t" \ + "pushl %2\n\t" \ + "popl %%ds\n\t" \ + "2:\n\t" \ + "addl $12, %%ebp\n\t" \ + "addl $-4, %%esp\n\t" \ + "pushl %%ebp\n\t" \ + "pushl -4(%%ebp)\n\t" \ + "pushl %3\n\t" \ + "pushl -12(%%ebp)\n\t" \ + "movl -8(%%ebp), %%ebp\n\t" \ + "3:\n\t" +#define SWITCH_STACK_TO_KK_CONSTRAINTS \ + : : "i" (TASK_SIZE), \ + "i" (__KERNELSTACK_DS), \ + "i" (__KERNEL_DS), \ + "i" (__SW_KERNEL_CS) +#endif #define IRQ_NAME2(nr) nr##_interrupt(void) #define IRQ_NAME(nr) IRQ_NAME2(IRQ##nr) +#define DUMMY_IRQ_NAME(nr) IRQ_NAME(_dummy_##nr) #define GET_CURRENT \ "movl %esp, %ebx\n\t" \ @@ -127,40 +165,53 @@ #define XBUILD_SMP_INTERRUPT(x,v)\ asmlinkage void x(void); \ asmlinkage void call_##x(void); \ +static void dummy_##x(void) __attribute__ ((unused)); \ +static void dummy_##x(void) { \ __asm__( \ "\n"__ALIGN_STR"\n" \ SYMBOL_NAME_STR(x) ":\n\t" \ + SWITCH_STACK_TO_KK \ "pushl $"#v"-256\n\t" \ SAVE_ALL \ SYMBOL_NAME_STR(call_##x)":\n\t" \ "call "SYMBOL_NAME_STR(smp_##x)"\n\t" \ - "jmp ret_from_intr\n"); + "jmp ret_from_intr\n" \ + SWITCH_STACK_TO_KK_CONSTRAINTS); \ +} #define BUILD_SMP_TIMER_INTERRUPT(x,v) XBUILD_SMP_TIMER_INTERRUPT(x,v) #define XBUILD_SMP_TIMER_INTERRUPT(x,v) \ asmlinkage void x(struct pt_regs * regs); \ asmlinkage void call_##x(void); \ +static void dummy_##x(void) __attribute__ ((unused)); \ +static void dummy_##x(void) { \ __asm__( \ "\n"__ALIGN_STR"\n" \ SYMBOL_NAME_STR(x) ":\n\t" \ + SWITCH_STACK_TO_KK \ "pushl $"#v"-256\n\t" \ SAVE_ALL \ - "movl %esp,%eax\n\t" \ - "pushl %eax\n\t" \ + "movl %%esp,%%eax\n\t" \ + "pushl %%eax\n\t" \ SYMBOL_NAME_STR(call_##x)":\n\t" \ "call "SYMBOL_NAME_STR(smp_##x)"\n\t" \ - "addl $4,%esp\n\t" \ - "jmp ret_from_intr\n"); + "addl $4,%%esp\n\t" \ + "jmp ret_from_intr\n" \ + SWITCH_STACK_TO_KK_CONSTRAINTS); \ +} #define BUILD_COMMON_IRQ() \ asmlinkage void call_do_IRQ(void); \ +static void dummy_call_do_IRQ(void) __attribute__ ((unused)); \ +static void dummy_call_do_IRQ(void) { \ __asm__( \ "\n" __ALIGN_STR"\n" \ "common_interrupt:\n\t" \ SAVE_ALL \ SYMBOL_NAME_STR(call_do_IRQ)":\n\t" \ "call " SYMBOL_NAME_STR(do_IRQ) "\n\t" \ - "jmp ret_from_intr\n"); + "jmp ret_from_intr\n" : :); \ +} /* * subtle. orig_eax is used by the signal code to distinct between @@ -171,14 +222,18 @@ * * Subtle as a pigs ear. VY */ - #define BUILD_IRQ(nr) \ asmlinkage void IRQ_NAME(nr); \ +static void DUMMY_IRQ_NAME(nr) __attribute__ ((unused)); \ +static void DUMMY_IRQ_NAME(nr) { \ __asm__( \ "\n"__ALIGN_STR"\n" \ SYMBOL_NAME_STR(IRQ) #nr "_interrupt:\n\t" \ + SWITCH_STACK_TO_KK \ "pushl $"#nr"-256\n\t" \ - "jmp common_interrupt"); + "jmp common_interrupt" \ + SWITCH_STACK_TO_KK_CONSTRAINTS); \ +} extern unsigned long prof_cpu_mask; extern unsigned int * prof_buffer; diff -urN linux-2.4.19.orig/include/asm-i386/mmu_context.h linux-2.4.19/include/asm-i386/mmu_context.h --- linux-2.4.19.orig/include/asm-i386/mmu_context.h Sat Aug 3 09:39:45 2002 +++ linux-2.4.19/include/asm-i386/mmu_context.h Sat Nov 16 02:10:51 2002 @@ -16,7 +16,7 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk, unsigned cpu) { - if(cpu_tlbstate[cpu].state == TLBSTATE_OK) + if (cpu_tlbstate[cpu].state == TLBSTATE_OK) cpu_tlbstate[cpu].state = TLBSTATE_LAZY; } #else @@ -41,6 +41,9 @@ #endif set_bit(cpu, &next->cpu_vm_mask); set_bit(cpu, &next->context.cpuvalid); +#ifdef CONFIG_KERNEL_MODE_LINUX + init_dft[cpu].__cr3 = __pa(next->pgd); +#endif /* Re-load page tables */ load_cr3(next->pgd); } diff -urN linux-2.4.19.orig/include/asm-i386/processor.h linux-2.4.19/include/asm-i386/processor.h --- linux-2.4.19.orig/include/asm-i386/processor.h Sat Aug 3 09:39:45 2002 +++ linux-2.4.19/include/asm-i386/processor.h Sat Nov 16 02:10:51 2002 @@ -18,6 +18,10 @@ #include #include +struct desc_struct { + unsigned long a,b; +}; + /* * Default implementation of macro that returns current * instruction pointer ("program counter"). @@ -72,6 +76,10 @@ extern struct cpuinfo_x86 boot_cpu_data; extern struct tss_struct init_tss[NR_CPUS]; +#ifdef CONFIG_KERNEL_MODE_LINUX +extern struct tss_struct init_dft[NR_CPUS]; +extern struct dft_stack_struct dft_stack[NR_CPUS]; +#endif #ifdef CONFIG_SMP extern struct cpuinfo_x86 cpu_data[]; @@ -387,6 +395,14 @@ unsigned long io_bitmap[IO_BITMAP_SIZE+1]; }; +#ifdef CONFIG_KERNEL_MODE_LINUX +struct dft_stack_struct { + unsigned long error_code; + struct tss_struct* current_tss; + struct tss_struct* previous_tss; +}; +#endif + #define INIT_THREAD { \ 0, \ 0, 0, 0, 0, \ @@ -408,11 +424,38 @@ 0,0,0,0, /* esp,ebp,esi,edi */ \ 0,0,0,0,0,0, /* es,cs,ss */ \ 0,0,0,0,0,0, /* ds,fs,gs */ \ - __LDT(0),0, /* ldt */ \ + LDT_ENTRY,0, /* ldt */ \ 0, INVALID_IO_BITMAP_OFFSET, /* tace, bitmap */ \ {~0, } /* ioperm */ \ } +#ifdef CONFIG_KERNEL_MODE_LINUX +extern void double_fault_task(void); + +#define INIT_DFT { \ + 0,0, /* back_link, __blh */ \ + 0, /* esp0 */ \ + __KERNEL_DS, 0, /* ss0 */ \ + 0,0,0,0,0,0, /* stack1, stack2 */ \ + 0, /* cr3 */ \ + (unsigned long)double_fault_task, /* eip */ \ + 0, /* eflags */ \ + 0,0,0,0, /* eax,ecx,edx,ebx */ \ + 0, /* esp : lazy initializing */ \ + 0,0,0, /* ebp,esi,edi */ \ + __KERNEL_DS,0, /* es */ \ + __KERNEL_CS,0, /* cs */ \ + __KERNEL_DS,0, /* ss */ \ + __KERNEL_DS,0, /* ds */ \ + __KERNEL_DS,0, /* fs */ \ + __KERNEL_DS,0, /* gs */ \ + LDT_ENTRY,0, /* ldt */ \ + 0, INVALID_IO_BITMAP_OFFSET, /* tace, bitmap */ \ + {~0, } /* ioperm */ \ +} +#endif + +#ifndef CONFIG_KERNEL_MODE_LINUX #define start_thread(regs, new_eip, new_esp) do { \ __asm__("movl %0,%%fs ; movl %0,%%gs": :"r" (0)); \ set_fs(USER_DS); \ @@ -423,6 +466,31 @@ regs->eip = new_eip; \ regs->esp = new_esp; \ } while (0) +#else +#define start_thread(regs, new_eip, new_esp) do { \ + __asm__("movl %0,%%fs ; movl %0,%%gs": :"r" (0)); \ + set_fs(USER_DS); \ + regs->xds = __USER_DS; \ + regs->xes = __USER_DS; \ + regs->xss = __USER_DS; \ + regs->xcs = __USER_CS; \ + regs->eip = new_eip; \ + regs->esp = new_esp; \ + regs->xcs &= 0x0000ffff; \ +} while (0) + +#define start_kernel_thread(regs, new_eip, new_esp) do { \ + __asm__("movl %0,%%fs ; movl %0,%%gs": :"r" (0)); \ + set_fs(KERNEL_DS); \ + regs->xds = __KERNEL_DS; \ + regs->xes = __KERNEL_DS; \ + regs->xss = __KERNEL_DS; \ + regs->xcs = __KERNEL_CS; \ + regs->eip = new_eip; \ + regs->esp = new_esp; \ + regs->xcs |= 0xffff0000; \ +} while (0) +#endif /* Forward declaration, a strange C thing */ struct task_struct; diff -urN linux-2.4.19.orig/include/asm-i386/ptrace.h linux-2.4.19/include/asm-i386/ptrace.h --- linux-2.4.19.orig/include/asm-i386/ptrace.h Sat Sep 15 06:04:08 2001 +++ linux-2.4.19/include/asm-i386/ptrace.h Sat Nov 16 02:10:51 2002 @@ -55,7 +55,11 @@ #define PTRACE_O_TRACESYSGOOD 0x00000001 #ifdef __KERNEL__ +#ifndef CONFIG_KERNEL_MODE_LINUX #define user_mode(regs) ((VM_MASK & (regs)->eflags) || (3 & (regs)->xcs)) +#else +#define user_mode(regs) ((VM_MASK & (regs)->eflags) || (0xffff0003 & (regs)->xcs)) +#endif #define instruction_pointer(regs) ((regs)->eip) extern void show_regs(struct pt_regs *); #endif diff -urN linux-2.4.19.orig/include/asm-i386/segment.h linux-2.4.19/include/asm-i386/segment.h --- linux-2.4.19.orig/include/asm-i386/segment.h Tue Dec 2 03:34:12 1997 +++ linux-2.4.19/include/asm-i386/segment.h Sat Nov 16 02:10:51 2002 @@ -7,4 +7,9 @@ #define __USER_CS 0x23 #define __USER_DS 0x2B +#ifdef CONFIG_KERNEL_MODE_LINUX +#define __SW_KERNEL_CS (0xffff0000 | __KERNEL_CS) +#define __KERNELSTACK_DS 0x90 +#endif + #endif diff -urN linux-2.4.19.orig/include/asm-i386/sigcontext.h linux-2.4.19/include/asm-i386/sigcontext.h --- linux-2.4.19.orig/include/asm-i386/sigcontext.h Thu Jun 22 12:59:38 2000 +++ linux-2.4.19/include/asm-i386/sigcontext.h Sat Nov 16 02:10:51 2002 @@ -70,7 +70,11 @@ unsigned long trapno; unsigned long err; unsigned long eip; +#ifndef CONFIG_KERNEL_MODE_LINUX unsigned short cs, __csh; +#else + unsigned long xcs; +#endif unsigned long eflags; unsigned long esp_at_signal; unsigned short ss, __ssh; diff -urN linux-2.4.19.orig/include/linux/mm.h linux-2.4.19/include/linux/mm.h --- linux-2.4.19.orig/include/linux/mm.h Sat Aug 3 09:39:45 2002 +++ linux-2.4.19/include/linux/mm.h Sat Nov 16 02:10:51 2002 @@ -672,6 +672,10 @@ extern struct page * vmalloc_to_page(void *addr); +#ifdef CONFIG_KERNEL_MODE_LINUX +extern asmlinkage int address_exists(unsigned long address); +#endif + #endif /* __KERNEL__ */ #endif diff -urN linux-2.4.19.orig/mm/memory.c linux-2.4.19/mm/memory.c --- linux-2.4.19.orig/mm/memory.c Sat Aug 3 09:39:46 2002 +++ linux-2.4.19/mm/memory.c Sat Nov 16 02:10:51 2002 @@ -1495,3 +1495,60 @@ } return page; } + +#ifdef CONFIG_KERNEL_MODE_LINUX +static inline int address_presents_and_writable_in_pmd(pmd_t* pmd, unsigned long address) +{ + pte_t* pte; + + if (pmd_none(*pmd)) + return 0; + if (pmd_bad(*pmd)) { + pmd_ERROR(*pmd); + return 0; + } + if (!pmd_present(*pmd)) + return 0; + pte = pte_offset(pmd, address); + return (pte_present(*pte) && pte_write(*pte)); +} + +static inline int address_presents_and_writable_in_pgd(pgd_t* pgd, unsigned long address) +{ + pmd_t* pmd; + + if (pgd_none(*pgd)) + return 0; + if (pgd_bad(*pgd)) { + pgd_ERROR(*pgd); + return 0; + } + if (!pgd_present(*pgd)) + return 0; + pmd = pmd_offset(pgd, address); + return address_presents_and_writable_in_pmd(pmd, address); +} + +static inline int address_presents_and_writable_in_mm(struct mm_struct* mm, unsigned long address) +{ + pgd_t* pgd; + + pgd = pgd_offset(mm, address); + return address_presents_and_writable_in_pgd(pgd, address); +} + +asmlinkage int address_presents_and_writable(unsigned long address) +{ + struct mm_struct* mm; + int result; + + mm = current->mm; + if (!mm) + return 0; + spin_lock(&mm->page_table_lock); + result = address_presents_and_writable_in_mm(mm, address); + spin_unlock(&mm->page_table_lock); + return result; +} +#endif +