Hooking Interrupt and Exception Handlers in Linux by mammon_ While interrupt hooking is quite common on DOS and Win32 platforms, there is little documentation for similar operations on Linux. What follows is an attempt to address the method of hooking Linux interrupt and exception handlers, what problems may be expected in doing so, and how to resolve them. I. Intel exceptions and the IDT _______________________________________________________________________________ On Intel processors, exceptions and interrupts can be generated as a direct or indirect [e.g. Page Fault] result of program code, or they can be generated by conditions external to the program -- for example, a hardware interrupt. Intel hardware allows 256 exceptions: Number Name Priority* 0 Divide Error 8 1 Debug 4 [3 if TF on Task Switch] 2 NIM 5 3 Breakpoint 6 4 Overflow 8 5 BOUND 8 6 Invalid Opcode 7 7 x87 Not Available 7 8 Double Fault ? 9 Coprocessor Segment Overrun 8 10 Invalid TSS 8 11 Segment Not Present 8 12 Stack Segment Fault 8 13 General Protection 8 14 Page Fault 6 for code, 8 for data pages 15 Reserved ? 16 Floating-Point Error 8 17 Alignment check 8 18 Machine Check 1 19 SIMD 8 20-31 Reserved ? 32-255 User-defined (INT x or INTR pin) 5 *Intel's numbering scheme, not mine An exception handler must be provided by the OS for each exception; the handler is a standard kernel-mode routine which responds by servicing the interrupt or recovering from the exception. Note that many of these exceptions occur quite frequently in normal OS operation [i.e., Page Fault], while others [Breakpoint, General Protection Fault] indicate a system or user error. Each exception handler is represented by an 8-byte descriptor which contains information such as the address and permissions of the exception handler. The descriptors for Interrupt and Trap gates -- Task gates are not used in Linux, and therefore will not be covered -- differ only in the hard-coded fields [bits 40-44], as shown below; the only practical difference between Interrupt and Trap gates is that the Interrupt Enabled Flag [IF in EFLAGS] is cleared before calling an Interrupt exception handler. Note that these structures are little endian, and when diagrammed in a real reference will show bit 0 starting at the right side of the page. Interrupt Gate Descriptor bit 0 - 15 Exception Handler offset bits 0-15 16 - 3 Segment Selector 32 - 36 Reserved 37 - 39 0x00 40 - 44 0x0E 45 - 46 Descriptor Privilege Level [0-3] 47 Segment Present Flag 48 - 63 Exception Handler offset bits 16-31 Trap Gate Descriptor bit 0 - 15 Exception Handler offset bits 0-15 16 - 3 Segment Selector 32 - 36 Reserved 37 - 39 0x00 40 - 44 0x0F 45 - 46 Descriptor Privilege Level [0-3] 47 Segment Present Flag Here you can see a fine example of why Intel hardware is referred to in the literature as 'brain-damaged': the 4 bytes of the exception handler address offset are split into two 2-byte words [the 2 bytes being an Intel Word, supposedly, since they definitely aren't a Machine Word, which would be 4 bytes on 32-bit hardware] that are separated by a full 4 bytes. Yes, the same genius that brought the world 20-bit addressing using two 16-bit registers is alive and well and designing Intel's descriptors. The exception handler descriptors are called Interrupt Descriptors, and they are stored in the kernel address space in a table called the Interrupt Descriptor Table, or IDT. The address and size of the IDT are encoded in the IDT Register, or 'idtr', as follows: bit 0 - 15 Size of IDT in bytes 16 - 47 32-bit address of IDT Note that 'size' is actually the index of the last entry in the table; since the descriptors are 8 bytes in size, 'size' would be 8 * ( n - 1 ), where 'n' is the number of entries in the table [usually 256]. The IDT may be read and written with the instructions lidt and sidt: lidt m16&32 ; Load IDTR from m16&m32 (address of 6-byte buffer) sidt m ; Write IDTR to location m (address of 6-byte buffer) The lidt instruction is allowed from kernel mode [ring 0]; the sidt instruction can be used from kernel or user mode. When an exception handler is called, CS:EIP and EFLAGS are saved on the stack, an error code is pushed, and the appropriate handler is called. The operation is functionally equivalent to the following [without register side effects]: handle_exception: ; AL contains exception # pushl %eflags pushl %cs pushl %eip pushl error_code movw idt_table, %edx movw 6(%edx,%eax,8), %bx ; idt_table[ EAX * 8 ] + 6 bytes shl %ebx, 16 ; move BX into high word movw (%edx,%eax,8), %bx ; idt_table[ EAX * 8 ] movl %ebx, %eip ; set new instruction pointer Naturally things are more complicated then this: the segment selector has to be set, the stack must be changed if the exception handler is more privileged than the current task, and so on; however, the above should provide the basic outline of the action. The error code that is pushed on the stack has the following format: bit 0 1 = External event, e.g. hardware interrupt 1 - 2 00 = GDT entry 10 = LDT entry 01 or 11 = IDT entry 3 - 15 Segment Selector 16 - 31 Reserved Most exceptions set error_code to 0; some do not push an error code at all -- the Linux handlers for these push 0 to mimic an error code -- and the page fault exceptions use a special error_code format documented in Vol 3, Section 5.12 of the Intel Manual. On return from an exception handler, the Intel hardware restores the saved values of CS_EIP and EFLAGS, and changes the stack and Interrupt Enabled Flag if necessary: return_from_handler: ; once again, very simplified popl %eflags popl %cs popl %eip All of this is transparent to the programmer; by the time an exception handler has been called, the switch has been made and the stack is prepared for the handler. II. Exception handling in Linux _______________________________________________________________________________ In Linux, the IDT table is implemented as an array of 'struct desc_struct' items stored in the kernel variable 'idt_table'. The descriptor structure is defined in include/asm/desc.h : struct desc_struct { unsigned long a,b; }; ...however this is useless for all but the most mundane access to the IDT. It is important to not include desc.h, and instead to use a structure such as struct desc_struct { unsigned short off_lo, seg_sel; unsigned char reserved,flag; unsigned short off_hi; }; which will give easy access to the address offsets and segment selector for the exception handler. Given this structure, any Interrupt or Trap Descriptor can be modified without regard to its type -- the DPL should not be changed, and in fact only the address offset will need to be modified. It thus becomes possible to hook an exception handler by overwriting the address offset at that table index: void grab_excep( int n, void *new_fn, unsigned long *old_fn){ unsigned long new_addr = (unsigned long)new_fn; struct desc_struct *idt = ptr_idt_table; /* save address of old handler */ if ( old_fn ) *old_fn = (idt[n].off_hi << 16) + idt[n].off_lo; /* insert new exception handler */ idt[n].off_hi = (unsigned short)(new_addr >> 16); idt[n].off_lo = (unsigned short)(new_addr & 0x0000FFFF); return; } During system init, the IDT is filled by the following sequence of instructions in arch/i386/traps.c: set_trap_gate(0,÷_error); set_trap_gate(1,&debug); set_intr_gate(2,&nmi); set_system_gate(3,&int3); /* int3-5 can be called from all */ set_system_gate(4,&overflow); set_system_gate(5,&bounds); set_trap_gate(6,&invalid_op); set_trap_gate(7,&device_not_available); set_trap_gate(8,&double_fault); set_trap_gate(9,&coprocessor_segment_overrun); set_trap_gate(10,&invalid_TSS); set_trap_gate(11,&segment_not_present); set_trap_gate(12,&stack_segment); set_trap_gate(13,&general_protection); set_trap_gate(14,&page_fault); /* this is a set_intr_gate in 2.4.10+ */ set_trap_gate(15,&spurious_interrupt_bug); set_trap_gate(16,&coprocessor_error); set_trap_gate(17,&alignment_check); set_trap_gate(18,&machine_check); set_trap_gate(19,&simd_coprocessor_error); set_system_gate(SYSCALL_VECTOR,&system_call); The set_system_gate() and set_trap_gate() instructions both generate a Trap Descriptor, with the difference being in the DPL [privelege level]: trap gates are 0, while system gates are 3 -- meaning any userspace process can access them. Note that the syscall vector [INT 80] is set with DPL 3; any interrupt or exception can be made accessible to userspace processes by setting its DPL to 3. Notice that the handlers specified in the above set_*_gate instructions are not the actual handlers included in traps.c; for example, 'debug' is used instead of 'do_debug', 'int3' instead of 'do_int3', and so on. This is because the IDT handlers are implemented in assembly language in arch/i386/entry.S; these assembly handlers call the 'real' handlers in traps.c. Thus, when an exception occurs, the handler or 'stub' in entry.S is called, and control is eventually transferred to the real handler. The code in entry.S is as follows [editted for brevity]: int3: pushl $0 # error code pushl $ SYMBOL_NAME(do_int3) # handler of address jmp error_code divide_error: pushl $0 # error code pushl $ SYMBOL_NAME(do_divide_error) error_code: [push pt_regs context] movl %es,%ecx movl ORIG_EAX(%esp), %esi # get the error code movl ES(%esp), %edi # get the function address movl %eax, ORIG_EAX(%esp) # move eax into pt_regs.orig_eax movl %ecx, ES(%esp) # move es into pt_regs.es movl %esp,%edx pushl %esi # push the error code pushl %edx # push the pt_regs pointer movl $(__KERNEL_DS),%edx movl %edx,%ds # set ds, es to __KERNEL_DS (0x18) movl %edx,%es GET_CURRENT(%ebx) # set ebx to current task desc call *%edi # call the 'real' handler addl $8,%esp jmp ret_from_exception # do exit stuff ret_from_exception: movl SYMBOL_NAME(irq_stat),%ecx # softirq_active testl SYMBOL_NAME(irq_stat)+4,%ecx # softirq_mask jne handle_softirq ret_from_intr: GET_CURRENT(%ebx) # get current task desc movl EFLAGS(%esp),%eax # mix EFLAGS and CS movb CS(%esp),%al testl $(VM_MASK | 3),%eax # return to VM86 mode? jne ret_with_reschedule # nah, do ring3 prep jmp restore_all # yeah, just leave handle_softirq: call SYMBOL_NAME(do_softirq) # do software irq jmp ret_from_intr ret_from_sys_call: movl SYMBOL_NAME(irq_stat),%ecx testl SYMBOL_NAME(irq_stat)+4,%ecx # is softirq_active? jne handle_softirq ret_with_reschedule: cmpl $0,need_resched(%ebx) # do we need to call schedule()? jne reschedule # yup, go do it... cmpl $0,sigpending(%ebx) # do we have any signals pending? jne signal_return # yup, go do it... restore_all: [pop register context] iret # finally! return from exception... signal_return: sti ... call SYMBOL_NAME(do_signal) # do_signal() jmp restore_all # bail reschedule: call SYMBOL_NAME(schedule) # schedule() jmp ret_from_sys_call # loop back to ret_with_reschedule The majority of the stubs simply push the address of the exception handler and jump to error_code, which sets up the register context [in pt_regs, discussed in the next section] and calls the 'real' handler; the exceptions to this are the following stubs: ENTRY(device_not_available) [ save register context ] GET_CURRENT(%ebx) pushl $ret_from_exception # return to ret_from_exception movl %cr0,%eax testl $0x4,%eax # EM (math emulation bit) je SYMBOL_NAME(math_state_restore) pushl $0 # temporary storage for ORIG_EIP call SYMBOL_NAME(math_emulate) addl $4,%esp ret ENTRY(nmi) [ save register context ] movl %esp,%edx pushl $0 pushl %edx # save stack pointer call SYMBOL_NAME(do_nmi) addl $8,%esp [pop register context] ENTRY(system_call) [ save register context ] GET_CURRENT(%ebx) # get current task desc cmpl $(NR_syscalls),%eax jae badsys # bad system call testb $0x02,tsk_ptrace(%ebx) # 0x02 = PT_TRACESYS jne tracesys # do ptrace stuff call *SYMBOL_NAME(sys_call_table)(,%eax,4) movl %eax,EAX(%esp) # save the return value [ ret_from_sys_call starts here] There is usually no need to hook these, though the system call handler is of interest simply because it allows the ptrace check to be augmented or even replaced. III. Writing an exception handler _______________________________________________________________________________ As seen in the previous section, the exception handler in the IDT is not really a handler at all; it is actually a stub which performs operations common to all handlers -- such as setting up the stack -- before calling the real, "high- level" handler. Any code can be used for this stub, as long as the stack and registers are intact, and that it jumps to error_code with the exception handler and an error code or a NULL byte pushed on the stack. The basic algorithm for a replacment exception handler stub is very simple: save flags register save general purpose registers check if the new or old handler should be used pop general purpose registers pop flags register if use_new : push address of new high-level handler else : push address of old high-level handler jump to error_code For the INT3 exception, this would take the form of assembly language code such as the following, assuming the code will be patched to entry.S directly: our_int3_stub: pushf pusha call SYMBOL_NAME(check_int3) # should we handle this exception? testl %eax, %eax popa jz default_handler # nah, call the old handler... popf # yeah, demmit! pushl $0 # no error code was provided pushl $ SYMBOL_NAME(our_do_int3) # push our interrupt handler jmp go_error_code default_handler: popf pushl $0 # no error code was provided pushl $ SYMBOL_NAME(do_int3) go_error_code: jmp error_code Of course patching the kernel source directly is inelegant and possibly non-portable; it is better to write a kernel module, and approaches for coding such modules will be covered in the next two sections. When error_code calls the high-level handler, it has stored error_code and a pt_regs structure of the process registers on the stack; exception handlers have void return type and therefore use the following prototype: asmlinkage void our_do_int3(struct pt_regs * regs, long error_code); The 'error_code' parameter has been introduced above; the pt_regs structure is defined in include/asm/ptrace.h as follows: struct pt_regs { long ebx, ecx, edx, esi, edi, ebp, eax; int xds, xes; long orig_eax, eip; int xcs; long eflags, esp; int xss; }; At this point any reasonable code may be used by the programmer, with the obvious caveats that one is in the kernel and in an exception handler; normal task switching is disabled until the return(), so the code must be as fast and robust as possible. The following routine prints a debug message to the console before calling the old exception handler: asmlinkage void our_do_int3( struct pt_regs * regs, long err_code ) { void (*old_fn)( struct pt_regs *,long ) = (void *) old_int3_handler; printk( "<7> Local INT3 Handler Called from PID %d EIP: %08lx\n", current->pid, regs->eip ); (*old_fn)(regs, err_code); return; } Exception handlers themselves are not very complex; however it takes careful assembler code to write the stub, and some respect for the state of the system when leaving the exception handler. IV. Hardcoded kernel symbol addresses _______________________________________________________________________________ By now, all of the brave assembly coders who have rushed the code from the last section through GAS will have found that it will not link into the kernel. The problem is simple, yet frustrating: only a small subset of functions in the kernel have their symbols exported; the rest are not intended for use by general modules and are considered unavailable. Of course, 'unavailable' and 'not intended for use' are ancient ASM coder synonyms for 'interesting' and 'must-have'; therefore we will be disregarding this small bit of programming ettiquette by touching the kernel's private parts. All kernel gobal and local kernel synbols are defined in the System.map file, which is generated using 'make linux' or using nm: nm vmlinux |\ grep -v '\(compiled\)\|\(\.o$$\)\|\( [aUw] \)\|\(\.\.ng$$\)\|\(LASH[RL]DI\)'|\ sort > System.map This file is in the usual nm output format of ADDRESS T NAME where 'T' is the type of symbol; this is easily read using an fscanf() format string of "%08X %c %s", and of course symbols can be easily found with grep: root@localhost> for i in error_code idt_table do_int3 >do > grep " $i$" System.map | sed -ne \ > 's/^\([A-Fa-f0-9]\{4,8\}\) [TtDd] \([A-z0-9_-]\{1,\}\)$/#define \2 0x\1/p' \ > >> symbols.h >done This will create a symbols.h file with the following contents [addresses will vary with your kernel]: #define error_code 0xc0108e54 #define idt_table 0xc0272000 #define do_int3 0xc010931c Needless to say, this is best implemented as a shell script that is called during the make process. When coding a kernel module that hooks an exception handler, it is assumed that the bulk of the module will be coded in C, with only the handler stub coded in assembler. It is pointless to include a .S file for a twenty-line routine; for the rest of the discussion, examples will be given using GCC inline assembler. The GCC __asm__ statement requires some explanation to make the code readable; the syntax of the statement is __asm__ ( assembler_code, : outputs : inputs ); ...where assembler_code is a string that is passed directly to as, and thus must have all of the linebreaks, tabs, and commenting conventions of as assembly language. Details on __asm__ can be found in the GCC manual using `info -f gcc.info -n "Extended Asm"` to avoid navigating with the clunky info interface. The 'outputs' and 'inputs' require some explanation; these allow C variables to be passed into the asm block. In the assembly language, such variables will be referred to using %0, %1, etc; the numbering scheme counts from 0 to 9 starting from the first colon, so that in __asm__ ( assembler_code, : a, b : x, y ); a would be referred to as %0, b as %1, x as %2, and y as %3. Statements with no inputs or no outputs simply use the colons, for example __asm__ ( assembler_code, : x : ); __asm__ ( assembler_code, : : x ); in both of these x would be referred to in the assembler as %0. Note that this use of '%' interferes with the as syntax which uses '%' to denote a register name; all register names must be escaped, so that '%eax' in as becomes '%%eax' in the __asm__ block. Think that's as complicated as inline assembler gets? Wrong! This is GNU code you're dealing with here; expect it to be more than a little counter-intuitive. Each __asm__ input or output has the following syntax: "constraint" symbol where 'symbol' is the C name or value [e.g. 'x', '&x', '0xFF', etc] and 'contraint' is one of the following characters: m memory address p a memory address for push/load address instructions r general register d data register a address register f floating-point register i immediate integer value F immediate floating-point value g general register, immediate integer, or memory address X any operand whatsoever [no constraint] These constraints refer to how the symbol or value in question will be handled in the generated assembler code; for example __asm__ ( "jmp %0 \n" : : "r" ptr_error_code ); will generate code such as movl ptr_error_code,%eax jmp %eax ...which is not what you want in a handler stub, since EAX contains a value that error_code relies on. It is best to compile the code with -S and review the resulting assembly language file in order to ensure that the constraints are treating your code correctly. In case this explanation of constraints has not been confusing enough, refer to `info -f gcc.info -n "Constraints"`. Hopefully the above paragraphs will make this code more rather than less readable; what follows is an inline assembler stub for the INT3 descriptor: __asm__ ( ".globl our_int3_stub \n" /* export 'our_int3_stub' label */ ".align 4,0x90 \n" "our_int3_stub: \n\t" /* start of exception handler */ "pushf \n\t" /* save all flags and registers */ "pusha \n\t" "call check_int3 \n\t" /* handle this exception? */ "testl %%eax, %%eax \n\t" "popa \n\t" "jz default_handler \n\t" /* no, push original handler */ "popf \n\t" /* yes, push our handler */ "pushl $0 \n\t" "pushl our_int3_handler(,1)\n\t" /* our handler */ "jmp go_go \n" "default_handler: \n\t" "popf \n\t" "pushl $0 \n\t" "pushl %0 \n" /* original handler */ "go_go: \n\t" "ljmp %1,%2 \n\t" : : "p" (do_int3), "p" (__KERNEL_CS), "p" (error_code) ); The first alarming bit here is the __KERNEL_CS; when passing immediate values to the ljmp, Intel uses the "ptr 16:32" format -- meaning that a 6-byte address is required, with 2 bytes for the selector and 4 bytes for the actual address. __KERNEL_CS is defined in include/asm/segment.h. Another surprise is the use of an effective address in the "pushl our_int3_handler" statement. GAS uses the syntax section:disp32( Base, Index, Scale ) to specify effective addresses; what we want in this case is to use the symbol "our_int3_handler" as a displacement into the current section, so we set Base and Scale to 0 and use an Index of 1. As usual, the GNU info files provide more information; GAS syntax for Intel is in 'info -f as.info -n i386-Dependent'. The rest of the inline assembler is fairly straightforward; the entry label for the stub "our_int3_stub" is defined globally so our exception grabber can reference it, and the constants for error_code and do_debug are passed in as immediate pointer values. Modifying grab_excep() to use the constant for idt_table is much more simple: void grab_excep( int n, void *new_fn, unsigned long *old_fn){ unsigned long new_addr = (unsigned long)new_fn; struct desc_struct *idte, *idt = (struct desc_struct *) idt_table; ... return; } Needless to say, the constants may be used indiscriminantly as function pointers as long as they are valid for the currently running kernel. V. Dynamic kernel symbol addresses _______________________________________________________________________________ Hard-coded constants are all well and good for proving a theory, but they are embarassing to leave lying around in your code like this. At the very least, the user should be able to specify the symbol addresses on the command line. The Linux kernel module faciity provides a handy macro for passing parameters in via insmod: MODULE_PARM(variable, "type"); ...where 'type' is "i" for integer values and "s" for string values, and 'variable' is the name of the variable where this parameter will be stored. A module which will be loaded with root@local_host>insmod test.o name=my_mod do_int3=0xc010931c would have the following macros invoked outside of any function bodies: MODULE_PARM(name, "s"); MODULE_PARM(do_int3, "i"); So we have a way to pass addresses into the kernel module; the fun part now is recoding the exception handler to deal with them. To begin with, we will need to add a few global variables as well as some code: unsigned long sym_idt_table, sym_error_code, sym_do_int3; MODULE_PARM(sym_idt_table, "i"); MODULE_PARM(sym_error_code, "i"); MODULE_PARM(sym_do_int3, "i"); /* initialize these symbols to the constants generated from mapfile */ unsigned long ptr_error_code = error_code; unsigned long ptr_idt_table = idt_table; /* These are pointers to the old and new exception handlers */ unsigned long old_int3_stub = 0; unsigned long old_int3_handler = do_int3; unsigned long our_int3_handler = (unsigned long)&our_do_int3; int __init init_dbg_mod(void){ EXPORT_NO_SYMBOLS; /* check command line parameters */ if ( sym_idt_table ) ptr_idt_table = (unsigned long) sym_idt_table; if ( sym_error_code ) ptr_error_code = (unsigned long) sym_error_code; if ( sym_do_int3 ) old_int3_handler = (unsigned long) sym_do_int3; /* hook exception */ grab_excep(3, &our_int3_stub, &old_int3_stub); return(0); } The use of variables in the exception handler also allows us to use our stored address for the old exception handler, rather than hard-coding the address of do_int3(); this is better all around, and requires one less kernel symbol. __asm__ ( ".globl our_int3_stub \n" /* export 'our_int3_stub' label */ ".align 4,0x90 \n" "our_int3_stub: \n\t" /* start of exception handler */ "pushf \n\t" /* save all flags and registers */ "pusha \n\t" "call check_int3 \n\t" /* handle this exception? */ "testl %%eax, %%eax \n\t" "popa \n\t" "jz default_handler \n\t" /* no, push original handler */ "popf \n\t" /* yes, push our handler */ "pushl $0 \n\t" "pushl our_int3_handler(,1)\n\t" /* our handler */ "jmp go_go \n" "default_handler: \n\t" "popf \n\t" "pushl $0 \n\t" "pushl old_int3_handler(,1)\n" /* original handler */ "go_go: \n\t" "jmp *ptr_error_code \n" /* jump to error_code dispatcher */ :: ); Once again, an effective address is used to push a code reference, in this case 'old_do_int3'. In addition, the GAS address indirection operator is used in the jmp; this deviates from the assembler in the kernel source in that a long jump is not used -- thus allowing the use of '*' applied to a function pointer -- however empirical tests have determined that a long jump is not strictly necessary, since all exception handlers run in the kernel CS. As before, the grab_excep() routine needs to be modified to handle this new scheme: void grab_excep( int n, void *new_fn, unsigned long *old_fn){ unsigned long new_addr = (unsigned long)new_fn; struct desc_struct *idte, *idt = ptr_idt_table; ... return; } While this dynamic method is a step above the hard-coded method, it still requires the user to override any incorrect symbol values, and thus is less than ideal. VI. Signatures and verifying addresses _______________________________________________________________________________ Naturally, user-supplied addresses cannot be used without verifying their validity; in addition, it would be good to check the compiled-in values to and exit prematurely if they are found to be invalid. Determining if idt_table is valid simply requires an sidt statement: int validate_idt( unsigned long addr ) { unsigned char idtr[6]; __asm__ ("sidt %0": "=m" (idtr)); if ( addr != *((unsigned long *)&idtr[2]) ) return(0); return(1); } Verifying that error_code is correct requires comparison with a known-good signature; this signature can be obtained using a kernel module such as the following: #define __KERNEL__ #define MODULE #define LINUX #include #include #include unsigned long sym_addr; MODULE_PARM(sym_addr, "i"); /* address to check */ int __init init_dbg_mod(void){ int x; if ( sym_addr ) { for ( x = 0; x < 32; x++ ) { printk("%02X ", ((unsigned char *)sym_addr)[x] ); } printk("\n"); } else printk("You forgot the sym_addr=0x???????? param!\n"); return(1); /* return without loading the module */ } module_init(init_dbg_mod); Loading this with `insmod gen_sig.o sym_addr=0xc0108e54` will print a 32-byte signature to the console. A verification routine is simple: int verify_addr_sig( char *addr, char *sig ) { if ( memcmp( addr, sig, 32 ) ) return(0); /* nope, we didn't match */ return(1); } ...and can be used with code like the following: unsigned char error_code_sig[32] = { 0x1E, 0x50, 0x31, 0xC0, 0x55, 0x57, 0x56, 0x52, 0x48, 0x51, 0x53, 0xFC, 0x8C, 0xC1, 0x8B, 0x74, 0x24, 0x24, 0x8B, 0x7C, 0x24, 0x20, 0x89, 0x44, 0x24, 0x24, 0x89, 0x4C, 0x24, 0x20, 0x89, 0xE2 }; if (! verify_addr_sig( (char *)ptr_error_code, error_code_sig ) ) { printk("Address %08lX for symbol error_code is invalid!\n", ptr_error_code ); return(1); } This same technique can be used to verify that do_int3 is correct. VI. Determining symbol addresses on-the-fly _______________________________________________________________________________ Suppose, for the sake of argument, that the module is being compiled for a kernel with no mapfile, and that you do not wish the user to be bothered with, say, knowledge of the module's existence. In this case it would be helpful to be able to determine the addresses of required symbols when loading the module. Once again, idt_table is simple, since it is in the IDTR register: unsigned long get_idt( void ) { unsigned char idtr[6]; unsigned long idt; __asm__ __volatile__("sidt %0": "=m" (idtr)); idt = *((unsigned long *)&idtr[2]); return(idt); } Now, we know from entry.S that 'error_code' is a few bytes after the stub for INT0, or 'divide_error'. This means that the address of the INT0 stub can be obtained from the IDT, and used as a base from which to search for the signature of error_code: /* returns the address of the stub for INT n */ void * get_from_idt( int n ) { struct desc_struct *idte = &((struct desc_struct *)ptr_idt_table)[n]; return( (void *) ((idte->off_hi << 16) + idte->off_lo) ); } /* Searches for 32-byte 'sig' in 'range' bytes starting at 'base_addr' */ void * find_addr_by_sig( char *base_addr, char *sig, int range ) { int x; for ( x = 0; x < range; x++ ) { if (! memcmp(sig, &base_addr[x], 32 ) ) return( &base_addr[x] ); } return(NULL); } int __init init_dbg_mod(void){ ... ptr_error_code = (unsigned long) find_addr_by_sig( get_from_idt(0), error_code_sig, 256 ); if (! ptr_error_code ) { printk("Unable to resolve error_code signature!\n"); return(1); } ... } Determining the location of do_int3 is a little tricky, but not very complex; since the address of the handler stub can be obtained from the IDT, and the stub itself contains only a few simple instructions, the address of the handler can be obtained by reading a set offset from the start of the stub: int3: 6A 00 pushl $0 68 1C 93 10 C0 pushl do_int3 ;do_int3 = 0xC010931C E9 28 FF FF FF jmp error_code Since only 3 bytes exist between the start of the stub and the address of the handler, the address of the handler can be obtained with old_int3_stub = (unsigned long) get_from_idt(3); old_int3_handler = *(unsigned long *)&((unsigned char *)old_int3_stub)[3]; With these modifications, the kernel module is able to be loaded on any binary-compatible system with a suitable [2.2 and 2.4 should both work] kernel. Additional symbols can be resolved using signatures of sufficient length; any symbol in /proc/ksyms that is in the same source file as the symbol in question would be usable as a 'base address' from which to start searching. APPENDIX A: Resources _______________________________________________________________________________ '(nearly) Complete Linux Loadable Kernel Modules', pragmatic, THC 1998 'Understanding the Linux Kernel', Daniel Bovet & Marco Cesati, O'Reilly 2001 'Linux Kernel Module Programming Guide', Ori Pomerantz, FSF 1999 'Intel Software Developer's Manual Volume 1: Basic Architecture', Intel Corporation, 1999 'Intel Software Developer's Manual Volume 3: System Programming', Intel Corporation, 1999 'Using and Porting GCC', Richard Stallman, FSF 1998 /usr/src/linux-2.4.2/include/asm/desc.h /usr/src/linux-2.4.2/include/asm/ptrace.h /usr/src/linux-2.4.2/arch/i386/kernel/traps.c /usr/src/linux-2.4.2/arch/i386/kernel/entry.S APPENDIX B: C Implementation _______________________________________________________________________________ /* INT3_Hook Kernel Module :: code 2001 per mammon_ * -- To compile: * gcc -I/usr/src/linux/include -Wall -c int3_hook.c */ #define __KERNEL__ #define MODULE #define LINUX #if CONFIG_MODVERSIONS==1 #define MODVERSIONS #include #endif #include #include #include #include #include struct desc_struct { unsigned short off_lo, seg_sel; unsigned char reserved,flag; unsigned short off_hi; }; /* forward decl's */ asmlinkage void our_do_int3(struct pt_regs * regs, long err_code); extern asmlinkage void our_int3_stub(); /* ================================================== Globals */ unsigned long ptr_error_code; unsigned long ptr_idt_table; unsigned long old_int3_stub; unsigned long old_int3_handler; unsigned long our_int3_handler = (unsigned long)&our_do_int3; unsigned char error_code_sig[32] = { 0x1E, 0x50, 0x31, 0xC0, 0x55, 0x57, 0x56, 0x52, 0x48, 0x51, 0x53, 0xFC, 0x8C, 0xC1, 0x8B, 0x74, 0x24, 0x24, 0x8B, 0x7C, 0x24, 0x20, 0x89, 0x44, 0x24, 0x24, 0x89, 0x4C, 0x24, 0x20, 0x89, 0xE2 }; /* ================================================== Utility Routines */ unsigned long get_idt( void ) { unsigned char idtr[6]; unsigned long idt; __asm__ __volatile__("sidt %0": "=m" (idtr)); idt = *((unsigned long *)&idtr[2]); return(idt); } void * get_from_idt( int n ) { struct desc_struct *idte = &((struct desc_struct *)ptr_idt_table)[n]; return( (void *) ((idte->off_hi << 16) + idte->off_lo) ); } /* Searches for 32-byte 'sig' in 'range' bytes starting at 'base_addr' */ void * find_addr_by_sig( char *base_addr, char *sig, int range ) { int x; for ( x = 0; x < range; x++ ) { if (! memcmp(sig, &base_addr[x], 32 ) ) return( &base_addr[x] ); } return(NULL); } /* hook exception # 'n' to call 'new_fn'; store old hanlder in 'old_fn' */ void grab_excep( int n, void *new_fn, unsigned long *old_fn){ unsigned long new_addr = (unsigned long)new_fn; struct desc_struct *idt = (struct desc_struct *)ptr_idt_table; if ( old_fn ) /* save old exception handler */ *old_fn = (idt[n].off_hi << 16) + idt[n].off_lo; idt[n].off_hi = (unsigned short)(new_addr >> 16); idt[n].off_lo = (unsigned short)(new_addr & 0x0000FFFF); return; } /* =================================================== INT3 Handler */ asmlinkage void our_do_int3(struct pt_regs * regs, long err_code) { void (*old_fn)(struct pt_regs *,long) = (void *)old_int3_handler; printk( "<7> Local INT3 Handler Called from PID %d EIP: %08lx\n", current->pid, regs->eip ); (*old_fn)(regs, err_code); return; } /* Short demonstration of how to selectively control trapping of processes */ /* called by INT handler stub [ int3_handler ] */ int check_int3( void ) { if ( current->pid > 1 ) return(1); return(0); } /* =================================================== Kernel Module Stuff */ int __init init_dbg_mod(void){ EXPORT_NO_SYMBOLS; ptr_idt_table = get_idt(); ptr_error_code = (unsigned long) find_addr_by_sig( get_from_idt(0), error_code_sig, 256 ); if (! ptr_error_code ) { printk("Unable to resolve error_code signature!\n"); return(1); } grab_excep(3, &our_int3_stub, &old_int3_stub); old_int3_handler = *(unsigned long *)&((unsigned char *)old_int3_stub)[3]; return(0); } void __exit exit_dbg_mod(void){ /* unhook exception */ grab_excep(3, (char *)old_int3_stub, NULL); return; } module_init(init_dbg_mod); module_exit(exit_dbg_mod); /* =================================================== INT3 Handler Stub */ /* this is a bogus routine that contains the code for int3_handler() */ void int3_crapola(void){ __asm__ ( ".globl our_int3_stub \n" /* export 'our_int3_handler' label */ ".align 4,0x90 \n" "our_int3_stub: \n\t" /* start of exception handler */ "pushf \n\t" /* save all flags and registers */ "pusha \n\t" "call check_int3 \n\t" /* handle this exception? */ "testl %%eax, %%eax \n\t" "popa \n\t" "jz default_handler \n\t" /* no, push original handler */ "popf \n\t" /* yes, push our handler */ "pushl $0 \n\t" "pushl our_int3_handler(,1)\n\t" /* our handler */ "jmp go_go \n" "default_handler: \n\t" "popf \n\t" "pushl $0 \n\t" "pushl old_int3_handler(,1)\n" /* original handler */ "go_go: \n\t" "jmp *ptr_error_code \n" /* jump to error_code dispatcher */ :: ); } /* ================================================================ EOF */