Emulating the U-dot zone
gdb was able to load, present its prompt to the user, and it was able to display the online help, but it was not possible to do anything actually useful with it. It was even impossible to launch a Linux program. This was not suprising because the Linux
ptrace() system call emulation was not yet implemented for the PowerPC.
In Unix systems, the
ptrace() system call is used almost exclusively by debuggers such as
gdb. It provides facilities for reading or writing the CPU register values during the program execution, stepping the program, and reading or writing the memory allocated to the traced program. All these operations are requested through
ptrace() commands such as
SETREGS, and so on. You can have a look to the
ptrace (2) man page if you want more information about
ptrace() emulation is split into two parts. On one hand, a machine-independent part, located in
sys/compat/linux/common/linux_misc.c, and on the other hand, a machine-dependent part
linux_sys_ptrace_arch(), activated through the
LINUX_SYS_PTRACE_ARCH macro in
linux_sys_ptrace_arch() function is located in
sys/compat/linux/arch/powerpc/linux_ptrace.c for the PowerPC. The machine-independent part can handle some commands, such as reading or writing to the traced process memory, by calling the NetBSD native
The machine-independent part of
linux_sys_ptrace_arch() should ideally implement the
SETFPREGS. The easier way to write the
linux_sys_ptrace_arch() function for the PowerPC was obviously to pick up the i386 version, and change what was really machine dependent. This includes all reference to CPU registers, and all references to data structures that do not exist on the PowerPC, for instance, the
u_debugreg field of
Operations on registers are quite straightforward to implement. Linux binaries expect reading and writing through a
pt_regs structure, defined in Linux's header
linux/include/asm-ppc/ptrace.h. The job is to get the registers and rearrange them appropriately.
The two tricky operations that help reading and writing the user structure are
POKEUSER. Before explaining how we emulate these two commands, let us first introduce the user structure, also known as the U-dot zone.
When running several processes at once, the Unix kernel needs to maintain some information for each process. This process information is split into kernel-memory-based and user-process-memory-based parts. The kernel part of the information is stored in the
struct proc, which is defined in
sys/sys/proc.h on NetBSD. This structure contains data that must remain in main memory at all times (kernel memory is never swapped out). Kernel-based process information includes, for instance, the user owning the process. That information must always remain resident in main memory because we do not want a
ps -aux to cause some pages of each swapped out process to be reloaded into main memory.
The user-based, or "userland", process information is called the user structure. The information contained in the user structure is only needed when the process is running. On NetBSD, the user structure is defined as
struct user, in
sys/sys/user.h. On Linux, this is the
struct user, defined in
linux/include/asm-ppc/user.h. In kernel code, user structures used to be named "u", and therefore accessed in C through the
u.<field> syntax. Hence the "U-dot" name.
The NetBSD U-dot zone is rather small, because most of the fields in this structure were moved to other locations, including the kernel stack or
struct proc. On the other hand, Linux stores lots of information in the U-dot zone, such as text, data, and stack location and sizes. It also uses the U-dot zone to save user values of CPU registers of the traced process when entering kernel space. Linux's
gdb reads the U-dot zone to get and set the register values of the traced processes. For reading, this works because the traced process is stopped when
gdb does the operation.
gdb reads the latest values of the traced process registers before it was stopped and the CPU entered kernel space, saving the registers in the U-dot zone. For writing, it works because when the kernel runs the traced process again, it will restore the modified registers from the U-dot zone.
Now, let us examine how
POKEUSER are emulated in NetBSD.
ptrace() commands are used with three other arguments: the PID of the traced process, the address of the target field in the U-dot zone relative to the beginning of the U-dot zone, and a data field, used for write operations. As you can imagine, it is not trivial to emulate operations on the U-dot zone, because they involve manipulating fields of the U-dot zone that do not exist in NetBSD's U-dot zone: registers, stack location and size, and so on. We therefore have to check the target address, and return a value from another place in the kernel depending on the target address.
LUSR_OFF macro helps. It returns the address of a given field in the U-dot zone. Here is the definition of
sys/compat/linux/arch/powerpc/linux_ptrace.c #define LUSR_OFF(member) offsetof(struct linux_user, member)
And here is some code that emulates reading the stack size, code location, and stack location from Linux's U-dot zone. As you can see, we grab the revelant information from locations in the
struct proc (
p is a pointer to the
struct proc of the current process):
if (addr == LUSR_OFF(u_ssize)) *retval = p->p_vmspace->vm_ssize; else if (addr == LUSR_OFF(start_code)) *retval = (register_t) p->p_vmspace->vm_taddr; else if (addr == LUSR_OFF(start_stack)) *retval = (register_t) p->p_vmspace->vm_minsaddr;
And here is a code snippet that emulates reading traced process registers from the U-dot zone:
error = process_read_regs(t, regs); /* (snip) */ if (addr == LUSR_REG_OFF(lnip)) *retval = regs->pc; else if (addr == LUSR_REG_OFF(lctr)) *retval = regs->ctr; else if (addr == LUSR_REG_OFF(llink)) *retval = regs->lr;
gdb was able to start the traced program, but there was a remaining bug that made it unable to get a backtrace or to trace the program. We will examine the problem in the next section