English Italian (it_IT)
Linux® emulation in FreeBSD
Linux(R) emulation in FreeBSD
This masters thesis deals with updating the Linux(R) emulation layer (the so called _Linuxulator_). The task was to update the layer to match the functionality of Linux(R) 2.6. As a reference implementation, the Linux(R) 2.6.16 kernel was chosen. The concept is loosely based on the NetBSD implementation. Most of the work was done in the summer of 2006 as a part of the Google Summer of Code students program. The focus was on bringing the _NPTL_ (new POSIX(R) thread library) support into the emulation layer, including _TLS_ (thread local storage), _futexes_ (fast user space mutexes), _PID mangling_, and some other minor things. Many small problems were identified and fixed in the process. My work was integrated into the main FreeBSD source repository and will be shipped in the upcoming 7.0R release. We, the emulation development team, are working on making the Linux(R) 2.6 emulation the default emulation layer in FreeBSD.
In the last few years the open source UNIX(R) based operating systems started to be widely deployed on server and client machines. Among these operating systems I would like to point out two: FreeBSD, for its BSD heritage, time proven code base and many interesting features and Linux(R) for its wide user base, enthusiastic open developer community and support from large companies. FreeBSD tends to be used on server class machines serving heavy duty networking tasks with less usage on desktop class machines for ordinary users. While Linux(R) has the same usage on servers, but it is used much more by home based users. This leads to a situation where there are many binary only programs available for Linux(R) that lack support for FreeBSD.
Naturally, a need for the ability to run Linux(R) binaries on a FreeBSD system arises and this is what this thesis deals with: the emulation of the Linux(R) kernel in the FreeBSD operating system.
During the Summer of 2006 Google Inc. sponsored a project which focused on extending the Linux(R) emulation layer (the so called Linuxulator) in FreeBSD to include Linux(R) 2.6 facilities. This thesis is written as a part of this project.
A look inside...
In this section we are going to describe every operating system in question. How they deal with syscalls, trapframes etc., all the low-level stuff. We also describe the way they understand common UNIX(R) primitives like what a PID is, what a thread is, etc. In the third subsection we talk about how UNIX(R) on UNIX(R) emulation could be done in general.
What is UNIX(R)
UNIX(R) is an operating system with a long history that has influenced almost every other operating system currently in use. Starting in the 1960s, its development continues to this day (although in different projects). UNIX(R) development soon forked into two main ways: the BSDs and System III/V families. They mutually influenced themselves by growing a common UNIX(R) standard. Among the contributions originated in BSD we can name virtual memory, TCP/IP networking, FFS, and many others. The System V branch contributed to SysV interprocess communication primitives, copy-on-write, etc. UNIX(R) itself does not exist any more but its ideas have been used by many other operating systems world wide thus forming the so called UNIX(R)-like operating systems. These days the most influential ones are Linux(R), Solaris, and possibly (to some extent) FreeBSD. There are in-company UNIX(R) derivatives (AIX, HP-UX etc.), but these have been more and more migrated to the aforementioned systems. Let us summarize typical UNIX(R) characteristics.
Technical details
Every running program constitutes a process that represents a state of the computation. Running process is divided between kernel-space and user-space. Some operations can be done only from kernel space (dealing with hardware etc.), but the process should spend most of its lifetime in the user space. The kernel is where the management of the processes, hardware, and low-level details take place. The kernel provides a standard unified UNIX(R) API to the user space. The most important ones are covered below.
Communication between kernel and user space process
Common UNIX(R) API defines a syscall as a way to issue commands from a user space process to the kernel. The most common implementation is either by using an interrupt or specialized instruction (think of `SYSENTER`/`SYSCALL` instructions for ia32). Syscalls are defined by a number. For example in FreeBSD, the syscall number 85 is the man:swapon[2] syscall and the syscall number 132 is man:mkfifo[2]. Some syscalls need parameters, which are passed from the user-space to the kernel-space in various ways (implementation dependant). Syscalls are synchronous.
Another possible way to communicate is by using a _trap_. Traps occur asynchronously after some event occurs (division by zero, page fault etc.). A trap can be transparent for a process (page fault) or can result in a reaction like sending a _signal_ (division by zero).
Communication between processes
There are other APIs (System V IPC, shared memory etc.) but the single most important API is signal. Signals are sent by processes or by the kernel and received by processes. Some signals can be ignored or handled by a user supplied routine, some result in a predefined action that cannot be altered or ignored.
Process management