The translation is temporarily closed for contributions due to maintenance, please come back later.
English Chinese (Simplified) (zh_CN)
The boot process is an extremely machine-dependent activity. Not only must code be written for every computer architecture, but there may also be multiple types of booting on the same architecture. For example, a directory listing of <filename>/usr/src/sys/boot</filename> reveals a great amount of architecture-dependent code. There is a directory for each of the various supported architectures. In the x86-specific <filename>i386</filename> directory, there are subdirectories for different boot standards like <filename>mbr</filename> (Master Boot Record), <filename>gpt</filename> (<acronym>GUID</acronym> Partition Table), and <filename>efi</filename> (Extensible Firmware Interface). Each boot standard has its own conventions and data structures. The example that follows shows booting an x86 computer from an <acronym>MBR</acronym> hard drive with the FreeBSD <filename>boot0</filename> multi-boot loader stored in the very first sector. That boot code starts the FreeBSD three-stage boot process. 启动过程与计算机架构息息相关。不仅必须为每个计算机体系结构编写代码,而且在同一体系结构上也可能有多个不同类型的引导。例如,<filename>/usr/src/sys/boot</filename>的目录列表显示了大量依赖于体系结构的代码。每个受支持的体系结构都有一个目录。在特定于 x86 的 <filename>i386</filename> 目录中,有不同引导标准的子目录,如<filename>mbr</filename>(主引导记录)、<filename>gpt</filename>(<acronym>GUID</acronym>分区表)和<filename>efi</filename>(可扩展固件接口)。每个引导标准都有自己的约定和数据结构。以下示例显示,使用存储在第一扇区的 FreeBSD <filename>boot0</filename> 多引导加载程序,从<acronym>MBR</acronym> 硬盘驱动器启动 x86 计算机。该引导代码启动 FreeBSD 三阶段引导过程。
When the computer powers on, the processor's registers are set to some predefined values. One of the registers is the <emphasis>instruction pointer</emphasis> register, and its value after a power on is well defined: it is a 32-bit value of <literal>0xfffffff0</literal>. The instruction pointer register (also known as the Program Counter) points to code to be executed by the processor. Another important register is the <literal>cr0</literal> 32-bit control register, and its value just after a reboot is <literal>0</literal>. One of <literal>cr0</literal>'s bits, the PE (Protection Enabled) bit, indicates whether the processor is running in 32-bit protected mode or 16-bit real mode. Since this bit is cleared at boot time, the processor boots in 16-bit real mode. Real mode means, among other things, that linear and physical addresses are identical. The reason for the processor not to start immediately in 32-bit protected mode is backwards compatibility. In particular, the boot process relies on the services provided by the <acronym>BIOS</acronym>, and the <acronym>BIOS</acronym> itself works in legacy, 16-bit code. 当PC加电后,处理器的寄存器被设为某些特定值。在这些寄存器中, <emphasis>指令指针</emphasis>寄存器被设为32位值0xfffffff0。 指令指针寄存器指向处理器将要执行的指令代码。<literal>cr1</literal>, 一个32位控制寄存器,在刚启动时值被设为0。cr1的PE(Protected Enabled, 保护模式使能)位用来指示处理器是处于保护模式还是实地址模式。 由于启动时该位被清位,处理器在实地址模式中引导。在实地址模式中, 线性地址与物理地址是等同的。
The <acronym>BIOS</acronym> (Basic Input Output System) is a chip on the motherboard that has a relatively small amount of read-only memory (<acronym>ROM</acronym>). This memory contains various low-level routines that are specific to the hardware supplied with the motherboard. The processor will first jump to the address 0xfffffff0, which really resides in the <acronym>BIOS</acronym>'s memory. Usually this address contains a jump instruction to the <acronym>BIOS</acronym>'s POST routines. BIOS表示<emphasis>Basic Input Output System</emphasis> (基本输入输出系统)。在主板上,它被固化在一个相对容量较小的 只读存储器(Read-Only Memory, ROM)。BIOS包含各种各样为主板硬件 定制的底层例程。就这样,处理器首先指向常驻BIOS存储器的地址 0xfffffff0。通常这个位置包含一条跳转指令,指向BIOS的POST例程。
The <acronym>POST</acronym> (Power On Self Test) is a set of routines including the memory check, system bus check, and other low-level initialization so the <acronym>CPU</acronym> can set up the computer properly. The important step of this stage is determining the boot device. Modern <acronym>BIOS</acronym> implementations permit the selection of a boot device, allowing booting from a floppy, <acronym>CD-ROM</acronym>, hard disk, or other devices. POST表示<emphasis>Power On Self Test</emphasis>(加电自检)。 这套程序包括内存检查,系统总线检查和其它底层工具, 从而使得CPU能够初始化整台计算机。这一阶段中有一个重要步骤, 就是确定引导设备。现在所有的BIOS都允许手工选择引导设备。 你可以从软盘、光盘驱动器、硬盘等设备引导。
The very last thing in the <acronym>POST</acronym> is the <literal>INT 0x19</literal> instruction. The <literal>INT 0x19</literal> handler reads 512 bytes from the first sector of boot device into the memory at address <literal>0x7c00</literal>. The term <emphasis>first sector</emphasis> originates from hard drive architecture, where the magnetic plate is divided into a number of cylindrical tracks. Tracks are numbered, and every track is divided into a number (usually 64) of sectors. Track numbers start at 0, but sector numbers start from 1. Track 0 is the outermost on the magnetic plate, and sector 1, the first sector, has a special purpose. It is also called the <acronym>MBR</acronym>, or Master Boot Record. The remaining sectors on the first track are never used. <acronym>POST</acronym>的最后一步是执行<literal>INT 0x19</literal>指令。 这个指令从引导设备第一个扇区读取512字节,装入地址<literal>0x7c00</literal>。 <emphasis>第一个扇区</emphasis>的说法最早起源于硬盘的结构,硬盘面被分为若干圆柱形轨道。给轨道编号,同时又将轨道分为 一定数目(通常是64)的扇形。0号轨道是硬盘的最外圈,1号扇区, 第一个扇区(轨道、柱面都从0开始编号,而扇区从1开始编号) 有着特殊的作用,它又被称为主引导记录(Master Boot Record, <acronym>MBR</acronym>)。 第一轨剩余的扇区常常不使用。
As mentioned previously, the <literal>INT 0x19</literal> instruction causes the <literal>INT 0x19</literal> handler to load an <acronym>MBR</acronym> (<filename>boot0</filename>) into memory at address <literal>0x7c00</literal>. The source file for <filename>boot0</filename> can be found in <filename>sys/boot/i386/boot0/boot0.S</filename> - which is an awesome piece of code written by Robert Nordier. 如前所述, <literal>INT 0x19</literal> 指令装载 MBR, 也就是 <filename>boot0</filename> 的内容至内存地址 0x7c00。 再看文件 <filename>sys/boot/i386/boot0/boot0.S</filename>, 可以猜想这里面发生了什么 - 这是引导管理器, 一段由 Robert Nordier书写的令人起敬的程序片段。
A special structure starting from offset <literal>0x1be</literal> in the <acronym>MBR</acronym> is called the <emphasis>partition table</emphasis>. It has four records of 16 bytes each, called <emphasis>partition records</emphasis>, which represent how the hard disk is partitioned, or, in FreeBSD's terminology, sliced. One byte of those 16 says whether a partition (slice) is bootable or not. Exactly one record must have that flag set, otherwise <filename>boot0</filename>'s code will refuse to proceed. MBR里,也就是<filename>boot0</filename>里, 从偏移量0x1be开始有一个特殊的结构,称为 <emphasis>分区表</emphasis>。其中有4条记录 (称为<emphasis>分区记录</emphasis>),每条记录16字节。 分区记录表示硬盘如何被划分,在FreeBSD的术语中, 这被称为slice(d)。16字节中有一个标志字节决定这个分区是否可引导。 有仅只能有一个分区可设定这一标志。否则, <filename>boot0</filename>的代码将拒绝继续执行。
A partition record descriptor contains information about where exactly the partition resides on the drive. Both descriptors, <acronym>LBA</acronym> and <acronym>CHS</acronym>, describe the same information, but in different ways: <acronym>LBA</acronym> (Logical Block Addressing) has the starting sector for the partition and the partition's length, while <acronym>CHS</acronym> (Cylinder Head Sector) has coordinates for the first and last sectors of the partition. The partition table ends with the special signature <literal>0xaa55</literal>. 一个分区记录描述符包含某一分区在硬盘上的确切位置信息。 LBA和CHS两种描述符指示相同的信息,但是指示方式有所不同:LBA (逻辑块寻址,Logical Block Addressing)指示分区的起始扇区和分区长度, 而CHS(柱面 磁头 扇区)指示首扇区和末扇区。
<literal>boot2</literal> defines an important structure, <literal>struct bootinfo</literal>. This structure is initialized by <literal>boot2</literal> and passed to the loader, and then further to the kernel. Some nodes of this structures are set by <literal>boot2</literal>, the rest by the loader. This structure, among other information, contains the kernel filename, <acronym>BIOS</acronym> harddisk geometry, <acronym>BIOS</acronym> drive number for boot device, physical memory available, <literal>envp</literal> pointer etc. The definition for it is: boot2 定义了很重要的引导信息数据结构。此结构由 boot2 初始化,然后传递到加载程序,再传到内核。
The <literal>STATICMETHOD</literal> keyword is used like the <literal>METHOD</literal> keyword except the kobj data is not at the head of the object structure so casting to kobj_t would be incorrect. Instead <literal>STATICMETHOD</literal> relies on the Kobj data being referenced as 'ops'. This is also useful for calling methods directly out of a class's method table. 关键词<literal>STATICMETHOD</literal>类似关键词<literal>METHOD</literal>。对于每个Kobj对象,一般其头部都有一些Kobj专有的数据。<literal>METHOD</literal>定义的方法就假设这些专有数据位于对象头部;假如对象头部没有这些专有数据,这些方法对这个对象的访问就可能出错。而<literal>STATICMETHOD</literal>定义的对象可以不受这个限制:这样描述出的方法,其操作的数据不由这个类的某个对象实例给出,而是全都由调用这个方法时的操作数(译者注:即参数)给出。这也对于在某个类的方法表之外调用这个方法有用。
The class must be <quote>compiled</quote>. Depending on the state of the system at the time that the class is to be initialized a statically allocated cache, <quote>ops table</quote> have to be used. This can be accomplished by declaring a <varname remap="structname">struct kobj_ops</varname> and using <function>kobj_class_compile_static();</function> otherwise, <function>kobj_class_compile()</function> should be used. 类须被<quote>编译</quote>。根据该类被初始化时系统的状态,将要用到一个静态分配的缓存和<quote>操作数表</quote>(ops table,译者注:即<quote>参数表</quote>)。这些操作可通过声明一个结构体<varname remap="structname">struct kobj_ops</varname>并使用<function>kobj_class_compile_static()</function>,或是只使用<function>kobj_class_compile()</function>来完成。
On most <trademark class="registered">UNIX</trademark> systems, <literal>root</literal> has omnipotent power. This promotes insecurity. If an attacker gained <literal>root</literal> on a system, he would have every function at his fingertips. In FreeBSD there are sysctls which dilute the power of <literal>root</literal>, in order to minimize the damage caused by an attacker. Specifically, one of these functions is called <literal>secure levels</literal>. Similarly, another function which is present from FreeBSD 4.0 and onward, is a utility called <citerefentry><refentrytitle>jail</refentrytitle><manvolnum>8</manvolnum></citerefentry>. <application>Jail</application> chroots an environment and sets certain restrictions on processes which are forked within the <application>jail</application>. For example, a jailed process cannot affect processes outside the <application>jail</application>, utilize certain system calls, or inflict any damage on the host environment. 在大多数<trademark class="registered">UNIX</trademark> 系统中,用户<literal>root</literal>是万能的。这也就增加了许多危险。如果一个攻击者获得了一个系统中的<literal>root</literal>,就可以在他的指尖掌握系统中所有的功能。在FreeBSD里,有一些sysctl项削弱了<literal>root</literal>的权限,这样就可以将攻击者造成的损害减小到最低限度。这些安全功能中,有一种叫安全级别。另一种在FreeBSD 4.0及以后版本中提供的安全功能,就是 <citerefentry><refentrytitle>jail</refentrytitle><manvolnum>8</manvolnum></citerefentry>。<application>Jail</application>将一个运行环境的文件树根切换到某一特定位置,并且对这样环境中叉分生成的进程做出限制。例如,一个被监禁的进程不能影响这个<application>jail</application>之外的进程、不能使用一些特定的系统调用,也就不能对主计算机造成破坏。
As you can see, there is an entry for each of the arguments passed to the <citerefentry><refentrytitle>jail</refentrytitle><manvolnum>8</manvolnum></citerefentry> program, and indeed, they are set during its execution. 正如你所见,传送给命令<citerefentry><refentrytitle>jail</refentrytitle><manvolnum>8</manvolnum></citerefentry>的每个参数都在这里有对应的一项。事实上,当命令<citerefentry><refentrytitle>jail</refentrytitle><manvolnum>8</manvolnum></citerefentry>被执行时,这些参数才由命令行真正传入。
The <citerefentry><refentrytitle>inet_aton</refentrytitle><manvolnum>3</manvolnum></citerefentry> function "interprets the specified character string as an Internet address, placing the address into the structure provided." The <literal>ip_number</literal> member in the <literal>jail</literal> structure is set only when the IP address placed onto the <literal>in</literal> structure by <citerefentry><refentrytitle>inet_aton</refentrytitle><manvolnum>3</manvolnum></citerefentry> is translated into host byte order by <citerefentry><refentrytitle>ntohl</refentrytitle><manvolnum>3</manvolnum></citerefentry>. >函数<citerefentry><refentrytitle>inet_aton</refentrytitle><manvolnum>3</manvolnum></citerefentry>“将指定的字符串解释为一个Internet地址,并将其转存到指定的结构体中”。<citerefentry><refentrytitle>inet_aton</refentrytitle><manvolnum>3</manvolnum></citerefentry>设定了结构体in,之后in中的内容再用<citerefentry><refentrytitle>ntohl</refentrytitle><manvolnum>3</manvolnum></citerefentry>转换成主机字节顺序,并置入<literal>jail</literal>结构体的<literal>ip_number</literal>成员。
Finally, the userland program jails the process. <application>Jail</application> now becomes an imprisoned process itself and then executes the command given using <citerefentry><refentrytitle>execv</refentrytitle><manvolnum>3</manvolnum></citerefentry>. 最后,用户级程序囚禁进程。现在Jail自身变成了一个被囚禁的进程,并使用<citerefentry><refentrytitle>execv</refentrytitle><manvolnum>3</manvolnum></citerefentry>执行用户指定的命令。
As you can see, the <literal>jail()</literal> function is called, and its argument is the <literal>jail</literal> structure which has been filled with the arguments given to the program. Finally, the program you specify is executed. I will now discuss how <application>jail</application> is implemented within the kernel. 正如你所见,函数<literal>jail()</literal>被调用,参数是结构体<literal>jail</literal>中被填入数据项,而如前所述,这些数据项又来自<citerefentry><refentrytitle>jail</refentrytitle><manvolnum>8</manvolnum></citerefentry>的命令行参数。最后,执行了用户指定的命令。下面我将开始讨论<literal>jail</literal>在内核中的实现。
Like all system calls, the <citerefentry><refentrytitle>jail</refentrytitle><manvolnum>2</manvolnum></citerefentry> system call takes two arguments, <literal>struct thread *td</literal> and <literal>struct jail_args *uap</literal>. <literal>td</literal> is a pointer to the <literal>thread</literal> structure which describes the calling thread. In this context, <literal>uap</literal> is a pointer to the structure in which a pointer to the <literal>jail</literal> structure passed by the userland <filename>jail.c</filename> is contained. When I described the userland program before, you saw that the <citerefentry><refentrytitle>jail</refentrytitle><manvolnum>2</manvolnum></citerefentry> system call was given a <literal>jail</literal> structure as its own argument. 像所有的系统调用一样,系统调用<citerefentry><refentrytitle>jail</refentrytitle><manvolnum>2</manvolnum></citerefentry>带有两个参数,<literal>struct thread *td</literal>和<literal>struct jail_args *uap</literal>。<literal>td</literal>是一个指向<literal>thread</literal>结构体的指针,该指针用于描述调用<citerefentry><refentrytitle>jail</refentrytitle><manvolnum>2</manvolnum></citerefentry>的线程。在这个上下文中,<literal>uap</literal>指向一个结构体,这个结构体中包含了一个指向从用户级<filename>jail.c</filename>传送过来的<literal>jail</literal>结构体的指针。在前面我讲述用户级程序时,你已经看到过一个<literal>jail</literal>结构体被作为参数传送给系统调用<citerefentry><refentrytitle>jail</refentrytitle><manvolnum>2</manvolnum></citerefentry>。
Therefore, <literal>uap-&gt;jail</literal> can be used to access the <literal>jail</literal> structure which was passed to the system call. Next, the system call copies the <literal>jail</literal> structure into kernel space using the <citerefentry><refentrytitle>copyin</refentrytitle><manvolnum>9</manvolnum></citerefentry> function. <citerefentry><refentrytitle>copyin</refentrytitle><manvolnum>9</manvolnum></citerefentry> takes three arguments: the address of the data which is to be copied into kernel space, <literal>uap-&gt;jail</literal>, where to store it, <literal>j</literal> and the size of the storage. The <literal>jail</literal> structure pointed by <literal>uap-&gt;jail</literal> is copied into kernel space and is stored in another <literal>jail</literal> structure, <literal>j</literal>. 于是<literal>uap-&gt;jail</literal>可以用于访问被传递给<citerefentry><refentrytitle>jail</refentrytitle><manvolnum>2</manvolnum></citerefentry>的<literal>jail</literal>结构体。然后,<citerefentry><refentrytitle>jail</refentrytitle><manvolnum>2</manvolnum></citerefentry>使用<citerefentry><refentrytitle>copyin</refentrytitle><manvolnum>9</manvolnum></citerefentry>将<literal>jail</literal>结构体复制到内核内存空间中。<citerefentry><refentrytitle>copyin</refentrytitle><manvolnum>9</manvolnum></citerefentry>需要三个参数:要复制进内核内存空间的数据的地址<literal>uap-&gt;jail</literal>,在内核内存空间存放数据的<literal>j</literal>,以及数据的大小。<literal>uap-&gt;jail</literal>指向的Jail结构体被复制进内核内存空间,并被存放在另一个<literal>jail</literal>结构体<literal>j</literal>里。
There is another important structure defined in <filename>jail.h</filename>. It is the <literal>prison</literal> structure. The <literal>prison</literal> structure is used exclusively within kernel space. Here is the definition of the <literal>prison</literal> structure. 在jail.h中定义了另一个重要的结构体型prison。结构体<literal>prison</literal>只被用在内核空间中。下面是<literal>prison</literal>结构体的定义。
Next, we will discuss another important system call <citerefentry><refentrytitle>jail_attach</refentrytitle><manvolnum>2</manvolnum></citerefentry>, which implements the function to put a process into the <application>jail</application>. 下面,我们将讨论另外一个重要的系统调用<citerefentry><refentrytitle>jail_attach</refentrytitle><manvolnum>2</manvolnum></citerefentry>,它实现了将进程监禁的功能。