The translation is temporarily closed for contributions due to maintenance, please come back later.

Source string Read only

(itstool) path: sect2/title
Context English State
This is the approach we will use in our program because it is very easy to do in assembly language, and very fast, too. We will simply keep the address of the right procedure in <varname role="register">EBX</varname>, and then just issue:
call ebx
This is possibly faster than hardcoding the address in the code because the microprocessor does not have to fetch the address from the memory—it is already stored in one of its registers. I said <emphasis>possibly</emphasis> because with the caching modern microprocessors do, either way may be equally fast.
Memory Mapped Files
Because our program works on a single file, we cannot use the approach that worked for us before, i.e., to read from an input file and to write to an output file.
<trademark class="registered">UNIX</trademark> allows us to map a file, or a section of a file, into memory. To do that, we first need to open the file with the appropriate read/write flags. Then we use the <function role="syscall">mmap</function> system call to map it into the memory. One nice thing about <function role="syscall">mmap</function> is that it automatically works with virtual memory: We can map more of the file into the memory than we have physical memory available, yet still access it through regular memory op codes, such as <function role="opcode">mov</function>, <function role="opcode">lods</function>, and <function role="opcode">stos</function>. Whatever changes we make to the memory image of the file will be written to the file by the system. We do not even have to keep the file open: As long as it stays mapped, we can read from it and write to it.
The 32-bit Intel microprocessors can access up to four gigabytes of memory – physical or virtual. The FreeBSD system allows us to use up to a half of it for file mapping.
For simplicity sake, in this tutorial we will only convert files that can be mapped into the memory in their entirety. There are probably not too many text files that exceed two gigabytes in size. If our program encounters one, it will simply display a message suggesting we use the original <application>tuc</application> instead.
If you examine your copy of <filename>syscalls.master</filename>, you will find two separate syscalls named <function role="syscall">mmap</function>. This is because of evolution of <trademark class="registered">UNIX</trademark>: There was the traditional <acronym>BSD</acronym> <function role="syscall">mmap</function>, syscall 71. That one was superseded by the <acronym><trademark class="registered">POSIX</trademark></acronym> <function role="syscall">mmap</function>, syscall 197. The FreeBSD system supports both because older programs were written by using the original <acronym>BSD</acronym> version. But new software uses the <acronym><trademark class="registered">POSIX</trademark></acronym> version, which is what we will use.
The <filename>syscalls.master</filename> lists the <acronym><trademark class="registered">POSIX</trademark></acronym> version like this:
197 STD BSD { caddr_t mmap(caddr_t addr, size_t len, int prot, \
int flags, int fd, long pad, off_t pos); }
This differs slightly from what <citerefentry><refentrytitle>mmap</refentrytitle><manvolnum>2</manvolnum></citerefentry> says. That is because <citerefentry><refentrytitle>mmap</refentrytitle><manvolnum>2</manvolnum></citerefentry> describes the C version.
The difference is in the <varname>long pad</varname> argument, which is not present in the C version. However, the FreeBSD syscalls add a 32-bit pad after <function role="opcode">push</function>ing a 64-bit argument. In this case, <varname>off_t</varname> is a 64-bit value.
When we are finished working with a memory-mapped file, we unmap it with the <function role="syscall">munmap</function> syscall:
For an in-depth treatment of <function role="syscall">mmap</function>, see W. Richard Stevens' <link xlink:href="">Unix Network Programming, Volume 2, Chapter 12</link>.
Determining File Size
Because we need to tell <function role="syscall">mmap</function> how many bytes of the file to map into the memory, and because we want to map the entire file, we need to determine the size of the file.
We can use the <function role="syscall">fstat</function> syscall to get all the information about an open file that the system can give us. That includes the file size.
Again, <filename>syscalls.master</filename> lists two versions of <function role="syscall">fstat</function>, a traditional one (syscall 62), and a <acronym><trademark class="registered">POSIX</trademark></acronym> one (syscall 189). Naturally, we will use the <acronym><trademark class="registered">POSIX</trademark></acronym> version:
189 STD POSIX { int fstat(int fd, struct stat *sb); }
This is a very straightforward call: We pass to it the address of a <varname remap="structname">stat</varname> structure and the descriptor of an open file. It will fill out the contents of the <varname remap="structname">stat</varname> structure.
I do, however, have to say that I tried to declare the <varname remap="structname">stat</varname> structure in the <varname>.bss</varname> section, and <function role="syscall">fstat</function> did not like it: It set the carry flag indicating an error. After I changed the code to allocate the structure on the stack, everything was working fine.
Changing the File Size
Because our program may combine carriage return / line feed sequences into straight line feeds, our output may be smaller than our input. However, since we are placing our output into the same file we read the input from, we may have to change the size of the file.
The <function role="syscall">ftruncate</function> system call allows us to do just that. Despite its somewhat misleading name, the <function role="syscall">ftruncate</function> system call can be used to both truncate the file (make it smaller) and to grow it.
And yes, we will find two versions of <function role="syscall">ftruncate</function> in <filename>syscalls.master</filename>, an older one (130), and a newer one (201). We will use the newer one:
201 STD BSD { int ftruncate(int fd, int pad, off_t length); }
Please note that this one contains a <varname>int pad</varname> again.
We now know everything we need to write <application>ftuc</application>. We start by adding some new lines in <filename></filename>. First, we define some constants and structures, somewhere at or near the beginning of the file:
;;;;;;; open flags
%define O_RDONLY 0
%define O_WRONLY 1
%define O_RDWR 2

;;;;;;; mmap flags
%define PROT_NONE 0
%define PROT_READ 1
%define PROT_WRITE 2
%define PROT_EXEC 4
%define MAP_SHARED 0001h
%define MAP_PRIVATE 0002h

;;;;;;; stat structure
struc stat
st_dev resd 1 ; = 0
st_ino resd 1 ; = 4
st_mode resw 1 ; = 8, size is 16 bits
st_nlink resw 1 ; = 10, ditto
st_uid resd 1 ; = 12
st_gid resd 1 ; = 16
st_rdev resd 1 ; = 20
st_atime resd 1 ; = 24
st_atimensec resd 1 ; = 28
st_mtime resd 1 ; = 32
st_mtimensec resd 1 ; = 36
st_ctime resd 1 ; = 40
st_ctimensec resd 1 ; = 44
st_size resd 2 ; = 48, size is 64 bits
st_blocks resd 2 ; = 56, ditto
st_blksize resd 1 ; = 64
st_flags resd 1 ; = 68
st_gen resd 1 ; = 72
st_lspare resd 1 ; = 76
st_qspare resd 4 ; = 80


No matching activity found.

Browse all component changes

Source information

Source string comment
(itstool) path: sect2/title
Source string location
String age
a year ago
Source string age
a year ago
Translation file
books/developers-handbook.pot, string 1879