oreilly.comSafari Books Online.Conferences.


Vanishing Features of the 2.6 Kernel

by Jerry Cooperstein

Many developers are eagerly awaiting the 2.6 Linux kernel. The feature freeze has passed, with a code freeze planned for January and final release slated for the second quarter of 2003. There is considerable excitement about anticipated enhancements, especially regarding scalability and performance.

However, some developers may first notice what doesn't work anymore. Some techniques and APIs have been removed, and existing device drivers and modular plugins may no longer work. At the same time, it will take some time to take advantage of new features and to find replacements for old ones.

Some deprecated techniques, such as task queues, have finally been eliminated. Other facilities, including in-kernel Web acceleration, have been supplanted by newer advances. Other changes, notably banishing the system call table from the list of exported symbols available to modules, have flowed more from philosophical and licensing issues than from technical considerations.

Export of the System Call Table

The Linux kernel has a monolithic architecture; it is one big program. All parts of the kernel are visible to each other unless their scope has been explicitly limited. Arguments are passed on the stack, as in any other C program. At the same time, Linux makes extensive use of modules: facilities that may be loaded and unloaded dynamically. (These are often, but not always device drivers.) Modules can only see explicitly exported symbols (functions, variables, etc.). Unless the kernel or a previously loaded module includes the statement EXPORT_SYMBOL(foobar);, the module cannot refer to foobar().

Extensive modularization does not render the kernel any less monolithic. The critical difference between monolithic and microkernels stems from how components communicate with each other. As long as the Linux kernel prefers function calls to message passing, its basic structure will remain monolithic.

The system call table is a vector containing the addresses of the functions executed whenever a system call is made from user space. When invoking a system call, the kernel receives the number of the call, the number of arguments, and the arguments themselves. It uses the call number as an offset into the table and places the arguments in the registers; they're not passed on the stack. Then it jumps to the appropriate address to execute the system call.

Exporting the system call table allows modules to substitute system calls with replacements of their own devising. To replace the basic kernel read() system call requires a simple code fragment:

extern int sys_call_table[];

read_save = sys_call_table[NR_read];
sys_call_table[NR_read] = read_sub;

where read_sub() has been defined somewhere in the module and the pointer to the original system call has been saved so that it can be restored upon module unloading:

sys_call_table[NR_read] = read_save;

So what is wrong with this technique?

Related Reading

Understanding the Linux Kernel
By Daniel P. Bovet, Marco Cesati

On the practical side, it is easy to incur race conditions, especially on multi-processor systems where the replacement happens while an application is using the system call. Various locking techniques can offer some protection, but the details are non-trivial. However, the abolition of this method is not primarily due to practical difficulties.

Some system calls penetrate deep into kernel's heart. Binary-only modules, where the source is not available under a GPL-compatible license, have enjoyed the use of this technique. Exported symbols have been visible to all modules.

The rules governing binary modules and GPL violations have always been fuzzy. Some argue that it is permissible for any such module to restrict itself to exported symbols. Others maintain it depends on whether or not the module fiddles with core kernel facilities. The line between central and peripheral matters has always been very gray.

To sharpen this delineation, the 2.4.10 version of modutils, which handles loading and unloading of modules, introduced module licenses. In addition, the EXPORT_SYMBOL_GPL macro, introduced in the 2.4.11 kernel, created two classes of exported symbols. Only modules with an acceptable open-source license can have access to symbols exported under the GPL. All previously exported symbols were grand fathered in.

This led to some loud arguments. Perhaps if the macro had been called EXPORT_SYMBOL_INTERNAL, it would have shown an intent of differentiating between modules implementing central and peripheral kernel facilities, rather than making a choice based on the kernel programmer's licensing philosophy.

Choosing to use EXPORT_SYMBOL_GPL(sys_call_table) would have satisfied many objections. Instead, the more draconian choice of embargoing all export of the system call table occurred. Red Hat did this in the patched 2.4.18 kernel shipped with Red Hat Linux 8.0, and Linus Torvalds did the same in the 2.5.41 development kernel. As a result, a module can no longer replace a system call through the simple code above. Its replacement adds support to register new system calls dynamically. This feature may continue to grow.

Most observers foresee a tightening of the limits on binary modules. This may very well break some rather expensive commercial Linux products, but that doesn't seem to bother most kernel developers. Reminding the purveyors of binary modules that they continue to operate at the pleasure of the Linux kernel developers and their open-source licenses is seen to be a necessary (even enjoyable) task. It has probably always been true that the only way to protect investment in Linux deployment of drivers and other kernel facilities (not applications) is to go open source, even if that is difficult for commercial enterprises to absorb. Recent developments seem to re-emphasize this.

Pages: 1, 2

Next Pagearrow

Linux Online Certification

Linux/Unix System Administration Certificate Series
Linux/Unix System Administration Certificate Series — This course series targets both beginning and intermediate Linux/Unix users who want to acquire advanced system administration skills, and to back those skills up with a Certificate from the University of Illinois Office of Continuing Education.

Enroll today!

Linux Resources
  • Linux Online
  • The Linux FAQ
  • Linux Kernel Archives
  • Kernel Traffic

  • Sponsored by: