Scheduler Activations: Effective Kernel Support T.E. Anderson et al. This paper argues that user-level threads inherently have better performance and more flexibility than kernel threads. It proposes scheduler activations as a kernel mechanism for supporting user-level threads. According to the paper, there are problems with current implementations of both kernel and user-level threads. Kernel threads are too heavyweight, causing performance problems. User-level threads are inexpensive, but can cause unfairness in scheduling when, for instance, they block on I/O or have a page fault. In systems where user-level threads are built on top of kernel threads, the kernel has to choose which user-level threads to schedule or preempt without knowing anything about the state of the user-level threads. To avoid these problems, the paper proposes a system in which the user-level thread system controls scheduling, but also communicates with the kernel about changes in the system. Communication between the kernel and user-level thread system is handled through scheduler activations. A scheduler activation has one execution stack in the kernel and one in the application address space. When the kernel needs to notify the user-level thread that a change has occurred, such as changing the number of processors allocated to the thread, it creates an activation and upcalls into the application address space. When a user-level thread needs to call into the kernel -- to request more processors or relinquish a processor, for instance -- it uses the activations kernel stack. The kernel creates a new activation each time it needs to make an upcall. The measurements compare three versions of the system: original Fast Threads on top of Topaz kernel threads, new Fast Threads on top of scheduler activations, and unmodified Topaz. New Fast Threads has higher thread operation latencies; the paper attributes this to Topaz kernel threads being written in optimized assembler. All the other measurements of the average speedup of applications show that FastThreads with activations performs better than the other two systems, with increasing efficiency as the number of processors increased. Overall, this paper is a convincing argument for the use of user-level threads. It clearly identifies the problems with existing systems and proposes a relatively elegant way to solve these problems. Scheduler activations enable the necessary communication between the kernel and applications, but avoid two main pitfalls. They are not bound to a thread, giving the kernel additional flexibility to create or free them as needed. In addition, they provide a low-cost way for the kernel to communicate with user-level threads in the infrequent cases when this is necessary. Scheduler Activations Jonathan Ledlie CS 736 February 25, 2000 In theory, threads allow users to achieve the parallelism of multiple processes without the overhead of heavyweight context switches or the difficulty of cross-process memory sharing. In practice, a user’s thread implementation is often linked to kernel threads which do not share enough information with the user level to retain the user’s scheduling policies or speed. Some number of kernel threads are usually tied to a particular process, which is left uninformed when they block on I/O, for example. Thus there exist conditions where the user process has threads in the ready queue, but none are run because their kernel counterparts are sleeping. To solve this problem, the U-Washington group proposes eliminating kernel threads entirely and replacing them with an array of virtual CPUs. The process can then decide what to do with the CPUs it has available. This abstraction, provided in the form of schedule activations, provides the mechanism, and leaves the actual scheduling policy up to the process. Beyond their basic idea, which is a good one, the Washington group provides facilities for processes to ask for more virtual processors and to relinquish them hints which the OS can ignore. A process can also inform the OS of what kind of work it plans to do: I/O or CPU intensive. As mentioned above, their idea allows distinct scheduling per process this is a good thing. Negative points include: 1) that the kernel does not appear to take this information into the global scheduling policy (which might give even better overall performance) 2) that multiple cooperating processors (like a resource container) are looked at separately (but, then again, that paper was more recent) 3) that their tests seemed to need by-hand programming for critical sections, although this was unclear.