*********************************************************************** Cluster-based scalable network services *********************************************************************** This paper describes architecture for a cluster of workstations and describes the implementation of this architecture for two services. The main ideas of the architecture presented are layered design, centralized load balancing and distributed request handling. Layered design: In this design, parsing of requests is separated from load distribution, data transformation services, and data retrieval services. This allows specialization of nodes for a particular type of operation and makes implementation of each layer simple. Specialization also helps to improve performance of instruction and data caches, thereby increasing the throughput. Centralized load balancing: The load balancing is performed by a centralized manager that has global information about the state of the cluster. The experimental results showed that one manager can support a cluster of reasonable size. Distributed request handling: The incoming requests are handled by a number of front-ends that are equipped with manager stubs. Manager stubs contact the centralized manager, who makes load distribution decisions based on the load information that it has and based on use profile data that can be looked up in a database. The manager chooses a particular worker to handle the request. There is also a number of nodes in the cluster that serve as caches. I found this paper difficult to evaluate for the following reasons: 1. It described a cluster-based solution for one specific service - TranSend. Their solution is also used by HotBot, but in a very limited way - it does not use some of the main features, such as automatic load distribution (the database is partitioned statically). 2. The description of the system was very high-level. The authors talk about automatically choosing workers for requests, but they do not go into enough detail of how particularly the workers are chosen. 3. The authors do not do any experimental comparison of their work with any other cluster-based solution. This is in part understandable, because it would be very difficult to port TranSend onto another cluster architecture. But they could at least run some synthetic benchmarks to present rough comparison numbers. There are a few ideas in the paper that I found interesting. 1. This system has a robust way of handling fault-tolerance: worker units, distillers and managers cooperate to detect failures and restart each other in the event of a failure. 2. The idea of using idle user workstations as a pool of overflow machines is interesting - these machines are used only when the load is critically high, so the users are not disturbed all the time. But this overflow pool can serve as a nice cushion, which can make transition to a more powerful cluster very smooth. Even though the description of the system together with the presented results did not make me confident that this would actually work, the fact that it is actually used by real companies was convincing. Also, I think that there is some value in sharing experience with a complex system like this one. However, this is not a research paper as I see it, because it does not provide a well-studied solution to a more-or-less general problem, but describes a specific implementation of a particular system. Cluster-Based Scalable Network Services (Berkely, 1997) Jonathan Ledlie CS 736 March 29, 2000 Clusters are to workstations as RAID is to JBOD. Clusters and RAID offer more power and scalability at less cost than their more hardware-advanced counterparts, namely SMP and expensive, large disks. Both, however, are hard to manage effectively. This paper offers one solution to managing and building for clusters. The authors' solution provides the user (a programmer) with a series of reusable building blocks to harness the advantages of clusters and ameliorate the difficulties in shared state, load balancing, and monitoring. They concentrate on the read-dominated, user-oriented side of the Internet, which allows them to reduce the concurrency problem inherient in distributed design; if a worker temporarily uses stale data, it is ok in their semantics. They call this reduction BASE (basically available, soft state, eventual consistency). In addition to this simplification, they provide a nice, hierarchical API, which allows users to use parts at the top (Service) level for most projects, but then delve into the TACC layer if they need more customization. Finally, they concentrated on the burstiness of most Internet activity, by allowing an administrator to designate overflow machines to temporarily handle high workloads. The lowest layer, SNS (Scalable Network Service), spreads on to these extra machines when needed. I find three major flaws in their work. First, there is no mention of security, particularly in the overflow situations which appear to run on other people's workstations. Second "the front end spends more than 70% of its time in the kernel" -- not a good sign. Third, and most significantly, more and more Internet operations iarei requiring scalable ACID semantics, B2B virtual markets for example, where having stale data does not work. We would need a new set of building blocks for this growing number of situations. NOTES FROM CS261 What do you need in an Internet cluster? 1) incremental scalability 2) fault-tolerant (24x7) 3) cheap Terminology: - rathole - embarrisingly parallel - forklift upgrade