next up previous contents index
Next: 8. Experimental Results Up: Automatic Configuration of Component-Based Previous: 6. Reconfiguration Agents   Contents   Index

Subsections


7. Application Scenarios

The work on this thesis was motivated initially by our research on the 2K operating system [KCM+00]. During the early phases of that research, we identified proper dependence management as a key factor in the development of robust operating system and middleware architectures. We begin this chapter describing dynamicTAO, a fundamental part of 2K, and explaining how the dynamicTAO internal architecture applies the ideas defended in this thesis.

Section 7.2 focuses on a Multimedia Distribution System we developed some years ago. It shows how we enhanced the system recently using the infrastructure for automatic configuration and component configurators to enable dynamic reconfiguration in the occurrence of failures.

In Section 7.3, we refer to systems being developed by other researchers that are applying some of the ideas presented in this thesis.


7.1 dynamicTAO

One of the major constituent elements of the 2K distributed operating system [KSC+98,CNM98,KCM+00] is a reflective middleware infrastructure based on CORBA. After carefully studying existing CORBA Object Request Brokers, we came to the conclusion that the TAO ORB [SC99] would be the best starting point for developing this infrastructure.

TAO is a portable, flexible, extensible, and configurable ORB based on design patterns. It uses the Strategy design pattern [GHJV95] to encapsulate different aspects of the ORB internal engine. A configuration file is used to specify the strategies the ORB uses to implement aspects such as concurrency, request demultiplexing, scheduling, and connection management. At ORB startup time, the configuration file is parsed and the selected strategies are loaded.

TAO is primarily targeted for static hard real-time applications such as avionics systems [HLS97a]. Thus, it assumes that, once the ORB is initially configured, its strategies will remain in place until it completes its execution. There is no support for on-the-fly reconfiguration.

On-the-fly adaptation is extremely important for a wide range of applications including those dealing with multimedia, mobile computers, and dynamically changing environments. To support this kind of dynamic adaptations, we developed dynamicTAO [RKC99,KRL+00], an extension of TAO that enables on-the-fly reconfiguration of its strategies. dynamicTAO exports an interface for loading and unloading modules into the ORB runtime, and for inspecting the ORB configuration state. The architecture can also be used for dynamic reconfiguration of user applications running on top of the ORB and even for reconfiguring non-CORBA applications.


7.1.1 A Reflective ORB

dynamicTAO is our first complete implementation of a CORBA reflective ORB. As pointed out in [SSC97,SSC98b], a reflective system is a system that gives a program access to its definition and evaluation rules, and defines an interface for altering them. In an ORB, client requests represent the ``program'' to be evaluated by the system, the ORB implementation represents the ``evaluator'', and ``evaluation'' is simply the execution of the remote method invocation. A reflective ORB makes it possible to redefine its evaluation semantics.

dynamicTAO allows inspection and reconfiguration of its internal engine and allows ORB and application developers to specify reconfiguration policies inside customized subclasses of the ComponentConfigurator class. It exports an interface for (1) transferring components across the distributed system, (2) loading and unloading modules into the ORB runtime, and (3) inspecting and modifying the ORB configuration state.

The reification of dynamicTAO's internal structure is achieved through a collection of component configurators. Each process running the dynamicTAO ORB contains a component configurator instance called DomainConfigurator. It is responsible for maintaining references to instances of the ORB and to servants running in that process. In addition, each instance of the ORB contains a customized component configurator called TAOConfigurator.

Figure 7.1: Reifying the dynamicTAO Structure
\begin{figure}
\begin{center}
\leavevmode \epsfig {file=figs/Configurators.eps,width=10cm} \end{center}\end{figure}

TAOConfigurator contains hooks to which implementations of dynamicTAO strategies are attached. Hooks work as ``mounting points'' where specific strategy implementations are made available to the ORB. We currently support hooks for different kinds of strategies such as Concurrency, Security, and Monitoring. The association between hooks and component implementations can be changed at any time, subject to safety constraints.

Figure 7.1 illustrates this reification mechanism in a process containing a single instance of the ORB. If necessary, individual strategies can use component configurators to reify their dependencies upon ORB instances and other strategies. These configurators may also store references to client connections that depend on the strategies. With this information, it is possible to manage strategy reconfiguration consistently as we explain in section 7.1.3.

The dynamicTAO architectural framework is depicted in Figure 7.2. The Persistent Repository stores category implementations in the local file system. It offers methods for manipulating (e.g., browsing, creating, deleting) categories and the implementations of each category. Once a component implementation is stored in the local repository, it can be dynamically loaded into the process runtime.

Figure 7.2: dynamicTAO Components
\begin{figure}
\begin{center}
\leavevmode \epsfig {file=figs/framework.eps,width=12cm} \end{center}\end{figure}

A Network Broker receives reconfiguration requests from the network and forwards them to the Dynamic Service Configurator. The latter contains the DomainConfigurator (shown in Figure 7.1) and supplies common operations for dynamic configuration of components at runtime. It delegates some of its functions to specific component configurators (e.g., TAOConfigurator or a certain ServantConfigurator).

We minimized the changes to the standard ACE/TAO distribution by delegating some of the basic configuration tasks to components of the ACE framework such as the ACE_Service_Config (used to process startup configuration files and manage dynamic linking) and the ACE_Service_Repository (to manage loaded implementations) [JS97].

This architectural framework enables the development of different kinds of persistent repositories and network brokers to interact with the Dynamic Service Configurator. Thus, it is possible to use different naming schemes when storing category implementations and different communication protocols for remote configuration as described below.

We built the dynamicTAO components using the ACE wrappers [Sch93] for operating system services. Thus, dynamicTAO runs on the several different platforms to which ACE was ported.


7.1.2 Reconfiguration Interfaces

dynamicTAO supports three distinct forms of reconfiguration interfaces. In general terms, they all provide the same functionality, but each of them has characteristics that makes it more or less appropriate for certain situations. A description of the interfaces follows.

  1. The DCP Broker is a customized subclass of the Network Broker shown in Figure 7.2. It listens on a TCP port, waiting for connection requests from remote clients. Once a connection is established, a client can send inspection and reconfiguration commands using DCP, our Distributed Configuration Protocol [Kon98]. This interface is particularly good for debugging and for fast interaction with an ORB, since the user can access the configuration interface simply by establishing a telnet connection to the DCP Broker.
  2. The Reconfiguration Agent Broker is also a customized subclass of the Network Broker. It is useful for configuring a distributed collection of ORBs as we described in Chapter 6.
  3. The DynamicConfigurator is a CORBA object that exports an IDL interface with operations equivalent to those offered by the DCP protocol. It is the most convenient of the three interfaces for programmatic interactions, since all the communication aspects are hidden by the CORBA middleware.

We now use the DynamicConfigurator IDL specification presented in Figure 7.3 to explain the functionality of the dynamicTAO reconfiguration interfaces7.1.

Figure 7.3: The DynamicConfigurator Interface
\begin{figure}
{
\small\begin{center}
\begin{boxedverbatim}interface DynamicCo...
...egoryName,
in string impName);
};\end{boxedverbatim}\end{center}}
\end{figure}

The DynamicConfigurator interface specifies the operations that can be performed on dynamicTAO abstractions, namely, categories, implementations, hooks, and configurable components. The first nine operations in the interface are used to inspect the dynamic structure of that domain and retrieve information about the different abstractions. A category represents the type of a component; each category typically contains different implementations, i.e., dynamically loadable code stored in the Persistent Implementation Repository. For example, a category called Concurrency contains the three threading models that dynamicTAO currently supports: Reactive_Strategy, Thread_Strategy, and Thread_Pool_Strategy.

Once an implementation is loaded into the system runtime, it becomes a loaded implementation and can be associated with a logical component in the ORB domain. Finally, components have hooks that are used to represent inter-component dependence; if a component $A$ depends upon component $B$, then this dependence is represented by attaching $B$ to a hook in $A$.

load_implementation dynamically loads and starts an implementation from the persistent repository. hook_implementation attaches it to a hook in one of the components in the domain.

The next four methods allow operations on loaded implementations. It is possible to suspend and resume their main threads, remove them from the process, and send them component-specific reconfiguration messages.

upload_implementation allows an external entity to send an implementation to be stored in the local Persistent Repository, so that it can be linked to a running process and attached to a hook. Conversely, download_implementation allows a remote entity to retrieve an implementation from the local Persistent Repository. Finally, delete_implementation is used to delete implementations stored at the ORB Persistent Repository.

Consider now the scenario in which a user wants to change the threading model at runtime by using an implementation of the Concurrency strategy called Thread_Pool_Strategy. Assuming that the user wants to start with a thread pool of size 20, the required configuration steps are the following.

  1. Load the implementation into memory:
    version = load_implementation("Concurrency","Thread_Pool_Strategy","20", 0, cc)
  2. Attach the implementation to the Concurrency hook in TAO:
    hook_implementation("Concurrency"+version,"TAO","Concurrency_Strategy")
After the new implementation is attached, the ORB starts using it. In section 7.1.3, we discuss what happens if a different concurrency strategy is in use.

Figure 7.4 shows C++ code that uses the Dynamic Configurator to retrieve and print some information about the ORB internal configuration. The code obtains a reference to the DynamicConfigurator object through the ORB's resolve_initial_references() method.

Figure 7.4: Inspecting the ORB Internal State
\begin{figure}
\begin{center}
{\small\begin{boxedverbatim}CORBA::Object_var d...
...%s> concurrency strategy.'', ret);
\end{boxedverbatim}}
\end{center}\end{figure}


7.1.3 Consistency

Reconfiguring a running ORB while it is servicing client requests is a difficult task that requires careful consideration. There are two major problems.

Consider the case in which dynamicTAO receives a request for replacing one of its strategies ($S_{old}$) by a new strategy ($S_{new}$). TAO strategies are implemented as C++ objects that communicate through method invocations. The first problem is that, before unloading $S_{old}$, the system must guarantee that no one is running $S_{old}$ code and that no one is expecting to run $S_{old}$ code in the future. Otherwise, the system could crash. Thus, it is important to assure that $S_{old}$ is only unloaded after the system can guarantee that its code will not be called.

The second problem is that some strategies need to keep state information. When a strategy $S_{old}$ is being replaced by $S_{new}$, part of $S_{old}$'s internal state may need to be transfered to $S_{new}$. Both problems can be addressed with the help of the TAOConfigurator.

Consider, for example, the three concurrency strategies supported by dynamicTAO: single-threaded reactive, thread-per-connection, and thread-pool. If the user switches from the reactive or thread-per-connection strategies to any other concurrency strategy, nothing special needs to be done. dynamicTAO may simply load the new strategy, update the proper TAOConfigurator hook, unload the old strategy, and continue. Old client connections will complete with the concurrency policy dictated by the old strategy. New connections will utilize the new policy.

However, if one switches from the thread-pool strategy to another strategy, we must take special care. The thread-pool strategy we developed maintains a pool of threads that is created when the strategy is initialized. The threads are shared by all incoming connections to achieve a good level of concurrency without having the runtime overhead of creating new threads. A problem arises when one switches from this strategy to another strategy: the code of the strategy being replaced cannot be immediately unloaded. This happens because, since the threads are reused, they return to the thread-pool strategy code each time a connection finishes. This problem can be solved by a ThreadPoolConfigurator keeping information about which threads are handling client connections and destroying them as the connections are closed. When the last thread is destroyed the thread-pool strategy signals that it can be unloaded.

Another problem occurs when one replaces the thread-pool strategy by a new one. There may be several incoming connections queued in the strategy waiting for a thread to execute them. The solution is to use the Memento pattern [GHJV95] to encapsulate the old strategy state in an object that is passed to the new strategy. An object is used to encapsulate the queue of waiting connections. The system simply passes this object to the new strategy, which then takes care of the queued connections.

7.1.4 dynamicTAO and this Thesis

It is interesting to note that, although the implementation of a reflective ORB is not the main topic of this thesis, dynamicTAO is associated with the novel ideas presented in this thesis in many distinct ways.

The dynamicTAO internal structure is reified and managed with the C++ implementation of the component configurator framework. Our implementation of the CORBA version of the component configurator was developed using dynamicTAO as the underlying ORB. Finally, we developed the Component Repository and the Automatic Configuration Service using dynamicTAO's DynamicConfigurator IDL interface depicted in Figure 7.3.

dynamicTAO is a good example of how the component configurator framework can be used to represent inter-component dependence and how software developers can use it to specify policies for safe dynamic configuration.


7.2 Scalable Multimedia Distribution

In [KCT+98] we describe the design and implementation of a flexible multimedia distribution system and demonstrated that it is possible to use the existing Internet to distribute low and medium bandwidth multimedia to thousands of simultaneous users. Our early experiments, however, pointed out to difficulties in managing such a large-scale system and keeping it available with an acceptable quality of service. It showed the necessity for a better support for dynamic reconfiguration, distribution of code updates, and provision of fault-tolerance for QoS-sensitive applications.

We addressed these problems in a new version of the multimedia distribution system [KCN00] built on top of the architecture presented in this thesis.

7.2.1 The Reflector

The multimedia distribution system's key element is the Reflector. It acts as a relay, receiving input data packets from a list of trusted sources and forwarding these packets to other Reflectors or to programs executed by end-users (also called end-user clients). The distribution system is composed of a network of Reflectors that collaborate with each other to distribute the multimedia data over local-area, metropolitan-area, and wide-area networks.

The Reflector is a user-level program that, at the application level, performs activities similar to those performed by network routers at the hardware level (which is where protocols such as IP-Multicast [Dee89] are implemented). Since it is implemented in software, not hard-wired into the router, the Reflector is more flexible, easy to deploy and evolve, and can be dynamically customized to users, applications, and environments.

Reflector data packets are encoded with RTP, a user-level protocol for real-time applications [SCFJ00] defined by the Internet Engineering Task Force (IETF). RTP packets can be transmitted over different kinds of low-level transport protocols such as TCP/IP, UDP/IP, and IP-Multicast.

The Reflector Network topology is determined by each Reflector's configuration. This information specifies input and output connections, access privileges, maximum allowed number of users, etc. It is stored in a database controlled by the Reflector administrator. Figure 7.5 depicts a generic Reflector network that distributes two video streams in different channels. In this figure, two capture stations send their video streams to ``master'' Reflectors; the streams may traverse several ``intermediate'' Reflectors until they reach ``public'' Reflectors to which end-user clients can connect and receive the video streams7.2. All the Reflectors in the figure are initiated with exactly the same code. Using the Automatic Configuration Service described in Chapter 4, the system then customizes each Reflector according to their individual requirements.

Figure 7.5: A Reflector Network Distributing Two Video Streams
\begin{figure}
\centerline {\psfig{file=figs/reflector2streams.eps,width=12cm}}\end{figure}


7.2.2 Data Distribution Protocols

In order to support different types of inter-reflector communication protocols transparently, the Reflector framework encapsulates the concept of a network connection into an abstract C++ class named Connection that defines the basic interface for all types of network connections used by the Reflector. This abstract class implements some of its own methods, but the majority of the connection-related code is implemented by its subclasses: TCPConnection, UDPConnection, MulticastConnection, and the like.

Figure 7.6 depicts a concrete example of a highly-heterogeneous Reflector network. In this example, the network distributes two audio-visual streams. The first comes from a mobile camera, mounted on a helicopter, that sends its stream to a ``master'' Reflector over a wireless link. This kind of link presents a high rate of packet loss not related to congestion, which makes protocols like TCP perform poorly [BPSK96]. Thus, it is desirable to use a protocol optimized for this kind of situation like, for example, WTCP [SVSB99]. The second stream is sent to its ``master'' Reflector through a dedicated ISDN line. To optimize the bandwidth, one can use UDP as the communication protocol since its overhead is lower than that of TCP and the link offers a low loss rate.

Figure 7.6: A Heterogeneous Reflector Network
\begin{figure}
\centerline {\psfig{file=figs/concrete-reflector-net.eps,width=14cm}}\end{figure}

Continuing with the example, Reflector C sends its streams to Reflector D over the public Internet through a transatlantic satellite link. Even though this is a high-bandwidth link, its loss rate may be high, so it is more appropriate to use TCP. Reflectors B and E offer the multimedia streams to dial-up clients over conventional phone lines; Reflector B uses TCP while Reflector E, on the other side of the Atlantic, uses the VDP adaptive algorithm. Finally, Reflectors A and D introduce the video streams into two distant points of the global MBone and Reflector F uses multicast to distribute the streams in a local Ethernet network.

Selecting the appropriate protocol for each situation, the Reflector administrators improve the quality of service offered to the end-users, optimizing the utilization of the available network bandwidth and minimizing packet losses.

Most of the Reflector's code deals with objects of type Connection and is not aware of the Connection's underlying implementation. The actual connection type is specified when the connection is created and does not need to be known after that.

This approach allows programmers to plug in customized Connection subclasses by providing their own implementation of the Open, Close, Bind, Connect, Send, and Receive methods, adding specialized functionality.

In this manner, it is possible to incorporate, into the Reflector, Connection subclasses that implement different transport protocols (such as the VDP QoS-aware adaptive protocol for the Internet [CTCL95] and the xbind QoS-aware reservation protocol for ATM networks [CL99]). Developers also use this mechanism to implement Connection subclasses that perform various operations on the data, such as encryption, transcoding, mixing, downsampling, and the like). Finally, one can create composite Connection types by combining existing ones. For example, one can create a CryptoMulticast connection type - that encrypts the data and sends them out using Multicast - by combining a Crypto connection with a Multicast connection.


7.2.3 Experience and Lessons Learned

This technology was utilized in the live broadcast of NASA's JPL Pathfinder mission [GCE+97]. During this broadcast - which lasted for several months - more than one million live video sessions were delivered to dozens of different countries across the globe by a network of more than 30 reflectors spread across five continents. The Reflectors ran in five different operating systems (Solaris, Linux, Irix, FreeBSD, and Windows) and transmitted their streams over different kinds of network links. End-users could watch the video stream simply by pointing their web browsers to selected locations, causing their browsers to download a Java applet containing the video client. The applet connected automatically to the reflector, received the video stream, decoded it, and displayed it to the user in real-time.

During this broadcast, we experienced three major problems:

  1. As the code had not been tested on such a large scale and on so many different platforms, we found many programming errors both in the Reflector code and in the client applet. Surprisingly, fixing the error was, sometimes, easier than updating the code in the dozens of machines that formed the distributed system. The same problem occurred when a new version of the Reflector, with added functionality, was released. System administrators had to manually connect to dozens of machines, upload the new code, shutdown the old version, and start the new one.

  2. In many cases, we had to reconfigure the reflector network by dynamically changing the distribution topology, or by setting new values to the reflector configuration parameters (e.g., maximum number of users, number of multimedia channels, etc.). The configuration information for the reflectors was stored in a centralized location. After updating this centralized database, we had to connect to each of the reflectors and instruct them to download their updated configuration. This process was tiresome and error-prone.

  3. The only mechanism the Reflector provided to support fault-tolerance was to send redundant streams from different sources to the same Reflector (see [KCT+98] for details). This mechanism leads to a large waste of bandwidth. The redundant streams are always transmitted even though they are seldom used.

With this experience, we learned that a user-level Reflector system is, indeed, a powerful tool for managing large-scale multimedia distribution. It gives Reflector administrators tight control over the distribution, allowing for better control of the quality of service. It achieves that through the definition of the distribution topology, the selection of appropriate communication protocols, and the possibilities for limiting the number of clients according to the available resources.

We also learned, however, that it is important to provide better mechanisms for distributed code updates, dynamic reconfiguration, and fault-tolerance, which motivated us to develop the architecture described in this thesis.


7.2.4 Dynamic Configuration of QoS-Sensitive Systems

The synergistic relationships between dynamic configuration and QoS are clear. Dynamic configuration allows the use of the best policies for each situation. For example, a mobile computer displaying a video clip to its user could use a protocol optimized for wireless connections (e.g., WTCP [SVSB99]) when the computer is using a wireless link, but dynamically reconfigure itself to use a TCP connection when the computer is hooked to a wired Ethernet connection.

However, if the reconfiguration process itself affects the quality of service negatively, it may not be worthwhile to do any reconfiguration at all. Going back to the example, if the dynamic reconfiguration to the TCP connection is so expensive that the video is interrupted for several seconds, it is better to keep using the wireless link, even when the wired link becomes available.

Therefore, while developing the new version of the Reflector system, our major goal was to deploy our architecture for dependence management to support dynamic configuration and fault-tolerance without affecting quality of service negatively.


7.2.4.1 Automatic Configuration

To solve the problem of maintaining the Reflector instances up-to-date as the code of the Reflector program evolves, and to customize each Reflector according to its role, we used the Automatic Configuration Service described in Chapter 4.

The first major change we had to do in the Reflector implementation to accommodate the new design was the adoption of a component-oriented model. We reorganized the implementation of the Reflector program, breaking it into dynamically loadable components. Surprisingly, this proved to be not so difficult, thanks to the original object-oriented design of the Reflector that was based on loosely coupled objects interacting via well-defined interfaces. This component-based model facilitates the customization of the Reflector program at startup, allowing the use of the Automatic Configuration Service to select the components that are best suited for executing the Reflector in a given node. It also facilitates the dynamic reconfiguration of running Reflectors to adapt to changes in the environment and to install new versions of components on-the-fly.

Figure 7.7 presents a schematic overview of the Reflector bootstrapping process in which the Reflector configures itself with the help of our infrastructure.

Figure 7.7: Bootstrapping a Reflector
\begin{figure}
\begin{center}
\leavevmode \epsfig {file=figs/bootstrapping.eps,width=12cm} \end{center}\end{figure}

At startup time, each Reflector contacts the CORBA Name Service to locate the Component Repository that is compatible with the operating system and hardware platform on which the Reflector is running (steps 1 and 2 in Figure 7.7). From the Component Repository, it retrieves its specific prerequisite specification file, which may contain information about its resource requirements and the list of components that must be dynamically loaded before the Reflector starts its activities.

The Automatic Configuration Service library (denoted as ACS in Figure 7.7) is linked to the Reflector process. Using the prerequisite file fetched from the Component Repository, the ACS requests the appropriate components from the remote repository (step 5), stores them in the local disk (step 6)7.3, and dynamically loads them into memory using the dynamicTAO facilities (step 7).

Once all components are loaded, the Reflector uses the resource requirement information in the prerequisite file to request the reservation of CPU and memory from an underlying QoS-aware resource management service. This communication with the resource management service (step 8) is the only part of the bootstrapping scheme that was not implemented in our prototype. QoS-aware resource management service can be provided by systems like SMART [NL97] or the Dynamic Soft Real-Time Scheduler (DSRT) [NhCN98]. In the latter case, DSRT can use the prerequisite specification to perform QoS-aware admission control, negotiation, reservation, and scheduling.

Next, the Reflector registers itself with the Name Service, so that it can be easily located by other system entities, and opens all the input and output connections using the specified protocols (steps 9 and 10). Although our prototype reads the information about the input and output connections from a local configuration file, it could be easily modified to retrieve this information from a file stored in the Component Repository.

Administrators of the Reflector system can look at the available broadcast and videoconference sessions by using a graphical user interface that interacts with the CORBA Name Service. For example, Figure 7.8 is a screen shot that illustrates the use of this interface. It shows three independent Reflector networks: the first called VirtualMeetingRoom that could be used for audio and videoconferencing, possibly divided according to interest groups; the second called News could contain several news channels; and the third called OlympicGames could contain audio and video broadcast channels related to Olympic events.

Figure 7.8: A Sample Screen Shot of the Name Service GUI
\begin{figure}
\begin{center}
\leavevmode \epsfig {file=figs/2kBrowser.ps,width=12cm} \end{center}\end{figure}

The administrator can also see that there are three available component repositories for three different kinds of architectures and can use the GUI to upload, download, and remove Reflector components from the repositories. Finally, the administrator can see that the OlympicGames network is composed of the five reflectors shown in the upper right-hand window. The CORBA IOR (Interoperable Object Reference) of the Reflector at delirius.cs.uiuc.edu is shown in the bottom right-hand window.


7.2.4.2 Fault-Tolerance

An important aspect of quality of service is that of reliability. It is difficult to ensure that the system will work properly in the presence of network failures, node failures, software failures, and temporary system shutdowns for maintenance. In addition, keeping the desired level of quality of service irrespective of all these disruptive events is a major challenge for modern QoS-sensitive systems. The Reflector architecture addresses this problem by using architecturally-aware dynamic reconfiguration based on the component configurator framework described in Chapter 5.

In multimedia distribution, the goal with respect to fault-tolerance is to maximize the system's availability without relying on redundant data transmission, which leads to waste of bandwidth. To achieve that, our prototype supports the dynamic reconfiguration of the Reflector inter-connections when failures occur. The distributed system has knowledge of its own structure and is able to establish alternative routes to deliver the multimedia streams to its users. Furthermore, whenever possible, the system performs these reconfigurations without affecting the quality of service perceived by its users.

7.2.4.2.1 Fault-Recovery Models

In order to build an alternate distribution topology, the system must store enough information so that alternate routes can be found when failures occur. The question is: where to maintain this information?

We considered initially a solution in which all the information regarding fault recovery would be placed in a centralized database accessible by every Reflector. When a Reflector $R_1$ detects that one of its inputs failed or that it is silent for too long, it would contact a configuration server with access to the centralized database and request the address of a reflector $R_2$ from which it could receive the missing streams. The configuration server would return this information and contact $R_2$, reconfiguring it to send data to $R_1$. The advantage of this approach is that very little information is stored on the Reflectors, all the fault-recovery information and policies are centralized in a single location, facilitating the manipulation of this information by a single entity: the configuration server.

The second solution is to store fault-recovery information on the Reflectors themselves. That is, each Reflector would store, for each set of multimedia channels, a list of alternate Reflectors that could be used as backups. The advantage of this approach is that it does not lead to a single point of failure and does not impose an extra load on a possibly already overloaded configuration server. This solution may be more difficult to implement, but it tends to be more scalable.

We believe that the optimal solution to this problem is one that encompasses both models. On the one hand, each Reflector should have some knowledge about its local surroundings and should be able to perform reconfigurations by communicating with its immediate neighbors, without being a burden to the centralized configuration server. On the other hand, the configuration server should maintain global knowledge about the system topology. This centralized, global knowledge should be used not only as backup, in case the Reflector's localized information is not enough to keep the system functioning, but also to perform optimizations, such as dynamic changes in the network topology to improve the quality of service and promote load balancing.

Our architecture adopts the hybrid model described above. It distributes the knowledge throughout the Reflector network and makes each Reflector aware of its dependence relationships with other Reflectors. Thus, the Reflectors are able to make reconfiguration decisions by themselves, without relying on a centralized entity. In addition to this, the global system topology is maintained in the configuration service so that a Reflector administrator or an ``intelligent'' software module can perform global optimizations in the distribution network.

The prototype supports fault-tolerance by using a subclass of ComponentConfigurator to represent the dependencies between Reflectors. When failures occur, the system uses the dependence information to locate alternate routes and keep the system functioning. The ComponentConfigurator implementation stores the dependencies as a list of CORBA IORs, which allows for prompt communication no matter where the objects are located.

The subclass of ComponentConfigurator we use is called ReflectorConfigurator and contains the policies for reshaping the network topology in case of failures and encapsulates all the code to deal with these reconfigurations. This approach proved to be very effective in keeping a clear separation of concerns in the Reflector code. The classes that deal with the Reflector's normal operation are totally unaware of the ReflectorConfigurator and of any code that deals with reconfiguration. This clean separation also makes it easy to plug different kinds of ReflectorConfigurators to support different reconfiguration policies.

7.2.4.2.2 Triggering Reconfigurations

In this implementation, four kinds of events can trigger dynamic reconfiguration:

  1. A Reflector shutdown message sent by the Reflector administrator or a kill command executed by the local system administrator.
  2. Errors (or ``bugs'') in the Reflector or library code that lead to a segmentation fault or bus error.
  3. A reconfiguration order sent by the Reflector administrator.
  4. Sudden machine crashes or network disconnections.

In the first two cases, the Reflector captures those events using signal handlers installed with the UNIX signal function or the Windows SetConsoleCtrlHandler function. In the UNIX implementation, for example, the administrator can kill a Reflector by pressing Ctrl-C on the terminal executing the Reflector, by sending a shutdown message to the reflector using a telnet connection, or by using the kill command. The Reflector captures the events generated by Ctrl-c, kill, segmentation faults, and bus errors by implementing signal handlers for the SIGINT, SIGTERM, SIGSEGV, and SIGBUS signals, respectively.

In the third case, the Reflector contacts the configuration service to retrieve its new configuration information and reprocesses it, reconfiguring its input and output connections.

Finally, the fourth case is the only one in which it is not possible to keep the client multimedia stream uninterrupted without relying on redundant streams to the same Reflector. The solution in this case is to detect when the input for a given channel has failed or is silent for too long and then locate an alternative input either by using the local list of alternatives or by contacting the configuration server.

Our current implementation focuses on supporting dynamic reconfiguration in the presence of the first two kinds of events. We now describe this process in more detail.

7.2.4.2.3 The Reconfiguration Process

When an administrator (or other system entity) requests to kill a Reflector, the system executes a special event handler called abandonReflectorNetwork (). This handler takes the following three actions.

  1. Unregisters the Reflector from the Name Service.
  2. Using CORBA, sends a FINISHED event to the ReflectorConfigurators of all the sources (inputs) of this Reflector; the event carries a list of the clients of the finishing reflector.
  3. Using CORBA, sends a FINISHED event to the ReflectorConfigurators of all the clients (outputs) of this Reflector; the event carries a list of the sources of the finishing reflector.

When a ReflectorConfigurator receives a FINISHED event from a source Reflector, it adds all the Reflectors in the list of sources of the finishing Reflector to its list of inputs7.4.

Conversely, when a ReflectorConfigurator receives a FINISHED event from a client Reflector, it adds all the Reflectors in the list of clients of the finishing Reflector to its output list.

Figure 7.9 shows a sample Reflector network where Reflector C has two inputs and two outputs. When C is killed and the reconfiguration process described above completes, the new configuration becomes the one in Figure 7.10.

Figure 7.9: A Distribution Network with Five Reflectors
\begin{figure}
\centerline {\psfig{file=figs/net1-before.eps,width=12cm}}\end{figure}

Figure 7.10: The Distribution Network After Reflector C Abandons the Network
\begin{figure}
\centerline {\psfig{file=figs/net1-after.eps,width=12cm}}\end{figure}

In order to be able to carry out the reconfiguration without any glitches in the multimedia streams and without affecting the system quality of service, we had to adopt the multithreaded solution described in section 8.2.

7.2.4.3 The Recovery Process

When a Reflector starts its execution for the first time or when it is restarted after being shutdown for some reason, it executes an initialization process. In this process, in addition to performing the actions described in 7.2.4.1, it performs the following three actions.

  1. Registers the Reflector with the Name Service.
  2. Using CORBA, sends a STARTED event to the ReflectorConfigurators of all the clients (outputs) of this Reflector; the event carries a list of the sources of the new reflector.
  3. Using CORBA, sends a STARTED event to the ReflectorConfigurators of all the sources (inputs) of this Reflector; the event carries a list of the clients of the new reflector.

Upon receiving a STARTED event from a new source Reflector, the client Reflector opens a new input connection to the new Reflector. If it is also receiving input from one of the sources of the new Reflector, it closes that input connection as soon as the data from the new source is available. An analogous process happens upon receiving a STARTED event from a new client Reflector.

These mechanisms allow the distribution system to recover its original topology after a faulty reflector restarts. Therefore, if the system configuration is the one in Figure 7.10 and the Reflector C recovers, then the configuration switches back to the one in Figure 7.9.

Note that we do not have an automatic mechanism for restarting faulty Reflectors. This requires the administrator's intervention, which seems to be natural since a Reflector goes out of service either because of an administrator's command or because of a failure in the system. In both cases, the administrator's attention is advisable. Alternatively, if desired, the Reflector can be added to the list of daemons that are executed by the operating system when a machine boots, eliminating the need for manual intervention when a machine crashes and restarts.

In section 8.2, we present the results of several experiments with the Reflector prototype. The performance evaluation shows that, with the help of our infrastructure for dependence management, the Reflector system is able to carry out dynamic reconfiguration of the distribution topology without any negative impact on the quality of service perceived by end-users. This dynamic reconfiguration provides a solid base for fault-tolerance.


7.3 Other Applications

In addition to the use cases described in section 5.4, three ongoing projects carried out by other researchers in our group are using the products of this thesis.

  1. Gaia [RC00], a middleware infrastructure for management of active spaces will use both the Automatic Configuration Service and component configurators to manage dynamic, heterogeneous devices in ubiquitous computing scenarios. As a traditional operating system manages the resources of a machine, Gaia manages the resources in an active space. Examples of active spaces include active offices and active lecture rooms enhanced with various computing and multimedia devices. Interactions between the user and the devices in the space are managed by a framework inspired in the Model-View-Controller paradigm [KP88]. Gaia uses component configurators to manage the dependencies among the Model, the Controller, and the multiple Views.

  2. 2KFS, the 2K distributed file service [HBC00] will use the Automatic Configuration Service to dynamically load components, implementing different algorithms for caching and data transcoding. The component configurator framework will be used to manage dependencies among the distributed 2KFS components.

  3. $2K^Q$ is a reconfigurable, component-based QoS framework which partitions the end-to-end QoS setup process into distributed QoS compilation and runtime QoS instantiation phases [NWX00]. Dynamic instantiation is based on dynamicTAO and the Automatic Configuration Service.



Footnotes

... interfaces7.1
To make Figure 7.3 more clear, we omitted the exceptions that each operation can raise.
... streams7.2
The classification of Reflectors as ``master'', ``intermediate'', and ``public'' Reflectors is merely pedagogical, they execute exactly the same program.
... 6)7.3
To minimize startup time and network load, the components fetched from the Component Repository may be cached locally.
... inputs7.4
This action may lead to redundant inputs for the same channel. This can be avoided by extending the code of the ReflectorConfigurator so that it adds all the new sources to its list of input alternatives and chooses only one of them to be its new input.

next up previous contents index
Next: 8. Experimental Results Up: Automatic Configuration of Component-Based Previous: 6. Reconfiguration Agents   Contents   Index
Fabio Kon