An Analysis of Aircraft Assignment in CTAS

6.894, Fall 1998
Team A: Daniel Jackson, Jay Corbett and John Chapin
October 6, 1998
DRAFT VERSION
 

Overview

This document describes the results of an analysis of the CTAS code to determine how aircraft are assigned to Route Analyzers (RAs). We chose to investigate this aspect because much of the code is devoted to it, and because we have been told by NASA that it has been a source of difficulty, in both performance and code structuring, and is therefore a good candidate for redesign.

We confined our analysis to the following aspects of the Communications Manager (CM):

Our concern has been the primary modes of operation, so we have provided little or no detail regarding: We considered only aircraft proper, and not blocked slots (which are essentially reservations made in anticipation of the arrival of aircraft).

Our aim was to produce two kinds of documentation. First, we wanted to understand how the existing code is organized; we documented this with traditional call graph representations. Second, we wanted to represent more fundamental properties of the problem domain and the system, at a more abstract level, that might serve as a specification for our subsequent design work. We documented these properties with object models and entity life histories.

Approach

We analyzed the code with a variety of tools:

Informal Description

Basic Flows

Aircraft data enters the CM primarily by messages from the Input Source Manager (ISM). Trial plans may be entered at the user interface, but these are not our concern. Two categories of data about aircraft are received: track data and flight plan information. Track data gives basic details of physical state (position, velocity, etc). Flight plan information is given in three kinds of message: The information from these messages is stored in the CM's aircraft database (the "aircraft tree"). The information associated with some messages is not saved. A flight plan addition for an aircraft that is deemed to be too far from the center, for example, is not recorded. An amendment for an aircraft that is not known by the system is therefore not an extraordinary occurrence. Relevant functions: ism_socket:ism_socket_read, update_aircraft:add_flight_plan, amend_flight_plan, remove_flight_plan, etc.

If the CM is in 'playback mode', received flight information from a prerecorded file, the aircraft tree is updated instead from the file. Relevant functions: update_aircraft:update_tree_with_active_aircraft.

Flight plan amendments and deletions are forwarded immediately to RAs. Relevant functions: ism_socket:ism_handle_fp_amd, cm_send:forward_flight_plan_amend, send_category_amendment, ism_socket:ism_handle_fp_del, update_aircraft:remove_flight_plan, cm_send:forward_flight_plan_delete.

The aircraft tree is then walked to find new additions; these are forwarded to RAs in batches whose size is determined by the constant UAC_CM2RA_FPS_COUNT_MAX. This strategy is designed to avoid overload that would otherwise occur when a large of new flight plans are received in a short period of time: for example, when opening the connection to the ISM for the first time, or when switching from a mode in which departures are not tracked to one in which they are. Relevant functions: update_aircraft:find_and_send_fps, cm_send:forward_flight_plan_add.

The tree is then walked to find aircraft that are active (ie, for which tracks have been received) but which have not been assigned to RAs. Assignment is done to spread load evenly: each such aircraft is assigned to the RA with the fewest aircraft already assigned to it. A message is sent to the chosen RA giving it the identifier of the aircraft, and the tree record is updated accordingly. It should be noted that information for an aircraft is broadcast to all RAs, and all RAs essentially hold a copy of the aircraft tree; this assignment determines 'ownership' of an aircraft by an RA. Relevant functions: distribute_ac:distribute_aircraft_to_ras, cm_send:handoff_aircraft_to_ra.

Track information is passed to the RAs, in contrast, for the entire aircraft tree. Relevant functions: update_aircraft:load_output_buffer, cm_send:send_aircraft.

The tree is walked to find aircraft that have either landed or become inactive. Landed aircraft are deleted from the tree. Inactive aircraft are marked as such. Whether an aircraft is inactive appears to be determined by how recently track data was received (but we have not looked into this properly). Relevant functions: update_aircraft:weed_out_old_aircraft, mark_deletable, delete_aircraft, set_inactive.

Periodically, a check is performed to determine whether aircraft are uniformly distributed amongst RAs. Unevenness arises because RA processes may fail, and have their load passed on to other RAs; new RAs may be added at any time. If the spread between the most and least loaded RAs exceeds some constant, the aircraft are redistributed in round-robin fashion amongst RAs. Relevant functions: distribute_ac:check_and_redistribute_aircraft, cm_send:handoff_aircraft_to_ra.

A periodic check for landed aircraft is also made. An aircraft that has been within the landing zone for some airport-specific amount of time is deemed to have landed; a message is broadcast to all RAs, and the aircraft tree record is updated accordingly. Relevant functions: update_aircraft:check_for_landed_aircraft, cm_send:send_aircraft_landed_to_all.

We have not yet analyzed the handling of RA sockets. Roughly, information about the state of RA's is maintained in a global array. In each execution of the main loop, the status of all socket connections is examined; if a socket appears to be malfunctioning, the connection is closed and the state is recorded in the array. When a new RA process is added, it replaces an array entry for a closed socket. There appears to be no code that marks the connection to an RA as temporarily degraded. Sockets seem to be associated with sectors: why? (see cm_struct:Socket_st). Relevant functions: cm_sock_util:check_sockets, check_socket, read_ra_socket, close_socket, connect_to_ra.

Distribution of Aircraft to RAs

Each active aircraft is assigned to a route analyzer (RA). This assignment is stored in the ra_index field of the aircraft record in the communications manager (CM) and set as follows: Whenever the assignment of an aircraft to an RA is modified, messages are sent to the RAs informing them of this change. If the assignment is changed from one RA to another, both of these RAs are notified. If the assignment is changed from unassigned to some RA, then all RAs are notified.

Call Graph

We have constructed a call graph that shows the calling relationships amongst the principal functions responsible for the behaviour described above. Postscript, GIF.

Analysis

In this section, we summarize our understanding of this aspect of the CM's behaviour in a pair of models: an object model and an entity life history model.

The entity life history model designates some significant events in the life of aircraft, and specifies constraints on their ordering.

The object model describes the state of an aircraft (at the grossest level) and the relationships between aircraft and RAs, and how these change when events occur.

The models are given here in textual form. Diagrammatic versions will be added later.
 

Entity Life History Model

The following events in the life of an aircraft are significant as far as allocation of aircraft to RA's is concerned: The following entity life histories each constitute a role of the aircraft entity. We have not modelled the RA entity; if we did, we would show, for example, that an unassign event always follows an assign event to the same RA.

fp_add (fp_amend | become_active | become_inactive | enter_landing_zone | presume_land)* [fp_delete]
// most events must follow fp_add and cannot follow fp_delete

[(become_active become_inactive)* [become_active]]
// might never become active, or might become active once, or might have periods of activeness

[become_active assign (unassign assign)* [unassign]]
// cannot assign until active

((unassign assign) | fp_amend | become_inactive)*
// no fp_amend or become_inactive between unassign and assign

[enter_landing_zone presume_landed]
// only presume_landed after enter_landing_zone

[become_active enter_landing_zone]
// only enter_landing_zone if once active

Questions to resolve:
Can fp_amend occur after enter_landing_zone?
Can become_inactive occur after enter_landing_zone but before presume_landed?
 
 

Object Model

There are two domains: AIRCRAFT are divided amongst the following subsets: The formal object model (below) makes the following assertions: We also specify, for each event, the effect it has on the abstract state. The specifications are unremarkable, except insofar as they specify event ordering constraints in their (implicit) preconditions. For example, the specification of presume_landed asserts that the aircraft must already have entered the landing zone.
 

model AircraftAssignment {
// Uses Alloy textual notation
// Diagram is equivalent to domain and state parts
// Invariants were given as additional annotations to diagrams in lecture
// Operations have not been discussed yet

    domain {AIRCRAFT, RA}

    state {
      partition ACTIVE_AC, INACTIVE_AC : AIRCRAFT
      ENTERED_LANDING_ZONE_AC : AIRCRAFT
      LANDED_AC : ENTERED_LANDING_ZONE_AC
      ASSIGNED_AC : AIRCRAFT
      assigned_to : ASSIGNED_AC -> RA !
      }

    def INACTIVE_AC {
      INACTIVE_AC = AIRCRAFT - ACTIVE_AC
      }

    inv {
      LANDED_AC in INACTIVE_AC ???
      }

    op fp_add (ac: AIRCRAFT' !) {
      INACTIVE_AC := INACTIVE_AC + ac
      }

    op fp_delete (ac: AIRCRAFT !) {
      AIRCRAFT := AIRCRAFT - ac
      }

    op fp_amend (ac: AIRCRAFT !) {
      }

    op assign (ac: AIRCRAFT !) {
      ac in ACTIVE_AC - ASSIGNED_AC
      ASSIGNED_AC := ASSIGNED_AC + ac
      }

    op unassign (ac: AIRCRAFT !) {
      ac in ASSIGNED_AC
      ASSIGNED_AC := ASSIGNED_AC - ac
      }

    op enter_landing_zone (ac: AIRCRAFT !) {
      ac not in ASSIGNED_AC
      ASSIGNED_AC := ASSIGNED_AC + ac
      }

    op presume_landed (ac: ENTERED_LANDING_ZONE !) {
      LANDED := LANDED + ac
      }

    op become_active (ac: INACTIVE_AC !) {
      ACTIVE_AC := ACTIVE_AC + ac
      }

    op become_inactive (ac: ACTIVE_AC !) {
      INACTIVE_AC := INACTIVE_AC + ac
      }

    }

Questions: can become_active occur for an active aircraft?
Distinguish abstract operation from execution of function?
 

Abstraction Functions

The abstract state given by the object model is related to the values of aircraft record fields as follows:
 
Abstract state Concrete state
aircraft ac is active, inactive otherwise ac->active is TRUE
aircraft ac has landed ac->landed is TRUE
aircraft ac is assigned ac->ra_index is not NOT_SET
aircraft ac is assigned to RA ra Ra [ac->ra_index] = ra

A rough assignment of operations to functions in the code can be made. The difficulty is that we would like to associate operations with leaves of the call graph, but in most cases, the leaf function performs only one concrete aspect of the operation (eg, setting the aircraft record field or sending a message). It is therefore necessary to find a function higher in the call graph whose execution maintains all code invariants (see below). There are no doubt invariants we have missed; the functions listed here are those that appear to maintain the invariants discussed below.
 
 
Function Operations performed
update_aircraft:add_flight_plan* fp_add
ism_socket:ism_handle_fp_amd fp_amend
ism_socket:ism_handle_fp_del fp_delete
update_aircraft:set_inactive become_inactive, unassign
update_aircraft:update_inactive_ac_specific_data become_active, unassign (for which RA, ac?)
update_aircraft:update_active_ac_specific_data become_inactive ?
cm_send:handoff_aircraft_to_ra assign, unassign
cm_send:new_category_set_ra_owner assign ?
update_aircraft:check_entry_to_landed_zone enter_landing_zone
update_aircraft:check_for_landed_aircraft presume_landed

*Broadcast of flight plan addition and updating of aircraft tree and performed in distinct phases.

Code Invariants

A number of invariants are maintained by the code. Most significantly, the state of the aircraft tree is synchronized with the state of the RAs: for example, an RA should believe that it has ownership of an aircraft when that aircraft's entry in the tree indicates ownership by that RA. We assumed that the messages received by the RA update its internal state appropriately. The relevant messages are: It seems that the updating of the aircraft tree and the sending of messages is related as expected, with the following exceptions: The balancing of aircraft load amongst RAs is aided by a global array distribute_ac:number_of_ac_in_each_ra. The expected invariant that this does indeed track assignments correctly appears to be maintained, although the function
update_aircraft:load_and_send_one_aircraft sends ownership messages but does not update the array. This is perhaps because the function is only called for trial flight plans entered at the user interface, and maybe there are few enough of these for them to be safely ignored.

Anomalies

Functionality is not systematically allocated amongst the files ism_socket, update_aircraft and cm_send. For example, cm_send:forward_flight_plan_amend is called directly by ism_socket:ism_handle_fp_amend, but cm_send:forward_flight_plan_delete is called by update_aircraft:remove_flight_plan, which in turn is called by ism_socket:ism_handle_fp_del. Flight plan addition is by necessity treated quite differently, since broadcast and tree updating are in separate phases.

The code associated with category amendments appears to be a late addition. The function cm_send:send_category_amendment is documented incorrectly: its header claims that it is only called by read_ra and read_pgui, but it is in fact called by ism_socket:ism_handle_fp_amd. Moreover, it sets ac->ra_index to zero, thus assigning the aircraft to the first RA, without apparent justification.

The array number_of_ac_in_each_ra is not updated by update_aircraft: load_and_send_one_aircraft, violating the invariant that the array gives the number of aircraft assigned to RAs. Is this because the aircraft represents a trial plan with only a transient assignment?

The function cm_send:handoff_aircraft_to_ra appears to unnecessarily broadcast an UNSET_RA_OWNERSHIP message to RAs when the ra_index of the aircraft is not set. This presumes that an RA might believe it has ownership of an aircraft even though the aircraft tree record does not indicate it. Why? Can the CM die and be restarted while RAs remain running? If this were so, we would have expected more fault tolerance code of this sort throughout the CM.

The function update_aircraft:update_active_ac_specific_data broadcasts an AIRCRAFT_INACTIVE message but does not update the aircraft tree record (by setting ac->active to false). This seems to arise in the case in which an aircraft has passed its freeze horizon.

Followup Issues

Look into RA socket handling: include dead RAs in object model?
Categories of flight plans: code for changing category of desired flights.
What exactly does active mean? Look at RA code?
Look into unbuffered/buffered ism track input.
Lackwit search on ac->landed to check operation table.
Freeze horizon and active aircraft? See update_aircraft:update_inactive_ac_specific_data. Add a freeze event to model?
Aircraft ignored if outside center: subsequent handling.
Flight plan amendments after entering landing zone?