6898: Advanced Topics in Software Design: Tasks

Tasks are to be completed before the given lecture.

Reading Questions

The purpose of the reading questions is to prod you to think as you read, and to encourage you to do the reading before the relevant class. The questions are not intended to be hard or take much time to answer; 5 lines or so should be enough to answer each. For each class for which answers are due, please send me (at the address dnj+6898 at mit.edu) a plain text email message with your answers. I will read your answers, but you shouldn't expect comments in response. I won't grade them, but I'll expect students taking the class for credit to make a good effort.

Monday May 13, 2002

Final project presentations today and Wednesday. Your write-up is due in Dan Wilson's office (NE43-511) no later than noon on Thursday May 16. Final presentation schedule is here.

As explained in class, your write-up should contain the following sections, and should be 2-4 pages in length. You can attach any relevant artifacts as appendices (models, code, etc) along with a commentary explaining them.

Abstract: a pithy summary of the problem you addressed, and what you achieved. No more than 100 words.
Background and Motivation: an explanation of the problem in its context.
Summary and Evaluation: a summary of what you did and an evaluation of how well it met your initial goals.
Lessons Learnt: what you learnt from the entire experience (whether the problem was hard or easy, nice/nasty features of the language or tool, surprises, etc)

Wednesday May 8, 2002

Richard Tibbetts will be leading a discussion about type classes. Thanks to Richard for also providing today's questions.

Read about Haskell's type classes in the original paper and in the Haskell manual:

Philip Wadler and Stephen Blott. How to make adhoc polymorphism less ad hoc. Proc. 16th ACM Symposium on Principles of Programming Languages, pp. 60--76, 1989. Available here
Paul Hudak, John Peterson, Joseph Fasel. A Gentle Introduction to Haskell. Revised June, 2000 by Reuben Thomas. See section: Type Classes and Overloading. http://www.haskell.org/tutorial/classes.html. Available here.

Then answer these questions:

How do type classes compare to Java Interfaces? Mixins? Aspects? Features?
What exactly is ad hoc polymorphism? Do other programming languages have it? How is it implemented? What kind of language features besides type classes can support it?
What kinds of decompositions are type classes good for? Do you have an example of a system where they would be useful?

Monday May 6, 2002

From Martin Fowler's refactoring book

Refactoring: Improving the Design of Existing Code. Martin Fowler with contributions by Kent Beck, John Brant, William Opdyke and Don Roberts. Addison-Wesley, 1999

read the following excerpts: a chapter explaining the key idea; a chapter about spotting problems; and a few sample refactorings from the catalog:

Preface - "What is Refactoring? ", pp. xv - xvii
Chapter 3 - "Bad Smells in Code" (by Kent Beck and Martin Fowler), pp. 75 - 88
Chapter 5 - "Toward a Catalogue of Refactorings", pp. 103 - 107;
Duplicate Observed Data, pp. 189 - 196;
Change Unidirectional Association to Bidirectional, pp. 197 - 199;
Encapsulate Collection, pp. 208 - 216;
Separate Domain from Presentation, pp. 208 - 216

Read this paper too:

Yoshio Kataoka, Michael D. Ernst, William G. Griswold, and David Notkin. Automated Support for Program Refactoring using Invariants. Proceedings of the International Conference on Software Maintenance (ICSM'01), (Florence, Italy), November 6-10, 2001, pp. 736-743. Available here.

Then answer these questions:

Do you buy the argument that up-front design is often premature? Or would good design obviate the need for some of these refactorings?
What kind of tool support would help with refactoring? What might go wrong if you refactor badly or clumsily?
Do the techniques described in the paper apply to all refactorings? What other kinds of analysis might be used to find opportunities for refactoring?

Wednesday May 1, 2002

I'm giving a talk about research strategy. No tasks in advance for you.

Monday April 29, 2002

Read the introductory chapter of Nam Suh's recent Axiomatic Design book:

Nam Pyo Suh Axiomatic Design: Advances and Applications. Oxford University Press, 2001.

This book is pretty hard going, but contains some interesting ideas, so here are some hints for how to go about reading it. Skim the whole chapter first, focusing on the case studies in the boxes. You can omit the material on the Information Axiom (from page 39 to page 53); our focus is on the Independence Axiom. Read carefully the paragraph beginning "Furthermore" on page 21; this is where the motivation for the notion of coupling is explained. Then read the questions, and reread the chapter with the questions in mind.

The model of design here is reminiscent, in the software realm, of top-down design and the distinction between "what" and "how". Zig-zagging is much like the idea of stepwise refinement from the 1970's. Do you think this is a good approach for software development? You might want to look back at Michael Jackson's paper on Problem Frames (available here) which argues for a completely different notion of the distinction between domains.
What are FR's and DP's? Do they have analogies in software? Does the process domain exist in software?
An ideal design has the same number of FR's and DP's. Does this make sense for software?
What do the elements of the design matrix that relates FR's to DP's mean? What would they mean in the context of software?
Roughly speaking, an uncoupled design allows a single DP to be changed to respond to a change in an FR. A decoupled design allows a set of DP's to be adjusted without iteration. Why are these notions important in mechanical engineering? Do they have a significance in software engineering? Is there a relationship to notions of requirements traceability, or to Parnas's notion of avoiding duplication in which a single function is implemented in more than one module?
Roughly speaking, an uncoupled design allows a single DP to be changed to respond to a change in an FR. A decoupled design allows a set of DP's to be adjusted without iteration. Why are these notions important in mechanical engineering? Do they have a significance in software engineering?
In axiomatic design, a system-wide matrix must be constructed for the entire decomposition to ensure that a lower-level decomposition into DP's does not introduce coupling between FR's at a much higher level. Is this a desirable feature of the design method?

Wednesday April 24, 2002

Read two papers about the Design Structure Matrix (DSM). The first paper (in HBR) is an easy-to-read overview of the DSM and its use in organizing product development tasks. The second is about using coupling between components to develop an architecture that chunks them into clusters appropriately.

Steven D. Eppinger. Innovation at the Speed of Innovation. Harvard Business Review, January 2001.
Thomas U. Pimmler and Steven D. Eppinger. Integration Analysis of Product Decompositions. DE-Col. 68, Design Theory and Methodology (DTM 1994), American Society of Mechanical Engineers.

Then answer these questions:

How do dependences in the DSM differ from the kind of dependences we've been studying?
Where role do specs play in the first paper?
How might the DSM apply to software? Consider, for example: whether a software system's architecture is more of less fluid than a mechanical system's; whether specs play a greater or lesser role; whether software iterations can be planned in advance; etc.
The second paper considers various axes on which components might interact. What analogous axes are there for software? Is there a relationship to aspects?

Monday April 22, 2002

In class, we'll continue our discussion of dependences by considering some common idioms that arise in object-oriented programs, and seeing how we can capture their dependence structures. We'll compare the new dependence diagram that I described last time with Parnas's uses relation.

There's no additional reading for this class. Just make sure you've read and understood the papers from last time, especially the Parnas paper. You will probably also want to look over the slides from the last class to make sure you understand the elements of the new dependence diagram.

Your task is to construct some dependence diagrams in advance of class so that you have thought about the issues and are ready to discuss them. Please bring your answers to these questions on paper to class on Monday.

Consider a phone book program that provides storage and retrieval of phone numbers and addresses. For each of these fragments of the program, draw two diagrams: a Parnas uses diagram, and a new dependence diagram as described in my last lecture.

Suppose only lookup on names is provided. Draw dependences for a configuration of a main class P, a map class M, a string class S for the keys, and a class T that implements immutable telephone numbers.
Suppose lookups on addresses are possible, and that addresses are mutable. Draw dependences for a configuration of a main class P, a map class M used to map addresses to phone numbers, a string class S for the phone numbers, and a class A that implements mutable addresses.
Suppose we want to support lookup of names that don't match exactly, so we change our map abstraction so that its constructor method takes a comparator object that is used later to determine whether two keys are equivalent. Draw dependences for a configuration of a main class P, a map class M used to map names to phone numbers, a class N for names, a comparator class C, and a class T for telephone numbers.

Wednesday April 17, 2002

Exercise 2 is due at the start of class today.

Read Parnas's seminal paper on the 'uses' relation, and the excerpts on dependences and coupling gathered from some recent textbooks. Copies of these excerpts are available in Dan Wilson's office (NE43-511).

David L. Parnas. Designing software for ease of extension and contraction. IEEE Trans. Software Eng. SE-5, 2 (1979). Available here.
Program Development in Java: Abstraction, Specification and Object-Oriented Design. Barbara Liskov with John Guttag, Addison-Wesley Pearson Education, 2001. Chapter 14 - Between Design and Implementation pp.356-360, 1.2 "Structure".
Design Patterns: Elements of Reusable Object-Oriented Software Erich Gamma, Richard Helm, Ralph Johnson & John Vlissides; Foreword by Grady Booch, Addison-Wesley Professional Computing Series, 1995. Chapter 1.7 - How to Select a Design Pattern pp.23-25, "Designing for Change"; Observer, pp. 293-303.
Software Engineering - Theory and Practice. Shari Lawrence Pfleeger, Prentice-Hall, 1998. Chapter 5 - Designing the System, Section 5.5 (pp.216-228), "Characteristics of Good Design".
The Unified Modeling Language Reference Manual. James Rumbaugh, Ivar Jacobson & Grady Booch, Addison-Wesley Object Technology Series, 1999. Encyclopedia of Terms - pp.250-252, "dependency".
UML Distilled, Second Edition - A Brief Guide to the Standard Object Modeling Language. Martin Fowler with Kendall Scott; Foreword by Grady Booch, Ivar Jacobson & James Rumbaugh, Addison-Wesley Object Technology Series, 2000. Chapter 7 - Packages and Collaborations pp.108-113, "Packages"

Then answer these questions:

There are several notions of dependence here: Parnas's uses relation; Liskov's module dependences; coupling from structured analysis; package dependences in UML. Are they the same? Are some more appealing than others? How clearly do the various authors define these notions?
How much of software design is about eliminating or reducing dependence? Can one have two few dependences?

Wednesday April 10, 2002

Greg Sullivan will be our guest speaker. His topic will be "Design Patterns in a Dynamic Language". Read the paper in advance of class.

Monday April 1, 2002

The second exercise is available here. It's due on Wednesday April 17.

Read the 1997 ECOOP papers on feature-oriented and aspect-oriented programming. You may also want to read the more recent paper on Aspect-J, which applies aspects to Java. More information on aspects is available at www.aosd.net, and on AspectJ at www.aspectj.org.

Gregor Kiczales, John Lamping, Anurag Mendhekar, Chris Maeda, Cristina Videira Lopes, Jean-Marc Loingtier, and John Irwin.Aspect-Oriented Programming. Proceedings of the European Conference on Object-Oriented Programming (ECOOP), Finland. Springer-Verlag LNCS 1241. June 1997. Available here.
G. Kiczales, E. Hilsdale, J. Hugunin, M. Kersten, J. Palm, and W. G. Griswold. An overview of aspectj. In ECOOP, pages 327-353, 2001. Available here.
Christian Prehofer. Feature-Oriented Programming: A Fresh Look at Objects. Proceedings of the European Conference on Object-Oriented Programming (ECOOP), Finland. Springer-Verlag LNCS 1241. June 1997. Available here.

Then answer these questions:

What's the difference between an aspect and a feature?
Could aspects be the fundamental structuring mechanism of a program, or are they always an add-on, something considered afterwards for incremental development?
Are aspects and features compatible with the paradigm of object-oriented programming? Would they be more or less easily implemented in Objective CAML?

Wednesday March 20, 2002

In class, we'll discuss the module system of Objective Caml, which is based on that of Standard ML, but with better support for separate compilation. To prepare for this, read Leroy's paper:

Xavier Leroy. Manifest Types, Modules and Separate Compilation. Proc. 21st Symposium on Principles of Programming Languages (POPL), 1194, pp. 109--122. Available here.

For background on what structures, signatures and functors are all about, read the relevant parts of Bob Harper's book on programming with Standard ML, which is full of insights and wonderful examples, and is very clearly explained. Part 3 explains the module system. If you plan to use OCaml in your programming exercise, you should also read the relevant parts of the OCaml manual. Part 1 is a tutorial; Chapter 2 is about modules. It doesn't explain much, but it does give some nice examples.

here

http://www.cs.cmu.edu/~rwh/

Xavier Leroy, with Damien Doligez, Jacques Garrigue, Didier Remy and Jerome Vouillon. The Objective Caml System. Release 3.04. Documentation and User's Manual. December 10, 2001. Institut National de Recherche en Informatique et en Automatique (INRIA). Available here, or at http://caml.inria.fr/ocaml/htmlman.

To really get to grips with Caml, download the distribution and play with it. You'll find distributions and much more at http://www.ocaml.org.

Then answer these questions:

What practical advantages does ML's module system offer over Java's? Some issues to consider: separate compilation; interfaces vs. classes; exporting types, exceptions and values together; whether packages have specifications; making dependences explicit.
ML uses sharing constraints to solve the 'diamond import problem'. Where do sharing constraints crop up in Units? Why are they not needed in Java?

Wednesday March 13, 2002

Read these papers:

R. Findler and M. Flatt. Modular Object-Oriented Programming with Units and Mixins. International Conference on Functional Programming, 1998. Available here.
M. Flatt, and M. Felleisen. Units: Cool modules for HOT languages. In Proc. Sigplan 1998 Conference on Programming Language Design and Implementation, 236--248. Available here.
S. McDirmid, M. Flatt, and W. Hsieh. Jiazzi: New Age Components for Old Fashioned Java. In Proc. of the ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA) 2001. Available here.

For an interesting application of units, optionally read this one too:

Reid, Flatt, Stoller, Lepreau, and Eide. Knit: Component Composition for Systems Software. OSDI 2000. Available here.

Then answer these questions:

Compare the linking model in units to Cardelli's theoretical model. What are some of the key differences? Consider things like ability to hide fragments, packaging exports into groups, subtyping, mutual references, etc.
In the mixins paper, Findler and Flatt say that "... support [for] external connections [is] crucial to software engineering" (Section 7, Page 8). What does this mean? Does Java provide it? And do you believe this claim?
The units framework allows linking to be done at runtime, with the links selected dynamically. What useful things could be done with this feature?

Monday March 11, 2002

Read Cardelli's paper:

Luca Cardelli. Program Fragments, Linking, and Modularization. POPL 97. Available here.

Don't be intimidated by the volume of mathematical material; the paper's actually very straightforward. It gives a formal model of linking that treats the linking step as substitution of an expression for a variable; this allows the standard technique for describing type checking in lambda calculus to be applied to type checks within and across modules. (If you're not familiar with basic notions of lambda calculus and type inference, though, you may find it a bit of a struggle.) Answer these questions:

Cardelli asks "When does a module system really support modularization?". What does he mean by "modularization", and what else would a module system offer?
Give an example of a linkset with two different linking paths (that is, two different sequences of steps that resolve the cross references between modules).
What does the restriction in Definition 6.1 -- that one of the environments involved must be empty -- mean in practice?
Suppose linking were not confluent. What problems would this cause in practice?
How well does this theory fit the linking model of Java?

Wednesday March 6, 2002

Read the paper about problem frames:

Michael Jackson. Problem Analysis and Structure. In Proceedings of NATO Summer School, Marktoberdorf, August 2000. Available here.

Then answer these questions:

Jackson says: "Even methods and approaches that claim the title of 'problem analysis' usually prove, on closer inspection, to deal entirely with putative or outline solutions." Does this criticism apply to Fowler's Analysis Patterns?
What are the domains and shared phenomena in your lift description? Which problem frames fit it?

Monday March 4, 2002

Read the chapters from Fowler's book, and the new chapter on Accountability patterns (but feel free to skip the implementation discussions):

Martin Fowler. Analysis Patterns: Reusable Object Models. Addison Wesley Longman, 1997. Chapters 1 and 5.
Martin Fowler. Accountability Patterns. New version of Chapter 2, of Analysis Patterns. Available at www.martinfowler.com, or here.

Then answer these questions:

What exactly are these patterns of? Who would use them?
Give an example of an instance of one of the accountability patterns in the organization of MIT.
Make precise some of the "uniqueness constraints" discussed in the Identification Scheme pattern of Chapter 5 using Alloy.

Wednesday February 27, 2002

Read the following:

J. Michael Spivey. An Introduction to Z and Formal Specifications. Software Engineering Journal IEE/BCS, 1989. Vol 4, No 1, pp. 40-50. Available here.
J. Michael Spivey. Specifying a Real-Time Kernel. IEEE Software, September, 1990, pp. 21-28. Available here.
J.C.P. Woodcock. Structuring Specifications in Z. Software Engineering Journal IEE/BCS, 1989. Vol 4, No 1, pp. 51-66. Available here.

Read enough of the following to get some idea of what ASMs are about:

[G93] Yuri Gurevich. Abstract State Machines: The ASM Tutorial. Or: "Evolving Algebras: An Attempt to Discover Semantics". Current Trends in Theoretical Computer Science, eds. G. Rozenberg and A. Salomaa, World Scientific, 1993, 266-292. Available here. See also the ASM website: http://www.eecs.umich.edu/gasm/.

Look into OCL, the constraint language of UML, by reading some of the stuff on IBM's web site http://www-3.ibm.com/software/ad/library/standards/ocl.html.

Also look at the support for OCL offered by the University of Bremen's USE tool: http://www.db.informatik.uni-bremen.de/projects/USE/.

Answer the following questions:

How do signatures in Alloy compare to schemas in Z? Are there things you can do with schemas but not with signatures, and vice versa?
Do you accept the argument that OCL is simpler and easier to use than Z because it's less mathematical?
What does the USE tool do? How does this compare to the Alloy Analyzer?
Which of these languages (OCL, Z, ASM, Alloy) is appropriate for modelling object-oriented systems?

Monday February 25, 2002

Exercise 1 is due at the start of class today.

Wednesday February 20, 2002

The first exercise is now available here. It's due on Monday February 25th, but I strongly encourage you to start it early. First, it's quite challenging, and if you leave it until the last minute, you may not get it done. Second, it should be fun, and you won't enjoy it if you're under pressure. Third, I find that elapsed time helps: if you get stuck, a day away from it will clear your mind and your subconscious may well get you unstuck. Fourth, you'll benefit from the peer review meeting on February 20th if you've already thought about the problem. You should aim to have at least started the main problem (the Elevator System design) by that date.

Wednesday February 13, 2002

Read the following:

[JS00] Daniel Jackson and Kevin Sullivan. COM Revisited: Tool-Assisted Modelling and Analysis of Complex Software Structures. ACM SIGSOFT Conference on Foundations of Software Engineering, San Diego, CA, November 2000. Available at: http://theory.lcs.mit.edu/~dnj/publications.html#com.
[KJ00] Sarfraz Khurshid and Daniel Jackson. Exploring the Design of an Intentional Naming Scheme with an Automatic Constraint Analyzer. Proc. 15th IEEE International Conference on Automated Software Engineering, Grenoble, France, September 2000. Available at: http://theory.lcs.mit.edu/~dnj/publications.html#ase-00.
A model of the leader election phase of the Firewire protocol. Daniel Jackson, January 2002. Available at http://theory.lcs.mit.edu/~dnj/6898/models/firewire.als. This model uses orderings defined in another module, available at http://sdg.lcs.mit.edu/alloy/models/ord.als, but you should be able to understand what's going on without reading this other module. The basic idea is very simple: the model defines states, and uses orderings to define a sequence of states, with a first state, a last state, and relation that maps each state except the last to the one that follows it.
A model of a simple distributed spanning tree algorithm. Ilya Shlyakhter and Manu Sridharan, January 2002. Available at http://theory.lcs.mit.edu/~dnj/6898/models/opt_spantree.als. This model also uses orderings.

Answer the following questions:

What's one key difference between the style of the Firewire and spanning tree models on the one hand, and the style of the INS model on the other? What are the pragmatic consequences of this difference?
You could imagine analyzing INS, Firewire or the spanning tree, using a simulator: write the algorithms in code, and explore executions by some kind of backtracking search. Why is the Alloy Analyzer's analysis more powerful than this? And why couldn't you do this kind of simulation for the COM example?
Are there parts of these models that you found obscure or hard to understand? If so, which?

Monday February 11, 2002

Read the following items:

[J02] Daniel Jackson. Micromodels of Software: Lightweight Modelling and Analysis with Alloy. February 2002. Available at: http://sdg.lcs.mit.edu/alloy/book.pdf. This is a draft of a book. Read Chapters 1, 2 and 3.
[JSS02] Daniel Jackson, Ilya Shlyakhter and Manu Sridharan. A Micromodularity Mechanism. ACM SIGSOFT Conference on Foundations of Software Engineering / European Software Engineering Conference, Vienna, September 2001. Available at: http://theory.lcs.mit.edu/~dnj/publications.html#micromodularity. This paper describes the key structuring mechanism in Alloy -- the signature. It's much shorter than the relevant book chapters (2 and 3), but people say they find it hard going, perhaps because it's too succinct.
[J00] Daniel Jackson. Automating Relational Logic. ACM SIGSOFT Conference on Foundations of Software Engineering, San Diego, CA, November 2000. Available at: http://theory.lcs.mit.edu/~dnj/publications.html#alcoa-algorithm.

This paper explains the technology that underlies the Alloy Analyzer. Since the analyzer compiles Alloy into an intermediate language that's essentially the relational logic I presented in the first lecture, yuo don't need to understand the full language to make sense of it. If you really want to understand how the analysis works, read this carefully (although you can skip some of the complicated stuff about representing environments as trees). For this class, you don't need to understand how the analyzer works, but you might want to skim this paper anyway.

Answer the following questions:

In Alloy, an expression always denotes a relation. Suppose Alloy had a built-in notion of scalars as well, so the meaning of an expression could be a single atom. What effect would this have on the syntax of the language? Would it make the language more expressive?
I chose to use a dot for relational join in Alloy, so that navigation expressions would look familiar to programmers. How do the dot in Alloy and the field dereferencing dot in Java differ?
To check that you really understand the relational operators, try and express the following two constraints without using quantifiers:
1. Every Course 6 student takes some (that is, at least one) Course 6 class.
2. Every geek takes every Course 6 class.
You can assume that we have basic types STUDENT, CLASS and COURSE, and the following relations: _takes_, a binary relation from STUDENT to CLASS that maps s to c when s takes class c; _course_, a binary relation from CLASS to COURSE that maps cl to co when cl is a class in MIT course co; _Six_, a unary relation on COURSE that contains a single tuple, and which represents Course 6; _SixStudents_, a unary relation on STUDENT which represents the set of students enrolled in Course 6; and _Geek_, a unary relation on STUDENT which represents the set of students who are geeks.