[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: One Man's Search for Smaller Codebases

To: "David McClain" <address@hidden>
Subject: Re: One Man's Search for Smaller Codebases
From: "Christopher R Vincent" <address@hidden>
Date: Thu, 6 Dec 2001 08:14:41 -0500
Cc: address@hidden
Importance: Normal
Sender: address@hidden

It's always nice to see thoughtful comparisons of C/C++ (especially VC++/COM) with high-level languages.
As someone who has written a fair bit of both Common Lisp and C++/ATL/COM, I suggest that the real difference
in codebase size/maintainability comes with larger systems. In C++, I find myself copying and modifying code because
it is too difficult to abstract common operations properly. These are cases where I use Lisp macros extensively, especially in
network apps where you repeatedly do things like set up and process requests. Combine this with the expressiveness of
CLOS, and I think you really have a chance of limiting redundancy.

Unfortunately not many software engineers, myself included, take the time to do the recoding experiment you describe.
I know there have been some industrial-level comparisons of C++ and Lisp on large projects, can anyone speak to those?
Certainly some of the Symbolics guys on this list could speak to pros and cons of maintaining large Lisp systems.

What does this have to do with lightweight languages? Good Lisp programs have varying levels of abstraction, where
high-level logic can be implemented with Scheme-like simplicity, while lower-level code is highly optimized with type
declarations and more obscure syntax. I tend to carry this principle over to "mainstream languages" by implementing
high-level logic in socially acceptable scripting languages, and the optimized bits packaged as C++ components. I feel like
macros and a flexible object system in my lightweight language would help me develop my core system logic as a concise,
maintainable code base.

Thanks again for sharing your experience!

-Christopher

Sent by: owner-ll1-discuss@ai.mit.edu

To: <LL1-Discuss@ai.mit.edu>
cc:
Subject: One Man's Search for Smaller Codebases

Hi,
Someone just wrote and told me to post this note on your thread, saying it would be appropriate. He also said I would find interesting company here... so far I do see some interesting names, Steele, McKay, Felliesen, and several others... So here goes...
I posted this on the OCaml thread last night, and on the comp.lang.lisp thread too... Understand that part of the motivation for this study is that I have 300 KLOC of working codebase to maintain against bit-rot, and induced bit-rot via OS changes beneath me... I'm eager to find a way to minimize this maintenance activity, do it correctly, and robustly.
-------------------------------------------------
I just finished my experiment to reduce the size of a fielded application by recoding in either of Lisp or OCaml. I had early indications that, aside from pure ease of programming in these HLL's, the overall code base would be drastically reduced (5x to 6x). That is certainly true if you count all the source code needed to produce the application, but an honest, impartial, comparison of the lines I actually had to write, of non-reusable, application specific code shows somewhat disappointing results on this basis alone.
The application is a system network server that performs recursive prefix mappings of file pathnames, including environment variable substitutions. This is a variation on the system provided by the Sprite experimental OS developed at UCB by John Ousterhout, et. al. in the late 1980's and early 1990's.
The existing version was coded in M$ VC++ making heavy use of STL. It is a COM/OLE process server based on M$ ATL. All three versions retain a machine generated ATL wrapper code for this COM/OLE behavior -- I only needed to write a few lines of IDL to produce the basic skeleton, and all three versions use identical stuff here...
For the application specific coding, the scores are:
Existing App: C/C++ = 1106 LOC
Lisp Version: C/C++ = 284 LOC, Lisp = 798 LOC --> Total = 1082 LOC
OCaml Version: C/C++ = 284 LOC, Lisp = 58 LOC, OCaml = 453 LOC --> Total = 888
These LOC counts do *not* include blank lines and comment only lines.
On the basis of code-base size reduction, these results are nearly a tie.
But on the basis of ease of programming, I have to award Lisp first, followed by OCaml, and distantly trailed by C/C++. The reasons for this are:
1. Lisp is a huge langauge with nearly everything you need already built in. But it produces very bulky DLL's -- on the order of 15 MBytes.
2. OCaml is equally terse as Lisp, or even slightly better, but needs a fair amount of additional support routines written, to cover the application needs. Some of this is in C/C++ (very little) but most has to do with providing things like unwind-protect, generalized string handling, generalized list operations. It produces very fast runtime code (not needed here) and quite reasonably sized DLL's -- about 300 KBytes (50x smaller than Lisp!!)
3. C/C++, making heavy use of classes and STL is nearly unreadable, took a long time to program, and is frightening to revisit after some time away from it (1 year or more since original writing). C/C++ retains the capability to utilize Unicode (FWIW -- I don't really need it), but it was written with some embedded bugs that I found only when I was able to remain at the abstract levels permitted by HOL's.
Both the Lisp and OCaml versions were written in the course of 2-3 hours. Writing the C/C++ version took the better part of 1 week. Prior to that I had written experimental versions in Lisp and had more than a year of playing with the system to get an understanding of the needed algorithms.
I will say that both Lisp and OCaml allowed me to spot some errors in the C/C++ implementation, fix those errors, and add some extra capability (about 20 LOC in both Lisp and OCaml for the extra stuff). I estimate the time needed to go back and refamiliarize myself with STL and the internal architecture of the existing application -- in order to fix the bugs I discovered and add the additional capabilities -- would be several days.
I find it remarkable that OCaml has a slight edge on Lisp for terseness of expression. OCaml is a highly expressive syntax and you can say quite a lot in a few keystrokes. Lisp tends to be more wordy, use longer identifiers, and the code is quite a bit sparser for semantic content over a given number of LOC.
This is as close as I can come to providing an honest, impartial, comparison of these languages for the purpose of rewriting existing code to be more maintainable, robust, and correct. I definitely think the effort is worthwhile, but not entirely for the reasons I had originally anticipated.
Cheers,
- David McClain, Sr. Scientist, Raytheon Systems Co., Tucson, AZ

Prev by Date: Language and library
Next by Date: Scsh startup time
Previous by thread: One Man's Search for Smaller Codebases
Next by thread: "Python for Lisp Programmers"
Index(es):
- Date
- Thread