© http://people.csail.mit.edu/jaffer/Docupage/copyrights.html

copyrights

Current Version Released Terms
1.7 2003-02-01 GPL

The copyrights program searches through the given files looking for "copyright", "Copr.", "©", or "©" (followed by years and name of holder in either order). It also recognizes the phrase "public domain". For each file it reports each distinct copyright once.


Quick Start

Usage

Usage: copyrights [-m NUM] [-n CNT] FILE1 FILE2 ...

  Displays at most NUM (default 10) of each FILE's copyrights occuring
  within the first CNT (default 1000) lines or groups of 100 chars.
  Returns 0 if a copyright is found; otherwise returns 1.

Usage: copyrights [-m NUM] [-n CNT] -
  As above, but reads filenames from standard input.

Examples

$ copyrights [oO]vertones.*
Overtones.html:59:Copyright 2003 Aubrey Jaffer</ADDRESS>
overtones.png:30:Copyright 2003 Aubrey Jaffer/z
overtones.ps:184:Copyright 2003 Aubrey Jaffer) show

$ copyrights AnaLugojana.*
AnaLugojana.abc:29:Copyright 1999 Voluntocracy.
AnaLugojana.mid:1:Copyright 1999 Voluntocracy.
AnaLugojana.pdf:

$ copyrights sharpbang.c
sharpbang.c:2:   This program is in the public domain.

Further Development

This first release of copyrights is a simple proof of concept. The list of desired improvements is so extensive that the next program need not reuse anything in the current version.

Nearly every aspect of the future program incorporating these improvements is approximate. The steady progress of genomics matching algorithms lets us find close matches in linear time and space (some titles tease about sub-linear times).

The SLIB function diff:edit-length returns the edit distance between tokenized word sequences; and should be usable for license matching. Dynamic-programming the copyright instance search can likely do the rest (or see Curriculum Vitae: Gene Myers for approximate pattern matching algorithms).

Histocomputability is computation modeled on biological immune function. In this paradigm copyright is the Major Histocompatibility Complex presented by nearly every cell (data file) type in an individual organism (software package). The copyrights program assumes the function of CD4 and CD8 T-cells in recognizing self and non-self copyrights.

Copyright 2002, 2003 Aubrey Jaffer

I am a guest and not a member of the MIT Computer Science and Artificial Intelligence Laboratory.  My actions and comments do not reflect in any way on MIT.
Docupage
agj @ alum.mit.edu
Go Figure!