2004-05-17 Jason Rennie * release 1.3.8 * debian: delete directory; don't want to step on Jens Peter Secher toes (official Debian ifile maintainer); wait for guidance * opts.c (parse_opt): allow use of concise option with loocv_folder 2006-05-17 Paolo * util.c, ifile.c: fix LOOCV query (as I got it): if folder's not already there, don't create it really, not even temp for next query. * Makefile.in: add STRIP 2006-05-16 Paolo * release 1.3.7 * Makefile.in: add DESTDIR * argp/Makefile.in: add DESTDIR * debian: new directory w/ Debian-related files 2006-05-14 Paolo * release 1.3.6 * util.c (ifile_print_ratings, ifile_concise_ratings): fix segv if no folders in DB or #folders less than 2 and -T is set 2004-12-12 Jason Rennie * release 1.3.5 2004-12-11 Derek Peschel * configure.ac: Check type of ssize_t which database.c uses. 2004-05-01 Jason Rennie * update 1.3.4 2004-05-01 Paolo * ifile.1: update for threshold option * opts.c (-T): update --help documentation 2004-04-30 Jason Rennie * release 1.3.4 2004-04-30 Paolo * ifile.c: add args.thresh to ratings calls * include/ifile.h: add int thresh to arguments * opts.c (options, parse_opt): add threshold option * util.c (ifile_print_ratings, ifile_concise_ratings): print threshold info 2003-07-28 Jason Rennie * release 1.3.3 2003-07-25 Jason Rennie * BUGS: removed * configure: recompile (Re: Lubomir's patch) * argp/configure: recompile (Re: Lubomir's patch) 2003-07-25 Lubomir Sedlacik * configure.ac: don't use __attributes__ for non-GCC compilers * argp/configure.in: don't use __attributes__ for non-GCC compilers 2003-06-10 Jason Rennie * release 1.3.2 2003-05-26 Michael Hohmuth * ifile.c: add setlocale command; fixes bug 1500. 2003-05-22 Jason Rennie * opts.c: Change // comment to /**/ (use C style comments) 2003-04-16 Jason Rennie * release 1.3.1 * opts.c: update mailing list address 2003-04-11 Jason Rennie * database.c: make error messages refer to temp_data_file; free temp_data_file 2003-03-30 Jason Rennie * database.c (ifile_write_db): check for errors while writing database. Fixes bug 2955. 2003-02-13 Dave Marquardt * Fixed problems in configure and autoconf scripts for the case of selecting non-gcc C compiler. Fixes bug 2535. 2003-02-07 Jason Rennie * release 1.3.0 * configure.ac: new file (rename from "configure.in") * Makefile.in (DIST_FILES): remove configure.in 2003-02-11 Dave Marquardt * include/extendable_array.h: Fixed bug in EXT_ARRAY_INIT_N_SET() macro. Fixes bug 2516, where new folders caused the database to be garbled. 2003-01-29 Dave Marquardt * Performance improvements: On a Sun SPARC-based system running Solaris 10, with a database of about 16500 words, got a 40% improvement in words processed per second when reading the database, using the changes listed here. * include/ifile.h: Changed prototypes for readline(), ifile_read_header() and ifile_read_word_frequencies() to reflect new calling conventions. * include/extendable_array.h: New macro EXT_ARRAY_INIT_N_SET() combines the effects of EXT_ARRAY_INIT() and multiple calls to EXT_ARRAY_SET() in a smarter way, saving many realloc() calls and many manipulations of the array metadata. * utils.c: Rewrote readline to take a char** bufp and use the data within *bufp to parse a line, and update *bufp to point beyond the first line. This avoids at least one copy of the data. * primes.c: Cast values returned by ifile_realloc() correctly. * int4str.c: Cast argument to free() to void * in ifile_int4str_free_contents(), to fix compiler complaints. * hash.c: Added an include of to fix compiler complaints. * database.c: Made ifile_read_db() read the whole database in one fell swoop and modified callers of readline() to just pass in a pointer to the buffered database. Also made ifile_read_word_entry() call a new macro EXT_ARRAY_INIT_N_SET() in place of EXT_ARRAY_INIT() and multiple calls to EXT_ARRAY_SET(), saving many calls to realloc() and many manipulations of the extendable array metadata. 2002-11-25 Jason Rennie * release 1.2.1 * ifile.c: don't dump core if database doesn't exist and user gives one of these options: -q, -Q, -l, -u * Makefile.in: make "dist" upload to Savannah file area 2002-10-31 Jason Rennie * release 1.2.0 * ifile.tcl, user.tcl, ifilter.mh.pl, irefile.mh.pl, knowledge_base.mh.pl, news2mail.pl: removed (added to new mh-ifile package) * FAQ: removed (converted to html, posted on web page) * various: update mailing list web page & e-mail address * Makefile.in: remove all references to removed files * INSTALL: delete mh-specific instructions (moved to mh-ifile) * experiment.mh.pl: removed (unnecessary) 2002-10-26 Jason Rennie * ifile.c: error if -u folder doesn't exist 2002-10-16 Jason Rennie * release 1.1.5 2002-09-27 Jason Rennie * release 1.1.4 * ifile.1: add reference to FAQ file for database format 2002-09-26 Andreas Piesk * ifile.c (main): check that message isn't NULL before freeing it 2002-09-19 Camillo Särs * Makefile.in (install): mode 0755 for MAIN_EXECUTABLES and PERL_RUNNABLE_FILES * Makefile.in (include/ifile.h): add "-f" to mv; make target $(srcdir)/include/ifile.h (instead of $@) 2002-09-11 Jason Rennie * release 1.1.3 * INSTALL: update * argp: add ldmalloc stuff 2002-09-10 Jason Rennie * release 1.1.2 * opts.c: remove -h from --concise option * ifile.c (main): move ifile_age_words out of message reading loops; make individual message print ifile_verbose; time reading all messages rather than individual messages * Makefile.in (DIST_FILES): add configure.in 2002-09-10 Aaron M. Ucko * configure.in: add --with-dmalloc option 2002-09-10 Nathan * Makefile.in: use $(srcdir) to specify location of man pages 2002-09-10 Jason Rennie * release 1.1.1 * opts.c: remove -h from --concise option 2002-09-09 Jeremy Brown * release 1.1.0 * util.c: htable_put calls should no longer use strdup (ifile_free): does this function do anything useful? * stoplist.c (ifile_stoplist_free): free the stoplist data structure on exit (used only when debugging memory leaks.) * int4str.c (ifile_int4str_free_contents): free contained strings too. * ifile.c: Completely restructured main loop: read one message, process it, free the memory it consumed; iterate. Save database only after all messages processed. If using dmalloc debugging malloc library, free main datastructures at the end, otherwise just exit fast and dirty. * hash_table.c: hash tables create their own internal copies of key strings (except when resizing, of course); htable_put calls should never bother to strdup the key. TODO: hashtables are currently hardwired to use strings as keys/indices, but there are partial facilities for generalization; make it all general. (htable_free_guts): free key, value, etc. of a hashtable with user-provided functions. (htable_free): use htable_free_guts, then free the actual structure too. * include/hash_table.h: added/modified prototypes * database.c: copy a string for each data structure it's stored in to avoid having to reference-count for strings. Don't use strdup when calling htable_put. (wentry_free, ifile_db_free, ifile_free_categories): new functions that free major data-structures. TODO: name these things consistently. * include/ifile.h: added/modified prototypes. Also added dmalloc.h hook for use with the dmalloc debugging malloc library. 2002-09-03 Jeremy Brown * release 1.0.11 * istext.c: make NUM_TEST_CHARS more constant 2002-09-03 Steve Price * ifile.c: in loocv, skip message if NULL 2002-08-28 Matt Kraai * release 1.0.10 * database.c (ifile_read_header, ifile_read_word_frequencies): make ifile die more reasonably when given a bad .idata file to read * util.c (readline): return NULL if read fails * release 1.0.9 * util.c (readline): make work properly if fgets returns an error 2002-08-28 Jason Rennie * release 1.0.8 * Makefile.in: clean up, install man page; make install directories work like they should * ifile.1: new file * test.sh: new file 2002-08-19 Jason Rennie * release 1.0.7 * opts.c (argp_program_bug_address): Change to mailing list * opts.c (parse_opt): eliminate query flags from case 'c'; error if -c option not used with -q or -Q 2002-07-24 Karl Vogel * release 1.0.6 * ifile.c, include/ifile.h, util.c: print file names for new '-c' option 2002-07-15 Jason Rennie * configure.in: new file * configure: generate with autoconf 2002-07-15 Jason Rennie * release 1.0.5 * Makefile.in: Don't install libifile.a 2002-07-15 Karl Vogel * ifile.c (main): use args.concise, ifile_concise_ratings * opts.c: new option, "concise" * util.c: abort() if malloc returns null * util.c (ifile_concise_ratings): new function 2002-01-20 Jason Rennie * ifile v1.0.4 released * ifile.c (main): check for MSG != NULL (fixes segfault) 2001-11-22 Jason Rennie * ifile v1.0.3 released * database.c (ifile_add_db): add 'create' parameter; use it * ifile.c (main): update uses of ifile_add_db * opts.c (update): add new option * opts.c (ifile_init_args): set default value of create_folder * ifile.h (arguments): add create_folder * ifile.h (ifile_add_db): add 'create' parameter 2001-02-09 Caleb Crome * news2mail.pl: Change #! line to /usr/bin/perl 2000-03-14 William O. Ferry * irefile.mh.pl: use `pick' to expand message wildcards (e.g. 100-200, cur) 2000-01-18 Jason Rennie * COPYING: Latest GPL formatting (http://www.gnu.org/copyleft/gpl.txt) * README, INSTALL: minor changes * Makefile.in (DIST_FILES): add COPYING 1999-10-27 Jason Rennie * NOTES: update for 1.0.0 * INSTALL: updates for ifile.tcl->user.tcl/ifile.tcl code split; other minor changes * ifile.tcl: move User_* functions to user.tcl * README: removed text that was copied in other text files; updated version * FAQ: move "does ifile actually work" to FAQ * FAQ: update FAQ to reflect CMU->MIT/AI Lab move Fri Apr 23 17:55:47 1999 * INSTALL: update EXMH install to reflect ifile.tcl fix; mention EXMH "custom" inc method * ifile.tcl: Recent fix to EXMH makes it possible to redefine Mh_Refile without any tricks--any function that runs via the background process must be defined in User_Layout Thu Apr 22 22:48:16 1999 * INSTALL: automatic filter when EXMH incorporates mail is no longer an option (without an external filter such as slocal); update text to indicate so * ifile.tcl: remove most incorporate code; keep Inc_Init, but make it the same as EXMH 2.0.2 Thu Feb 19 12:07:38 EST 1998 consider making optional argument to --log-file consider having ~/.ifile directory store idata, accuracy, log files Thu Dec 3 12:55:51 EST 1998 * ifilter.mh.pl: Add X-filter: header at end of header section; rename old x-filter headers ifile-981203 released Wed Dec 2 18:20:36 EST 1998 * README, Version: bump version number to v0.7.3 * database.c (ifile_write_db): clean up tmp file name code * ifile.c, opts.c: add print-tokens option * lex-simple.c, opts.c: add max-length option * util.c (ifile_print_message): fix * include/ifile.h: add max_length and print_tokens to argument type * Makefile.in: make snapshot work ifile-981202 released Tue Apr 7 20:43:11 EDT 1998 * database.c: ifile_write_db(): Fix for a pointer bug which would cause a segfault on many occasions. Mon Apr 6 17:18:00 EDT 1998 ifile v0.7.1 released Mon Apr 6 12:57:53 EDT 1998 * database.c: Modified ifile_write_db() so that database is first written to a file which is user/host specific. Once database writing is done, file is renamed to real file name. Fri Apr 3 12:36:02 EST 1998 * ifilter.mh.pl: changed removal of temporary file to use unlink() (previously used "system \"rm $tmp_file\"") Made temporary file read/writable only to user. Tue Mar 31 13:38:56 EST 1998 * Makefile.in: added $(srcdir) in copy of argp.h to include dir. Tue Mar 31 02:21:00 EST 1998 ifile v0.7.0 released Tue Mar 31 00:15:17 EST 1998 * error.c, opts.c: debugging/log information now stored in ~/.ifile.log (previously /tmp/ifile.log.[userid])a * ifilter.mh.pl: debugging/log info now stored in ~/.ifilter.log * irefile.mh.pl: debugging/log info now stored in ~/.irefile.log Mon Mar 30 03:16:29 EST 1998 * ifilter.mh.pl: check to make sure we have write permissions before deciding on the folder to filter to. * experiment.mh.pl/knowldege_base.mh.pl: only accumuate info on mailboxes which we have write permissions to. Mon Mar 30 01:33:22 EST 1998 * ifile.c: semaphore code to keep two ifile processes from messing with the data file at the same time. Thu Mar 5 03:15:00 EST 1998 * ifile.c: changed cmp() function so that it returns 0 when given two floats which are equal. Previously, would return 1 - this seems to have caused mass hysteria when passed to qsort. Thu Mar 5 01:13:35 EST 1998 * knowledge_base.mh.pl: No longer uses wildcard when choosing files from a directory. Now only takes files with names completely composed of digits. Changed file names which are passed to ifile binary - now uses relative (instead of absolute) path names. * experiment.mh.pl: Changed file names which are passed to ifile binary - now uses relative (instead of absolute) path names. This reduces the length of the command line call. Wed Feb 25 13:05:40 EST 1998 * Makefile.in: added $(srcdir)/ to ifile.h locations in include/ifile.h make Sun Feb 22 11:25:07 EST 1998 * experiment.mh.pl: changed $data_file to absolute path Thu Feb 19 01:40:37 EST 1998 ifile v0.6.6 released Thu Feb 19 00:46:09 EST 1998 * hash_table.c, scan.c, istext.c: accounted for minor C library inconsistencies on SunOS 4.1.3_U1 Wed Feb 18 17:10:27 EST 1998 * Makefile.in: removed "-include Makefile" from bottom of file. * istext.c: AIX compiler doesn't like const static int = declaration. Inserted #defines instead. * lex-email.c, primes.c: AIX needs "#pragma alloca" to be able to properly use alloca() Wed Feb 18 01:50:51 EST 1998 * Makefile.in: removed dependencies on configure (in both main and argp directories), removed configure from maintainer-clean build * configure.in: removed file from distribution * ifilter.mh.pl: added '-g' option to output /tmp/.log. with 0600 permissions. * irefile.mh.pl: added '-g' option to output /tmp/.log. with 0600 permissions. * irefile.mh.pl: do not pass '-g' option to refile executable * ifile.c, error.c: added '-g' option to output /tmp/.log. file. Added chmod() in error.c to set permissions of file to 0600. * include/ifile.h, opts.c: added -g option, added tmp_file to arguments struct. * error.c: fixed ifile_strip_path() - previously was putting extra '/' at beginning of returned string. * opts.c: added cases to parse_opt() so that it will only return error when it is supposed to. * FAQ: added info about configure --srcdir option Thu Feb 12 23:24:39 EST 1998 * Makefile: removed experiment.mh.pl from list of PERL_FILES ifile v0.6.5 released Thu Feb 12 04:02:30 EST 1998 ifile v0.6.4 released Thu Feb 12 03:32:21 EST 1998 * ifile.c: previously checked for word aging only in the case of --insert without --delete. With new --query-insert option, this old style of --insert w/o --delete is never used. Changed condition so that it checks for --query-insert option. Tue Feb 10 10:53:53 EST 1998 * opts.c: Added db-file command line option. * ifile.c: Changed all instances of IFILE_DATA so that it would use arg.db_file for location of db storage Mon Feb 2 22:35:00 EST 1998 ifile v0.6.3 released Mon Feb 2 21:32:45 EST 1998 * error.c: changed ifile_open_log so that it will overwrite the old log file. changed ifile_strip_path so that it doesn't skip first character in executable name. Mon Feb 2 00:28:06 EST 1998 * ifilter.mh, irefile.mh: added checks for executability of MH programs Fri Jan 30 03:07:19 EST 1998 * UPGRADE, INSTALL: Added information about the necessity to have MH binaries in a directory which is part of the user's path. * ifilter.mh, irefile.mh: Added code which will cause each to die miserably if MH binaries are not runnable (either because full path is not specified or because PATH does not include MH dirs) Thu Jan 29 03:04:00 EST 1998 ifile v0.6.2 released Thu Jan 29 00:47:50 EST 1998 * database.c, hash_table.c: changed a number of (int) casts to (long int) casts. On Alphas and other 64-bit machines, casting from a pointer to int would cause a compiler warning. Casting to (long int) eliminates those warnings. Thu Jan 22 01:32:29 EST 1998 * ifile.tcl: Added. Created a new Mh_Refile function to cause EXMH to call irefile.mh when it would normally call MH refile. * irefile.mh.pl: Fixed some bugs, caused it to output debugging information to /tmp/irefile.info. Bugs included .mh_profile parsing and variable name misstypings. Thu Jan 15 00:08:22 EST 1998 * changed all instances of "ifile.h" to * Makefile.in: modified installation section so that files are passed to installation program one at a time (some installation programs cannot handle multiple files). Wed Dec 24 23:23:00 EST 1997 * irefile.mh.pl: various minor changes to eliminate bugs & to get it to work properly Wed Dec 24 13:19:41 EST 1997 * ifile.c, ifile.h, opts.c: minor modifications to the way query-insert option is dealt with. Wed Dec 24 04:46:46 EST 1997 * database.c, ifile.c: modified aging calling so that it is called from ifile.c and only called if there is insertion without deletion. Wed Dec 24 03:49:28 EST 1997 * ifilter.mh.pl, irefile.mh.pl: added usage information * irefile.mh.pl: added code to find out current folder if no source folder is given. Also, if irefile.mh is given absolute folder names, it will attempt to extract the folder names by searching for and removing the mail path ($mail_path). Tue Dec 23 20:10:51 EST 1997 * irefile.mh.pl: modified building of ifile command line to check for .skip_me files in source and destination folders * opts.c, ifile.c: added "--query-insert" option Tue Dec 23 16:56:48 EST 1997 * ifile.c: any newly created .idata file will now have 0600 permissions Tue Dec 23 16:36:56 EST 1997 * lex-email.c: ifile_lexer_email_prelex_header(): modified code so that it would not seg fault if a give e-mail message does not contain a body. Tue Dec 23 15:57:22 EST 1997 * Makefile.in: Minor modifications to file names * news2mail.pl: Minor modifications * extendable_array.h: EXT_ARRAY_SET() now sets all new values of array to 0. Mon Dec 22 20:29:41 EST 1997 * database.c: ifile_write_db(): fixed process by which infrequent words are tossed. Folder frequencies are now properly updated. Mon Dec 22 17:23:19 EST 1997 * extendable_array.h: EXT_ARRAY_GET() now returns 0 when asked for a value which is out of the range of the currently allocated array. Mon Dec 22 14:36:21 EST 1997 * database.c: ifile_del_db() was broken. Now it's fixed. Need to create a function for removing infrequent words. Currently, the folder frequencies are not being updated (not a good thing). Mon Dec 22 14:15:06 EST 1997 * ifile.c, util.c: fixed message reading timing output. Mon Dec 22 05:54:34 EST 1997 * util.c: added an fclose() call to ifile_read_message() * knowledge_base.pl: completely rewritten in order to make use of ifile * opts.c, ifile.h, ifile.c: added a new command line option, one to reset the database (currently implemented by removing .idata) * opts.c, ifile.h, ifile.c: added state variables to indicate whether messages should be read, database should be read/written Mon Dec 22 04:49:06 EST 1997 * util.c: modified ifile_read_message to check the return value of the opening of the message. Opening an empty message results in a (ifile_lex *) NULL return value. Mon Dec 22 02:55:17 EST 1997 * lex-email.c: apparently works, does a nice job of dicing up the headers when requested to do so. Sun Dec 21 17:26:19 EST 1997 * lex-define.c, ifile.h: integrated the gram lexer into the email lexer so that ifile_email_lexer is the only non-simple lexer. * lex-email.c: rewritten using lex-gram.c as a base. * ifile.h, lex-simple.c, lex-email.c: to this point, ifile_lexer.sizeof_lex has been used as though it should be the size of the lexer. After some inspection, it appears that it should be the size of the lex. Changes have been made accordingly. * lex-gram.c: removed from the distribution. Thu Dec 18 19:44:56 EST 1997 * lex-define.c, ifile.h: added a layer between the e-mail lexer and an indirect lexer. Also created a specialized LEX for email. * lex-email.c: very broken at this point. Needs to be fixed. Thu Dec 18 16:53:20 EST 1997 * lex-define.c: fixed lexer initialization code. * lex-email.c: fixed header checking code so that it mostly works. It still needs fixing, though. Currently, it dumps core on an empty input file. Tue Dec 9 08:03:35 EST 1997 * ifile crashes on ifile_read_message. My guess is that the lexer code is broken. Mon Dec 8 23:40:16 EST 1997 * Added hack to irefile.mh.pl to allow refiles of the form "irefile # +src +dest" (sometimes used by xmh) * Finished writing irefile.mh.pl Sat Nov 1 02:24:16 EST 1997 * Began writing irefile.mh Sat Oct 18 01:54:39 EDT 1997 * Makefile: modified snapshot, diff to use "cp" instead of "ln". Consumes more diskspace, but is more likely to work on AFS. Sat Oct 18 01:43:47 EDT 1997 * opts.c: fixed argument parsing so that it properly parses file names * database.c: fixed writing so that it would only write folder-word frequency entries which are non-zero Tue Oct 14 16:50:20 EDT 1997 * ifilter.mh: changed to use a temp file so that it doesn't have to load libraries to do 2-way piping. Also, changed to print folder to which message is filtered (this should NOT be printed to STDERR) Sun Oct 12 09:22:13 EDT 1997 * temporary happiness achieved :) it looks like the C code for the 72nd "complete & total rewrite of ifile" is complete. Sun Oct 12 09:19:27 EDT 1997 * database.c: changed "args.folder_calcs" to "folder", fixing a segmentation violation bug Sun Oct 12 04:52:25 EDT 1997 * database.c: fixed some technical proclems with the ifile_add_db function. Made minor revisions to reading/writing functions Fri Oct 10 22:25:48 EDT 1997 * opts.c: added all sorts of neat command line options. * lex-default.c: set up lexing options so that effects take place in the correct order * ifile.c: added capabilities for updating DB, writing it to disk. * database.c: wrote functions for updating DB, writing it to disk. Fri Oct 10 05:44:47 EDT 1997 * ifile.c: converted querying, verbosity to work with new style of command line parsing Fri Oct 10 04:47:10 EDT 1997 * Makefile.in: configured to work with ifile * ifile.c: now the main exectuable file * util.c: new file - some utility functions from ifile.c * database.c: new file - database opteration functions Fri Oct 10 02:37:02 EDT 1997 * ifile_query.c: fixed minor bugs, added GNU command line argument system * argp/*: GNU command-line argument handling system - added to main executable(s) * hash_table.c: fixed final bugs (hopefully :) * ifile.c: fixed final bugs in db reading functions, rating functions. * Makefile.in: new file * configure.in: new file * configure: new file * Version: new file * opts.c: new file - configuration of argp Thu Oct 9 02:06:49 EDT 1997 * ifile_query.c: complete rewrite. Stripped out all unnecessary functions - make code totally reliant on new data structures. * hash_table.c: eliminted syntax errors, minor bugs. Improved prime-finding algorithm * ifile.c: eliminated syntax errors, minor bugs Wed Oct 8 10:17:54 EDT 1997 * ifile.c: ripped out any functions which were MH-specific. Modified remaining functions to use new ifile_db and hash_table types. Still have to fix write_db functions. * ifile.h: added ifile_db, db_word_entry, removed anything MH-specific. * ifile_db.c: began to remove MH-specific functions Wed Oct 8 07:16:01 EDT 1997 * hash_table.c: new file - implements general array-based (no linked lists) autoextendable hash table * hash_table.h: new file Wed Oct 8 04:05:44 EDT 1997 * ifile.c: fixed minor bug in idata writing code. Previously, blank lines would be added to .idata file. Sat Oct 4 02:48:43 EDT 1997 * ifile.c: rewrote ifile_write_word_frequencies() in order to speed things up a bit. * assoc_array_int: rewrote some of the structures, wrote a slicker hash() function, changed hash table sizes Sat Sep 13 03:35:17 EDT 1997 * ifilter.c: renamed to ifile_query.c * irefile.c: renamed to ifile_db.c * ifilter.c: made code MH-independant. No longer filters mail to a folder, but rather prints out folder names and cooresponding metric values. * ifile.c: added ifile_print_ratings, modified ifile_rate_categories so that it would return an array of category names and ratings Sat Sep 13 02:27:54 EDT 1997 * Makefile: rewritten to allow more compact representation * lex-define.c: new file - defines and initializes default lexers * deflexer.c: removed (replaced with lex-define.c) * ifilter.c: added lexer initialization call * irefile.c: added lexer initialization call Sat Sep 13 00:32:35 EDT 1997 ifile 0.4.5 released Mon Sep 8 00:46:40 EDT 1997 * ifile_read_profile: fixed possibility of blank line in .mh_profile causing executables to dump core. Changed tokenizing to allow for ':' as a token separator. Sun Aug 31 22:50:51 EDT 1997 ifile v0.4.4 released Tue Aug 26 00:02:25 EDT 1997 * irefile.c/main: fixed the location to check for .skip_me file. irefile previously checked in improper location. Wed Aug 13 01:15:35 EDT 1997 * ifile_verbosify: removed commands which allowed indenting of messages according to priority level. ifile_free() call was somehow being called ad infinitum and causing a seg fault. * ifile.h: "#define ifile_free(x) free(x)" statement was causing neverending recursion of free() calls. #define removed. *** these changes not included in 0.4.x releases Wed Aug 13 00:44:44 EDT 1997 * ifile_rate_categories: no longer use document frequencies in best category calculations * deflexer.c: new file - defines and sets the default lexer * ifile.h: now contains all header file information for all ifile C files (except for assoc_array_int.c and extendable_array.c) * int4str.c: implementation of 1-1 string->int and int->string mapping * istext.c: new file - determines whether a file is text or binary * lex-email.c: new file - lexes e-mail, removing certain headers * lex-gram.c: new file - lexes words into groups (multi-word tokens) * lex-html.c: new file - html lexer * lex-indirect.c: new file - lexer stuff * lex-simple.c: new file - simple lexers * primes.c: new file - prime number generation functions * scan.c: new file - file scanning functions * stem.c: new file - implementation of Porter stemming algorithm * stoplist.c: new file - implementation of stoplist function * stopwords.c: new file - default stoplist * stuff.c: new file - file for testing new lexing code *** these changes not included in 0.4.x releases NOTE: deflexer.c, int4str.c, istext.c, lex-*.c, scan.c, stoplist.c and stopwords.c were originally part of the libbow package and were originally written by Andrew McCallum. stem.c was originally part of the libbow package and was written by various authors (see code for credits). Fri Aug 8 22:30:18 EDT 1997 ifile v0.4.3 released Tue Jul 22 00:28:13 EDT 1997 * ifile_rate_categories: if list of categories is empty, sets first category as "inbox" and selects inbox (1) as best category. Eliminates seg faults in ifile_verbosify() call. v0.4.3 - knowledge_base.perl: keep separate nuking counts for entire run and each aging call - .idata file is now explicitly closed - fclose() call did not exist in earlier versions - minor changes to extendable array code - ifile_rate_categories now adds "inbox" to the mailbox listing of the mailbox listing is empty (leaving it empty would cause a seg fault in some cases) - Makefile now contains explicit linking commands (some 'make' programs will not link executables without explicit commands) - added to distribution descriptions of how to better integrate ifile into EXMH (see exmh_integration, exmh_integration2 and filter_button) v0.4.2 - minor changes to install.perl script - news2mail.perl added to distribution (search for 'news2mail' in README file for some explaination) - .idata file locking introduced into system. ifile programs are synched via lock files so that none will ever read or write a corrupted .idata file (not implemented in all executables yet) - new version of "knowledge_base.perl". Does age updates periodically to lessen processor overhead. Includes commented out lines for attaching associative arrays to disk (primarily useful if you have a very large corpus of e-mail and a large number of mailbox folders). - irefile no longer learns on messages which are refiled to folders which contain a .skip_me file (synchronizes behavior with knowledge_base.perl) - primitive code aiming toward alternate style refile included in distribution. Search for 'irefile queue' in README file for more info. - cutoff frequency for eliminating words from .idata now done using a log scale (previously used a constant). [- modularized idata writing] v0.4.1 - moved opening of log file to top of ifilter.c to prevent seg fault when argc > 1 - previouslly while reading a message, ifile would reset as_nk value of any word/category it came across (essentially drilling holes in the .idata file). This has been fixed. - to allow a smoother transition between lexing styles, each Subject:, From: word is now lexed into two words, one with the prefix and one without. (the duplication will be eliminated in the near future) v0.4.0 - removed 'make clean' command from install.perl so that the user can make his/her own executables and then use install.perl to install them - fixed problem with opening of log file. Program previously tried to use entire argv[0] as part of the log file path. Now uses only executable name. - because a portion of argv[0] is used to name the .info files, the irefile .info file will most likely appear as "/tmp/refile.info" - renaming of some variables - main ifilter program modularized. - corresponding parts of irefile modularized. - ifilter now only reads sections of .idata which it needs to classify the message it is given (greatly increases efficiency) ^^^ Good stuff - ifilter is quite a bit faster and less of a memory hog :) - invariants removed from "category rating" loop to promote efficiency - the UPGRADE file has been removed from distribution (due to lack of utility) - the convert.perl file has been removed from the distribution (you should instead use knowledge_base.perl to recreate your .idata file) - message headers, "From:" and "Subject:", are now treated specially during lexing. [- machine names and e-mail addresses are now lexed as one word and are no longer broken to pieces.] v0.3.3 - loop variable initialization bugfixes - rewrite of mh_refile function (hopefully eliminating strange string bugs/random seg faults) --> thanks to Colin McCormack for the suggestion! - fixed logical AND (&&) to bitwise AND (&) in refile_exe function (doh!) - some clean-up of code - code is more modular - added ifile_sprintf() and ifile_cats() to aid in string handling Any programmers who have deal with strings in C should love these :) - added error.c for message printing and error handling - modularized printing of progress and error messages - added ifile.c to hold library of ifile functions - converted readline() to a function (from a #define) and in doing so introduced (but shortly thereafter fixed) a bug - ifile now adds a header to filtered messages "X-filter: ifile v0.3.3" v0.3.2 - included errno.h in irefile.c (required by SunOS v5.5.1/gcc-2.7.2) - eliminated strerror() function from code. Function is unnecessary and SunOS 4.1.3/gcc-2.7.2 produces undefined references to 'strerror' during compilation. This seemed to be the only easy fix. - eliminated problem with 'make' putting -lm (math library) option at the beginning of the line v0.3.1 - install.perl links to 'irefile' instead of 'refile', like it should v0.3.0 - C implementation of ifilter. Incredible speed improvements (see chart below) - C implementation of irefile. Significant increase in speed (see chart below). Not quite as extreme as ifilter. irefile heavily depends on (slow) associative arrays for writing the .idata file - Both programs are much more memory conscious (see chart below). - Full support for all MH refile options. I think I finally have full support for all the options. Please tell me if I don't! - irefile will use 'cur' message and current folder if no message/source folder is given - message headers (subject, from) NOT given extra weight (this will probably be reversed in the future) - install program does not affect location of ifilter entry in .maildelivery filter, simply changes ifilter path My own experiences on my i586 100Mhz Linux box (48 megs ram) 202k .idata file, 39 folders, 5061 words time/memory comsumption to filter/refile a single 2k message processor time memory usage perl irefile 20 sec 5.1 Meg C irefile 7.5 sec 2.2 Meg perl ifilter 10 sec 4.7 Meg C ifilter 1.5 sec 2.0 Meg v0.2.5 - ifile now properly addresses the '-file' option [mostly] - knowledge_base.perl is more memory conscious - resets .idata_accuracy file if accuracy goes negative - irefile would previously refile all messages after it read in the first of any batch. It now waits to refile after all messages are read in. - uses context file to get the current-folder when no source file is indicated in the calling of irefile [this is actually incorrect] - ifilter will not filter messages into a folder with a .skip_me file - finding PWD was made slightly more robust v0.2.4 - irefile no longer mistakes MH refile options for file names - allows for more flexibility in reading of .mh_profile - passing '-help' to ifilter or irefile will print usage v0.2.3 - script included to create knowledge database according to where mail is currently located (knowledge_base.perl). - message headers (subject, from) given extra weight v0.2.2 - if original MH refile program is available as refile.bak, ifile will use MH refile program to move the message (increases compatibility) - keeps track of number of filters and number of refilings for approximate - /tmp/ifilter.info now contains lots of info about ifile's filtering decisions v0.2.1 - handling of multi-level folders (sub-folders) - refiled messages no longer appear as new messages - refiling to folder which message comes from does not destroy message - protections set up to restore original message if refile is not performed correctly v0.2.0 - fixes problems concerning strange folder names - more concise data file - provides quicker filtering and refiling - keeps track of number of messages filtered and number of messages refiled - looks at .folders to determine folder names, rather than directory structure - keeps data file streamlined by eliminating infrequent words - records 'words' of length 3 (previously required length of 4 or greater) v0.1.2 - Fixed problem of not allowing user to indicate directory of the rmm binary - Installation program changed to search for binaries in common directories before asking user (provides easier installation) v0.1.1 - Coordinated naming of main data file - '.idata' - Installation program allows customization of binary directories (previously, program made assumptions as to locations of mh binaries) $Id: ChangeLog,v 1.24 2004/12/12 19:10:46 jrennie Exp $