hginsmail - a Perl script for archiving e-mail messages on a Hyper-G server
hginsmail [parameters] [file(s)]
For a short description of the parameters try hginsmail -h, for more details see below.
hginsmail takes e-mail messages from a list of files, matches them against user-defined rules and inserts the messages into collections associated with the rules.
Depending on the user's parameters, the script inserts each message into one collection only (the one corresponding to the first matching rule), or it evaluates the whole list of rules, which may yield several destination collections, and inserts the message into each of these.
Before actually inserting a message into a collection, hginsmail tests whether it is already there (the messages' Ids and authors are stored as keywords). Thus, it does not insert a message into a collection twice, and even if the script processes a file of e-mail messages more than once, it won't produce duplicate messages in the archive (except the user overrides the script's suggestions.)
Upon insertion, the script also tests whether a message is a reply to some other message. In case it is, a link to the referenced document is generated. (Thanks to Hyper-G's link-management, this even works when the referenced object happens to be inserted at some later time.)
Thus, the thread of a discussion can easily be reconstructed, and, moreover, it may be graphically represented with Harmony's Local Map!
In order to get the full thread of a discussion, however, the messages a user sends should appear in the archive, too. This may easily be accomplished by adding a 'Bcc: $USER' header field to each message sent, which copies the message into the user's mailbox.
The files that are read from may contain e-mail messages in either the standard e-mail format (as from the standard mail-file (/usr/spool/mail/$USER)) or in emacs' RMAIL-format. (Implicitly, newsgroup articles may be archived with hginsmail, too.)
The rules for categorization along with some general information should be stored in a file. More details about that follow below.
hginsmail does not directly insert the messages into the database; instead, it writes them to a HIF-file (which is stored in $HOME/.hgmail), and when all messages are read, it hands that file over to hifimport, a tool which then inserts all messages into the database at one swoop. The HIF-file is deleted if hifimport terminates successfully.
Sometimes it may happen, though, that something goes wrong in the course of insertion; maybe a collection does not exist, the HG host is down, or whatever. In this case, the user is notified of the failure and the HIF-file is left in the directory.
Before terminating, hginsmail always checks $HOME/.hgmail for the presence of any HIF-files and if there are any, it tries to insert them again. In the case of success, the files are deleted, otherwise the user is notified. (The user could also insert the HIF-files manually by calling hifimport.)
This file holds all information that is individual for each user and/or each execution of the script. Being exactly, it defines the default Hyper-G host, the default collection for messages that cannot be categorized, the name of the 'incoming mail' collection, the default access rights for the documents in the database, the directory of the standard mail-file, and, of course, the rules for categorizing the messages.
Since this information may differ from case to case, the commandline parameter -rc allows the user to pass the name of the rc-file currently desired. If this argument is missing, the script first looks for a file named hginsmail.rc in the current directory, and if it cannot find it there, it checks the directory ~/.hgmail/.
An example rc-file can be found at the same location as hginsmail. It might be best to just copy it and adapt it for one's own needs.
Rules
While the definition of host and default collection etc. do not need much explanation, the definition of the rules probably does.
Each rule consists of two parts; a description and the name of a destination collection (which may optionally be followed by the access rights that a document matching this rule should be assigned). The two parts are separated by '->'.
($Subject =~ /Landscape/) ($From =~ /will|summer|joe@somehost/i) ($To =~ /aschmid/)A comment in the example rc-file lists which variables may legally be used within the expressions.
($Subject =~ /Hyper-G/) -> hgstuff_asj, hgteam, hgintern ($Subject =~ /\[holiday\]/) -> _SKIP_ ($To =~ /aschmid/) -> to_asj_pers, aschmid (($Cc =~ /hgteam|hgintern/)) -> hgstuff_asj ($Gnus =~ /alt.prose/) -> gnus_alt_prose_asj
Thanks to Michael Klemme <michael-k@cs.auckland.ac.nz> for the ideas to following changes:
Thanks to Tasos Koutoumanos <tkout@softlab.ece.ntua.gr> for the ideas to following changes:
Some of you have probably wondered about Harmony's Mail functions, which usually do not produce any visible effect when selected. However, when hgsendmail is installed on your system and the file hgsendmail.rc is in $HOME/.hgmail, they will work just as you'd expect them to.
For more information please see the documentation of hgsendmail.
Alfons Schmid (aschmid@iicm.edu) - July 22, 1996