diff options
Diffstat (limited to 'doc/texinfo/programs/decodemail.texi')
-rw-r--r-- | doc/texinfo/programs/decodemail.texi | 227 |
1 files changed, 227 insertions, 0 deletions
diff --git a/doc/texinfo/programs/decodemail.texi b/doc/texinfo/programs/decodemail.texi new file mode 100644 index 000000000..ffddef4da --- /dev/null +++ b/doc/texinfo/programs/decodemail.texi @@ -0,0 +1,227 @@ +@c This is part of the GNU Mailutils manual. +@c Copyright (C) 2020--2024 Free Software Foundation, Inc. +@c See file mailutils.texi for copying conditions. +@comment ******************************************************************* +@pindex decodemail + +The @command{decodemail} utility is a filter program that reads +messages from the input mailbox, decodes ``textual'' parts of each +multipart message from a base64- or quoted-printable encoding to an +8-bit or 7-bit transfer encoding, and stores the processed messages in +the output mailbox. All messages from the input mailbox are stored in +the output, regardless of whether a change was made. + +The message parts deemed to be textual are those whose +@samp{Content-Type} header matches a predefined, or user-defined, +mime type pattern. In addition, encoded pieces of the @samp{From:}, +@samp{To:}, @samp{Subject:}, etc., headers are decoded. + +For example, @command{decodemail} makes this transformation: + +@example +Subject: =?utf-8?Q?The=20Baroque=20Enquirer=20|=20July=202020?= +@result{} Subject: The Baroque Enquirer | July 2020 +@end example + +The built-in list of textual content type patterns is: + +@example +text/* +application/*shell +application/shellscript +*/x-csrc +*/x-csource +*/x-diff +*/x-patch +*/x-perl +*/x-php +*/x-python +*/x-sh +@end example + +These strings are matched as shell globbing patterns +(@pxref{glob,,,glob(7), glob(7) manual page}). + +More patterns can be added to this list using the +@code{mime.text-type} configuration statement. +@xref{mime statement}, for a detailed discussion, and the +configuration section below for a simple example. + +When processing old mesages you may encounter @samp{Content-Type} +headers whose value contains only type, but no subtype. To match +such headers, use the pattern without @samp{/whatever} part. E.g. +@samp{text/*} matches @samp{text/plain} and @samp{text/html}, but +does not match @samp{text}. On the other hand, @samp{t*xt} does +not match @samp{text/plain}, but does match @samp{text}. + +Optionally, the decoded parts can be converted to another character +set. By default, the character set is not changed. + +@menu +* Opt-decodemail:: Invocation of @command{decodemail}. +* Conf-decodemail:: Configuration of @command{decodemail}. +* Using-decodemail:: Purpose and caveats of @command{decodemail}. +@end menu + +@node Opt-decodemail +@subsection Invocation of @command{decodemail}. + +Usually, the utility is invoked as: + +@example +decodemail @var{inbox} @var{outbox} +@end example + +@noindent +where @var{inbox} and @var{outbox} are file names or URLs of the input +and output mailboxes, correspondingly. The input mailbox is opened +read-only and will not be modified in any way. In particular, the +status of the processed messages will not change. If the output +mailbox does not exist, it will be created. If it exists, the +messages will be appended to it, preserving any original messages that +are already in it. This behavior can be changed using the @option{-t} +(@option{--truncate}) option, described below. + +The two mailboxes can be of different types. For example you can read +input from an imap server and store it in local @samp{maildir} box +using the following command: + +@example +decodemail imap://user@@example.com maildir:///var/mail/user +@end example + +Both arguments can be omitted. If @var{outbox} is not supplied, the +resulting mailbox will be printed on the standard output in Unix +@samp{mbox} format. If @var{inbox} is not supplied, the utility will +open the system inbox for the current user and use it for input. + +A consequence of these rules is that there is no simple way to read +the input mailbox from standard input (the input must be seekable). +If you need to do this, the normal procedure would be to save what +would be standard input in a temporary file and then give that file as +@command{decodemail}'s input. + +The following command line options modify the @command{decodemail} +behavior: + +@table @option +@item -c, --charset=@var{charset} +Convert all textual parts from their original character set to the +specified @var{charset}. + +@item -R, --recode +Convert all textual parts from their original character set to the +current character set, as specified by the @env{LC_ALL} or @env{LANG} +environment variable. + +@item --no-recode +Do not convert character sets. This is the default. + +@item -t, --truncate +If the output mailbox exists, truncate it before appending new +messages. + +@item --no-truncate +Keep the existing messages in the output mailbox intact. This is the +default. +@end table + +Additionally, the @ref{Common Options} are also understood. + +@node Conf-decodemail +@subsection Configuration of @command{decodemail}. + +The following common configuration statements affect the behavior of +@command{decodemail}: + +@multitable @columnfractions 0.3 0.6 +@headitem Statement @tab Reference +@item mime @tab @xref{mime statement}. +@item debug @tab @xref{Debug Statement}. +@item mailbox @tab @xref{Mailbox Statement}. +@item locking @tab @xref{Locking Statement}. +@end multitable + +Notably, the @code{mime} statement can be used to extend the list of +types which are decoded. For example, in the file @file{~/.decodemail} +(other locations are possible, @pxref{configuration}), you could have: + +@example +# base64/qp decode these mime types also: +mime @{ + text-type "application/x-bibtex"; + text-type "application/x-tex"; +@} +@end example + +Since the list of textual mime types is open-ended, with new types being +used at any time, we do not attempt to make the built-in list +comprehensive. + +@node Using-decodemail +@subsection Purpose and caveats of @command{decodemail}. + +The principal use envisioned for this program is to decode messages in +batch, after they are received. + +Unfortunately, some mailers prefer to encode messages in their +entirety in base64 (or quoted-printable), even when the content is +entirely human-readable text. This makes straightforward use of +@command{grep} or other standard commands impossible. The idea is for +@command{decodemail} to rectify that, by making the message text +readable again. + +Besides personal mail, mailing list archives are another place where +such decoding can be useful, as they are often searched with standard +tools. + +It is generally not recommended to run @command{decodemail} within a +mail reader (which should be able to do the decoding itself), or +directly in a terminal (since quite possibly there will be 8-bit +output not in the current character set). + +Although the output message from @command{decodemail} should be +entirely equivalent to the input message, apart from the decoding, it +is generally not identical. Because @command{decodemail} parses the +input message and reconstructs it for output, there are usually small +differences: + +@itemize +@item In the envelope @samp{From } line, multiple spaces are collapsed +to one. + +@item A @samp{Content-Transfer-Encoding:} header may be added where +not previously present, or its value changed from @samp{8bit} to +@samp{7bit}, or vice versa. This may happen both for the message as a +whole, and for a given mime part. @command{decodemail} looks at the +actual content of the text and outputs +@samp{Content-Transfer-Encoding:} accordingly. + +@item A trailing space is inserted when a long header line is broken +to occupy several lines (@dfn{header wrapping}). + +@example +SomeHeader: + someextremelylongvaluethatcannotbebroken +@end example + +@item The non-tracing headers may be reordered, notably those that are +mime-related. + +@item Any material before the first mime part of a mime multipart +message is lost. By the standards, nothing should appear +there. Typically if it does appear, it is a string such as @samp{This +is a multi-part message in MIME format.}. + +@item In mime parts, the charset specifications may no longer be +quoted (if quoting is not necessary). For example, +@samp{charset="utf-8"} becomes @samp{charset=utf-8}. + +@item The mime boundary strings will be changed. + +@end itemize + +If a discrepancy is created which actually affects message parsing or +reading, that's most likely a bug, and please report it. Naturally, +please send an exact input message to reproduce the problem. + |