summaryrefslogtreecommitdiff
path: root/doc/texinfo/programs/decodemail.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/texinfo/programs/decodemail.texi')
-rw-r--r--doc/texinfo/programs/decodemail.texi227
1 files changed, 227 insertions, 0 deletions
diff --git a/doc/texinfo/programs/decodemail.texi b/doc/texinfo/programs/decodemail.texi
new file mode 100644
index 000000000..ffddef4da
--- /dev/null
+++ b/doc/texinfo/programs/decodemail.texi
@@ -0,0 +1,227 @@
+@c This is part of the GNU Mailutils manual.
+@c Copyright (C) 2020--2024 Free Software Foundation, Inc.
+@c See file mailutils.texi for copying conditions.
+@comment *******************************************************************
+@pindex decodemail
+
+The @command{decodemail} utility is a filter program that reads
+messages from the input mailbox, decodes ``textual'' parts of each
+multipart message from a base64- or quoted-printable encoding to an
+8-bit or 7-bit transfer encoding, and stores the processed messages in
+the output mailbox. All messages from the input mailbox are stored in
+the output, regardless of whether a change was made.
+
+The message parts deemed to be textual are those whose
+@samp{Content-Type} header matches a predefined, or user-defined,
+mime type pattern. In addition, encoded pieces of the @samp{From:},
+@samp{To:}, @samp{Subject:}, etc., headers are decoded.
+
+For example, @command{decodemail} makes this transformation:
+
+@example
+Subject: =?utf-8?Q?The=20Baroque=20Enquirer=20|=20July=202020?=
+@result{} Subject: The Baroque Enquirer | July 2020
+@end example
+
+The built-in list of textual content type patterns is:
+
+@example
+text/*
+application/*shell
+application/shellscript
+*/x-csrc
+*/x-csource
+*/x-diff
+*/x-patch
+*/x-perl
+*/x-php
+*/x-python
+*/x-sh
+@end example
+
+These strings are matched as shell globbing patterns
+(@pxref{glob,,,glob(7), glob(7) manual page}).
+
+More patterns can be added to this list using the
+@code{mime.text-type} configuration statement.
+@xref{mime statement}, for a detailed discussion, and the
+configuration section below for a simple example.
+
+When processing old mesages you may encounter @samp{Content-Type}
+headers whose value contains only type, but no subtype. To match
+such headers, use the pattern without @samp{/whatever} part. E.g.
+@samp{text/*} matches @samp{text/plain} and @samp{text/html}, but
+does not match @samp{text}. On the other hand, @samp{t*xt} does
+not match @samp{text/plain}, but does match @samp{text}.
+
+Optionally, the decoded parts can be converted to another character
+set. By default, the character set is not changed.
+
+@menu
+* Opt-decodemail:: Invocation of @command{decodemail}.
+* Conf-decodemail:: Configuration of @command{decodemail}.
+* Using-decodemail:: Purpose and caveats of @command{decodemail}.
+@end menu
+
+@node Opt-decodemail
+@subsection Invocation of @command{decodemail}.
+
+Usually, the utility is invoked as:
+
+@example
+decodemail @var{inbox} @var{outbox}
+@end example
+
+@noindent
+where @var{inbox} and @var{outbox} are file names or URLs of the input
+and output mailboxes, correspondingly. The input mailbox is opened
+read-only and will not be modified in any way. In particular, the
+status of the processed messages will not change. If the output
+mailbox does not exist, it will be created. If it exists, the
+messages will be appended to it, preserving any original messages that
+are already in it. This behavior can be changed using the @option{-t}
+(@option{--truncate}) option, described below.
+
+The two mailboxes can be of different types. For example you can read
+input from an imap server and store it in local @samp{maildir} box
+using the following command:
+
+@example
+decodemail imap://user@@example.com maildir:///var/mail/user
+@end example
+
+Both arguments can be omitted. If @var{outbox} is not supplied, the
+resulting mailbox will be printed on the standard output in Unix
+@samp{mbox} format. If @var{inbox} is not supplied, the utility will
+open the system inbox for the current user and use it for input.
+
+A consequence of these rules is that there is no simple way to read
+the input mailbox from standard input (the input must be seekable).
+If you need to do this, the normal procedure would be to save what
+would be standard input in a temporary file and then give that file as
+@command{decodemail}'s input.
+
+The following command line options modify the @command{decodemail}
+behavior:
+
+@table @option
+@item -c, --charset=@var{charset}
+Convert all textual parts from their original character set to the
+specified @var{charset}.
+
+@item -R, --recode
+Convert all textual parts from their original character set to the
+current character set, as specified by the @env{LC_ALL} or @env{LANG}
+environment variable.
+
+@item --no-recode
+Do not convert character sets. This is the default.
+
+@item -t, --truncate
+If the output mailbox exists, truncate it before appending new
+messages.
+
+@item --no-truncate
+Keep the existing messages in the output mailbox intact. This is the
+default.
+@end table
+
+Additionally, the @ref{Common Options} are also understood.
+
+@node Conf-decodemail
+@subsection Configuration of @command{decodemail}.
+
+The following common configuration statements affect the behavior of
+@command{decodemail}:
+
+@multitable @columnfractions 0.3 0.6
+@headitem Statement @tab Reference
+@item mime @tab @xref{mime statement}.
+@item debug @tab @xref{Debug Statement}.
+@item mailbox @tab @xref{Mailbox Statement}.
+@item locking @tab @xref{Locking Statement}.
+@end multitable
+
+Notably, the @code{mime} statement can be used to extend the list of
+types which are decoded. For example, in the file @file{~/.decodemail}
+(other locations are possible, @pxref{configuration}), you could have:
+
+@example
+# base64/qp decode these mime types also:
+mime @{
+ text-type "application/x-bibtex";
+ text-type "application/x-tex";
+@}
+@end example
+
+Since the list of textual mime types is open-ended, with new types being
+used at any time, we do not attempt to make the built-in list
+comprehensive.
+
+@node Using-decodemail
+@subsection Purpose and caveats of @command{decodemail}.
+
+The principal use envisioned for this program is to decode messages in
+batch, after they are received.
+
+Unfortunately, some mailers prefer to encode messages in their
+entirety in base64 (or quoted-printable), even when the content is
+entirely human-readable text. This makes straightforward use of
+@command{grep} or other standard commands impossible. The idea is for
+@command{decodemail} to rectify that, by making the message text
+readable again.
+
+Besides personal mail, mailing list archives are another place where
+such decoding can be useful, as they are often searched with standard
+tools.
+
+It is generally not recommended to run @command{decodemail} within a
+mail reader (which should be able to do the decoding itself), or
+directly in a terminal (since quite possibly there will be 8-bit
+output not in the current character set).
+
+Although the output message from @command{decodemail} should be
+entirely equivalent to the input message, apart from the decoding, it
+is generally not identical. Because @command{decodemail} parses the
+input message and reconstructs it for output, there are usually small
+differences:
+
+@itemize
+@item In the envelope @samp{From } line, multiple spaces are collapsed
+to one.
+
+@item A @samp{Content-Transfer-Encoding:} header may be added where
+not previously present, or its value changed from @samp{8bit} to
+@samp{7bit}, or vice versa. This may happen both for the message as a
+whole, and for a given mime part. @command{decodemail} looks at the
+actual content of the text and outputs
+@samp{Content-Transfer-Encoding:} accordingly.
+
+@item A trailing space is inserted when a long header line is broken
+to occupy several lines (@dfn{header wrapping}).
+
+@example
+SomeHeader:
+ someextremelylongvaluethatcannotbebroken
+@end example
+
+@item The non-tracing headers may be reordered, notably those that are
+mime-related.
+
+@item Any material before the first mime part of a mime multipart
+message is lost. By the standards, nothing should appear
+there. Typically if it does appear, it is a string such as @samp{This
+is a multi-part message in MIME format.}.
+
+@item In mime parts, the charset specifications may no longer be
+quoted (if quoting is not necessary). For example,
+@samp{charset="utf-8"} becomes @samp{charset=utf-8}.
+
+@item The mime boundary strings will be changed.
+
+@end itemize
+
+If a discrepancy is created which actually affects message parsing or
+reading, that's most likely a bug, and please report it. Naturally,
+please send an exact input message to reproduce the problem.
+

Return to:

Send suggestions and report system problems to the System administrator.