diff options
author | Sergey Poznyakoff <gray@gnu.org.ua> | 2003-03-19 12:51:31 +0000 |
---|---|---|
committer | Sergey Poznyakoff <gray@gnu.org.ua> | 2003-03-19 12:51:31 +0000 |
commit | df1140b1035ef784b35421b307b869d73329030d (patch) | |
tree | 802b747a4c818d40eeee5030ce63ca225a8a8a72 | |
parent | 285f13318db4f08e91b48344bb1d88e44a0dca9e (diff) | |
download | mailutils-df1140b1035ef784b35421b307b869d73329030d.tar.gz mailutils-df1140b1035ef784b35421b307b869d73329030d.tar.bz2 |
Added to the repository
-rw-r--r-- | doc/rfc/rfc1521.txt | 4539 |
1 files changed, 4539 insertions, 0 deletions
diff --git a/doc/rfc/rfc1521.txt b/doc/rfc/rfc1521.txt new file mode 100644 index 000000000..eae6dc736 --- /dev/null +++ b/doc/rfc/rfc1521.txt @@ -0,0 +1,4539 @@ + + + + + + +Network Working Group N. Borenstein +Request for Comments: 1521 Bellcore +Obsoletes: 1341 N. Freed +Category: Standards Track Innosoft + September 1993 + + + MIME (Multipurpose Internet Mail Extensions) Part One: + Mechanisms for Specifying and Describing + the Format of Internet Message Bodies + +Status of this Memo + + This RFC specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" for the standardization state and status + of this protocol. Distribution of this memo is unlimited. + +Abstract + + STD 11, RFC 822 defines a message representation protocol which + specifies considerable detail about message headers, but which leaves + the message content, or message body, as flat ASCII text. This + document redefines the format of message bodies to allow multi-part + textual and non-textual message bodies to be represented and + exchanged without loss of information. This is based on earlier work + documented in RFC 934 and STD 11, RFC 1049, but extends and revises + that work. Because RFC 822 said so little about message bodies, this + document is largely orthogonal to (rather than a revision of) RFC + 822. + + In particular, this document is designed to provide facilities to + include multiple objects in a single message, to represent body text + in character sets other than US-ASCII, to represent formatted multi- + font text messages, to represent non-textual material such as images + and audio fragments, and generally to facilitate later extensions + defining new types of Internet mail for use by cooperating mail + agents. + + This document does NOT extend Internet mail header fields to permit + anything other than US-ASCII text data. Such extensions are the + subject of a companion document [RFC-1522]. + + This document is a revision of RFC 1341. Significant differences + from RFC 1341 are summarized in Appendix H. + + + + + +Borenstein & Freed [Page 1] + +RFC 1521 MIME September 1993 + + +Table of Contents + + 1. Introduction....................................... 3 + 2. Notations, Conventions, and Generic BNF Grammar.... 6 + 3. The MIME-Version Header Field...................... 7 + 4. The Content-Type Header Field...................... 9 + 5. The Content-Transfer-Encoding Header Field......... 13 + 5.1. Quoted-Printable Content-Transfer-Encoding......... 18 + 5.2. Base64 Content-Transfer-Encoding................... 21 + 6. Additional Content-Header Fields................... 23 + 6.1. Optional Content-ID Header Field................... 23 + 6.2. Optional Content-Description Header Field.......... 24 + 7. The Predefined Content-Type Values................. 24 + 7.1. The Text Content-Type.............................. 24 + 7.1.1. The charset parameter.............................. 25 + 7.1.2. The Text/plain subtype............................. 28 + 7.2. The Multipart Content-Type......................... 28 + 7.2.1. Multipart: The common syntax...................... 29 + 7.2.2. The Multipart/mixed (primary) subtype.............. 34 + 7.2.3. The Multipart/alternative subtype.................. 34 + 7.2.4. The Multipart/digest subtype....................... 36 + 7.2.5. The Multipart/parallel subtype..................... 37 + 7.2.6. Other Multipart subtypes........................... 37 + 7.3. The Message Content-Type........................... 38 + 7.3.1. The Message/rfc822 (primary) subtype............... 38 + 7.3.2. The Message/Partial subtype........................ 39 + 7.3.3. The Message/External-Body subtype.................. 42 + 7.3.3.1. The "ftp" and "tftp" access-types............... 44 + 7.3.3.2. The "anon-ftp" access-type...................... 45 + 7.3.3.3. The "local-file" and "afs" access-types......... 45 + 7.3.3.4. The "mail-server" access-type................... 45 + 7.3.3.5. Examples and Further Explanations............... 46 + 7.4. The Application Content-Type....................... 49 + 7.4.1. The Application/Octet-Stream (primary) subtype..... 50 + 7.4.2. The Application/PostScript subtype................. 50 + 7.4.3. Other Application subtypes......................... 53 + 7.5. The Image Content-Type............................. 53 + 7.6. The Audio Content-Type............................. 54 + 7.7. The Video Content-Type............................. 54 + 7.8. Experimental Content-Type Values................... 54 + 8. Summary............................................ 56 + 9. Security Considerations............................ 56 + 10. Authors' Addresses................................. 57 + 11. Acknowledgements................................... 58 + Appendix A -- Minimal MIME-Conformance.................... 60 + Appendix B -- General Guidelines For Sending Email Data... 63 + Appendix C -- A Complex Multipart Example................. 66 + Appendix D -- Collected Grammar........................... 68 + + + +Borenstein & Freed [Page 2] + +RFC 1521 MIME September 1993 + + + Appendix E -- IANA Registration Procedures................ 72 + E.1 Registration of New Content-type/subtype Values...... 72 + E.2 Registration of New Access-type Values + for Message/external-body............................ 73 + Appendix F -- Summary of the Seven Content-types.......... 74 + Appendix G -- Canonical Encoding Model.................... 76 + Appendix H -- Changes from RFC 1341....................... 78 + References................................................ 80 + +1. Introduction + + Since its publication in 1982, STD 11, RFC 822 [RFC-822] has defined + the standard format of textual mail messages on the Internet. Its + success has been such that the RFC 822 format has been adopted, + wholly or partially, well beyond the confines of the Internet and the + Internet SMTP transport defined by STD 10, RFC 821 [RFC-821]. As the + format has seen wider use, a number of limitations have proven + increasingly restrictive for the user community. + + RFC 822 was intended to specify a format for text messages. As such, + non-text messages, such as multimedia messages that might include + audio or images, are simply not mentioned. Even in the case of text, + however, RFC 822 is inadequate for the needs of mail users whose + languages require the use of character sets richer than US ASCII + [US-ASCII]. Since RFC 822 does not specify mechanisms for mail + containing audio, video, Asian language text, or even text in most + European languages, additional specifications are needed. + + One of the notable limitations of RFC 821/822 based mail systems is + the fact that they limit the contents of electronic mail messages to + relatively short lines of seven-bit ASCII. This forces users to + convert any non-textual data that they may wish to send into seven- + bit bytes representable as printable ASCII characters before invoking + a local mail UA (User Agent, a program with which human users send + and receive mail). Examples of such encodings currently used in the + Internet include pure hexadecimal, uuencode, the 3-in-4 base 64 + scheme specified in RFC 1421, the Andrew Toolkit Representation + [ATK], and many others. + + The limitations of RFC 822 mail become even more apparent as gateways + are designed to allow for the exchange of mail messages between RFC + 822 hosts and X.400 hosts. X.400 [X400] specifies mechanisms for the + inclusion of non-textual body parts within electronic mail messages. + The current standards for the mapping of X.400 messages to RFC 822 + messages specify either that X.400 non-textual body parts must be + converted to (not encoded in) an ASCII format, or that they must be + discarded, notifying the RFC 822 user that discarding has occurred. + This is clearly undesirable, as information that a user may wish to + + + +Borenstein & Freed [Page 3] + +RFC 1521 MIME September 1993 + + + receive is lost. Even though a user's UA may not have the capability + of dealing with the non-textual body part, the user might have some + mechanism external to the UA that can extract useful information from + the body part. Moreover, it does not allow for the fact that the + message may eventually be gatewayed back into an X.400 message + handling system (i.e., the X.400 message is "tunneled" through + Internet mail), where the non-textual information would definitely + become useful again. + + This document describes several mechanisms that combine to solve most + of these problems without introducing any serious incompatibilities + with the existing world of RFC 822 mail. In particular, it + describes: + + 1. A MIME-Version header field, which uses a version number to + declare a message to be conformant with this specification and + allows mail processing agents to distinguish between such + messages and those generated by older or non-conformant software, + which is presumed to lack such a field. + + 2. A Content-Type header field, generalized from RFC 1049 [RFC-1049], + which can be used to specify the type and subtype of data in the + body of a message and to fully specify the native representation + (encoding) of such data. + + 2.a. A "text" Content-Type value, which can be used to represent + textual information in a number of character sets and + formatted text description languages in a standardized + manner. + + 2.b. A "multipart" Content-Type value, which can be used to + combine several body parts, possibly of differing types of + data, into a single message. + + 2.c. An "application" Content-Type value, which can be used to + transmit application data or binary data, and hence, among + other uses, to implement an electronic mail file transfer + service. + + 2.d. A "message" Content-Type value, for encapsulating another + mail message. + + 2.e An "image" Content-Type value, for transmitting still image + (picture) data. + + 2.f. An "audio" Content-Type value, for transmitting audio or + voice data. + + + + +Borenstein & Freed [Page 4] + +RFC 1521 MIME September 1993 + + + 2.g. A "video" Content-Type value, for transmitting video or + moving image data, possibly with audio as part of the + composite video data format. + + 3. A Content-Transfer-Encoding header field, which can be used to + specify an auxiliary encoding that was applied to the data in + order to allow it to pass through mail transport mechanisms which + may have data or character set limitations. + + 4. Two additional header fields that can be used to further describe + the data in a message body, the Content-ID and Content- + Description header fields. + + MIME has been carefully designed as an extensible mechanism, and it + is expected that the set of content-type/subtype pairs and their + associated parameters will grow significantly with time. Several + other MIME fields, notably including character set names, are likely + to have new values defined over time. In order to ensure that the + set of such values is developed in an orderly, well-specified, and + public manner, MIME defines a registration process which uses the + Internet Assigned Numbers Authority (IANA) as a central registry for + such values. Appendix E provides details about how IANA registration + is accomplished. + + Finally, to specify and promote interoperability, Appendix A of this + document provides a basic applicability statement for a subset of the + above mechanisms that defines a minimal level of "conformance" with + this document. + + HISTORICAL NOTE: Several of the mechanisms described in this + document may seem somewhat strange or even baroque at first + reading. It is important to note that compatibility with existing + standards AND robustness across existing practice were two of the + highest priorities of the working group that developed this + document. In particular, compatibility was always favored over + elegance. + + MIME was first defined and published as RFCs 1341 and 1342 [RFC-1341] + [RFC-1342]. This document is a relatively minor updating of RFC + 1341, and is intended to supersede it. The differences between this + document and RFC 1341 are summarized in Appendix H. Please refer to + the current edition of the "IAB Official Protocol Standards" for the + standardization state and status of this protocol. Several other RFC + documents will be of interest to the MIME implementor, in particular + [RFC 1343], [RFC-1344], and [RFC-1345]. + + + + + + +Borenstein & Freed [Page 5] + +RFC 1521 MIME September 1993 + + +2. Notations, Conventions, and Generic BNF Grammar + + This document is being published in two versions, one as plain ASCII + text and one as PostScript (PostScript is a trademark of Adobe + Systems Incorporated.). While the text version is the official + specification, some will find the PostScript version easier to read. + The textual contents are identical. An Andrew-format copy of this + document is also available from the first author (Borenstein). + + Although the mechanisms specified in this document are all described + in prose, most are also described formally in the modified BNF + notation of RFC 822. Implementors will need to be familiar with this + notation in order to understand this specification, and are referred + to RFC 822 for a complete explanation of the modified BNF notation. + + Some of the modified BNF in this document makes reference to + syntactic entities that are defined in RFC 822 and not in this + document. A complete formal grammar, then, is obtained by combining + the collected grammar appendix of this document with that of RFC 822 + plus the modifications to RFC 822 defined in RFC 1123, which + specifically changes the syntax for `return', `date' and `mailbox'. + + The term CRLF, in this document, refers to the sequence of the two + ASCII characters CR (13) and LF (10) which, taken together, in this + order, denote a line break in RFC 822 mail. + + The term "character set" is used in this document to refer to a + method used with one or more tables to convert encoded text to a + series of octets. This definition is intended to allow various kinds + of text encodings, from simple single-table mappings such as ASCII to + complex table switching methods such as those that use ISO 2022's + techniques. However, a MIME character set name must fully specify + the mapping to be performed. + + The term "message", when not further qualified, means either the + (complete or "top-level") message being transferred on a network, or + a message encapsulated in a body of type "message". + + The term "body part", in this document, means one of the parts of the + body of a multipart entity. A body part has a header and a body, so + it makes sense to speak about the body of a body part. + + The term "entity", in this document, means either a message or a body + part. All kinds of entities share the property that they have a + header and a body. + + The term "body", when not further qualified, means the body of an + entity, that is the body of either a message or of a body part. + + + +Borenstein & Freed [Page 6] + +RFC 1521 MIME September 1993 + + + NOTE: The previous four definitions are clearly circular. This is + unavoidable, since the overall structure of a MIME message is + indeed recursive. + + In this document, all numeric and octet values are given in decimal + notation. + + It must be noted that Content-Type values, subtypes, and parameter + names as defined in this document are case-insensitive. However, + parameter values are case-sensitive unless otherwise specified for + the specific parameter. + + FORMATTING NOTE: This document has been carefully formatted for + ease of reading. The PostScript version of this document, in + particular, places notes like this one, which may be skipped by + the reader, in a smaller, italicized, font, and indents it as + well. In the text version, only the indentation is preserved, so + if you are reading the text version of this you might consider + using the PostScript version instead. However, all such notes will + be indented and preceded by "NOTE:" or some similar introduction, + even in the text version. + + The primary purpose of these non-essential notes is to convey + information about the rationale of this document, or to place this + document in the proper historical or evolutionary context. Such + information may be skipped by those who are focused entirely on + building a conformant implementation, but may be of use to those + who wish to understand why this document is written as it is. + + For ease of recognition, all BNF definitions have been placed in a + fixed-width font in the PostScript version of this document. + +3. The MIME-Version Header Field + + Since RFC 822 was published in 1982, there has really been only one + format standard for Internet messages, and there has been little + perceived need to declare the format standard in use. This document + is an independent document that complements RFC 822. Although the + extensions in this document have been defined in such a way as to be + compatible with RFC 822, there are still circumstances in which it + might be desirable for a mail-processing agent to know whether a + message was composed with the new standard in mind. + + Therefore, this document defines a new header field, "MIME-Version", + which is to be used to declare the version of the Internet message + body format standard in use. + + Messages composed in accordance with this document MUST include such + + + +Borenstein & Freed [Page 7] + +RFC 1521 MIME September 1993 + + + a header field, with the following verbatim text: + + MIME-Version: 1.0 + + The presence of this header field is an assertion that the message + has been composed in compliance with this document. + + Since it is possible that a future document might extend the message + format standard again, a formal BNF is given for the content of the + MIME-Version field: + + version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT + + Thus, future format specifiers, which might replace or extend "1.0", + are constrained to be two integer fields, separated by a period. If + a message is received with a MIME-version value other than "1.0", it + cannot be assumed to conform with this specification. + + Note that the MIME-Version header field is required at the top level + of a message. It is not required for each body part of a multipart + entity. It is required for the embedded headers of a body of type + "message" if and only if the embedded message is itself claimed to be + MIME-conformant. + + It is not possible to fully specify how a mail reader that conforms + with MIME as defined in this document should treat a message that + might arrive in the future with some value of MIME-Version other than + "1.0". However, conformant software is encouraged to check the + version number and at least warn the user if an unrecognized MIME- + version is encountered. + + It is also worth noting that version control for specific content- + types is not accomplished using the MIME-Version mechanism. In + particular, some formats (such as application/postscript) have + version numbering conventions that are internal to the document + format. Where such conventions exist, MIME does nothing to supersede + them. Where no such conventions exist, a MIME type might use a + "version" parameter in the content-type field if necessary. + + NOTE TO IMPLEMENTORS: All header fields defined in this document, + including MIME-Version, Content-type, etc., are subject to the + general syntactic rules for header fields specified in RFC 822. In + particular, all can include comments, which means that the following + two MIME-Version fields are equivalent: + + MIME-Version: 1.0 + MIME-Version: 1.0 (Generated by GBD-killer 3.7) + + + + +Borenstein & Freed [Page 8] + +RFC 1521 MIME September 1993 + + +4. The Content-Type Header Field + + The purpose of the Content-Type field is to describe the data + contained in the body fully enough that the receiving user agent can + pick an appropriate agent or mechanism to present the data to the + user, or otherwise deal with the data in an appropriate manner. + + HISTORICAL NOTE: The Content-Type header field was first defined in + RFC 1049. RFC 1049 Content-types used a simpler and less powerful + syntax, but one that is largely compatible with the mechanism given + here. + + The Content-Type header field is used to specify the nature of the + data in the body of an entity, by giving type and subtype + identifiers, and by providing auxiliary information that may be + required for certain types. After the type and subtype names, the + remainder of the header field is simply a set of parameters, + specified in an attribute/value notation. The set of meaningful + parameters differs for the different types. In particular, there are + NO globally-meaningful parameters that apply to all content-types. + Global mechanisms are best addressed, in the MIME model, by the + definition of additional Content-* header fields. The ordering of + parameters is not significant. Among the defined parameters is a + "charset" parameter by which the character set used in the body may + be declared. Comments are allowed in accordance with RFC 822 rules + for structured header fields. + + In general, the top-level Content-Type is used to declare the general + type of data, while the subtype specifies a specific format for that + type of data. Thus, a Content-Type of "image/xyz" is enough to tell + a user agent that the data is an image, even if the user agent has no + knowledge of the specific image format "xyz". Such information can + be used, for example, to decide whether or not to show a user the raw + data from an unrecognized subtype -- such an action might be + reasonable for unrecognized subtypes of text, but not for + unrecognized subtypes of image or audio. For this reason, registered + subtypes of audio, image, text, and video, should not contain + embedded information that is really of a different type. Such + compound types should be represented using the "multipart" or + "application" types. + + Parameters are modifiers of the content-subtype, and do not + fundamentally affect the requirements of the host system. Although + most parameters make sense only with certain content-types, others + are "global" in the sense that they might apply to any subtype. For + example, the "boundary" parameter makes sense only for the + "multipart" content-type, but the "charset" parameter might make + sense with several content-types. + + + +Borenstein & Freed [Page 9] + +RFC 1521 MIME September 1993 + + + An initial set of seven Content-Types is defined by this document. + This set of top-level names is intended to be substantially complete. + It is expected that additions to the larger set of supported types + can generally be accomplished by the creation of new subtypes of + these initial types. In the future, more top-level types may be + defined only by an extension to this standard. If another primary + type is to be used for any reason, it must be given a name starting + with "X-" to indicate its non-standard status and to avoid a + potential conflict with a future official name. + + In the Augmented BNF notation of RFC 822, a Content-Type header field + value is defined as follows: + + content := "Content-Type" ":" type "/" subtype *(";" + parameter) + ; case-insensitive matching of type and subtype + + type := "application" / "audio" + / "image" / "message" + / "multipart" / "text" + / "video" / extension-token + ; All values case-insensitive + + extension-token := x-token / iana-token + + iana-token := <a publicly-defined extension token, + registered with IANA, as specified in + appendix E> + + x-token := <The two characters "X-" or "x-" followed, with + no intervening white space, by any token> + + subtype := token ; case-insensitive + + parameter := attribute "=" value + + attribute := token ; case-insensitive + + value := token / quoted-string + + token := 1*<any (ASCII) CHAR except SPACE, CTLs, + or tspecials> + + tspecials := "(" / ")" / "<" / ">" / "@" + / "," / ";" / ":" / "\" / <"> + / "/" / "[" / "]" / "?" / "=" + ; Must be in quoted-string, + ; to use within parameter values + + + +Borenstein & Freed [Page 10] + +RFC 1521 MIME September 1993 + + + Note that the definition of "tspecials" is the same as the RFC 822 + definition of "specials" with the addition of the three characters + "/", "?", and "=", and the removal of ".". + + Note also that a subtype specification is MANDATORY. There are no + default subtypes. + + The type, subtype, and parameter names are not case sensitive. For + example, TEXT, Text, and TeXt are all equivalent. Parameter values + are normally case sensitive, but certain parameters are interpreted + to be case-insensitive, depending on the intended use. (For example, + multipart boundaries are case-sensitive, but the "access-type" for + message/External-body is not case-sensitive.) + + Beyond this syntax, the only constraint on the definition of subtype + names is the desire that their uses must not conflict. That is, it + would be undesirable to have two different communities using + "Content-Type: application/foobar" to mean two different things. The + process of defining new content-subtypes, then, is not intended to be + a mechanism for imposing restrictions, but simply a mechanism for + publicizing the usages. There are, therefore, two acceptable + mechanisms for defining new Content-Type subtypes: + + 1. Private values (starting with "X-") may be + defined bilaterally between two cooperating + agents without outside registration or + standardization. + + 2. New standard values must be documented, + registered with, and approved by IANA, as + described in Appendix E. Where intended for + public use, the formats they refer to must + also be defined by a published specification, + and possibly offered for standardization. + + The seven standard initial predefined Content-Types are detailed in + the bulk of this document. They are: + + text -- textual information. The primary subtype, + "plain", indicates plain (unformatted) text. No + special software is required to get the full + meaning of the text, aside from support for the + indicated character set. Subtypes are to be used + for enriched text in forms where application + software may enhance the appearance of the text, + but such software must not be required in order to + get the general idea of the content. Possible + subtypes thus include any readable word processor + + + +Borenstein & Freed [Page 11] + +RFC 1521 MIME September 1993 + + + format. A very simple and portable subtype, + richtext, was defined in RFC 1341, with a future + revision expected. + + multipart -- data consisting of multiple parts of + independent data types. Four initial subtypes + are defined, including the primary "mixed" + subtype, "alternative" for representing the same + data in multiple formats, "parallel" for parts + intended to be viewed simultaneously, and "digest" + for multipart entities in which each part is of + type "message". + + message -- an encapsulated message. A body of + Content-Type "message" is itself all or part of a + fully formatted RFC 822 conformant message which + may contain its own different Content-Type header + field. The primary subtype is "rfc822". The + "partial" subtype is defined for partial messages, + to permit the fragmented transmission of bodies + that are thought to be too large to be passed + through mail transport facilities. Another + subtype, "External-body", is defined for + specifying large bodies by reference to an + external data source. + + image -- image data. Image requires a display device + (such as a graphical display, a printer, or a FAX + machine) to view the information. Initial + subtypes are defined for two widely-used image + formats, jpeg and gif. + + audio -- audio data, with initial subtype "basic". + Audio requires an audio output device (such as a + speaker or a telephone) to "display" the contents. + + video -- video data. Video requires the capability to + display moving images, typically including + specialized hardware and software. The initial + subtype is "mpeg". + + application -- some other kind of data, typically + either uninterpreted binary data or information to + be processed by a mail-based application. The + primary subtype, "octet-stream", is to be used in + the case of uninterpreted binary data, in which + case the simplest recommended action is to offer + to write the information into a file for the user. + + + +Borenstein & Freed [Page 12] + +RFC 1521 MIME September 1993 + + + An additional subtype, "PostScript", is defined + for transporting PostScript documents in bodies. + Other expected uses for "application" include + spreadsheets, data for mail-based scheduling + systems, and languages for "active" + (computational) email. (Note that active email + and other application data may entail several + security considerations, which are discussed later + in this memo, particularly in the context of + application/PostScript.) + + Default RFC 822 messages are typed by this protocol as plain text in + the US-ASCII character set, which can be explicitly specified as + "Content-type: text/plain; charset=us-ascii". If no Content-Type is + specified, this default is assumed. In the presence of a MIME- + Version header field, a receiving User Agent can also assume that + plain US-ASCII text was the sender's intent. In the absence of a + MIME-Version specification, plain US-ASCII text must still be + assumed, but the sender's intent might have been otherwise. + + RATIONALE: In the absence of any Content-Type header field or + MIME-Version header field, it is impossible to be certain that a + message is actually text in the US-ASCII character set, since it + might well be a message that, using the conventions that predate + this document, includes text in another character set or non- + textual data in a manner that cannot be automatically recognized + (e.g., a uuencoded compressed UNIX tar file). Although there is + no fully acceptable alternative to treating such untyped messages + as "text/plain; charset=us-ascii", implementors should remain + aware that if a message lacks both the MIME-Version and the + Content-Type header fields, it may in practice contain almost + anything. + + It should be noted that the list of Content-Type values given here + may be augmented in time, via the mechanisms described above, and + that the set of subtypes is expected to grow substantially. + + When a mail reader encounters mail with an unknown Content-type + value, it should generally treat it as equivalent to + "application/octet-stream", as described later in this document. + +5. The Content-Transfer-Encoding Header Field + + Many Content-Types which could usefully be transported via email are + represented, in their "natural" format, as 8-bit character or binary + data. Such data cannot be transmitted over some transport protocols. + For example, RFC 821 restricts mail messages to 7-bit US-ASCII data + with lines no longer than 1000 characters. + + + +Borenstein & Freed [Page 13] + +RFC 1521 MIME September 1993 + + + It is necessary, therefore, to define a standard mechanism for re- + encoding such data into a 7-bit short-line format. This document + specifies that such encodings will be indicated by a new "Content- + Transfer-Encoding" header field. The Content-Transfer-Encoding field + is used to indicate the type of transformation that has been used in + order to represent the body in an acceptable manner for transport. + + Unlike Content-Types, a proliferation of Content-Transfer-Encoding + values is undesirable and unnecessary. However, establishing only a + single Content-Transfer-Encoding mechanism does not seem possible. + There is a tradeoff between the desire for a compact and efficient + encoding of largely-binary data and the desire for a readable + encoding of data that is mostly, but not entirely, 7-bit data. For + this reason, at least two encoding mechanisms are necessary: a + "readable" encoding and a "dense" encoding. + + The Content-Transfer-Encoding field is designed to specify an + invertible mapping between the "native" representation of a type of + data and a representation that can be readily exchanged using 7 bit + mail transport protocols, such as those defined by RFC 821 (SMTP). + This field has not been defined by any previous standard. The field's + value is a single token specifying the type of encoding, as + enumerated below. Formally: + + encoding := "Content-Transfer-Encoding" ":" mechanism + + mechanism := "7bit" ; case-insensitive + / "quoted-printable" + / "base64" + / "8bit" + / "binary" + / x-token + + These values are not case sensitive. That is, Base64 and BASE64 and + bAsE64 are all equivalent. An encoding type of 7BIT requires that + the body is already in a seven-bit mail-ready representation. This + is the default value -- that is, "Content-Transfer-Encoding: 7BIT" is + assumed if the Content-Transfer-Encoding header field is not present. + + The values "8bit", "7bit", and "binary" all mean that NO encoding has + been performed. However, they are potentially useful as indications + of the kind of data contained in the object, and therefore of the + kind of encoding that might need to be performed for transmission in + a given transport system. In particular: + + "7bit" means that the data is all represented as short + lines of US-ASCII data. + + + + +Borenstein & Freed [Page 14] + +RFC 1521 MIME September 1993 + + + "8bit" means that the lines are short, but there may be + non-ASCII characters (octets with the high-order + bit set). + + "Binary" means that not only may non-ASCII characters + be present, but also that the lines are not + necessarily short enough for SMTP transport. + + The difference between "8bit" (or any other conceivable bit-width + token) and the "binary" token is that "binary" does not require + adherence to any limits on line length or to the SMTP CRLF semantics, + while the bit-width tokens do require such adherence. If the body + contains data in any bit-width other than 7-bit, the appropriate + bit-width Content-Transfer-Encoding token must be used (e.g., "8bit" + for unencoded 8 bit wide data). If the body contains binary data, + the "binary" Content-Transfer-Encoding token must be used. + + NOTE: The distinction between the Content-Transfer-Encoding values + of "binary", "8bit", etc. may seem unimportant, in that all of + them really mean "none" -- that is, there has been no encoding of + the data for transport. However, clear labeling will be of + enormous value to gateways between future mail transport systems + with differing capabilities in transporting data that do not meet + the restrictions of RFC 821 transport. + + Mail transport for unencoded 8-bit data is defined in RFC-1426 + [RFC-1426]. As of the publication of this document, there are no + standardized Internet mail transports for which it is legitimate + to include unencoded binary data in mail bodies. Thus there are + no circumstances in which the "binary" Content-Transfer-Encoding + is actually legal on the Internet. However, in the event that + binary mail transport becomes a reality in Internet mail, or when + this document is used in conjunction with any other binary-capable + transport mechanism, binary bodies should be labeled as such using + this mechanism. + + NOTE: The five values defined for the Content-Transfer-Encoding + field imply nothing about the Content-Type other than the + algorithm by which it was encoded or the transport system + requirements if unencoded. + |