E-mail in the Near Future

Abstract

As electronic mail becomes more and more important to business and to consumers, in industry and in private life, for financial and for personal purposes, it becomes more and more important that we understand the issues facing e-mail users and software developers over the next few years.

In this paper we'll look at the state of the art of electronic mail and we'll look at where the art is going in the coming two or three years. We'll focus on three major areas: mail formatting, mail access, and mail security.

Definitions: What is E-mail?

In contrasting electronic mail with some related technologies, we can use the following definitions to show the similarities and differences:

E-mail: The asynchronous electronic transmission of information over a computer network.

Fax: The synchronous electronic transmission of information over a dedicated communication line.

Worldwide Web: The synchronous electronic retrieval of information over a computer network.

A Brief Look at the Evolution of E-mail

1977 to 1997 and beyond

-20 years	Experimental; RFC-733 introduced in 1977
-10 years	Geeks had e-mail; RFC-822 introduced in 1982
-5 years	WWW experimental (MIME - 1992 / HTML - 1993)
-2½ years	WWW/e-mail explosion
Today	"Everyone" has e-mail & WWW access
+2½ years	Rich formatting through MIME/HTML Flexible access through advanced client/server protocols Enhanced security, delivery receipts, signatures
+5 years	Much higher security and reliability Widespread deployment of today's emerging standards Who can really guess?

The Future of E-mail

Formatting

RFC-822 allowed for the transmission of electronic mail as simple messages in plain text, using the 7-bit US-ASCII character set. European character sets, not to mention the more complex Asian character sets, were not supported. E-mail was very simple: the headers described the message (who it came from, where it was going, and so on) and the body contained the content. Most mail-readers displayed all the headers to you, though some of the more sophisticated ones displayed a "memo style", showing only those (From, To, CC, Date, Subject) that might actually interest the reader.

RFC-822 became a standard in 1982, and it served well for several years. It soon became clear, though, that the limitations imposed by US-ASCII, by the lack of document structure, and by the inability to convey descriptive (meta-)information about the message content were restricting the usefulness of e-mail. Several software vendors began to provide more function in their proprietary e-mail products, but such function was limited to the users of that vendor's products, since every vendor was handling it in a different way.

Ten years after the acceptance of RFC-822 as a standard, a new standard was introduced which was an extension of RFC-822 that removed all of the above restrictions. RFC-1341, the Multipurpose Internet Mail Extensions (MIME) was born out of the need for expanded character set support, but also provided (and has been much more well known for) the ability to send messages with attachments. The message itself or any attachment could be, in addition to plain text, one of a number of data types, such as image or audio. MIME had not only brought Internet standard e-mail to Europeans and Asians, with its character-set support; it had also brought Internet e-mail into the multimedia world.

Of course, people had been sending images, programs, and "zip files" around by e-mail for years, but they had done it in an ad hoc way, and one which required a lot of manual work. To send it, one had to encode the image file with a program that would translate all the bytes into US-ASCII characters, and one had to imbed the encoded file into one's message in a way that made it clear to the recipient what it was, where it started and ended, and how to decode it. On the other end, the recipient had to save the mail in a file, edit the file to isolate the encoded part of the body, run the appropriate program to decode it (so s/he had to have the right program to decode it), and then find a program that would display the image. MIME standardized this so that the users' mail programs would take care of the encoding and decoding, the attaching and extracting, and the locating of the right program to display this or that type of data on the receiving end.

MIME has itself evolved over the past few years, and RFC-1341 was replaced by RFC-1521 and then by RFC-2045 through 2049 (the current MIME definition is a set of five RFCs). The new standards continue the attachment model, which still provides limited information about the structure of a mail document. But in 1993 the introduction of HTML, the Hypertext Markup Language used to create pages on the Worldwide Web, defined a standard way to provide much more information about the format and structure of a message using a compound document format. HTML specifies not just the document parts, but also the relationship among the parts. A proposed standard, introduced in 1996 and called MHTML (or MIME/HTML), defines the mechanism for packaging a set of files (text, image, and so on) that are related and are to be considered as one document. This standard uses a new "multipart/related" MIME type for the packaging, and "text/html" for the formatting, and specifies standard ways to connect one part to another to describe a complete, formatted, multimedia document in e-mail.

Over the next couple of years, as mail clients are enhanced to take advantage of these new standards, we will see the next stage in the development of e-mail. Once used for sending plain text, and now used to send plain text with a photo of the new baby attached, it will very soon be used to send complex documents with all the capabilities of the Worldwide Web.

Access

When e-mail started, each user's workstation had an e-mail address. If you wanted to send e-mail to John Smith, and John was at Princeton University and had a workstation called "enchanted-forest", then perhaps you sent e-mail to "[email protected]". If John's workstation was shut down, then the mail waited along the way, and it would eventually be delivered when John brought his machine back up. If John was away on vacation, then maybe the mail delivery service would give up after a while and would send the message back to you, telling you that it was unable to deliver it for five days or so.

But sometimes John would want to read his mail from another building on campus, or perhaps from another city, while he was travelling. Without access to his workstation, he couldn't do that directly, but it's possible to read files on his hard disk remotely, by network. If this was set up, John could mount the file system from enchanted-forest, through the network, and get to his mail from someplace else. Of course, that assumed that the mail-reader he had available to him could read the files stored by his own mail-reader.

To solve some of these problems, particularly the one about mail delivery, mail servers were developed, and the Post Office Protocol (POP) was defined. POP defines a protocol through which a mail client can retrieve mail from a mail server (the post office). With POP, I send mail to John by addressing it to something like "[email protected]". I don't have to know which workstation or workstations John uses, and I don't have to worry about whether John's workstation will be running and receiving mail. The post office server will be maintained by the computing center and will be reliable and available at all times. This eliminates the non-delivery problem -- the mail is delivered, and when the user retrieves it is irrelevant to the delivery. It also eliminates the need to go to your e-mail machine in order to check your mail.

The POP model, though, doesn't solve the problem of mail access entirely. When you retrieve new mail from the server, it now resides on whatever machine you retrieved it on. This works well if you have a laptop computer that you always have with you, but if you use different machines for e-mail at different times, it causes pieces of your mail to be in several different places, which is not easy to deal with. Some people solve that by configuring their POP clients to retrieve mail and still leave a copy on the server, but that doesn't fit the POP model and ultimately doesn't work well for a variety of reasons. To address this issue, the IMAP protocol was devised.

The IMAP model has all mail residing on the server, with the client used to read and manipulate the mail, but not for mail storage. The IMAP protocol allows for the creation of mail folders and the moving of mail from folder to folder, giving a truly flexible way to store mail remotely and access it from any client anywhere. If you have a laptop computer and you frequently want to read and answer your mail offline, IMAP gives the capability for making local (cache) copies of messages or folders and for synchronizing changes with the server at appropriate times. The future of mail access will include a combination of POP and IMAP (RFC-2060).

All major e-mail vendors are working on IMAP clients and servers at this writing. There are several commercial clients and a few servers available now, and within a few months there will be many more, including entries from IBM/Lotus, Netscape, and Microsoft. Within a couple of years, these IMAP servers will replace the POP servers currently in use by Internet Service Providers (ISPs) -- some of the servers support both IMAP and POP in the same server, allowing an ISP to make a gradual transition, and to continue supporting clients with older, POP interfaces. It will also allow the ISPs to better manage their resources: they may provide POP service, which requires only limited management of storage space, to all their customers, while they may make IMAP service available only to those customers who need it (and, perhaps, who will pay extra for it), because of the higher level of administration needed to manage IMAP users' storage needs.

Once it becomes typical to use different client machines at different times, we start noticing problems associated with that, though we thought we'd solved all of our problems. Since your address book is stored on your client, how do you send mail to your friends when you're at your parents' house using their computer? And what about configuration information? What's the Internet address of your mail server? What do you want your signature line to be when you send mail? When people reply to your mail, won't the replies go to your parents, and not to you? How do you use your private encryption keys? It seems that we need a way to retrieve this sort of information from a server somehow, so it can be set for any client, anywhere.

Enter: ACAP -- the Application Configuration Access Protocol. ACAP is an experimental protocol that is currently under development, and which defines a mechanism for retrieving configuration, address book data, and the like from a server. ACAP is currently a draft on track to become a proposed standard, and will probably become an RFC within the next year. There are no commercial implementations of it yet, but there is interest in the protocol from several major vendors, including Microsoft and IBM/Lotus. When ACAP, along with IMAP, is widely deployed, one will be able to use any client to read, manipulate, and answer one's mail, simply by remembering the address of one's ACAP server.

Security

Internet electronic mail has always been a medium of questionable reliability. There is no guarantee of delivery, no guarantee that the information has not been changed (accidentally or intentionally) along the way, and, since the Simple Mail Transfer Protocol (SMTP) does not provide for authentication, no guarantee that the mail really came from the person it seems to have come from. The store-and-forward relay system used for Internet e-mail is very flexible, but very insecure. In practice there are few problems, the actual reliability is very high, and the chance of your mail being observed or tampered with is very low (owing to, if nothing else, the sheer volume of e-mail that's moving around). Nevertheless, as we become increasingly dependent upon e-mail for such things as business transactions, we must be more concerned about its security.

There are four principal issues in electronic mail security:

assurance that the message was delivered,
assurance that the content was not changed along the way (integrity),
assurance that the sender is who it seems to be (signature), and
assurance that no one else has read the message (privacy).

Now, electronic mail has been in productive use for quite a few years without any of these features, so we must put the matter in perspective: just as one does not send every letter by registered mail with a return receipt requested, one need not send every piece of electronic mail that way. But if you're ordering goods by e-mail, you probably want to know that your order will arrive unchanged, that no one can peek at your credit card number while the order is in transit, and that no one else can send an order that the vendor will think came from you. With paper mail, we trust the postal service to provide this security. With electronic mail, we are less sure of the paths and relays through which the mail will travel, and we want independent assurance that it can't be poked or prodded. So we will choose to apply security techniques only when necessary, and not to every piece of mail sent on the Internet.

Delivery receipts have been a missing element of e-mail for years, and for years some mail clients have used informal, nonstandard header entries such as "return-receipt-to" to indicate the address to which to return a delivery receipt. Unfortunately, this has been plagued with problems, precisely because it's not standard. Should the receipt be sent when the message is delivered to its final destination, or should it wait until the end-user actually opens it? What should we do if the user deletes it without opening it? In a client/server environment, if the receipt sent by the server or by the client? And, not surprisingly, most clients (and servers) do not support these nonstandard headers.

There is a current draft that attempts to standardize the sending of receipts. The draft defines MDNs (Message Delivery Notifications), which are both human-readable and machine-parsable and which are clearly associated with the message to which they refer. The draft standardizes a set of "disposition-notification" header entries and suggests when servers and clients should send disposition reports back to the requester. We anticipate that the MDN specification will become a standard within the next year, and that clients and servers will begin to support these over the next two or three years.

To assure content integrity, we can apply a simple algorithm to the message content and derive a checksum, which is simply a number that identifies the current content of the message. We send the checksum along with the message. On the receiving end, we recompute the checksum and compare it with the one that was sent, and if they match we have a reasonable assurance that the message wasn't changed. It's possible that it has been changed and that the changed message happens to have the same checksum, but it's unlikely that accidental changes will result in the same checksum. Of course, this method doesn't protect against intentional tampering, since the tamperer would simply change the checksum as well. We'll come back to this issue later.

Because it's very easy to masquerade as someone else when one is sending e-mail, we need a way to electronically sign our messages. We can accomplish this by encrypting a known string (say, the sender's name) using an encryption key known only to the sender and the recipient. The recipient then decrypts the string, and if that works we have the assurance we need. And we get it at fairly low cost, since encrypting a short signature string is much less work than encrypting an entire, possibly long, message.

We can combine the checksum and signature techniques and add the checksum to the signature before we encrypt it. That will prevent someone from tampering with the checksum and make it more difficult to modify the message without detection.

If we truly need privacy -- the assurance that no one but the intended recipient can read the message -- we can encrypt the entire content of the message, or, in the case of multipart MIME messages, the entire part that must be private (for instance, in our example of ordering goods, we might leave the order unencrypted but put the credit card information in a separate attachment that is encrypted). This is more expensive than just encrypting the signature, but such a level of security is sometimes warranted.

There are a few proposed standards vying for acceptance in the areas of encryption algorithms and packaging mechanisms. You'll see terms such as PEM, PGP, SSL, TLS, S/MIME, Secure Multiparts, and others; a lot of work is being done to iron out these standards, decide which ones will predominate, and settle on a clean method of sending signed and encrypted data in e-mail. There are technical and practical advantages and disadvantages to each scheme, and as the dust (and politics) settles we will soon see one set of security mechanisms come out as the winner, and we'll see most e-mail software supporting signatures and privacy within the next couple of years.

So far we have left the question of the encryption keys aside, but it's time to address that now. Using a key known only to the sender and the recipient is practical in limited cases, but with many senders and many recipients, all sending mail to each other, it quickly gets out of hand. With a symmetric key system, one must have a separate key shared with each person with whom one communicates, and one must distribute those keys such that no one else can find them out. An easier system to use is a public key encryption system. In such a system there are two keys, and a message encrypted by one can only be decrypted with the other. One keeps one of the keys private (the private key) and distributes the other one (the public key) freely.

With a public key system, a sender can produce a signature encrypted with that sender's private key, and anyone receiving the message can validate the signature using the sender's public key. There's no longer a need for the sender to have a separate key to use with each recipient, because the recipient knows that only the sender has the private key. Similarly, if someone wants to send something that can only be decoded by one particular recipient, the message is encrypted with the recipient's public key, and only that recipient, who holds the corresponding private key, can decode it. If we need both assurances at the same time (signature and privacy) we do a double-encryption, with the sender's private key (for signature) and the recipient's public key (for privacy). (In practice, encryption using such a two-key system is much more expensive than symmetric-key encryption, so we actually generate a symmetric key and use it to encrypt the message, and then use the public/private key to encrypt only the symmetric key and append that to the message. Such mechanics, though, are implementation details that do not affect the concepts in this discussion.)

We've now eliminated one source of problems in the encryption system: since the public keys may be freely distributed and the private keys are not distributed, we no longer have to worry about distributing keys privately. I can give everyone my public key and not be concerned about who sees it, and people can store the public keys of the people they correspond with right in their address books alongside the e-mail addresses. So what problems remain? We still must distribute the public keys in a manner that assures the integrity of the key. That is, if you have my public key, you don't care who else has it, but you must be assured that it is, indeed, my public key. If I can pass my public key off as someone else's, then I can masquerade as that person by sending signatures that you think are his.

The distribution is easy when you're exchanging keys with someone you know. I can hand you my public key on a diskette, or put it on my business card and give it to you, or tell it to you on the telephone. But what if we don't know each other? Suppose you're a company I want to do business with, and I want to send you my credit card number, and I want to be sure it can't be intercepted: I need your company's public key. I could get it from your web page, perhaps; lots of people are distributing their public keys on their web pages. But anyone can put up a web page that claims to be your web page -- how can I tell it's really yours? We can set up a trusted key server. But how do we get the keys to that server in a secure way in the first place? And how does that scale up, when we're talking about distributing millions of different keys? A network of key servers perhaps? Run by whom? These are the problems that still need to be solved, beyond the simple definition of encryption and packaging standards that are underway. The standards will be clearly defined soon. The distribution network will take much longer to set up, and the system will be of limited use until that happens.

Summary and Conclusions

Mail formatting, mail access, and mail security are three areas that are particularly important to the expansion of electronic mail, and are areas where significant changes are expected over the next few years. Mail formatting will become much richer with the acceptance of the MHTML standard and with the inclusion of HTML formatting support in most e-mail clients. Mail access will become extremely flexible as IMAP and ACAP servers become widely used and IMAP and ACAP support is added to mail clients. And electronic mail will become much more secure as encryption and signature standards develop and as it becomes possible to distribute public encryption keys in a secure and efficient manner. These changes will increase the utility of e-mail for business, as well as personal use, and we will see a more widespread use of Internet e-mail for business negotiations and buying and selling of goods and services, areas formerly left to paper mail, the telephone, and in-person contact.