Before email (short for electronic mail, or
e-mail) when dinosaurs walked the Earth,
it was difficult to avoid phone tag
.
To leave someone a message you needed to physically travel to
their office and leave a slip of paper.
(There were no cell phones in those days, and even
answering machines were rare.)
At some point it was realized that (at some organizations anyway)
people had computer terminals in their offices, and that it was
possible to leave a message in a file at a given location.
When a user returned to their office they could check if that file
existed and contained any new messages.
To make this easier a simple program was used.
To send email to someone the original mail program
could be used this way:
.
Any text you typed thereafter would be appended to that user's
mailbox file, sometimes called the inbox.
(The standard location was
mail user/usr/spool/mail/username; today
/var/mail/username is more common.)
When done entering the message you would indicate EOF
by hitting control+d.
(You could also pipe into this command.)
To read your mail (if any) a user would just type
.
This would dump the contents of that user's mailbox file to
the screen.
Over time this command became more sophisticated, to allow the
mailbox to hold multiple messages, to automatically add a
header with the sender's username and the time the
message was sent.
This scheme allowed the mailbox to hold multiple messages since
the header separated one from the next.
That file is called a mail folder
(or mailbox), but is a single text file containing one
message after another.
mail
The newer mail command also displayed one message at
a time and allowed options for the user to save the message,
delete the message, and even to reply to it (that is,
send a message back to the person who sent you a message).
The reply message usually has the same subject
as the original,
with
prepended.
Sometimes the original message is quoted in the new message
body as well.
Re:
An email message contains three parts: the envelope which identifies the sender and recipients, the headers, and the message body. The headers and body together are referred to as the message.
It is important to understand that only the recipients listed on the envelope will receive an email message; mail servers never look at any headers to determine this! Also the envelope is not part of the message that gets saved in a user's mailbox. So once some email message is delivered to you, you can't tell who else was listed on the envelope.
The special header mentioned above that identifies the start of an email message in a mailbox looks like this:
From sender date
(Note the space and no colon after
.)
This special From
header is part of the
mailbox format common to most systems, known as
MBOX or Berkeley mailbox format,
and is not a standard email header at all.
See RFC-4155 and
the From mbox(5) man page for details.)
This
header is generated automatically from the
envelope from address.
Even if a From
header was provided it is
over-written by most mail servers with the real sender.
This header is always the first one of an email message.
When a mail program is reading a mailbox, an email message begins
with this header and continues until the next
From
header, or until the end of the file.
From
What happens if the message body contains a line starting with
?
The mail program will typically insert an ASCII space
in front of that line, a technique called
space-stuffing.
When displaying messages the mail program removes the extra
spaces.
From
Eventually other headers were allowed as well such as
.
A problem is, how to tell the difference between the headers
added by the mail program and the message entered by the
sender?
The answer is to have all the headers at the beginning
(hence the name), followed by one blank line, and that
followed by the message entered by the user.
That part is known as the message body.
Subject:
Recall the
header does not determine
to whom the email gets delivered.
The addresses passed to the mail server (the envelope
addresses) and not the ones listed in any mail headers
determine who receives the email.
Since the various headers in the message do not determine who
receives the message, they may easily be faked.
To:
There are many standard headers that can be used, such as:
Subject:, To:, From:,
Cc: (carbon copy, Date:,
etc.
The carbon copy list is the same as the To:
list, just
more recipients to add to the envelope.
The only difference is that somehow being listed on the
To: list confers more status than being listed on the
Cc: list.
Note there is no such thing as a Bcc: header;
a blind carbon copy
is a recipient listed on the
envelope but not in the headers.
(See RFC-2076
and RFC-4021
for a description of standard email headers.
It is often the case where you need to add your name, title, contact information, and sometimes a legal notice, to some or to every email message you send. This information is known as a signature block, signature line, sig block, or just a signature. (This should not be confused with digital signatures, discussed below.) That can get tedious to type in for each message! Many mail programs (MUAs) and mail servers (MTAs) have a feature where you can set a signature to be automatically appended to the body of all email messages.
There are rules of netiquette
(network etiquette)
for email signatures.
They should always begin with a line only containing two dashes
and a space.
This signature separator is called sig dashes,
signature cut line, or sig-marker.
(It is recognized automatically by most
email programs, which can treat signatures specially when replying
to a message and quoting the message body.)
The other rules are that the signature should be plain text,
with no more than 4 lines;
each line should be at most 80 columns long.
There are many rules of netiquette
for the body of email.
When using traditional, plain text email there are rules for
formatting signatures, quoting material, line length, line wrapping,
and so on.
These are defined in
RFC-3676 (Text/Plain Format).
This also defines when to use space stuffing
(adding
a space to lines that start with
,
a space, or a From
.
>
One useful convention you should follow is to quote URLs
with angle-brackets (<
and >
).
This allows an MUA to recognize a URL
even when wrapped over multiple lines.
Modern email addresses look like this:
username@hostname
The hostname is a host or computer on a
network.
The @hostname
part is optional;
if missing the localhost
(that is, the current system) is assumed.
The hostname should be a valid
DNS
host name or an IP
address.
(It can be another name defined in the DNS system,
in an MX record.
A common example is to use an organizations domain name only
and not the name of any particular host; for example
and not
user@example.com
.)
user@mailserver.example.com
The username should be a valid account on that system
(or a defined alias such as
).
webmaster
There are a number of programs play a role in composing, reading, and delivering email:
MTA (Mail Transport Agent) — Examples include Sendmail, Postfix, Exim, and Exchange. The MTA is the software that accepts email from an MUA and then routes and forwards the email (several hops if necessary) to the destination MTA. The destination MTA also must handles security issues such as rejecting email or sending a redirection message back, alias expansion, forwarding, relaying, etc. An MTA that accepts mail destined for other MTAs is relaying email. An MTA that does this without requiring sender authentication is called an open mail relay. Spammers love these!
MDA
(Mail Delivery Agent) —
An example is procmail.
The MDA handles final delivery issues such as virus
scanning, spam filtering, return-receipt handling, automatic mail
processing (by piping into some program), forwarding email
to users and groups, sorting email into different mailboxes,
etc.
The most common action is to simply append to user's inbox.
(Note some software such as Exchange or Sendmail is both an
MTA and MDA.)
It is the MDA that must be configured with the location
of a user's inbox.
The standard location is /var/mail/username
but this can be changed (how depends on which MDA
you use).
Of course the MUA must also know that pathname;
the MAIL environment variable is often
used to tell an MUA this location.
MAA (Mail Access Agent) — Examples include Courier and Cyrus. When email is send to a user the MDA stores it on the server's hard disks. Users rarely have login access to that mail server! (YborStudent is an exception.) Consider AOL mail, Yahoo mail, Gmail, or Hotmail. Your mailboxes are stored on those remote servers. Somehow you need to access those remote mailboxes with your local mail reading software (your MUA). An MAA server is used in addition to the MTA to provide this remote access. The user's MUA authenticates the user to the MAA which then downloads a copy of the user's mailbox (or selected messages only) to the MUA, where the user can then read, save, copy, print, or delete messages. (Remember on some systems, notably Exchange server, the MTA, MDA, and even MAA are a part of a single program.)
MUA
(Mail User Agent) —
Examples include
alpine (was pine),
mutt, Eudora,
Outlook, Thunderbird, HotMail.com, mail, and
mailx (or nail).
The MUA (also called an email client)
is the software that allows you to compose,
send, and read your email.
An MUA must be configured with the MTA
to use to send mail and the MAA
to use to fetch the mail.
(Older MUAs such as mailx can't use an
MAA; they just read the mailbox file directly.)
In addition some MTAs and MAAs require
usernames and passwords for authentication and may also require
additional security configuration.
Some MUAs such as mail.yahoo.com,
hotmail.com, and mail.google.com
(or G mail) are not installed on your local computer,
but on a web server that a users access with a web browser.
Even though they are installed on a server they are still just
MUAs or email clients.
MSA (Mail Submission Agent) — With about 90% of email currently rejected as spam or as containing viruses, a commonly used mail architecture today is to use an MSA to screen out and drop such email before the MTA and MDA must process it. In this case the MDA won't also need to scan for viruses or spam (although it can, perhaps using a more sophisticated filter to catch spam that makes it through the bulk filter used in the MSA).
First you start up your MUA, compose a message, add
some headers (such as Subject:), and state to whom
the email should be sent.
When you are finished the MUA (usually) adds some
additional standard headers and then sends the mail message to your
MTA.
The mail gets routed from MTA to
MTA (nowadays very few hops are needed), with the
MTAs in the middle relaying the mail
to the next MTA along the path from the source to
the destination.
(Each MTA that receives an email will add a
header to it.
By looking at these headers you can see the path a message took.)
Received:
The mail arrives and is accepted by the MTA at the
destination.
That MTA gives the mail to MDA which
may filter the mail, forward it, sort it to
different mail folders, ..., and finally deliver the mail to
the user's inbox.
Note when sending email to many users on the same system,
all the users are listed on RCPT: line (envelope
addresses, not a header) but the actual email is
sent only once to that (remote) system.
biff is a mail notification program named after a
BSD
developer's dog.
(It's not true that Biff used to bark at the mailman,
that's just a myth.)
Use the arguments
or y
to enable or disable these notifications.
nbiff is annoying but if you have a
GUI then
xbiff is useful.
For TUI
(or CLI)
use the mail notification feature of the shell (most shells
have such a feature), which only notifies
you between running commands and not in the middle of some task
the way biff does.
The MTAs work in a store and forward manner. An MTA receives an email message and stores it on a (local) disk. Then the message is relayed to another MTA or handed to the local MDA for delivery. Various problems can cause messages to not get delivered. This is usually known as a bounced message.
Mailer-daemon is the usual name of an MTA or MDA when it generates error email messages to return to the sender. Common causes include: bad user-name (destination MTA will bounce the email, which means return it to the sender with an explanation as to what happened), bad hostname (sender's MTA will bounce it), destination MTA is down (Sender's MTA—actually, the MTA immediately upstream of the destination—will try for awhile and then send a warning. If the destination server never responds then the MTA will eventually give up and will bounce the email).
The different components in the mail system must communicate with each other, passing the mail messages and other information. The rules of communication and the definition of message formats are called protocols. There are a number of standard protocols defined so that different vendor's software can interoperate easily (as well as the different components of an email system). The most important protocols are:
SMTP
(Simple Mail Transport Protocol) —
ESMTP
is the modern enhanced version.
(A good mnemonic for SMTP is Send Mail To People
.)
MUAs use this protocol to talk with MTAs;
MTAs use it to talk to each other.
Interestingly this protocol is designed to be used interactively
by humans!
ESMTP is defined by
RFC-5321
and RFC-5322.
(It was originally defined by RFC-821 and
RFC-822, which was replaced with RFC-2821
and RFC-2822.
It is still common to talk about 822
email.)
POP (Post Office Protocol) — Sometimes called POP3 (since POP is such a popular acronym adding the version number makes the name stand out), this protocol is used between an MUA and an MAA. POP is popular with ISPs because it is simple and cheap to implement. It allows you to send a username and password, then your entire mailbox is downloaded to your MUA. The only option is whether to delete the mailbox contents from the server after downloading, or not. POP is defined by RFC-1939.
POP3 assigns each mail message a unique ID
called the UIDL,
so it can tell which messages have been downloaded already.
In addition some vendor's have implemented an extension to
POP3 called
,
that allows clients to transmit outbound mail.
(Normally SMTP is used for that.)
XTND XMIT
IMAP (Internet Message Access Protocol) — Like POP, IMAP is a protocol used between an MUA and an MAA. IMAP is more powerful and flexible than POP but takes more resources so ISPs rarely offer it. IMAP allows for selective message downloading and deleting, downloading of headers only, multiple mailboxes, and more. IMAP is defined by RFC-3501.
Some vendor's use proprietary protocols to talk between their
proprietary mail servers and MUAs.
Examples include Novell's GroupWise
, IBM's
Lotus Notes
, and Microsoft's Exchange
.
Some companies have reverse-engineered these protocols and
claim to have compatible products but it is not a good idea
to rely on those.
An email gateway
is an MTA that translates
between standard email protocols and some
proprietary system's protocols.
One point to note about all these protocols is that the usernames and passwords are sent in plain text. Variations of all three (ESMTP, POPS and IMAPS) allow for encryption to protect usernames and passwords (and the contents of the email messages). However few ISPs support these protocols since they require more resources (and hence are more expensive to implement).
In the old days email was plain ASCII text, which takes only seven bits of each byte. Much of the early Internet dropped the 8th bit of every byte to gain a 12.5% speedup. Naturally this won't work with binary files such as GIFs, binary data files, or programs. So these needed to be encoded (in essence adding a zero bit after every seventh bit) when sent, and decoded by the recipient. Here's how this was done:
uuencode filename filename > file.uu mail recipient < file.uu
The encoded file is copied into the body of the outgoing email. Once delivered the recipient would have to save the body, edit it to remove all but the encoded file, and decode the file:
mail ... save received email body infile.uu... uudecode file.uu
What a pain! Besides this problem, plain text email is... plain-looking. Early business adopters of email wanted better looking email with features such as bold, italics, underline, justified text, and color.
MIME (Multipurpose Internet Mail Extensions) is a protocol that MUAs use to provide styled text demanded by business users of early email. Today that isn't important (as we now use HTML for email with styles, graphics, and fancy formatting). But most importantly MIME supports multi-part email messages. This is when the body of the message is split into several parts, separated by a MIME separator string, where each part contains its own headers and is automatically encoded and decoded. Each of these parts are known as an attachment. Note that MIME is invisible to MTAs, MDAs, and MAAs, which only see a single message body with some weird stuff in it. (Virus scanners do know about attachments, of course.)
Today much of the Internet uses all eight bits of a byte, but not all of it so encoding is still used. MIME uses a technique known as Base-64 encoding (RFC-4648). MIME is defined by RFC-2045.
View a sample email message that uses MIME attachments.
Many old command line MUAs exist, the most popular of
those is mailx.
However old mailx doesn't know about MAAs
or MIME.
An updated compatible MUA nail is
available (nail = new mail?).
nail is often installed under the name mailx
on Linux and some Unix systems.
These older MUAs are still valuable because some of
them (mailx but not alpine) can be used
non-interactively to send mail from a shell script, and because
system administrators often use command line access to
Unix/Linux servers via
SSH
and may need to read or send mail directly from that server.
The use of a MUA such as alpine
should be easy to learn since it is menu-driven.
alpine
is the current version of the
pine
MUA.
(The developers wanted to change the license before continuing
development, and they couldn't with the old name.)
In alpine, at any point you can examine the menus
to see what you can do.
These are context-sensitive menus.
For instance hitting ^J (control+J) when
a header field is highlighted means to add an attachment;
if the message body is highlighted this means to
justify the text.
In message body use ^R to read a file
and paste its contents into the mail body.
Buy cheap Real Rolex watches!or
Hot! Hot! Hot! pix!) while other spam is harder to spot (
Meeting time changedor
I have a question about your site). Non-spam email is called ham.
One type of nasty spam that is popular is a message that appears to be from someone or some organization you know. The attacker tries to trick you into clicking a link (in the email body) that will either run nasty software on your computer or take you to a fake website that looks real, to collect your personal information. This is known as phishing and can be very hard to detect. (Avoid clicking on links in email; use a browser's bookmark/favorite instead to be sure you're going to the correct site.)
In the U.S. and a few other countries it is legal
for users to encrypt their emails to protect their
privacy.
(See below for details on how to do this.)
However recent legislation (such as the Sarbanes-Oxley Act
,
or SOX
) may require that for any organization that
allows encrypted email, the organization must be able to recover
the private key to be able to turn over emails when legally
requested to do so by the proper authorities.
In other parts of the world it is illegal to encrypt email.
A related issue is censorship of emails by employers,
ISPs, or governments.
Today there is wide-spread censoring or delaying of some
Internet traffic by many ISPs.
Some ISPs have a premium
level of service where they promise not to delay or block
your Internet traffic if you pay extra.
(To me this seems a kind of protection racket:
Nice email you got there buddy!
If you pay I can make sure nothing bad will happen to it.
.)
Another meaning for mail-bomb is an email that, just by reading it, will install malicious software on your computer or do other terrible things. This is a half-truth. Since plain email is just text, reading it with a basic MUA can never hurt your computer. Unfortunately some advanced MUAs (including reading email with a web browser) will accept instructions embedded in the email (perhaps using MIME) which can be abused. Personally I turn off HTML and JavaScript in my MUA, as well as any auto-loading of documents in attachments. Then, unless I download and run something that I found in an email, I am fairly safe from this threat.
A list of e-mail addresses identified by a single name
such as students@hccfl.edu is called
a mailing list.
When an e-mail message is sent to the mailing list name
it gets sent to all the addresses in the list.
Each list also has a special email address for configuring
your use of the list (for example, add or remove yourself
from the list) and another address for the (human) owner of
the list.
A good resource for learning about mailing lists is
Understanding Mailing Lists
by Harley Hahn
(the author of our textbook).
Users can configure the procmail MDA
to manage mail logs, to spam check, to filter email, to
automatically reply to some mail, etc.
See the man pages for procmailrc and
procmailex.
The file
~wpollock/.procmailrc
contains examples of MDA return-receipts and spam
filter using regular expressions,
spamassassin
(www.SpamAssassin.org).
While you have no guarantee of privacy with your email, you are allowed (in the U.S. anyway) to protect your email by encrypting the message. Such a message can't be read or tampered with by unauthorized parties.
The older technology for encryption is called symmetric (or shared) key: you and I share a key (password). Qu: how do we do that securely? Qu: what about doing business say with Amazon.com using this?
The old method is efficient but the problems are too difficult to make this technology useful on a wide scale. A newer technology is called public key encryption. With this method a pair of keys is made for each party. One is kept secret (the private key) and one is published (in email messages, in flyers, on web sites, on key servers, etc.) called the public key. To send a message to you I encrypt the message with your public key. Only you can decrypt it since this requires your private key and only you have it.
You reply to me by encrypting your reply with my public key (which only I can decrypt, using my private key). As you can see, four keys are used altogether. Public key encryption is the technology behind secure web sites that we all rely on (the web sites using the HTTPS protocol).
Public key encryption is much, much slower than symmetric key encryption. To make this technology practical, rather than encrypt a lengthy email message (body) a very large random number is generated by the sender. This number is used as a symmetric key and the message is encrypted with it. Only this session key gets encrypted using the public key method.
A digital signature is created for a message by encrypting it with the sender's private key. This doesn't protect the confidentiality of the message since anyone with the sender's public key can decode the message. (If privacy is also desired the sender encrypts this encrypted message with the recipient's public key.) If the sender's public key decodes the message correctly, it is strong proof that their private key was used to encrypt it in the first place. Since only the sender has their private key, only the sender could have sent the message. Note this encryption with the sender's private key is a digital signature; a GIF graphic of a hand-written signature is not!
The U.S. federal government now treats digital signatures just as binding as a hand-written (or holographic) signature. See the Electronic Signatures in Global and National Commerce (ESIGN) Act passed in June of 2000, for details. Also many state governments have passed laws treating digitally signed emails as equivalent to documents signed holographically (by a person).
In practice this takes too long (even on modern computers) so a checksum (or digest or hash) of the message is encrypted with the private key instead, and this is appended to the (unencrypted) message. The recipient also computes a message digest of the email, then decrypts the message digest sent with the message body and compares the two. If the two digests differ the message was altered or forged.
Because the private keys used to encrypt email must often be made available to organizations (to comply with laws), separate sets of keys are often used for digital signatures and for email encryption.
Issue of Trust: A public key can be digitally signed by a trusted third party (such as VeriSign). This third party has a well-known public key. Most web browsers and email clients come with a built-in list of such well-known public keys.
People can use
PGP/GPG
(Pretty Good Privacy, Gnu Privacy Guard, both
written by Phil Zimmerman) to encrypt, decrypt, compute
message digests, and to digitally sign messages.
PGP was written first; later versions were
renamed GPG.
GPG is sometimes called
GnuPG.
(See the man page for gpg for details.)
GPG provides easy email integration with modern MUAs. But to use this technology people must generate a pair of keys and publish their public key. (You also need a third party to sign your key so others will trust it.) These complexities have hampered the widespread adoption of encryption and digital signatures.
The secure web sites with URLs such as
HTTPS://www.example.com/
work by exchanging public keys between
server and browser, which then verifies these by using the
trusted third party's public key to validate the key.
Next the browser and web server exchange a session key
(a big random number encrypted with the public keys)
to use for symmetric key encryption for the rest of that session.
(This is an over-simplification of what really happens but
should provide some idea of the process.)
mail.
The modern version of this is called mailx.
These tools are still useful to allow mail to be send
from a shell script. /var/mail/username.
This can be changed by the MDA.
The MUA must also know this location;
the MAIL environment variable is often used for this.
From user date. From:,
To:, Cc:, Subject:,
etc.
(However there is no Bcc: header.) username@hostname. The
@hostnamepart is optional; localhost is assumed.
Received: header to the front of
the email. biff, xbiff
(or similar GUI program), or by the shell's
email notification feature (which relies on MAIL
environment variable being set correctly). uuencode and uudecode utilities
were common, now MIME's Base-64 encoding
is used. alpine has an easy to use menu-driven interface. gpg.