Internet Draft
Category-to-be: Informational
Matt Curtin
The Ohio State University
Jamie Zawinski
Netscape Communications

June 1998
Expires: December 1998

Recommendations for generating Message IDs

Status of this Memo

This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as ``work in progress.''

To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet- Drafts Shadow Directories on (Africa), (Europe), (Pacific Rim), (US East Coast), or (US West Coast).


This draft provides recommendations on how to generate globally unique Message IDs in client software.

Table of contents

1. Introduction

Message-ID headers are used to uniquely identify Internet messages. Having a unique identifier for each message has many benefits, including ease in the following of threads and intelligent scoring of messages based on threads to which they belong.

It has been suggested that it is impossible for client software to be able to generate globally-unique Message-IDs. We believe this to be incorrect, and herein to offer suggestions for generating unique Message-IDs.

2. Message-ID formatting

As defined in [NEWS], a message ID consists of two parts, a local part and a domain, separated by an at-sign and enclosed in angle brackets:

Practically, news message IDs are a restricted subset of mail message IDs. In particular, no existing news software copes properly with mail quoting conventions within the local part, so software generating a Message-ID would be well-advised to avoid this pitfall.

It is also noted that some buggy software considers message IDs completely case-insensitive, in violation of the standards. It is therefore advised that one not generate IDs such that two IDs so generated can differ only in character case.

3. Message-ID generation

The most popular method of generating local parts is to use the date and time, plus some way of distinguishing between simultaneous postings on the same host (e.g. a process number), and encode them in a suitably- restricted alphabet. An older but now less-popular alternative is to use a sequence number, incremented each time the host generates a new message ID; this is workable, but requires careful design to cope properly with simultaneous posting attempts, and is not as robust in the presence of crashes and other malfunctions.

On many client systems, it is not always possible to get the fully-qualified domain name (FQDN) of the local host. In that situation, a reasonable fallback course of action would be to use the domain-part of the user's return address. Doing so makes the generation of the "distinguishing number" be more important; in particular, it means that a process ID is probably not sufficient.

An alternative for generating the distinguishing number, on systems where the process ID isn't available, or in the case where the local host's FQDN isn't known, is to generate a large random number from a high-quality, well-seeded pseudorandom number generator. (Note that the RNGs shipped by many vendors are not high quality.)

In summary, one possible approach to generating a Message-ID would be:

If the random number generator is good, this will reduce the odds of a collision of message IDs to well below the odds that a cosmic ray will cause the computer to miscompute a result. That means that it's good enough.

There are many other approaches. This is provided only as an example.

4. Acknowledgments

This document is partially derived from an earlier, unrelated draft by Henry Spencer.

5. References
Ref. Author, title IETF status
(June 1998)

[NEWS] M.R. Horton, R. Adams: "Standard for interchange of USENET messages", RFC 1036, December 1987. Non-standard (but still widely used as a de-facto standard).

6. Authors' addresses

Matt Curtin
The Ohio State University
791 Dreese Laboratories
2015 Neil Ave
Columbus OH 43210
+1 614 292 7352

Jamie Zawinski
Netscape Communications Corporation
501 East Middlefield Road
Mountain View, CA 94043
(650) 937-2620

[ up ]