Skip to content

DraftRepositoryDirectories

Robert Sparks edited this page Jun 8, 2023 · 12 revisions

This documents implementation details that are fluid. Treat the contents with suspicion of having become out of sync with reality.

Internet-Drafts are stored on the ietf server in several directories.

The Internet-Draft Repository

/a/ietfdata/doc/draft/repository/:: This is the active ID repository. The id submission tool places drafts in this directory and a cron job moves drafts out of this directory when they expire

The Internet-Draft Archive

Files that have always been controlled through official processes

/a/ietfdata/doc/draft/archive/:: This is the ID archive.

In normal circumstances, this directory is the only archive directory that regularly receives new files. The one exception is documented below with unknown-ids. Files are moved into this directory by 4 different mechanisms:

  • A cron job moves drafts into this directory from the repository above as they expire. (by the expire-ids script)
  • When an active draft is marked as replaced by another draft by the secretariat toolset, it is moved into this directory as part of the replacement update.
  • When a new revision of an active draft is added with the secretariat toolset, the previous version is moved into this directory.
  • When a draft is marked as published as RFC by the secretariat toolset, the corresponding draft is moved into this directory.

Additionally, there are other mechanisms that should be in place, but aren't: (TODO - verify - These may actually be performed now).

  • When a new revision of an active draft is submitted, the previous version should be moved into this directory as part of the submission process.
  • When a draft is marked as replaced by another draft through the draft's document view, in doc.views_draft.replaces(), the replaced draft should be moved to this directory.

There are several subdirectories in this directory. They each have a fixed set of files, described below, that augment the archive. It is expected that these subdirectories would never have new files added to them.

Files in this directory and its subdirectories have always been controlled through our official processes.

/a/ietfdata/doc/draft/collection/interenet-drafts/:: This is a snapshot of a previous toolchain's copy of the active ID repository. It has 5544 files in it with dates that range from Apr 26, 2000 to Oct 28, 2005.

/a/ietfdata/doc/draft/collection/unknown-ids/:: This directory contains files that were found in the repository, but the processing toolchain had no record of the draft associated with the filename. It is primarily a historic snapshot, but will still get new files if a file somehow ends up in the repository with an out-of-sync version, or a name that does not match any existing Document object.

There are bugs in the code that grooms the repository that are being removed now that have been regularly placing files particularly xml files) in this directory. As of this writing, it contains 43288 files. The oldest is from Mar 1, 2000. The most recent .txt file is from Feb 5, 2014. There are many draft xml files, here up to Jun 12, 2014. Many of these will be moved to the ID archive proper, where they should have been placed, as the bug-fix work progresses and these stats will be updated. See for example changeset:7922.

/a/ietfdata/doc/draft/collection/expired_without_tombstone:: This is a snapshot of a previous toolchain's directory of drafts that expired without being replaced by a tombstone. It contains 4158 drafts with dates ranging from Oct 29, 2002 to Aug 30, 2013.

Files that have been out of our official process control

/a/ietfdata/doc/draft/collection/tool-id-archive:: This is a static snapshot of the archive from the volunteer maintained tools server (primarily curated by Henrik Levkowetz) with files ranging from Sep 10, 1992 to Jan 9, 2012. There are more than drafts here - files like all_id2.txt appear as well.

/a/ietfdata/doc/draft/collection/lou-berger-archive:: A collection from Lou Berger accepted by the IESG with files ranging from May 5, 1997 to Aug 4, 2002.

While confidence in the validity of these files is very high, files in this directory have NOT always been controlled through our official processes.

New subdirectories may be added in the future if new fixed sets of files are found to augment the archive further.

Accessing the Archive

Tools placing drafts in the archive should only add files to /a/ietfdata/doc/draft/collection/internet-drafts.

Tools reading from the archive should never access the directories in /collection/ directly. Instead, they should use the /a/ietfdata/doc/draft/archive. This view is maintained by ghostlinkd (see /usr/local/sbin/ghostlinkd and /etc/ghostlinkd.conf and https://github.com/ietf-tools/ghostlinkd). This view is exposed over https at https://www.ietf.org/archive/id

NOTE: To allow the archives and applications using them to scale, these will be moving to a blob store with tooling providing access, at which point ghostlinkd (in its current form) won't be the mechanism that constructs the archive from the collections.

It is possible for a file to appear in more than one of the /collection/ directories above. It is even possible for the contents of files with the same name in different directories to not be identical. The combined view will present the file of a given name using the version that is in the earliest appearing directory in the order listed above.

Clone this wiki locally