FutureKeep

Extensible Universal Data Media Archiving Standard

Draft outline notes

FutureKeep (tentative name) is a standard being devised to provide a mechanism for imaging any form of media used to store data. Such media in mind includes (but is not limited to) punch cards, paper tape, magnetic tape, magnetic disk, ROM, etc. The images are stored as plaintext structured using XML. Each image will be self-contained, including not only the data being archived but the meta data as well (i.e. identifying information). The standard will allow for any level of imaging of the original media, from the highest level of a single stream of data all the way down to the lowest levels of the original media, including the ability to represent physical encoding features on the source media. Read the specification notes below for more information.

This webpage is just a temporary place to publish these draft outline notes until a more formal website running a content management apparatus can be setup.

Recent Updates:

The End of Software: Contemplating a Standardized Software Preservation Methodology (white paper by Sellam Ismail of VintageTech)

Basic features:

  1. Well Documented
  2. Universal (not constrained to any particular hardware)
  3. All inclusive - inclusive of all physical manner of recording media
  4. Ease of Implementation - be implementable on even the simplest architectures
  5. Unencumbered by license - Open source, public domain, etc.
  6. Extensible - Adaptable, expandable, revisable (for future extensions)
  7. Character-based - Text-based and stored in commonly accessible character set
  8. Multi-level - Allow for the representation of media in logical or physical form

Notes for Basic Features:

  1. Should also allow for logical representation of data on a physical medium
  2. The original media source will in many cases have to be read on the original hardware, which will be older and therefore possibly more difficult to operate or maintain.
  3. A copyright on the format may be held to prevent unauthorized extension or pollution/adulteration of the standard. As well, the standard should not incorporate any copyrighted or patented schemes or algorithms.
  4. Adaptable, expandable, revisable (for future extensions)
  5. A suitable subset of Unicode, i.e. ASCII or UTF-8, should be specified for universality. Tag characters should be limited to a defined subset, i.e. A-Z, 0-9, - (hyphen), = (equals), the period, the command, and the space character (subject to study).

Documents Required:

RFP
Request For Proposal document to introduce the specification to the world.
Specification
This will be the actual specification definition.
Best Practices
a "Best Practices" document needs to be written which explains the best way to create an archive.

Meta Data

Media Scope

Data Scope

Archive

Transcoding

Markup Tags

Data Encoding

Interface

Miscellaneous