QuickTime as a Tape Archival Format

On the SIMH group, Al Kossow and others have been discussing how .tap is a terrible archival container format that also has a bunch of problems for use in emulation and simulation of systems. This is a problem I’ve been thinking about for a while since I hired Miëtek to implement SCSI tape support in MAME including the .tap format, and I had a sudden realization: There’s already a great format for representing sequential media, QuickTime!

A lot of people think QuickTime is a “video format,” but that’s not really accurate. Video and audio playback are applications atop the QuickTime container format; the container format itself is a means of representing multiple typed tracks of time-based media, each of which may have their own representation in the form of samples interpreted according to their own CODECs.

QuickTime Media Structure at a High Level

As an example, a QuickTime file containing a video with associated stereo audio and subtitles may have three tracks, each with their own timebase:

  1. The video track, whose timebase is the number of frames per second, and whose track media is the CODEC metadata needed to decode its samples.
  2. The audio track for the two audio channels, whose timebase is the number of samples per second. Its track media will be similar to that of the video, specifying the CODEC to use for the audio samples to decode.
  3. The text track for the subtitles, whose timebase is probably derived from the video timebase, whose track media will specify things like the language and font of the subtitles, and whose samples consist of the text to present and the size, location, duration, and styling for that presentation.

All of these are represented within a file as atoms which represent well-identified bags of data with arbitrary size and content, making it very easy to write general-purpose tooling and also to extend over time. (The last major extension to the low-level design was in the 1990s, to support 64-bit atom sizes, so it’s quite a stable format already.)

Mapping QuickTime to Data Tape

Once you realize that the tracks themselves can be arbitrary, it starts to become clear how this format maps nicely to tape content: Since tapes themselves are linear, they’re fundamentally time-based.

The actual content of a tape isn’t a pure stream of raw data, it’s a set of blocks of raw data between magnetic flux marks, with some gaps between — and thanks to media decay, those blocks can be good or bad. Usually these marks are used to organize tapes into files, but that’s not a guarantee; for both archival and emulation, it’s best to stick to the low-level representation and let applications impose the higher-level semantics.

In this case, you’d have a “tape data” track whose track media describes the original medium (7-track, 9-track, etc.) and the interpretation of its samples. The samples themselves would be the marks and data blocks. And there’s even a native representation of tape gaps, in the form of non-contiguous samples.

The format can also be leveraged to support random access including writes, since the intelligence for that can be in the “CODEC” for the “tape” track media, combined with the QuickTime format’s existing support for non-destructive edits. New data can be overlaid based on its “temporal” position, which should more or less accurately simulate how a rewritten tape would actually work, while still preserving the data that was just overwritten.

Finally, QuickTime has a concept of “references” that can be used to implement things like tape files independent of (rather than inline with) the tape data itself. A catalog of block references, for example, could also be stored with the tape data’s track media to indicate the block extents for individual files on tape, thus allowing direct access by tooling without having to stream through the entire file.


Since QuickTime movie files are a moderately complex structure atop a simple base, it’s important to have a reasonable API to work with both the low-level atom structures as well as the higher-level constructs like tracks, track media, sample chunks and samples. Fortunately there already exists at least one Open Source library allowing this, QTFileLib from the Darwin Streaming Server that Apple made Open Source in 1999.

Darwin Streaming Server as a whole and its QTFileLib component are written in quite straightforward “C with Classes”-style C++, and QTFileLib has an API surface representing all of the major low-level and application-level concepts of the file format. As a side effect of the implementation of its read support, it also has a lot of the API necessary for creating and wiring together QuickTime data structures for creating files, just not support for writing it all out. Structurally that should be straightforward to add. It even looks straightforward to port to plain C, if that’s desired.


  1. Karlie Birmingham
    January 23, 2024

    While using QuickTime for this purpose is ingenious, is it safe to be using C and C++ code in 2024? We’re living in the Rust era now, after all. We can now use Rust instead of C and C++ if we want blazingly fast software without garbage collection. Your idea is a great one, but I think it must be done using Rust instead of C and C++. There’s just no reason to risk it these days. All new software should be written in Rust, and all existing software should be converted to Rust as soon as possible.

    1. admin
      January 26, 2024

      Maybe because Rust isn’t something everyone wants to switch all code everywhere to when there are already perfectly reasonable implementations, despite the best effort of the Rust Evangelism Strike Force. Please keep it to HN.

    2. professor-order
      January 30, 2024

      You mean brainless hipster C-hater era where Rust Garbage Spreader Freaks flood news columns and Gir repositories with useless code? Then, yes. Absolutely.

      1. admin
        January 30, 2024

        Nah, Rust isn’t *bad*, in fact it’s pretty good for writing new security-critical code. But it’s not the only way to achieve those aims, and not the best solution for every problem by far. After all, *that* was developed in the late 1950s by John McCarthy… 😉


Leave a Reply

Your email address will not be published. Required fields are marked *