February 2023 – Eschatology

Here’s another interesting thing I’ve learned about Clascal and Object Pascal: It went through exactly the same evolution from combining object allocation & initialization to separating them that Objective-C did a decade later!

In early 1983 Clascal, classes were expected to implement a New method as a function returning that type, taking zero or more parameters, and returning an instance of that type by assigning to SELF—sound familiar? This was always implemented as a “standard” method (one without dynamic dispatch) so you couldn’t call the wrong one. A cited advantage of this is that it would prevent use of the standard Pascal built-in New() within methods—which I suspect turned out not to be what people wanted, since it would prevent interoperability.

A class could also choose to implement an OVERRIDE of the also-included Free method to release any resources it had acquired, like file handles or other object instances. And each overridden Free method had to include SUPERSELF.Free; after it did so in order to ensure that its superclass would also release any resources it had acquired.

INTERFACE

  TYPE

    Object = SUBCLASS OF NIL
      FUNCTION New: Object; STANDARD;
      PROCEDURE Free; DEFAULT;
    END;

    Person = SUBCLASS OF Object
      name: String;
      FUNCTION New(name: String): Person; STANDARD;
      PROCEDURE Free; OVERRIDE;
    END;

  VAR

    gHeap: Heap;

IMPLEMENTATION

  METHODS OF Object;

    FUNCTION New{: Object}
    BEGIN
      SELF := Object(HAllocate(gHeap, Size(THISCLASS)));
    END;

    PROCEDURE Free
    BEGIN
      HFree(Handle(SELF));
    END;

  END;

  METHODS OF Person;

    FUNCTION New{(theName: String): Person;}
    BEGIN
      SELF := Person(HAllocate(gHeap, Size(THISCLASS)));
      IF SELF &lt;&gt; NIL THEN
        name := theName.Clone;
    END;

    PROCEDURE Free
    BEGIN
      name.Free;
      SUPERSELF.Free;
    END;
  END;

By mid-1984, Clascal changed this to the CREATE method, which was declared as ABSTRACT in the base class. Note that it still doesn’t use the standard Pascal built-in New() to create object instances. However, it takes a potentially-already-initialized object so that it’s easier for a subclass to call through to its superclass for initialization, since CREATE is still not a dynamically-dispatched method. Also, instead of referencing a global variable for a heap zone in which to perform allocation, it takes the heap zone, providing some amount of locality-of-reference that may be helpful to the VM system.

There was also a change in style to prefix class names with T.

INTERFACE

  TYPE

    TObject = SUBCLASS OF NIL
      FUNCTION CREATE(object: TObject; heap: THeap): TObject; ABSTRACT;
      PROCEDURE Free; DEFAULT;
    END;

    TPerson = SUBCLASS OF TObject
      name: TString;
      FUNCTION CREATE(theName: TString; object: TObject; heap: THeap): TPerson; STANDARD;
      PROCEDURE Free; OVERRIDE;
    END;

IMPLEMENTATION

  METHODS OF TObject;

    PROCEDURE Free
    BEGIN
      FreeObject(SELF);
    END;

  END;

  METHODS OF TPerson;

    FUNCTION CREATE{(theName: TString; object: TObject; heap: THeap): TPerson;}
    BEGIN
      IF object = NIL
        object := NewObject(heap, THISCLASS);
      SELF := TPerson(object);
      WITH SELF DO
        name := theName.Clone(heap);
    END;

    PROCEDURE Free
    BEGIN
      name.Free;
      SUPERSELF.Free;
    END;
  END;

This is starting to look even more familiar to Objective-C developers, isn’t it?

The final form of the language, Object Pascal, actually backed off on the Smalltalk terminology a little bit and renamed “classes” to “objects” and went so far as to introduce an OBJECT keyword used for defining a class. It also changed SUPERSELF. to INHERITED—yes, with whitespace instead of a dot!—as, again, developers new to OOP found “superclass” confusing.

Object Pascal also, at long last, adopted the standard Pascal built-in New() to perform object allocation (along with its counterpart Free() for deallocation) directly instead of introducing a separate function for it, since the intent can be inferred by the compiler from the type system. It also removed the need to use the METHODS OF construct to add methods, instead just prefixing the method with the class name and a period.

The final major change from Clascal to Object Pascal is that, with New() used for object allocation, the CREATE methods were changed into initialization methods instead since they just initialize the object after its allocation. They were also made procedures rather than functions returning values, and since the standard Pascal built-in New() is being used they no longer take a potentially-already-allocated object nor do they take a heap zone in which to perform the allocation. The convention is that for a class TFoo the initialization method has the form IFoo.

There was also another stylistic change, prepending field names with f to make them easy to distinguish from zero-argument function methods at a glance.

There was also a switch from not including the parameter list in the IMPLEMENTATION section to including it directly instead of in a comment.

Here’s what that looks like:

INTERFACE

  TYPE

    TObject = OBJECT
      PROCEDURE IObject; ABSTRACT;
      PROCEDURE Free; DEFAULT;
    END;

    TPerson = OBJECT(TObject)
      fName: TString;
      PROCEDURE IPerson(theName: TString); STANDARD;
      PROCEDURE Free; OVERRIDE;
    END;

IMPLEMENTATION

    PROCEDURE TObject.Free
    BEGIN
      Free(SELF);
    END;

    PROCEDURE TPerson.IPerson(theName: TString)
    BEGIN
      fName := theName.Clone();
    END;

    PROCEDURE TPerson.Free
    BEGIN
      fName.Free;
      INHERITED Free;
    END;

Based on the documentation I’ve read, it wouldn’t surprise me if the only reason initialization methods aren’t consistently named Initialize is that the language design didn’t support an OVERRIDE of a method using a different parameter list.

On January 19, Apple and the Computer History Museum released the source code to the Lisa Office System 7/7 version 3.1, including both the complete Office System application suite and the Lisa operating system. (The main components not released were the Workshop environment and its tooling, including the Edit application and the Pascal, COBOL, BASIC, and C compilers and the assembler.) Curious people have started to dig into what’s needed to understand and build it, and I thought I’d share some of what I’ve learned over the past few decades as a Lisa owner and enthusiast.

While Lisa appears to have an underlying procedural API similar to that of the Macintosh Toolbox, the Office System applications were primarily written in the Clascal language—an object-oriented dialect of Pascal designed by Apple with Niklaus Wirth—using the Lisa Application ToolKit so they could share as much code as possible between all of them. This framework is the forerunner of most modern frameworks, including MacApp and the NeXT frameworks, which in turn were huge influences on the Java and .NET frameworks.

One of the interesting things about Clascal is that it doesn’t add much to the Pascal dialect Apple was using at the time: Pascal was originally designed by Wirth to be a teaching language and several constructs useful for systems programming were left out, but soon added back by people who saw Pascal as a nice, straightforward, compact language with simple semantics that’s straightforward to compile. While in the 1990s there was a bitter war fought between the Pascal and C communities for microcomputer development, practically speaking the popular Pascal dialects and C are almost entirely isomorphic; there’s almost nothing in C that’s not similarly simple to express in Pascal, and vice versa.

So beyond standard Pascal, Apple Pascal had a concept of “units” for promoting code modularity: Instead of having to cram an entire program in one file, you could break it up into composable units that specify their “interface” separately from their “implementation.” Sound familiar?

When creating a unit under this model, both the interface and the implementation can go in a single file, but in separate sections. So let’s say you want to create a unit that makes some simple types available along with procedures and functions to operate on them. (In code examples, I’m putting keywords in uppercase since Pascal was historically case-insensitive and it helps to make clear the distinction between language constructs and developer code.)

UNIT Geometry;

INTERFACE

  TYPE
    Point  = RECORD
               h, v: INTEGER;
             END;

  VAR
    ZeroPoint: Point;

  PROCEDURE InitGeometry;
  PROCEDURE SetPoint(var p: Point; h, v: INTEGER);
  FUNCTION EqualPoints(a, b: Point): BOOLEAN;

IMPLEMENTATION

  PROCEDURE InitGeometry
  BEGIN
    SetPoint(ZeroPoint, 0, 0);
  END;

  PROCEDURE SetPoint
  BEGIN
    p.h = h;
    p.v = v;
  END;

  FUNCTION EqualPoints
  BEGIN
    IF a.h = b.h AND a.v = b.v THEN BEGIN
      EqualPoints := TRUE;
    ELSE BEGIN
      EqualPoints := FALSE;
    END
  END;

END.

Reading through this code, what’s the first thing you notice? While InitGeometry would typically be written without parentheses, as is normal for a zero-argument procedure or function in Pascal, functions and procedures that do take arguments and return values are also written without parameter lists but only in the IMPLEMENTATION section.

This is why, in a lot of the Lisa codebase, they would actually be written like this:

  FUNCTION EqualPoints{(a, b: Point): BOOLEAN}
  BEGIN
    IF a.h = b.h AND a.v = b.v THEN BEGIN
      EqualPoints := TRUE;
    ELSE BEGIN
      EqualPoints := FALSE;
    END
  END;

This is because, despite being “wordy,” Pascal also typically tries to minimize repetition and risk of error. So since you’ve already specified the INTERFACE why specify it again, and potentially get it wrong?

What’s interesting about Clascal is that it does the same thing! You define a class and its methods as an interface, and then its implementation doesn’t require repetition. This may sound convenient but in the end it means you don’t see the argument lists and return types at definition sites, so everyone wound up just copying & pasting them into comments next to the definition!

A couple of other things that are interesting about Clascal is that it sticks closer to Smalltalk terminology than most modern systems other than Objective-C (and, marginally, Swift): Instead of this it has SELF and instead of “member functions” it has “methods,” as PARC intended. This makes perfect sense as a bunch of the people who created and used Clascal came from PARC.

So to define a class, you simply use SUBCLASS OF SuperclassName in a TYPE definition section, provide your instance variables as if they were part of a RECORD, and declare its methods using almost-normal PROCEDURE and FUNCTION declarations (not definitions!) that require an OVERRIDE keyword to indicate a subclass override of a superclass method.

So the above code would look like this adapted to Clascal style:

UNIT Geometry;

INTERFACE

  TYPE
    TPoint = SUBCLASS OF TObject
               h, v: INTEGER;
               FUNCTION CREATE(object: TObject, heap: THeap): TPoint;
               PROCEDURE Set(h, v: INTEGER);
               FUNCTION Equals(point: TPoint): BOOLEAN;
             END;

IMPLEMENTATION

  METHODS OF TPoint;

    FUNCTION TPoint.CREATE{(object: TObject, heap: THeap): TPoint};
    BEGIN
      { Create a new object in the heap of this class, if not
        initializing an instance of a subclass. }
      IF object = NIL THEN
        object := NewObject(heap, THISCLASS);
      SELF := TPoint(TObject.CREATE(object, heap));
    END;
    PROCEDURE TPoint.Set{(h, v: INTEGER)};
      SELF.h := h;
      SELF.v := v;
    END;
    FUNCTION TPoint.Equals{(point: TPoint): BOOLEAN};
      Equals := a.h = b.h AND a.v = b.v;
    END;
  END;

END.

In addition to SELF there’s of course SUPERSELF to send messages to your superclass instead. And messages are sent via dot notation, e.g. myPoint.Set(10,20); to send Set to an instance of TPoint. It’s just about the most minimal possible object-oriented addition to Pascal, with one exception: It takes advantage of Lisa’s heap.

Just like Macintosh, Lisa has a Memory Manager whose heap is largely organized in terms of relocatable blocks referenced by handles rather than fixed blocks referenced by pointers. Thus normally in Pascal one would write SELF^^.h := h; to dereference the SELF handle and pointer when accessing the object. However, since Clascal knows SELF and myPoint and so on are objects, it just assumes the dereference—making it hard to get wrong. What I find interesting is that, unlike the Memory Manager on Macintosh, I’ve not seen any references to locking handles so they don’t move during operations. However, since there isn’t any saving and passing around of partially dereferenced handles most of the time, I suspect it isn’t actually necessary!

Honestly, as late-1970s languages go, it isn’t so bad at all. It wouldn’t even be all that difficult for the editor to show this information inline anyway, it’s the sort of thing that can be done fairly easily even in static language development environments from the 1970s.

Search Site

Month: February 2023

Lisa Source Code: Clascal Evolution

Lisa Source Code: Understanding Clascal