“Modernize”

In a vintage computing group, someone posted a picture of a terminal in use at a modern bookstore that’s still using the same infrastructure as they have for decades, and someone replied saying that while from a retrocomputing perspective it was cool, as a business they need to “modernize!” This was my reply…

It’s my understanding that a major US tire and oil change chain used HP 3000—Hewlett-Packard’s minicomputer and mainframe platform—for decades, right up until HP cancelled it out from under them, and only switched away from it due to the promised end of support. That is to say, they’d be using it now if HPe still supported it today.

My understanding is that their systems were built using native technologies on MPE, the HP mini/mainframe OS, like the IMAGE database, COBOL for business logic, and MPE’s native forms package. They went through a number of transitions from HP’s 16-bit mainframe architecture to 32-bit and then 64-bit PA-RISC, from using terminal concentrators in stores connected to a district mini over packet data to using a small mini at each store with store-and-forward via a modem to the regional mini (and on up) and finally to live connections over VPN via local ISPs, and from not having any direct customer access except by calling someone at a specific store to having customer access via the corporate web site.

So tell me, why should they have switched away if their hand wasn’t forced by HP? Keep in mind that they maintained and enhanced the same applications for decades to accommodate changes in technology, regulations, and expectations, and by all accounts everything was straightforward to use, fast, and worked well. What would be in it for the company and the people working in the shops to rewrite everything regularly for the platform du jour? I’ll grant that their development staff wasn’t padding their résumés with the latest webshit, but why would that actually matter?

The “Promise” of “Easier” Programming

So yesterday, Thomas Fuchs said on Mastodon:

The LLM thing and people believing it will replace people reminds me so much of the “visual programming” hype in the 80s/90s, when companies promised that everyone could write complex applications with a few clicks and drawing on the screen.

Turns out, no, you can’t.

I had to respond, and he encouraged me to turn my response into a blog post. Thanks!

In essence, he’s both incorrect and quite correct, in ways that correlate directly to the current enthusiasm among the less technically savvy for LLMs.

Back in the late 1980s to mid-1990s, there were large numbers of complex business applications built in Prograph and Sirius Developer and other “4GLs.” These were generally client-server line-of-business applications that were front-ends to databases implementing business processes, prior to the migration of such applications to the web in the late 1990s. In addition, there was LabVIEW, a graphical programming system by National Instruments for instrument control and factory automation, which has largely dominated that industry since not long after its release in 1986.

This was all accompanied by breathless PR, regurgiated by an entirely coöpted technology press, about how graphical programming was going to save businesses money since once software could be developed by drawing lines between boxes and one wouldn’t have to deal with textual syntax, anyone who needed software could write it themselves instead of hiring “expensive” programmers to do it.

The problem with this is that it’s optimizing the wrong problem: The complexity of textual programming. Yes, people have varying levels of difficulty when it comes to using text to write programs, and some of that is caused by needless complexity in both programming language and development environment design. However, a complex application is still a complex application regardless of whether it’s written in Prograph or Swift.

For example, LabVIEW isn’t necessarily an advancement over the systems it replaced. An enormous amount of factory automation and instrumentation tooling was created in the 1980s around the IEEE 1488 General Purpose Instrument Bus—originally HP-IB—using Hewlett-Packard’s “Rocky Mountain BASIC” running on its 9000-series instrumentation controllers. (These controllers are what HP’s 68000-based HP 9000-200/300/400 systems running HP-UX were the fanciest versions of; a significant use of these larger systems was to act as development systems with deployment on lower-cost fixed-purpose controllers.)

All of that was a lot more maintainable and discoverable than a modern rats’ nest of LabVIEW diagrams—LabVIEW didn’t win the market because it was easier to use or better, it won because it ran on ubiquitous PC hardware while still being able to fully interoperate with already-deployed GPIB systems. This is in part because Rocky Mountain BASIC was a good structured BASIC with flexible I/O facilities, not a toy BASIC like existed on the microcomputers of the time. So if you needed to add a feature or fix a bug, you had lots of tools with which to pinpoint the bug and address it, and then deploy updated code to your test and then production environment, as well as manage changes like that over time.

This is one of the same things that’s also doomed “environment-oriented” programming systems like Smalltalk; plain text has some very important evolutionary benefits when used for programming, particularly when it comes to long-term maintainability, interoperability, portability, and interoperability. Maintaining any sort of environment-oriented system over time is much more difficult simply because making comparisons between variants can become incredibly complex, and often isn’t possible except when working within the system itself. (For example, people working in Smalltalk environments often used to work by passing around FileOuts, and Smalltalk revision control systems often just codified that.)

And of these sorts of systems, LabVIEW is the only one still in wide use, all of that 4GL code has been replaced more than once over time with more traditionally-developed software because it turns out that there are good reasons that software that needs to live a long time tends to be created textually.

What does this have to do with using LLMs for programming? All of the same people—people who appear to resent having to give money to professional software developers to practice their trade—think that this time it’ll be different, that they’ll finally be able to just describe what they want a computer to do for their enterprise and have a program created to do it. They continue to not realize that the actual complexity is in the processes themselves that they’re describing, and in breaking these down in sufficient detail to create computer systems to implement them.

So yeah, Thomas is absolutely correct here, they’re going to fail especially spectacularly again this time, since LLMs are just fancy autocomplete and have zero actual intelligence. It’s like saying we won’t need writing and literature classes any more because spellcheck exists, a category error.

Raspberry Pi vs SPARCstation 20: Fight!

A couple weeks back, I tweeted the following:

Turns out a Raspberry Pi now is about 6 times as fast as a SPARCstation 20 was 20 years ago. And a Pi 2 is more like 15 times as fast.

I was a little low in my numbers, too — they’re more like 7 times and 16 times to 41 times as fast — since I was going from memory!

Here’s how I came up with that.

The BYTE UNIX Benchmark

The standard benchmark for UNIX systems back in the day was the BYTE UNIX Benchmark, a set of benchmarks originally developed at a university and fleshed out substantially by BYTE Magazine so they could evaluate the new servers and workstations that were coming to market.

Even though BYTE itself is no more (RIP) the benchmark lives on: The most recent version was posted on Google Code and had some additional portability and enhancement work done. These days, the most up-to-date version is on GitHub.

What’s useful about this benchmark is that it’s scaled, and UNIX hasn’t changed all that much in itself, so it’s still moderately useful as a way to compare systems with each other.

I’d recently made an offhand comment that the Raspberry Pi, despite feeling “underpowered” by today’s standards, was actually extremely powerful — and that it put a decent workstation from the mid-1990s to shame, the kind of system we tended to be jealous of as college students.

What was a SPARCstation 20?

Back then, Sun was the biggest UNIX workstation vendor, primarily because both their hardware and baseline operating system were good and they offered a ton of flexibility in their product line.

In 1994, Sun introduced a new lineup of SPARCstation systems that had dramatically improved performance compared to their previous models — the original of which was so iconic that it defined the “pizza box” form factor for desktop workstations — and the SPARCstation 20 was one of their flagships.

Here are some specs for the Sun SPARCstation 20 model 61, which shipped in June 1994:

  • One 60 MHz SuperSPARC CPU
  • 1 MB of cache
  • 32MB RAM (expandable to 512MB)
  • 20 MB/second SCSI-2
  • 1152 by 900 8-bit graphics

In 1994, this was quite a substantial system, and it cost $16,195 in its minimum configuration. (That’s $25,580 today!) And if you used one, it felt like it: This thing was wicked fast.

This was also the last system for which the BYTE benchmark was re-indexed, defining this SPARCstation 20 to have a score of 10.0.

The Benchmarks

Actually running the benchmarks under Raspbian Jessie on my Raspberry Pi and Raspberry Pi 2 was trivial, literally just a matter of cloning the git repository and running the script.

Here are the results. Note that the Raspberry Pi 2 has two sets of results, because the BYTE UNIX Benchmark runs once to get “single-CPU” performance numbers and another time to get “multi-CPU” numbers. Its single-CPU numbers are really more like “single process” numbers, however, since the other three cores aren’t actually disabled while the benchmark is run.

System Benchmarks Index Values SS20-61 Result SS20-61 Index RPi Result RPi Index RPi2x1 Result RPi2x1 Index RPi2x4 Result RPi2x4 Index
Dhrystone 2 using register variables 16700.0 10.0 1647374.0 141.2 3000237.2 257.1 1948737.7 1023.9
Double-Precision Whetstone 55.0 10.0 239.6 43.6 435.3 79.1 1729.8 314.5
Execl Throughput 43.0 10.0 167.7 39.0 321.5 74.8 1210.6 281.5
File Copy 1024 bufsize 2000 maxblocks 3960.0 10.0 30363.8 76.7 70026.8 176.8 110940.6 280.2
File Copy 256 bufsize 500 maxblocks 1655.0 10.0 9473.6 57.2 20353.5 123.0 31384.0 189.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 10.0 76219.4 131.4 186926.9 322.3 296346.9 510.9
Pipe Throughput 12440.0 10.0 118393.6 95.2 181562.5 146.0 713070.2 573.2
Pipe-based Context Switching 4000.0 10.0 14539.1 36.3 33809.8 84.5 126241.1 315.6
Process Creation 126.0 10.0 434.6 34.5 1190.8 94.5 2572.9 204.2
Shell Scripts (1 concurrent) 42.4 10.0 354.5 83.6 1087.0 256.4 2395.0 564.9
Shell Scripts (8 concurrent) 6.0 10.0 44.9 74.8 301.0 501.7 317.0 528.3
System Call Overhead 15000.0 10.0 276169.1 184.1 399939.7 266.6 1545514.4 1030.3
System Benchmarks Index Score 10.0 71.9 165.6 417.4

What does this tell us?

A lot can happen in 20 years. Even when it comes to things like I/O throughput, where the Raspberry Pi really falls down compared to other systems — because it attaches to everything via USB — it’s still way faster than a mid-1990s Sun that we all thought was extremely fast.

In particular, according to the indexes, a Raspberry Pi is about seven times as fast as a baseline SPARCstation 20 model 61 — and has substantially more RAM and storage, too. And the Raspberry Pi 2 is sixteen times as fast at single-threaded tasks, and on tasks where all cores can be put to use it’s forty one times faster.

Ideally, this would also mean that even a Raspberry Pi Zero should feel exceptionally fast. However, our software appetite has grown even faster than our appetite for fast hardware, and the feel of systems compared like this can demonstrate that well.

What’s next?

Well, I just got a DragonBoard 410c, which is a quad-core 64-bit ARM board using a Qualcomm CPU, and which doesn’t have any of the major design issues of the Raspberry Pi…

SBCL test failures on ARM

For hacking/prototyping/fun purposes I have a few embedded systems laying around. For example, I have a couple of Raspberry Pi systems, one of the original Raspberry Pi model B boards and one of the new Raspberry Pi 2 model B boards.

And on everything, I have the latest Steel Bank Common Lisp building.

On my Raspberry Pi, which is an armv6 device, I see the following failures in SBCL’s unit tests:

 Failure:            debug.impure.lisp / (TRACE ENCAPSULATE NIL)
 Failure:            debug.impure.lisp / (TRACE-RECURSIVE ENCAPSULATE NIL)
 Expected failure:   packages.impure.lisp / USE-PACKAGE-CONFLICT-SET
 Expected failure:   packages.impure.lisp / IMPORT-SINGLE-CONFLICT
 (62 tests skipped for this combination of platform and features)

On my Raspberry Pi 2, which is an armv7 device, I see the following additional failures:

 Failure:            float.pure.lisp / (SCALE-FLOAT-OVERFLOW BUG-372)
 Failure:            float.pure.lisp / (ADDITION-OVERFLOW BUG-372)
 Failure:            float.pure.lisp / (ADDITION-OVERFLOW BUG-372 TAKE-2)
 Failure:            debug.impure.lisp / (TRACE ENCAPSULATE NIL)
 Failure:            debug.impure.lisp / (TRACE-RECURSIVE ENCAPSULATE NIL)
 Expected failure:   packages.impure.lisp / USE-PACKAGE-CONFLICT-SET
 Expected failure:   packages.impure.lisp / IMPORT-SINGLE-CONFLICT
 (62 tests skipped for this combination of platform and features)

This says to me that, contrary to what some have told me, SBCL probably does need to distinguish the various ARM instruction set variants.

Is anyone actually working on SBCL on ARM?

I also have a DragonBoard 410c on the way, and it might be nice to have a fast Lisp on ARM64, though I suspect that’s a bit further out…

When to use NSOperation vs. GCD

Mac OS X has a number of concurrency mechanisms, and that increases with Snow Leopard. In addition to run loops, threads (both Cocoa and POSIX) and operations, Snow Leopard adds **Grand Central Dispatch** (GCD), a very lightweight way to represent units of work and the style of concurrency they need, and have the system figure out how to schedule them.

But wait, don’t we have that already in NSOperation? It shouldn’t surprise you in the least to learn that NSOperation, on Snow Leopard, is built atop GCD. However, there are a number of differences between the two, and for that reason people have started to ask “How should I decide which to use when?”

The straightforward answer is a general guideline for all application development:

> **Always use the highest-level abstraction available to you, and drop down to lower-level abstractions when measurement shows that they are needed.**

In this particular case, it means that when writing Cocoa applications, you should generally be using NSOperation rather than using GCD directly. Not because of a difference in efficiency, but because NSOperation provides a *higher-level abstraction* atop the mechanisms of GCD.

For example, you can set up a dependency between two NSOperations such that the second will only be run after the first is complete — even if they’re run on different queues. You can use KVO to observe the completion (or cancellation) of different operations — and you can create operations that support being cancelled in the first place. You can set a completion block to run after an application has finished. And you can, of course, create operations from blocks using *NSBlockOperation*.

You’ll also fit in better with Cocoa by using NSOperation in your high-level code. If you take a look at new Snow Leopard API on NSNotificationCenter, you’ll see one where you specify the NSOperationQueue on which you wish a notification to run a block.

Ultimately, you spend a little bit of overhead to use NSOperation instead of GCD, but you gain significant additional functionality that will be useful as you start to compose operations together. And that’s the biggest benefit of NSOperation: You can break up your application in terms of units of work that can not only be run on a queue, but also canceled, observed, and depended upon. This lets you easily define your data dependencies and ensure that you aren’t simply running code serially as a side-effect of locking.

Rebutting Big Nerd Ranch on Objective-C 2.0 dot notation

The Big Nerd Ranch weblog has a new post about Objective-C 2.0 dot notation. They advocate never using it and they’re completely wrong.

Given my reaction on Twitter, several people have asked me to write a more in-depth rebuttal.

I’ve already addressed when and why you should use Objective-C 2.0 properties and dot notation in an earlier post, so I won’t go into that here. I’ll just repeat my response to their weblog.

Here’s what I wrote in response:

> I disagree most emphatically. The whole point of dot notation is that, when combined with properties, it’s not *just* an alternative syntax for invoking methods. In fact, if that’s how you think about dot syntax, STOP. That’s not what it’s for at all.
>
> What dot syntax and property declarations are for is separating object *state* from object *behavior*. Classical OOP only really defines objects as only exposing behavior but the past 30+ years have demonstrated rather aptly that objects consist of both. C# was actually pioneering in this; its concept of properties is rather similar to what the combination of property declarations and dot syntax enable in Objective-C.
>
> To write idiomatic Objective-C 2.0 you should use `@property` to declare properties, and use dot syntax to access them. Period. Doing otherwise is a bad idea because it will create code that isn’t intention-revealing to other experienced Objective-C 2.0 developers. Teaching students to do otherwise is doing them a disservice, because you’re directly contradicting those responsible for the language and its evolution.

In short, Objective-C 2.0 has properties and dot notation as another way of expressing intent in your code. Use them for that, don’t refuse to use them just because they weren’t in earlier versions of the language, or because they require teaching another concept.

When to use properties & dot notation

I listened to a recent episode of the [cocoaFusion:][1] podcast about properties and dot notation today. There were a few interesting points brought up, but I felt a couple of the most important reasons to use `@property` declarations and dot notation weren’t addressed.

The biggest reason I see to use a different notation for both property declaration and property access than for method declaration and sending messages — even if property access ultimately results in a message send, as it does in Objective-C 2.0 — is **separation of state and behavior**.

In the ur-OOP days of Smalltalk, state was supposed to be *encapsulated* within objects *entirely*. This has become Smalltalk dogma and some developers have tried to propagate it to Objective-C: The idea that objects should *never* expose their state to each other, but instead only vend *behaviors*. It makes a certain amount of sense if you see objects purely as actors.

However, in today’s modern world we understand that objects aren’t *just* actors. Objects both “do things” *and* “represent things;” they’re nouns. And they have both *internal* state that they use in managing their behavior and *external* state they expose to the world.

You can use the same mechanism to do both, as Smalltalk does and as Objective-C did before 2007. However, it turns out that it can make a lot more sense to use different syntax to represent the distinct concepts, even if they’re handled by the same underlying mechanism.

For example, I might have a Person class and an NSViewController subclass that presents a Person instance in a view. The API to my Person class might look like this:

@interface Person : NSObject
@property (copy) NSString *name;
@property (copy) NSImage *picture;
@end

This sends a strong signal to the users of the class that `name` and `picture` are *external* state for a Person instance, leading me to write code like this:

Person *person = [[Person alloc] init];
person.name = @”Chris”;
person.picture = pictureOfChris;

The intent of this code is clear: I’m not asking `person` to do anything, I’m just manipulating its external state. That may cause it to do things as a side-effect, but at least in the code snippet I’m looking at, I’m just interested in changing some state.

Now let’s see what the NSViewController might look like:

@interface PersonViewController : NSViewController
@property (retain) Person *person;
– (BOOL)updatePicture;
@end

And let’s see how I’d use one, assuming it’s been instantiated and wired up in my view hierarchy already:

selectedPersonViewController.person = person;

if ([selectedPersonViewController updatePicture]) {
// note elsewhere that the person’s picture is updated
}

Even though the `-updatePicture` method has no arguments and returns a `BOOL` I (1) don’t make it a property and (2) don’t use dot notation to invoke it. Why not, since it fits the *form* of a property? Because it doesn’t fit the *intent* of a property. I’m actually telling the view controller to perform some action, not just changing some state, and I want the intent of that to be clear to a reader of the above code.

That brings me to the other major reason to both declare properties using `@property` and use dot notation to access them: Xcode. The code completion technology in Xcode tries hard to provide you with a good default completion and good choices in the completion list. But this can be very difficult, especially with the combination of Objective-C’s dynamic dispatch and the sheer size of the frameworks: There are hundreds (!) of methods declared in categories on NSObject as a result of categories describing informal protocols, and any of those methods could be valid on an arbitrary instance.

Dot notation, on the other hand, is **not** valid on arbitrary objects. Even though it compiles to exactly the same sort of `objc_msgSend` dynamic dispatch that bracket notation does, from a type system perspective it actually **requires** a declaration of the property being accessed to work correctly. The reason for this is that the name of a property is not necessarily the name of the message selector to use in dispatch; think of a property `hidden` whose underlying getter method is named `-isHidden`.

This means that tools like Xcode can provide a much more streamlined experience for properties using dot notation. If I write the above code in Xcode, rather than MarsEdit, Xcode will not offer completions like `-autorelease` and `-retainCount` when I type `person.` — it will only offer me the properties it knows are on Person.

Now, you *can* invoke arbitrary getters or setters (so long as they have a corresponding getter) using dot notation. The compiler doesn’t require an `@property` declaration for the use of dot notation, just a declaration. I could have declared the `Person.name` property like this:

@interface Person : NSObject
– (NSString *)name;
– (void)setName:(NSString *)value;
@end

The compiler will compile `person.name = @”Chris”;` just fine with this declaration, to an invocation of `-[Person setName:]` — but Xcode won’t offer you code completion for it if you don’t use `@property` to declare it. By not offering to complete non-`@property` properties for dot notation, Xcode avoids offering you *bad* completions like `autorelease` and `retain` that are technically allowed but are abominable style.

Of course, one complication is that many frameworks on Mac OS X predate `@property` declarations and thus don’t expose their state this way. That means Xcode won’t offer you completions for their state with dot notation until those frameworks do. Fortunately for iPhone developers, UIKit uses `@property` declarations pervasively. You should too.

To sum up:

* Use `@property` to declare the state exposed by your objects.
* Use dot notation to get and set objects’ state.
* Use method declarations to declare the behavior exposed by your objects.
* Use bracket notation to invoke objects’ behavior.

This results in intention-revealing code that clearly separates state and behavior both in declaration and use.

[1]: http://www.cocoafusion.net/ “cocoaFusion: podcast”

Not it!

I didn’t write [Carrie’s Dots][1] — but I did download it!

It was written by [Dr. Chris Hanson][2], a Chris Hanson who’s evidently still in the mid-South. Maybe the next time I get a chance to visit Mississippi, we’ll get to meet up!

[1]: http://phobos.apple.com/WebObjects/MZStore.woa/wa/viewSoftware?id=286047207&mt=8
[2]: http://dr-chris.org/