LLVM Developers’ Meeting 2007-05

The LLVM Compiler Infrastructure is a great technology that came out of the computer science research community and can be used to develop extensible compiler platforms. Among other things, it provides a platform-independent assembly and object code (the “low level virtual machine” that its name is taken from), and a great object-oriented compilation, linking, optimization and code-generation infrastructure that you can use to efficiently target real hardware. The main idea is that LLVM provides a comprehensive back-end that you can easily build a front-end to target.

There’s a huge amount of material available on the LLVM web site, including the LLVM Assembly Language Reference Manual and LLVM Programmer’s Manual, a wide variety of papers on LLVM, and a great walkthrough of the creation of Stacker, a Forth front-end that targets LLVM. It shows how the budding language creator might leverage the tools available as part of the LLVM infrastructure. I fully expect that in time, “little languages” will have no excuse not to be JIT compiled simply because targeting LLVM is actually easier than writing your own interpreter or bytecode engine! Just walk your AST and generate naïve LLVM code for what you encounter, and let the infrastructure handle the rest. (For those who aren’t developer tools weenies, an Abstract Syntax Tree is the internal representation of a program’s structure before it’s turned into instructions to execute.)

A couple months back, the May 2007 LLVM Developers’ Meeting was held at Apple. The proceedings from this meeting — the actual session content, both in slides and in video form — are available online, and I’ve even created an LLVM Developers’ Meeting podcast (including a subscribe-directly-in-iTunes version) for easy viewing. The video may be low bit rate, but it has a 16:9 aspect ratio so you can even pretend it’s HD. (I put together the podcast primarily so I could watch the sessions on my Apple TV, since I couldn’t attend the meeting.)

So if you’re at all interested in compilers, language design or development, optimization, or development platforms in general, you’ll be very well-served by checking out LLVM. It is a seriously cool enabling technology.

Easily speed up your Xcode builds

A lot of developers don’t heavily modify the projects they instantiate from Xcode’s templates. This is a shame, because not only is it a great way to learn Xcode in depth, it’s also a great way to ensure your projects build as fast as possible!

To that end, then, here are two simple tips that you can apply to your projects right now that should get them building faster.

Normalize your build settings!

Projects and targets in Xcode have configurations — collections of build settings that influence how a target is built. The Debug configuration, for example, will typically specify the compiler optimization build setting with as low a value as possible. The Release configuration, on the other hand, will typically specify a relatively high value for this build setting. This begs the question: The Debug and Release configuration of what?

When you create a new project from one of Xcode’s templates, the principal target in that project will typically have a number of build settings customized in its configurations. The project itself, on the other hand, won’t have very may customized build settings. For the one-target-per-project case this doesn’t matter much. However, if you create a project with multiple targets, you can wind up with a number of targets that specify the same (or even subtly different!) information.

Instead of leaving your project this way, you can normalize your build settings such that build settings you want to specify for every target in the project — for example, that compiler optimization should be off for the Debug configuration — are specified at the project level rather than at the target level. This takes advantage of the fact that build settings are inherited in Xcode; if a build setting isn’t customized in a target, the value specified in the project is used.

What does this buy you? It ensures that you have a single, consistent set of settings that are passed to the compilers, linkers, and other tools that are used to build your software. That way you won’t wind up with subtle bugs like code built with C++ RTTI turned on calling a plug-in built with C++ RTTI turned off. But just as importantly, it’s enables the next trick, which can have a significant impact on the speed of large, multi-target builds.

Share your precompiled prefix!

Xcode, like many other IDEs, supports prefix files — header files that are included implicitly by every one of your source files when they’re built. Normally, as I described above, these are specified in target configurations. The text in the prefix file that is copied out of Xcode’s templates even mentions that it’s for a specific target.

Don’t be fooled!

Prefix files get precompiled for performance, so they can simply be loaded by the compiler rather than recomputed for every file you build. Precompiling a prefix file takes quite a bit of time though, especially if you use multiple languages in your project (such as C, Objective-C, and C++) because each language needs a separate precompiled prefix due to differences in how they’ll treat “the same” code.

However, just because precompiled prefix files can’t be shared between languages doesn’t mean they can’t be shared between targets. In fact, for performance, Xcode automates this sharing for you — if you meet it halfway. The critical thing that you need to do is to ensure that your prefix files are:

  1. Named the same; and,
  2. Built using the same compiler-related build settings.

That’s where the normalization I talked about above comes in. You can even promote your prefix file-related build settings to your project instead of your targets, so you can be certain that they’re shared.

In fact, if they meet the criteria that Xcode uses to determine whether precompiled prefix files should be shared, even multiple projects will wind up sharing them!

The pause between builds of a target’s dependent targets to generate a new precompiled prefix file is like a pipeline stall: An unwelcome hiccup that holds everything else up until it’s over. If Xcode can precompile a single set of prefix files at the start of your build and then re-use them for the entire rest of your build, it will stream past as quickly as it possibly can with only the occasional pause for linking. For large projects with a lot of dependent targets, this can make a big difference.

Separate your preprocessor macros!

“But Chris,” you say, “I have target-specific preprocessor macros that I can’t get rid of!” That’s OK. As long as you don’t need them in your prefix files, you can set them in the special Preprocessor Macros Not Used in Precompiled Headers build setting. These will be passed to the compiler when your sources are compiled, just like the regular Preprocess Macros will, but they won’t be passed when precompiling a prefix file. So you can have your cake and eat it, too.

Of course, there are macros that you do want to set in your precompiled prefix headers, because they change the behavior. Macros like NDEBUG to turn off C assert or NS_BLOCK_ASSERTIONS to turn off Foundation’s NSAssert are important to specify for your precompiled prefix files. Fortunately these types of macros typically differ only by configuration, and also remain consistent across targets and projects, allowing you to specify them at the project level rather than the target level.

Just these three small changes have the potential to make a dramatic difference in how quickly Xcode builds your project:

  1. Normalizing your build settings, so common settings across all your targets are specified at the project level;
  2. Increasing sharing of your precompiled prefix files by naming them consistently and using consistent compiler-related build settings; and
  3. Specifying separate preprocessor macros for your prefix files than for your targets,

Try it out and see for yourself!

Designing for Core Data performance

On the comp.sys.mac.programmer.help newsgroup, Florian Zschocke asked about improving the performance of his Core Data application. Here’s an adapted version of my reply to his post.

Core Data applications should scale quite well to large data sets when using an SQLite persistent store. That said, there are a couple implementation tactics that are critical to performance for pretty much any application using a technology like Core Data:

  1. Maintain a well-normalized data model.
  2. Don’t fetch or keep around more data than you need to.

Implementing these tactics will make it much easier to both create well-performing Core Data applications in the first plce, and to optimize the performance of applications already in progress.

Maintaining a normalized data model is critical for not fetching more data than you need from a persistent store, because for data consistency Core Data will fetch all of the attributes of an instance at once. For example, consider a Person entity that can have a binary data attribute containing a picture. Even if you’re just displaying a table of Person instances by name, Core Data will still fetch the picture because it’s an attribute of Person. Thus for performance in a situation like this, you’d normalize your data so that you have a separate entity, Picture, to represent the picture for a Person on the other side of a relationship. That way the image data will only be retrieved from the persistent store if the relationship is actually traversed; until it’s traversed, it will just be represented by a fault.

Similarly, if you have lots of to-many relationships and need to display summary information about them, de-normalizing your data model slightly and caching the summary information in the main entity can help.

For example, say your app works with Authors and Books. Author.books is a to-many relationship to Book instances and Book.authors is a to-many relationship to Author instances. You may want to show a table of Authors that includes the number of Books related to the Author. However, binding to books.@count for that column value will cause the relationship fault to fire for every Author displayed, which can generate a lot more traffic to the persistent store than you want.

One strategy would be to de-normalize your data model slightly so Author also contains a booksCount attribute, and maintains that whenever the Author.books relationship is maintained. This way you can avoid firing the Author.books relationship fault just because you want to display the number of Books an Author is related to, by binding the column value to booksCount instead of books.@count.

Another thing be careful of is entity inheritance. It’s an implementation detail, but inheritance in Core Data is single-table. Thus if you have every entity in your application inheriting from one abstract entity, it’ll all wind up in a single table, potentially increasing the amount of time fetches take etc. because they require scanning more data.

Retaining or copying the arrays containing fetch results will keep those results (and their associated row cache entries) in memory for as long as you retain the arrays or copies of them, because the arrays and any copies will be retaining the result objects from the fetch. And as long as the result objects are in memory, they’ll also be registered with a managed object context.

If you want to prune your in-memory object graph, you can use -[NSManagedObjectContext refreshObject:mergeChanges:] to effectively turn an object back into a fault, which can also prune its relationship faults. A more extreme measure would be to use -[NSManagedObjectContext reset] to return a context to a clean state with no changes or registered objects. Finally, you can of course just ensure that any managed objects that don’t have changes are properly released, following normal Cocoa memory management rules: So long as your managed object context isn’t set to retain registered objects, and you aren’t retaining objects that you’ve fetched, they’ll be released normally like any other autoreleased objects.

“Enterprise” thought leadership?

David Heinemeier Hansson, creator of Rails at 37signals, takes James McGovern — some Java/J2EE author — to task for his über-lame rant against Ruby in the Enterprise in a great post titled Boy, is James McGovern enterprise or what!

> So by Enterprise, Architect, and Enterprise Architect standards, this gent must be the top of the pop. Thus, allow me to make this perfectly clear: I would be as happy as a clam never to write a single line of software that guys like James McGovern found worthy of The Enterprise.

> If Ruby, Rails, and the rest of the dynamic gang we’re lumped together to represent, is not now, nor ever, McGovern Enterprise Readyâ„¢, I say hallelujah! Heck, I’ll repeat that in slow motion just to underscore my excitement: HAL-LE-LU-JAH!

> With that out of the way, we’re faced with a more serious problem. How do we fork the word enterprise? The capitalized version has obviously been hijacked by McGovern and his like-minded to mean something that is synonymous with hurt and pain and torment.

Indeed, McGovern’s rant reads more like a parody of a rant than the real thing:

> 13\. Lets say there is a sixteen week project and the productivity stuff was true and Ruby could save me an entire three weeks which would be significant. Since Ruby is a new vendor and not represented by existing vendors I already do business with, do you think that I will spend more than three weeks in just negotiating the contract?

Yes, because there is some vendor out there named “Ruby that you need to sign a contract with before you can begin a project.

Despite his claims to be agile, McGovern obviously doesn’t know the first thing about agile development. People come first, sure, but agile development doesn’t say that tools aren’t important. Not using good tools makes it harder for good people to do good work.

That’s why I love developing software for Mac OS X and why I love helping people develop software on Mac OS X: We have great tools like Cocoa, Core Data, Interface Builder, OCUnit, WebObjects, and Xcode, and these can be used by great developers to do great things.

Cooperative User Threads vs. Preemptive Kernel Threads

James Robertson, Cooperative Threading:

> Well, in Cincom Smalltalk, this model gives you predictability –
> you know exactly what a thread is going to do. The issue with runaway
> threads rarely comes up for a simple reason – most processes end up
> pausing for I/O (user input, db access, file access, sockets –
> what have you). That wait for I/O state is what prevents a problem
> from arising.

This is a classic problem and I’m honestly surprised to find out that Cincom Smalltalk implements cooperative user-level threads rather than supporting preemptive kernel threads.

Here’s what I posted in response to James, unattributed thanks to the torturous comment interface on his blog:

> One issue with cooperative threads relative to preemptive
> OS-supplied threads is that you get far less opportunity
> for true concurrency within an application. In an era when
> multi-core processors are becoming significantly more common,
> this is becoming exceptionally important to application
> developers. It’s not just about doing I/O concurrently with
> other operations or allowing an application to perform
> multiple tasks at once; it’s about allowing a task to be
> completed faster because more efficient use is being made
> of machine resources. This is why I take an extremely
> skeptical view of user-level threading packages, especially
> in software built on platforms that have reasonable
> kernel-level threading.

You’ll note that the various threading APIs in Mac OS X are all built on kernel threads.

Furthermore, the Mach microkernel schedules exclusively in terms of threads. The microkernel doesn’t even have a conception of processes! It only knows about collections of resources — *tasks* — such as address spaces and IPC ports, and flows of control — *threads* — that it can schedule on processors.

Joel still doesn’t get it

A couple days ago in [Joel on Software][1], Joel claimed that in order for it to make economic sense to develop a Macintosh product, you had to be able to sell *25 times as many* copies as you would a Windows product.

**Bullshit.**

First of all, you can’t just assume that the relative market sizes between the Macintosh and Windows are accurately represented by their market shares. This is partly because market share is a measurement of new computer sales rather than installed base, and partly because there are broad swaths of each market that *aren’t* in the market for your application.

Secondly, it presumes that it costs the same to develop and bring to market a Macintosh product as it does to develop a Windows product. It doesn’t. It costs substantially less. The development tools on Mac OS X are the best on any platform, and speed development significantly; very small teams can create high-end applications in very short timeframes. There is a far smaller test matrix when you’re dealing with Macintosh software, and within that matrix there are far fewer bizarre interactions. There is significantly less competition in the Macintosh market, so you don’t have to spend as much on marketing and promotion of your product. Consumers also don’t have to wade through nearly as much complete garbage to discover the good applications.

Finally, you have to consider post-sales support. The support load presented by Macintosh software is also far lower than for the typical Windows product. This means lower post-sales costs, which means you get to keep more of the revenue generated by the product.

All this adds up to an excellent ROI story for Mac OS X development. You may still have the potential for a higher return on a Windows product, but you’ll also have substantially higher costs, a longer development timeline, and correspondingly greater project risk. All sides need to be weighed before deciding whether it’s worth pursuing one platform or another – you can’t just do a couple bogus back-of-the envelope calculations and decide you need to sell 25 times as many units to make Macintosh development worthwhile.

[1]: http://www.joelonsoftware.com/news/20020910.html