Objective-C 2.0 properties and to-many relationships

I’ve occasionally been asked about the appropriate form for properties representing to-many relationships in Objective-C 2.0.

Let’s start with the example of a recipe and its ingredients, represented by instances of the Recipe and Ingredient classes.

@interface Recipe : NSObject {
@private
NSMutableSet *_ingredients;
}

@property (copy) NSSet *ingredients;

@end

This is a pretty straightforward interface for the Recipe class, but how should we actually implement it? You might first think of writing something like this:

@implementation Recipe

– (id)init {
if (self = [super init]) {
_ingredients = [[NSMutableSet alloc] init];
}
return self;
}

– (void)dealloc {
[_ingredients release];
[super dealloc];
}

@synthesize ingredients = _ingredients;

@end

However, this **will not** do what you expect. In particular, whenever you manipulate the `ingredients` property, it always **replace the value** of the `_ingredients` instance variable used for its storage with a new, immutable NSSet!

What’s wrong with this? For one thing, you won’t be able to make any finer-grained changes to the `ingredients` property, so your code may wind up doing a lot of work unnecessarily. You’ll only ever post Key-Value Observing changes for the entire property, as well, not for individual manipulations; anything observing those changes will probably wind up doing extra work, too.

Why not instead change the type of the property itself to `NSMutableSet *` then? That way, your code could just manipulate the ingredients of a recipe directly, right? You *could* do that, but then you wouldn’t get **any** Key-Value Observing notifications for changes to the property. Why not? Because KVO is all about notification of **property** changes, and changing the object that stores a property’s data isn’t the same thing as changing the property itself.

What should you do then? Here’s how I would implement the `Recipe.ingredients` property instead of using the above `@synthesize` directive:

@implementation Recipe (IngredientsProperty)

– (void)setIngredients:(NSSet *)value {
[_ingredients setSet:value];
}

– (NSSet *)ingredients {
return [NSSet setWithSet:_ingredients];
}

@end

What’s different here is that I’m taking advantage of the fact that the instance variable backing the property is mutable. For just a getter and a setter, that isn’t a big deal. However, since I’m dealing with a to-many relationship, I wouldn’t just write a getter and a setter. I’d also write some of the additional relationship-KVC methods for the property, so I can manipulate the property more efficiently, and get finer-grained KVO notifications:

@interface Recipe (IngredientsProperty)

– (void)addIngredientsObject:(Ingredient *)ingredient;
– (void)removeIngredientsObject:(Ingredient *)ingredient;

– (void)addIngredients:(NSSet *)ingredients;
– (void)removeIngredients:(NSet *)ingredients;

@end

@implementation Recipe (IngredientsProperty)

– (void)addIngredientsObject:(Ingredient *)ingredient {
[_ingredients addObject:ingredient];
}

– (void)removeIngredientsObject:(Ingredient *)ingredient {
[_ingredients removeObject:ingredient];
}

– (void)addIngredients:(NSSet *)ingredients {
[_ingredients unionSet:ingredients];
}

– (void)removeIngredients:(NSSet *)ingredients {
[_ingredients minusSet:ingredients];
}

@end

By doing this, when I need to manipulate a Recipe’s `ingredients` property I can use `-mutableSetValueForKey:` to do so and any changes I make will be efficient. For example, if I’m creating a Recipe to represent Meghan’s Butternut Squash Panang Curry, I might write some code like this:

NSMutableSet *ingredients = [panangCurryRecipe mutableSetValueForKey:@”ingredients”];

[ingredients addObject:[Ingredient ingredientWithName:@”Butternut Squash” quantity:1]];
[ingredients addObject:[Ingredient ingredientWithName:@”Panang Curry Paste” quantity:1]];
[ingredients addObject:[Ingredient ingredientWithName:@”Coconut Milk” quantity:1]];

Instead of making several copies of the set as I make changes, the underlying mutable set is changed in as efficient a way as possible given the accessors I’ve implemented. I don’t have to do any extra work to make that happen.

I also get efficient KVO change notifications for the property, so if I have any user interface bound to it — whether through Cocoa bindings or, if I’m using Cocoa Touch, a “bindings lite” implemented atop KVO — the change notifications it receives will reflect exactly the changes made, instead of wholesale replacement of the set.

I could still improve the code above. I’m using `-[NSObject(NSKeyValueCoding) mutableSetValueForKey:]` to manipulate the `Recipe.ingredients` property. That means I don’t get nice Code Sense completion from Xcode, and have to remember the property’s name *and* spell it correctly when I use it in a string. So I’ll add the following property declaration and implementation:

@interface Recipe (IngredientsProperty)
@property (readonly, copy) NSMutableSet *mutableIngredients;
@end

@implementation Recipe (IngredientsProperty)

– (NSMutableSet *)mutableIngredients {
return [self mutableSetValueForKey:@”ingredients”];
}

@end

You’re probably thinking something like “Wait a minute, `readonly` and `NSMutableSet`?!” That’s exactly what I mean to say, though: You can mutate the *collection* you get back (“read”) from the property, but not the *property itself*.

> **Update:** On Twitter, a couple of people asked why I didn’t just use `-[Recipe addIngredientsObject:]` directly, since I have that available. I certainly could have done that, and it’d have all of the advantages I cite, and it wouldn’t require the creation of the proxy mutable set either. However, if I wanted to something more complex than just an addition, using the proxy mutable set is a significant advantage.
>
> This is because the proxy mutable set (or array, if you’re using an ordered relationship and `-mutableArrayValueForKey:`) will do the heavy lifting of figuring out the right combination of the accessors your implemented accessors to perform an operation most efficiently. Also, technologies like Cocoa bindings will always use the proxy.

With this additional property in place, the entire Recipe class will look something like this:

@interface Recipe : NSObject {
@private
NSMutableSet *_ingredients;
}

@property (copy) NSSet *ingredients;
@property (readonly, copy) NSMutableSet *mutableIngredients;

– (void)addIngredientsObject:(Ingredient *)ingredient;
– (void)removeIngredientsObject:(Ingredient *)ingredient;

– (void)addIngredients:(NSSet *)ingredients;
– (void)removeIngredients:(NSet *)ingredients;

@end

@implementation Recipe

– (id)init {
if (self = [super init]) {
_ingredients = [[NSMutableSet alloc] init];
}
return self;
}

– (void)dealloc {
[_ingredients release];
[super dealloc];
}

– (void)setIngredients:(NSSet *)value {
[_ingredients setSet:value];
}

– (NSSet *)ingredients {
return [NSSet setWithSet:_ingredients];
}

– (NSMutableSet *)mutableIngredients {
return [self mutableSetValueForKey:@”ingredients”];
}

– (void)addIngredientsObject:(Ingredient *)ingredient {
[_ingredients addObject:ingredient];
}

– (void)removeIngredientsObject:(Ingredient *)ingredient {
[_ingredients removeObject:ingredient];
}

– (void)addIngredients:(NSSet *)ingredients {
[_ingredients unionSet:ingredients];
}

– (void)removeIngredients:(NSSet *)ingredients {
[_ingredients minusSet:ingredients];
}

@end

And the code for creating the curry recipe becomes this, for which Xcode will give helpful Code Sense completion suggestions:

NSMutableSet *ingredients = panangCurryRecipe.mutableIngredients;

[ingredients addObject:[Ingredient ingredientWithName:@”Butternut Squash” quantity:1]];
[ingredients addObject:[Ingredient ingredientWithName:@”Panang Curry Paste” quantity:1]];
[ingredients addObject:[Ingredient ingredientWithName:@”Coconut Milk” quantity:1]];

As well, it continues to avoid making copies of the underlying collection representing the relationship, and it also continues to post fine-grained KVO change notifications rather than whole-property notifications, ensuring bound controls are updated efficiently.

So when you’re creating properties for to-many relationships whether they’re unordered (NSSet) or ordered (NSArray), consider using this approach to implementing them. It’ll take a little more code, but it’ll be a lot more efficient and more correct.

#### Bonus Round: Core Data

What about Core Data? Now that iPhone OS 3.0 has Core Data, in addition to Mac OS X, there’s **really** no excuse not to use it. But would you do anything differently above?

Of course. But since we’re talking about Core Data, it turns out that what you do different is actually *write a whole lot less code*. Here’s what the declaration of the Recipe class will look like if it corresponds to a Core Data entity:

@interface Recipe : NSManagedObject

@property (copy) NSSet *ingredients;
@property (readonly, copy) NSMutableSet *mutableIngredients;

@end

@interface Recipe (CoreDataGeneratedAccessors)

– (void)addIngredientsObject:(Ingredient *)ingredient;
– (void)removeIngredientsObject:(Ingredient *)ingredient;

– (void)addIngredients:(NSSet *)ingredients;
– (void)removeIngredients:(NSet *)ingredients;

@end

Notice that I’ve gotten rid of the instance variables section entirely. This is because Core Data manages the storage for your modeled attributes and relationships for you; you don’t need (and *really* don’t want) instance variables for them.

You’ll also notice that I put the additional to-many relationship accessor methods in their own category. To see why, take a look at the implementation of the class:

@implementation Recipe

@dynamic ingredients;

– (NSMutableSet *)mutableIngredients {
return [self mutableSetValueForKey:@”ingredients”];
}

@end

Notice anything missing? *All of the methods related to the modeled `ingredients` property!* Core Data will not only generate an efficient setter and getter for the `ingredients` property automatically at run time, but will *also* generate implementations for the other `ingredients` to-many relationship accessor methods as well!

Core Data will generate the methods (at the latest) when you try to use them; it’s not dependent on having the category declaration available. That’s just for the compiler and IDE’s benefit when you’re writing code that *uses* those methods, so they can be completed by Code Sense and the compiler knows not to generate unknown-method warnings.

#### Changes

I added a bit after the first use of `-mutableSetValueForKey:` to address why one might want to use the mutable set proxy rather than just using the finer-grained KVC accessor methods directly.

Go ahead and use Core Data

In a few weeks, it will be **four years** since Mac OS X 10.4 Tiger was first released. That was the first release to include Core Data. It will also be about **one and a half years** since Mac OS X 10.5 Leopard was released, with significant enhancements to the Core Data API.

It’s pretty safe to start using Core Data in your applications now. You certainly don’t need to wrote directly to the low-level SQLite API any more.

Let’s merge managed object models!

There was a question recently on Stack Overflow asking how to handle cross-model relationships in managed object models. Now, the poster wasn’t asking about how to handle relationships across persistent stores — he was asking how to handle splitting a model up into pieces such that the pieces could be recombined.

It turns out that this is somewhat straightforward to do using Core Data. Let’s say you have a simple model with Song and Artist entities. I’ll write it out here in a pseudo-modeling language for ease of reading:

MusicModel = {
    Song = {
        attribute title : string;
        attribute duration : float;
        to-one-relationship artist : Artist,
            inverse : songs,
            delete-rule : nullify;
        userInfo = { };
    };

    Artist = {
        attribute name : string;
        to-many-relationship songs : Song,
            inverse : artist,
            delete-rule : cascade;
        userInfo = { };
    };
};

Now let’s say you want to split this up into two models, where Song is in one and Artist is in the other. You could just try and create two xcdatamodel files in Xcode, one with each entity, and wire the relationships together after loading them and merging them with +[NSManagedObjectModel modelByMergingModels:]. Except that won’t work: Relationships with no destination entity won’t be compiled by the model compiler.

What else might you try? You could try just putting dummy entities in for relationships to point to. However, merging models will fail then, because NSManagedObjetModel won’t merge models that have entity name collisions.

It turns out, though, that you can merge models very easily by hand, by taking advantage of the way Core Data’s model-description objects handle the NSCopying protocol. All you have to do is create your destination model, loop through every entity in each of your source models, and copy every entity that you haven’t tagged as a stand-in using a special key in their userInfo dictionary.

Why does this work? The trick is that before you tell a persistent store coordinator to use a model, that model is mutable and references relationship destination entities and inverse relationships by name. So you can have only a minimal representation of Artist in one model, and a minimal representation of Song in another model:

SongModel = {
    Song = {
        attribute title : string;
        attribute duration : float;
        to-one-relationship artist : Artist,
            inverse : songs,
            delete-rule : nullify;
        userInfo = { };
    };

    Artist = {
        /* Note no attributes. */
        to-many-relationship songs : Song,
            inverse : artist,
            delete-rule : cascade;
        userInfo = { IsPlaceholder = YES; };
    };
};

ArtistModel = {
    Song = {
        /* Note no attributes. */
        to-one-relationship artist : Artist,
            inverse : songs,
            delete-rule : nullify;
        userInfo = { IsPlaceholder = YES; };
    };

    Artist = {
        attribute name : string;
        to-many-relationship songs : Song,
            inverse : artist,
            delete-rule : cascade;
        userInfo = { };
    };
};

Then, when you write some code to combine them, the merged model will wind up with the full definition of Song and the full definition of Artist. Here’s an example of the code you might write to do this:

- (NSManagedObjectModel *)mergeModelsReplacingDuplicates:(NSArray *)models {
    NSManagedObjectModel *mergedModel = [[[NSManagedObjectModel alloc] init] autorelease];

    // General strategy:  For each model, copy its non-placeholder entities
    // and add them to the merged model. Placeholder entities are identified
    // by a MyRealEntity key in their userInfo (which names their real entity,
    // though their mere existence is sufficient for the merging).

    NSMutableArray *mergedModelEntities = [NSMutableArray arrayWithCapacity:0];

    for (NSManagedObjectModel *model in models) {
        for (NSEntityDescription *entity in [model entities]) {
            if ([[[entity userInfo] objectForKey:@"IsPlaceholder"] boolValue]) {
                // Ignore placeholder.
            } else {
                NSEntityDescription *newEntity = [entity copy];
                [mergedModelEntities addObject:newEntity];
                [newEntity release];
            }
        }
    }

    [mergedModel setEntities:mergedModelEntities];

    return mergedModel;
}

This may seem like a bit of overhead for this simple example. The critical thing to see above is that only that which is necessary for model consistency is in the placeholder entities. Thus you only need the inverse relationship from Song to Artist in ArtistModel. Say you wanted to add a Picture entity related to the Artist entity — you don’t have to add that to both models, only to ArtistModel. The benefit of this method for merging models should then be pretty apparent: It gives you the ability to make your model separable, just like your code.

Designing for Core Data performance

On the comp.sys.mac.programmer.help newsgroup, Florian Zschocke asked about improving the performance of his Core Data application. Here’s an adapted version of my reply to his post.

Core Data applications should scale quite well to large data sets when using an SQLite persistent store. That said, there are a couple implementation tactics that are critical to performance for pretty much any application using a technology like Core Data:

  1. Maintain a well-normalized data model.
  2. Don’t fetch or keep around more data than you need to.

Implementing these tactics will make it much easier to both create well-performing Core Data applications in the first plce, and to optimize the performance of applications already in progress.

Maintaining a normalized data model is critical for not fetching more data than you need from a persistent store, because for data consistency Core Data will fetch all of the attributes of an instance at once. For example, consider a Person entity that can have a binary data attribute containing a picture. Even if you’re just displaying a table of Person instances by name, Core Data will still fetch the picture because it’s an attribute of Person. Thus for performance in a situation like this, you’d normalize your data so that you have a separate entity, Picture, to represent the picture for a Person on the other side of a relationship. That way the image data will only be retrieved from the persistent store if the relationship is actually traversed; until it’s traversed, it will just be represented by a fault.

Similarly, if you have lots of to-many relationships and need to display summary information about them, de-normalizing your data model slightly and caching the summary information in the main entity can help.

For example, say your app works with Authors and Books. Author.books is a to-many relationship to Book instances and Book.authors is a to-many relationship to Author instances. You may want to show a table of Authors that includes the number of Books related to the Author. However, binding to books.@count for that column value will cause the relationship fault to fire for every Author displayed, which can generate a lot more traffic to the persistent store than you want.

One strategy would be to de-normalize your data model slightly so Author also contains a booksCount attribute, and maintains that whenever the Author.books relationship is maintained. This way you can avoid firing the Author.books relationship fault just because you want to display the number of Books an Author is related to, by binding the column value to booksCount instead of books.@count.

Another thing be careful of is entity inheritance. It’s an implementation detail, but inheritance in Core Data is single-table. Thus if you have every entity in your application inheriting from one abstract entity, it’ll all wind up in a single table, potentially increasing the amount of time fetches take etc. because they require scanning more data.

Retaining or copying the arrays containing fetch results will keep those results (and their associated row cache entries) in memory for as long as you retain the arrays or copies of them, because the arrays and any copies will be retaining the result objects from the fetch. And as long as the result objects are in memory, they’ll also be registered with a managed object context.

If you want to prune your in-memory object graph, you can use -[NSManagedObjectContext refreshObject:mergeChanges:] to effectively turn an object back into a fault, which can also prune its relationship faults. A more extreme measure would be to use -[NSManagedObjectContext reset] to return a context to a clean state with no changes or registered objects. Finally, you can of course just ensure that any managed objects that don’t have changes are properly released, following normal Cocoa memory management rules: So long as your managed object context isn’t set to retain registered objects, and you aren’t retaining objects that you’ve fetched, they’ll be released normally like any other autoreleased objects.

“Enterprise” thought leadership?

David Heinemeier Hansson, creator of Rails at 37signals, takes James McGovern — some Java/J2EE author — to task for his über-lame rant against Ruby in the Enterprise in a great post titled Boy, is James McGovern enterprise or what!

> So by Enterprise, Architect, and Enterprise Architect standards, this gent must be the top of the pop. Thus, allow me to make this perfectly clear: I would be as happy as a clam never to write a single line of software that guys like James McGovern found worthy of The Enterprise.

> If Ruby, Rails, and the rest of the dynamic gang we’re lumped together to represent, is not now, nor ever, McGovern Enterprise Readyâ„¢, I say hallelujah! Heck, I’ll repeat that in slow motion just to underscore my excitement: HAL-LE-LU-JAH!

> With that out of the way, we’re faced with a more serious problem. How do we fork the word enterprise? The capitalized version has obviously been hijacked by McGovern and his like-minded to mean something that is synonymous with hurt and pain and torment.

Indeed, McGovern’s rant reads more like a parody of a rant than the real thing:

> 13\. Lets say there is a sixteen week project and the productivity stuff was true and Ruby could save me an entire three weeks which would be significant. Since Ruby is a new vendor and not represented by existing vendors I already do business with, do you think that I will spend more than three weeks in just negotiating the contract?

Yes, because there is some vendor out there named “Ruby that you need to sign a contract with before you can begin a project.

Despite his claims to be agile, McGovern obviously doesn’t know the first thing about agile development. People come first, sure, but agile development doesn’t say that tools aren’t important. Not using good tools makes it harder for good people to do good work.

That’s why I love developing software for Mac OS X and why I love helping people develop software on Mac OS X: We have great tools like Cocoa, Core Data, Interface Builder, OCUnit, WebObjects, and Xcode, and these can be used by great developers to do great things.

WWDC 2005 Wrap-Up

WWDC 2005 is over, and *damn* was it a great week! Apple made some incredible announcements and shipped some incredible software, I got to see lots of old friends and make a lot of new ones, and I got to talk to lots of developers about things that I’m passionate about: Core Data, unit testing, setting up and streamlining your build process, and creating insanely great software to make users’ lives better.

It was a wonderful, wonderful time. Thanks to everyone!

Unit testing and Core Data

Mike Zornek asks about unit testing and Core Data. I’ve been meaning to write about this, so this is the perfect opportunity to do so.

Writing unit tests against your model and code that uses Core Data is easy. For example, it’s trivial to load your compiled model in a unit test:

NSManagedObjectModel *model = [NSManagedObjectModel mergedModelFromBundles:nil];

Not only that, but you can introspect it:

NSArray *entities = [model entities];

And you can do this all the way down to the property level. This means that it’s possible to assert that your entire model is set up the way you expect it to be. For example, you can make sure that your Employee entity has a mandatory salary attribute with a minimum value of 1 and a type of NSDecimalAttributeType, and descends from a Person entity that has a mandatory name attribute with a minimum length of 1 and a default value of “name.”

But how do you test your use of Core Data? You just use Core Data in your tests as you would in your project. For example, to instantiate a complete Core Data “stack” (as it’s sometimes referred to):

NSManagedObjectModel *model;
NSPersistentStoreCoordinator *coordinator;
NSManagedObjectContext *context;

model = [[NSManagedObjectModel alloc] initWithContentsOfURL:urlToModel];
coordinator = [[NSPersistentStoreCoordinator alloc] initWithManagedObjectModel:model];
context = [[NSManagedObjectContext alloc] init];
[context setPersistentStoreCoordinator:coordinator];

To instantiate managed objects associated with that context from entities in your model:

NSManagedObject *employee;

employee = [NSEntityDescription insertNewObjectForEntityForName:@”Employee”
inManagedObjectContext:context];

This gives you an autoreleased Employee (assuming your context’s coordinator’s model has an Employee entity, of course). It’s that easy.

You can then do things like check that this object was created with the correct defaults (e.g. the ones specified in your model), that it posts KVO notifications properly for properties where you care about such things, and so on. You can even add a persistent store to the coordinator and test that saving and loading work (and don’t work when they’re supposed to fail, of course) just by using normal Core Data and Foundation APIs.

You can multiply the full power of data modeling with the full power of unit testing and test driven development. It kicks ass.

Core Data: Generating an interface for an entity

The new Core Data framework and Xcode 2 modeling tools in Tiger are an extremely powerful way to develop great end-user applications quickly. You can even easily generate a human interface for your application that will let you work with its data model with little to no code.

To generate an interface, create an empty window in Interface Builder and make sure you’ll be able to see it with Xcode in front. Switch to your model in Xcode. Then just option-drag the entity you want an interface for into your window. Interface Builder will ask you whether you want an interface for one or many instances of that entity, and then generate a basic form-style human interface for all of the attributes and to-one relationships in that entity.

This generated interface isn’t a special monolithic “NSCoreDataControl” or anything of the sort. it’s composed of standard Cocoa controls that wired to standard Cocoa controllers via bindings. If your nib file’s owner is set to be an instance of NSPersistentDocument or a subclass, Interface Builder will even bind the controllers’ managed object contexts to the document’s.

If you just want to create controllers rather than full interfaces, or if you want to update the controllers in your nib file with the latest definition of your entity, drag the entity from your model straight to your nib’s document window. (That’s the one with the tabs for classes and instances etc.)

Note that none of this, none of this requires generating or writing code. You can create a new Core Data Document-based Application from the project template, create a data model for it in Xcode, create an interface for it in Interface Builder, and then build and run it. You can create, save, load, and manipulate documents and even undo and redo changes and avoid saving invalid data with no code.