July 20, 2005

self = [stupid init];

Ok, much as these things tend to, my last post caused a flurry of controversy amongst a select few, who felt that the more traditional method of writing -init methods, eg:

Traditional -init
- (id)init;
{
if ((self = [super init]) == nil)
return nil;

[...initialize my stuff...]
return self;
}

Is somehow better than my recommended:

Wil's -init
- (id)init;
{
if (![super init])
return nil;

[...initialize my stuff...]
return self;
}

I kept posting reasons why my way was valid, and posters kept arguing, so I threw down the gauntlet and offered $20 to anyone who could come up with one Cocoa class that you can subclass and which won't be initialized correctly unless you use "self = [super init]" in your subclass's -init method.

And, this week, we have a winner! Well, sort of. We have someone who did such a good job researching this that I agreed to give him the money, even though he didn't disprove my point, although he did point out some interesting gotchas in Cocoa.

Ken Ferry was intrepid enough to write a program that dynamically runs through every class and subclasses them, instantiates the subclasses, and sees if [super init] doesn't return self.

Which, in fact, several classes don't always do. Mr. Ferry mentions that NSColorPanel, NSFontManager, and NSDocumentController all only have a single instance, so the second time you call -init on them, you get the first object again.

However, this wasn't actually the question. Because if you allocate a second NSColorPanel, you've done wrong. Your code is not correct. FURTHER, if you just blithely go on with your init method after the superclass has returned a different object, you are likely to leak objects at best and crash at worst. That is, imagine I have the following code:

Subclassing NSColorPanel the Wrong Way
- (id)init;
{
if ((self = [super init]) == nil)
return nil;

myColorList = [[isa _generateColorList] retain];
return self;
}

Well, the second time you init your subclass of NSColorPanel, you're going to overwrite the original value for myColorList with a new value, and leak the original value. You can probably imagine how to have even worse effects.

What you need to do with these classes is detect that you've had a different value returned, and then handle that case. You can't just blithely reset 'self.' I would recommend raising an exception if you've attempted to create two of a unique class, so you can catch this and fix your dang code, but if you're subclassing a class where Apple's machinery requires their (dubious) silently-replace-with-unique-instance behavior, at least you should do something like:

Subclassing NSColorPanel the Right Way
- (id)init;
{
id superInitReturn = [super init];
if (!superInitReturn || self != superInitReturn)
return nil;

myColorList = [[isa _generateColorList] retain];
return self;
}

Mr. Ferry also pointed out that NSIndexPath uniquefies paths, so in this example:

NSIndexPath uniquing
unsigned indexes[3] = {1,2,3};
NSIndexPath *indexPath1 = [[NSIndexPath alloc] initWithIndexes:indexes length:3];
NSIndexPath *indexPath2 = [[NSIndexPath alloc] initWithIndexes:indexes length:3];

indexPath1 and indexPath2 are the exact same pointer. However, this is essentially the same case as above — you do not necessarily want to re-run your -init method on the same object twice, so you have to detect if you've been uniquefied after a [super init], not simply re-assign self.

So, my original statement was, "If you write code that says, 'self = [super init]', YOU DONE WRONG." This still stands. Mr. Ferry showed us some cases where you have to be aware of what your superclass is doing, and you have to handle some classes in a special way. And these cases underscore the point I made in the discussion after my last blog post: don't write your subclasses without knowing how your superclass works. Don't pretend you can write some platonic ideal class whose superclass doesn't matter. It does matter. It effects what valid variables can be named. It effects what method names you can use. And it effects how (and even if) you'll write your - init method. Heck, sometimes you don't even want to subclass - init; you'll want to call the designated initializer for the superclass, which by convention is usually the longest - init... method, and - init is, by definition, always the shortest.

Note: Mr. Ferry asked if he could exchange his cash prize (which totalled $100, since I said I'd give $20 for each instance found) for a job interview at Delicious Monster, which I will happily give him.

Update April, 2009:

There's been hints from Apple that they might modify the standard -[NSObject init] method to try to re-use old object's memory, since it turns out that a very common usage pattern is for programs to keep creating and deallocating, say, 12 objects of the same class, over and over. Re-using the exact same memory ends up being a big win (and this is a trick the iPhone already does with its UITableViewCell class, and that is a HUGE win if you do it yourself on the iPhone).

So, from now on, I recommend everyone uses:

Subclassing NSColorPanel the Right Way
- (id)init;
{
if (!(self = [super init]))
return nil;

// other stuff
return self;
}

I do.

Labels:

36 Comments:

Anonymous Anonymous said...

Being that you've been in the Cocoa game so long, could you ellaborate as to how then the self= [super init] myth got started and how it managed to get itself into very high profile books, confusing the rest of us?

July 20, 2005 2:18 AM

 
Blogger Francisco said...

This comment has been removed by a blog administrator.

July 20, 2005 2:19 AM

 
Blogger Francisco said...

Sorry about that, double post.

July 20, 2005 2:20 AM

 
Anonymous Anonymous said...

Also, something else that all the "Learn Yourself Cocoa" books seem to teach is


@interface .... : NSObject
....
@end


@implementation ....

- (id) init {
[super init];
....
return self;
}

...
@end

So the message to super is -[NSObject init], which is (very reasonably) documented "the version of the init method defined in the NSObject class does no initialization; it simply returns self."

I've heard ObjC optimisation freaks say that, under these circumstances, omitting the message to super is a good idea.
I'm sure the freakiest of those freaks will go so far as to insist it's essential.
What do we think?

July 20, 2005 4:06 AM

 
Anonymous Damien said...

On a different note here. Why does everyone use:
id anObject = [[SomeClass alloc] init];
Instead of
id anObject = [SomeClass new];

Less code is better code right?

July 20, 2005 4:10 AM

 
Anonymous Chris said...

Congratulations!
Ken, for your new job and
Wil, for a new employee!

July 20, 2005 4:26 AM

 
Anonymous cesar said...

Cool! an interview with delicious Monster!! what a cool prize!!! :)

will you blog about the interview?

July 20, 2005 6:48 AM

 
Anonymous Anonymous said...

Wil:

About super. I think you're right and you're wrong.

You're right:

You don't have to subclass without knowledge about your superclass.

You're wrong:

But, you forgot one key thing... Mainly, that Apple has (and will) continue to change Cocoa behind your back. You don't know what the superclass will look like in the future. You don't know if or when Apple will change your super, so it is best to write the code for the general case when you don't know when it will change.

July 20, 2005 8:02 AM

 
Anonymous cjwl said...

anonymous said:

Being that you've been in the Cocoa game so long, could you ellaborate as to how then the self= [super init] myth got started and how it managed to get itself into very high profile books, confusing the rest of us?

I have always blamed this post, which is burned into my brain as something wrong, for it:

http://groups-beta.google.com/group/comp.sys.next.programmer/msg/f20e6973710c7450?dmode=source

Scott Hess was a respected NEXTSTEP programmer, and his template got a following. I've always thought it was wrong, never used it, never had a problem with not using it.

I completely agree with Wil, it's stupid, it's even worse that Apple promotes it's use in the NSObject doc.s, it's just out of control.

I think it came from use of the early Foundation/AppKit implementations, the behaviour wasn't set in stone and people we're trying to switch from the old NX ways of doing things, so there was some misuse, for some reason self=[super init] was concocted and people started using it blindly because they didn't really understand what was going on, so it made them feel a little safer by using it.

Wil, thanks for pushing the topic. It's an Objective-C eyesore that needs to be removed from common use. Hopefully Apple is watching.

July 20, 2005 9:15 AM

 
Anonymous initgraf said...

Wil,

Did you say which classes don't return the expected self pointer? You mention NSColorPanel, NSFontManager, and NSDocumentController, but there should be 5 classes (if you were giving $20/class and Ken won $100).

July 20, 2005 10:52 AM

 
Anonymous Anonymous said...

From the blog: Don't pretend you can write some platonic ideal class whose superclass doesn't matter.

By that account, why do you write if (![super init]) and not just [super init] - or even nothing, when you subclass NSObject ? In the vast majority of cases that would be sufficient by the Foundation documentation.

Probably the best reason for you to keep on coding the Apple standard is to ensure yourself from lapses in your own reasoning.

July 20, 2005 12:16 PM

 
Anonymous Topher said...

Are we going to see Ken's code? It sounds pretty damned cool.

July 20, 2005 1:55 PM

 
Blogger Wil Shipley said...

Anonymous wrote: You don't know what the superclass will look like in the future. You don't know if or when Apple will change your super, so it is best to write the code for the general case when you don't know when it will change.

Argh! My point was and is that there is no logical situation in which self = [super init] helps you. Not now, not in the future. Not the way the Obj-C runtime works.

Because, if a different object is returned, you need to handle that case specially. I never said "Apple won't ever return a different object." I just said, "If they do, you need to do something different than blithely assign that to self."

July 20, 2005 2:44 PM

 
Blogger Wil Shipley said...

Probably the best reason for you to keep on coding the Apple standard is to ensure yourself from lapses in your own reasoning.

You're right that I leave that stub code in so I can easily move my classes around and they'll work in the common case without my having to think about it. I'm a bad person! I apologize!

July 20, 2005 2:47 PM

 
Anonymous Anonymous said...

This example ignores class clusters which pretty well always return a different pointer from init than they originally returned from alloc.

Also, a small style issue: functions with multiple returns are a little hard to work with. It is usually safer (and easier to understand the flow when you return to the code) to just re-use self:
- (id)init
{
self = [super init];
if (nil != self)
{
//initialize
}
return self;
}

But that isn't exactly the issue at hand.

Of course, an OO purist would argue that sub-classing any base class that doesn't already come with documentation describing how you are supposed to do it is already a recipe for disaster. That is why containment and delegation are usually preferred over inheritance.

A question for you regarding your approach: how does not assigning self get around the problem of potentially crashing or (even worse) over-writing memory re-allocated for someone else if you are in init with a stale self pointer?

July 20, 2005 9:17 PM

 
Anonymous ken said...

Quoth initgraf:

Did you say which classes don't return the expected self pointer? You mention NSColorPanel, NSFontManager, and NSDocumentController, but there should be 5 classes.

NSColorPanel, NSFontManager, NSDocumentController, NSIndexPath and NSJavaVirtualMachine are 5 classes where [super init] doesn't always return the receiver. (The last is in the JavaVM framework.)

Quoth topher:

Are we going to see Ken's code? It sounds pretty damned cool.

Sure! It requires some explanation, so I'll post an email I sent Wil (slightly revised). Grab it while it's hot, because I'm not going to leave this online for eternity.

July 20, 2005 10:13 PM

 
Anonymous ken said...

Quoth anonymous:

This example ignores class clusters which pretty well always return a different pointer from init than they originally returned from alloc.

Careful - it's true that if instanceAfterAlloc = [NSNumber alloc] and instanceAfterInit = [instanceAfterAlloc init], then instanceAfterAlloc != instanceAfterInit.

However, if MyNumber is a custom subclass of NSNumber, then [MyNumber alloc] is an instance of MyNumber, and if you call [super init] in -[MyNumber init], then you will get back the receiver, self.

July 20, 2005 10:25 PM

 
Anonymous ken said...

..okay, NSNumber is a confusing example because [[NSNumber alloc] init] is nil. Substitute NSSet for NSNumber in the comment above.

July 20, 2005 10:47 PM

 
Anonymous Jeff said...

As others have said, class clusters are a case in which [super init] return something other than self. (For typical class clusters, +alloc returns a shared singleton, and the real allocation happens in -init. These gymnastics are a consequence of the style of separating allocation and initialization, rather than a +new type of approach.) In a situation such as this, if you don't return what was returned by -init, then you're of course in trouble. The reason to actually assign to self is because in the case of very vanilla code such as this:

- (id)init
{
if(( self = [super init] ))
{
_myInstanceVariable = 7;
}
return self;
}

If you didn't assign to self, then you'd be initializing the instance variables of the wrong instance. That is, code such as this:

- (id)init
{
id superReturn = [super init];
if( superReturn )
{
_myInstanceVariable = 7;
}
return superReturn;
}

is actually initializing a different instance than the one returned, of course.

But the real origins are actually older. In the early days of ObjC, +new style object creation was in vogue, and your instance initialization happened in class methods rather than in instance methods. Here's an example of how this was used, from Timothy Budd's "An Intro. to OO Programming" book:

@implementation Card

+ suit:(int)s rank:(int)r
{
self = [Card new];
suit = s;
rank = r;
return self;
}

@end

Here, assigning to self allowed you to subsequently initialize instance variables of your new instance via straightforward assignment. It seems peculiar, but I expect that was the origin of the current style. I'm not sure if this code will work today.

But it's important to keep in mind that the "self" variable isn't magic--it's merely a local variable, and assigning to it doesn't have non-local consequences. The main consequence it does have is on subsequent ivar-access, which is relative to the value of self at the time of the ivar assignment. So "self = [super init]" allows you to proceed with your instance variable initialization in the same way whether or not the call to super returned a different instance. (I'm referring to -init methods here, not +new methods.)

Of course you can't hope to subclass without having some documentation as to whether a class was designed to be subclassed (and which methods must be overridden and which ones shouldn't be). But, I'd consider "super init returns a different instance" to be an implementation detail--not the sort of thing that a subclasser should need to know. Assigning to self guards against implementation details such as this, and in a case where a class was not meant to be subclassed (such as NSColorPanel) you're in trouble no matter what.

July 20, 2005 11:14 PM

 
Anonymous ken said...

quoth Jeff:

As others have said, class clusters are a case in which [super init] return something other than self.

Class clusters are not an example of what we're talking about. If MySet is a custom subclass of NSSet, then calling [super init] in -[MySet init] will return self.

July 20, 2005 11:32 PM

 
Anonymous Dominik Wagner said...

here you go

- (id)initWithCapacity:(unsigned)capacity

Initializes a newly allocated NSMutableString. The capacity argument represents characters and is used as a hint for how much memory to allocate. Returns an initialized object, which might be different than the original receiver.

July 21, 2005 3:23 AM

 
Anonymous Jeff said...

Ah yes, to Ken's point: Class clusters are a bad example--they represent a case in which -init returns something other than what +alloc returned (and demonstrate why it's necessary to do "[[SomeClass alloc] init]" rather than "foo = [SomeClass alloc]; [foo init];"), but that's not exactly the same as the case in which it's a call to [super init] which is returning something different. That's because subclassing a class cluster usually ends up bypassing the special allocation logic in the abstract base class. (For example, [NSString alloc] returns a special singleton object, and you don't get your real string instance until the call to an init method. But that special logic kicks in only if you are calling +alloc directly on the NSString class--it's bypassed for subclasses, so that in those cases +alloc does the "nomal" thing and returns a real instance of the subclass.)

I suppose a more direct argument for "self = [super init]" is that -init is defined as returning the initialized instance (not necessarily the same instance on which it was called), so if you ignore the value returned by -init, there's an opportunity for trouble. You don't necessarily have to assign that value to self, but doing so keeps the subsequent code "normal" looking. Ignoring the returned value is similar to doing "foo = [SomeClass alloc]; [foo init];", though in practice causes problems less often. Assigning to self is just a bit of defensive programming--it doesn't always make a difference.

July 21, 2005 8:46 AM

 
Anonymous cjwl said...

I suppose a more direct argument for "self = [super init]" is that -init is defined as returning the initialized instance (not necessarily the same instance on which it was called), so if you ignore the value returned by -init, there's an opportunity for trouble.

There is a big difference between [someObject init] returning a different object than someObject and [super init] returning a different object than self. Just because one does, doesn't mean you should code for it in the other case.

You don't necessarily have to assign that value to self, but doing so keeps the subsequent code "normal" looking.

By that argument, every method should start with:

_cmd=@selector(methodName);

Because it is entirely possible for _cmd to be NULL if someone calls your code using the implementation functions directly.

Not having _cmd assigned will screw up the NSAssert*() macros.

Of course that seems pretty stupid. So is assigning self=[super init].

July 21, 2005 10:39 AM

 
Anonymous Anonymous said...

You're right that I leave that stub code in so I can easily move my classes around and they'll work in the common case without my having to think about it.

So now you argue, that you don't want to think about it. A while go you said

Don't pretend you can write some platonic ideal class whose superclass doesn't matter..

That made more sense.

July 21, 2005 12:29 PM

 
Blogger Mike Lee said...

This comment has been removed by a blog administrator.

July 21, 2005 2:19 PM

 
Blogger Mike Lee said...

So, moving on... why don't you show us how to handle clicks and drags without ever leaving mousedown?

July 21, 2005 2:22 PM

 
Anonymous Curious Monkey said...

I heard that "it's the responsibility of the init method to release the object if it's going to return nil." Is there really a need for this or is that just crazy talk?

They provided the example code:
- (id) init
{
if (nil != (self = [super init]))
{
if (...whatever you check...)
{ ...init variables & stuff... }
else
{
[self release];
self = nil;
}
}
return self;
}

July 21, 2005 10:31 PM

 
Anonymous Anonymous said...

I like the blog, I like the software. I write way to much cocoa for my liking. But if you are arguing over points that just don't seam to ever matter... find something better to do.

June 16, 2006 8:46 PM

 
Anonymous Anonymous said...

A tad late, but this came to me while reading the argument:

if ([super init] != self) return nil;

This will catch the case of [super init] returning nil – unless someone’s been Very Naughty with IMPs and called init with a nil self. It will also catch the case where [super init] returns a different object – although in that case it will leak both objects.

July 26, 2006 5:44 PM

 
Anonymous Jon Hess said...

I read the post, but not the pages of comments.

Here's a case i've used were I realloc self in init.

I sometimes setup a view in IB that will be MyView. I made a custom view add some pop ups and check boxes and what not. Set the custom view's class to "MyView" then I wire up the outlets. In my view's initWithFrame:, I release self, load the nib, and return the view from the nib.

This way, I don't have to write all of the code to layout/setup the subviews.

You could subclass such a class, and it returns a different object from initXXX:...

Jon Hess

September 22, 2006 6:14 PM

 
Anonymous Anonymous said...

Every time you used the word "effects" in your final paragraph (above the Note), you really meant to say "affects."

December 19, 2006 8:54 PM

 
OpenID dwt said...

I wonder why this reusing objects happens in -init as opposed to +alloc - as that means the instance has been allocated anyway, so where is the gain?

Regards,
Martin

May 19, 2009 4:50 AM

 
Blogger Wil Shipley said...

> I wonder why this reusing objects happens in -init as opposed to +alloc - as that means the instance has been allocated anyway, so where is the gain?

+alloc would be cleaner but there may be implementation problems, I don't know.

But you should be able to prove to yourself that re-using in -init would only require allocating n + 1 objects (where 'n' is the maximum you use at any one time), NOT allocating 'm' objects (where 'm' is the total number of objects allocated ever).

-W

May 19, 2009 2:43 PM

 
Blogger Wil Shipley said...

> I wonder why this reusing objects happens in -init as opposed to +alloc - as that means the instance has been allocated anyway, so where is the gain?

+alloc would be cleaner but there may be implementation problems, I don't know.

But you should be able to prove to yourself that re-using in -init would only require allocating n + 1 objects (where 'n' is the maximum you use at any one time), NOT allocating 'm' objects (where 'm' is the total number of objects allocated ever).

-W

May 19, 2009 2:44 PM

 
OpenID dwt said...

> But you should be able to prove to yourself that re-using in -init would only require allocating n + 1 objects (where 'n' is the maximum you use at any one time), NOT allocating 'm' objects (where 'm' is the total number of objects allocated ever).

Yes of course, one needs only n + constant objects - thats the point of caching.

However if after +alloc an instance was already created, and the state would have to be reset anyway (so all the instance variables go back to their default values), then the re-using didn't actually give any benefits, right?

Lets compare the two scenarios:
* +alloc creates a new instance (malloc and everything - possibly very fast because of garbage collection - but still)
* It gets either appended to some internal cache or just released again
* one object gets removed from the class specific cache (thread local and everything probably)
* that is cleaned so its instance variables are all zero again

On the other hand it's just:
* malloc object (possible very fast with garbage collection)
* return

So I wonder how that should ever be faster - if it always, even in the best of cases has to do way more?

Or else I'm not getting something very basic about this.

May 22, 2009 4:21 PM

 
Blogger Wil Shipley said...

We agree that it'd be faster in +alloc - I don't know why they didn't do it, I assume it occurred to them, and there's a technical reason.

However, there is still SOME win to re-using in -init, which is that it keeps your peak size smaller, and there'd still be a win to allocating and deallocating the same, say, 60 bytes 100x in a row, instead of allocating 100 * 60 and then deallocating 100 * 60 bytes.

-Wil

May 22, 2009 6:29 PM

 

Post a Comment

<< Home