July 6, 2005

Code Insults, Mark I

Our first "lucky" contestant is up! He's sent me a class whose instances apparently represent files on the disk, possibly to be put into a tree to mirror parts of the filesystem. Then, STUFF can be done to them. None of this really matters for our porpoises (Eeee! Eee!), we're just looking at the way he wrote the code.

We otter dive right in (as a porpoise would):

Before
@interface FileSystemNode (PrivateMethods)
- (BDAlias *) nodeAlias;
- (void) setNodeAlias:(BDAlias *)newVal;

// Keep track of the node mod date
- (void) setNodeModifiedDate:(NSDate*)newDate;
- (NSDate*) nodeModifiedDate;

// Keep track of where the alias last pointed to...
- (BOOL) setNodePath:(NSString*)newVal;

@end;

After
@interface FileSystemNode (Private)
- (BDAlias *)_nodeAlias;
- (void)_setNodeAlias:(BDAlias *)newNodeAlias;
- (NSDate *)_nodeModifiedDate;
- (void)_setNodeModifiedDate:(NSDate *)newModifiedDate;
- (BOOL)_setNodePath:(NSString *)newNodePath;
@end;

Note first that "PrivateMethods" category name becomes "Private". Obviously, these are "methods" — what else would they be? Every word counts.

All private methods gained underscore prefixes. This convention is really handy for telling, at a glance, whether you're calling a public method or not on your code. It aids in deleting methods in a timely manner — whenever you delete a place where you call a method that starts with an underscore, you search for that method in the same file, and if you don't invoke it then you can delete the whole method, as well. Yay! Less code.

The accessors were rearranged so that they are in a consistent order: access, set, access, set, etc. This is important; the brain loves patterns. Redundant comments were removed. I know what -_setNodeModifiedDate: is going to do just by reading the (well-written) method name. Extra text in front of it is HARDER for me to read. Assume I speak code better than I speak English. Assume that if I'm thinking in code terms I don't want to have to keep switching my English parser on and off. Assume I'm familiar with the concept of "accessor" methods if I've programmed Cocoa at all, ever.

If there are side-effects in your set/get accessor methods, document them in the method. If the side-effects are really severe, the accessor method should be retitled as well. Almost never should you have to comment your method definitions; if the titles alone aren't clear enough, YOU ARE ALREADY WRONG. (In this case, they are.) For example, if the method were to, say, set the node path AND remove the old file, then it should be called something like -_removeOldFileAndSetNodePath:.

Finally, spacing was changed to be consistent with my exacting standards (no spaces after braces except in an "if ()" statement) and method parameter names we changed to be more descriptive.

Before
+ (id)nodeWithPath:(NSString *)aPath
{
FileSystemNode* newNode = [[[FileSystemNode alloc] initWithPath:aPath] autorelease];

return newNode;
}

After
+ (id)nodeWithPath:(NSString *)aPath
{
return [[[FileSystemNode alloc] initWithPath:aPath] autorelease];
}

Don't allocate a variable you only use once — every time I see a variable, I expect it to, you know... vary. More lines is worse, not better. I want to be able to glance at the routine and know what it does.

Before
- (id)initWithPath:(NSString *)aPath
{
// We don't support symbolic links because aliases made to them get pointed at the original, not the
// point of the symbolic... this messes up all our hierarchy management with alias resolution, etc.
if ([[NSFileManager defaultManager] pathContentOfSymbolicLinkAtPath:aPath] == nil)
{
if (self = [super init])
{
// Set the initial path (this creates an FSRef, records mod date, etc)
if ([self setNodePath:aPath] == NO)
{
// NSLog(@"FileSystemNode creation failed with path %@.", aPath);
[self release];
self = nil;
}
}
}
else
{
[self release];
self = nil;
}
return self;
}

After
- (id)initWithPath:(NSString *)aPath
{
if (![super init])
return nil;

// We don't support symbolic links because aliases made to them get pointed at the original, not the
point of the symbolic... this messes up all our hierarchy management with alias resolution, etc.
if ([[NSFileManager defaultManager] pathContentOfSymbolicLinkAtPath:aPath] != nil) {
[self release];
return nil;
}
// Set the initial path (this creates an FSRef, records mod date, etc)
if ([self _setNodePath:aPath] == NO) {
[self release];
return nil;
}

return self;
}

Multi-line comments should be collapsed to just one line; let your editor wrap lines for you, it's good at it. There is no magical 80-column limit! If my editor window is narrower than yours, I'll see double-wrapping (as in the left column, above) if you try to manually wrap lines. [Set Xcode to wrap and indent lines in its preferences.]

Don't leave dead logs in your code. Rewrite 'em if you need them again. They junk things up and make code hard to read. Also, they make me think you're not really sure if your code works or not.

Don't indent and indent and indent for the main flow of the method. This is huge. Most people learn the exact opposite way from what's really proper — they test for a correct condition, and if it's true, they continue with the real code inside the "if".

What you should really do is write "if" statements that check for improper conditions, and if you find them, bail. This cleans your code immensely, in two important ways: (a) the main, normal execution path is all at the top level, so if the programmer is just trying to get a feel for the routine, all she needs to read is the top level statements, instead of trying to trace through indention levels figuring out what the "normal" case is, and (b) it puts the "bail" code right next to the correctness check, which is good because the "bail" code is usually very short and belongs with the correctness check.

When you plan out a method in your head, you're thinking, "I should do blank, and if blank fails I bail, but if not I go on to do foo, and if foo fails I should bail, but if not i should do bar, and if that fails I should bail, otherwise I succeed," but the way most people write it is, "I should do blank, and if that's good I should do foo, and if that's good I should do do bar, but if blank was bad I should bail, and if foo was bad I should bail, and if bar was bad I should bail, otherwise I succeed." You've spread your thinking out: why are we mentioning blank again after we went on to foo and bar? We're SO DONE with blank. It's SO two statements ago.

Don't assign "self = [super init]" in your init statements, ever. If you ever type "self =" YOU'VE DONE WRONG. I know there's some book out there that says to do it. It's wrong as well, so there. 'self' is assigned by the Obj-C machinery when -init is called. Re-assigning it should really be a compiler error, but it wasn't made so because back in 1989, when we started all this, we were using +new instead of +alloc, -init, and +new both allocated the memory AND initialized it, and so we assigned 'self' by hand in our +new methods. Nowadays, 'self' is set for you, and it's ugly and potentially hazardous and redundant to set it again.'

Don't do anything before calling "[super init]" unless you have a really, REALLY good reason. In this case, the tiny optimization of the unlikely case of the file being a symlink is not a good enough reason, because [super init] is SO cheap compared to everything else that it really doesn't matter. I expect the first two lines and the last line of every -init... method to be the same unless something REALLY different is going on. Don't disappoint me. It makes kittens cry.

I do like the way the coder calls his own -_setNodePath: to set the path instead of writing the code AGAIN in -init. Kudos.

Before
- (void) setNodeAlias:(BDAlias *)newVal
{
[newVal retain];
[_nodeAlias autorelease];
_nodeAlias = newVal;
}

[... other methods were here ...]

- (BDAlias *) nodeAlias
{
return _nodeAlias;
}

After
@implementation FileSystemNode (Private)

- (BDAlias *)_nodeAlias;
{
return _nodeAlias;
}

- (void)_setNodeAlias:(BDAlias *)newNodeAlias;
{
if (_nodeAlias == newNodeAlias)
return;
[_nodeAlias release];
_nodeAlias = [newNodeAlias retain];
}

@end

If you declare your "Private" methods to be in a category, put 'em in a category! In that way, if you miss some, you'll get a nice compiler warning. Don't mix 'em in with your public methods! That's just ugly.

End the definition lines on your method implementations with a semicolon, so you can copy-n-paste them to or from your header (or the "Private" category at the top of your file) as needed when they change. Semicolons are required in the "interface" section, but don't hurt anything in the "implementation" section.

Group accessors together. -_nodeAlias should be right next to -_setNodeAlias:. Purr.

Ok, now some people are going to call me a hypocrite here, because my _setNodeAlias: method is ONE LINE LONGER than his. Well, it's the exception that makes the rule, smart guy. In this case, I happen to know three things:

(1) These -_set...: methods tend to be the primitives upon which everything else is based, and so they can get called a LOT. It's not uncommon to have tens of thousands called for a single user action in real applications. Now, if you use the old (and once-recommended) -autorelease, -retain technique for -set...: methods, you will end up with TENS OF THOUSANDS of dead objects sitting in your autorelease pool. Remember that all these objects are using up memory, and it won't be freed until the autorelease pool is popped, which is at the end of the current event (unless you've made your own, inner pool, which is often a good idea inside big operations). Remember also that allocating and releasing memory is expensive.

Now, if you do it my way, these objects never hit the autorelease pool. They're deallocated immediately, so their memory can be re-used immediately. This keeps your stack size nice and small, and you expand your VM like a blowfish with a water allergy.

(2) These -_set...: methods often register their undo contrapositives, as they otter. Your typical -set...: method should look like this:

Typical -set with undo
- (void)_setNodeAlias:(BDAlias *)newNodeAlias;
{
if (_nodeAlias == newNodeAlias)
return;
[[[self undoManager] prepareWithInvocationTarget:self] _setNodeAlias:_nodeAlias];
[_nodeAlias release];
_nodeAlias = [newNodeAlias retain];
}

Now, it doesn't take a genius to see that it's going to be a TON more efficient to only register undo events when the 'newNodeAlias' is actually different.

(3) -set...: methods often have side-effects, and those can be expensive. See point #2.

Ok, you may be saying, "How often am I really going to call -set...: with the value it already has?" The answer is: a lot. It happens a lot. There are lots of reasons for this, but it boils down to this simple rule: it happens a whole lot in code. This is a good place to nip off this redundancy — don't try to handle it at a higher or lower level. Handle it right here.

Note that the -autorelease, -retain way of doing -set...: methods was once recommended by Apple (or NeXT — I can't remember), and may still be recommended by books. Ignore that crap.

Before
- (NSComparisonResult) sortByUserVisibleName:(id)otherObject
{
// Just compare the user visible names
return [[self userVisibleName] caseInsensitiveCompare:[otherObject userVisibleName]];
}

After
- (NSComparisonResult)compareUserVisibleNameToObject:(FileSystemNode *)otherFileSystemNode;
{
NSString *ourName = [self userVisibleName];
NSString *otherName = [otherObject userVisibleName];
if (ourName != nil && otherName != nil)
return [ourName caseInsensitiveCompare:otherName];
else if (ourName == nil && otherName == nil)
return NSOrderedSame;
else if (ourName == nil)
return NSOrderedAscending;
else
return NSOrderedDescending;
}

Note the method name change. This method doesn't sort! It compares. Name it to match other -compare...: methods, like, say, NSString's -compare:.

I deleted the comment. Seriously, I can figure out this method compares the visible names. Especially now that I named the method "compareVisibleNameToObject:". My mom could figure that one out.

I handle the case for -userVisibleName being 'nil' explicitly, because it's much safer — -caseInsensitiveCompare: would otherwise raise an exception if 'otherName' is nil, which is essentially like crashing, and if 'ourName' were nil it would return NSOrderedSame no matter what 'otherName' was. Note that if you know that -userVisibleName can NEVER return 'nil' (like, it'd be a program error if it did so) then his method was really written correctly; I just wrote mine to show what you'd do if there's a possibility you'll get a 'nil'. Note that if you can't ever have a nil value for the userVisibleName and it doesn't change, I urge you to not have a -setUserVisibleName: method, and instead pass it as an argument to your -init...: method, so there's no way to create one of these objects without a name.

I do like that the programmer called his own accessor here instead of accessing his ivar directly. My rule is, if you have an accessor method, it's good to use it internally in the class, because there may be side effects in the accessor method. (If you don't have an accessor method, I recommend against using it, because your app will not compile or run.)

--

Now, as an exercise to the reader, I'll paste in another method from this file, which you should now have the power to write correctly. Go for it! Make it beautiful! You can do it!

Before, do-it-yourself
- (NSString*) nodePathResolvingAliases
{
// Assume we will get the path of the base FSRef
NSString* finalPath = nil;

// If it's not a dir, it might be an alias
if (_isDir == NO)
{
BOOL isAlias;
BOOL isDir;

if ((FSIsAliasFile(&_nodeFSRef, (Boolean *)&isAlias, (Boolean *)&isDir) == noErr)
&& (isAlias == YES))
{
BOOL isAliasToDir;
FSRef thisFileRef = _nodeFSRef;

// ¥[Initials]¥ should cache alias resolution?
if (FSResolveAliasFileWithMountFlags(&thisFileRef, YES, (Boolean *)&isAliasToDir,
(Boolean *)&isAlias, kResolveAliasFileNoUI) == noErr)
{
UInt8 pathChars[PATH_MAX];

if (FSRefMakePath(&thisFileRef, pathChars, PATH_MAX) == noErr)
{
finalPath = [[NSString stringWithCString:(const char *)pathChars]
stringByStandardizingPath];
}
}
}
}

// If we didn't get an alias resolution, just use the base path
if (finalPath == nil)
{
finalPath = [self nodePath];
}

return finalPath;
}

Special thanks to my guinea pig this time, who was first to send me code. Thanks for being a good sport and not coming after me with a hunting rifle. I thought there was a lot to like in this code; in particular the author clearly was following some style guidelines, they just weren't mine. Still, big ups for trying to be clean.

Labels:

95 Comments:

Anonymous WombatPredator said...

This is really great and informative, Wil. I especially like the part about retain, release and autorelease. There are so many schools of thought on that topic that it's nice to see how you approach the problem, since you've written and sold applications as opposed to books. Street cred is a good thing.

I hope you continue this series, it's really fun.

July 07, 2005 3:02 AM

 
Anonymous Anonymous said...

Make it beautiful!

OK, here is my go at it.

//ps

July 07, 2005 3:30 AM

 
Anonymous Anonymous said...

I used to use underscores in my private methods until I learned that Apple uses this convention themselves and thus suggests you not do so to avoid clashes with private frameworks.

July 07, 2005 3:52 AM

 
Anonymous Anonymous said...

Wil,

I seem to recall the rational behind setting self in init was that alloc might have returned a placeholder object that got reassigned in the case of the super's init (such as when using class clusters.)

Any comments?

July 07, 2005 5:00 AM

 
Anonymous Anonymous said...

If you don't have an accessor method, I recommend against using it, because your app will not compile or run.

Not to be a nit-picker, but it will compile (with a warning) and it will run. It'll throw an exception when the class doesn't respond to the selector in question but that's not the same as not compiling and not running.

July 07, 2005 5:44 AM

 
Anonymous Anonymous said...

a) self = [super init] serves an important role in class clusters, as well as other cases. It's the way to go, especially for designated initializers. Ssay you have an initializer for a fixed-size array: initWithNumberOfSlots:(int). This initializer could deallocated the alloc'ed (placeholder)-object, then return an object of a suitable size. Now you subclass the object: without the assignment to self, you throw away the init'ed object, and you return the placeholder. Bad.
b) autorelease in setBlah:(id)blah:
This doesn't concern the _stack_ at all. Autorelease pools live in dynamically allocated memory, as do all Obj-C-Objects. That's one main difference from C++. Plus, the autorelease pool gets cleared at every runloop tick anyway. You shouldn't care about stuff at this level. I even do my getter methods like -(id)blah { return [[instance retain] autorelease]; }. Defensive programming!

July 07, 2005 5:56 AM

 
Anonymous macFanDave said...

In the improved method,
compareUserVisibleNameToObject:, shouldn't the second line be:

NSString *otherName = [otherFileSystemNode userVisibleName];

instead of:

NSString *otherName = [otherObject userVisibleName]; ?

July 07, 2005 6:01 AM

 
Anonymous Aaron Jacobs said...

I have a couple of questions/comments:

1) About the underscore-prefixed private methods: as anonymous said, I think this is against Apple's guidelines (and for a good reason). It's been awhile since I read them, but I believe they say you should name your private methods like ASD_setNodeAlias:, where ASD is a prefix you choose for your program/company (NS being Apple's prefix).

2) About autoreleasing in accessors - doesn't this serve an important purpose in memory management conventions in Cocoa? Correct me if I'm wrong, but isn't the following in agreement with Cocoa conventions?

NSString *oldName = [object name];
[object setName:newName];
NSLog(@"The old name was: %@, the new name is %@", oldName, newName);

That code may cause the program to crash with your accessor methods. I thought that the convention was that accessor methods return an autoreleased object which is guaranteed to be valid for the rest of the run-loop (or basically the context in which the accessor is called).

3) I also thought that the self = [super init] thing was essential to handle class clusters, singleton objects, and the like. How do you respond to that?

July 07, 2005 6:04 AM

 
Anonymous Anonymous said...

About the nested ifs...

That form can be useful if there is a lot of cleanup code associated with some calls that must be called before returning no matter what.

Otherwise, with multiple failed return points, you often have to reproduce that cleanup code or put it in a different method which you call, and that is not really great.

It shouldn't go to many levels deep. And, of course, it isn't the case with the method given in example so there is no reason for using nested ifs.

July 07, 2005 6:17 AM

 
Anonymous Chris Cowan said...

Excellent website. I'm a PHP developer and a lot of what you said about the comments and writing clear code makes perfect sense. It seems that all the PHP developers out there over comment their code. The point about switching my English parser on and off while reading code is right on! I can read code better then english and I hate reading comments that repeat what the code is doing. I praise you, Syntax Samurai!

July 07, 2005 8:20 AM

 
Anonymous Lon Varscsak said...

I hate to be a fanboi, but I likes what I sees. :D

I regularly correct my developers to do pretty much all of the items listed in this example.

Clean pretty coooddeee<drools>

July 07, 2005 9:13 AM

 
Anonymous Anonymous said...

That self is automatically assigned is not the point; the point is that any 'init' method is permitted to return different object than the receiver.

self = [super init]

accounts for the superclass performing a substitution.

July 07, 2005 9:21 AM

 
Anonymous Anonymous said...

If you prefix your private methods with _, you can accidentally override Apple's private methods. This is Not Fun.

July 07, 2005 9:22 AM

 
Anonymous Anonymous said...

I would agree that unnecessarily adding objects to the autorelease pool is bad. But I thought for a setter the reasoning for retaining the old variable, setting the new value, then releasing the old value is to aid in multithreaded access. Unless I'm wrong about what release really does, your code will result in the ivar being nil for a finite (albeit small) amount of time.

July 07, 2005 11:04 AM

 
Anonymous Anonymous said...

To the previous anonymous:

If you want to be thread safe with accessors, you pretty much have to use locks. It's hard to predict (for me anyway) when a line of C code will break into several, divisible machine instructions and not be thread safe.

See Jon Rentzsch's article at http://www-128.ibm.com/developerworks/library/pa-atom/.

July 07, 2005 11:15 AM

 
Anonymous Anonymous said...

Oh, also, setting and retaining the new value before releasing the old value is _really_ protection against accidentally deallocating the value in question when the new value and the old value are the same. Wil doesn't have to worry about it because he knows, through his test, that the new and old values are not the same object.

Work through this case:

// assume that retain count of [object foo] is 1
[object setFoo:[object foo]];

with accessor

- (void)setFoo:(id)newFoo
{
[foo release];
foo = [newFoo retain];
}

July 07, 2005 11:22 AM

 
Anonymous Daniel Jalkut said...

Wow, thanks Wil!

I am the "original coder" who was lucky enough to be ripped apart here :) I guess since I haven't been too publicly humiliated, I am now willing to fess up.

I sent the class in question to Wil thinking it really wasn't much, and half expected him to just say "there's not enough there to criticize." It just goes to show how much you can observe from even a small sample.

I really appreciate all the comments you made and the time you took to make them. I learned some pretty significant things, both stylistic and technical. I never *knew* I could leave the semicolons on the end of function definitions! Say what?! It will be weird to get into that habit, but I think I might just take you up on that.

I also appreciate the dialog from the comments above. I share some of the skepticism e.g. about the usefulness of assigning to self explicitly, and the debate about the pros and cons of the various accessor idioms. Wil is undeniably wise in these areas but I think it's revealing that so many people have thought hard about these issues and thus have slightly different opinions. The resulting dialog can be nothing but illuminating.

Thanks again!

July 07, 2005 11:31 AM

 
Anonymous Aaron Jacobs said...

So I re-skimmed Apple's "The Objective-C Programming Language", and I guess I was wrong about the memory management convention I mentioned in #2 in my previous comment. Here's what they say:

"Similarly, if you ask for the main sprocket and then send setMainSprocket: you can’t assume that the sprocket you received remains valid."

It seems as if the anonymous poster above was right about the reason for autoreleasing - so you don't accidentally release the object you are about to retain.

July 07, 2005 12:34 PM

 
Anonymous Marcus S. Zarra said...

Wil,

Just wanted to pop in and say thanks for spending the effort to write blog entries like this. I am a recent convert from Java and seeing real world examples of how things should work is more valuable than the books I am currently working through.

I hope to see lots more of these!

July 07, 2005 1:13 PM

 
Anonymous Mont Rothstein said...

On autorelease vs. release in accessor methods.

I have to agree with Will here. The amount of calls to accessor methods that can go on between hitting the autorelease pool is *huge*, particularly in enterprise apps.

My rule of thumb is to use release everywhere, unless autorelease is absolutely required. Going back and changing autorelease to release to reduce your memory footprint is a disaster waiting to happen.

To make matters more confusing I think I saw Apple recommending using copy instead of retain in accessors :-( To which all I can say is, stop hiring Java programmers :-P

-Mont

July 07, 2005 1:50 PM

 
Blogger Wil Shipley said...

I admit that the 'self =' construct may be valuable for some types of class clusters, but I don't think so for Apple's. I don't know that I've ever subclassed one, honestly. So, that tells you something about how often that happens.

At any rate, for the built-in Apple class clusters, I'm *pretty sure* that the superclass you asked for is the one you're going to get back when you call [super init]. If it weren't, you'd be kind of upset when you tried to access *your* instance variables, wouldn't you?

That is to say, you've declared your MyArray class to have an ivar called "foo", and so the runtime machinery has made a subclass of NSArray with "foo" as an ivar. Now, if in your -init method for MyArray, you call NSArray's init and it returns and object of type NSSuperSecretSquirrel instead, how are you going to access "foo"?

Conversely, if you call [NSString stringWithBlah:], you don't know what type you're going to get back. But that's not a subclassing case, that's a simple use case.

When subclassing class clusters Apple specifically tells you which primitive methods you have to implement to actually have them function at all. Apple's clusters typically leave the low-level storage up to you, and just have all their convenience methods implemented in terms of those methods in the abstract superclass. In their concrete, hidden classes, they have much more efficient implementations of a lot of the convenience methods (eg, they don't have to call -characterAtIndex: EVERY TIME), but the beauty of the class cluster is you never have to worry about any of that.

July 07, 2005 2:15 PM

 
Blogger Wil Shipley said...

> it will compile (with a warning) and it will run.

Warnings are usually errors in my build systems. Had to turn it off at various points, but it's a good way to go.

I consider "not running" to mean "it'll raise an exception," without really worrying about the mechanics of the default way that exception is caught. There are flags you can set so it'll hang or crash or core dump.

July 07, 2005 2:17 PM

 
Blogger Wil Shipley said...

> Otherwise, with multiple failed return points, you often have to reproduce that cleanup code or put it in a different method which you call, and that is not really great.

I'll often use an inline function inside the method, or sometimes I'll even use a goto. I really hate the nested way.

July 07, 2005 2:19 PM

 
Blogger Wil Shipley said...

> To make matters more confusing I think I saw Apple recommending using copy instead of retain in accessors :-( To which all I can say is, stop hiring Java programmers :-P

-copy is actually more valid. The Apple immutable classes correctly just increment their retain count, and the mutable ones... well, you REALLY want to do a copy to get an immutable object.

I didn't suggest using copy because a lot of custom, user classes don't implement it at all, or correctly.

July 07, 2005 2:21 PM

 
Blogger Wil Shipley said...

> This doesn't concern the _stack_ at all. Autorelease pools live in dynamically allocated memory, as do all Obj-C-Objects.

Yup. The point is, there's a 'stack' of objects in the autorelease pool, and if that stack (not the main, low-level stack, but a DIFFERENT stack) gets really huge each event, then you will have an uglier memory profile and your app will be unnecessarily slow.

I know that the autorelease pool gets cleaned every event. My point is, it's really easy in a real app to have your autorelease pool blow up between events, so you end up allocating a TON of memory and then freeing it, which is one of the slower operations. (Hey, mach pre-allocates swap space on disk for allocated memory!)

If you release instead of autorelease, the objects' memory joins the free pool immediately.

Imagine a program that called, say, "-setFoo:" 100,000x between events. In the autorelease pool, every object you called -setFoo: with is still retained until the END of the event, so you'll need to allocate space for 100,000 Foos, and then you'll need to wait while they're all deallocated at the end of the event.

Using my setFoo: method, you'll only need to allocate memory for 2(!) Foo objects the whole time, instead of 100,000.

July 07, 2005 2:26 PM

 
Anonymous Anonymous said...

NSObject docs say this:

"In some cases, an init method might release the new object and return a substitute. Programs should therefore always use the object returned by init, and not necessarily the one returned by alloc or allocWithZone:, in subsequent code."

Yeah, assigning to self is distasteful, but it's (a) easy to do, (b) more correct. It's a known idiom, so it isn't going to make the code hard to read.

July 07, 2005 2:37 PM

 
Anonymous Daniel Eggert said...

I really enjoyed reading your page until this...

You certainly missed some important points yourself. Are you doing this to test us?

Certainly the leading underscore is bad. Read this: Naming Methods.

Also removing the "FileSystemNode* newNode = ..." and just returning the value, will remove some type checking. Hey, type checking is your friend. Use it when it doesn't clutter your code.

Then the self = [super init]. Leaving out the "self =" is like shooting at your foot and saying: "Hey, I usually miss anyway." Read more here: Memory Management (The Returned Object).

Except for those, many good ideas on your side. Keep up the good work.

/Daniel

July 07, 2005 2:47 PM

 
Blogger Wil Shipley said...

> Certainly the leading underscore is bad. Read this: Naming Methods.

I respectfully disagree. There is some potential for danger. On the other hand, if you give your methods very descriptive names, what are the actual chances? Sure, if it's:

- (int)_index;

You might get screwed. But if it's:

- (unsigned int)_indexOfDraggingStoreItem;

Then, seriously!

There's an argument to be made that if you've added your own private category method and it overrides one of the Apple ones, it sure BETTER do the same thing, or you haven't given it a very good name.

But, hey, if you're worried, I'd use:

_XXdoSomeDangThing;

--

> Also removing the "FileSystemNode* newNode = ..." and just returning the value, will remove some type checking. Hey, type checking is your friend. Use it when it doesn't clutter your code.

Nope. There was no typechecking in the original statement, because -autorelease returns an (id).

It's one of those arguments where you say, "If I have to type in the exact same word twice (in this case the class name), I think there's less chance for mistake."

I think there's more, because if you change it one place (in this case) and don't change it in the other, like such:

Foo *myThing = [[[Bar alloc] init] autorelease];

You would NOT get a warning, and you'd glance at the code and say, "Yup, it's a Foo, not a Bar any more!"

--

> Then the self = [super init]. Leaving out the "self =" is like shooting at your foot and saying: "Hey, I usually miss anyway."

I don't know of any objects that replace themselves, for the reasons I specified above, regarding new variables. Please tell me how I could add instance variables if they're returning a different class than the one I allocated?

Note that if you're not SUBCLASSING a class cluster, it is totally true that the object you allocated may not be the one you get back. So, if you call:

NSString *myString = [[NSString alloc] init];

You may get back a different string. That's different from the 'self =' hack.

July 07, 2005 5:57 PM

 
Anonymous Anonymous said...

I have to say, missing the boat on the "self = [super init]" thing (and sounding so confident in the advice about it) has shaken my faith in the almighty Delicious Wil...

July 07, 2005 5:59 PM

 
Anonymous Anonymous said...

So the books say that not assigning a new object to itself will cause the earth to stop spinning, sores to appear on your naughty bits, and a plague of locusts o'er the land, yet I use Wil's selfless software and my naughty bits are fine, so... maybe the books ARE overstating the danger a little bit.

Also, people who post anonymous smacktalk have dirty, dirty sex with sea turtles. This this green-eyed beauty here.

C'mere you!

July 07, 2005 6:45 PM

 
Anonymous Anonymous said...

Wil,

Leading underscores on method and ivar names are an Apple internal coding convention, and it is not recommended for outside developers.

-jcr (No longer Apple's official voice on this matter, but still...)

July 07, 2005 6:57 PM

 
Anonymous Anonymous said...

I never wanted to use underscores on my privates before, but now that you say I can't, I really want to. Oh yeah... that's nice.

July 07, 2005 9:07 PM

 
Anonymous Anonymous said...

Seems Wil just got bitch slapped by jcr!

July 07, 2005 9:09 PM

 
Anonymous Lon Varscsak said...

Apple invented the underscore! They're so cool. ;)

July 07, 2005 9:23 PM

 
Anonymous Zach Wily said...

I've used underscores for ivars and private methods for awhile, and I've always felt slightly bad about it, knowing that I was blatantly defying Apple. However, it's so convenient, it just doesn't seem fair for Apple to have the leading underscores all for themselves!

I'll worry about it even less now.

July 07, 2005 9:39 PM

 
Anonymous Anonymous said...

Wil,

Your site is very difficult to read at 1600x1200 (and upping the font size destroys the layout). Please consider a better font/table format for code snippets.

Grouping accessor methods in pairs greatly aids reading them. Add an empty line between each group of two.

Assigning 'self = [super init]' is always correct where as a lone '[super init]' is not. Your arguments to the contrary are specious.

Naming private methods with an unadorned underscore is against company dogma. Your arguments that it (probably) shouldn't matter are specious.

Convenience constructors should always call '[self class]' instead of hard-coding the class in which they're defined. This greatly facilitates subclassing.

The method name 'compareUserVisibleNameToObject:' is unnecessarily verbose and somewhat misleading since you are not passing in a 'user-visible name' as a parameter. It should be named 'compareUserVisibleName:'.

Making a blanket statement about the evils of 'autorelease' without mentioning that it is required for thread-safe accessors is unconscionable.

Consider recommending a set of macros for setter/getter methods which remove the tedium of writing them and which make it clear which are meant for simple retains, copies, thread-safe accessors, explicit autoreleases, etc.

July 07, 2005 10:07 PM

 
Blogger Wil Shipley said...

Guys, you aren't freaking listening. I'm right about the damn self = [super init] thing being wrong.

Read the dang docs that Daniel Eggert quoted: http://developer.apple.com/documentation/Cocoa/Conceptual/ObjectiveC/RuntimeOverview/chapter_4_section_2.html

This is clearly talking about creating objects that are NOT your superclass, when you've ALLOCATED and INITIALIZED the object yourself.

You do NOT allocate the object inside -init! You can replace the object in -init if you want, but then you cannot be reasonably subclassed.

PERIOD! I AM RIGHT! IF YOU ARE GOING TO ASSERT I AM NOT, GIVE SOME FREAKING FACTS AND/OR ADDRESS WHAT I HAVE SAID THREE TIMES NOW: THAT YOU CAN'T SUBCLASS A CLASS THAT REPLACES INSTANCES IN ITS INIT METHOD.

July 07, 2005 11:27 PM

 
Anonymous Anonymous said...

You can replace the object in -init if you want, but then you cannot be reasonably subclassed.

Well, not true. I use object replacement in -init for cacheing (making objects unique based on an initial parameter) objects of the same class. E.g. initWithPath:X might return the same object that was allocated for path X before, and release the newly allocated one.

Such class sure can be subclassed, since you always return objects of the same class.

It is better to be safe, and it doesn't cost much, and it is simple to always have default first line: if( (self = [super init]) == nil ) return nil.

BTW, I hate it when people apply logical operators to things that are not boolean values, such as integers or pointers. It conveys no information about the type of expression. Code should be as self-documenting as possible.

July 08, 2005 12:20 AM

 
Anonymous Ken said...

Private methods are the only place where I really feel the hurt from ObjC's lack of namespaces. Prefixed method names (with anything other than a '_') are distractingly ugly, and there's no compiler warning for an 'accidental' override (damn you computer, read my mind!).

What would really be nice is if we could optionally declare a method to live in a namespace. Or maybe an entire category if that seemed simpler, I don't know.

Non-sucky calling syntax is not totally obvious. Perhaps

@using_namespace FileSystemNodePrivate

could signal that methods from the FileSystemNodePrivate namespace are fair game. Scope could be to the end of the current block, if used in a block, or to the end of the file if not.

I don't know enough about the ObjC runtime to judge whether this'd be reasonable to implement. It might be pretty rough on method lookup caching.

July 08, 2005 2:34 AM

 
Blogger Wil Shipley said...

> Well, not true. I use object replacement in -init for cacheing (making objects unique based on an initial parameter) objects of the same class. E.g. initWithPath:X might return the same object that was allocated for path X before, and release the newly allocated one.

Yes, if you are dealing with an exceptional class, there's a WAY to make it so it makes sense to use "self =".

I continue to maintain that it doesn't make sense in the general case. Even your use case seems pretty contrived -- why would you have the user allocate an object just to throw it away? Why not have a class method that does the uniquing, like:

+ (id)gimmeAUniquifiedThingyWithPath:(NSString *)path;

Certainly it's a lot clearer to the caller what's going on. My basic point is, this kind of swizzling is the exception, not the rule. You should be aware it COULD happen, yes. And if it DOES happen, you should account for it. But it's not GOING to happen very often, so don't write all your code for the 0.01% case.

If I did happen to write a subclass of your stringy-cachy-swappy class, and I didn't call "self =" in my init (which I would not), and the app immediately crashed, I'd figure out in two seconds what was happening and fix it. That'd be that. Should I really modify my coding behavior from now until that fateful day, just to avoid a completely easy-to-fix potential bug if I happen to not read the documentation on a class BEFORE I SUBCLASS IT?

You DO NOT HAVE TO WRITE SUBCLASSES AS IF YOU DON'T KNOW WHAT THE PARENT CLASS IS. In fact, if you do write a subclass this way, IT IS STUPID. I know everyone dreams of the day when we get complete abstract and polymorphic programming languages that are still somehow strongly typed and fast and magic. That day isn't here. It's why we're programming in Objective-C instead of Java; because every time someone tries to abstract too far from the machine, we discover that real-world programs can't be written very well. Objective-C is C. C has survived for thirty-something years now. Every FREAKING year somebody declares that it's dead and something more advanced is taking its place. Hell, I remember when it was Ada. I remember when it was Modula-2. I remember Pascal. I remember 4GLs. C sticks around because it turns out doing things abstractly doesn't work. It just isn't the right metaphor.

Seriously, this convention something I've done for years. It's in so many shipping Cocoa apps it's not even funny. I mean, it's funny like beer, but not like speech.

--

You can make a reasonable argument that you shouldn't use ! on pointers. But I'll mildly disagree -- I think it's an established practice and I honestly think it's a lot easier to read.

There was a period at Omni where we insisted people compare pointers to nil or NULL, but I'm not there any more and I've changed my mind.

C specifically allows arithmetic on pointers. It's valid.

Remember: "NO means NULL".

July 08, 2005 2:51 AM

 
Blogger Wil Shipley said...

JCR:

You know I've been using underscores longer than everyone at Apple except for Ali Ozer and Bertrand Serlet. Those homies tell me to stop, I'll stop. Until then, you N00Bs can step off my namespace, yo!

-W

July 08, 2005 2:54 AM

 
Blogger Wil Shipley said...

Ken:

It'd actually be pretty easy to do some preprocessing on a .m so, when you declare a (Private) category, the compiler first mangles all the names in the category to make sure they are darn unique. Like, it could rename:

@interface MyClass (Private)
- (void)_index;
@end;

to

@interface MyClass (Private)
- (void) _MyClassPrivateStuffSoYouBettaStepBeyotchIndex;
@end

Since private methods are, by definition, only scoped to the current file, the mangling wouldn't affect other parts of the program.

July 08, 2005 2:57 AM

 
Blogger Wil Shipley said...

To PS on his code beautification:

The second poster took up my challenge. I finally got a chance to read his code. It looks clean, although I must point out something that wasn't in my tirade:

Using -stringWithCString: isn't necessarily correct for paths. Use CFStringCreateWithFileSystemRepresentation() or [NSFileManager stringWithFileSystemRepresentation:length:].

That's going to be safer as it will work no matter what the filesystem's directory encoding is.

July 08, 2005 3:07 AM

 
Anonymous Anonymous said...

Wil wrote
> Using -stringWithCString: isn't necessarily correct for paths

Ah, bummer :-) Well, I've not really used CoreFoundation or the non-NS* stuff much, so that probably explains it.

I'll stay out of the accessor and self assignment debates, but further up, Wil wrote
> I'll often use an inline function inside the method, or sometimes I'll even use a goto. I really hate the nested way.

A little tip for those who fear goto:s (or who have been told to fear them), you could do something like this:

BOOL success = NO;
do
{
if (!doSetupStuff())
break;
if (!doSomeMoreSetupStuff())
break;
...
success = YES;
}
while (NO);

if (!success)
tearDown();

Anyway, really interesting blog entry and followup thread... Kudos to Wil for doing this. I'm looking forward to the next installment ;-)

//ps

July 08, 2005 4:44 AM

 
Anonymous Anonymous said...

Oops, I missed the line where Wil wrote [NSFileManager stringWithFileSystemRepresentation:length:]

OK, so I didn't know NSString as well as I thought.

//ps

July 08, 2005 4:48 AM

 
Anonymous Tiny Clanger said...

Seconded on the readability thang - white on black tends to need a heavier font, at least for the Courier, as it has quite thin characters. I'd say setting font-weight: bold; font-size: 120%; ought to do it.

Apart from that, I'm just setting out to learn Cocoa and Objective-C at the moment, so this is stopping me from falling into bad habits. Thanks :)

July 08, 2005 5:31 AM

 
Anonymous nanouk said...

My two cents to contextualize things:

On self = [super init]: This only matters if you are doing weird shyte like adding a concrete subclass to one of Apple's class clusters (e.g., NSArray). Otherwise, just doing [super init] is FINE. Really.

On using underscore for private methods: Again, you only care if you are SUBCLASSING an Apple provided class (and I don't mean NSObject, for any would-be smartypants) Otherwise, your method name is only valid in the scope of your class, i.e., [MySwooptyClass _blah] isn't going to stomp on anything else, unless _blah is something amazingly generic (duh)

On autorelease: Only if really necessary (read: unavoidable) Autoreleasing is EXPENSIVE. Put in the frabjous one-line comparison (if (newFoo != foo)) already and use release. It's good for your karma. Added bonus: you will quickly discover extra-release problems with a usable stack trace that would otherwise be hidden by the autorelease pool.

yours truly,
- another old fart who has being doing all of the above for donkeys years now and never broken nuthin' (yet), ye maggots

July 08, 2005 6:35 AM

 
Anonymous Anonymous said...

Can anyone recommend a good beginners book for programming Cocoa apps? I have little experience in programming, but I would love to get started and making programs for the Mac would be a lot of fun.

Thanks for any advice.

July 08, 2005 9:17 AM

 
Anonymous Lon Varscsak said...

I think the nanouk (and others) hit it on the head: Wil and others (including myself) have been doing it for a very long time, long before Cocoa was a glint in Apple's eye. I think Omni's and now Delicious' legacy of applications is strong statement about how software development should be done and the problems you might or might not encounter.

On self = [super init]: On this one, I honestly don't know, I have never once done it (always use [super init]), and have never once had a problem with it.

On using underscore for private methods: I think that Ken nailed it. Apple has used underscore as "their" namespace, but ultimately they should use a really complex namespace so that I don't have to. :)

On release vs. autorelease: Just this year I jumped on this band wagon. For years I just did autorelease and felt the performance cost of it. I do use a nifty macro that does it for me, so I just do ASSIGN(iVar, transientVar)...works like a champ. You could probably have the macro do locking too for threadsaftey (but threads and I aren't speaking to eachother right now, so I'm not sure). :D

July 08, 2005 9:20 AM

 
Blogger Wil Shipley said...

Anonymous wrote: Making a blanket statement about the evils of 'autorelease' without mentioning that it is required for thread-safe accessors is unconscionable.

Yes, thousands have died, due to my reckless behavior.

I would find it MORE unconscionable if I implied that access methods of the form:

- (id)foo;
{ return [[foo retain] autorelease]; }

Were thread-safe. They are not. You don't know when foo is going to be changed by another thread -- it could get reassigned right after the retain and before the autorelease, and then you'll leak the old foo and over-release the new foo, causing a crash.

I've done tons of thread programming; I was one of the architects of the OmniWeb threaded architecture; and it's still the only threaded web browser I know of. Threading is really hard, and requires tons of locks and tons of design and even then it's really, really dangerous.

July 08, 2005 7:02 PM

 
Anonymous cjwl said...

nanouk wrote:

On self = [super init]: This only matters if you are doing weird shyte like adding a concrete subclass to one of Apple's class clusters (e.g., NSArray). Otherwise, just doing [super init] is FINE. Really.

If you subclass an abstract Apple class, and then do [MyClass alloc], you will get an instance of your class, not a placeholder. The -init* methods in the abstract classes do not return placeholders. If you follow the rules of designated initializers everything will work.

self=[super init] is nonsense in all but rare cases.

July 08, 2005 7:45 PM

 
Anonymous John B said...

Wil, nice post. I see that you pretty much ignore having single return statements. Where I work (we do C++ work), our standard is to always use only one return statement. Same thing with breaks, except when you're using switch ( ).

I think the argument is that for something like a non-returning function, you shouldn't be using return to break out of your routine. If you are, you wrote a poorly designed function. If you're writing a function that returns something, you should have only one statement that does this.

Doing this makes you need to check against a variable you maintain in your function, to see if you can continue executing any following routines. Its also harder to read because you see all these conditional statements, which may make it look complex (the second code snippet in this post is a good example).

The counter argument is that you have code thats easier to read, but it may be harder to debug. You have different exit points.

I think you can apply this argument for loops too.

What's you point view on this. Do you have a guideline you work with? Trying to maintain a singular point of exit in a function can be a pain, and if it really comes down to designing, I think it can be ignored with these really fast cpu's. Is this just an issue for critical apps, or should less critical apps be subject to it as well?

I've tried reading a lot about this, and basically it seems like you can design your functions or methods to encapsulate routines that do "work," and ones that control which branches your application take. However, you run into function/method explosion.

Any ideas?

July 08, 2005 8:45 PM

 
Anonymous perl freak said...

being able to read your own code? pansies. vive obfuscacion!

July 08, 2005 9:57 PM

 
Blogger Wil Shipley said...

John B:

I can't imagine what justification anyone would have for only wanting a single point of return from a method/function/whatever. There may be some pure math or logic or whatever theory behind this, but I think basically it's a completely bogus notion.

It's a zillion times easier to understand a method (function, whatever) in terms of "Do something. Am I still on the right course? No? Then bail. Do something else. Am I still on the right course? No? The bail... [etc]" than it is to do all that nesting crap. I've never heard anyone espouse that single-exit is better thing, except possibly back in the days of Modula-2. Remember Nikolas Wirth anyone? No? You know why?

BECAUSE HE WAS WRONG. He was a total "purity of code" guy, on of the people behind the no-goto movement, one of the architects of the strongly-typed language movement, etc. For a couple years we all bought into it. Then we noticed that, still, all the interesting software was being written in C.

The notion that a function in a programming language should be like a function in math has been disproven by the history of real programming. We've got flow control. We should use it.

One return point... yeesh. Can anyone top that for bizarre coding edicts? Anyone not allowed to use capital letters? Anyone not allowed to type the 'e' key in their programs?

July 09, 2005 5:22 AM

 
Anonymous Anonymous said...

wil wrote:
I can't imagine what justification anyone would have for only wanting a single point of return from a method/function/whatever.

Because then you can set a breakpoint on it.
And, if you declare a variable 'returnValue' which you assign, then return ... you can inspect it. [Mind you, I think it's going to have to be called '_returnValue' from now on.]

There may be some pure math or logic or whatever theory behind this, but I think basically it's a completely bogus notion.

We all have to debug production code sometimes.

July 09, 2005 9:01 AM

 
Blogger paul said...

This comment has been removed by a blog administrator.

July 09, 2005 4:45 PM

 
Blogger paul said...

Someone mentioned [NSString stringWithCString:x]. Don't do that. And don't use [x cString] either. These methods are deprecated.

Use [x UTF8String] and [NSString stringWithUTF8String:x] instead.

July 09, 2005 4:46 PM

 
Blogger Troy Phillips said...

Wow - I know C and I thought "how much different could Objective C be". I think I have a lot to learn!

July 09, 2005 6:45 PM

 
Anonymous Greg Titus said...

Because then you can set a breakpoint on it.
And, if you declare a variable 'returnValue' which you assign, then return ... you can inspect it.


This might make some sense if your debugger is incredibly dumb, which debuggers tended to be in the distant past. :-)

However, with gdb, there is no need. You can type "finish" in the Console, or hit the "Step Out" button in the toolbar to complete the current function/method, and gdb saves the return value on its result stack. Type "p $" (for print last result) on the debugging console and you'll see the return value.

July 10, 2005 12:37 PM

 
Anonymous ken said...

Wil said:
I can't imagine what justification anyone would have for only wanting a single point of return from a method/function/whatever.

You know, when you say it like that, it's like a challenge.. < fires up imagination >

Suppose you're writing C function, and you want to write such that a reader could start reading in the middle of the function and understand what's going on (go with it for a minute..).

It's a help, in that case, if the reader can assume that the flow of the function is from top to bottom. Indentation then communicates that a line might not apply in all cases. It also simplifies resource allocation and destruction, because if you allocate at the start of a block, you can deallocate at the end of the block without leaking.

Some arguments against the above:

(1) It's all about the short functions. If a function is only 8 lines long, you'd have to have some serious tunnel vision to avoid noticing an "if (case) then bail" structure.

(2) ObjC has less manual resource allocation than C.

(3) Even in a long function, an "if (case) then bail" at the top is pretty easy to grok.

Troy Phillips said:
Wow - I know C and I thought "how much different could Objective C be". I think I have a lot to learn!

Naw, Obj-C is a tiny extension of C! The heavy traffic is probably because Wil makes a couple of judgment calls, and the issues affecting the calls are easy to understand - see bike shed painting.

It's kind of lame that there _are_ judgement calls in the absolutely most common bits of ObjC code. Next we can discuss [arrayOfStrings valueForKeyPath:@"lowercaseString"] vs. explicit iteration. :-)

July 10, 2005 1:14 PM

 
Anonymous Anonymous said...

What's the point exactly for the underscore convention? Is it only aesthetics, helping reading the code? In that case, why not let _myVar for Apple and use __myVar?

July 10, 2005 3:04 PM

 
Anonymous Anonymous said...

The point for underscore conventions? It's not just "an Apple internal convention", check out the C standard (and the Rationale):

"All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.

All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces."

These names are reserved for the implementor.

"Apple has used underscore as "their" namespace"
"why not let _myVar for Apple and use __myVar?"

Apple used underscore as their namespace because it's reserved precisely for that use. C'mon people, learn at least a little about the language you're using!

July 10, 2005 8:13 PM

 
Anonymous Anonymous said...

Wil,

Regarding underscores: the amount of time that you've been doing something and getting away with it has no bearing on whether it's correct to do so. I used to use leading underscores on my ivars too, until about three years ago, when Ali advised me that it wasn't a good idea.

Regarding assignment of self: This is another case of you getting away with a mistake. -init is not guaranteed to return the same object that is sent the -init message, so self = [super init] is correct, and [super init] by itself is something that works *most* of the time, but will trip you up eventually.

-jcr

July 11, 2005 5:54 PM

 
Anonymous Rosyna said...

Not using self=[super init...] and just using [super init...] is making some gross assumptions about the runtime. The runtime has changed in the past and it will change in the future. What is a real class now may become a cluster in the future and vice versa. That's one of the benefits of Objective-C, the runtime and internals can change and the code will continue working. As long as you don't make assumptions about the runtime, your code will go with the flow.

Remember, you said, "Don't assign "self = [super init]" in your init statements, ever. If you ever type "self =" YOU'VE DONE WRONG."

Which is quite wrong.

July 11, 2005 7:12 PM

 
Anonymous Anonymous said...

If you are so good at insulting other people's code, why can't you get that pice of garbage you call delicious library to work? Why can't you find time to answer anyones tech support email? Why does everyone at versiontracker say your software is garbage?

July 11, 2005 7:59 PM

 
Anonymous JC said...

>Wil: I'll often use an inline function inside the method, or sometimes I'll even use a goto. I really hate the nested way.

I really like the nested way and I really hate the goto way. So there.

I really hate more than one "return" statement in any method. ..OK maybe two in a very short one (ie an if statement).

I really hate the "break" statement. Bastard. (ok needed for switch.) And "continue". Fecker.

> John B: Where I work (we do C++ work), our standard is to always use only one return statement.

I agree with this standard.

> Wil: One return point... yeesh

Wil obviously disagrees. I support the arguments for it though. It's clearly wrong to duplicate cleanup code and using goto's will catch you out more than you think.


> Objective-C is C
Lets just be done with this; C does suck for most things.

I really hate prefixes on classes and methods.
I really hate the lack of packages or namespaces.
I really hate the lack of proper exception handling and exception specifications (and nice tools that fill these in for you automatically! ..OK I'm spoilt). These make proper cleanup really hard though without some form of automatic garbage collection.

Objective-C is useless without the Cocoa frameworks.

The Objective-C runtime is nice and dynamic though.


> Anonymous: I hate it when people apply logical operators to things that are not boolean values, such as integers or pointers.

I think it's fine (and preferred by me) to use it for testing pointers for null/nil (but not ints, etc), eg
if (ptr) ... or if (! ptr) ...
is fine by me


> Troy Phillips: Wow - I know C and I thought "how much different could Objective C be". I think I have a lot to learn!

Not much in Objective-C, but lots in the Cocoa frameworks; behaviour, conventions, and standards.

Good luck and have fun.

I like the Cocoa frameworks and WebObjects. C++ was ok for it's time, ..well it's a lot better than C. We need an implementation of Java-Cocoa with the performance and minimal runtime of Objective-C++. But this is not going to happen. So, hands up who demands decent refactoring support for Xcode! (And no more bloody prefixes please)

JC

July 11, 2005 9:14 PM

 
Anonymous Darth Sidious said...

Some good modifications and some that can could be discussed.

One modification I would not consider that great is the one ending up with 3 returns when there was only 1 before.

If you've been working with restricted amount of space for your binary (such as in embedded platforms), you may know that the less return, you have, the less space it eats.

More returns is not that great if you really care about code size.

My useless $0.02

July 11, 2005 11:24 PM

 
Anonymous Anonymous said...

I suppose we do a lot of things because we've "always" done it that way... I prefer to do "if (ptr == nil)" etc, partly because I used to lint my code heavily while programming straight C, partly because the expression doesn't make any sense. Basically, it translates to "is this not a pointer?", which it clearly is. :-)

I don't agree with the sentiment that a function must have only one return point. I'm not an embedded programmer; I'm a Cocoa programmer. If it looks cleaner, you end up doing less mistakes, and you don't have to worry as much about debugging.

//ps

July 12, 2005 12:38 AM

 
Anonymous Anonymous said...

The init method has one big error in it (I won't get into self = being good or bad). The method is called initWithPath there still might or might not be an init method so you shouldn't call [super init] but [self init] instead. Even though init might not currently exist in the class, it might in the future and its easier just to assume that it could.

Now if you think less code is better (which I almost always do) why bother comparing what the method returns with NO or nil? I'd tend to do a if (!this) or if (that).

Another nit somewhat related to the above. Doing variable == constant is less "safe" then doing constant == variable. The reasoning behind this is of course you can typo == to =. It does make things a bit harder to read which is why I tend to not compare to nil YES/NO and just let the if statement do its thing.

Why bother defining the method with (id) that's the default for methods.

I'd probably do


- initWithPath:(NSString *)aPath
{
if (self = [self init]) {
if ([[NSFileManager defaultManager] pathContentOfSymbolicLinkAtPath:aPath] ||
![self _setNodePath:aPath]) {
[self release];
return nil;
}
}

return self;
}


Actually I'd probably figure out some way to dump the entire [self release] stuff since that always gives me a shiver when I see it. IMO returning nil from init should be an exceptional circumstance, but again that's just IMO.

///
PTH

July 12, 2005 6:25 AM

 
Anonymous cjwl said...

darthsidious wrote:

If you've been working with restricted amount of space for your binary (such as in embedded platforms), you may know that the less return, you have, the less space it eats.

Perhaps you should look at the assembler output more often.

gcc will typically generate a return someValue; as:
- assign someValue to register
- jump to very end of function
- very end of function contains the generic return code, move stack pointer, return instruction, what have you depending on processor it can be a few instructions.

So in most cases doing

if(foo)
bar=blug;
else
bar=blah;

return bar;

generates the (sometimes exactly) same code as:

if(foo)
return blug;
else
return blah;

Don't think about what the compiler is doing when you're coding.

July 12, 2005 6:34 AM

 
Blogger Mike Lee said...

This comment has been removed by a blog administrator.

July 12, 2005 2:30 PM

 
Blogger Mike Lee said...

Let me try this a bit less crankily:

I don't think there's any such thing as one true style. I mean, there's good style, and there's bad style, but in the end, good style gets divided into Apple Official Style, Aaron Hillegass Style, James Duncan Davidson Style, and, yes, Wil Shipley style.

We're here for Wil Shipley style, and I think some readers need to remember that when he says "this is wrong," what he means is "this is not Wil Shipley Style," and you can debate the relative merits of Wil Shipley Style, but in the end, you can't really prove or disprove it.

It is what it is, and like Wil said in the beginning of this little experiment, take it or leave it.

July 13, 2005 2:58 AM

 
Blogger Wil Shipley said...

self = [super init]

I must stress this again: this is not right based on anything said here. There may be some reason it's not right, but nobody has stated a reason that's convinced me.

Class-clusters: if you subclass a class cluster, the init method you call _WILL NOT_ return a different object than what you asked for. Period. It won't. If you want to argue this, let's do it, but you'll lose.

This is NOT to say that if you call an init... method on a class cluster when you ARE NOT subclassing it, that you'll get the same class of object that you originally allocated. You may not. That's what the documentation says.

Why?

Look, say I've got a class cluster, and the public class is Foo. Foo contains two ivars, so its structure looks like this:

Foo {
int badname;
int worsename;
}

Now, I define a subclass Bar of the class cluster Foo. Bar adds a new instance variable, "horriblename". This creates a C structure that looks like this:

Bar {
Foo {
int badname;
int worsename;
}
int horriblename;
}

Now, if I call:

Bar *thing = [[Bar alloc] init];

And Bar's init looks like:

- (id)init;
{
[super init];
horriblename=2.0;
return self;
}

I ABSOUTELY promise you that the Foo class cluster is NOT going to return me some other, private subclass of Foo. Because, if it did, the next line of code would crash.

In fact, "Foo" the abstract superclass exists for the _sole purpose_ of being subclassed. Its methods are defined inefficiently in terms of a couple base methods, so you can simply override those in your subclass and get a fully functional Foo-like object. If "Foo" returned any other KIND of object, your new code wouldn't be there.

Ok, so what if "Foo" correctly returned an object of type "Bar", but it wasn't the object you had just allocated? Well, then you'd have me. Except, WHY THE HECK WOULD IT DO THIS? How can an abstract superclass know of an instance in which it's more efficient to use some other object than the one you just gave it? I mean, it is, by definition, an ABSTRACT class. It really doesn't know a lot -- all its functionality is wrapped up in your implementation of two or three methods.

But, I tell you what. I like being right, it's true, but I like learning new things even more. So, I'll send $20, cash, to anyone who can find an instance in which they subclass a non-deprecated, Apple-supplied, 10.4 Cocoa class, call [super init] and DO NOT get back 'self'.

Limit one $20 prize per class, limit 5 classes total then I call it quits; first entry wins for any given class. Go for it. Prove me wrong. I'm excited!

Please note clearly that you have to be sending the init message to an instance of your subclass of a built-in Cocoa class.

I'll post the winners in a full blog entry, in which I will also apologize for my gross recklessness.

July 13, 2005 3:56 AM

 
Anonymous Anonymous said...

With all the self = stuff you are still missing that the original code is wrong and should be calling [self init] instead of [super init]. That IMO is far more problematic then whether or not self = is right...

///
PTH

July 13, 2005 5:17 AM

 
Anonymous Lon Varscsak said...

lol, Wil, I love it. You and I are on the same page here, so no $20 for me, but this is fun!

I think we need a Wil Shipley Code forum now. :D

July 13, 2005 12:44 PM

 
Anonymous Anonymous said...

quoth PTH:
With all the self = stuff you are still missing that the original code is wrong and should be calling [self init] instead of [super init].

Unlikely. Usually the init method with the most arguments is the designated initializer. If anything, there should be an implementation of the init method that calls [self initWithPath:nil].

July 13, 2005 6:10 PM

 
Anonymous Anonymous said...

PTH,

You got your "self" and "super" reversed. If you send [self init] in your -init method, you hit the stack limit and crash.

-jcr

July 14, 2005 2:43 AM

 
Anonymous Anonymous said...

jcr,

He doesn't suggest calling [self init] in -[FileSystemNode init], he suggests calling [self init] in -[FileSystemNode initWithPath:].

This is probably an attempt to better support subclassing of FileSystemNode. It makes [FileSystemNode init] the designated initializer of the FileSystemNode class. However, that's unusual. It'd be more normal to make -[FileSystemNode initWithPath:] the designated initializer, which can be done by adding

- (id)init
{
return [self initWithPath:nil];
}

to the implementation of FileSystemNode.

July 14, 2005 10:32 AM

 
Anonymous codecrafter said...

Maybe I am missing the boat on this self = [super init] stuff--I admit that I am new to Cocoa and obj-c. From what I have read, self is just a local variable in the currently executing method. Setting it to some value does not have any side effects. You could have just as easily coded your init.. method as:

foobar = [super init]
// init some instance variables
return foobar

the above is the same as

self = [super init]
// init some instance variables
return self

I guess from Apple's perspective, self = [super init] is preferred because it is a common idom to send messages to self in initializers, e.g. to call the designated initializer:

self = [super init]
[self initDesgInitWith: foo and:foo2]
return self

Since you are using self as the target of the initDesgInitWith:and: method you want to make sure that self is set to the actual return value of [super init] just in case [super init] returned a different object. The following is equivalent:

foobar = [super init]
[foobar initDesgInitWith: foo and:foo2]
return foobar

July 14, 2005 9:47 PM

 
Anonymous codecrafter said...

Oops, one correction regarding this example:

self = [super init]
[self initDesgInitWith: foo and:foo2]
return self


By convention only the designated initializer should call [super init]

July 15, 2005 8:34 AM

 
Anonymous codecrafter said...

I guess I did miss the boat on this. :-) After further reading, setting 'self' does appear to have side affects. It must be so because via self is the only way methods can access their instance variables. When encountering an instance variable, I assume the compiler emits code that always addresses them via an offset to self, which is passed as an argument to the method via the runtime.

So my previous examples were wrong. However, if true, the above would indicate that you should always set self to the result of the super call in your designated initializer:

Example from Apples docs:

@implementation ValidatingArray
- init
{
self = [super init];
if (self) {
embeddedArray = [[NSMutableArray allocWithZone:[self zone]] init];
}
return self;
}

If the result of [super init] returned a different object and we didn't set self to this new object, then the initialization of the instance variable, embeddedArray, would probably crash (assuming the original object was released by super's init.

Excuse my ignorance if this has all been covered in previous comments--I'm learnin'.

July 15, 2005 10:16 AM

 
Anonymous Marcus S. Zarra said...

[super init] sets self for you. This can be tested by sub-classing NSObject and not setting self yourself then confirming that it is not nil.

The discussion is not about whether or not "self" gets set for you automatically but whether the "self" that is set for you matches the pointer that is returned from [super init].

July 15, 2005 7:17 PM

 
Anonymous Richard Albury said...

Don't allocate a variable you only use once

In general, I'd agree, but in this case, I wouldn't have changed the code because I like to see what the method is returning because I step through *every* line of code I write just like Steve Maguire recommends.

Other than that, great stuff. I've finally weaned myself of the 80-column limit and turned off the page guide and turned on line wrap.

July 18, 2005 5:08 AM

 
Anonymous codecrafter said...

[super init] sets self for you. This can be tested by sub-classing NSObject and not setting self yourself then confirming that it is not nil.

[super init] does not set self automatically. self is a hidden argument passed to the method via the runtime. You must set self to the return value of [super init] just in case [super init] returns a different object. If you don't instance variable access could blow up or cause heap corruption. For example:

// value of self = X before call to [super init]
[super init] // assume super class released X and returned Y
// self still equals X here
ivar = 5; // blows up

ivar = 5 blows up because this is assignment is actually:

self->ivar = 5

self was never set to the new value Y returned by [super init]--the old object was most likely released.

I have confirmed this in Xcode.

July 18, 2005 3:07 PM

 
Anonymous Marcus S. Zarra said...

I would be interested in seeing an example of this as I have not run into this ever.

Every object I have ever created just calls [super init] without setting the result back to self with no issues at all. It really makes no sense for [super init] to return a different object.

If you have an example of this you should post it here as I am sure quite a few people would love to see it.

July 18, 2005 3:40 PM

 
Anonymous codecrafter said...

My main point is that self is nothing more than a hidden argument passed to a method, and that all instance variables are addressed via offsets to the CURRENT value of self. I am not arguing whether it is common or good practice for an initializer to return a new object, just that it is possible. For rationale of when this might occur, see the Memory Management section of The Objective-C Programming Language from Apple's developer web site. Below is a contrived example of how this might occur:

@interface Foo : NSObject {
int fooInstanceVar;
}
- initFooWithInteger:(int)i;
@end

@interface Bar : Foo {
int barInstanceVar;
}
- initFooBarWithInteger:(int)i andInteger:(int)j;
@endif

@implementation Foo
- initFooWithInteger:(int)i
{
id newObj;
self = [super init];
if (self == nil)
return nil;
// assume, for some reason, we need a diff. obj.
newObj = [[self class] alloc];
[self release]
// must set self so subsequent ivar access is ok
self = newObj;
if (self == nil)
return nil;
fooInstanceVar = i;
return self;
}
@end

@implementation Bar
- initFooBarWithInteger:(int)i andInteger:(int)j
{
self = [super initFooWithInteger:i]
if (self == nil)
return nil;
barInstanceVar = j;
return self;
}
@end

July 19, 2005 8:10 AM

 
Anonymous Marcus S. Zarra said...

codecrafter,

I apologize, I had assumed you were referring to a real world example of this situation and not something contrived.

July 19, 2005 11:47 AM

 
Anonymous Lon Varscsak said...

Hey Wil, any $20 paid yet?

July 19, 2005 2:42 PM

 
Anonymous Anonymous said...

Hi Wil,

I'd suggest renaming the comparison method. In place of

- compareUserVisibleNameToObject:

consider

- userVisibleNameComparisonWithFileSystemNode:

The method is a pure query -- it returns a result and has no side-effects. I tend to prefer noun phrases (eg "comparison") for naming queries, _especially_ pure queries. I try to avoid using verb phrases (eg "compare") for anything besides commands.

YMMV. I know there's a lot of code out there that's written in a different style. That's why I'm interested in hearing from people who do things differently. So if you're unhappy with the suggestion, I'm offering a proverbial, but very shiny, penny for your thoughts.

Best wishes on all your projects,

jd

July 26, 2005 6:29 PM

 
Anonymous Ian Wilkinson said...

Hi Will,

Love the blog. Sorry I'm late in posting to this one, but I would really appreciate an answer.

Consider this:

- (NSString*) name
{
return _name;
}

- (void)setName:(NSString*)name
{
if (![_name isEqualToString:name])
{
[_name release];
_name = [name copy];
}
}

.. which I believe is the style that you are advocating (more or less). Now consider this piece of code that uses these accessors:

NSString* old_val = [myObj name];
[myObj setName:@"Fred"];

NSLog(@"Old name is %@, new name is %@", old_val, [myObj name]);// BANG!

I assumed that 'return [[_name retain] autorelease]' was the correct way to implement the name accessor, (the Cocoa Development II Student Guide that I'm reading advises this too - in most cases).

Another way to avoid this crash is to retain (and later release) the value returned from [myObj name] in the calling code. Is this what you suggest one should do here?

Feel free to pimp my code. I'm still a newbie at Obj-C.

September 08, 2006 9:59 AM

 
Anonymous Anonymous said...

Just came across this and my only problem maybe the lack of using {} on if statements that are not complete on one line. 

IE:
   if (true)
       do_something;
should be:
  if (true) do_something; 
or
  if (true) {
    do_something;
  }

You may not like what it looks like but it save a lot of trouble...

September 14, 2006 2:46 PM

 
Anonymous Anonymous said...

One reason to only have one return statement (besides the pure mathematical correctness provability cases) is that not all code is read in order. If there is only one return statement, it is easy to glance at a routine and find the return and see what it is doing. I find multiple returns much harder to figure out as I might miss one. YMODV.
Chad

October 25, 2006 1:18 AM

 
Blogger Jens Jakob Jensen said...

Sorry to post this two years late :-)

A really nice post and discussion, but I just have to comment wrt. thread-safety:

(id)foo;
{ return [[foo retain] autorelease]; }

is in fact thread-safe if foo is guaranteed to always contain a valid object ref or nil, and pointer assignments are atomic.

You don't know when foo is going to be changed by another thread -- it could get reassigned right after the retain and before the autorelease, and then you'll leak the old foo and over-release the new foo, causing a crash.

Of course, -retain returns the original object, so reassigning 'foo' won't change what's autoreleased. 'foo' isn't read twice.


As to "self = [super init]":
Write every line to be bulletproof. Write every method as if every other method was out to get your code and try to make it crash, and your job was to make sure it wasn't going to happen in YOUR code.

;-)

July 15, 2007 3:09 AM

 
Blogger bbum said...

If you aren't using...

if (self = [super init]) { ... }

... then your code is wrong.

See the documentation.

It may still work and, because of binary compatibility requirements, it will likely continue to work.

But it still isn't correct. And "works" doesn't mean "fast and correct".

April 19, 2009 9:14 AM

 
Blogger Jeff Barbose said...

yeah, the self = [super init] thing must have touched you in a secret no-no place way back when.

simple logic: you agree there might be *some* case when the planets are aligned and you're writing code while standing on your left foot while wearing Mormon underwear and chanting in Engrish when self=[super init] might save your ass, but there's never a time when it's wrong to use, just that you don't like its aesthetics.

Is there a case where self=[super init] causes HARM?

If the choice here is (harm > 0) vs (harm == 0), what's the big with you on this point?

By the way, the docs you pointed to back in that 2005 post? They recommend using, yes, self=[super init].

May 06, 2010 3:57 PM

 

Post a Comment

<< Home