Infrequent Influxes of Inspiration

On Information Hiding written on

Caution: Starcraft 2 is addictive and may hinder blog post writing. End of public service announcement; returning to the scheduled programme in 3, 2, 1…

Recently, I've had two notable encounters of a specific topic that I simply need to address in the form of this blog post. Today's subject: Information hiding, i.e. making fields or methods something other than public.

After I read Code Complete a few years back, it simply dawned upon me why information hiding is totally awesome. But it seems that all too many programmers haven't read it, and therefore I have met people being confused by the idea. Not confused in the "what does it mean?"-way, but rather in a way that they can't see the point.

The main comments I've heard against information hiding can be summed up in "It won't make your software any more secure, since I have reflection anyways" and "Why can't I access this private field I absolutely need?"

Hiding in Security

Let me be absolutely clear about this: Information hiding and software security should have nothing to do with each other. If you ever notice those two concepts mentioned between a set of covers, and it's not a CS basics book, you should burn it and demand your money back. Yes, in that order.

If you would use information hiding as a software security measure, it would be nothing more but security through obscurity, and that's bad (usually regarded as one of the seven deadly sins, it being the eighth). So just don't do it.

Being Just a Nuisance?

The other argument is regarding the case where you have decided that in order to make your code work, you need to access something inaccessible. You have read all the code, and are sure that it won't break anything. It's almost as if it has been put out of reach, just out of spite.

I can almost buy this argument. Almost. But this will violate a few things, and here's just a few of them:

Optimization

Whenever I write code, I try to find the shortest path between the current state of the code and the required functionality of the application, with additional business requirements taken into account. This is optimization of development time: There are a lot of things that might need to be taken account, but I choose not to. Instead of constantly verifying all variables in my code in case of a stray cosmic ray deciding to flip a bit in the computer, I choose to spend my time instead making the log-in screen to the database manager work.

My point is that I write the parts I need to make stuff work my way as fast as possible, making as many assumptions and quirks as I possibly can get away with. (Again, I mean this in the most literal way possible: I do make assumptions and take shortcuts, but I choose them carefully, to avoid bugs and unnecessary spaghetti code.) Should I unnecessarily leave stuff public, those who code against my code might, in good faith and by honest accident, change the internal behavior of the code, making everything fail in a glorious ball of fire. If I proactively disallow access to the internals of my code, I proactively make it harder for others to do mistakes and weird bugs.

Case in point: Take java.lang.Integer, having a private int value field that stores the primitive integer value of the Integer object. Imagine for a second that you could do the following:

Integer i = Integer.valueOf(1);
i.value = 3;
System.out.println(i); // Output: "3"

Innocous enough, right? And it might even work pretty well with JDK versions 1.4 or lower. But as soon as you upgrade to 1.5, that would probably bring down the whole JVM. This is because IntegerCache got introduced in 1.5.

Between lines 559 and 586 you can see that the JVM has a cache of integer objects of -128, 127 and everything in between. Since the Integer.valueOf(1) returns a cached instance, and we've just modified that instance's internal value, we have essentially asserted to the JVM that 1 = 3. Oops!

As a casual programmer, there's no way you could've known this change. And that's the whole point. As a casual programmer, you shouldn't need to know these kinds of changes. It's internal optimization, made possible by informaiton hiding.

Refactoring

If everything you write in a class would be public, there's nothing stopping the client code from using everything in that class, intended to be touched or not. This means, if you change even the slightest thing in your class, there's a high possibility that the client code will break. This will effectively kill off any safe way of refactoring your code.

I like to break down my code into the smallest chunks possible. I usually have a handful of private methods for each public method. This helps me to unload the cognitive load of any piece of code, since I'm looking at less code on a higher level of abstraction. Instead of an in-lined for-loop, I might have a single-purpose method that describes what the for-loop does. It's not that I'd strictly need a method but, but it's a way of both commenting what the code does, and simultaneously reducing the lines of code per eyeful. This way, I can get away with not writing actual comments for my code.

Now, if all those private single-purpose utility methods were public, all kinds of horrible things might happen. Someone might've found a particular algorithm that they find useful, and they start using it directly from my code. This would mean that either I keep the method around forever, whether it becomes obsolete or not, or I break the API. Even worse, if I had a bug somewhere in there, and some 3rd party code would actually rely on that bug. I'd be royally screwed, since I can't make my code work correctly, without making someone else's work incorrectly.

This problem is especially prominent in general purpose libraries intended for public distribution. But controlled and contained projects aren't exempt from this problem either: Even if you could communicate to everyone what you are doing, and tell everyone to take this into account, you're adding one more rule to remember into everyone's (already excessive) cognitive load. That's not exactly fair, if you ask me.

Being Lazy is Being Effective

Which brings me to the following point point: The most awesome side-effect of generous information hiding is allowing yourself to forget as much as possible. My brain's working memory is very small, and I imagine other people suffer from this too. Merely trying to remember three arbitrary numbers all while writing code for them is a guaranteed failure for me. So trying to remember the minute implementations of a certain class I'm using is simply not going to work for me. If I need to know how to jiggle fields, call various magic methods in a specific order, and that sort of thing, I'm guaranteed to write a huge load of bugs.

That's why I keep as much as I can get away with out of the public scope. I name my methods descriptively (long names, if needed), and try to hide as much as the details as possible into the private scope. A few months later, when I have happily forgot the implementation details of a particular class, I can simply use my IDE's auto-complete feature to tell me what it can do, and I can be damn sure that there's no clutter, and no pitfalls to watch out for. There's only the methods I need to be bothered with, and that's all I need to remember and take into account.

There's This One Case, However…

There's one case where information hiding is actually a nuisance: A pseudo-extendable class.

What if there's a class that is not explicitly made non-extendable. You try to extend it, but there's some little piece of information that you can't reach, but would be forced to in order to make the extension useful in any way.

There's usually two things that might've happened here: 1) the developer has forgot to disallow class extension (which, in my mind, should be the default in Java), or 2) the developer has encountered the brutal fact that allowing extension of a class is damn hard to do properly.

Making an extensible class requires you to know pretty much all the ways the class is going to be extended. It's a lot easier to go about the breadth-first approach, by first programming the code, and then extracting a common, extensible class, out of what can be generalized. Sometimes, however, you are forced to make a class extensible for extendability's sake, and this is where things are so easy to break.

So, what could you do about it? File a bug report, and hope real hard they open up the API instead of realizing that the class should've been final in the first place. Or you could be content with reflecting whatever you need. In the latter case, however, if anything fails, it's your own fault. In any case, I wouldn't count this against information hiding per se, but a mere design error.

Now, Where Was I?

By now, it should've become apparent that I like information hiding. I wish it's now apparent why I like information hiding, too; it helps me a lot with my programming. Information hiding might be unnecessary in small and trivial applications, but since I haven't done much of trivial work in a good while, I've since denounced that. In non-trivial projects, it's a shortcut to poor software design and spaghetti code.

If you still think information hiding is stupid and evil, feel free to think that. That's fine. I don't judge you … much.


Unconventional Typing written on

A couple of months ago, my brain got stuck on an idea. My brain often does that, and I can't really be at ease until I even try to realize that idea (this blog is definitely one of them). So, I went about uneasy for a good couple of weeks until I gave in: I decided to get myself a netbook. For coding.

Selection Criteria

When researching on which I'd be interested in, I had a list of things that it must fulfill:

  • Low price
  • Comfortable keyboard
  • Decent resolution
  • Sufficient crunching power
  • Excellent portability
  • Long battery life

The price and a good keyboard were by far the most important criteria. Since no netbook manufacturer had the courtesy of answering my queries on review units, the price needed to be in the "justifiable to the wife" category. Also, as the computer was intended for coding, thus heavy typing, the keyboard must be comfortable to write with.

There were a couple makes and models that I narrowed my list down to, but ultimately ended up choosing the Samsung N220.

Specs

1.66GHz Atom N450 Pine Trail, 10.1" 1024×600px, 1GB DDR2, 250GB 5400RPM, 3×USB, D-SUB, 1.31kg, 256×189×36mm, Win7 Starter. Blah blah blah. Are we done here? Good. Moving on…

Before I Start

The web is full of reviews of this computer so there's no need I'd need to go into the unboxing-speclisting-walkthoughing routine. The point of this review is to test whether it's suitable for writing code.

But there's something I need to tell you before I go there: I love the computer's size and weight. Some reviewers say it's both heavy and bulky because it has a 6-cell battery, which bulges out a bit at the underside. Well, that's one of the things I totally love about this little thing.

The added weight of a 6-cell battery, compared to the common 3-cell battery is not a problem for me. It's still lighter to carry around than my DSLR. Besides, the bulge makes the computer stand on a slight edge, making typing more comfortable. Most importantly, though, the battery-bulge is a perfect grip for the computer; if I grip the computer's hinged edge so that the top of the computer rests on my forearm's underside, and my fingers wrap around to the underside, the grasp is firm, secure, and perfectly balanced. Not to mention that 6 cells is, like, 100% more than 3. Totally awesome!

The Four Kings

After fully committing myself to writing code (for Bob, no less) solely with what the netbook came with (excepting listening to music via headphones), there are four main things that probably matter the most to a programmer:

Screen

It's 10.1" and 1024×600 pixels. That's not terribly much to work with, compared to the 20" and 1680×1050 Samsung I keep attached to my 15" MacBook Pro, itself being 1440×900.

So, I had my hesitations. In fact, this was the biggest thing that I felt scared of in a netbook; while Eclipse doesn't have much chrome in its UI, you really can't have too much screen estate in use when programming.

The workspace configuration for Eclipse at the netbook's full resolution.

But I was pleasantly surprised with how well it sufficed. As seen in the screenshot above, side-by-side comparison can be kissed goodbye and the method overview is a bit cramped. Also, anything other than full-screen mode is out of the question. But otherwise, for me, it's totally usable and didn't really even require getting used to, as long as line length is kept in the 80-characters' vicinity.

Keyboard

Since the screen is so small, the keyboard must be as small. If I put my palms on each edge of the keyboard and extend my thumbs, they overlap about half the thumbs' lengths. So, it's nothing compared to an ergonomic split layout keyboard, but it could be considerably worse.

The keys are flat rounded squares protruding about a millimeter from the keyboard base, just like the modern Apple keyboards. Their touch and feel is excellent. Although the keys are somewhat crammed together, once my hands have settled into their native positions, I haven't missed any keys I shouldn't miss.

I say "shouldn't miss" with great intent, since there are some totally wonky key placements. Mind you, I have a Finnish keyboard layout, so your mileage will vary on the following.

For a coder that likes his generics, having the '<' and '>' key moved from the left side of the keyboard to the right is very jarring, and hard to remember (not to mention hard to type). Also, the delete key is above backspace, in the upper righthand corner, not to the right of enter, as in full-size keyboards. But this I've already grown used to.

But perhaps the most disruptive key (mis)placements on this computer are the page up/page down and home/end keys. The cursor keys have their usual 1+3 matrix, but pgup/pgdn have been placed on either side of the up-key. Additionally, as home/end are on these same buttons, activated by a Fn-key, this leads to a huge amount of mis-scrolls.

It's not uncommon for me wanting to go up just one line, but accidentally pressing the page-up button, losing my orientation completely. The keys are exactly the size of my index finger's tip (about 1cm wide, two thirds tall), so it's very easy to double-press when intending to move the caret left or right.

Using the right hand's pinky finger might be be the fastest to use the cursor keys, but since pressing them is such a precision art, I'm always forced to lift and move my right hand for that activity, making it slow and cumbersome.

But, I want to emphasize, that these are the only negative things I have to say about the keyboard. For such a form factor, I'm amazed.

Trackpad

Being used to my MacBook Pro's ample trackpad, this thing is crap.

If you are a person who avoids trackpads, this will be a waste of space since you will avoid it like the plague. It's small and it's inaccurate. Merely keeping your finger tip on the trackpad, and shifting the finger's weight without even tilting the finger will make the cursor move a lot. There's also some two-finger gestures, and even a three-finger gesture, but it's very clunky, unresponsive and unreliable.

I, however, have no problems with trackpads, so choosing between carrying around an external mouse or using the trackpad, I'll choose this trackpad. Even though it's inaccurate, it's still faster and more convenient to use than lifting the whole arm when switching between keyboard and mouse. It's just the lesser of two evils.

Even though I really dislike those red dotty things ThinkPads love, those might thrive in this form factor, because they require so little physical space. The trackpad's mouse buttons are currently ridiculously thin, so getting rid of the trackpad would allow for those buttons being considerably larger and click-friendlier.

All-in-all, the trackpad is the weakest spot in this computer, by far.

Memory

The laptop comes with 1GB of RAM installed. Since Windows 7 is pretty decent in its memory usage, and there wasn't terribly much bundleware upon startup, it was enough for doing the netbook-y stuff; having a browser and a few other windows open was no problem.

But as soon as Eclipse entered the game, everything went sluggish. Keeping Eclipse in the background for a while and then swapping back took a good minute to get it going again.

Luckily for me, and decent of Samsung, the memory is very easily replaced with a max of one 2GB SODIMM chip. Unfortunately, this added 50€ to the price of the laptop, making the total cost of ownership 400€. After I installed the 2GB memory, I've been able to use the computer without inhibition. Eclipse now runs as smoothly as anything.

Bonus: Everything Else

So, I just explained in detail what I think of the screen, keyboard, trackpad and memory for programming purposes. I'll now give a quick skim over the rest that might matter.

I've always thought that the Atoms are just slow as molasses, and should be avoided at all costs. Lo and behold, it turns out that I was very wrong. Aside from having trouble playing high-res flash video (i.e. a 480p YouTube video at full screen), I've seen no unexpected choppiness or slowness in my use. It's actually very good, even for programming.

A laptop should be portable, otherwise it's just a computer with a battery. This, indeed is portable. It's even more portable than my MacBook Pro is – all 5 inches more portable. Size and weight indeed do matter in this case. Taking the netbook anywhere and everywhere with you is a no-brainer. The battery life doesn't make me even consider taking the charger with me, and it's just so light and small that it's not really something I even consider "carrying around", it just comes right along. It takes very little tabletop space, so pretty much any area that is enough to be called "a surface" is enough. Also, lifting it effortlessly with one hand from the coffee table to the sofa, wherein I lie, makes it so much more attractive for casual use.

Glad I Tried It

Another great worry I had was whether or not I'd still use the netbook after this little experiment. While 400€ isn't terribly much, it's not so little that I'd spend it willy-nilly.

I'm glad I took the chance, since I now consider it as my primary laptop. The Mac laptop on my desk (incidentally, currently underneath the netbook, as I write this post) as merely an underventilated computer with a battery.

The ultimate answer is a resounding "yes", a netbook can be used for programming. I say this without any kind of hesitation. I wouldn't use it as the primary coding computer, but certainly as a mobile back-up.

Whenever I expect being bored and stationary for a longer period of time, I'll surely take with me the little Sammy that could.


Breadth-First Programming written on

A cornerstone of the test-driven development movement is to write the absolute minimum amount of code needed to get a certain unit test to pass. I like that idea. I like it so much, that I want it to be extracted from the TDD world for others to enjoy – no unit tests required.

After looking at an endless stream of God of War how-it-was-made videos and learning the gaming industry's definition of alpha*, and seeing a certain XKCD-strip, I got the idea for this post.

An XKCD comic strip on the hazards of depth-first-searching on Wikipedia

My idea, in addition to being testing agnostic, it's also both a bit more generic and more specific, than the TDD commandment.

My intended goal is getting modules completed quicker. Faster coding means momentum, which keeps the developers happy. Faster coding means also keeping deadlines easier, which keeps the managers happy.

Caution: Design Patterns Ahead

My idea is pretty simple: Don't over-indulge yourself in design patterns.

Design patterns aren't a goal for themselves. They take time to implement, and often unnecessarily so. You don't always need a complete Memento pattern to add an undo feature; sometimes a simple previousValue field is enough. There's also easier ways to send messages from a class to another than a fully-fledged Observer pattern; a lone observer can be given to the notifier via a setter without any special interfaces.

Consider the following piece of code:

public enum Command { CHECK_MAIL, BREW_COFFEE }

public void execute(Command command) {
  /*
   * In case this switch/case block grows over four cases long, switch 
   * to Command pattern (http://en.wikipedia.org/wiki/Command_pattern)
   */
  switch (command) {
  
    case CHECK_MAIL:
      checkMail();
      break;
      
    case BREW_COFFEE:
      brewCoffee();
      break;
  }
}

private void checkMail()  { /* ... */ }
private void brewCoffee() { /* ... */ }

I can't remember how many times I've heard people saying that switch/case blocks do not belong to OOP. But what if, once you've completed your application, you notice the only actions your application ever will do is check the mail and brew a cup 'o joe? Wasting your time creating interfaces, classes and generic code is actually doing there is downright embarrassing to a sensible programmer.

In a worst-case scenario, it might even make the application's abilities less obvious, just by reading the code. Imagine the aforementioned code replaced by a generic Command interface. You'd know that the application can execute commands, and supposedly it has some, but you'd have to use an IDE to know that there's only two measly commands it ever executes!

The most important piece of the code, however, is the comment. It states clearly that, in case more than four commands are listed, it's time to switch over to a proper pattern. In your respective comments, figure out one number representing the amount of acceptable "jerry-rigging". If you would just write "...when these get too many...", you wouldn't be exactly clear just what too many is, possibly allowing the list to grow into a hairy ball of spaghetti.

But I'm not advising against using design patterns. Patterns are excellent, when used with cause. I'm just saying that certain shortcuts are ok, just as long as you prepare the code in such a way that it's near-trivial to yank the code out, and convert it into a shiny, polished pattern du jour.

Why Should I Do This, Again?

Developers love to polish their code into perfection. The trouble is, polishing kills momentum. Also, if it turns out that something you have polished for days on end is unnecessary after all, it's very hard to throw it away. It has become your baby, and you're attached to it. You would do your best to try to keep it around: The baby of polished, useless code.

So, Breadth-First Programming is about completing your project as fast as possible. This way, you know what code you need to have to make it work. Unpolished code is easy to throw away, leading to optimal codebase.

Think of it, instead of "writing the absolute minimum amount of code needed to get a certain unit test to pass", as "writing the absolute minimum amount of code needed to complete your application". Once you're done with the breadth of the code, you're free to add depth to it. Since you're now ahead of your schedule, it's refactoring time!

* The way the gaming industry defines an alpha release is when all the gaming elements are in place. Levels can be navigated as they should, the power ups and items are in their place, etc. The graphics might still look like crap, textures are missing, and the final polish isn't there. But the feel and the gameplay mechanics are in place, from start to finish. back up


Method Hierarchies written on

Jim walks past my desk, on his way to the coffee machines. He gives a purposeful, not-very-subtle, sigh, so I reply to him with a "so, what's up now?"-look. He says he's hunting for a bug in a class that's over 1500 lines long, split up into a huge amount of private methods. He can reproduce it at will, so he knows pretty eactly where the bug is in within the application, but since he hasn't written the code, it's a jungle. So he's fetching his motivational cup of coffee, to get him started on finding where that code containing the bug is.

Since I'm somewhat familiar with his project, I could've asked for more details and then hazard a guess about what's wrong and where to start. Instead, my mouth goes off: "Yeah, but imagine what it would look like, if all those methods would be in-line?" I ask rhetorically. I then stubbornly continue, before Jim had the chance to interject to my unhelpful and unsymphatetic comment: "Instead of you having a couple methods of 500 lines each, you now have a handful of methods that are significantly shorter, each with a descriptive names about what they do."

"But it's just silly," Jim replies defeatedly. "If a method is private, and it's called from exactly one place for exactly one purpose, why can't it just be in-lined? This way I wouldn't have to jump and skip between methods to know what it does, and how it mangles the data? How else am I supposed to find the bug?"

And I realize that Jim has a point: Reading highly fragmented code to find a particular bug is tedious. But I refuse to say that my point is any less valid: Any one method is easier to read and understand the less code you have to decipher.

Unfortunately, it seems like we can't have it both ways.

Exhibit “A”

Take a look at Robert C. Martin's (aka. Uncle Bob) Bad Code talk – the bit between 00:14:25 and 00:21:35. Do that right now. Watch the snippet, and come then back here.

I don't agree with his enthusiastic four-ish lines per method vision, but his the idea of small, single-purpose methods is intriguing. What he's really saying, I believe, is that, on the job, you don't write code for yourself, but for posterity. You might understand what you just wrote (for now, at least), but the future maintainers usually don't have a clue. Splittin code up into fine-grained methods and labeling them with meaningful names would guide the reader about what's going on.

Alas, Jim's problem still exists: A class that has 1500 lines of unfamiliar code, containing 100 unfamiliar methods, and you know that there's one bug lurking somewhere between the button press event and the render phase. Where to start?

Exhibit “B”

Take a look at Code Bubbles's demo video. You need to watch only the three first minutes to get good a grasp of the idea.

It's an idea of a complete IDE that allows the developer to see the code by placing individual functions into small bubbles, as they call them. The developer can then sort and organize them visually, seeing how code relates. All this is placed into an impossibly large, scrollable, virtual desktop.

While being a refreshing point of view, a decent attempt to bring something new to the code-writing world, I think that's a bad idea. I don't want to write my code that way.

But What If…

Imagine having at your disposal such a view in your current favourite IDE? Pick a class, switch into this view mode and – boom – you're presented with a detailed code call hierarchy, in a nice graphical tree structure, code properly highlighted, blocks being foldable, and so on. This mode would not be intended to write code in, but to navigate code. To read code. To understand code.

You'd get best of both worlds: Those who already know a bit about the code can find methods even faster, and total newcomers to the code can navigate their way through the code faster and with ease.

The IDEs don't currently help us with this task enough. Method overviews do give a list of method signatures, but that's merely a list, arbitrarily sorted. The Call Hierarchy feature found in Eclipse does do something like that, but it's not nearly powerful enough. We need a "I have found a bug on unfamiliar territory, help me find it"-feature.

Both Jim and I do.


Bob's Gone Open Problem written on

Bob is already open source. But that's not enough. It's now open problem aswell!

I'm going to write down the design problems I've encountered with Bob in a public place. These design problems don't fit in a bug report system, since they're fluffy and abstract. A mere patch of code won't be enough to fix them. The idea is that once I've made a problem go away, I'll add a summary about what I did, or am going to do, about it.

"Why?" you might ask. Well…

First Reason

I'm very forgetful, so it's just a place to remember stuff. It's also helpful to get larger concepts down on paper to look at, as a whole. Additionally, the process of writing them down, forcing myself articulate the problem, makes the problem as defined as possible. This makes the weak points easiest to find, and subsequently concentrating an attack on those. This is, by far, be the most important reason.

Second Reason

I'm not the last person to write code, and once infinite monkeys start writing, they'll very soon encounter the same problems I'm dealing with. A few infinite monkeys might then find the wiki entries. By any luck, their problems are already solved, so they can just copy the solution off of that, and be done a bit faster. This is, by far, the most interesting reason.

Third Reason

As with open source, there's always the hope of contributions. I'm not the first person to write code, and the chances are that someone else has already encountered and solved the same problems I'm dealing with. By any luck, this good samaritan will edit the wiki page, and help me along the way. This is, by far, the most utopian reason.

So, go ahead, laugh at my problems. Or help a guy out. Feel free to edit the pages as you wish.