Test your migrations

May 29, 2013

An evolving project that changes its persistent data structure can require a transformation of already existing content into the new form. To achieve this goal in our grails projects we use a grails database migration plugin. This plugin allows us to apply changesets to the database and keep track of its current state.

The syntax of the DSL for groovy database migrations is easy to read. This can trick you into the assumption that everything that looks good, compiles and runs without errors is OK. Of course it is not. Here is an example:

changeSet(author: 'vasili', id: 'copies messages to archive') {
  grailsChange {
    change {
      sql.eachRow("SELECT MESSAGES.ID, MESSAGES.CONTENT, "
                + "MESSAGES.DATE_SENT FROM MESSAGES WHERE "
                + "MESSAGES.DATE_SENT > to_date('2011-01-01 00:00', "
                + "'YYYY-MM-DD HH24:MI:SS')") { row ->
        sql.execute("INSERT INTO MESSAGES_ARCHIVE(ID, CONTENT, DATE_SENT) "
                  + "VALUES(${row.id}, ${row.content}, ${row.date_sent})")
      }
    }
    change {
      sql.eachRow("SELECT MESSAGES.ID, MESSAGES.CONTENT, "
                + "MESSAGES.DATE_SENT FROM MESSAGES WHERE "
                + "MESSAGES.DATE_SENT > to_date('2012-01-01 00:00', "
                + "'YYYY-MM-DD HH24:MI:SS')") { row ->
        sql.execute("INSERT INTO MESSAGES_ARCHIVE(ID, CONTENT, DATE_SENT) "
                + "VALUES(${row.id}, ${row.content}, ${row.date_sent})")
      }
    }
  }
}

Here you see two change closures that differ only by the year in the SQL where clause. What do you think will happen with your database when this migration is applied? The answer is: only changes from the year 2012 will be found in the destination table. The assumption that when there is one change closure in the grailsChange block there can also be two changes in it is, while compilable and runnable, wrong. Loking at the documentation you will see that it shows only one change block in the example code. When you divide the migration into multiple parts, each of them working on their own change, everything will work as expected.

Currently there is no safety net like unit tests for database migrations. Every assumption you make must be tested manually with some dummy test data.


TDD myths: the problems

March 11, 2013

100% code coverage is enough

Code coverage seems to be a bad indicator for the quality of the tests. Take the following code as an example:

public void testEmptySum() {
  assertEquals(0, sum());
}

public void testSumOfMultipleNumbers() {
  assertEquals(5, sum(2, 3));
}

Now take a look at the implementation:

public int sum(int...numbers) {
  if (numbers.length == 0) {
    return 0;
  }
  return 5;
}

Baby steps in TDD could lead you to this implementation. It has 100% code coverage and all tests are green. But the implementation isn’t finished at all. Our experiment where we investigated how much tests communicate the intend of the code showed flaws in metrics like code coverage.

Debugging is not needed

One promise of TDD or tests in general is that you can neglect debugging. Even abandon it. In my experience when a test goes red (especially an integration test) you sometimes need to fire up the debugger. The debugger helps you to step through code and see the actual state of the system at that time. Tests treat code as a black box, an input results in an output. But what happens in between? How much do you want to couple your tests to your actual implementation steps? Do we need the tests to cover this aspect of software development? Maybe something along the lines as shown in Inventing on principle where the computer shows you the immediate steps your code takes could replace debugging but tests alone cannot do it.

Design for testability

A noble goal. But are tests your primary client? No. Other code is. Design for maintainability would be better. You will need to change your code, fix it, introduce new features, etc. Don’t get me wrong: You need tests and you need testability. But how much code do you write specifically for your tests? How much flexibility do you introduce because of your tests? What patterns do you use just because your tests need them? It’s like YAGNI for code exposure for tests. Code specifically written only for tests couples your code to your tests. Only things that need to be coupled should be. Is the choice of the underlying data structure important? Couple it, test it. If it isn’t, don’t expose it, don’t write a getter. Don’t break the information hiding principle if you don’t need to. If you couple your tests too much to your code every little change breaks your tests. This hinders maintenance. The important and difficult design question is: what is important. Test this.

You are faster than without tests

Some TDD practitioners claim that they are faster with TDD than without tests because the bugs and problems in your code will overwhelm you after a certain time. So with a certain level of complexity you are going faster with TDD. But where is this level? In my experience writing code without tests is 3x-4x faster than with TDD. For small applications. There are entire communities where many applications are written without or with only a few tests. But I wouldn’t write a large application without tests but at least my feeling is that in many cases I go much slower. Cases where I feel faster are specification heavy. Like parsing or writing formats, designing an algorithm or implementing a scientific formula. So the call is open on this one. What are your experiences? Do you feel slowed down by TDD?


Thoughts about TDD

December 17, 2012

First a disclaimer: I think tests are a hallmark for professional software development, I like to write tests before the implementation but that’s not always easy or simple (for the difference please refer to Simple made easy). I find it hard to grasp test driven development (TDD) though. The difference between test first and test driven lies in the intention: in both cases tests are written before any implementation code but in TDD the tests drive the design of your implementation.

The problem with opinions of TDD is there are mostly extreme positions: some think “TDD is the (next) holy grail” or the ones which dismissed it. Though reading between the lines there are great discussions about how to do it and what problems arise. Many people (me included) are really trying to get value from TDD. Testing should be fun.
One way in letting the tests drive the way you develop is proposed by Uncle Bob: transformation priority premise. He proposes a list of transformations which introduce new or replace existing constructs like replacing a constant by a variable or adding more logic and gives them a priority. Only if you cannot use a high priority transformation to get the test to pass you look at a transformation with a lower priority.
But how do you determine what you should test next or even which is the first test?
Taking the typical Conway’s game of life kata as an example one thing struck me: I could only get the TDD to work smoothly when I started with the data structure. But why that? Naturally I start with the algorithm (in this case the rules) and write the first test for it. But upon further inspection of the problem and deeper (domain) knowledge it seems the data structure is way more important for solving this kata. So you need to know where the journey goes along beforehand, not every step you will take but the big picture: first the data structure, then the rules in this example. Maybe you should start with the integrations or the functional tests and break them down into units.
What are your experiences using TDD? Do you use or want to use TDD?


Clean Code Developer at your fingertips

March 19, 2012

We are participants in the Clean Code Developer (CCD) movement. This initiative provides a way to perpetually learn, train, reflect and act on the most important topics of today’s software development by formulating a value system and a learning path. The learning path is subdivided in different grades, associated with colors. Every Clean Code Developer progresses continually through the grades, focussing on the principles and practices of the current grade.

If you want a tongue-in-cheek explanation of what the Clean Code Developer is in one sentence: It’s a sight-seeing tour to the most prominent topics every professional software developer should know. But other than your usual tourist rip-off, you can just stay seated and enjoy another round without ever paying anything except attention.

Visualize it

An important aspect of learning and deliberate practice is proper visualization. We invest a lot of work at our workplace, our software and the interaction with our customers to make things visible. When we reflected on our Clean Code Developer practice, we knew that it lacked visualization.

The proposed equipment for a Clean Code Developer is a desktop background picture, a mousepad with an image of all grades at once and some rubber wristbands in the colors of the grades. The wristbands serve as a reminder and a self-assessment tool. The desktop background picture is nice, but only visible if we don’t perform actual work. This let us concentrate on the mousepad.

Duplicate if necessary

The mousepad is the most prominent “advertising” space on the typical work desktop. We want to advertise the content of our current Clean Code Developer grade to ourself. The combination of these two thoughts is not one mousepad, but one for every grade. Imagine six mousepads in the colors of the grades, displaying your currently most important topics right under your fingertips.

We liked the idea so much that we worked on it. The result is a collection of mousepads for every Clean Code Developer to enjoy.

Iterative design

It took us several full cycles of planning, design, layout and proof-reading to have the first version of mousepads produced. It took only a few hours of real-world testing to start the second iteration to further improve the design. Right now, we are on the third iteration. The first iteration had the five colored CCD grades printed on real mousepads. The second iteration added the mousepad for the white grade and a little stand-up display for the initial black grade. The third iteration incorporated the official Clean Code Developer logo, the website URL and improved some details.

Here are some promotional photos of the five first-iteration mousepads:

This slideshow requires JavaScript.

As you can see, we chose to print ultra-slim mousepads to test if it’s feasible to use them stacked all at once (it isn’t, your mileage may vary) or use them even if you aren’t used to mousepads at all (it depends, really). You might want to print the images onto the mousepad you prefer best.

Do it yourself

Yes, you’ve read it right. We are donating the mousepad images back to the community. You can download everything right here:

All documents are bare of any company logo or other advertising and free for your constructive usage. There is only one catch really: the documents are in german language. This might not be apparent at first because we really like the original english technical terms, but some content might need translation for non-german speakers. If you are interested to produce an all-english version, drop us a line.

Acknowledgements

These mousepads wouldn’t exist without the help and inspiration of many co-workers. First of all, the founders of the Clean Code Developer movement, Ralf Westphal and Stefan Lieser, provided all the content of the mousepads. Without their groundbreaking work, we probably wouldn’t have thought of this. The design and production is owed to Hannegret Lindner from the Hannafaktur, a small graphic design agency. We admire her endurance with our iterative approach. And finally, the initial inspiration sparked in a creative discussion with Eric Wolf and Benjamin Trautwein from ABAS Software AG.

It’s your turn now

We are very curious about your story, photo or action still with the mousepads (or the little stand-up display). You can also just share your thoughts about the whole idea or submit an improvement. We’d love to hear from you.


Separate Master Data and Variable Data

January 10, 2012

In the design of data structures or objects, there are two different kinds of data, namely “Master Data” (german: Stammdaten) and “Variable Data” (german: Bewegungsdaten). The first kind, master data, are data fields that will change seldom over time and can sometimes be used to “identify” an object. The second kind are data fields that capture the current value of an object’s aspect, but are expected to change in the future. If you can categorize your data fields in this manner, think about separating them into different objects.

Let me make an actual example. An application we develop has a central instance (the “center”) that distributes situational data to several operation desks, powered by client applications, named the “clients”. Each client instance is registered in the center to enable the supervision and administration of clients. The data for each client is stored in a ClientInformation object that is mapped to a database relation. Let’s have a look at some of the data fields of ClientInformation:

  • int internalIdentifier – the database primary key for the record
  • String type – some type of the client application
  • String instanceName – the given readable denotation of the operation desk
  • String version – the currently installed version of the client application
  • Date connectionDate – the last time this client application established a connection
  • Date lastActionDate – the last time this client application issued an action command (“was active”)

We can start all kinds of (justified) discussion about primitive obsession, too much information at one place and so on, but for this blog entry, only the categorization in master data and variable data is of interest. My opinion on the example is that the first three data fields (internalIdentifier, type and instanceName) are definitely in the master data category. The last two data fields are clearly variable data, while the version field is something in between. My guts tell me to categorize the version as master data, because it won’t change on a daily schedule.

When separating the two categories of data, the ClientInformation object may turn into a reference holder object only. In this case, the ClientInformation holds two references, one to a new ClientMasterData object (holding internalIdentifier, type, instanceName and version) and another one to a new ClientVariableData object (holding connectionDate and lastActionDate).

A less radical modification would be to let the master data remain in the ClientInformation object and only extract the variable data into a new ClientConnectionData object. If a client connects, only the referenced ClientConnectionData object has to change.

If you separate your master data from the variable data, you can very easily concentrate on the variable data for performance optimizations. This is where the data changes will happen and a tuned storage strategy will pay off. The master data should be designed more carefully concerning the type information, so if we really start the discussion about primitive obsession, I would first tend to the master data fields and argue that the type shouldn’t be a String but an Enum and the version should be a more sophisticated Version type. This could be modelled even with a slow object/relational mapper because the data is only written/read once.

The next time you come across one of your data model objects that contain more than two data fields, have a look at their categorization in master and variable data. Perhaps you can see a good reason to split the object.


Deployment with the Play! framework

December 5, 2011

Play! is a great framework for java-base development of modern web applications. Unfortunately, the documentation about deployment options is not really that extensive in certain details. I want to describe a way to automatically build a self-contained zip archive without the source code. The documentation does state that using the standalone web server is preferred so we will use that option.

Our goal is:

  • an artifact with the executable application
  • no sources in the artifact
  • startup script for different platform and environments
  • CI integration with execution of the tests

Fortunately, the play framework makes most of this quite easy if you know some small tricks.

The first very important step towards our goal is embedding the whole Play! framework somewhere in your project directory. I like to put it into lib/play-x.y.z (x.y.z being the framework version). That way you can do perform all neccessary calls to play scripts using relative paths and provide a self-contained artifact which developers or clients may download and execute on their machine. You can also be sure everyone is using the correct (read “same”) framework version.

The next important thing is to write some small start-scripts so you can demo the software easily on any machine with Java installed. Your clients may try it out theirselves if the project policy is open enough. Here are small examples for linux

#!/bin/sh
python lib/play-1.2.3/play run --%demo -Dprecompiled=true

and windows

REM start our app in the "demo" environment
lib\play-1.2.3\play run --%%demo -Dprecompiled=true

The last ingredient to a great deployment and demoing experience is the build script which builds, tests and packages the software together. We do not want to include the sources in the artifact, so there is a bit of work to do. We perform following steps in the script:

  1. delete old artifacts to ensure a clean build
  2. call play to precompile our application
  3. call play to execute all our automatic tests
  4. copy all needed files into our distribution directory ready to be packed together
  5. pack the artifacts into a zip archive

Our sample build script is for the linux shell but you can easily translate it to the scripting environment of your choice, be it apache ant, gradle, windows batch depending on your needs and preference:

#!/bin/sh

rm -r dist
rm -r test-result
rm -r precompiled
python lib/play-1.2.3/play precompile
python lib/play-1.2.3/play auto-test
TARGET=dist/my_project
mkdir -p $TARGET/app
cp -r app/views $TARGET/app
cp -r conf lib modules precompiled public $TARGET
cp programs/my_project* $TARGET
cd dist && zip -r my_project.zip my_project

Now we can hook the project into a continuous integration server like Jenkins and let it archive the build artifact containing an executable installation of our web application. You could grant your client direct access to the artifact, use it for demos and further deployment steps like triggered upload to a staging server or the like.


Separate your code domains

September 12, 2011

When you develop software, you most likely have to think in two target domains at the same time. One domain will be the world of your stakeholder. He might talk about business rules and business processes and business everything, so lets call it the business domain. The other domain is the world you own exclusively with your colleagues, it’s the world of computers, programming languages and coding standards. Lets call it the technical domain. It’s the world where your stakeholders will never follow you.

Mixing the domains

Whenever you create source code, you probably try to solve problems in the business domain with the means of your technical domain – e.g. the programming language you’ve chosen on the hardware platform you anticipate the software to run on. Inevitably, you’ll mix parts of the business domain with parts of the technical domain. The main question is – will it blend? Most of the time, the answer is yes. Like milk in coffee, the parts of two domains will blend into an inseparable mixture. Which isn’t necessarily a bad thing – your solution works just fine.

The hard part comes when you want to reuse your code. It’s like reusing the milk in your coffee, but without the coffee. You’ve probably done it, too (reusing domain-blended code, not extracting the milk from your coffee) and it wasn’t the easy “just copy it over here and everything’s fine” reusability you’ve dreamt of.

Separating the domains

One solution for this task begins by realizing which code belongs to which domain. There isn’t a clear set of rules that you can just check and be sure, but we’ve found two rules of thumb helpful for this decision:

  • If you have a strong business domain data type model in your code (that is, you’ve modelled many classes to directly represent concepts and items from your stakeholder’s world), you can look at a line of code and scan for words from the business domain. If there aren’t any, chances are that you’ve found a line belonging to the technical domain. If you prefer to model your data structures with lists and hashmaps containing strings and integers, you’re mostly out of luck here. Hopefully, you’ve chosen explicit names for your variables, so you don’t end with a line stating map.get(key), when in fact, you’re looking up orders.getFor(orderNumber).
  • For every line of code, you can ask yourself “do I want to write it or do I have to write it?”. This question assumes that you really want to solve the problems of the business domain. Every line of code you just have to write because otherwise, the compiler, the QA department, your colleagues or, ultimately, your coder idol of choice would be disappointed is a line from the technical domain. Every line of code that would only disappoint your stakeholder if it would be missing is a line from the business domain. Most likely, everything that your business-driven tests assert is code from the business domain.

Once you categorized your lines of code into their associated domain, you can increase the reusability of your code by separating these lines of code. Effectively, you try to avoid the blending of the parts, much like in a good latte macchiato. If you achieve a clear separation of the different code parts, chances are that you have come a long way to the anticipated “copy and paste” reusability.

Example one: Local separation

Well, all theory is nice and shiny, but what about the real (coding) life? Here are two examples that show the mechanics of the separation process.

In the first example, we’re given a compressed zip file archive as an InputStream. Our task is to write the archive entries to disk, given that certain rules apply:

public void extractEntriesFrom(InputStream in) {
    ZipInputStream zipStream = new ZipInputStream(in);
    try {
         ZipEntry entry = null;
         while ((entry = zipStream.getNextEntry()) != null) {
             if (rulesApplyFor(entry)) {
                 File newFile = new File(entry.getName());
                 writeEntry(zipStream,
                      getOutputStream(basePath(), newFile));
             }
             zipStream.closeEntry();
         }
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        IOHandler.close(zipStream);
    }
}

This is fairly common code, nothing to be proud of (we can argue that the method signature isn’t as explicit as it should be, the exceptions are poorly handled, etc.), but that’s not the point of this example. Try to focus your attention to the domain of each code line. Is it from the business or the technical domain? Let me refactor the example to a form where the code from both domains is separated, without changing the additional flaws of the code:

public void extractEntriesFrom(InputStream in) {
    ZipInputStream zipStream = new ZipInputStream(in);
    try {
         ZipEntry entry = null;
         while ((entry = zipStream.getNextEntry()) != null) {
             handleEntry(entry, zipStream);
             zipStream.closeEntry();
         }
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        IOHandler.close(zipStream);
    }
}

protected void handleEntry(ZipEntry entry,
        ZipInputStream zipStream) throws IOException {
    if (rulesApplyFor(entry)) {
        File newFile = new File(entry.getName());
        writeEntry(zipStream,
            getOutputStream(basePath(), newFile));
    }
}

In this version of the same code, the method extractEntriesFrom(…) doesn’t know anything about rules or how to write an entry to the disk. Everything that’s left in the method is part of the technical domain – code you have to write in order to perform something useful within the business domain. The new method handleEntry(…) is nearly free of technical domain stuff. Every line in this method depends on the specific use case, given by your business domain.

Example two: Full separation

Technically, the first example only consisted of a simple refactoring (Extract Method). But by separating the code domains, we’ve done the first step of a journey towards code reusability. It begins with a simple refactoring and ends with separated classes in separated packages from two separated project parts, named something like “application” and “framework”. Even if you only find a class named “Tools” or “Utils” in your project, you’ve done intermediate steps towards the goal: Separating your technical domain code from your business domain code in order to reuse the former (because no two businesses are alike).

The next example shows a full separation in action:

WriteTo.file(target).using(new Writing() {
    @Override
    public void writeTo(PrintWriter writer) {
        writer.println("Hello world!");
        writer.println("Hello second line.");
        // more business domain code here
    }
});

Everything other than the first line (and the necessary java boilerplate) is business domain code. In the first line, only the specified target file isn’t technical. Everything related to opening the file output stream, handling exceptions, closing all resources and all the other fancy stuff you want to do when writing to a file is encapsulated in the WriteTo class. The equivalent to the handleEntry(…) method from the first example is the writeTo(…) method of the Writing interface. Everything within this method is purely business domain related code. The best thing is: you can nearly forget about the technical domain when filling out the method, as it is embedded in a reusable “code clamp” providing the proper context.

Conclusion

If you want to write reusable code, consider separating your two major code domains: the technical domain and the business domain. The first step is to be aware of the domains and distinguish between them. Your separation process then can start with simple extractions and finally lead to a purely technical framework where you “only” have to fill in the business domain code. By the way, it’s a variation of the classic “separation of concerns” principle, if you want to read more.


Prepare for the unexpected

August 29, 2011

In most larger projects there are many details which cannot be foreseen by the development team. Specifications turn out to be wrong, incomplete or not precise enough for your implementation to work without further adjustments. New features have to work with production data that may not be available in your development or testing environment.

The result I often observed is that everything works fine in your environment including great automated tests but fails nevertheless when deployed to production systems. Sometimes it is minor differences in the operating system version or configuration, the locale for example, may cause your software to fail. Another common problem is  real production data containing unexpected characters, inconsistencies in the data (sometimes due to bugs) or its sheer size.

What can we do to better prepare for unexpected issues after deployment?

The thing is to expect such issues and to implement certain countermeasures to better cope with them. This may conflict with the KISS principle but usually is worth a bit of added complexity. I want to provide some advice which proved useful for us in the past and may help you in the future too:

  1. Provide good, detailed and persistent debug output for certain features: Once we added a complex rule system which operated on existing domain objects. To check every possible combination of domain object states would have been a ton of work, so we wrote tests for the common cases and difficult cases we could think of. Since the correctness of the functionality was not critical we decided to rather display slightly incorrect information instead of failing and thus breaking the feature for the user. We did however provide extensive and detailed logs whenever our rule system detected a problem.
  2. Make certain parts of your communication interface to third party systems configurable: Often your system communicates to different kinds of users and other systems. Common examples are import/export functionality, web service APIs or text protocols. Even if most of the time details like date and number formats, data separators, line endings, character encoding and so forth are specified it often proves valuable to make them configurable. Many times the specification changes or is incorrect, some communication partner implements the protocol slightly different or a format deviates from your assumption breaking your application. It is great if you can change that with a smile in front of your client and make the whole thing work in minutes instead of walking home frustrated to fix the issues.

The above does not mean building applications with ultimate flexibility and configurability and ignoring automated tests or realistic test environments. It just means that there are typical aspects of an application where you can prepare for otherwise unexpected deviations of theory and praxis.


An advent of unconditional quality code

November 8, 2010

This blog entry invites you to an experiment in code. It’s an experiment that runs four weeks and can be performed secretly even at your workplace. It might improve the way you think about conditional statements in an object oriented programming language. You don’t need any special hardware or setup, just the will to change your coding style a bit each week.

The experiment

Beginning with this year’s advent (a season of the christian religion), you are asked to omit one type of conditional statement each week while programming your regular code. The omitted statements add up, so that you have to spare four different statements in the week before christmas. There is no relation to christmas (or religion) other than it’s a four week period at the end of the year, which is the perfect timeframe for the experiment. And you might buy yourself a little present for christmas if you succeeded at the experiment (idea: a new programming book).

The four stages

For every stage, you are asked to write your normal code without a specific statement. It is perfectly valid to use semantically equivalent code constructs to achieve the same goal. This experiment is even more successful if you are creative and diversified in your variations of the original statement. Remember that the stages add up. On the fourth stage, you are asked to use none of the statements mentioned below.

  • Stage 1 (first week): Don’t use “else”
  • Stage 2 (second week): Don’t use the conditional operator “?:”
  • Stage 3 (third week): Don’t use “switch”
  • Stage 4 (fourth week): Don’t use “if”

You are not asked to change existing code to conform to these restrictions, except you need to work on the lines that contain the prohibited statements. You should apply the rules to your new code rigorously, though.

Explanation of stage 1 (Don’t use “else”)

This rule bans all the different occurrences of the else-branch to your if-statements. It includes every “else if” or “elsif” your programming language might provide. The rationale behind the rule can be found in the Object Calisthenics, rule #2 by Jeff Bay. Here is an explanation of it by Being Cellfish.

Explanation of stage 2 (Don’t use the conditional operator “?:”)

Elvis is dead. Let this resemblance to his hairdo rest for a week, too. It contains a hidden else statement that is restricted since stage 1. Another rationale is that the conditional operator isn’t very easy to read/grasp if stretched out a long line.

Explanation of stage 3 (Don’t use “switch”)

A switch (or case, or select) statement is nothing but a big if-else cascade. It’s handy sometimes, but can be replaced by a lookup table (like a hashmap) virtually everytime . In Martin Fowler’s book “Refactoring”, the switch statement counts as its own code smell category. You should try to live without it for a week. If you need inspiration, try this article on how to avoid it.

Explanation of stage 4 (Don’t use “if”)

Yes, you didn’t misread. There is a whole campaign that tries to avoid the if-statement altogether. Read their website for inspiration on how to survive this week. Maybe you might make new friends with polymorphism and some other implicit conditional structures. Remember, this is a short week just before christmas. Try it, you might be surprised how easy it looks with hindsight.

Ready, steady, go!

This experiment starts with the first advent at Sunday, 28.11.2010. Every stage lasts for one week and adds up to the previous stages. The experiment ends at christmas.

Good luck! And if you’re done with it, drop us a comment with your experiences.


Are programming books overrated?

July 19, 2010

In the last few weeks, we had an internship of a student that just finished academic high school (“Gymnasium”) and is looking forward to take up studies in computer science. He wanted to get in touch with the practical aspects of the career he is about to choose. The programming courses in school merely covered the basics of a programming language (Java) and some UML.

We prepared the student for the internship by feeding him several books we thought were appropriate for his level of knowledge. The books were a beginner’s book about Java (Head First Java), an introduction to unit testing (Pragmatic Unit Testing) and a foundation on clean code programming (Refactoring). Our student read them thoroughly and could make references to the chapters during pair programming sessions.

Retrospective on the books

But one feedback we got from him was that the books alone were nearly useless for his case. If there wouldn’t have been tutorial style pair programming coding sessions and several short lectures , he couldn’t grasp the deeper meaning of the book chapters he read (he suffered from the “blank slate blockade” several times). This came a bit as a surprise for us, as the student was very clever and really into it. It wasn’t the student, it was the books.

But you can’t blame it on “Refactoring”, for example, as this book is an all-time classic filled with really important knowledge. It has to be the medium itself, books are not the ideal source to learn about programming and software development.

Books are part of the academics

There is an old question in our profession. It revolves around if we are more like engineers or artists, craftsmen or scientists. In the core of this question is a uncertainty about the right model of education. Artists and craftsmen prefer more practical training, with apprentice/master relations and personal knowledge transfer. For engineers and scientists, literature and more standardized lectures are better suited. Academic knowledge is transferred during debate, not during exercises.

The duality of our profession

Projecting the feedback of our student onto this question, there seems to be a duality in our profession: Both (or all four if you want) approaches are needed to form a whole. You can’t learn the theory and expect to excel on the job. But pratical experience alone will not suffice to keep up with the pace of our profession. Good books are like afterburners here, you’ll be hurled forward by every page.

Conclusion

If it’s really true that we need to learn our profession both ways at once, pair programming (in the tour guide or backseat driver style) is an essential part of our qualification. And our current university curriculum fails to deliver this part. Students nowadays can team up to program together on an assignment, but that’s not learning from a master (unless one in the team has distinctly more experience than everybody else and is able to transfer it). So I vote to bring more craftsmanship to the academic education, as the books alone won’t cut it.

Your opinion?

What’s your opinion on this topic? Drop us a line about your thoughts.


Follow

Get every new post delivered to your Inbox.

Join 45 other followers