How do I start a project

On my quest to build better software for people and their needs I try to move my current agile project approach to a more user centered and outcome oriented one.

This starts right at the beginning of a project. After getting the go from the client I start with meeting the project leads on the client side, the ones who will make decisions and control the way of the project.
I like to take an assumption driven process or learning focussed one to ask questions and clear my assumptions on my way.
The first questions I have are:

  • who will use the software
  • who will be affected by the software/project
  • what are their goals/expected outcomes, what if they could choose only one
  • what do they expect from the software
  • what will happen if the project stalls or even fails

The people using the software aka the users are one of the main focus during the project but also the people who get benefits from the software without directly using it are really important and should not be neglected. These can be the people responsible for operating the software or managers getting reports from actual users. I keep them in mind so that other parts which are often missed during a user centered approach are considered.
All these people have some expectations how the software will affect them, some even have goals or need something to come out of the project. These outcomes cover a great range: from measureable business goals like increasing revenue or retention rate, to personal benefits like visibility. It is important to get a rough priority, I use a narrowing question like ‘what if you could choose only one’.
Besides from goals and outcomes people have also imaginations how the software will be used by them, in which context and how often.
These are the positive effects of the project and the software but all is not sunshine, so I also look at what will happen if the project is delayed, stops or even *shudder* fails. These are the risks that I need to consider and may be even plan for.
All these questions help me frame the project from the end. I know what goals to aim for and in which direction the journey goes.
This is my first step to build a shared understanding among the project participants. The steps to learn about what picture they have in mind. My questions and their answers help me to clarify the direction. After that I need to plan the first phase. For this I have to clear my mind and start with a beginner’s mind to find my hidden assumptions. Every assumption I or other have need to be called out explicitly. I have to capture it and formulate a corresponding learning step.

But this is a topic for another post…

On the usefulness of volatile memory

I have a strange hobby: Raising ants as pets. Not literally pets, because I don’t give them names and they probably don’t know I even exist, but as tended animals in a controlled and restricted environment, the formicarium. This doesn’t sound too strange if you know three additional things about me: I was fascinated by ants since kindergarten, I play dwarf fortress for fun and I’m interested in computers.

My main colony is a lasius niger nest with about 1500 individuals. Lasius niger or black garden ant is a very common and easy to raise ant that is small enough to not require too much space but big enough to be observed and tracked.

I could tell you hours and hours of ant facts, but let’s concentrate on the topic of this blog post: An ant colony can be perceived as one big organism. Each individual ant has a clear role in the hierarchy:

  • The queen ant is the sole egg layer. In an established colony, she won’t do anything else. She will be fed, cleaned and protected by her workers. She was born a queen and cannot be replaced. If she dies, the whole colony will slowly fade away because nobody produces new ants anymore. Worker ants don’t know if they still have a queen and don’t care anyway. If a queen dies, she will be fed to the remaining larvas as soon as her scent disappears.
  • The male ants are born, fed and kicked out of the nest, preferably when the young queens fly out to find a mate. They won’t do anything else and have a lifespan of days.
  • The worker ants are all sterile females and do all the work. Literally all the work. They are born workers and spend every moment of their lives contributing to the hive or just sitting around scrounging food. Other workers are too busy to judge them.

But keep in mind that ants only have short-range communication means (scent, antennae drumming and ground vibrations), so no single ant has complete overview about the situation. It wouldn’t have the brains to process that information anyway. Ants have limited capability to remember things but no long-time memory. It isn’t necessary for the single worker ant to store any information. The information is stored in the hive – literally.

If you abstract some details away, you can also perceive and describe an ant colony as a computer or at least as a massive parallel problem solver. The algorithms are ingrained in the worker ants and are executed ruthlessly. The problems are food, water, enemies, cleaning and nurture. If the environment (the problem space) is suitable for the algorithms, the colony will thrive. Else the colony will ultimately fail. An ant colony at the side of a busy road will lose many workers in sudden “enemy” attacks, while a colony in the middle of a meadow might find less insects killed by cars. Not a single ant in each colony is aware of these relations. But they have found a way to share information: They mark their position (and therefore their way) by special scent fluid. They use their scents to label the environment for other ants.

If you ever encountered an avid user of printed sticker labels, this is what ants do, too: They put a sticker label on everything they come in contact with. If you seem to be food, they label you as food and rush back to the hive, laying out a “food in this direction” lane. If you seem dead and inedible, they mark you as garbage and them or some other ant will pick you up and follow the garbage line (finding the garbage line often requires them to return near their hive so it seems they mistook the garbage for food). The garbage area is marked with its own scent, preferably at a cliff. My ants love to throw things down a cliff! If you didn’t get the relationship to dwarf fortress yet, it should be clear by now. There are countless more examples, but you get the idea. The environment is not only an area, it is the long-term memory of the colony. The colony’s “memory brain” is scent on dirt, stones and sticks. The ant colony has outsourced its complete knowledge about the surrounding into the surrounding itself.

The ants have invented something I would call an “inverse cloud”. The cloud in IT is a concept of data storage that exists mostly independent of physical location and provides access to that data from virtually anywhere. The ants’ inverse cloud is a concept of data storage that is tightly coupled to a physical location and provides access to that data only if you are in the immediate vincinity. If you remove a physical data storage part in the IT cloud, it gets replaced by other parts that contain mirrored copies of the data. The cloud never forgets. If you remove a physical data storage part in the ants’ inverse cloud, it is forgotten immediately. Ants accept their environment as it is right now and never look back.

Now think about what happens when it rains and all the scents are washed away.

After every rain, the ant colony enters a “new level”, a fresh environment to be discovered and labeled. They probably never grow old to rediscover the same cliff again and again. This is what happens to a computer when the power is lost. It loses its working memory. But keep in mind that the colony retains some memories: the hive is underground and often rain-proof by overlaying stones or plants. So the colony remembers that it is currently in the hive, because it always smells like hive. The colony remembers the queen chamber because it always smells like queen. The colony remembers the brood chambers because they always smell like teen spirit (SCNR).

In my formicarium, it never rains. The ants get enough water by drinking troughs, but the marked lanes are never erased. This leads to all sorts of silly situations that the ants don’t even recognize as such. For example, a strong “food here” scent lane exists long after the food is gone. So a lot of enthusiastic workers run around searching in the target area. And remember that the hive smell never fades? My ants have assimilated area after area as “hive” after enough ants have marked it. So now they react excessively to disturbances because they think they are defending the hive – and therefore the queen! – but are ant miles away from the colony. Even better, a lot of young ants that usually never leave the hive until they are older wander out into the open (“it’s still the hive, just less dark”) and panic as soon as they encounter something that shouldn’t be in a hive. The panic spreads by, you’ve guessed it already, alarm scent and soon hundreds of battle-ready ants are running around frantically without any one of them knowing why.

So, to speak in computer terms, the main memory never gets erased and the caches never flushed. A little error can spread like a wildfire (like the panic example above) and cause disadvantages like energy consumption without any real gain (no enemy to defend against) or a lot of delicate ants sitting around in plain sight of their predators. The whole system is fragile and erratic and probably wouldn’t survive in the wild. A good measure of rain would remove the odd memories and probably ease the ants because their hive, the area that needs to be defended at all costs, would shrink again.

I’ve raised neurotic ants in need of a cold shower.

How does this give us any insight into modern computing? Well, my train of thoughts is this: If modern systems are unable to forget, because memory is cheap and permanent, we might be prone to design software that acts neurotic and hyped up. The ability to forget, to really don’t remember at all, might be crucial in designing resilient parallel systems. There is the cost of losing valueable information, but the benefit of losing all results of errors seems to match it. So volatile memory might be a nuisance for us programmers, but it also provides a “blank slate” every time the system starts and is the reason for the most important question in IT: “Have you tried turning it off and on again?”

Systems that rely on “place oriented programming”  seem to have the need of regular reset phases where the working memory is cleared and the system goes into the next cycle fresh and rested. We might even call it sleep. And in case you wonder: The sleep of ants is an ongoing topic for research.

Disclaimer: I know that not all ants are as dumb as lasius niger. Some ants even teach each other facts about their environment. The wikipedia article mentions some wonderful examples. I had ant colonies with more complex ants and they were wonderful. But right now, as I’m typing this, there is a lasius niger worker that heaves a wasp husk part to the top corner of the formicarium, throws it down (did I mention they love throwing things?) and runs back down to heave it up again, probably because the corner is somehow marked as a garbage dump zone. It has repeated this process at least half a dozen times now. This is how a biological infinite loop looks like. Some ants even parallelize such a loop to exhaustion:

The definition of done

From large to small, from projects to issues, a team needs to define when they are considered done.
This decision differs from team to team, some have steps to done, others just one state. Even the words used in your issue tracker reflect your choices: what does ‘fixed’ mean, what is ‘closed’ used for…
Even some practices like test driven development define a state of done: the code is done if all tests are green and it is refactored.

What’s your definition of done?

Let’s take a look at some examples:

  • tests are green and code is refactored
  • QA says ok
  • customer/stakeholder/product owner accepts the issue
  • developer thinks the code reflects the description in the issue
  • a predefined spec, maybe even with an acceptance test, is fulfilled
  • no bugs were found while clicking through
  • the code is merged with the master branch
  • the continuous integration tool has found no errors

The problem with this ‘definition of done’s is that either they look for an external person to accept by their opinion/guideline or concentrate on some output. But the people needing the software do not want the software in its own regard. They want to reach a goal through the software. The software is a mean to an end: their goals. Without defining the goals and needs beforehand you are either doomed to guess them and are at the mercy of arbitariness (from your point of view) or concentrate on some measurable output like code, tests or a completed feature.

Defining what the user wants to do with this new feature or project should be the first thing in a project right after the initial introductions. Who will use the app or the feature? (the intended audience, the users) What do they expect from it? (the benefits) What goal do they want to reach?
With this questions and answers you have a target. After completing the issues or project you can see if the target has been reached, if the goals are met. It might be the same with an acceptance process from a stakeholder but here you know the target beforehand not after.

Kotlin and null-safety

This week I installed the Android Studio 3.0 preview in preparation for the development of an Android tablet app. Android Studio is based on JetBrains’ IntelliJ IDE. Google recently announced that Android Studio will support Kotlin as an official programming language for Android starting with version 3.0. The language has been designed and developed by JetBrains since 2010.

Kotlin is a language for the Java ecosystem like Scala, Groovy or Clojure that targets both the JVM and Google’s Dalvik VM, which is used for Android. It’s a statically typed language and it has a similar feature set as Scala and C#. Compared to Java it adds things like operator overloading, short syntax for properties, type inference, extension functions, string templates and it supported lambda expressions since before Java 8. But it also fixes some of Java’s inconsistencies. For example, it provides a unified type system with the Any type at the top of the type hierarchy and without special raw types. Arrays in Kotlin are invariant, and it uses declaration-site variance instead of use-site variance (see my other blog post for an explanation of these terms: Declaration-site and use-site variance explained). However, in my opinion the most interesting of Kotlin’s features is null-safety.

Null-safety

In Kotlin all types are non-nullable by default. You can’t assign null to a variable declared as

var s: String = "hi";

If you really want to be able to assign null to a variable you have to declare it with a question mark after the type, for example

var s: String? = null;

However, if you want to access a member of nullable reference or call a method on it, you have to perform a null check before doing so. This is enforced by the compiler. 

if (s != null) {
    return s.toUpperCase();
}

The compiler keeps track of the null checks before accessing a member of a nullable reference. Without the check the code wouldn’t compile. Kotlin offers some additional operators to simplify these null checks, like the safe navigation operator ?. (also known from Groovy and C# 6.0) or the “Elvis” operator ?:

person?.address?.country?.name
s?.toUpperCase() ?: ""

With Kotlin’s null-safety feature NullPointerExceptions are a thing of the past.

Kotlin has a lot more to offer and we haven’t decided yet if we will use it for our new Android app project, but it’s definitely an option to consider.

Do most language make false promises?

Some years ago I stumbled over this interesting article about C being the most effective of programming language and one making the least false promises. Essentially Damien Katz argues that the simplicity of C and its flaws lead to simple, fast and easy to reason about code.

C is the total package. It is the only language that’s highly productive, extremely fast, has great tooling everywhere, a large community, a highly professional culture, and is truly honest about its tradeoffs.

-Damien Katz about the C Programming language

I am Java developer most of the time but I also have reasonable experience in C, C++, C#, Groovy and Python and some other languages to a lesser extent. Damien’s article really made me think for quite some time about the languages I have been using. I think he is right in many aspects and has really good points about the tools and communities around the languages.

After quite some thought I do not completely agree with him.

My take on C

At a time I really liked the simplicity of C. I wrote gtk2hack in my spare time as an exercise and definitely see interoperability and a quick “build, run, debug”-cycle as big wins for C. On the other hand I think while it has a place in hardware and systems programming many other applications have completely different requirements.

  • A standardized ABI means nothing to me if I am writing a service with a REST/JSON interface or a standalone GUI application.
  • Portability means nothing to me if the target system(s) are well defined and/or covered by the runtime of choice.
  • Startup times mean nothing to me if the system is only started once every few months and development is still fast because of hot-code replacement or other means.
  • etc.

But I am really missing more powerful abstractions and better error handling or ressource management features. Data structures and memory management are a lot more painful than in other languages. And this is not (only) about garbage collection!

Especially C++ is making big steps in the right direction in the last few years. Each new standard release provides additional features making code more readable and less error prone. With zero cost abstractions at the core of language evolution and the secondary aim of ease of use I really like what will come to C++ in the future. And it has a very professional community, too.

Aims for the C++11 effort:

  • Make C++ a better language for systems programming and library building
  • Make C++ easier to teach and learn

-Bjarne Stroustup, A Tour of C++

What we can learn from C

Instead of looking down at C and pointing at its flaws we should look at its strengths and our own weaknesses/flaws. All languages and environments I have used to date have their own set of annoyances and gotchas.

Java people should try building simple things and having a keen eye on dependencies especially because the eco system is so rich and crowded. Also take care of ressource management – the garbage collector is only half the deal.

Scala and C++ people should take a look at ABI stability and interoperability in general. Their compile times and “build, run, debug”-cycle has much room for improvement to say the least.

C# may look at simplicity instead of wildly adding new features creating a language without opinion. A plethora of ways implementing the same stuff. Either you ban features or you have to know them all to understand code in a larger project.

Conclusion

My personal answer to the title of this blog: Yes, they make false promises. But they have a lot to offer, too.

So do not settle with the status quo of your language environment or code style of choice. Try to maintain an objective perspective and be aware of the weaknesses of the tools you are using. Most platforms improve over time and sometimes you have to re-evaluate your opinion regarding some technology.

I prefer C++ to C for some time now and did not look back yet. But I also constantly try different languages, platforms and frameworks and try to maintain a balanced view. There are often good reasons to choose one over the other for a particular project.

 

C++ modules example

Two weeks back, I blogged about C++ modules, and why we so desperately need them. Now I have actually played with the implementation in Visual Studio 2017, and I want to share my findings.

The Files

My example consists of four files in two “components”, i.e. one library and one executable. The executable only has one file, main.cpp:

import pets;
import std.core;
import std.memory;

int main()
{
  std::unique_ptr<Pet> pet = std::make_unique<Dog>();
  std::cout << "Pet says: "
    << pet->pet() << std::endl;
}

The library consists of three files. First is pet.cpp, which contains the abstract base class for all pets:

import std.core;
module pets.pet;

export class Pet
{
public:
  virtual char const* pet() = 0;
};

Then there is dog.cpp – our only concrete implementation of that base class (yes, I’m not a cat person).

module pets.dog;
import std.core;
import pets.pet;

export class Dog : public Pet
{
public:
  char const* pet() override;
};

char const* Dog::pet()
{
  return "Woof!";
}

Notice they each define their own submodule. Finally, there is interface.cpp, which just cobbles those submodules together into one single “parent” module:

module pets;

export module pets.pet;
export module pets.dog;

You can get the full source code including the CMake setup at our github repository. I was not able to get the standard library path setup automated so far, so you probably want to adjust that.

Discussion

There are no headers at all, which was one of my goals of laying it out like this. I think that alone means an enormous increase in productivity.

The information that was previously in the header files is now in .ifc files that the microsoft compiler generates automatically from the module definitions.
When trying this out, a couple of things stood out to me:

  • Intellisense does not work with the new keywords yet.
  • The way I used it, interface.cpp needs to be compiled after pet.cpp and dog.cpp, so that the appropriate .ifc file exists. Having an order dependency like that within a single library is a new challenge.
  • I was not able to use the standard lib in the library. That would compile, but not link.
  • Not having to duplicate the function declaration feels very strange in C++.
  • There are a lot of paradigm changes required. For example, include paths are a thing of the past – you will need to configure correct module search paths in the future.
  • We will need to get the naming straight: right now, “modules” is already used as a “distinct software component”. The new meaning is similar, but still competes with it. since the granularity is no longer so flexible. I already started using “components” as a new word for the former.

What are your experiences with modules so far? Do you have another way of composing modules? I really like to hear about it! I think the biggest challenge right now is how to use these new possibilities to improve the design of bigger C++ projects.

Learning about Class Literals after twenty years of Java

I’ve programmed in Java nearly every day for twenty years now. At the beginning of my computer science studies, I was introduced to Java 1.0.x and have since accompanied every version of Java. Our professor made us buy the Java Language Specification on paper (it was quite a large book even back then) and I occassionally read it like you would read an encyclopedia – wading through a lot of already known facts just to discover something surprising and interesting, no matter how small.

With the end of my studies came the end of random research in old books – new books had to be read and understood. It was no longer efficient enough to randomly spend time with something, everything needed to have a clear goal, an outcome that improved my current position. This made me very efficient and quite effective, but I only uncover surprising facts and finds now if work forces me to go there.

An odd customer request

Recently, my work required me to re-visit an old acquaintance in the Java API that I’ve never grew fond of: The Runtime.exec() method. One of my customer had an recurring hardware problem that could only be solved by rebooting the machine. My software could already detect the symptoms of the problem and notify the operator, but the next logical step was to enable the software to perform the reboot on its own. The customer was very aware of the risks for such a functionality – I consider it a “sabotage feature”, but asked for it anyway. Because the software is written in Java, the reboot should be written in Java, too. And because the target machines are exclusively running on Windows, it was a viable option to implement the feature for that specific platform. Which brings me to Runtime.exec().

A simple solution for the reboot functionality in Java on Windows looks like this:


Runtime.exec("shutdown /r");

With this solution, the user is informed of the imminent reboot and has some time to make a decision. In my case, the reboot needed to be performed as fast as possible to minimize the loss of sensor data. So the reboot command needs to be expanded by a parameter:


Runtime.exec("shutdown /r /t 0");

And this is when the command stops working and politely tells you that you messed up the command line by printing the usage information. Which, of course, you can only see if you drain the output stream of the Process instance that performs the command in the background:


final Process process = Runtime.exec("shutdown /r /t 0");
try (final Scanner output = new Scanner(process.getInputStream())) {
    while (output.hasNextLine()) {
        System.out.println(output.nextLine());
    }
}

The output draining is good practice anyway, because the Process will just stop once the buffer is filled up. Which you will never see in development, but definitely in production – in the middle of the night on a weekend when you are on vacaction.

Modern thinking

In Java 5 and improved in Java 7, the Runtime.exec() method got less attractive by the introduction of the ProcessBuilder, a class that improves the experience of creating a correct command line and a lot of other things. So let’s switch to the ProcessBuilder:


final ProcessBuilder builder = new ProcessBuilder(
        "shutdown",
        "/r",
        "/t 0");
final Process process = builder.start();

Didn’t change a thing. The shutdown command still informs us that we don’t have our command line under control. And that’s true: The whole API is notorious of not telling me what is really going on in the background. The ProcessBuilder could be nice and offer a method that returns a String as it is issued to the operating system, but all we got is the ProcessBuilder.command() method that gives us the same command line parts we gave it. The mystery begins with our call of ProcessBuilder.start(), because it delegates to a class called ProcessImpl, and more specific to the static method ProcessImpl.start().

In this method, Java calls the private constructor of ProcessImpl, that performs a lot of black magic on our command line parts and ultimately disappears in a native method called create() with the actual command line (called cmdstr) as the first parameter. That’s the information I was looking for! In newer Java versions (starting with Java 7), the cmdstr is built in a private static method of ProcessImpl: ProcessImpl.createCommandLine(). If I could write a test program that calls this method directly, I would be able to see the actual command line by myself.

Disclaimer: I’m not an advocate of light-hearted use of the reflection API of Java. But for one-off programs, it’s a very powerful tool that gets the job done.

So let’s write the code to retrieve our actual command line directly from the ProcessImpl.createCommandLine() method:


public static void main(final String[] args) throws Exception {
    final String[] cmd = {
            "shutdown.exe",
            "/r",
            "/t 0",
    };
    final String executablePath = new File(cmd[0]).getPath();

    final Class<?> impl = ClassLoader.getSystemClassLoader().loadClass("java.lang.ProcessImpl");
    final Method myMethod = impl.getDeclaredMethod(
            "createCommandLine",
            new Class[] {
                    ????, // <-- Damn, I don't have any clue what should go here.
                    String.class,
                    String[].class
            });
    myMethod.setAccessible(true);

    final Object result = myMethod.invoke(
            null,
            2,
            executablePath,
            cmd);
    System.out.println(result);
}

The discovery

You probably noticed the “????” entry in the code above. That’s the discovery part I want to tell you about. This is when I met Class Literals in the Java Language Specification in chapter 15.8.2 (go and look it up!). The signature of the createCommandLine method is:


private static String createCommandLine(
        int verificationType,
        final String executablePath,
        final String cmd[])

Note: I didn’t remove the final keyword of verificationType, it isn’t there in the original code for unknown reasons.
When I wrote the reflection code above, it occurred to me that I had never attempted to lookup a method that contains a primitive parameter – the int in this case. I didn’t think much about it and went with Integer.class, but that didn’t work. And then, my discovery started:


final Method myMethod = impl.getDeclaredMethod(
        "createCommandLine",
        new Class[] {
                int.class, // <-- Look what I can do!
                String.class,
                String[].class
        });

As stated in the Java Language Specification, every primitive type of Java conceptionally “has” a public static field named “class” that contains the Class object for this primitive. We can even type void.class and gain access to the Class object of void. This is clearly written in the language specification and required knowledge for every earnest usage of Java’s reflection capabilities, but I somehow evaded it for twenty years.

I love when moments like this happen. I always feel dumb and enlightened at the same time and assume that everybody around me knew this fact for years, it is just me that didn’t get the memo.

The solution

Oh, and before I forget it, the solution to the reboot command not working is the odd way in which Java adds quote characters to the command line. The output above is:


shutdown /r "/t 0"

The extra quotes around /t 0 make the shutdown command reject all parameters and print the usage text instead. A working, if not necessarily intuitive solution is to separate the /t parameter and its value in order to never have spaces in the parameters – this is what provokes Java to try to help you by quoting the whole parameter (and is considered a feature rather than a bug):


final String[] cmd = {
        "shutdown",
        "/r",
        "/t",
        "0",
};

This results in the command line I wanted from the start:


shutdown /r /t 0

And reboots the computer instantaneous. Try it!

Your story?

What’s your “damn, I must’ve missed the memo” moment in the programming language you know best?