How I find the source of bugs

December 15, 2014

You know the situation: a user calls or emails you to tell you your program has a problem. When you are lucky he lists some steps he believe he did to reproduce the behaviour. When you are really lucky those steps are the right ones.
In some cases you even got a stacktrace on the logs. High fives all around. You follow the steps the problem shows and you get the exact position in the code where things get wrong. Now is a great time to write a test which executes the steps and shows the bug. If the bug isn’t data dependent you normally can nail it with your test. If it is dependent on the data in the production system you have to find the minimal set of data constraints which causes the problem. The test fails, fixing it should make it green and fix the problem. Done.
But there are cases where the problem is caused not in the last action but sometime before. If the data does not reflect that the problem is buried in layers between like caches, in memory structures or the particular state the system is in.
Here knowledge of the frameworks used or the system in question helps you to trace back the flow of the state and data coming to the position of the stack trace.
Sometimes the steps do not reproduce the behaviour. But most of the time the steps are an indicator for how to reproduce the problem. The stack trace should give you enough information. If it doesn’t, take a look in your log. If this also does not show enough info to find the steps you should improve your log around the position of the strack trace. Wait for the next instance or try your own luck and then you should have enough information to find the real problem.
But what if you have no stack trace? No position to start your hunt? No steps to reproduce? Just a message like: after some days I got an empty transmission happening every minute. After some days. No stack trace. No user actions. Just a log message that says: starting transmission. No error. No further info.
You got nothing. Almost. You got a message with three bits of info: transmission, every minute and empty.
I start with transmission. Where does the system transmit data. Good for the architecture but bad for tracing the transmission is decoupled from the rest of the system by using a message bus. Next.
Every minute. How does the system normally start recurring processes? By quartz, a scheduler for Java. Looking at the configuration of the live system no process is nearly triggered every minute. There must be another place. Searching the code and the log another message indicates a running process: a watchdog. This watchdog puts a message on the bus if it detects a problem. Which is then send by the transmission process. Bingo. But why is it empty?
Now the knowledge about the facilities the system uses comes into play: UMTS. Sometimes the transmission rate is so low that the connection does not transfer any packages. The receiving side records a transmission but gets no data.
Most of the time the problem can be found in your own code but all code has bugs. If you assume after looking at your code that the frameworks you use have a bug. Search the bug database of the framework hopefully it is found there and already fixed.


The four rules of data safety

December 8, 2014

firefly-gunOne of the most dangerous objects to handle is guns. No wonder there are strict and understandable rules how to handle them safely. The Canadians have The Four Firearm ACTS, but for this blog entry, I will cite the Four Rules stated by Captain Ira L. Reeves right before the first world war and restated by Colonel Jeff Cooper:

  1. All guns are always loaded
  2. Never let the muzzle (the business end of a gun) cover anything you are not willing to destroy
  3. Keep you finger off the trigger until your sights are on the target
  4. Be sure of your target and what is beyond it

Even if you accidentally break one rule (for example, rule 3 is often blatantly disobeyed on television), there are still enough precautions in place to keep you (and everybody around you) relatively safe. The rules are meant to instill a certain amount of respect for the gun into the owner so that offloading of responsibility isn’t possible any more, as in the line “I know this gun is unloaded, so it’s probably mighty fun to point it at somebody”.

The guns of software development

In software development, the most dangerous objects we can handle is user-created data or inputs. To mitigate the risks we take when we accept inputs from our users (and most software would be pretty useless otherwise), we have the concept of validation: Before anything other may happen with the data, it needs to be validated, meaning “proved to be free of danger”. Improper input validation is so prevalent in software development that it has its own CWE number (CWE-20) and ranked number 1 on the Top 25 list of “most dangerous programming errors”.

There are some concepts ready to help us tackle this task. The most promising is the Taint checking that treats all input as dangerous and therefore unworthy of further usage unless proven otherwise. Taint checking reminds you of validation, but not how to validate and isn’t available in most programming languages, unfortunately. What we need is a language agnostic set of rules that shape our behaviour in a way that we can’t make the most common mistakes of validation. It seems that gun owners have tried the same and succeeded. So Let’s formulate our Four Rules of data safety, inspired by the gun rules.

Our four rules

  1. All data always contains malicious aspects
  2. Never accept input for modules you cannot afford to have hacked
  3. Leave input data alone until you actually want to use it
  4. Be sure what aspects to validate and how to do it properly

This is just a starting ground for discussion, let’s call it the first version of the Four Rules. Here is my motivation for each rule:

All data always contains malicious aspects

Most users of most systems are in no way harmful. But if they attempt to harm a system, it better stands prepared. Problem is, even with a thorough validation in your current context, there is always the possibility that your attacker plays a rail shot, entering the system here, but causing damage somewhere else. A good example of this practice were images with Javascript code in their metadata. An adequate validation of uploaded images would check for a valid image format, but don’t mind the “dead content” in the meta tags. A browser would later discover the Javascript and execute it – a classic cross-site scripting attack. Never treat any data as fully validated. If you know that your particular code is vulnerable to a specific threat, let’s say a zero value in a variable used as a divisor, validate once more against this threat. This practice is also contained in the idea of Defensive programming.

Never accept input for modules you cannot afford to have hacked

Behind this rule lies a simple truth: Everything that can be hacked will be hacked, given enough time. The only protection against any hack is no access at all (like in “some air between network cable and network card”). If for example you run a certificate authority and absolutely cannot risk losing your secret private key, the machine using this key must not be connected to any network. If your database contains data much too valuable to be “stolen”, the database shouldn’t be accessible directly – and all access need to be validated beforehand. You need to think about a pragmatic compromise for your scenario when following this rule, but you’ve always been warned.

Leave input data alone until you actually want to use it

This was the most difficult rule for me to decide on. The rationale is that even the slightest bit of validation is actually usage of the input. Given enough knowledge about the validation, an attacker could possibly attack the system by abusing weaknesses in the validation itself (see rule 1 for inspiration). Any contact with input data is dangerous, even when it happens with the best intentions. The downside is that you won’t have a stronghold security architecture, where a mighty wall separates the danger zone from friendly territory (or tainted from cleaned data). Remember that even persisting the input data is using it in some form.

Be sure what aspects to validate and how to do it properly

If the time has come to use the input and to validate it right before, you need to think deep about the threats you want to eliminate. Just like with guns, where real bullets (as opposed by “television bullets”) won’t stop at the shooter’s convenience, your validation has consequences beyond an immediate gain of security. A common error is the rushed countermeasure, when you think of a specific threat and immediately try to abolish it. Take your time and think deep! For example, if your users can enter way too high values, it’s of no use to constrain the input field length, because direct web requests and notations like “1E9″ are still possible. But converting an input string to a number to check its value might not be the smartest idea, too. Not long ago, you could crash nearly every application by entering a certain “number of death”. Following this rule requires experience and lots of reading, learning and thinking. And even then, there’s always somebody smarter than you, so ultimately, you should plan your system under the impression of rule 2.

As stated, this is just a starting point to try to formulate rules for data validation that provide a behaviour framework that avoids the most common mistakes and pitfalls. I’m highly interested to hear your thoughts about this topic. Please leave a comment below – but be gentle with the comment validation algorithm.


TANGO device server architecture

December 1, 2014

In my previous post I explained the basics of TANGO and why you probably want to use TANGO for development of a distributed system. Now I would like to explain how to build and design a TANGO device server. There are several best practices and even a comprehensive and ever evolving guide you should definately have a look at.

General Approach

I like to think about TANGO as a thin wrapper around some software object. That means almost all logic and hardware/platform dependent stuff is implemented in the software object which should provide all services the TANGO wrapper needs. Usually you will design an opinionated library supporting your use cases and encapsulating platform, hardware and driver issues and leaves out the stuff you do not need.

TANGO Server - ArchitectureThe opinionated library has no dependencies on TANGO and can be use in different clients independently of TANGO. The TANGO device classes mostly delegate to the library and manage just the TANGO specific things like device state, synchronisation, allowed methods and so on.

TANGO Server Architecture

As said before the TANGO device that makes use of the software component developed with TANGO in mind contains only short methods doing parameter conversion and some TANGO book keeping and life-cycle-management. The design of the server itself is an interesting part in itself though. Often it pays off to implement several devices in one (or more) TANGO servers that perform different tasks and provide special interfaces to their clients.

For example, a multi-axis motor controller could export one device per axis, so clients can move the axes independently in a natural fashion by denoting the respective axis by its device name. Alongside there may be some controller device that provides access to controller functionality not specific to a single axis like a stop all axes command. Sometimes it is helpful to let the axis devices talk to the controller and not directly to the component you are trying to expose via TANGO. That way you can for example synchronise access to the component with TANGO framework functionality on the controller device.

For imaging systems like CCD cameras or other detectors additional devices for image transformations, persisting the images or additional buffering may be a good decision. Such devices can be made largely independent of the actual hardware or imaging system which makes for nice reuse and plug-able functionality.

So it is good to think about the different tasks and aspects your TANGO server should perform and separate them into specialised devices. That should make each device itself clearer and enables specialised service interfaces for different clients. Your devices become easier to use and many parts may be even reusable. We try to standardise on device interfaces every time we identify general abstractions. That makes it much easier for the clients to work with your exposed TANGO devices.


Successful patterns for software developers

November 23, 2014

Be good at everything but better at something

Diversify. Knowing and working in different domains, using different programming languages and tools and handling diverse tasks enriches your creativity as a developer. It keeps your mind flexible and inspires you. Many can agree on that. But on the other hand you should find your niche, your personal joy, your home ground. Different developers have different personalities, different talents and different preferences. People shine when they work with something they like. They are more happy, more productive and more creative. But not all developers in a team or company have the same favorites. Your company should encourage you getting better at your specialty. Work on your strengths.
I like dynamic languages, others prefer static languages. Some developers like to craft desktop software, some web applications. Others concentrate on the UI or controlling robots or sensors. Some ponder over algorithms while others are happy with designing visualizations.
When your team has a common ground but everybody has strengths in different fields everybody can learn from and supplement each other. Synergy is created.

Be support, project manager, admin, …

Developers in our company wear many hats. Sometimes even literally (but that’s another story). If a developer is responsible or takes part in other roles of the project his view is widened. When he talks with customers he not only understands their needs better but also can suggest different ways or solutions. His domain knowledge grows and he can identify pain points. Working with the platforms where the application is deployed and the systems involved also strengthens his grasp of the environment the application lives in. As a project manager he learns to juggle time and scope. He learns to work with constraints and different forces pulling not only the code but the whole project.

Optimize for forgiveness

A user deleted a post accidentally. What do we normally do? We introduce a security question to establish a barrier for deletion. What’s this? We make the UX worse because we think it is the user’s fault and we need to protect him from himself. A better way would be to don’t delete the post but make it invisible. This way he can undo if he wrongly deleted the post. But what about updates? Updating a post overwrites the old content. What if this happened not on purpose? A better way would be to record the last state of the post and undo the update if necessary. Todays computers have so much memory and are so powerful that an application can be allowed to be merciful. It can forgive its users when they did something wrong and want to revert it.

It is not about you

This is one of the hardest lessons to learn. Your work is not about you. Yes, it reflects you in a way. But it does not define you, you define the work. If your code has bugs or you make a mistake and accidentally delete important data, take a deep breath and think. Get help. Plan your steps, discuss with others what to do. Tell them you made a mistake. Everybody does. Do not try to fix it yourself. Do not hide it. Focus on the problem at hand. Not your shame or feeling of guilt. (This goes hand in hand with optimizing for forgiveness)

Talk and listen

Feedback. One of the most important things in software development. Talk to your users. Listen. Again and again. Often talking 10 minutes about a task or problem can result in hours of work saved. The bug you couldn’t reproduce? It was in a different area. The feature you thought was nearly impossible? With a slight change it is much easier. This also goes for everything else. Deployments and commits. Design decisions. You are the expert in your domain, the customer in his. Act accordingly. Tell him the options he has and what will be the consequences. Don’t let him guess. He shouldn’t do your work.


Programming mistakes of my past self – Part I

November 16, 2014

One thing that fascinates me about software development is the fact that we aren’t done yet as a profession, we just barely started. New paradigms, programming languages and concepts, even new technologies are invented, discovered and refined at every moment. Add a personal journey of skill acquisition and improvement, and it’s enough for a fulfilled professional life. But as a Clean Code Developer, I often pause and reflect – on me, my work and why I do it in this particular way. I’m aware that I’m on a perpetuating process of self-improvement, always better than yesterday (hopefully), but never as good as I want to be. Reflecting the changes and transformations I made in the past helps me to understand changes in the present or even in the future. So this is a blog entry about mistakes, probably embarrassing ones, that I really made and didn’t think anything was wrong at some point in my professional career.

But before I make my confessions, please keep this disclaimer in mind: Most of these mistakes, I made in the ancient days of my schooling and early steps. I’ve come a long way since, read a ton of books, wrote several big software systems and switched programming languages several times. I didn’t write this to make fun of my past self, but to gather (and provide) insight into the mind of an apprentice and how he rationalizes aspects of software development that seem out of place or even funny to more experienced developers. The purpose is to be more aware of more recent sketchy rationalizations, not to laugh about how stupid I was – even if I’ve probably been stupid.

No indentation

Origin:
Yes, really. I started my professional/academic career with strictly left-aligned code and no sense of the value of indentation. It just seemed meaningless “additional effort” to me. Let me explain why while you laugh. I started my career with BASIC, and after years of tinkering around and finally reading books about it (this was long before the world wide web, mind you!), discovered that I could circumvent the limitations of the runtime by directly PEEKing and POKEing to the memory. Essentially, I began to write machine code in BASIC. As soon as I had this figured out, my language of choice was now assembler, because why drill holes into BASIC every time I wanted to do something meaningful (like changing the VGA palette mid-frame to have more than 256 colours available). Years of assembler programming followed. Assembler isn’t like any other programming language, it’s more of a halfway de-scrambled machine code and as such has no higher concepts like loops or if-else statements. This is more or less like every program in assembler looks like:

push    20h
call    401010
add     esp,4
xor     eax,eax
ret

You’ve probably already guessed where this leads to: In assembler, all scoping/blocking of code has to be done by the programmer in his head. There was no value in indentation because there was no hierarchy of statements and everything was on the same level of (nearly non-existent) abstraction. I got used to the level of attention you have to maintain to keep track of your code. So when I started programming in Java during my study, the hard nut to crack was object orientation, not the simple task of understanding code without indentation.

Mistake:
It didn’t occur to me that my code was hard to understand for other readers (e.g. my tutor) without proper formatting. Code was cryptic and hard to understand, so what? I didn’t regard obfuscation as a problem, but was proud to be “one of the few” who could actually understand what was going on.

Remedy:
I’ve come a long way since. Nearly two decades in application development taught me to write, structure and format my code as clearly as I can – and always add some extra effort into clarity. Good code is readable, and readable code is understandable by virtually everybody, not only a chosen few. Indentation is a very important tool to lead the reader (and yourself) through your program. It’s no coincidence that the first rule of the Object Calisthenics deals with indentation.

Single return functions

Origin:
This one also roots in my first years of programming BASIC and assembler. In assembler, you never think about anything other than one clear exit from a subroutine, because you need to restore all register context before the jump back by hand. In BASIC, there was that lingering danger that you couldn’t break free from a loop or a routine too early because the interpreter would mess up its internal context. If you were inside a loop and left the subroutine by “Exit Sub” command, the loop context was still present and ready to bite you.
In short, everything else but a clearly cut exit strategy from a function was dangerous and error prone. The additional code infrastructure needed to maintain such a programming style, e.g. additional local variables and blown-up conditionals were necessary costs in my book. To be honest, I didn’t even think about any alternative, because in my reality, you needed to care about your stack content even in BASIC.

Mistake:
I didn’t think about ways to minimize my effort in micromanaging the computer. In my defense, this would have totally alienated assembler programming for me. Assembler is all about micromanagement and CPU nursery. It didn’t occur to me that my value system (stack handling is coder’s work) limited my ability to express the goals of a function (instead of its minutiae).

Remedy:
Great recapulations of most arguments against single return functions can be found in the C2 wiki and various other internet sources like this great question on stackexchange.com
I dropped this style quickly when finally wrapping my head around the fact that the Java VM handles all memory including the stack for me and doesn’t want me to interfere (or “optimize”). Once freed from micromanagement issues, you can adapt your stylistic choice to the matter at hand and write code that supports your problem domain instead of adhering to limitations from the technical domain.

Special naming conventions for interfaces

Origin:
One of the hardest topics in object-oriented programming for me was the concept of “abstract” classes or even those mysterious interfaces. What’s the use of an interface anyway when it doesn’t even contain code? It seemed like additional work without benefit for me. And with a programming style that stores everything in primitive data types (where else?), interfaces just don’t cut it. So I adopted a style that marks everything dubious with extra prefixes to move it out of the way when it comes to naming. Let’s say I want to program a class that represents a user (class User), but are somehow forced or tempted to create an interface for it? Just name it IUser! It’s such a no-brainer that interfaces didn’t require any effort in their creation. And while we are at it, let’s name all abstract classes AbstractXYZ, because that’s much better than the alternative – to name the concrete class XYZImpl (disclaimer: both options are flawed). Cool, a new concept in Java 5 were Enums, let’s prefix them with “big E” so we can always tell them apart. And while we are at it, every exception should end with… well, I think you can guess.

Mistake:
I’m happy to announce that I never fell in the Hungarian notation trap. But that doesn’t serve as an excuse for the type name prefix mess I maintained longer than I’m willing to admit. The mistake was to overburden type names with implementation details and let the technical domain leak into my type system.

Remedy:
One day, I decided to cut it out and began to eliminate prefixes and suffixes in type names. It started a process of discoveries, insights and new possibilities much like in the case of single return functions. And the process isn’t even finished yet. Just recently, Kevlin Henney came along and gave me another push forward on my journey to really good type names (Seven ineffective coding habits of many programmers). As a reminder: The compiler doesn’t care about your names. Most readers don’t care about the actual technical realization of a type as long as they know what the type is for in the problem domain. Even you yourself don’t care about prefixes in the name once the name-finding phase is past. Let me phrase this facetious: “Equal naming rules for all types of types!”

Only the beginning

These three examples are only the beginning of a whole list of mistakes, misconceptions and plain falsities of mine. I hope you’ll see the intention behind the confession, not only the amusing part of self-revelation. Try it on yourself! Think back to your early days as a software developer and write down the funny things you worked with and were proud of. Then try to fit them into the scheme: How did you start doing it? Why exactly was it a mistake (in the long run)? And what was the aspect that drove you away from it? How did you fix your mistake?

I would love to hear and learn from your mistakes, too.


MSBuild Basics

November 11, 2014

MSBuild is Microsoft’s build system for Visual Studio. Visual Studio project files (*.csproj, *.vbproj) do not only describe the project structure, but are also build scripts for MSBuild. They’re executed when you click the run button in the IDE, but they can also be called via the MSBuild command line utility.

> MSBuild.exe Project.csproj

These project files / build scripts are in XML format, comparable to Ant scripts in the Java land.

Edit project files

You can edit these files in any text editor, of course. But if you want to edit them within Visual Studio, you have to unload the project first:

  • Right click on the project in the Solution Explorer -> Unload Project
  • Right click on the project in the Solution Explorer -> Edit MyProject.csproj

After you’re done editing you can reload the project again via the context menu.

Targets and tasks

The concepts of MSBuild are comparable to many other build systems: a build script contains a set of named targets, and each target consists of a sequence of task calls.

A project can have one or more default targets, referenced by the DefaultTargets attribute of the Project root element:

<Project DefaultTargets="Build" ...>

Multiple targets can be separated by semicolons.

Targets are declared via Target tags containig the task calls:

  <Target Name="Clean">
    <Delete Files="xyz.tmp" />
    ...
  </Target>

MSBuild comes with a set of common tasks, such as Message, Copy, Delete, Exec, …

If you need more tasks you should have a look at these community provided task collections:

Both are available as NuGet packages and can be checked into your code repository alongside the project for self-containment. For the Extension Pack you have to set the ExtensionTasksPath property correctly before importing the tasks, for example:

<PropertyGroup>
  <ExtensionTasksPath Condition="'$(ExtensionTasksPath)' == ''">$(MSBuildProjectDirectory)\packages\MSBuild.Extension.Pack.1.5.0\tools\net40</ExtensionTasksPath>
</PropertyGroup>

<Import Project="$(ExtensionTasksPath)MSBuild.ExtensionPack.tasks">

Properties

Properties are defined within PropertyGroup tags, containing one or many property tags. The names of these tags are the property names and the tag contents are the property values. Properties are referenced via $(PropertyName). A property definition can have an optional Condition attribute, which determines whether a property should be set or not. The condition ‘$(PropertyName)’ == ”, for example, checks if a property is not yet set.

Here’s an example build target that uses the ZIP compression task from the Extension Pack and some properties to create a ZIP file artifact from the build results:

<Target Name="AfterBuild">
  <MSBuild.ExtensionPack.Compression.Zip TaskAction="Create" CompressPath="$(OutputPath)" ZipFileName="bin\$(ProjectName)-$(BuildNumber).zip" />
</Target>

You can also set property values from the outside via the MSBuild call:

> MSBuild.exe /t:Build /p:Configuration=Release;BuildNumber=1234 Project.csproj

  • The /t switch determines which targets to run. Multiple targets can be separated by semicolons.
  • The /p switch sets properties in the form of PropertyName=value, also separated by semicolons.

This way you can pass environment variables like $BUILD_NUMBER from your Continuous Integration system (e.g. Jenkins) to your build script:

> MSBuild.exe /t:Build /p:Configuration=Release;BuildNumber=$BUILD_NUMBER Project.csproj

Now you could use the MSBuild.ExtensionPack.Framework.AssemblyInfo task to write the $(BuildNumber) property into your AssemblyInfo file.


TANGO – Making equipment remotely controllable

November 3, 2014

Usually hardwareTango_logo vendors ship some end user application for Microsoft Windows and drivers for their hardware. Sometimes there are generic application like coriander for firewire cameras. While this is often enough most of these solutions are not remotely controllable. Some of our clients use multiple devices and equipment to conduct their experiments which must be orchestrated to achieve the desired results. This is where TANGO – an open source software (OSS) control system framework – comes into play.

Most of the time hardware also can be controlled using a standardized or proprietary protocol and/or a vendor library. TANGO makes it easy to expose the desired functionality of the hardware through a well-defined and explorable interface consisting of attributes and commands. Such an interface to hardware –  or a logical piece of equipment completely realised in software – is called a device in TANGO terms.

Devices are available over the (intra)net and can be controlled manually or using various scripting systems. Integrating your hardware as TANGO devices into the control system opens up a lot of possibilites in using and monitoring your equipment efficiently and comfortably using TANGO clients. There are a lot of bindings for TANGO devices if you do not want to program your own TANGO client in C++, Java or Python, for example LabVIEW, Matlab, IGOR pro, Panorama and WinCC OA.

So if you have the need to control several pieces of hardware at once have a look at the TANGO framework. It features

  • network transparency
  • platform-indepence (Windows, Linux, Mac OS X etc.) and -interoperability
  • cross-language support(C++, Java and Python)
  • a rich set of tools and frameworks

There is a vivid community around TANGO and many drivers for different types of equipment already exist as open source projects for different types of cameras, a plethora of motion controllers and so on. I will provide a deeper look at the concepts with code examples and guidelines building for TANGO devices in future posts.


Follow

Get every new post delivered to your Inbox.

Join 92 other followers