Build your own performance and log monitoring solution

September 29, 2014

Tips for a better performance like using views or reducing the complexity of your algorithms only help when you know where you can improve performance. So to find the places where speed is needed you can build extensive load and performance tests or even better you can monitor the performance of your app in production. Many solutions exists which give you varying levels of detail (the costs of them also varies). But often a simple solution is enough. We start with a typical (CRUD) web app and build a monitor and a tool to analyse response times. The goal is to see what are the ten worst performing queries of the last week. We want an answer to this question continuously and maybe ask additional questions like what were the details of the responses like user/role, URL, max/min, views or persistence. A web app built on Rails gives us a lot of measurements we need out of the box but how do we extract this information from the logs?
Typical log entries from an app running on JRuby on Rails 4 using Tomcat looks like this:

Sep 11, 2014 7:05:29 AM org.apache.catalina.core.ApplicationContext log
Information: I, [2014-09-11T07:05:29.455000 #1234]  INFO -- : [19e15e24-a023-4a33-9a60-8474b61c95fb] Started GET "/my-app/" for 127.0.0.1 at 2014-09-11 07:05:29 +0200

...

Sep 11, 2014 7:05:29 AM org.apache.catalina.core.ApplicationContext log
Information: I, [2014-09-11T07:05:29.501000 #1234]  INFO -- : [19e15e24-a023-4a33-9a60-8474b61c95fb] Completed 200 OK in 46ms (Views: 15.0ms | ActiveRecord: 0.0ms)

Important to identify log entries for the same request is the request identifier, in our case 19e15e24-a023-4a33-9a60-8474b61c95fb. To see this in the log you need to add the following line to your config/environments/production.rb:

config.log_tags = [ :uuid ]

Now we could parse the logs manually and store them in a database. That’s what we do but we use some tools from the open source community to help us. Logstash is a tool to collect, parse and store logs and events. It reads the logs via so called inputs, parses, aggregates and filters with the help of filters and stores by outputs. Since logstash is by Elasticsearch – the company – we use elasticsearch – the product – as our database. Elasticsearch is a powerful search and analytcs platform. Think: a REST frontend to Lucene – only much better.

So first we need a way to read in our log files. Logstash stores its config in logstash.conf and reads file with the file input:

input {
  file {
    path => "/path/to/logs/localhost.2014-09-*.log"
    # uncomment these lines if you want to reread the logs
    # start_position => "beginning"
    # sincedb_path => "/dev/null"
    codec => multiline {
      pattern => "^%{MONTH}"
      what => "previous"
      negate => true
    }
  }
}

There are some interesting things to note here. We use wildcards to match the desired input files. If we want to reread one or more of the log files we need to tell logstash to start from the beginning of the file and forget that the file was already read. Logstash remembers the position and the last time the file was read in a sincedb_path to ignore that we just specify /dev/null as a path. Inputs (and outputs) can have codecs. Here we join the lines in the log which do not start with a month. This helps us to record stack traces or multiline log entries as one event.
Add an output to stdout to the config file:

output {
  stdout {
    codec => rubydebug{}
  }
}

Start logstash with

logstash -f logstash.conf --verbose

and you should see your log entries as json output with the line in the field message.
To analyse the events we need to categorise them or tag them for this we use the grep filter:

filter {
  grep {
    add_tag => ["request_started"]
    match => ["message", "Information: .* Started .*"]
    drop => false
  }
}

Grep normally drops all non matching events, so we need to pass drop => false. This filter adds a tag to all events with the message field matching our regexp. We can add filters for matching the completed and error events accordingly:

  grep {
    add_tag => ["request_completed"]
    match => ["message", "Information: .* Completed .*"]
    drop => false
  }
  grep {
    add_tag => ["error"]
    match => ["message", "\:\:Error"]
    drop => false
  }

Now we know which event starts and which ends a request but how do we extract the duration and the request id? For this logstash has a filter named grok. One of the more powerful filters it can extract information and store them into fields via regexps. Furthermore it comes with predefined expressions for common things like timestamps, ip addresses, numbers, log levels and much more. Take a look at the source to see a full list. Since these patterns can be complex there’s a handy little tool with which you can test your patterns called grok debug.
If we want to extract the URL from the started event we could use:

grok {
    match => ["message", ".* \[%{TIMESTAMP_ISO8601:timestamp} \#%{NUMBER:}\].*%{LOGLEVEL:level} .* \[%{UUID:request_id}\] Started %{WORD:method} \"%{URIPATHPARAM:uri}\" for %{IP:} at %{GREEDYDATA:}"]
 }

For the duration of the completed event it looks like:

grok {
    match => ["message", ".* \[%{TIMESTAMP_ISO8601:timestamp} \#%{NUMBER:}\].*%{LOGLEVEL:level} .* \[%{UUID:request_id}\] Completed %{NUMBER:http_code} %{GREEDYDATA:http_code_verbose} in %{NUMBER:duration:float}ms (\((Views: %{NUMBER:duration_views:float}ms \| )?ActiveRecord: %{NUMBER:duration_active_record:float}ms\))?"]
 }

Grok patterns are inside of %{} like %{NUMBER:duration:float} where NUMBER is the name of the pattern, duration is the optional field and float the data type. As of this writing grok only supports floats or integers as data types, everything else is stored as string.
Storing the events in elasticsearch is straightforward replace or add to your stdout output an elasticsearch output:

output {
  elasticsearch {
    protocol => "http"
    host => localhost
    index => myindex
  }
}

Looking at the events you see that start events contain the URL and the completed events the duration. For analysing it would be easier to have them in one place. But the default filters and codecs do not support this. Fortunately it is easy to develop your own custom filter. Since logstash is written in JRuby, all you need to do is write a Ruby class that implements register and filter:

require "logstash/filters/base"
require "logstash/namespace"

class LogStash::Filters::Transport < LogStash::Filters::Base

  # Setting the config_name here is required. This is how you
  # configure this filter from your logstash config.
  #
  # filter {
  #   transport { ... }
  # }
  config_name "transport"
  milestone 1

  def initialize(config = {})
    super
    @threadsafe = false
    @running = Hash.new
  end 

  def register
    # nothing to do
  end

  def filter(event)
    if event["tags"].include? 'request_started'
      @running[event["request_id"]] = event["uri"]
    end
    if event["tags"].include? 'request_completed'
      event["uri"] = @running.delete event["request_id"]
    end
  end
end

We name the class and config name ‘transport’ and declare it as milestone 1 (since it is a new plugin). In the filter method we remember the URL for each request and store it in the completed event. Insert this into a file named transport.rb in logstash/filters and call logstash with the path to the parent of the logstash dir.

logstash --pluginpath . -f logstash.conf --verbose

All our events are now in elasticsearch point your browser to http://localhost:9200/_search?pretty or where your elasticsearch is running and it should return the first few events. You can test some queries like _search?q=tags:request_completed (to see the completed events) or _search?q=duration:[1000 TO *] to get the events with a duration of 1000 ms and more. Now to the questions we want to be answered: what are the worst top ten response times by URL? For this we need to group the events by URL (field uri) and calculate the average duration:

curl -XPOST 'http://localhost:9200/_search?pretty' -d '
{
  "size":0,
  "query": {
    "query_string": {
      "query": "tags:request_completed AND timestamp:[7d/d TO *]"
     }
  },
  "aggs": {
    "group_by_uri": {
      "terms": {
        "field": "uri.raw",
        "min_doc_count": 1,
        "size":10,
        "order": {
          "avg_per_uri": "desc"
        }
      },
      "aggs": {
        "avg_per_uri": {
          "avg": {"field": "duration"}
        }
      }
    }
  }
}'

See that we use uri.raw to get the whole URL. Elasticsearch separates the URL by the /, so grouping by uri would mean grouping by every part of the path. now-7d/d means 7 days ago. All groups of events are included but if we want to limit our aggregation to groups with a minimum size we need to alter min_doc_count. Now we have an answer but it is pretty unreadable. Why not have a website with a list?
Since we don’t need a whole web app we could just use Angular and the elasticsearch JavaScript API to write a small page. This page displays the top ten list and when you click on one it lists all events for the corresponding URL.

<!DOCTYPE html>
<html>
  <head>
    <script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/angularjs/1.2.11/angular.min.js"></script>
    <script type="text/javascript" src="elasticsearch-js/elasticsearch.angular.js"></script>
    <script type="text/javascript" src="monitoring.js"></script>

    <link rel="stylesheet" href="http://netdna.bootstrapcdn.com/bootstrap/3.1.0/css/bootstrap.min.css">
    <link rel="stylesheet" href="http://netdna.bootstrapcdn.com/bootstrap/3.1.0/css/bootstrap-theme.min.css">
  </head>
  <body>
    <div ng-app="Monitoring" class="container" ng-controller="Monitoring">
      <div class="col-md-6">
        <h2>Last week's top ten slowest requests</h2>
        <table class="table">
          <thead>
            <tr>
              <th>URL</th>
              <th>Average Response Time (ms)</th>
            </tr>
          </thead>
          <tbody>
            <tr ng-repeat="request in top_slow track by $index">
              <td><a href ng-click="details_of(request.key)">{{request.key}}</a></td>
              <td>{{request.avg_per_uri.value}}</td>
            </tr>
          </tbody>
        </table>
      </div>
      <div class="col-md-6">
        <h3>Details</h3>
        <table class="table">
          <thead>
            <tr>
              <th>Logs</th>
            </tr>
          </thead>
          <tbody>
            <tr ng-repeat="line in details track by $index">
              <td>{{line._source.message}}</td>
            </tr>
          </tbody>
        </table>
      </div>
    </div>
  </body>
</html>

And the corresponding Angular app:

var module = angular.module("Monitoring", ['elasticsearch']);

module.service('client', function (esFactory) {
  return esFactory({
    host: 'localhost:9200'
  });
});

module.controller('Monitoring', ['$scope', 'client', function ($scope, client) {
var indexName = 'myindex';
client.search({
index: indexName,
body: {
  "size":0,
  "query": {
    "query_string": {
      "query": "tags:request_completed AND timestamp:[now-1d/d TO *]"
     }
  },
  "aggs": {
    "group_by_uri": {
      "terms": {
        "field": "uri.raw",
        "min_doc_count": 1,
        "size":10,
        "order": {
          "avg_per_uri": "desc"
        }
      },
      "aggs": {
        "avg_per_uri": {
          "avg": {"field": "duration"}
        }
      }
    }
  }
}
}, function(error, response) {
  $scope.top_slow = response.aggregations.group_by_uri.buckets;
});

$scope.details_of = function(url) {
client.search({
index: indexName,
body: {
  "size": 100,
  "sort": [
    { "timestamp": "asc" }
  ],
  "query": {
    "query_string": {
      "query": 'timestamp:[now-1d/d TO *] AND uri:"' + url + '"'
     }
  },
}
}, function(error, response) {
  $scope.details = response.hits.hits;
});
};
}]);

This is just a start. Now we could filter out the errors, combine logs from different sources or write visualisations with d3. At least we see where performance problems lie and take further steps at the right places.


Configuring your Java webapp

September 8, 2014

There are several ways to configure your Java Servlet-based webapp with values for deployment-specific things like the database connection or directories for data and logs. Let us take a look at the alternatives and their benefits and drawbacks.

web.xml

The deployment descriptor (web.xml) resides inside your WAR file. You can specify init parameters available using the ServletContext.
web.xml

<context-param>
    <param-name>LogDirectory</param-name>
    <param-value>/myapp/logs</param-value>
</context-param>

Accessing the parameter in your Servlet:

String logDirectory = getServletContext().getInitParameter("LogDirectory");
// do something with it

The nice thing about this solution is the self-containment of your packaged application. The price is building a customized web.xml/WAR for each deployment instance.

Environment variables

Another possibility is to pass environment variables to your servlet container at startup, e.g. using JAVA_OPTS in the case of Apache Tomcat.
tomcat.conf

...
JAVA_OPTS="-DLogDirectory=/myapp/logs"
...

They can be easily accessed using

    System.getProperty("LogDirectory");

This is very easy to employ but has several drawbacks:

  • you have to mess with the configuration of your servlet container/host to set the variables
  • they are valid for the whole servlet container, possibly interferring with other webapps or the container itself
  • the settings are harder to find than in one file that you deliver with your webapp
  • need of server restart to change the values

context.xml

Using context.xml and JNDI is our preferred way of configuring our webapps. You can ship a default context.xml in the META-INF directory of your WAR and easily configure resources and beans:

<Context>
    <Environment name="LogDirectory" value="/myapp/logs" type="java.lang.String" />
    <!-- Development DB -->
    <Resource name="jdbc/devdb" auth="Container" type="javax.sql.DataSource"
               maxActive="100" maxIdle="30" maxWait="-1"
               username="sa" password="" driverClassName="org.h2.Driver"
               url="jdbc:h2:mem:devDB;mode=Oracle"/>
</Context>

A context.xml outside of your WAR has to be copied in the context configuration directory of your servlet container, e.g.:

cp context.xml /etc/tomcat7/Catalina/localhost/myapp.xml

You can then access the configuration items using JNDI:

Context ctx = (Context) new InitialContext().lookup("java:comp/env");
String logDirectory = (String) ctx.lookup("LogDirectory");
// do something

You can of course use context-params and the ServletContext to retrieve simple String parameters stored in the context.xml instead of web.xml, too.

The name of the context file must match the name of the deployed application. That way we can deploy the same WAR on several target machines and configure the applications separately. The context.xml not only contains the JNDI datasources (which is very common) but also configuration parameters that may change for each target system.


Bit-fiddling is possible in Java

August 26, 2014

We have a service interface for Modbus devices that we can use remotely from Java. Modbus supports only very basic data types like single bits and 16-bit words. Our service interface provides the contents of a 16-bit input or holding register as a Java integer.

Often one 16-bit register is not enough to solve a problem in your domain, like representing a temperature as floating point number. A common solution is to combine two 16-bit registers and interpret their contents as a floating point number.

So the question is how to combine the bits of the two int values and convert them to a float in Java. There are at least two possiblities I want to show you.

The “old-fashioned” way includes bit-shifting and bit-wise operators which you actually can use in Java despite of the major flaw regarding primitive types: there are no unsigned data types; even byte is signed!

public static float floatFrom(int mostSignificat16Bits, int leastSignificant16Bits) {
    int bits = (mostSignificat16Bits << 16) | leastSignificant16Bits;
    return Float.intBitsToFloat(bits);
}

As seemingly more modern way is using java.nio.ByteBuffer:

public static float nioFloatFrom(int mostSignificat16Bits, int leastSignificant16Bits) {
    final ByteBuffer buf = ByteBuffer.allocate(4).putShort((short) mostSignificat16Bits).putShort((short) leastSignificant16Bits);
    buf.rewind(); // rewind to the beginning of the buffer, so it can be read!
    return buf.getFloat();
}

The second method seems superior when working with many values because you can fill the buffer conveniently in one phase and extract the float values conveniently by subsequent calls to getFloat().

Conclusion

Even if it is not one of Java’s strengths you actually can work on the bit- and byte-level and perform low level tasks. Do not let the lack of unsigned data types scare you; signedness is irrelevant for the bits in memory.


Dynamic addition and removal of collection-bound items in an HTML form with Angular.js and Rails

August 19, 2014

A common pattern in one of our web applications is the management of a list of items in a web form where the user can add, remove and edit multiple items and finally submit the data:

form-fields

The basic skeleton for this type of functionality is very simple with Angular.js. We have an Angular controller with an “items” array:

angular.module('example', [])
  .controller('ItemController', ['$scope', function($scope) {
    $scope.items = [];
  }]);

And we have an HTML form bound to our Angular controller:

<form ... ng-app="example" ng-controller="ItemController"> 
  <table>
    <tr>
      <th></th>
      <th>Name</th>
      <th>Value</th>
    </tr>
    <tr ng-repeat="item in items track by $index">
      <td><span class="remove-button" ng-click="items.splice($index, 1)"></span></td>
      <td><input type="text" ng-model="item.name"></td>
      <td><input type="text" ng-model="item.value"></td>
    </tr>
    <tr>
      <td colspan="3">
        <span class="add-button" ng-click="items.push({})"></span>
      </td>
    </tr>
  </table>
  <!-- ... submit button etc. -->
</form>

The input fields for each item are placed in a table row, together with a remove button per row. At the end of the table there is an add button.

How do we connect this with a Rails model, so that existing items are filled into the form, and items are created, updated and deleted on submit?

First you have to transform the existing Ruby objects of your has-many association (in this example @foo.items) into JavaScript objects by converting them to JSON and assigning them to a variable:

<%= javascript_tag do %>
  var items = <%= escape_javascript @foo.items.to_json.html_safe %>;
<% end %>

Bring this data into your Angular controller scope by assigning it to a property of $scope:

.controller('ItemController', ['$scope', function($scope) {
  $scope.items = items;
}]);

Name the input fields according to Rails conventions and use the $index variable from the “ng-repeat” directive to provide the correct index value. You also need a hidden input field for the id, if the item already has one:

  <td>
    <input name="foo[items_attributes][$index][id]" type="hidden" ng-value="item.id" ng-if="item.id">
    <input name="foo[items_attributes][$index][name]" type="text" ng-model="item.name">
  </td>
  <td>
    <input name="foo[items_attributes][$index][value]" type="text" ng-model="item.value">
  </td>

In order for Rails to remove existing elements from a has-many association via submitted form data, a special attribute named “_destroy” must be set for each item to be removed. This only works if

accepts_nested_attributes_for :items, allow_destroy: true

is set in the Rails model class, which contains the has-many association.

We modify the click handler of the remove button to set a flag on the JavaScript object instead of removing it from the JavaScript items array:

<span class="remove-button" ng-click="item.removed = true"></span>

And we render an item only if the flag is not set by adding an “ng-if” directive:

<tr ng-repeat="item in items track by $index" ng-if="!item.removed">

At the end of the form we render hidden input fields for those items, which are flagged as removed and which already have an id:

<div style="display: none" ng-repeat="item in items track by $index"
ng-if="item.removed && item.id">
  <input type="hidden" name="foo[items_attributes][$index][id]" ng-value="item.id">
  <input type="hidden" name="foo[items_attributes][$index][_destroy]" value="1">
</div>

On submit Rails will delete those elements of the has-many association with the “_destroy” attribute set to “1”. The other elements will be either updated (if they have an id attribute set) or created (if they have no id attribute set).


How to speed up your ORM queries of n x m associations

August 11, 2014

What causes a speedup like this? (all numbers are in ms)

Disclaimer: the absolute benchmark numbers are for illustration purposes, the relationship and the speedup between the different approaches are important (just for the curious: I measured 500 entries per table in a PostgreSQL database with both Rails 4.1.0 and Grails 2.3.8 running on Java 7 on a recent MBP running OSX 10.8)

Say you have the model classes Book and (Book)Writer which are connected via a n x m table named Authorship:

A typical query would be to list all books with its authors like:

Fowler, Martin: Refactoring

A straight forward way is to query all authorships:

In Rails:

# 1500 ms
Authorship.all.map {|authorship| "#{authorship.writer.lastname}, #{authorship.writer.firstname}: #{authorship.book.title}"}

In Grails:

// 585 ms
Authorship.list().collect {"${it.writer.lastname}, ${it.writer.firstname}: ${it.book.title}"}

This is unsurprisingly not very fast. The problem with this approach is that it causes the famous n+1 select problem. The first option we have is to use eager fetching. In Rails we can use ‘includes’ or ‘joins’. ‘Includes’ loads the associated objects via additional queries, one for authorship, one for writer and one for book.

# 2300 ms
Authorship.includes(:book, :writer).all

‘Joins’ uses SQL inner joins to load the associated objects.

# 1000 ms
Authorship.joins(:book, :writer).all
# returns only the first element
Authorship.joins(:book, :writer).includes(:book, :writer).all

Additional queries with ‘includes’ in our case slows down the whole request but with joins we can more than halve our time. The combination of both directives causes Rails to return just one record and is therefore ruled out.

In Grails using ‘belongsTo’ on the associations speeds up the request considerably.

class Authorship {
    static belongsTo = [book:Book, writer:BookWriter]

    Book book
    BookWriter writer
}

// 430 ms
Authorship.list()

Also we can implement eager loading with specifying ‘lazy: false’ in our mapping which boosts a mild performance increase.

class Authorship {
    static mapping = {
        book lazy: false
        writer lazy: false
    }
}

// 416 ms
Authorship.list()

Can we do better? The normal approach is to use ‘has_many’ associations and query from one side of the n x m association. Since we use more properties from the writer we start from here.

class Writer < ActiveRecord::Base
  has_many :authors
  has_many :books, through: :authors
end

Testing the different combinations of ‘includes’ and ‘joins’ yields interesting results.

# 1525 ms
Writer.all.joins(:books)

# 2300 ms
Writer.all.includes(:books)

# 196 ms
Writer.all.joins(:books).includes(:books)

With both options our request is now faster than ever (196 ms), a speedup of 7.
What about Grails? Adding ‘hasMany’ and the authorship table as a join table is easy.

class BookWriter {
    static mapping = {
        books joinTable:[name: 'authorships', key: 'writer_id']
    }

    static hasMany = [books:Book]
}
// 313 ms, adding lazy: false results in 295 ms
BookWriter.list().collect {"${it.lastname}, ${it.firstname}: ${it.books*.title}"}

The result is rather disappointing. Only a mild speedup (2x) and even slower than Rails.

Is this the most we can get out of our queries?
Looking at the benchmark results and the detailed numbers Rails shows us hints that the query per se is not the problem anymore but the deserialization. What if we try to limit our created object graph and use a model class backed by a database view? We can create a view containing all the attributes we need even with associations to the books and writers.

create view author_views as (SELECT "authorships"."writer_id" AS writer_id, "authorships"."book_id" AS book_id, "books"."title" AS book_title, "writers"."firstname" AS writer_firstname, "writers"."lastname" AS writer_lastname FROM "authorships" INNER JOIN "books" ON "books"."id" = "authorships"."book_id" INNER JOIN "writers" ON "writers"."id" = "authorships"."writer_id")

Let’s take a look at our request time:

# 15 ms
 AuthorView.select(:writer_lastname, :writer_firstname, :book_title).all.map { |author| "#{author.writer_lastname}, #{author.writer_firstname}: #{author.book_title}" }
// 13 ms
AuthorView.list().collect {"${it.writerLastname}, ${it.writerFirstname}: ${it.bookTitle}"}

13 ms and 15 ms. This surprised me a lot. Seeing this in comparison shows how much this impacts performance of our request.

The lesson here is that sometimes the performance can be improved outside of our code and that mapping database results to objects is a costly operation.


When UTF8 != UTF8

July 21, 2014

Problem

Recently I encountered a problem with umlauts in file names. I had to read names from a directory and find and update the appropriate entry in the database. So if I had a file named hund.pdf (Hund is German for dog) I had to find the corresponding record in the database and attach the file. Almost all files went smooth but the ones with umlauts failed all.

Certainly an encoding problem I thought. So I converted the string to UTF-8 before querying. Again the query returned an empty result set. So I read up on the various configuration options for JDBC, Oracle and Active Record (it is a JRuby on Rails based web app). I tried them all starting with nls_language and ending with temporary setting the locale. No luck.

Querying the database with a hard coded string containing umlauts worked. Both strings even printed on the console looked identically.

So last but not least I compared the string from the file name with a hard coded one: they weren’t equal. Looking at the bytes a strange character combination was revealed \204\136. What’s that? UTF8 calls this a combining diaeresis. What’s that? In UTF8 you can encode umlauts with their corresponding characters or use a combination of the character without an umlaut and the combining diaeresis. So ‘ä’ becomes ‘a\204\136′.

Solution

The solution is to normalize the string. In (J)Ruby you can achieve this in the following way:

string = string.mb_chars.normalize.to_s

And in Java this would be:

string = Normalizer.normalize(string, Normalizer.Form.NFKC)

Ruby uses NFKC (or kc for short) as a default and suggests this for databases and validations.

Lesson learned: So the next time you encounter encoding problems look twice it might be in the right encoding but with the wrong bytes.


Finally: return considered harmful

July 7, 2014

Ok, calm down already – I’m not going to take your return statement away. It’s a nice little statement that gets rarer and rarer,  mostly because many modern languages allow for implicit return values, e.g. in Scala, everything is an expression and the last expression of a method is its return value – except when it isn’t. You dont need to write an explicit return statement anymore, so our laziness takes over and we omit it.

And I’m not argueing that a finally block is a bad thing – quite the contrary. Even in the age of try-with-resources and autoclose closures, a well-crafted finally block is a beautiful piece of code at its right place.

But put the two together and you’ve got a recipe for desaster, at least in Java (try it in your language of choice with care). Let’s have a look at code:

public void doSomething() {
    try {
        throw new Error("drama!");
    } finally {
        // dispose resources here
    }
}

Ok, I could have spared you the drama of an Error and just thrown a RuntimeException (or a declared and checked Exception), but it’ll soon illustrate the effect all the better.

First, the code does exactly what it should: if you call the method, you’ll get to catch an Error and all resources that were used inside the method are cleaned up right before you catch it. And since the method has no return value, you don’t miss out there.

But what if I want to give one of those endangered return statements a new home and insert it into the finally block?

public void doSomething() {
    try {
        throw new Error("drama!");
    } finally {
        // dispose resources here
        return;
    }
}

Not much has changed. I’m still not missing any return value and the Error still gets thrown. The resources are cleaned up quite as thorough as before, but something important has changed: You won’t ever see the Error. You won’t catch it because it gets eaten by that return statement at the end of the finally block.

So in short: return statements are hungry little beasts that will eat your Throwables if you corral them inside a finally block.

No wonder nobody wants to have them in their code if they behave like that.

Oh, and rest assured that you will be warned: All modern IDEs will point you to the inevitable desaster waiting to happen, if only with a compiler warning.

But where do Throwables go when they got eaten? Good question. They don’t show up on any UncaughtExceptionHandler, they don’t even stay in memory. Probably, they just are digested, never to be seen again.

If you don’t regard return statements as a little more dangerous now, you probably also raise switch-statements for fun.


Follow

Get every new post delivered to your Inbox.

Join 82 other followers