Author: Jens Lukowski – Page 7

Java Pub Quiz

Everybody loves a pub quiz. So I collected some Java trivia questions ranging from syntax over frameworks. Have fun!

What is the name of the following syntax elements: (1 point each)
<>
?:
Is this valid Java code and why or why not?
```
http://www.google.de
```
When does a != a result in true for the same a?
Which non internal package cannot be imported?
What’s the result of “Hello” == “Hello” and why?
How is this piece of code construct called?
```
new ArrayList() {{
  add("Hello");
}};
```
For which value does Math.abs(int) return a negative number?
What does file.delete() do when the file in question cannot be deleted?
What is the result of Arrays.asList(1, 2, 3).add(2) ?

Solutions

diamond operator and ternary operator or elvis (in Groovy)
Yes, http: is a label and // starts a comment
Double.NaN
The default package
True. All String literals are interned.
Double brace initialisation
Integer.MIN_VALUE
It returns false
java.lang.UnsupportedOperationException

Framework

Here we name three classes from a JDK package or an open source framework, can you guess which package or framework it is?

Closeable, Console, Serializable
Objects, Properties, Random
Callable, Future, Phaser
Point, Robot, Toolkit
AutoCloseable, Iterable, Process
Mapping, PersistentCollection, Session
EqualsBuilder, Mutable, StringUtils
ApplicationContext, DataBinder, JdbcTemplate
Frequency, Length, Volume
Minutes, Weeks, Years

Solutions

java.io
java.util
java.util.concurrent
java.awt
java.lang
Hibernate
Apache Commons (Lang)
Spring
JScience
Joda Time

Grails / GORM performance tuning tips

Every situation and every code is different but here are some pitfalls that can cost performance and tips you can try to improve performance in Grails / GORM

First things first: never optimize without measuring. Even more so with Grails there are many layers involved when running code: the code itself, the compiler optimized version, the Grails library stack, hotspot, the Java VM, the operating system, the C libraries, the CPU… With this many layers and even more possibilities you shouldn’t guess where you can improve the performance.

Measuring the performance

So how do you measure code? If you have a profiler like JProfiler you can use it to measure different aspects of your code like CPU utilization, hotspots, JDBC query performance, Hibernate, etc. But even without a decent profiler some custom code snippets can go a long way. Sometimes we use dedicated methods for measuring the runtime:

class Measurement {
  public static void runs(String opertationName, Closure toMeasure) {
    long start = System.nanoTime()
    toMeasure.call()
    long end = System.nanoTime()
    println("Operation ${operationName} took ${(end - start) / 1E6} ms")
  }
}

or you can even show the growth in the Hibernate persistence context:

class Measurement {
  public static void grown(String opertationName, Closure toMeasure) {
    PersistenceContext pc = sessionFactory.currentSession.persistenceContext
    Map before = numberOfInstancesPerClass(pc)
    toMeasure.call()
    Map after = numberOfInstancesPerClass(pc)
    println "The operation ${operationName} has grown the persistence context: ${differenceOf(after, before)}"
  }
}

Improving the performance

So when you found your bad performing code, what can you do about it? Every situation and every code is different but here are some pitfalls that can cost performance and tips you can try to improve performance:

GORM hotspots

Performance problems with GORM can be in different areas. A good rule of thumb is to reduce the number of queries hitting the database. This can be achieved by combining results with outer join, eager fetching associations or improving caching. Another hotspot can be long running operations which you can improve via creating indices on the database but first analyze the query with database specific tools like ANALYZE.
Also a typical problem can be a large persistence context. Why is this a problem? The default flush mode in Hibernate and hence GORM is auto which means before any query the persistence context is flushed. Flushing means Hibernate checks every property of every instance if it has changed. The larger the persistence context the more work to do. One option would be to clear the session periodically after a flush but this could decrease the performance because once loaded and therefore cached instances need to be reloaded from the database.
Another option is to identify the parts of your code which only need read access on the instances. Here you can use a stateless session or in Grails you can use the Spring annotation @Transactional(readOnly = true). It can be beneficial for the performance to separate read only and write access to the database. You could also experiment with the flush mode but beware that this can lead to wrong query results.

The thin line: where to stop?

If you measure and improve you can get big and small improvements. The problem is to decide which of these small ones change the code in a good or minimal way. It is a trade off between performance and code design as some performance improvements can worsen the code quality. Another cup of tea left untouched in this discussion is scalability. Whereas performance concentrates of the actual data and the current situation, scalability looks on the performance of the system when the data increases. Some performance improvements can worsen scalability. As with performance: measure, measure, measure.

Your own perfection hinders you

Remember when you first started programming? A post against your (exaggerated) perfection.

Remember when you first started programming? Did you think about tests? Did you plan an architecture before coding? Did you look at your results and thought what a crap? No, you were lucky to see something, something done. Imperfect but in a sense beautiful. You had a feeling of accomplishment. That little pixel that responded to you pressing keys, the little web page that just saved some data. You did something.
Now fast forward to today. You now write software professionally. With tests, architecture, well thought out. You are practicing an agile methodology (whatever that means). Don’t get me wrong these are all important points and can help you to get a solid implementation. But what if you need to implement a prototype? Just to try something? Fire and forget. Quick and dirty. Can you do it? Do you start with writing tests? Planning the architecture? Writing a spec? And afterwards: what do you think of the result? is it ugly? is it not done “professionally”?
What if you start in another field of your profession? Maybe you made websites your whole career and now start with desktop or mobile apps. Or you implemented back end code and now start writing code for the front end. Do you feel insecure? Do you think you just write crap? You shouldn’t. Remember your beginnings. Yes, you have matured, you know more, you write better. But you should celebrate getting something done. Shipping something. Seeing something. That feeling of accomplishment. Don’t criticize too hard, don’t be too harsh to you and your code. Get something done and then improve along the way. Just like when you started. Just keep shipping.
And for you, young software engineers, who just start. We all went through this phase. Don’t look at other’s work and think: wow, this looks so good and my work doesn’t. Think: he went also through this phase and now he can make this wonderful work, I will, too.
Ira Glass, an american writer said it best.

TDD: avoid getting stuck or what’s the next test?

One central point of practicing TDD is to determine what is the next test. Choosing the wrong path can lead you into the infamous impasse

One central point of practicing TDD is to determine what is the next test. Choosing the wrong path can lead you into the infamous impasse: to make the next test pass you need to make not baby but giant steps. Some time ago Uncle Bob introduced a principle called the transformation priority premise. To make a test pass you need to change the implementation. These changes are transformations. There are at least the following transformations (taken from his blog post):

({}–>nil) no code at all->code that employs nil
(nil->constant)
(constant->constant+) a simple constant to a more complex constant
(constant->scalar) replacing a constant with a variable or an argument
(statement->statements) adding more unconditional statements.
(unconditional->if) splitting the execution path
(scalar->array)
(array->container)
(statement->recursion)
(if->while)
(expression->function) replacing an expression with a function or algorithm
(variable->assignment) replacing the value of a variable.

To determine what the next test should be you look at the possible next tests and the changes in the implementation necessary to make that test pass. The required transformations should be as high in the list as possible. If you always choose the test which causes the highest transformations you avoid getting stuck, the impasse.
This seems to work but I think this is pretty complicated and expensive. Shouldn’t there be an easier way?
Let’s take a look at his case study: the word wrap kata. Word wrap is a function which takes two parameters: a string, and a column number. It returns the string, but with line breaks inserted at just the right places to make sure that no line is longer than the column number. You try to break lines at word boundaries.
The first three tests (nil, empty string and one word which is shorter than the wrap position) are obvious and easy but the next test can lead to an impasse:

@Test
public void twoWordsLongerThanLimitShouldWrap() throws Exception {
  assertThat(wrap("word word", 6), is("word\nword"));
}

With the transformation priority premise you can “calculate” that this is the wrong test and another one is simpler meaning needs transformations higher in the list. But let me introduce another concept: the facets or dimensions of tests.
Each test in a TDD session tests another facet of your problem. And only one more. What a facet is is determined by the problem domain. So you need some domain knowledge but usually to solve that problem you need this nevertheless. Back to the word wrap example: what is a facet? The first test tests the nil input, it changes one facet. The empty input test changes another facet. Then comes one word shorter than the wrap position (one facet changed again) and the fourth test uses two words longer than the wrap position. See it? The fourth tests introduces changes in two facets: one word to two word and shorter to longer than. So what can you do instead? Just change one facet. According to this the next test would be to use one word longer than the wrap position (facet: longer) which is proposed as a solution. Or you can use two words shorter than the wrap position (facet: word count) but this test will just pass without modifications to the implementation code. So facets of the word wrap kata could be: word count, shorter/longer, number of breaks, break position.
I know this is a very informal way of finding the next tests. It leans on your experience and domain knowledge. But I think it is less expensive than the transformations. And even better it can be combined with the transformation priority premise to check and verify your decisions.
What are you experiences with getting stuck in TDD? Do you think the proposed facets of TDD could be of help? Is it too informal? Too vague?

TDD myths: the problems

I take a look at some (in my experience) problems/misconceptions with TDD:
100% code coverage is enough, Debugging is not needed, Design for testability, You are faster than without tests

100% code coverage is enough

Code coverage seems to be a bad indicator for the quality of the tests. Take the following code as an example:

public void testEmptySum() {
  assertEquals(0, sum());
}

public void testSumOfMultipleNumbers() {
  assertEquals(5, sum(2, 3));
}

Now take a look at the implementation:

public int sum(int...numbers) {
  if (numbers.length == 0) {
    return 0;
  }
  return 5;
}

Baby steps in TDD could lead you to this implementation. It has 100% code coverage and all tests are green. But the implementation isn’t finished at all. Our experiment where we investigated how much tests communicate the intend of the code showed flaws in metrics like code coverage.

Debugging is not needed

One promise of TDD or tests in general is that you can neglect debugging. Even abandon it. In my experience when a test goes red (especially an integration test) you sometimes need to fire up the debugger. The debugger helps you to step through code and see the actual state of the system at that time. Tests treat code as a black box, an input results in an output. But what happens in between? How much do you want to couple your tests to your actual implementation steps? Do we need the tests to cover this aspect of software development? Maybe something along the lines as shown in Inventing on principle where the computer shows you the immediate steps your code takes could replace debugging but tests alone cannot do it.

Design for testability

A noble goal. But are tests your primary client? No. Other code is. Design for maintainability would be better. You will need to change your code, fix it, introduce new features, etc. Don’t get me wrong: You need tests and you need testability. But how much code do you write specifically for your tests? How much flexibility do you introduce because of your tests? What patterns do you use just because your tests need them? It’s like YAGNI for code exposure for tests. Code specifically written only for tests couples your code to your tests. Only things that need to be coupled should be. Is the choice of the underlying data structure important? Couple it, test it. If it isn’t, don’t expose it, don’t write a getter. Don’t break the information hiding principle if you don’t need to. If you couple your tests too much to your code every little change breaks your tests. This hinders maintenance. The important and difficult design question is: what is important. Test this.

You are faster than without tests

Some TDD practitioners claim that they are faster with TDD than without tests because the bugs and problems in your code will overwhelm you after a certain time. So with a certain level of complexity you are going faster with TDD. But where is this level? In my experience writing code without tests is 3x-4x faster than with TDD. For small applications. There are entire communities where many applications are written without or with only a few tests. But I wouldn’t write a large application without tests but at least my feeling is that in many cases I go much slower. Cases where I feel faster are specification heavy. Like parsing or writing formats, designing an algorithm or implementing a scientific formula. So the call is open on this one. What are your experiences? Do you feel slowed down by TDD?

TDD myths

Some TDD myths and what is true about them.

TDD, Test first or test immediately after are all the same

No. All methods result in having tests in the end. But especially in the TDD case your mind set is completely different. First the tests drive the design of your code. You construct your system piece by piece. Unit test for unit test. All code you write must have a test first and you use the tests to describe and reason about the external interface of your units. In TDD the tests represent the future clients using your code. In practice this leads to small(er) units.

In TDD the tests’ (main) objective is to prevent regression

No. Tests help immensely when you break the same code twice. But even more so tests help to structure your code and make it maintainable. When using TDD you tend to reduce your code and its flexibility because you need to write a test for every piece of functionality first. So over designing or implementing things you don’t need (breaking YAGNI or KISS) bites you doubly: in the code and in the tests. Also wrong design decisions like choosing an inappropriate data structure or representation hits you twice as hard. TDD emphasizes bad design decisions.

Test code is the same as production code

No. Test code should adhere too a similar quality level like production code. But you won’t write tests for your tests. Also conditionals and loops are a very bad idea in tests and should be avoided. Take the following example:

public void testSomething() {
  for (MyEnum value : values()) {
    assertEquals(expected, do(value))
  }
}

If you forgot an enum value the tests just passes. Even if you have no values in your enum it passes still. Conditions have the same problem: you introduce another path through your test which can be avoided or never taken. You could secure the other path through an assert but in some cases this is a hint that you broke another principle: single responsibility of tests.

DRY is harmful in tests

No. DRY (don’t repeat yourself) aims to reduce or eliminate duplication in logic. But often DRY is understood as removing code duplication. This is not the same! Code duplication can be essential in tests. You need all of the essential information in the test. This code should not be extracted or abstracted elsewhere. These code lines which may seem similar are not coupled logically. When you change one test, the other test is not affected.

TDD is hard

No and yes. For me learning TDD is like learning a new language. It certainly needs time. But if you do it often and repeatedly you learn more every time you use it. It’s a way of reasoning about a system, a way of thinking, a paradigm. When I started with TDD I thought it was impossible or unreasonable to use in cases other than where strong specs exist like parsing a format. But over time I value the driving part of TDD more and more. You can get into a TDD flow. TDD gives you a very good feeling of security when you refactor. It forces you beforehand to think about your intended use for your code. Which is good. It changes my way of seeing my code, one step at time. Some things are still hard: acceptance tests are unreasonably expensive. Just testing one thing needs discipline. Not jumping ahead of the tests and implementing too much code also. Finding the next unit of testing can be difficult, getting stuck can be frustrating. Just like learning a new language I think it is worth it.

Aspects done right: Concerns

With aspects you cannot see (without sophisticated IDE support) which class has which aspects and which aspects are woven into the class when looking at its source. Here concerns (also called mixins or traits) come to the rescue.

The idea of encapsulating cross cutting concerns struck with me from the beginning but the implementation namely the aspects lacked clarity in my opinion. With aspects you cannot see (without sophisticated IDE support) which class has which aspects and which aspects are woven into the class when looking at its source. Here concerns (also called mixins or traits) come to the rescue. I know that aspects were invented to hide away details about which code is included and where but I find it confusing and hard to trace without tool support.

Take a look at an example in Ruby:

module Versionable
  extend ActiveSupport::Concern

  included do
    attr_accessor :version
  end
end

class Document
  include Versionable
end

Now Document has a field version and is_a?(Versionable) returns true. For clients it looks like the field version is in Document itself. So for clients of this class it is the same as:

class Document
  attr_accessor :version
end

Furthermore you can easily use the versionable concern in another class. This sounds like a great implementation of the separating of concerns principle but why isn’t everyone using it (besides being a standard for the upcoming Rails 4)? Well, some people are concerned with concerns (excuse the pun). As with every powerful feature you can shoot yourself in the foot. Let’s take a look at each problem.

Diamond problem aka multiple inheritance

Ruby has no multiple inheritance. Even when you include more than one module the modules are like superclasses for the message resolve order. Every include creates a new “superclass” above the including class. So the last include takes precedence.

Dependencies between concerns

You can have dependencies between different concerns like this concern needs another concern. ActiveSupport:Concerns handles these dependencies automatically.

Unforeseeable results

One last big problem with concerns is having side effects from combining two concerns. Take for an example two concerns which add a method with the same name. Including both of them renders one concern unusable. This cannot be solved technically but I also think this problem shows an underlying, more important cause. It could be because of poor naming. Or you did not separate these two concerns enough. As always tests can help to isolate and spot the problem. Also concerns should be tested in isolation and in integration.

Thoughts about TDD

Thoughts and links about test driven development

First a disclaimer: I think tests are a hallmark for professional software development, I like to write tests before the implementation but that’s not always easy or simple (for the difference please refer to Simple made easy). I find it hard to grasp test driven development (TDD) though. The difference between test first and test driven lies in the intention: in both cases tests are written before any implementation code but in TDD the tests drive the design of your implementation.

The problem with opinions of TDD is there are mostly extreme positions: some think “TDD is the (next) holy grail” or the ones which dismissed it. Though reading between the lines there are great discussions about how to do it and what problems arise. Many people (me included) are really trying to get value from TDD. Testing should be fun.
One way in letting the tests drive the way you develop is proposed by Uncle Bob: transformation priority premise. He proposes a list of transformations which introduce new or replace existing constructs like replacing a constant by a variable or adding more logic and gives them a priority. Only if you cannot use a high priority transformation to get the test to pass you look at a transformation with a lower priority.
But how do you determine what you should test next or even which is the first test?
Taking the typical Conway’s game of life kata as an example one thing struck me: I could only get the TDD to work smoothly when I started with the data structure. But why that? Naturally I start with the algorithm (in this case the rules) and write the first test for it. But upon further inspection of the problem and deeper (domain) knowledge it seems the data structure is way more important for solving this kata. So you need to know where the journey goes along beforehand, not every step you will take but the big picture: first the data structure, then the rules in this example. Maybe you should start with the integrations or the functional tests and break them down into units.
What are your experiences using TDD? Do you use or want to use TDD?

Web apps: Security is more than you think

Security in web apps is an ever increasing important topic: in this post we take a look at injection attacks especially SQL injection, the number one OWASP security problem.

Security in web apps is an ever increasing important topic besides securing the machine or your web/application containers on which your apps run you need to deal with some security related issues in your own apps. In this article we take a look at the number one (according to OWASP)risk in web apps:

Injection attacks

Every web app takes some kind of user input (usually through web forms) and works with it. If the web app does not properly handle the user input malicious entries can lead to severe problems like stealing or losing of data. But how do you identify problems in your code? Take a look at a naive but not uncommon implementation of a SQL query:

query("select * from user_data where username='" + username + "'")

Using the input of the user directly in a query like this is devastating, examples include dropping tables or changing data. Even if your library prevents you from using more than one statement in a query you can change this query to return other users’ data.
Blacklisting special characters is not a solution since you need some of them in your input or there are methods to circumvent your blacklists.
The solution here is to proper escape your input using your libraries mechanisms (e.g. with Groovy SpringJDBC):

query("select * from user_data where username=:username", [username: username])

But even when you escape everything you need to take care what you inject in your query. In this example all data is stored with a key of username.data.

query("select * from user_data where key like :username '.%' ", [username: username])

In this case everything will be escaped correctly but what happens when your user names himself % ? He gets the data of all users.

Is SQL the only vulnerable part of your app? No, every part which interprets your input and executes it is vulnerable. Examples include shell commands or JavaScript which we will look at in a future blog post.

As the last query showed: besides using proper escaping, setting your mind for security problems is the first and foremost step to a secure app.

Antipatterns: Convenience Constructors

Lately I stumble a lot upon code I wrote 4 or more years ago. In the light of introducing new features the code gets tested for its quality. One antipattern I’ve found which I had used in the past but which is really hard to extend is convenience constructors.

    public SetProperty(String filename, String key, String value) {
        this(filename, key, value, null);
    }

    public SetProperty(String filename,
            String key, String value, String comment) {
        this(filename, ReferenceTo.key(key), value, comment);
    }

    public SetProperty(String filename,
            String sectionType, String sectionName,
            String key, String value) {
        this(filename, sectionType, sectionName, key, value, null);
    }

    public SetProperty(String filename,
            String sectionType, String sectionName,
            String key, String value, String comment) {
        this(filename, ReferenceTo.sectionAndKey(sectionType, sectionName, key), value, comment);
    }

    public SetProperty(String filename,
            AdvancedPropertyReference propertyReference,
            String value, String comment) {
        this(filename, propertyReference, value, comment);
    }

    public SetProperty(String filename,
            AdvancedPropertyReference propertyReference,
            String value, String comment) {
        super(filename);
        this.propertyReference = propertyReference;
        this.value = value;
        this.comment = comment;
    }

We need to add a new feature which enables us to append properties not just set and replace them. One way could be to extend the class. But this is overkill. Just adding a new parameter flag should suffice. But this would blow up the number of constructors because you need to include a version with and without the new parameter for each (used) constructor. Here an old friend comes to the rescue: design patterns. Looking at the GoF book shows a good solution to the problem: the builder pattern.

public class SetPropertyBuilder {
    private final String filename;
    private String sectionType;
    private String sectionName;
    private String referenceKey;
    private String value;
    private String comment;
    private boolean append;

    public SetPropertyBuilder(String filename) {
        super();
        this.filename = filename;
    }

    public SetPropertyBuilder set(String key, String newValue) {
        this.referenceKey = key;
        this.value = newValue;
        return this;
    }

    public SetPropertyBuilder append(String key, String additionalValue) {
        set(key, additionalValue);
        this.append = true;
        return this;
    }

    public SetPropertyBuilder inSection(String type, String name) {
        this.sectionType = type;
        this.sectionName = name;
        return this;
    }

    public SetProperty build() {
        AdvancedPropertyReference reference = ReferenceTo.key(this.referenceKey);
        if (this.sectionType != null && this.sectionName != null) {
            reference = ReferenceTo.sectionAndKey(this.sectionType, this.sectionName, this.referenceKey);
        }
        return new SetProperty(this.filename, reference, this.value, this.comment, this.append);
    }
}

Now we can eleminate all but one constructor from the SetProperty command. Adding a new property now yields one new method in the builder.

	Anonymous on Cache configuration with WildF…
	Miq on Nested queries like N+1 in pra…
	mariuselvert on Creating functors with lambda…
	Nested queries like… on Common SQL Performance Gotchas…
	Nested queries like… on Make your users happy by not c…