Monday, May 16, 2016

Of give and take ...

When it comes to judging Java performance sooner or later you will encounter arguments like "... all this bloated code: dumb getters and setters that have to be coded and executed waste processor time ...".
Usually this argument comes from C/C++ programmers with assembler experience that count each operation and its cycles.
It seems to be obvious that the fastest code is the one that does not have to be executed. So why should you want to write this stupid getters and setters that only increase the line count and make larger classes?

Well, I assume I do not have to tell you something about the benefits of object-oriented programming in general and encapsulation in detail, but the positive aspect of getters/setters can be shown with a simple example:
Imagine you have a class that holds a single field. Whenever the value of the field changes, a counter is increased that counts the changes of the field. At the same time, whenever the value is retrieved, a log entry is written.

While this seems to be a non-sense example, it is acutally used when you create a class for observing changes in a document and create an undo history (you should use the strategy or memento pattern for a full implementation ;) ). If you implemented this without getters and setters, you would have to write the needed code at every place where the variable is either changed or read. This is very error prone as you might overlook some occurences or some other developer might not know about the requirement for this code at all. To find these errors can be a very frustrating and time consuming task...

But still these getters/setters have to coded, but fortunately, modern IDEs like Eclipse or Netbeans generate these automatically, so the creation effort tends to zero, if you consider the negative effects of not using them.

From the view of a software architect, getters and setters are positive. The programmer will not have much work coding these, so let's say they are neutral.
But wait, more code means more execution time, so the user might encounter the negative effect. As all of you performance junkies out there might expect this, it is time to perform some benchmarking on it.

Based on the concepts previously described, I designed a mini benchmark that takes a class as test object that holds a field of int type.


class TestObject {
    int value = 17;

    int getValue() {
        return value;
    }
}

To benchmark the getter behavior compared to direct access, I created two methods that perform the access and log the time needed to stdout.

void runGetter() {
    int result = 0;
    long start = System.nanoTime();
    for (int i = 0; i < 10000; i++) {
        result += to.getValue();
    }
    long end = System.nanoTime();
    System.out.println(
        String.format("Getter:\tresult\t%d\ttime\t%d",
                      result, end - start)); 
}

void runDirect() {
    int result = 0;
    long start = System.nanoTime();
    for (int i = 0; i < 10000; i++) {
        result += to.value;
    }
    long end = System.nanoTime();
    System.out.println(
        String.format("Direct:\tresult\t%d\ttime\t%d",
                      result, end - start));
}

To clue it all together, the static main method creates an instance of the benchmark and repeats the whole benchmark 15 times.

public class GetterTest
    TestObject to = new TestObject();
    
    public static void main(String[] args) {
        GetterAnalysis ga = new GetterAnalysis();
        for (int i = 0; i < 100; i++) {
            ga.runGetter();
            ga.runDirect();
            System.out.println();
        }
    }
    ... runXXX ...
}


The first result looks like:
Testrun: 1
Getter: result 170000 time 728720
Direct: result 170000 time 131059

No surprise, the getter takes ~500% longer as the direct access.

The 9th run:
Testrun: 9
Getter: result 170000 time 49739
Direct: result 170000 time 120401

What's wrong? Well, this result is reproducable, so it can not be a random effect. Because of the benchmark's structure, the JIT optimizes parts of the getter earlier, as the call to a method is more likely to be optimized at an early time than direct access to fields. .

11th run:
Testrun: 11
Getter: result 170000 time 32765
Direct: result 170000 time 30002

The getter is now optimized and takes AS LONG as the direct access.


Now both methods have been optimized (first level) and there is no significant difference in runtime between getter and and direct access!

Recent changes in the JVM and the compiler will do further optimization based on a cost and effort estimation. The further results show a reduction to 1% of the already optimized code:
Testrun: 35
Getter: result 170000 time 394
Direct: result 170000 time 63556

Testrun: 36
Getter: result 170000 time 395
Direct: result 170000 time 77372

Testrun: 37
Getter: result 170000 time 395
Direct: result 170000 time 72240

Testrun: 38
Getter: result 170000 time 395
Direct: result 170000 time 395

Although it seems to be irrational, the results show a that there is no difference between direct access and using getters..
Conventional programs are compiled as a whole, line by line, method by method. Every effective line of code is represented by code in the executable. Every test, every variable access will be part of the executable (not necessarily in the same order, but in principal).

The Java Hotspot JIT proceedes slightly different. It tries to find code blocks (methods or other blocks) that are executed often (the "hot spots"). These are compiled to native code, so that the execution is a mixture of interpreted bytecode and compiled native code.
Beside of this, the JIT traces the execution flow and tries to optimize the native code if possible (this effect will be described in detail in another post). The compiled fragements are rearranged and if necessary recompiled to reach optimal performance. The getter is optimized to the direct access of the field so both ways reach equal performance.

Well, not really what you might have expected, but if you consider that Java was developed as object-oriented language and the JIT is optimized to optimize object oriented code, it can do its best on small, encapsuled code elements that can easily be identified and rearranged.

... to be continued

No comments:

Post a Comment