Saturday, March 7, 2015

Profiling Java

Java is slow!?! Is it?

One common prejudice about Java applications is lack of speed compared to languages like C or C++. Benchmarking Java requires a some knowledge about the way Java is executing code to assure, the benchmark is benchmarking the the code, and not the coding style of the developer.
While this post concentrates more on the HOW to measure, the following posts will concentrate on common (anti-)patterns and their impact on performance, readability and development speed.

Measuring the execution time of a program or method usually looks like this:

public class PerformanceAnalysis {
    public static void main(String[] args) {
        long startTime = System.currentTimeMillis();
        // something to be measured
        int result = 0;
        for (int i = 0; i < 10000000; i++) {
            result += i;
        }
        System.out.println(
            System.currentTimeMillis() - startTime);
    }
}

On my system, this simply writes a "30" to the console. On your system this value might differ. Please not that executing this this multiple times will lead to a spread which is created by system load and other processes. To get a "real" result, execute a few times - you will see a tendency. Please note that in this simple tests, the changes in executing time are not significant and it should stay almost constant.

While this approach seams to be reasonable, it has a few flaws which affect the results:
  1. A granulation of milliseconds might not lead to senseful results.
  2. This test is not object oriented - Java programs are.
  3. The result is not used after the loop.
  4. The output of the test might affect the measurement more than necessary.
  5. The test is interpreted without the JIT-compiler.
To see the changes of each point, you should edit the source code as specified, recompile, start and compare the results.

    Problem 1:

    Java has the possibility to use a granulation of nano seconds for differential measurements. While this can not be used for absolute times, it is a perfect means to measure execution times.

    Solution:

    Replace calls to System.currentTimeMillis() by System.nanoTime().
    >> execution time: 28921134

    Problem 2:

    The test code is completely static and does not use any class or object references. This is not typical for Java code, so the JIT behavior is different.

    Solution:

    Pack the code in an object and excute it there.

    public class PerformanceAnalysis {
        public static void main(String[] args) {
            PerformanceAnalysis analysis
                = new PerformanceAnalysis();
            analysis.perform();
        }

        public void perform() {
            ... test code as before ...
        }
    }

    >> execution time: 28440079

    Problem 3:
    Optimization might remove the loop as it has no side affects. Therefore you might not get the expected results.

    Solution:

    Use the result after creation. This can be achieved by simply writing it to the console after the loop.
    >> execution time: 29159851    49999995000000 

    Problem 4:

    The class containing the System.out.println(...) method  might not yet have been loaded. This code will include the time needed for class loading and thereby lead to wrong results.

    Solution:

    Replace the printout of the result by:


        long endTime = System.nanoTime();
        System.out.println(endTime - startTime + "\t" + result);

    This will decouple class loading from the measurement.
    >> execution time: 26237273    49999995000000

    Problem 5:

    Java code is not compiled to executables as for instance C/C++ code. A special format (Java byte code) is created which itself is interpreted on the target system at runtime. While this leads to high portability (in principle - another topic), the drawback is a significant loss in performance. To solve this problem, Sun developed the JIT-compiler (JIT: just in time) which itself analyses the byte code and will create native code on the target system when it detects, that the code is needed more often. To be more precise, The code is compiled when the code is used the third time. Our code is excecuted exactly one time, so the JIT will not be used at all. Performance critical code is usually executed more than once, so in real life you do not have to care about this. In synthetic benchmarks (like this is) this has to be considered.

    Solution:

    Put the whole code in a loop and execute it multiple times. To make sure all paths are executed more than three times, the number of executions is difficult to guess. I usually use 10 times.

    In this simple example the result after Solution 5 will not very much differ from the previous ones, as the time consumption is produced in a loop which itself will be optimized on the first run already. If you have large, costly operations the result will be markable.

    public class PerformanceAnalysis {
        public static void main(String[] args) {
            PerformanceAnalysis analysis 
                = new PerformanceAnalysis();
            for (int j = 0; j < 10; j++) {
                analysis.perform();
            }
        }

        public void perform() {
            long startTime = System.nanoTime();
            // something to be measured
            int result = 0;
            for (int i = 0; i < 10000000; i++) {
                result += i;
            }
            long endTime = System.nanoTime();
            System.out.println(
                (endTime - startTime) + "\t" + result);
        }
    }
     

    Conclusion

    - Try to create object based test code - the compiler is optimized...
    - Ensure relevance of the code - the optimizer is more clever as you might expect!
    - Reduce side effects - profile the function, not the measurement.
    - Profiling takes time -  think about required cycles.


    All further "mini-benchmarks" will follow these guidelines.

        ... to be continued

    Hello World!

    I am a software engineer working living and working in Germany who is quite disappointed with the quality and processes found in the current way of software development.

    Working with Java for more than 15 years, I have seen many frameworks come and go, many "enhancement"-libraries to the Java framework disappear as fast as they became hyped.

    At the same time, most of the "experienced" Java developers - originally C or C++ developers - used Java the way they used to do with other languages, which not necessarily  belonged to the object oriented languages at all.

    The result is a strange mix of multiple technologies, programming paradigms and code patterns - not all of them fitting the Java-world at all.


    But why is this important for you?

    If you are developing software you will encounter all of these problems (maybe even more...):
    • incomplete/outdated documentation
    • erroneous examples
    • rumors and assumptions instead of facts
    • dirty hacks
    • reinventing the wheel
    • wasting time on not using the capabilities of languages/frameworks
    • using ALL available frameworks and design patterns at once

    Why should you trust my statements?

    I am using Java since 1998 continously, SQL since 2003 and other things like C/C++, Python, HTML, JSF when needed. Writing software using Linux, Windows, different languages/technologies, being involved in different external projects I got a wide overview over the ups and downs of software development.

    At the moment I am living near Nuremberg and work for an engineering service provider with the focus on Java SE applications.


    Comments are enabled, but be aware that all comments offending or anyhow not meeting my standards for communication will be removed. If you find any comment offending I did not notice, please let me know.

    Thank you in advance for all useful comments and aids.

    Disclaimer:
    I am not responsible for any content on external sites linked by this blog. If any link leads to illegal or offensive websites please inform me and the link will be removed