Java Runtime Performance Vs Native C / C++ Code?

I’ve become more and more comfortable programming in Java than with C++ or C. I am hoping to get a sense of the performance hit incurred using a JVM interpreter, as opposed to executing the same “project” natively. I realize that there is some level of subjectivity here; the quality of the program will depend highly on a good implementation. I’m interested in the following aspects in a general sense:

  • There must be some baseline for overhead when using an interpreter. Is there some general rule of thumb to remember? 10% 15%? (I pulled these numbers out of thin air) I have read the occasional blog stating that Java code is nearly as fast as native code, but I think that may have been biased.

  • Does the JVM garbage collector add significant overhead to runtime performance? I know Cocoa applications have begun to use a garbage collection model, and i agree that it makes programming a lot simpler, but at what cost?

  • What is the overhead of making system calls from Java? For example creating a Socket object as opposed to the C socket API.

  • Finally, I recall reading somewhere that the JVM implementation is single threaded. If this is true (which i am skeptical about), does that mean that Java threads really aren’t true threads? Does a java thread, in general, correspond to an underlying kernel-provided thread? Does a Java application benefit in the same way a native application would from multiple cores / multiple cpu’s?

Any advice from developer who understands the intricacies of JVM and java program performance would be much appreciated. Thanks.

Answer

Java isn’t an interpreted language, and hasn’t been for several versions. The Java bytecode is JIT’ed on the fly. (Technically it still interprets some of the code, but anything that matters performance-wise gets JIT’ed)

As for performance, what on Earth gives you the crazy idea that “there is a baseline for overhead”? There isn’t. There never was and never will be. Not betwee C++ and Java, and not between Python and Javascript, or any other two languages. There are things that your specific version of the JVM will do faster than your specific C++ compiler, and things that your specific C++ compiler will do better than your specific JVM.

So the “overhead” of your choice of language depends entirely on 1) what you want your code to do, and 2) how you write your code.

If you take a Java program and translate it to C++, the result will almost certainly run slower.

If you take a C++ program and translate it to Java, that too will also run slower.

Not because one language is “faster” than the other, but because the original program was written for one language, and was tailored to work well in that language. And any attempt to translate it to another language will lose this advantage. You end up with a C++-style Java program, which won’t run efficiently on the JVM, or a Java-style C++ program, which will run terribly as well.

Neither language specification contains a clause that “and the result must be at least x% slower than language y”. Both your C++ compiler and the JVM do their very best to make things go fast.

And then performance characteristics you’re seeing today may change tomorrow. Languages don’t have a speed.

But to answer your specific questions:

There must be some baseline for overhead when using an interpreter. Is there some general rule of thumb to remember? 10% 15%? I have read the occasional blog stating that Java code is nearly as fast as native code, but I that may have been biased.

As said above, it depends. For many common tasks, you typically won’t see more than a few percents difference either way. For some use cases, you’ll see a larger difference (going either way. Both languages have advantages when it comes to performance. There is some overhead associated with the JVM, but there are also huge optimization opportunities and not least the garbage collector)

Does the JVM garbage collector add significant overhead to runtime performance? I know Cocoa applications have begun to use a garbage collection model, and i agree that it makes programming a lot simpler, but at what cost?

Basically none. On average, a garbage collector is far faster than manual memory management, for many reasons:

  • on a managed heap, dynamic allocations can be done much faster
  • shared ownership can be handled with negligible amortized cost, where in a native language you’d have to use reference counting which is awfully expensive
  • in some cases, object destruction is vastly simplified as well (Most Java objects can be reclaimed just by GC’ing the memory block. In C++ destructors must always be executed, and nearly every object has one)

The main problem with a GC is that while on average a garbage collector performs better, you lose some control over when to take the performance cost. Manual memory management ensures your thread won’t ever be halted while waiting for memory to be cleaned up. A garbage collector can, at almost any time, decide to pause the process and clean up memory. In almost all cases, this is fast enough to be no problem, but for vital real-time stuff, it is a problem.

(An additional problem is that you lose a bit of expressiveness. In C++, RAII is used to manage all sorts of resources. In Java, you can’t use RAII. Instead the GC handles memory for you, and for all other resources, you’re screwed, and have to do it yourself with lots of try/finally blocks. There is no reason why RAII couldn’t be implemented in a GC’ed language, but it’s not available in either Java or C#)

What is the overhead of making system calls from Java? For example creating a Socket object as opposed to the C socket API.

Roughly the same. Why would it be different? Of course, Java has to invoke the relevant OS services and APIs, so there is a tiny bit of overhead, but it is really nothing you’re likely to notice.

Finally, I recall reading somewhere that the JVM implementation is single threaded. If this is true (which i am skeptical about), does that mean that Java threads really aren’t true threads? Does a java thread, in general, correspond to an underlying kernel-provided thread? Does a Java application benefit in the same way a native application would from multiple cores / multiple cpu’s?

Java can use multiple threads, yes. The JVM itself might be singlethreaded (in the sense that all the JVM services run on the same thread), I don’t know about that. But your Java application can use as many threads as it likes, and they are mapped to OS threads and will use multiple cores.