Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

A reoccurring theme on SE I\'ve noticed in many questions is the ongoing argumen

ID: 643260 • Letter: A

Question

A reoccurring theme on SE I've noticed in many questions is the ongoing argument that C++ is faster and/or more efficient than higher level languages like Java. The counter-argument is that modern JVM or CLR can be just as efficient thanks to JIT and so on for a growing number of tasks and that C++ is only ever more efficient if you know what you're doing and why doing things a certain way will merit performance increases. That's obvious and makes perfect sense.

I'd like to know a basic explanation (if there is such a thing...) with practical examples as to why and how certain tasks are faster in C++ than the JVM or CLR? Is it simply because C++ is compiled into machine code whereas the JVM or CLR still have the processing overhead of JIT compilation at run time?

When I try to research the topic, all I find is the same arguments I've outlined above without any detailed information as to understanding exactly how C++ can be utilized for high-performance computing.

Explanation / Answer

It's all about the memory (not the JIT). The JIT 'advantage over C' is mostly limited to optimizing out virtual or non-virtual calls through inlining, something that the CPU BTB is already working hard to do.

In modern machines, accessing RAM is really slow (compared to anything the CPU does), which means applications that use the caches as much as possible (which is easier when less memory is used) can be up to a hundred times faster than those that don't. And there are many ways in which Java uses more memory than C++ and makes it harder to write applications that fully exploit the cache:

There is a memory overhead of at least 8 bytes for each object, and the use of objects instead of primitives is required or preferred in many places (namely the standard collections).
Strings consist of two objects and have an overhead of 38 bytes
UTF-16 is used internally, which means that each ASCII character requires two bytes instead of one (the Oracle JVM recently introduced an optimizaion to avoid this for pure ASCII strings).
There is no aggregate reference type (i.e. structs), and in turn, there are no arrays of aggregate reference types. A Java object, or array of Java objects, has very poor L1/L2 cache locality compared to C-structs and arrays.
Java generics use type-erasure, which has poor cache locality compared to type-instantiation.
Object allocation is opaque and has to be done separately for each object, so it is impossible for an application to deliberately lay out its data in a cache-friendly way and still treat it as structured data.
Some other memory- but not cache-related factors:

There is no stack allocation, so all non-primitive data you work with has to be on the heap and go through garbage collection (some recent JITs do stack allocation behind the scenes in certain cases).
Because there are no aggregate reference types, there is no stack passing of aggregate reference types. (Think efficient passing of Vector arguments)
Garbage collection can hurt L1/L2 cache contents, and GC stop-the-world pauses hurt interactivity.
Converting between data types always requires copying; you cannot take a pointer to a bunch of bytes you got from a socket and interpret them as a float.
Some of these things are tradeoffs (not having to do manual memory management is worth giving up a lot of performance for most people), some are probably the result of trying to keep Java simple, and some are design mistakes (though possibly only in hindsight, namely UTF-16 was a fixed length encoding when Java was created, which makes the decision to choose it a lot more understandable).

It's worth noting that many of these tradeoffs are very different for Java/JVM than they are for C#/CIL. The .NET CIL has reference-type structs, stack allocation/passing, packed arrays of structs, and type-instantiated generics.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote