CritLoops
© 2012 by Paul Hsieh
The following are the results of the CritLoops benchmark.
It compares the speeds of multiple programming languages and compilers however,
it can be used for benchmarking of different platforms. It is open source,
covered by the three clause BSD
license.
Purpose
Compare programming languages, compilers and systems for performance in a
fair, objective and open way. It should be noted that this test differs
from other efforts in that it concentrates on:
- Coverage of all standard platform single CPU performance features
- Using very easy to understand and well known algorithms
- Essentially equivalence implementations where possible between languages
Scoring
The CritLoops benchmark uses weighted geometric means of subtests within
categories. In this way if a language is unable to implement any given
subtest, it can still give representative numbers so long as it implements at
least one subtest in each category. This is an issue for Python which does
not make an translation-time distinction between floating point and integer
values in the same way that the other languages do and thus cannot implement
a different integer and floating point heapsort, for example.
The test allows for two modes of interpreting the results. One is
Standard Performance which reflects the performance of the most
obvious renditions of each algorithm. Optimized Performance reflects
performance that is possible with additional effort that may lead to less
than intuitive source code.
Results
The following are sample results for a Athlon XP2000 @ 1.784Ghz with 100Mhz
DDRAM. It should be noted that newer versions of most of these
compilers/interpreters exist, and therefore the results are not necessarily
reflective of the state of the art.
Standard Performance
|
Language (Compiler vendor)
|
Relative Rate of Performance
|
|
Optimized Performance
|
Language (Compiler vendor)
|
Relative Rate of Performance
|
|
Interpretation of Results
With the important proviso that these are not the most up to date compilers,
we can nevertheless get a rough picture about what is going on with these
results.
- As expected C and C++ score the highest results, but there is some
variation even amongst different compilers, and between the C and C++
languages. The difference in C and C++ in the standard results, comes mainly
from the fact that C is limited to '\0' terminated strings which are
significantly slower than C's std::string. However in the optimized results C is
able to recover the lost performance by using a length delimited string
implementation. MSVS.NET C++ does particularly well in the standard
mode because of its vastly improved STL std::string versus MSVC 6.0 which
the Intel compiler is using.
- While C# and Java are slower, they only shed about 50% of their
potential performance (versus C++), which is the equivalent of 18-months of
hardware improvements in Moore's Law terms. One real curiosity, however, is
that the performance of C# can be improved by really barbaric optimization
methods. The improvements seen in the optimized results come from simulating
a 2 dimensional array with a 1 dimensional array, and premultiplying induction
variables. The first may be due to an inescapable weakness of the C#
language (all multidimensional arrays are "ragged" arrays, rather than being
rectilinear) but the second is almost assuredly due to a weaker than
necessary compiler. Java did not appear to be affected by similar changes.
- Python turned in a surprisingly bad score. Other benchmarks lead
me to believe that Python would end up at around 10 times slower than C/C++,
however 100 times slower is truly stunning. This is the equivalent of 10
years of hardware improvements in Moore's Law terms. This really puts into
perspective the claims of Iron Python to be able to improve Python's
performance by up to 3 times -- the performance problems with Python are
clearly more fundamental. (That is not to detract from the other good
features of this language.)
Download
You can download the source and some sample Win32 executables here.