This benchmark seems irrelevant

The benchmark with 1B rows in this blogpost seems irrelevant for comparing performance of different programming languages.

It seems like the execution time of a program would be dominated by loading data from the file. And a lot of people posted solution with specs of cpu but not specs of disk (hdd, ssd, raid) although that seems more relevant.

Why would they compare languages and solutions in this way?

https://devclass.com/2024/01/04/how-fast-is-your-programming-language-new-contest-and-benchmarks-spark-debate/Open link View original on lemmy.world

Comments5

aubeynarf

lemmynsfw.com

aluminium reply

lemmy.world

Also big "enterprise" Software usually becomes slow due to fundamental issues or issues in the architecture.

For example I worked on maintaining an old Java EE project and people there constantly made multiple sequencial HTTP requests despite the requests not being dependent on one another.

killeronthecorner

lemmy.world

To answer your question about environmental and hardware factors - from the repo:

Results are determined by running the program on a Hetzner Cloud CCX33 instance (8 dedicated vCPU, 32 GB RAM). The time program is used for measuring execution times, i.e. end-to-end times are measured.

AlphaAutist reply

lemmy.world

That seems to only be for the Java code

How fast though is Java versus other languages? A show and tell page has submissions in Rust, C#, Go, Python, PostgreSQL, Python, C, C++, and more. These are hard to compare with one another since they have been run on different hardware, but there are some impressive results, including one under 5 seconds done with C on an AMD laptop, and a C# solution that runs in 5.3 seconds on a Core i5-12500 with 6 cores.

killeronthecorner reply

lemmy.world

The show and tell page is exactly that, show and tell; not a scientific or balanced comparison.

The original challenge only compared JDK solution in this way. Further down there is a link to another repo that does that same across many languages, and uses the same M1 MacBook Pro to run the tests.

orhtej2

eviltoast.org

I would assume they want to factor in startup time as well as IO handling overhead - raw disk IO should be the same given programs are run in the same environment.