tag:blogger.com,1999:blog-8405107760807432973.post5459635207870933531..comments2024-01-04T09:55:32.459-08:00Comments on Java Concurrency (&c): Scala / Java ShootoutJeremy Mansonhttp://www.blogger.com/profile/04241094734813086257noreply@blogger.comBlogger36125tag:blogger.com,1999:blog-8405107760807432973.post-29676017571666279482011-06-15T11:35:33.666-07:002011-06-15T11:35:33.666-07:00@Rex - Robert was just having a bit of fun. I don...@Rex - Robert was just having a bit of fun. I don't think he meant for any of this to be taken as seriously as it was.<br /><br />But in direct answer to your question: he never mentioned that site when discussing this paper with me.Jeremy Mansonhttps://www.blogger.com/profile/04241094734813086257noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-27947398100499173592011-06-15T10:24:06.056-07:002011-06-15T10:24:06.056-07:00The benchmark results are somewhat interesting, bu...The benchmark results are somewhat interesting, but I do wonder how they are significantly more illuminating than the various benchmarks on the Computer Languages Benchmark Game. Do you know if Robert was simply unaware of the site, or if there was a particular question that could not be answered by referring to the results there? I couldn't find a clear answer to this in the paper.Rexhttps://www.blogger.com/profile/07181243882004133987noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-86864694595687685122011-06-12T14:32:59.761-07:002011-06-12T14:32:59.761-07:00What I find quite amazing about this sort of compa...What I find quite amazing about this sort of comparison is the failure to recognise the benefits of the classloading approach.<br /><br />Because I can write an aspect do some load-time weaving on Java, I can apply caching on class and/or method named pattern matching across a whole application. I can also do that on the basis of picking up certain annotations at runtime.<br /><br />For testing Nealehttps://www.blogger.com/profile/07606902594236608044noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-54585037800512442442011-06-09T08:36:26.586-07:002011-06-09T08:36:26.586-07:00@Charles - Sometimes it should, sometimes it shoul...@Charles - Sometimes it should, sometimes it shouldn't. It depends on the benchmark (or, rather, it depends on what you are benchmarking). See my non-existent blog post for details. :)<br /><br />In this case, it probably should include that overhead, but clearly GC hadn't been taken into account when writing the code.Jeremy Mansonhttps://www.blogger.com/profile/04241094734813086257noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-30755722231452547562011-06-09T01:15:16.123-07:002011-06-09T01:15:16.123-07:00"The benchmark did not follow the rules of Ja..."The benchmark did not follow the rules of Java benchmarking. At some point, I should write a post about the rules of Java benchmarking (maybe I have? I don't know). Basically, it timed the full execution of the program, including startup time, JIT compilation time, and GC time."<br /><br /> Why it shouldn´t include GC time? As with C++, memory (and other resources) allocation and Charlesnoreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-56773167096894338052011-06-08T23:57:09.665-07:002011-06-08T23:57:09.665-07:00For what it is worth, Google didn't say it. R...For what it is worth, Google didn't say it. Robert said it in a paper he recognizes isn't exactly comprehensive. It was just a bit of fun that went too far. Google has no official opinion about which programming language you should use. That's why NativeClient exists. :)<br /><br />My recollection was that on my machine, the optimized version I first gave Robert ran in about 20 Jeremy Mansonhttps://www.blogger.com/profile/04241094734813086257noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-4593905513743363622011-06-08T15:24:03.480-07:002011-06-08T15:24:03.480-07:00BTW, I forgot (you know, it's almost 2000 clas...BTW, I forgot (you know, it's almost 2000 classes :) that fastutil has array-based sets, too. I replaced IntOpenHashSet with IntArraySet in the same positions in which you use your IntegerSet, and I got a 20% improvement (now I'm about 10% slower than your version). The number of line changed of course is still the same.Sebastiano Vignahttp://vigna.dsi.unimi.it/noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-33003270245647994492011-06-08T14:52:02.320-07:002011-06-08T14:52:02.320-07:00Unfortunately I discovered your version only after...Unfortunately I discovered your version only after having made the post. :)<br /><br />Yes, obviously fastutil is slower, as it still provides a hash table. The set you implemented is an array-based set, well-suited for this case, in which sets are ridicolously small. But your IntegerSet class has linear insertion time and query time, and it would never work as a generic data structure as soon asSebastiano Vignahttp://vigna.dsi.unimi.it/noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-14921304538330570562011-06-08T09:56:58.614-07:002011-06-08T09:56:58.614-07:00@Sebastiano - Are you looking at the final optimiz...@Sebastiano - Are you looking at the final optimized version of C++? When I tried fastutil, it was:<br /><br />- Faster than C++'s original version, <br />- Slower than the handmade data structures I created, and<br />- Slower than C++ with all of the optimizations.Jeremy Mansonhttps://www.blogger.com/profile/04241094734813086257noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-10037214730040661082011-06-08T09:44:04.521-07:002011-06-08T09:44:04.521-07:00A final comment. After having implemented findSet(...A final comment. After having implemented findSet() recursively, and having replaced the creation of >200000 object at each call to findLoops() with object reuse, the wall clock of Java is *smaller* than that of C++. User time is marginally higher due to the parallel garbage collector.<br /><br />Java is faster than C++ on this benchmark.Sebastiano Vignahttp://vigna.dsi.unimi.it/noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-25809685205823747142011-06-08T09:09:16.714-07:002011-06-08T09:09:16.714-07:00I don't suppose Robert Hundt realized that he ...I don't suppose Robert Hundt realized that he was providing the world with Google's official assessment of programming languages :-)<br /><br /><a href="http://www.itproportal.com/2011/06/07/googles-rates-c-most-complex-highest-performing-language/" rel="nofollow">Google Rates C++ As The Most Complex, Highest Performing Language</a><br /><br /><a href="http://www.computing.co.uk/ctg/news/Isaac Gouyhttps://www.blogger.com/profile/02902123247585964087noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-85912709971167790482011-06-08T08:45:12.407-07:002011-06-08T08:45:12.407-07:00As soon as I read the paper, I knew there was some...As soon as I read the paper, I knew there was something horribly wrong. I just changed the collections used in the code to fastutil's (overall 20 changed lines!) and now C++ is not even three times faster. <br /><br />I could play a bit with the profiler to eliminate obvious bottleneck, but the point here is that Java built-in collections are deadly slow on primitive types, and everybody Sebastiano Vignahttp://vigna.dsi.unimi.it/noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-16698008256950065522011-06-07T15:39:42.430-07:002011-06-07T15:39:42.430-07:00I doubt it. It requires some assumptions about th...I doubt it. It requires some assumptions about the operating environment that shouldn't get made if you don't control said operating environment. IIRC, Cliff has a good bit in the talk about what it would take to make it widely practicable.Jeremy Mansonhttps://www.blogger.com/profile/04241094734813086257noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-30849710916166553932011-06-07T15:31:47.895-07:002011-06-07T15:31:47.895-07:00Thanks so much for the info! That's what I tho...Thanks so much for the info! That's what I thought of when trying to think about "CycleTime"; a quicker currentTimeMillis based on TSC is definitely a big win. Do you think this could land in the 'official' Mercurial?Pierre Queinnecnoreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-27219595763686472222011-06-07T14:22:48.027-07:002011-06-07T14:22:48.027-07:00@Pierre - it uses a System.currentTimeMillis/nanoT...@Pierre - it uses a System.currentTimeMillis/nanoTime mechanism based on the processor timestamp counter and some magic kernel juju, rather than the overweight mechanism that is the current implementation.<br /><br />Cliff Click talks in general about the problem that this fixes at 19:45 in this video:<br /><br />http://www.youtube.com/watch?v=uL2D3qzHtqYJeremy Mansonhttps://www.blogger.com/profile/04241094734813086257noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-82649914724287566312011-06-07T14:16:54.256-07:002011-06-07T14:16:54.256-07:00Hi Jeremy,
Thanks for your post. I tried to find ...Hi Jeremy,<br /><br />Thanks for your post. I tried to find CycleTime in arguments.cpp but couldn't, obviously as per your last comment. Would you care to explain what it does?<br /><br />Thanks a lot!Pierre Queinnecnoreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-85029826339774091482011-06-07T11:36:41.630-07:002011-06-07T11:36:41.630-07:00I *think* Figure 8 reflects the original code. He...I *think* Figure 8 reflects the original code. He also mentions a -XX:+CycleTime flag, which I don't see him mentioning is a Google-specific flag I added.Jeremy Mansonhttps://www.blogger.com/profile/04241094734813086257noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-54370574956986534262011-06-07T11:34:02.843-07:002011-06-07T11:34:02.843-07:00@Isaac - I'm not sure what the status of the c...@Isaac - I'm not sure what the status of the code is, either. There are at least three possible versions:<br /><br />1) Robert's initial pass<br />2) My initial revisions<br />3) My subsequent revisions<br /><br />My initial revisions have hand-build versions of HashMap that permit scalars instead of boxed Integers. My subsequent revisions use ArrayDeque, as well. The paper uses (2) asJeremy Mansonhttps://www.blogger.com/profile/04241094734813086257noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-8161916063262582712011-06-07T09:05:24.515-07:002011-06-07T09:05:24.515-07:00OK.
I'll just finish with the suggestion tha...OK. <br /><br />I'll just finish with the suggestion that your "didn't realize anyone would actually read the paper" quip is probably quite close to what's happened.<br /><br />Seems like few of us read the paper well, we kind of glanced at it - I needed half-a-dozen attempts before I succeeded in finding Figure 3.<br /><br />And I'm still not convinced that I understandIsaac Gouyhttps://www.blogger.com/profile/02902123247585964087noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-31385715653618472582011-06-06T10:07:35.430-07:002011-06-06T10:07:35.430-07:00I'm not going to dive into this too deeply. I...I'm not going to dive into this too deeply. I will reiterate that I include time spent interpreting and doing JIT compilation in JVM startup, and that I retracted that statement (as above) when you pointed out that the benchmark does a warmup phase.<br /><br />I will also point out that my statement that programmers are more productive in Java than in C++ is simply borne from my experiences Jeremy Mansonhttps://www.blogger.com/profile/04241094734813086257noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-71582176101148841532011-06-06T09:08:03.247-07:002011-06-06T09:08:03.247-07:001) I'm always entertained by - The benchmark d...1) I'm always entertained by - <i>The benchmark did not follow the rules of Java benchmarking</i> excuse.<br /><br />I'm amused that we should just not count GC time instead of reducing GC, and that JVM startup is so horrendous on your machine that it's a significant portion of even your 9 second measurement :-)<br /><br />Are you trying to send the message that JVM startup is <i>thatIsaac Gouyhttps://www.blogger.com/profile/02902123247585964087noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-53379880240165965982011-06-05T17:10:36.379-07:002011-06-05T17:10:36.379-07:00In the end: thanks for posting, it made for a funn...<i>In the end: thanks for posting, it made for a funny Saturday night.</i><br /><br />Our efforts, as always, are purely to entertain.Jeremy Mansonhttps://www.blogger.com/profile/04241094734813086257noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-73467741484256520672011-06-05T11:44:36.814-07:002011-06-05T11:44:36.814-07:00Jeremy, perhaps you can blog some on proper use of...Jeremy, perhaps you can blog some on proper use of datastructures in Java. I believe even "Effective Java" doesn't address the matter properly. Using the right structure is vital for effectiveness. I know it's not related to concurrency but you do write very well.<br /><br />The benchmark had it all wrong. HashSet w/o hashcode/equals (although the instances are unique and 'Stanimir Simeonoffhttps://www.blogger.com/profile/15526543718385237177noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-86181843763037349272011-06-05T10:58:20.990-07:002011-06-05T10:58:20.990-07:00@Stanimir Simeonoff
I'm not sure why you fee...@Stanimir Simeonoff <br /><br />I'm not sure why you feel the need to teach me Java basics :-)<br /><br />'A new feature in <a href="http://www.oracle.com/technetwork/java/javase/6u25releasenotes-356444.html" rel="nofollow">this version of HotSpot</a> is "tiered" compilation in the Server VM that enables it to start quickly as does the Client VM, while achieving superior peak Isaac Gouyhttps://www.blogger.com/profile/02902123247585964087noreply@blogger.comtag:blogger.com,1999:blog-8405107760807432973.post-34823878853854291222011-06-05T10:26:10.127-07:002011-06-05T10:26:10.127-07:00Actually, the static collections were something I ...Actually, the static collections were something I put in there, as the original was using HashMap, and thereby creating millions of Integer objects a second. This was the quickest way I could think of to resolve that problem.Jeremy Mansonhttps://www.blogger.com/profile/04241094734813086257noreply@blogger.com