Delegates and inlining
by Leandro Lucarella on 2010-06-28 12:30 (updated on 2010-06-28 12:30)- with 3 comment(s)
Sometimes performance issues matter more than you might think for a language. In this case I'm talking about the D programming language.
I'm trying to improve the GC, and I want to improve it not only in terms of performance, but in terms of code quality too. But I'm hitting some performance issues that prevent me to make the code better.
D support high level constructs, like delegates (aka closures). For example, to do a simple linear search I wanted to use this code:
T* find_if(bool delegate(ref T) predicate)
{
for (size_t i = 0; i < this._size; i++)
if (predicate(this._data[i]))
return this._data + i;
return null;
}
...
auto p = find_if((ref T t) { return t > 5; });
But in DMD, you don't get that predicate inlined (neither the find_if() call, for that matter), so you're basically screwed, suddenly you code is ~4x slower. Seriously, I'm not joking, using callgrind to profile the program (DMD's profiler doesn't work for me, I get a stack overflow for a recursive call when I try to use it), doing the call takes 4x more instructions, and in a real life example, using Dil to generate the Tango documentation, I get a 3.3x performance penalty for using this high-level construct.
I guess this is why D2's sort uses string mixins instead of delegates for this kind of things. The only lectures that I can find from this is delegates are failing in D, either because they have a bad syntax (compare sort(x, (ref X a, ref X b) { return a > b; }) with sort!"a < b"(x)) or because their performance sucks (mixins are inlined by definition, think of C macros). The language designer is telling you "don't use that feature".
Fortunately the later is only a DMD issue, LDC is able to inline those predicates (they have to inhibit the DMD front-end inlining to let LLVM do the dirty work, and it definitely does it better).
The problem is I can't use LDC because for some unknown reason it produces a non-working Dil executable, and Dil is the only real-life program I have to test and benchmark the GC.
I think this issue really hurts D, because if you can't write performance critical code using higher-level D constructs, you can't showcase your own language in the important parts.
Comment #1
by Michal Minich on 2010-06-29 18:00What is the performance when function is used instead of delegate? Is the inlining done in this case?
Comment #2
by Luca on 2010-06-29 19:14by Rob on 2010-06-29 13:14
D delegates also support the syntax (a,b){ return a > b; } whenever there's enough information for type deduction to work.
Are you sure? I can't see (in the specs) any mention to that syntax in the grammar. What I just found out it's supposed to be valid is using directly a function body, but I can't see how that is not ambiguous with a plain new scope. Maybe there's an error on the specs (or in my interpretation from it).
Also, have you filed a bugzilla report on the performance issues?
Yes, I used a really old bug report, the link is (a little hidden) in the article: http://d.puremagic.com/issues/show_bug.cgi?id=859
by Michal Minich on 2010-06-29 18:00
What is the performance when function is used instead of delegate? Is the inlining done in this case?
I didn't tried, since I needed a delegate, but be my guest if you want =)
Comment #0
by Rob on 2010-06-29 13:14D delegates also support the syntax (a,b){ return a > b; } whenever there's enough information for type deduction to work. Also, have you filed a bugzilla report on the performance issues?