Optimize multicast stubs#130207
Conversation
|
@EgorBot -arm -amd --envvars DOTNET_JitDisasm:IL_STUB_MulticastDelegate_Invoke using BenchmarkDotNet.Attributes;
public class MyBenchmarks
{
public Action a;
[GlobalSetup]
public void Setup()
{
for (int i = 0; i < 10000; i++)
a += new Action(() => {});
}
[Benchmark]
public void Bench() => a();
} |
|
Hmm the asm seems better but the arm perf is much worse, I assume it's cause the JIT is ordering blocks wrong which causes branch mispredictions. EDIT: the branch order is also different than what I get locally, is the bot using any weird settings? @EgorBo |
These sort of
We have seen number of cases where unsafe code "optimizations" result in worse performance. It sounds like this is another one of those.
What are the covariant helpers that this is eliminating?
The typical multicast delegate has very few targets, and the targets are typically different. This is not very representative microbenchmark. |
|
@EgorBot -arm -amd --envvars DOTNET_JitDisasm:IL_STUB_MulticastDelegate_Invoke using BenchmarkDotNet.Attributes;
public class MyBenchmarks
{
public Action a;
[GlobalSetup]
public void Setup()
{
for (int i = 0; i < 10000; i++)
a += new Action(() => {});
}
[Benchmark]
public void Bench() => a();
} |
| #endif // DEBUGGING_SUPPORTED | ||
|
|
||
| ILCodeLabel *realLoopStart = pCode->NewCodeLabel(); | ||
| pCode->EmitBEQ(realLoopStart); |
There was a problem hiding this comment.
This doesn't look optimal for branch predicting (forward conditional jump as hot path). The original logic was constructed for optimizing this.

Rewrites multicast stubs to byref loops to remove bounds checks and covarianc helpers from every iteration.
This assumes the wrapper struct is the same size as a ref, is such assumption fine for the VM? @jkotas @MichalStrehovsky