Iteration techniques put on a bench(mark)
A couple of weeks ago I covered some of the additions Microsoft has made to native C++ for VC7-8. One of these additions is the "for each" loop, which can e.g. be used for STL container iteration.
While the loop certainly looks nice; how good does it optimize? Or concretely, how fast is it? To shed some light on this, I did a series of tests on three different iteration techniques. All code generations are done using VS8 with SP1, and a plain Release Build.
First off is a plain for loop, which iterates by a subscript index:int sum = 0;
vector<int> v;
// ...
size_t l = v.size();
for(size_t i = 0; i < l; ++i)
{
sum = v[i];
}
The second variation is also a plain for loop, but this time iterators are used:for(vector<int>::iterator i = v.begin(); i != v.end(); ++i)
{
sum = *i;
}
And finally, the for each loop:for each (int i in v)
{
sum = i;
}
In analyzing these iterations, two main concerns will have to be included in the comparison.
- How well does the technique handle a high number of elements to iterate?
- How well does the technique handle being called often?
In addition, one analysis will be done with _SECURE_SCL switched on, and one when it's switched off.
With _SECURE_SCL
Listing A: Index iteration
Listing B: Iterators
Listing C: for each
What's worth noticing here is the mainly the calls to _invalid_parameter_noinfo, which will go through, granted events such as either of the following
- the v.begin() iterator points to null.
- v.end() - v.begin() is suddenly less than i mid-loop, when the prerequisite for the iteration was that i < the vector length determined prior to entering the loop.
| _SECURE_SCL | NO _SECURE_SCL | |||
| First run | Each iteration | First run | Each iteration | |
| Index | 24 | 13 | 15 | 6 |
| Iterator | 20 | 14 | 9 | 5 |
| for each | 28 | 16 | 10 | 5 |
In the above table, the "first run" column indicates how many instructions are required to setup the iteration sequence, and complete the first run of the iteration body. The "each iteration" columns indicate how many instructions are used for each following iteration. Again, the instruction count doesn't directly relate to the execution speed, but there it is. As seen from Listing C above, the for each loop has more verification code than that of the iterator technique. I actually find the generated code somewhat odd,
mov eax,dword ptr [v+8]
mov ecx,dword ptr [v+4]
cmp ecx,eax
push ebx
push esi
push edi
mov ebx,eax
jbe noproblem
call _invalid_parameter_noinfo
mov eax,dword ptr [v+8]
mov ecx,dword ptr [v+4]
noproblem:
cmp ecx,eax
mov esi,ecx
Notice the second initialization of eax and ecx. That code is never reached, as the application would terminate given that _invalid_parameter_noinfo is called. Now this particular construct may be there for the sake of optimizations, but I can't see how it would help.
Further down, there's another peculiar thing going on:
test edi,edi
je problem
cmp edi,offset v
je noproblem
problem:
call _invalid_parameter_noinfo
noproblem:
cmp esi,ebx
je endofloop
test edi,edi
...
The test at the top of this source block is the first instruction for each repeated iteration. The edi register will simply hold a pointer to the vector. Granted that no problems are detected, a typical flow (not listing the jumps) will look like:
test edi,edi
cmp edi,offset v
cmp esi,ebx
test edi,edi
...
So edi is being tested twice, and there's no way it can be changed inbetween. That really seems like redundancy to me.
Those remarks having been made, none of it really matters if you compile your application with _SECURE_SCL switched off.
Without _SECURE_SCL
Listing D: Index iteration
Listing E: Iterators
Listing F: for each
As the table shown earlier indicates, the instruction count is quite different for all three of the techniques, and there's really not much to say about any of them at this point. What does remain, is to see an indication of how fast they all run:
| _SECURE_SCL | NO _SECURE_SCL | |||
| 100M items, 1 run | 10 items, 10M runs | 100M items, 1 run | 10 items, 10M runs | |
| Index | 350 | 410 | 315 | 280 |
| Iterator | 490 | 695 | 350 | 385 |
| for each | 585 | 800 | 340 | 385 |
All results in the above table are in milliseconds. 'M' indicates millions, 'k' thousands.
While this has not been a scientific test, but rather my mere findings on a standard desktop computer; it does give an indication of how the iterations may perform compared to each other. The index iteration is the quicker, regardless of the _SECURE_SCL setting, how many items are iterated and how frequently the iteration sequence is made. What's interesting to see, is that the slowdown experienced from the "for each" to the iterator technique goes away entirely when _SECURE_SCL is disabled. This may strengthen my notion of the "for each" not being optimally generated, but jumping to that conclusion from these (highly) non-scientific benchmarks is a long stretch. It does make you wonder, though. In either case, the speed difference isn't horrific (except perhaps that of between the index iteration and "for each" seen in the seconc column).

11 comments:
yep, no surprise there. It seems like the farther you get from the metal (i.e., using STL), the harder it is for the compiler writers to create something efficient.
Thanks again for your suggestion on the MSDN site (signal support in the CRT). It's refreshing to see somebody analyzing compiler output in assembler...
Thanks for the nice post!
Nice article. I had no idea SECURE_SCL was on by default in RELEASE. Turning if off made a actual difference in our execution time. Why wouldn't MS just leave that on for DEBUG?
Aston Villa rode their luck at Hull City where an 88-minute own goal from Kamil Zayatte saw them leapfrog three points clear of Arsenal and into fourth place in the Premier League wow gold with a 1-0 win.
Villa had to survive Hull penalty appeals for a handball against Ashley Young in time added on, television replays showing that referee Steve Bennett wow gold correctly rejected the claims after consulting a linesman.
Bennett had been involved in controversy after just five minutes when American goalkeeper Brad Friedel looked to have handed Hull the initiative and threaten Villa's return to the Champions League qualifying wow gold zone.
Friedel spilled wow gold the ball under pressure from Nick Barmby and stand-in right-back Nigel Reo-Coker turned it into his own net as he attempted to wow gold clear.
But Bennett cut short wow gold celebrations at the KC Stadium -- and let Friedel off the hook -- when he ruled out the score for an apparent infringement by Barmby.
Zayatte's intervention from a Young cross bound for wow gold Gabriel Agbonlahor then saw Villa leapfrog Arsenal and draw level with Manchester United on 38 points -- seven adrift of leaders Liverpool and four wow gold behind Chelsea.
Stung by an on-pitch dressing down wow gold by manager Phil Brown at Manchester City last week, Hull showed five changes and a vastly improved performance.
Promoted Hull were looking for only their second win in 11 games while wow gold Villa arrived unbeaten in seven and it looked to be heading for a goalless draw when the home side suffered a cruel late blow.
South Africa inflicted the world of warcraft gold first home series defeat on Australia in almost 16 wow powerleveling years as they wrapped up a nine-wicket win over the world's number one ranked world of warcraft gold Test nation in Melbourne on Tuesday.
Captain Graeme Smith wow power leveling hit a fluent 75 as his side successfully passed a world of warcraft gold modest victory target of 183 on the final day at the MCG to take an wow powerleveling unassailable 2-0 lead.
It was the South dofus kamas African's first-ever Test series triumph in Australia and dofus kamas victory in the third and final match in Sydney will see them leapfrog the home side at the top of the global Lord of the Rings Online Gold rankings.
Hashim Amla LOTRO Gold (30 not out) scored the winning fly for fun penya runs shortly after lunch as South Africa flyff penya became the first team to overcome Australia at home Final Fantasy XI gilsince the West Indies in 1992-93.
South Africa ffxi gil were never under any pressure in eq2 plat their run chase and did not lose eq2 gold a wicket until just before lunch when the inspirational Smith Lord of the Rings Online Gold was trapped leg before wicket by Nathan LOTRO Gold Hauritz.
Smith had flyff penya dominated a 121-run opening stand flyff money with Neil McKenzie, hitting ffxi gil 10 boundaries.
McKenzie struggled to buy ffxi gil a half century and survived strong eq2 plat lbw shouts from Brett Lee, eq2 gold who was bowling despite an injured foot that will Lord of the Rings Online gold keep him out of the Sydney Test.
South Africa's LOTRO gold victory was set up by a brilliant maiden Test century fly for fun penya from JP Duminy, who shared a stunning flyff penya 180-run ninth wicket partnership with pace bowler Dale Final Fantasy XI gil Steyn.
It gave the tourists ffxi gil a priceless 65-run lead on first innings before man of eq2 plat the match Steyn worked his magic with the ball as Australia were eq2 gold bowled out on the fourth day for 247 in their second innings.
The pugnacious Smith was virtually runescape money lost for words in his victory speech.
"It has been such a special moment runescape gold for all of us, it has been an incredible team effort," he said.
"I have been smiling non-stop wow po since we hit the winning runs.
"To be 2-0 up after this game was something wow or we only dreamt of."
South Africa won the first Test in Perth from an unlikely position, chasing 414 for victory for the loss of only four wickets.
welcome to the wow power leveling cheap Wow gold service site, buy cheap wow gold,wow gold,world of warcraft power leveling buy wow gold
When Wow Gold wolf finally found the wow gold cheap hole in the chimney he crawled cheap wow gold down and KERSPLASH right into that kettle of water and that was cheapest wow gold the end of his troubles with the big bad wolf.
game4power.
The next day the Buy Wow Goldlittle pig invited hisbuy gold wow mother over . She said "You see it is just as mygamegoldI told you. The way to get along in the world is to do world of warcraft gold things as well as you can." Fortunately for that little pig, he buy cheap wow gold learned that lesson. And he just k4gold lived happily ever after!.
Welcome to the 2moons dil, In here you can buy the 2moons gold, Do you know that the 2moon dil in the game is very important, If you had more cheap 2moons gold. I think you can get the tall level, quickly come here to buy 2moons dil.
Making holic gold is the old question : Honestly there is no fast way to make lots of holic money . Sadly enough a lot of the people that all of a sudden come to with millions of holic online gold almost overnight probably duped . Although there are a lot of ways to make lots of cheap holic goldhere I will tell you all of the ways that I know and what I do to make holic online money.
As a new player , you may need some game guides or information to enhance yourself.
kal geons is one of the hardest theme for every class at the beginning . You must have a good way to manage your kal gold.If yor are a lucky guy ,you can earn so many kal online geons by yourself . But if you are a not , I just find a nice way to get kal online gold. If you need , you can buy kalonline Geons at our website . Go to the related page and check the detailed information . Once you have any question , you can connect our customer service at any time .
nice post!
aion chinaaion china gold,
aion cn goldaion chinese gold,
aion gold chinaaion gold chinese,
china aion goldchinese aion gold,
aion china kinaaion chinese kina,
aion kina chinachina aion kina,
aion china buybuy aion china,
aion chinese server goldaion cn server gold,
aion china server goldchina aion server gold,
chinese aion server goldaion chinese server gold,
aion cn server kinaaion china server kina,
china aion server kinachinese aion server kina
Post a Comment