For Programmers: Free Programming Magazines  


Home > Archive > VC Language > June 2005 > Performance: Iterating a vector in a loop..









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Performance: Iterating a vector in a loop..
Mark Ingram

2005-05-26, 4:02 pm

Hi, im having a discussion with a colleague about iterating through a
vector in a loop.

my current way is:

vector<MyType>::iterator itLoop;
for (itLoop = vec.begin(); itLoop != vec.end(); itLoop++)
{
itLoop->DoSomething();
}

But he was saying its slow because .begin() and .end() are called every
loop.

I know that the function is called every iteration, but was under the
impression that STL is highly optimised in dealing with things like
this, so it shouldnt make too much difference.

Can anyone shed any light on this and explain whether or not it really
has any speed hits when doing that, or, if it would be better to
re-write the code like this (which i feel is less legible).

vector<MyType>::iterator itLoop, itBegin = vec.begin(), itEnd= vec.end();

for (itLoop = itBegin ; itLoop != itEnd; itLoop++)
{
itLoop->DoSomething();
}

Thanks for any info.
Hendrik Schober

2005-05-26, 4:02 pm

Mark Ingram <mark.ingram@removeme.softease.com> wrote:
> Hi, im having a discussion with a colleague about iterating through a
> vector in a loop.
>
> my current way is:
>
> vector<MyType>::iterator itLoop;
> for (itLoop = vec.begin(); itLoop != vec.end(); itLoop++)
> {
> itLoop->DoSomething();
> }
>
> But he was saying its slow because .begin() and .end() are called every
> loop.


That's wrong. Only 'end()' is called each time.
'begin()' is called once.

> I know that the function is called every iteration, but was under the
> impression that STL is highly optimised in dealing with things like
> this, so it shouldnt make too much difference.


It can only do this when it can proof that
the container isn't changed within the loop.
AFAIK that's tricky for compilers more often
than we think.

> Can anyone shed any light on this and explain whether or not it really
> has any speed hits when doing that, or, if it would be better to
> re-write the code like this (which i feel is less legible).
>
> vector<MyType>::iterator itLoop, itBegin = vec.begin(), itEnd= vec.end();
>
> for (itLoop = itBegin ; itLoop != itEnd; itLoop++)
> {
> itLoop->DoSomething();
> }


What I often do is this:

for( container::iterator it=c.begin(), end=c.end(); it!=end; ++it ) {
// ...
}

If ocurse, that makes for long lines, so I break
this, which messes up automatic formatting...

> Thanks for any info.



HTH,

Schobi

--
SpamTrap@gmx.de is never read
I'm Schobi at suespammers dot org

"Coming back to where you started is not the same as never leaving"
Terry Pratchett


Duane Hebert

2005-05-26, 4:02 pm


"Hendrik Schober" <SpamTrap@gmx.de> wrote in message news:evHr$oeYFHA.4032@tk2msftngp13.phx.gbl...[color=darkred]
> Mark Ingram <mark.ingram@removeme.softease.com> wrote:

In addition to Shoby's advice, you should
consider a pre-increment. Depending on
what your MyType class is, ++itLoop may
be a noticeable improvement.


Tom Widmer

2005-05-26, 4:02 pm

Duane Hebert wrote:
> "Hendrik Schober" <SpamTrap@gmx.de> wrote in message news:evHr$oeYFHA.4032@tk2msftngp13.phx.gbl...
>
>
>
> In addition to Shoby's advice, you should
> consider a pre-increment. Depending on
> what your MyType class is, ++itLoop may
> be a noticeable improvement.


The type of MyType won't affect the performance of ++itLoop vs itLoop++,
rather the type of container may, depending on how good the optimizer is
and how complex the container::iterator type is.

Tom
Bo Persson

2005-05-26, 4:02 pm


"Hendrik Schober" <SpamTrap@gmx.de> skrev i meddelandet
news:evHr$oeYFHA.4032@tk2msftngp13.phx.gbl...
> Mark Ingram <mark.ingram@removeme.softease.com> wrote:
>
> That's wrong. Only 'end()' is called each time.
> 'begin()' is called once.


Additionally, the vec.end() is a very small function that inlines
nicely. Unless DoSomething() somehow affects the private members of
vector, the value of end() will not change thru the loop.

>
>
> It can only do this when it can proof that
> the container isn't changed within the loop.
> AFAIK that's tricky for compilers more often
> than we think.


But calling DoSomething() for a vector element, is highly unlikely to
affect the vector itself. The optimizer may very well understand this.


Bo Persson


Duane Hebert

2005-05-26, 9:02 pm

> The type of MyType won't affect the performance of ++itLoop vs itLoop++,
> rather the type of container may, depending on how good the optimizer is
> and how complex the container::iterator type is.


I'm not sure exactly what the side affects of the post increment are going
to be. It may well depend on the optimizer or the complexity of the
iterator
but since it's not using the side affects, why do it this way? My point was
that if the OP is concerned enough about overhead (as it appears since
he asked about calling .end() repeatedly and .end() is most likely not
very expensive) that he may also want to consider the pre-increment.


andré m.a

2005-05-26, 9:02 pm


> But he was saying its slow because .begin() and .end() are called every
> loop.
>
> I know that the function is called every iteration, but was under the
> impression that STL is highly optimised in dealing with things like this,
> so it shouldnt make too much difference.
>
> Can anyone shed any light on this and explain whether or not it really has
> any speed hits when doing that, or, if it would be better to re-write the
> code like this (which i feel is less legible).
>
> vector<MyType>::iterator itLoop, itBegin = vec.begin(), itEnd= vec.end();
>
> for (itLoop = itBegin ; itLoop != itEnd; itLoop++)
> {
> itLoop->DoSomething();
> }
>
> Thanks for any info.


I dont think that begin() is required more then once.
But the fact that you only call end() once is logicaly an improvement.

and im not sure if

while( itLoop != itEnd )
{
(++itLoop)->DoSomething();
}

would be faster either. At this point if the vector is very large,
you may want to set up a test to verify these things yourself.





Simon Trew

2005-05-27, 4:04 am

"andré m.a" <a.m.a@videotron.ca> wrote in message
news:YArle.5230$8j3.220242@weber.videotron.net...
>
> and im not sure if
>
> while( itLoop != itEnd )
> {
> (++itLoop)->DoSomething();
> }
>
> would be faster either.


since 'for' is more or less a bit of syntactic sugar for 'while', it would
more than likely be faster, since here you are missing off the first element
of the collection, so you're doing one iteration. of course, if itLoop ==
end() on entry to the loop, you have undefined behavior.

You'd need to use a post-increment or put the pre-increment after the call
to DoSomething.

S.


Simon Trew

2005-05-27, 4:04 am

"Simon Trew" <ten.enagro@werts> wrote in message
news:ucBc9$kYFHA.580@TK2MSFTNGP15.phx.gbl...
> so you're doing one iteration


I meant one iteration fewer.

Obviously time for bed.

S.


Simon Trew

2005-05-27, 4:04 am

"Simon Trew" <ten.enagro@werts> wrote in message
news:ucBc9$kYFHA.580@TK2MSFTNGP15.phx.gbl...
> "andré m.a" <a.m.a@videotron.ca> wrote in message
> news:YArle.5230$8j3.220242@weber.videotron.net...
> Of course, if itLoop == end() on entry to the loop


....and, ++itLoop == end().

Bedtime.

S.


Tim Roberts

2005-05-27, 4:04 am

Mark Ingram <mark.ingram@removeme.softease.com> wrote:
>
>Hi, im having a discussion with a colleague about iterating through a
>vector in a loop.
>
>my current way is:
>
>vector<MyType>::iterator itLoop;
>for (itLoop = vec.begin(); itLoop != vec.end(); itLoop++)
>{
> itLoop->DoSomething();
>}
>
>But he was saying its slow because .begin() and .end() are called every
>loop.


None of the other posters have mentioned a potentially even better (and
certainly sexier) alternative. You can replace that entire snippet with
this:

for_each( vec.begin(), vec.end(), mem_fun_ref(&MyType::DoSomething) );
--
- Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc
Tom Widmer

2005-05-27, 8:59 am

Duane Hebert wrote:
>
>
> I'm not sure exactly what the side affects of the post increment are going
> to be. It may well depend on the optimizer or the complexity of the
> iterator
> but since it's not using the side affects, why do it this way?


No reason - if you aren't using the value of the expression,
pre-increment is always preferred.

My point was
> that if the OP is concerned enough about overhead (as it appears since
> he asked about calling .end() repeatedly and .end() is most likely not
> very expensive) that he may also want to consider the pre-increment.


That part of your point I agreed with, just not the bit about MyType.

Tom
Hendrik Schober

2005-05-27, 8:59 am

Bo Persson <bop@gmb.dk> wrote:
> [...]
>
> But calling DoSomething() for a vector element, is highly unlikely to
> affect the vector itself.


I agree with this.

> The optimizer may very well understand this.


And then it might not.
When 'DoSomething()' is implemented in another
TU, the compiler might have a hard time to
proof that it doesn't have access to some
reference/pointer to the vector.
I know that VC optimizes across TUs, but I
don't know the specifications of what it is
able to do and what not. I wouldn't want to
depend on this. (Also, all code I write is
ported to a few other platforms. And some
might get compiled with VC6.)

> Bo Persson


Schobi

--
SpamTrap@gmx.de is never read
I'm Schobi at suespammers dot org

"Coming back to where you started is not the same as never leaving"
Terry Pratchett


Mark Ingram

2005-05-27, 8:59 am

Duane Hebert wrote:
> "Hendrik Schober" <SpamTrap@gmx.de> wrote in message news:evHr$oeYFHA.4032@tk2msftngp13.phx.gbl...
>
>
>
> In addition to Shoby's advice, you should
> consider a pre-increment. Depending on
> what your MyType class is, ++itLoop may
> be a noticeable improvement.
>
>


Why does the pre-increment make a difference?

if itLoop wasnt an iterator and was an int for example, wouldnt it mean
the loop starts at 1 instead of 0?

for (int i = 0; i < 5; ++i)
{
// ??
}


Thanks for all the replies!
Mark Ingram

2005-05-27, 8:59 am

Tim Roberts wrote:

> None of the other posters have mentioned a potentially even better (and
> certainly sexier) alternative. You can replace that entire snippet with
> this:
>
> for_each( vec.begin(), vec.end(), mem_fun_ref(&MyType::DoSomething) );



How does the function call work here? How do you pass in a parameter?
Is this quicker than the for loop?
Duane Hebert

2005-05-27, 8:59 am


> That part of your point I agreed with, just not the bit about MyType.


Yep.


Duane Hebert

2005-05-27, 8:59 am


> Why does the pre-increment make a difference?

What happens with the last iteration? How does
it determine that it's at end()?



Hendrik Schober

2005-05-27, 8:59 am

Mark Ingram <mark.ingram@removeme.softease.com> wrote:
> [...]
> Why does the pre-increment make a difference?


A somewhat canonical implementation of post-
increment for a type 'T' would be

T operator++(int)
{
T tmp(*this);
inc();
return tmp;
}

wheras pre-increment looks luike this


T& operator++(int)
{
inc();
return *this;
}

If you look close, you see that the former
makes at least two copies of the object which
the latter doesn't make. If you are not going
to use the result of post-increment this is
a waste that might be avoided by the optimizer
or it might not.

> [...]



Schobi

--
SpamTrap@gmx.de is never read
I'm Schobi at suespammers dot org

"Coming back to where you started is not the same as never leaving"
Terry Pratchett


Tom Widmer

2005-05-27, 8:59 am

Mark Ingram wrote:
> Tim Roberts wrote:
>
>
>
>
> How does the function call work here?


mem_fun_ref is a function that takes a member function pointer and
returns a functor. The returned functor basically wraps the member
pointer and adapts it so that it can be called with the this pointer as
the first parameter. e.g. the functor has an operator() that looks a bit
like this:

Ret (T::*memptr)(); //member variable holding member pointer.
public:
Ret operator()(T& obj)
{
return (obj.*memptr)(); //use passed param to call member func.
}

> How do you pass in a parameter?


The easiest way is to use boost::bind, which will shortly be part of the
standard library extension "Technical Report 1" (tr1). Then you can do:

void MyType::DoSomething(int i)
//...
using std::tr1::bind; //or using boost::bind
for_each(vec.begin(), vec.end(), bind(&MyType::DoSomething, _1, 10));

> Is this quicker than the for loop?


Not unless the optimizer is being very clever. Calling a function
through a member function pointer is generally slower than calling it
directly. for_each is a pretty useless algorithm in any case, but the
above is more useful for more complex algorithms like sort and remove_if.

Tom
Mark Ingram

2005-05-27, 4:03 pm

Hendrik Schober wrote:
> Mark Ingram <mark.ingram@removeme.softease.com> wrote:
>
>
>
> A somewhat canonical implementation of post-
> increment for a type 'T' would be
>
> T operator++(int)
> {
> T tmp(*this);
> inc();
> return tmp;
> }
>
> wheras pre-increment looks luike this
>
>
> T& operator++(int)
> {
> inc();
> return *this;
> }
>
> If you look close, you see that the former
> makes at least two copies of the object which
> the latter doesn't make. If you are not going
> to use the result of post-increment this is
> a waste that might be avoided by the optimizer
> or it might not.
>
>
>
>
>
> Schobi
>


ah right, cheers, thanks for the info!
Doug Harrison [MVP]

2005-05-27, 4:03 pm

On Thu, 26 May 2005 20:38:01 -0700, Tim Roberts wrote:

> None of the other posters have mentioned a potentially even better (and
> certainly sexier) alternative. You can replace that entire snippet with
> this:
>
> for_each( vec.begin(), vec.end(), mem_fun_ref(&MyType::DoSomething) );


Interesting choice of words... probably shouldn't go there. :) The problem
is, for_each typically requires you to tie yourself into various knots in
order to use it, and the algorithm it implements is so trivial that it
isn't even close to worth the effort. Often, you have to move your code
from the body of the loop into little helper classes, which only obfuscates
your code, and your loops are replaced with these one-line monstrosities
involving binders and whatnot that you can't understand six months later.
Libraries such as Boost can help to a degree, but the real solution is
better language support for generic programming. In the meantime, I'd
recommend forgetting for_each exists.

--
Doug Harrison
Microsoft MVP - Visual C++
andré m.a

2005-05-27, 4:03 pm


"Simon Trew" <ten.enagro@werts> wrote in message
news:ucBc9$kYFHA.580@TK2MSFTNGP15.phx.gbl...
> "andré m.a" <a.m.a@videotron.ca> wrote in message
> news:YArle.5230$8j3.220242@weber.videotron.net...
>
> since 'for' is more or less a bit of syntactic sugar for 'while', it would
> more than likely be faster, since here you are missing off the first
> element of the collection, so you're doing one iteration. of course, if
> itLoop == end() on entry to the loop, you have undefined behavior.
>
> You'd need to use a post-increment or put the pre-increment after the call
> to DoSomething.
>
> S.
>


oops sorry, you're right, (itLoop++)->DoSomething();
or

itLoop->DoSomething();
itLoops++;

bedtime indeed =/





Bo Persson

2005-05-27, 8:59 pm


"Hendrik Schober" <SpamTrap@gmx.de> skrev i meddelandet
news:4296dce0$0$79454$14726298@news.sunsite.dk...
> Bo Persson <bop@gmb.dk> wrote:
>
> I agree with this.
>
>
> And then it might not.
> When 'DoSomething()' is implemented in another
> TU, the compiler might have a hard time to
> proof that it doesn't have access to some
> reference/pointer to the vector.
> I know that VC optimizes across TUs, but I
> don't know the specifications of what it is
> able to do and what not.


It does a lot more than most people expect.

Basically, it links the modules first, and then compiles the total code
afterwards. At that time it can use global knowledge and, for example,
inline functions from one .cpp-file to another, even if they are
initially compiled in the wrong order.

At that point, it should be pretty obvious whether the element type
calls any mutating vector functions, escpecially if the vector object
isn't global or anything.


> I wouldn't want to
> depend on this. (Also, all code I write is
> ported to a few other platforms. And some
> might get compiled with VC6.)
>


Ok, so now we are takning about "optimizing for broken compilers". A
totally different business.

The orignal question was more like if some C++ constructs are inherently
faster than others. They might be for stupid compilers, but not for the
language as such.


Bo Persson


Tim Roberts

2005-05-28, 8:58 pm

Mark Ingram <mark.ingram@removeme.softease.com> wrote:
>
>Why does the pre-increment make a difference?
>
>if itLoop wasnt an iterator and was an int for example, wouldnt it mean
>the loop starts at 1 instead of 0?
>
>for (int i = 0; i < 5; ++i)
>{
> // ??
>}


No. That loop is exactly equivalent to this:

{
int i = 0;
while( i < 5 )
{
// ??
++i;
}
}

The "pre-increment" does not mean that the increment happens before the
test.

Now, if you had written it THIS way, as some people like to do:

for( int i = 0; i++ < 5; )

THEN pre-increment would change the meaning of the code.
--
- Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc
Tim Roberts

2005-05-28, 8:58 pm

"Doug Harrison [MVP]" <dsh@mvps.org> wrote:
>
>In the meantime, I'd recommend forgetting for_each exists.


Oh, but it's SO COOL.

I agree with your sentiment that you sometimes have to twist yourself in
knots. That's really too bad, because a construct like for_each can make
the code READ much more naturally.

I write a lot of Python code. I find that container and looping constructs
are expressed very naturally and readably in Python, and I have tried to
map some of those constructs back into my C++ code.

What you'd really like, I think, is to write something like this:

for_each( MyType item, vec.begin(), vec.end() )
{
// work with "item" here
}

I suppose something like that could actually be written with the help of
the pre-processor. And I'm sure someone will tell me that it's already
written and debugged in Boost.
--
- Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc
Norman Bullen

2005-05-29, 8:58 pm

Tim Roberts wrote:

> Mark Ingram <mark.ingram@removeme.softease.com> wrote:
>
>
>
> No. That loop is exactly equivalent to this:
>
> {
> int i = 0;
> while( i < 5 )
> {
> // ??
> ++i;
> }
> }
>


<pedantic mode on>
It's not quite the same...

If the contents of the loop contain a "continue" statement, the "++i"
will not be executed.
<\pedantic mode off>

Norm


--
--
To reply, change domain to an adult feline.

Doug Harrison [MVP]

2005-05-29, 8:58 pm

On Sat, 28 May 2005 14:12:22 -0700, Tim Roberts wrote:

> What you'd really like, I think, is to write something like this:
>
> for_each( MyType item, vec.begin(), vec.end() )
> {
> // work with "item" here
> }
>
> I suppose something like that could actually be written with the help of
> the pre-processor. And I'm sure someone will tell me that it's already
> written and debugged in Boost.


What I'd really like, Whidbey is already doing:

http://msdn2.microsoft.com/library/ms177202(en-us,vs.80).aspx

Even that doesn't go far enough. From the first subsection:

for each( pair<const char*, int> c in months )

There should be a way to avoid having to spell out the type of c.

--
Doug Harrison
Microsoft MVP - Visual C++
Simon Trew

2005-06-01, 4:00 am

"Bo Persson" <bop@gmb.dk> wrote in message
news:umLGs6uYFHA.1224@TK2MSFTNGP10.phx.gbl...
>
> Basically, it links the modules first, and then compiles the total code
> afterwards. At that time it can use global knowledge and, for example,
> inline functions from one .cpp-file to another, even if they are initially
> compiled in the wrong order.


.... which rather removes the proclaimed benefits of independent linking.
Intel compilers do the same, though they push the work onto the linker which
is actually, er, the compiler. So do MIPSpro compilers. I don't see anything
particularly wrong with this, save to say that presumably with C++ at least,
independent linking is a historical curiosity. It's a peculiar thing that in
our aim for global optimization we've taken techniques (name mangling,
inlining, template definitions in header files, precompiled headers &c.)
that have essentially all been there to put more onus on the 'compiler' than
the 'linker'. And we had "cat *.h *.cpp > compile_this.cpp" all the time...

S.


Gabor

2005-06-10, 4:03 pm

On ther performance front, I have written a simple program that iterates over
10000000 items using Microsoft Visual C++ .NET.

For each was done in a classic way:
for_each( v.begin(), v.end(), mem_fun_ref<blah,
someclass>(&someclass::dosomething));

The loop approach was implemented this way:
VectorTypeIterator iCurrent = v.begin(), iEnd = v.end;
for (;iCurrent != iEnd; ++iCurrent) iCurrent->dosomething();

Each approach was tested five time with the same vector. Here are my results:

for each:
0.326406(s)
0.306472(s)
0.302574(s)
0.305426(s)
0.307759(s)

Loop:
0.215136(s)
0.222612(s)
0.219414(s)
0.216155(s)
0.219778(s)

We see that the loop is a little bit faster but the difference is pretty
slim considering that we have 10000000 iterations.

I guess, it's up to personal taste than speed.


"Doug Harrison [MVP]" wrote:

> On Sat, 28 May 2005 14:12:22 -0700, Tim Roberts wrote:
>
>
> What I'd really like, Whidbey is already doing:
>
> http://msdn2.microsoft.com/library/ms177202(en-us,vs.80).aspx
>
> Even that doesn't go far enough. From the first subsection:
>
> for each( pair<const char*, int> c in months )
>
> There should be a way to avoid having to spell out the type of c.
>
> --
> Doug Harrison
> Microsoft MVP - Visual C++
>

Jeff F

2005-06-10, 8:59 pm

Tim Roberts wrote:
....

> What you'd really like, I think, is to write something like this:
>
> for_each( MyType item, vec.begin(), vec.end() )
> {
> // work with "item" here
> }
>
> I suppose something like that could actually be written with the help
> of the pre-processor. And I'm sure someone will tell me that it's
> already written and debugged in Boost.


Yep, by Eric Niebler, IIRC.

Jeff Flinn


Tim Roberts

2005-06-11, 3:59 am

"Gabor" <Gabor@discussions.microsoft.com> wrote:
>
>On ther performance front, I have written a simple program that iterates over
>10000000 items using Microsoft Visual C++ .NET.
>
>For each was done in a classic way:
>for_each( v.begin(), v.end(), mem_fun_ref<blah,
>someclass>(&someclass::dosomething));
>
>The loop approach was implemented this way:
>VectorTypeIterator iCurrent = v.begin(), iEnd = v.end;
>for (;iCurrent != iEnd; ++iCurrent) iCurrent->dosomething();
>
>Each approach was tested five time with the same vector. Here are my results:
>
>for each:
>0.326406(s)
>0.306472(s)
>0.302574(s)
>0.305426(s)
>0.307759(s)
>
>Loop:
>0.215136(s)
>0.222612(s)
>0.219414(s)
>0.216155(s)
>0.219778(s)
>
>We see that the loop is a little bit faster but the difference is pretty
>slim considering that we have 10000000 iterations.


Pretty slim? I find a 30% difference to be depressingly significant,
although that depends on how much work "dosomething()" is doing.

The for_each solution is certainly more aesthetically pleasing.
--
- Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com