Home > Archive > VC STL > April 2005 > 'npos' in string operations
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
'npos' in string operations
|
|
| Ian Semmel 2005-04-27, 9:00 am |
| The MS documentation defines npos (eg from find_first_of) as
static const basic_string <char>::size_type npos = -1;
When compiling under gcc, this gives a signed/unsigned mismatch warning.
What is the accepted definition for npos ?
| |
| Igor Tandetnik 2005-04-27, 4:02 pm |
| "Ian Semmel" <isemmel@removejunkmailrocketcomp.com.au> wrote in message
news:ebr%23bywSFHA.980@TK2MSFTNGP14.phx.gbl
> The MS documentation defines npos (eg from find_first_of) as
>
> static const basic_string <char>::size_type npos = -1;
>
> When compiling under gcc, this gives a signed/unsigned mismatch
> warning.
> What is the accepted definition for npos ?
It's either that, or
static const basic_string <char>::size_type npos =
static_cast<basic_string <char>::size_type>(-1);
The latter is equivalent to the former, but is accepted by most
compilers without warning.
--
With best wishes,
Igor Tandetnik
With sufficient thrust, pigs fly just fine. However, this is not
necessarily a good idea. It is hard to be sure where they are going to
land, and it could be dangerous sitting under them as they fly
overhead. -- RFC 1925
| |
| Pete Becker 2005-04-27, 4:02 pm |
| Igor Tandetnik wrote:
>
> The latter is equivalent to the former, but is accepted by most
> compilers without warning.
And makes code that uses it harder to read and slower to compile.
--
Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
| |
| Gene Bushuyev 2005-04-27, 9:01 pm |
| "Ian Semmel" <isemmel@removejunkmailrocketcomp.com.au> wrote in message
news:ebr%23bywSFHA.980@TK2MSFTNGP14.phx.gbl...
> The MS documentation defines npos (eg from find_first_of) as
>
> static const basic_string <char>::size_type npos = -1;
>
> When compiling under gcc, this gives a signed/unsigned mismatch warning.
>
> What is the accepted definition for npos ?
The above definition is correct, according to the standard 21.3/6.
But I wonder why wasn't a more palatable form used:
static const size_type npos = -1u;
unary "-" applied to unsigned type results in unsigned type, so no compiler
should complain about signed/unsigned mismatch. The only thing I can think
about is the fact that basic_string<>::size_type is a typedef for
Allocator::size_type, and user defined allocator may have any type. Which
leads me to the conclusion that npos should have been defined Allocator
class, and not in basic_string. Or better yet, npos shouldn't be there in
the first place, as it violates "avoid magic numbers" design rule.
Gene
| |
| Doug Harrison [MVP] 2005-04-27, 9:01 pm |
| On Wed, 27 Apr 2005 21:44:22 GMT, Gene Bushuyev wrote:
> The above definition is correct, according to the standard 21.3/6.
> But I wonder why wasn't a more palatable form used:
>
> static const size_type npos = -1u;
>
> unary "-" applied to unsigned type results in unsigned type, so no compiler
> should complain about signed/unsigned mismatch.
Consider what happens if sizeof(unsigned) < sizeof(size_type), and contrast
with sizeof(int) < sizeof(size_type).
> The only thing I can think
> about is the fact that basic_string<>::size_type is a typedef for
> Allocator::size_type, and user defined allocator may have any type. Which
> leads me to the conclusion that npos should have been defined Allocator
> class, and not in basic_string.
But npos is a string concept completely unrelated to allocator.
> Or better yet, npos shouldn't be there in
> the first place, as it violates "avoid magic numbers" design rule.
But npos is not a magic number. It's a manifest constant whose value has
some useful properties when dealing with string function return values. The
term "magic number" refers to a literal constant, not a named one. If some
code used -1 instead of npos, then you could rightfully accuse it of using
a magic number.
--
Doug Harrison
Microsoft MVP - Visual C++
| |
| Bo Persson 2005-04-28, 9:00 am |
|
"Gene Bushuyev" <gb@127.0.0.1> skrev i meddelandet
news:WITbe.8685$J12.1893@newssvr14.news.prodigy.com...
> "Ian Semmel" <isemmel@removejunkmailrocketcomp.com.au> wrote in
> message
> news:ebr%23bywSFHA.980@TK2MSFTNGP14.phx.gbl...
>
> The above definition is correct, according to the standard 21.3/6.
> But I wonder why wasn't a more palatable form used:
>
> static const size_type npos = -1u;
>
> unary "-" applied to unsigned type results in unsigned type, so no
> compiler
> should complain about signed/unsigned mismatch.
No, but the other compiler will now complain that "unary minus applied
to an unsigned type is still unsigned".
Initializing an unsigned type with -1 is well defined by the standard.
It results in the largest possible value of the unsigned type.
Bo Persson
| |
| Gene Bushuyev 2005-04-28, 4:06 pm |
| "Doug Harrison [MVP]" <dsh@mvps.org> wrote in message
news:1i2co1s64urs6$.1oyztc8z2lpwn.dlg@40tude.net...
> On Wed, 27 Apr 2005 21:44:22 GMT, Gene Bushuyev wrote:
>
compiler[color=darkred]
>
> Consider what happens if sizeof(unsigned) < sizeof(size_type), and
contrast
> with sizeof(int) < sizeof(size_type).
Why? I already mentioned that user defined allocator types can still pose
the conversion warning problem. But "-1" is no better in this respect than
"-1u"
>
Which[color=darkred]
>
> But npos is a string concept completely unrelated to allocator.
It's related by its type to the allocator, not to the string class.
>
>
> But npos is not a magic number. It's a manifest constant whose value has
> some useful properties when dealing with string function return values.
The
> term "magic number" refers to a literal constant, not a named one. If some
> code used -1 instead of npos, then you could rightfully accuse it of using
> a magic number.
All "magic numbers" are (generally numeric) constants with special
properties. What's your point? Choosing the word (special) "constant"
instead of "magic number" somehow supposed to change the meaning?
npos is a magic number, because a single number from size_type set is chosen
to convey a completely different meaning than all the other numbers from the
same set. Moreover, it's completely unnecessary for a string class to have
such a number, no string functionality becomes impossible or even
inconvenient if npos dissapeared altogether. If you think otherwise, then
provide an example where it's necessary.
Let's face it, std::basic_string is screwed up big time.
Gene
| |
| Igor Tandetnik 2005-04-28, 9:03 pm |
| "Gene Bushuyev" <gb@127.0.0.1> wrote in message
news:BW9ce.2566$Gd7.2450@newssvr21.news.prodigy.com
> "Doug Harrison [MVP]" <dsh@mvps.org> wrote in message
> news:1i2co1s64urs6$.1oyztc8z2lpwn.dlg@40tude.net...
>
> Why? I already mentioned that user defined allocator types can still
> pose the conversion warning problem. But "-1" is no better in this
> respect than "-1u"
Is too. size_type does not have to be unsigned int. Typically, size_type
is the same as size_t. E.g. under Win64, size_t is 64 bit while unsigned
int is 32 bit.
Let's assume that size_t is 64 bit and unsigned is 32 bit. Consider:
size_t n = -1;
Here, 1 is a literal of type int. A unary minus operator is applied to
it, resulting in an expression of type int with the value of -1. An
assignment operator then converts this expression to size_t following
(modulo 2^64) arithmetic as defined by the standard. The result is the
largest value of type size_t, that is, the largest 64-bit number.
Now consider
size_t n = -1u;
You start with a literal of type unsigned. You apply a unary minus to
it, which produces a value of type unsigned following (modulo 2^32)
arithmetic. The result is the largest 32-bit value. This is then
converted to type size_t, but since you are converting an unsigned value
to a larger unsigned type, it simply gets padded with zeros on the left.
So you end up with a value of type size_t which still is a largest
32-bit integer, not the largest 64-bit integer.
--
With best wishes,
Igor Tandetnik
With sufficient thrust, pigs fly just fine. However, this is not
necessarily a good idea. It is hard to be sure where they are going to
land, and it could be dangerous sitting under them as they fly
overhead. -- RFC 1925
| |
| Doug Harrison [MVP] 2005-04-28, 9:03 pm |
| On Thu, 28 Apr 2005 18:27:45 GMT, Gene Bushuyev wrote:
>
> Why? I already mentioned that user defined allocator types can still pose
> the conversion warning problem. But "-1" is no better in this respect than
> "-1u"
Igor explained the difference very nicely. To summarize it, -1 of any
signed integer type will always convert to the largest value of any
unsigned integer type, while -1u will not.
> Which
>
> It's related by its type to the allocator, not to the string class.
Suppose there are classes X1, X2, X3, ... XN that define their own npos
analogues. Need I say more?
> The
>
> All "magic numbers" are (generally numeric) constants with special
> properties. What's your point?
My point is you're using the term in such a way that it loses its commonly
accepted meaning:
http://www.clueless.com/jargon3.0.0/magic_number.html
> Choosing the word (special) "constant"
> instead of "magic number" somehow supposed to change the meaning?
Someone who sees "npos" in source code can easily look it up if he doesn't
know what it means. Someone who finds 284924073 has to pray there's a
comment nearby to explain it. There's a difference.
> npos is a magic number, because a single number from size_type set is chosen
> to convey a completely different meaning than all the other numbers from the
> same set. Moreover, it's completely unnecessary for a string class to have
> such a number, no string functionality becomes impossible or even
> inconvenient if npos dissapeared altogether. If you think otherwise, then
> provide an example where it's necessary.
Various string member functions accept and/or return positions. Designating
a singular value for position enables:
1. ctors and other functions specify npos as the value for a length
parameter to mean "the rest of the string" starting from some position pos.
2. Functions such as find return size_type and need a way to encode "not
found".
Choosing the largest value of size_type for npos makes sense because
size_type is typically defined such that it's impossible for a string to
ever contain npos characters, because there isn't enough address space to
hold it. Therefore, unlike some poorly chosen sentinel values, npos doesn't
intrude into the range of valid values. Since size_type is unsigned, and
npos is defined as it is, we know that npos+1 is zero, and that's useful in
cases such as:
s.resize(s.find_last_not_of(_T("\r\n. "))+1);
This can be used to truncate (say) an Indexing Service COM error string
prior to formatting it. If the string contains a character not in the set,
adding one to its position performs the desired truncation. If the string
consists entirely of characters in the supplied set, find_last_not_of
returns npos, adding one sums to zero, and you end up with resize(0), which
truncates the string to nothing, which is the desired effect.
If you want an example of a sentinel value that can actually be a problem,
look at EOF, and try to implement the standard library when sizeof(char) ==
sizeof(int). (Actually, it's unsigned char that's relevant, but
sizeof(unsigned char) == sizeof(char) by definition.)
> Let's face it, std::basic_string is screwed up big time.
Agreed, but I have my own reasons. The real problems with std::string are
due to its weak specification. Off the top of my head:
1. It can be reference-counted or not, which has implications for the
exception safety and speed of copying.
2. If it's reference-counted, it cannot be truly copy-on-write, but instead
can only be what I call "copy-just-in-case", which leads to various bizarre
invalidation rules.
3. It can be contiguous or not.
4. The data() and c_str() members may or may not throw.
5. You can say s[s.size()] on const s, but it's undefined for non-const s.
--
Doug Harrison
Microsoft MVP - Visual C++
| |
| Jason Winnebeck 2005-04-28, 9:03 pm |
| Doug Harrison [MVP] wrote:
>
> intrude into the range of valid values. Since size_type is unsigned, and
> npos is defined as it is, we know that npos+1 is zero, and that's useful in
You said that npos is not a "magical value" because of its clear definition
and good choice as a sentinel value (I agree with this, that npos is good).
However, is it safe (or "smart") to assume that npos+1 is zero? Wouldn't
that basically turn it into a magical value with an intelligent name, since
for npos+1 to work you have to know that the magical number is the largest
unsigned number?
Jason
| |
| Doug Harrison [MVP] 2005-04-29, 4:04 am |
| On Thu, 28 Apr 2005 17:46:13 -0400, Jason Winnebeck wrote:
> You said that npos is not a "magical value" because of its clear definition
> and good choice as a sentinel value (I agree with this, that npos is good).
> However, is it safe (or "smart") to assume that npos+1 is zero?
True, if size_type were smaller than int, you'd have a problem in some
circumstances, though not in the example I presented. For example:
s.resize(npos+1);
Suppose size_type were unsigned short, and unsigned short were smaller than
int. Then npos would get promoted to int. Thus, a 16-bit unsigned short
npos (equal to 65535 by definition) would become int(65535), and npos+1
would equal int(65536). However, this result gets converted back to
size_type when passed to resize, so resize observes the result to be zero.
So that's OK, but this would be a problem under this hypothetical
implementation:
assert(npos+1 == 0);
In five minutes, I couldn't find any specific requirement for size_type or
size_t for that matter, but there may be an indirect chain of reasoning
that implies one. (Note that containers can assume allocator::size_type is
the same as size_t, which is a pretty strong hint as to what it should be
or at a minimum be like.) FWIW, I've never known a size_t or size_type
smaller than unsigned int (I'd be interested to hear of one), and if
size_type is unsigned int or larger, it's guaranteed that npos+1 is zero.
So in every implementation I know about, the assert mentioned above never
fires, where npos is size_t or size_type.
> Wouldn't
> that basically turn it into a magical value with an intelligent name, since
> for npos+1 to work you have to know that the magical number is the largest
> unsigned number?
It is what it is, and it can't be anything else. If that were not the case
(modulo the size issue, if it's real), I'd agree that taking advantage of
npos+1 is an abuse. Whether or not to take advantage of this boils down to
a matter of taste, which is the case for all idioms.
--
Doug Harrison
Microsoft MVP - Visual C++
| |
| Gene Bushuyev 2005-04-29, 4:04 am |
| "Igor Tandetnik" <itandetnik@mvps.org> wrote in message
news:udidu3CTFHA.3620@TK2MSFTNGP09.phx.gbl...
....
> You start with a literal of type unsigned. You apply a unary minus to
> it, which produces a value of type unsigned following (modulo 2^32)
> arithmetic. The result is the largest 32-bit value. This is then
> converted to type size_t, but since you are converting an unsigned value
> to a larger unsigned type, it simply gets padded with zeros on the left.
> So you end up with a value of type size_t which still is a largest
> 32-bit integer, not the largest 64-bit integer.
Criticism accepted. The general solution may be this:
static const size_type npos = -(size_type)1;
and if some compilers compain about "unary -" applied to unsigned type, then
this solution should be free from any warnings:
static const size_type npos = ~(size_type)0;
But this is a minor problem. Bigger problem is that npos shouldn't be there
in the first place.
| |
| Gene Bushuyev 2005-04-29, 4:04 am |
| "Doug Harrison [MVP]" <dsh@mvps.org> wrote in message
news:122d7mceraa0i$.dxac13mngnv8$.dlg@40tude.net...
....
>
> Suppose there are classes X1, X2, X3, ... XN that define their own npos
> analogues. Need I say more?
Huh? I guess you need.
....
>
> My point is you're using the term in such a way that it loses its commonly
> accepted meaning:
>
> http://www.clueless.com/jargon3.0.0/magic_number.html
>
Quote:
"A number that encodes critical information used in an algorithm in some
opaque way"
- so what's wrong with that definition? But I'm not going to argue about
words. If you like calling it a special case constant instead of magic
number, - it's fine with me.
>
> Someone who sees "npos" in source code can easily look it up if he doesn't
> know what it means. Someone who finds 284924073 has to pray there's a
> comment nearby to explain it. There's a difference.
That's not the point. The evil of npos is that it's just a single value from
size_type set that has a *special meaning* Good design shouldn't contain any
special constants that are treated differently than the other by the same
functions.
....
>
> Various string member functions accept and/or return positions.
Designating
> a singular value for position enables:
>
> 1. ctors and other functions specify npos as the value for a length
> parameter to mean "the rest of the string" starting from some position
pos.
which 1) unnecessary, and 2) is better done through overloading, rather by
defining default parameters
>
> 2. Functions such as find return size_type and need a way to encode "not
> found".
That's because of basic_string's terrible design. Have you seen any of the
standard algorithms or other containers taking this approach. I can't find
any. Because it's 1) unnecessary and 2) bad design. All string's *find*()
functions should have returned iterators, which would be consistent and work
well with the rest of the standard library containers and algorithms. In
fact those *find*() functions shouldn't have been members in the first
place, but rather more general (and more powerful) non-member algorithms, in
fact most of std::*find* doing that work already exist. Now, if you get rid
of the *find* duplicates and make the rest of string::*find* functions
non-member functions returning iterators, nobody would even think about
needing something as odd as basic_string::npos constant.
| |
| Doug Harrison [MVP] 2005-04-29, 4:04 am |
| On Fri, 29 Apr 2005 01:46:03 GMT, Gene Bushuyev wrote:
> "Doug Harrison [MVP]" <dsh@mvps.org> wrote in message
> news:122d7mceraa0i$.dxac13mngnv8$.dlg@40tude.net...
> ...
>
> Huh? I guess you need.
> ...
You stated that npos should be defined by the allocator class, despite the
fact it's a string concept that has nothing to do with allocators. There
are an infinite number of possible classes that might want to define
something similar to npos, which I captured with my X1, X2, X3, ... XN
example. By your reasoning, they should dump their npos analogues into
allocator. That does not compute.
>
> Quote:
> "A number that encodes critical information used in an algorithm in some
> opaque way"
> - so what's wrong with that definition?
I guess something went wrong with your copy and paste job, so allow me to
restore the relevant parts that you omitted:
http://www.clueless.com/jargon3.0.0/magic_number.html
<q>
1. In source code, some non-obvious constant whose value is significant to
the operation of a program and that is inserted inconspicuously in-line
(hardcoded), rather than expanded in by a symbol set by a commented
`#define'. Magic numbers in this sense are bad style.
2. A number that encodes critical information used in an algorithm in some
opaque way. The classic examples of these are the numbers used in hash or
CRC functions, or the coefficients in a linear congruential generator for
pseudo-random numbers. This sense actually predates and was ancestral to
the more common sense 1.
</q>
You earlier said npos violates a "design rule":
<q>
Or better yet, npos shouldn't be there in
the first place, as it violates "avoid magic numbers" design rule.
</q>
The "design rule" you invoked refers to (1), and I explained why it doesn't
apply to npos in my first message. The "magic numbers" in (2) are essential
to the operation of the algorithms mentioned there, and for someone who
doesn't understand the algorithm that uses them, they certainly can seem
like magic. It doesn't make them any less necessary to the algorithm,
though, and they can't very well be "avoided". So again, there's a
difference.
> But I'm not going to argue about
> words. If you like calling it a special case constant instead of magic
> number, - it's fine with me.
It's just a plain ol' constant to me.
>
> That's not the point. The evil of npos is that it's just a single value from
> size_type set that has a *special meaning* Good design shouldn't contain any
> special constants that are treated differently than the other by the same
> functions.
> ...
As I also said, npos does not correspond to a valid position. In that
respect, it is like a past-the-end iterator or a NULL pointer. How is npos
"evil" while the latter two are (I guess) "good"?
> Designating
> pos.
>
> which 1) unnecessary, and 2) is better done through overloading, rather by
> defining default parameters
1. It can save you from determining string length yourself.
2. Where did I say anything about default parameters? Even if you overload
(pos) and (pos, length), the latter should still accept npos for length
because of (1). This can come up when you want just a fragment of the
string or the whole thing, so you use the (pos, length) overload, where
length is specified by (say) using the conditional operator.
>
> That's because of basic_string's terrible design. Have you seen any of the
> standard algorithms or other containers taking this approach. I can't find
> any. Because it's 1) unnecessary and 2) bad design. All string's *find*()
> functions should have returned iterators, which would be consistent and work
> well with the rest of the standard library containers and algorithms. In
> fact those *find*() functions shouldn't have been members in the first
> place, but rather more general (and more powerful) non-member algorithms, in
> fact most of std::*find* doing that work already exist. Now, if you get rid
> of the *find* duplicates and make the rest of string::*find* functions
> non-member functions returning iterators, nobody would even think about
> needing something as odd as basic_string::npos constant.
I'd agree the string interface is fat. I disagree that iterators are the
answer to everything. Often integer positions are more convenient to deal
with. Except for the find/compare member functions, which are defined only
on positions, the std::string class pretty much lets you choose the model
that suits the task at hand. And if you want to use string with generic
algorithms based on iterators, there's nothing stopping you, though the
cumbersome iterator-based find/compare functions in <algorithm> likely
aren't optimized for strings in the way the position-based find/compare
members likely are.
I have to wonder, given your disdain for the "magic" fixed value npos,
which has the type of a position but does not denote a legal position, how
do you feel about the "magic" non-fixed iterator returned by end(), which
has the type of an iterator but does not denote a legal position?
(That's a fair question. While end() has something to do with length, and
npos does not, iterator versions of the functions we're discussing would
return end(), or more generally y for the iterator range [x,y), to indicate
not-found in the same way the existing ones return npos. So in that
respect, there's no difference.)
--
Doug Harrison
Microsoft MVP - Visual C++
| |
| Gene Bushuyev 2005-04-29, 9:02 pm |
| "Doug Harrison [MVP]" <dsh@mvps.org> wrote in message
news:16sn9upwirv3q$.1hu2emhl9vpdu$.dlg@40tude.net...
> On Fri, 29 Apr 2005 01:46:03 GMT, Gene Bushuyev wrote:
....
> I have to wonder, given your disdain for the "magic" fixed value npos,
> which has the type of a position but does not denote a legal position, how
> do you feel about the "magic" non-fixed iterator returned by end(), which
> has the type of an iterator but does not denote a legal position?
>
> (That's a fair question. While end() has something to do with length, and
> npos does not, iterator versions of the functions we're discussing would
> return end(), or more generally y for the iterator range [x,y), to
indicate
> not-found in the same way the existing ones return npos. So in that
> respect, there's no difference.)
Ok, it's a fair question. And I grant you that there are similarities, such
as you can't dereference end() iterator and you can't access string at npos
value. But that's where the similarities end. Not much if you ask me. The
first corresponds to a well defined range concept, used in mathematics for
ages, the second is an artificial size constant used to indicate unrelated
situations. As a result every algorithm and member function works seamlessly
with end() iterator with no surprises, while npos has inconsistencies all
over the place.
Every iterator in the range [begin(), end()) can be dereferenced. Can you
access any value in the [0, npos) range? No, because npos is just a kludge
to indicate the special cases. Can you traverse a reverse range like
[rbegin(), rend()) using npos? No, because npos is just a kludge to indicate
the special cases. Can you insert to a string before npos, like you do it
before end()? No, because npos is just a kludge to indicate the special
cases.
npos is used for two different purposes, both of which have nothing to do
with it's numeric value. First, it's used to indicate "take the whole
string" in a number of superfluous functions (violating minimal interface
concept), second to indicate "not found" situation in *find*() member
functions, which of course should have returned iterators.
Almost every function that accepts npos as "take the whole string" has it as
a default parameter, npos cannot be used in any of the other parameters
without throwing an exception. (Some functions don't take npos as a default
parameter, because they already have too many unnecessary overloads.) If
those functions to exist at all they should have been either overloaded to
"take the whole string" with no default parameters or simply any value
bigger than size() should serve that function. There is no need for npos
here.
Likewise, if find() cannot find, it should have better returned
string::size() - it's more logical and better suited for other algorithms
and functions. So there is no need in npos here as well.
Case closed.
| |
| Stephen Howe 2005-04-29, 9:02 pm |
| > The first corresponds to a well defined range concept, used in mathematics
for
> ages, the second is an artificial size constant used to indicate unrelated
> situations.
Maybe. If you check Knuth you find that "Sentinel values" has an honored
history in programming used to denote some special value. I see npos very
similar to a Sentinel value.
Stephen Howe
| |
| Doug Harrison [MVP] 2005-04-30, 9:01 pm |
| On Fri, 29 Apr 2005 18:52:15 GMT, Gene Bushuyev wrote:
> Ok, it's a fair question. And I grant you that there are similarities, such
> as you can't dereference end() iterator and you can't access string at npos
> value. But that's where the similarities end. Not much if you ask me. The
> first corresponds to a well defined range concept, used in mathematics for
> ages; the second is an artificial size constant used to indicate unrelated
> situations. As a result every algorithm and member function works seamlessly
> with end() iterator with no surprises, while npos has inconsistencies all
> over the place.
>
> Every iterator in the range [begin(), end()) can be dereferenced. Can you
> access any value in the [0, npos) range? No, because npos is just a kludge
> to indicate the special cases. Can you traverse a reverse range like
> [rbegin(), rend()) using npos? No, because npos is just a kludge to indicate
> the special cases. Can you insert to a string before npos, like you do it
> before end()? No, because npos is just a kludge to indicate the special
> cases.
So, npos is not equivalent to end(). I don't have a problem with that.
> npos is used for two different purposes, both of which have nothing to do
> with it's numeric value. First, it's used to indicate "take the whole
> string" in a number of superfluous functions (violating minimal interface
> concept)
I don't view string functions that take position and length as superfluous.
I view them as frequently convenient and preferable to a pure iterator
interface. And the numeric value of npos is in fact relevant to indicating
"take the whole string", or more generally, the remainder following some
position, because npos is larger than any length.
> second to indicate "not found" situation in *find*() member
> functions, which of course should have returned iterators.
Which of course could complicate the code you're not considering at the
moment, which would be clumsier to implement in terms of iterators.
> Almost every function that accepts npos as "take the whole string" has it as
> a default parameter npos cannot be used in any of the other parameters
> without throwing an exception. (Some functions don't take npos as a default
> parameter, because they already have too many unnecessary overloads.) If
> those functions to exist at all they should have been either overloaded to
> "take the whole string" with no default parameters or simply any value
> bigger than size() should serve that function. There is no need for npos
> here.
Any value bigger than size() (or more generally size()-pos) does serve that
purpose. Like I said earlier, we know npos is larger than any string
length, so it's possible to say things like:
// Please excuse the magic numbers 2 and 4.
string s(t, 2, (cond) ? 4 : string::npos);
I prefer that to:
string s(t, 2, (cond) ? 4 : t.size());
> Likewise, if find() cannot find, it should have better returned
> string::size() - it's more logical and better suited for other algorithms
> and functions. So there is no need in npos here as well.
One thing that would suck about using size() as an ersatz end() to indicate
"not found" or "the remainder of the string" is that like end() and
iterators in general, it has meaning only for one container and then only
at a given point in time.
Also, string iterators are invalidated at the drop of a pin, and using them
correctly in one's own mutating algorithms can require more care than using
integer positions. For example, I can insert or delete from a string or
vector and adjust a position simply by adding to it, whereas with
iterators, I'd have to do something like this in some cases:
// These are random access iterators; the generic version using distance
// and advance is even more verbose.
int x = i-s.begin()+offset;
modify s, invalidating the iterator i
i = s.begin()+x;
Sometimes it's more convenient to work with position and length than
iterators.
> Case closed.
Famous last words, more like it. ;)
--
Doug Harrison
Microsoft MVP - Visual C++
|
|
|
|
|