Home > Archive > Tcl > August 2004 > [string totitle "ab cd ef"] does not output "Ab Cd Ef" but instea
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
[string totitle "ab cd ef"] does not output "Ab Cd Ef" but instea
|
|
| Gerry Snyder 2004-08-27, 3:57 am |
| This is not a major issue, but I am curious. I may have missed some
simple way of doing what I want. The command [string totitle
"whatever"] does not work as I would expect. What it outputs is more
appropriate for a sentence than for a title.
It makes the first character upper case and the rest lower case,
something that could be done with two [string to...] statements, which
hardly seems worth implementing.
Doing a real title requires looping through the string and for each
character make it upper case if the preceding character was a
space/tab/newline, and make it lower case otherwise. This is complicated
enough to warrant having in Tcl (or would be if enough people wanted it).
If I were implementing from scratch, I would have [string tosentence
....] have the current function and [string totitle ...] do what I want.
Since this would break existing scripts, I guess adding [string totitle
-all $string] as a new capability would be much better.
So, am I way off base? Is there some history to the current behavior? I
suspect that if there were much demand something like this would be
there already, so I certainly don't expect anyone to create it just for
me, but I am curious.
TIA,
Gerry
| |
| Bruce Hartweg 2004-08-27, 3:57 am |
| Gerry Snyder wrote:
> This is not a major issue, but I am curious. I may have missed some
> simple way of doing what I want. The command [string totitle "whatever"]
> does not work as I would expect. What it outputs is more appropriate for
> a sentence than for a title.
>
> It makes the first character upper case and the rest lower case,
> something that could be done with two [string to...] statements, which
> hardly seems worth implementing.
>
> Doing a real title requires looping through the string and for each
> character make it upper case if the preceding character was a
> space/tab/newline, and make it lower case otherwise. This is complicated
> enough to warrant having in Tcl (or would be if enough people wanted it).
>
> If I were implementing from scratch, I would have [string tosentence
> ...] have the current function and [string totitle ...] do what I want.
> Since this would break existing scripts, I guess adding [string totitle
> -all $string] as a new capability would be much better.
>
> So, am I way off base? Is there some history to the current behavior? I
> suspect that if there were much demand something like this would be
> there already, so I certainly don't expect anyone to create it just for
> me, but I am curious.
>
you are treating the string as a series of words, the string command takes
and treats it's argument as a single string - a series of characters.
tolower sets all chars to lower case
toupper sets all chars to upper case
totitle set first char to title/upper case and remain chars to lower case
it is very clearly documented at
http://www.tcl.tk/man/tcl8.5/TclCmd/string.htm#M44
you need/want something that is parsing the string into words and operating
on them individually. You could do this with a regsub/subst pair ...
proc multiWordTitleCase {str} {
subst [regsub -all {\w+} $str {[string totitle "&"]}]
}
Bruce
| |
| Michael Schlenker 2004-08-27, 8:58 am |
| Jules schrieb:
> Bruce Hartweg <bruce-news@hartweg.us> wrote in message news:<%gyXc.322037$%_6.194545@attbi_s01>...
>
>
>
>
> Isn't this code faster and more accurate:
>
> proc wordsToTitle {str} {
> foreach word $str {lappend output [string totitle $word]}
> return $output
> }
>
Its not. Do not use foreach on strings, only on lists, otherwise you may
see unwanted effects. As a second note, this deletes multiples spaces
between words. So it may be faster but is in no way more accurate...
Michael
| |
| Bruce Hartweg 2004-08-27, 4:00 pm |
|
Jules wrote:
> Ok, no more confusing strings with lists...
> Would there be any di vantage if you turn the string into a list
> first and join the result into a string?
>
> proc wordsToTitle {string} {
> foreach item [split $string] {lappend output [string totitle $item]}
> return [join $output]
> }
>
> It doesn't remove spaces and it's still faster than subst/regsub.
>
that is good as well, but note that you get different answers.
proc t1 {str} {
subst [regsub -all {\w+} $str {[string totitle "&"]}]
}
proc t2 {string} {
foreach item [split $string] {lappend output [string totitle $item]}
return [join $output]
}
set s1 "a bunch of words, some.are.weird, some are not"
set s2 "another one, isn't this fun?"
t1 $s1
t2 $s1
t1 $s2
t2 $s2
(bin) 53 % A Bunch Of Words, Some.Are.Weird, Some Are Not
(bin) 54 % A Bunch Of Words, Some.are.weird, Some Are Not
(bin) 55 % Another One, Isn'T This Fun?
(bin) 56 % Another One, Isn't This Fun?
so it depends on ow you want to interpret your string as
words, the RE currently is using alphanumeric or _ and the
split one does non-whitespace. (if you change the RE to use
\S you get the exact same answer as the split).
so the "right" answer really depends on the requirements of
what you do/don't want to consider a word. in the case
of the split/join you would have to specify the split string
as all the characters you consider NOT a word, in the case
of the RE solution you adjust the RE to be the expression that
defines everything you consider part of a word. depending on
what you definition is the RE could be easier to specify.
and unless you are doing this many, many times the performance
is less of a concern.
Bruce
| |
| Gerry Snyder 2004-08-28, 3:57 am |
| Bruce Hartweg wrote:
>
>
> you need/want something that is parsing the string into words and operating
> on them individually. You could do this with a regsub/subst pair ...
>
> proc multiWordTitleCase {str} {
> subst [regsub -all {\w+} $str {[string totitle "&"]}]
Much thanks to all who replied, and especially to Bruce for the above
code. First, it does what I need neatly, proving that I was right in
thinking my way of doing it was far from optimal (more briefly, my
original way stinks). Second, it opened my eyes to the power of the
regsub/subst pair in a much more meaningful way than seeing it applied
to someone else's problem would have.
Thanks again,
Gerry
|
|
|
|
|