Home > Archive > Tcl > November 2006 > string bytelength -> character length?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
string bytelength -> character length?
|
|
| George Petasis 2006-11-24, 7:02 pm |
| Hi all,
I have a tcl string and somehow I have a *byte* offset in this string.
How can I convert this *byte* offset to character offset?
If I had the character offset, I can get the byte offset with string
bytelength. But how to achieve the opposite?
I suppose that I have to "reparse" the string into a binary object
(having each byte as a character), then cat this string with string
range, and then "reparse" this string as utf-8, to use string length on
this. But how to accomplish this?
The solution in C would have been extremely easy:
Tcl_NumUtfChars(Tcl_GetString(string_obj
), bytelength);
But how the same can be done in Tcl?
George
PS: Its a pitty that TkHtml 3.0 returns *byte* offsets in the parse
handlers, and not *character* offsets!
| |
| Darren New 2006-11-24, 7:02 pm |
| George Petasis wrote:
> range, and then "reparse" this string as utf-8, to use string length on
> this. But how to accomplish this?
Be aware of [encoding convertto utf-8 $yourstring]. I spent some time
finding this myself. :-)
--
Darren New / San Diego, CA, USA (PST)
Scruffitarianism - Where T-shirt, jeans,
and a three-day beard are "Sunday Best."
|
|
|
|
|