For Programmers: Free Programming Magazines  


Home > Archive > Tcl > November 2006 > string bytelength -> character length?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author string bytelength -> character length?
George Petasis

2006-11-24, 7:02 pm

Hi all,

I have a tcl string and somehow I have a *byte* offset in this string.
How can I convert this *byte* offset to character offset?
If I had the character offset, I can get the byte offset with string
bytelength. But how to achieve the opposite?

I suppose that I have to "reparse" the string into a binary object
(having each byte as a character), then cat this string with string
range, and then "reparse" this string as utf-8, to use string length on
this. But how to accomplish this?

The solution in C would have been extremely easy:

Tcl_NumUtfChars(Tcl_GetString(string_obj
), bytelength);

But how the same can be done in Tcl?

George

PS: Its a pitty that TkHtml 3.0 returns *byte* offsets in the parse
handlers, and not *character* offsets!
Darren New

2006-11-24, 7:02 pm

George Petasis wrote:
> range, and then "reparse" this string as utf-8, to use string length on
> this. But how to accomplish this?


Be aware of [encoding convertto utf-8 $yourstring]. I spent some time
finding this myself. :-)

--
Darren New / San Diego, CA, USA (PST)
Scruffitarianism - Where T-shirt, jeans,
and a three-day beard are "Sunday Best."
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com