For Programmers: Free Programming Magazines  


Home > Archive > PHP Documentation > July 2006 > Re: [PHP-DOC] Starting on Unicode docs









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Re: [PHP-DOC] Starting on Unicode docs
Gabor Hojtsy

2006-07-19, 6:58 pm

Our policy was always to document functionality when people have time
and willingness to do so, even for funtionality available in stable
versions in the future. The new Unicode stuff needs to be marked as such,
but IMHO has its place in the documentation. People need to see that there
is a light regarding Unicode!

Gabor

On Wed, 19 Jul 2006, Andrei Zmievski wrote:

> I think this information could be useful since people will most likely want
> to use online manual even before PHP 6.0 final is out. Making them go to a
> separate site to look up which function is Unicode-safe or not is not very
> user-friendly.
>
> -Andrei
>
>
>

Nuno Lopes

2006-07-19, 6:58 pm

Hi Goba!

I have already started documenting the Unicode stuff about one year ago
(http://php.net/unicode). Most is already out-of-date, but it is a start
point.
What we were discussing is if we should change every page to mention that it
is unicode-safe/aware/compatible/whatever. We were all against that because
changing every single file would be a pain for us and mainly for the
translators. All functions will be converted by the time PHP 6 is released,
so there is no really interest in marking every single function as such.

What we can do (and should) is to add some information to the reference.xml
file about how the extension handles unicode data (for example, the xml
extensions use utf8 internally because of libxml, etc..)


Nuno

[color=darkred]
> Our policy was always to document functionality when people have time and
> willingness to do so, even for funtionality available in stable versions
> in the future. The new Unicode stuff needs to be marked as such, but IMHO
> has its place in the documentation. People need to see that there is a
> light regarding Unicode!
>
> Gabor
>
> On Wed, 19 Jul 2006, Andrei Zmievski wrote:
>
Andrei Zmievski

2006-07-19, 6:58 pm

Fine, but what I was saying is that the behavior of number of functions
will change, slightly or more than slightly, depending on unicode
semantics mode being on or on the type of data passed to them. Those
changes need to be documented.

-Andrei

On Jul 19, 2006, at 2:17 PM, Nuno Lopes wrote:

> I have already started documenting the Unicode stuff about one year
> ago (http://php.net/unicode). Most is already out-of-date, but it is a
> start point.
> What we were discussing is if we should change every page to mention
> that it is unicode-safe/aware/compatible/whatever. We were all against
> that because changing every single file would be a pain for us and
> mainly for the translators. All functions will be converted by the
> time PHP 6 is released, so there is no really interest in marking
> every single function as such.
>
> What we can do (and should) is to add some information to the
> reference.xml file about how the extension handles unicode data (for
> example, the xml extensions use utf8 internally because of libxml,
> etc..)

Nuno Lopes

2006-07-19, 6:58 pm

> As far as I have seen adding an entity was the suggested method, which
> should not be a problem for translators to adapt to IMHO. In case this
> entity is just a "this has unicode support" text, and nothing more, and if
> all text handling functions will indeed have Unicode support, then adding
> these entities is really not seem to worth it.
>
> Gabor


The problem is: thats not only the text handling functions.. we are talking
about all functions! I don't remember how many they are, but they are many.


> Fine, but what I was saying is that the behaviour of number of functions
> will change, slightly or more than slightly, depending on unicode
> semantics mode being on or on the type of data passed to them. Those
> changes need to be documented.
>
> -Andrei


Ah sure! Some protos may need changes, and some functions may need tweaking,
of course. But all this will need to be made manually and thus will take a
long time..
I think we can start by describing the general behaviour of the extension on
the reference.xml page and then, as time permits, touch the function pages.

Nuno
Philip Olson

2006-07-19, 6:58 pm


On Jul 19, 2006, at 2:34 PM, Gabor Hojtsy wrote:

> Philip, I assumed what is supposed to be documented is not just a
> boolean switch: "this is unicode compatible" / "this is not yet
> unicode compatible", but there are behaviour changes, special
> options, or anything (I am not actually following PHP 6
> development). In case it is just a boolean switch, then indeed
> there is no point in documenting it. It seems Andrei tries to push
> the point across that we are not talking about boolean switches.


Ah yes, this makes sense and his last post did clarify that
(hindsight is 20/20!). Sounds like a good idea to document all these
types of changes so to the original question on when: ASAP! :-)

Regards,
Philip
Philip Olson

2006-07-19, 6:58 pm


I'm unsure if I follow what you mean here Gabor, please clarify.

If we do this soon then later we'll undo it all because when PHP
6.0.0 is released every [appropriate] function will be unicode
compatible so repeating this thousands of times throughout the
manual won't be useful. The appropriate place to document "the
progress" of unicode support for pre PHP 6 should not be in all these
manual pages as we do not document alpha/beta releases. If for some
reason a function is missed and support is given in, for example, PHP
6.0.1 then we will mention this in the CHANGELOG for said function.
But as it stands, everyone is expected to assume unicode support
everywhere as that's what PHP 6 is all about.

So, it feels right to document unicode compatibility in a few places
like the language reference, features, migration sections, related
extension docs (mbstring/iconv/...)... but not per function. In the
end people will see the light but not be blinded by it ;-) Also it
might be appropriate to document it for each extensions reference.xml
as it will be informative and also help promote this new feature.

Regards,
Philip


On Jul 19, 2006, at 1:03 PM, Gabor Hojtsy wrote:
[color=darkred]
> Our policy was always to document functionality when people have
> time and willingness to do so, even for funtionality available in
> stable versions in the future. The new Unicode stuff needs to be
> marked as such, but IMHO has its place in the documentation. People
> need to see that there is a light regarding Unicode!
>
> Gabor
>
> On Wed, 19 Jul 2006, Andrei Zmievski wrote:
>
Andrei Zmievski

2006-07-19, 9:57 pm

Yes, that's a good idea. :) Most of the functionality in the core
language will not be changing so the documentation effort can start
immediately. I would offer to document some things myself, but
honestly, I think that my time is best spent finishing up core stuff,
doing Unicode upgrades of core functions, and nagging other
developers to upgrade their extensions. But I'm perfectly willing to
provide answer about Unicode features to those who wish to start this
documentation effort.

-Andrei


On Jul 19, 2006, at 2:52 PM, Philip Olson wrote:
> Ah yes, this makes sense and his last post did clarify that
> (hindsight is 20/20!). Sounds like a good idea to document all
> these types of changes so to the original question on when: ASAP! :-)
>
> Regards,
> Philip

Andrei Zmievski

2006-07-19, 9:57 pm

>
> Ah sure! Some protos may need changes, and some functions may need
> tweaking, of course. But all this will need to be made manually and
> thus will take a long time..
> I think we can start by describing the general behaviour of the
> extension on the reference.xml page and then, as time permits,
> touch the function pages.


It's not just the unicode extension that needs to be documented.
There are plenty of changes affecting the whole language.

-Andrei
Philip Olson

2006-07-20, 3:58 am


[color=darkred]
> Yes, that's a good idea. :) Most of the functionality in the core
> language will not be changing so the documentation effort can start
> immediately. I would offer to document some things myself, but
> honestly, I think that my time is best spent finishing up core
> stuff, doing Unicode upgrades of core functions, and nagging other
> developers to upgrade their extensions. But I'm perfectly willing
> to provide answer about Unicode features to those who wish to start
> this documentation effort.


Perhaps you could provide an example or two on "how to document a
unicode change" as currently I for one would not know where to begin.
Like the best way to go about finding [1] and documenting a unicode
related change. And an estimate on about how many functions will
eventually require documentation changes for this. 50, 200, 500,
1000, ...?

One note: The PHP documentation is in the middle of being converted
to a new doc style [2] that includes individual Changelogs so ideally
the document would first be converted to this new style (about 30% of
the manual is currently), and then this change would be done. Ideally...

Also, the doc team is sort of on vacation currently but once Livedocs
[3] is online I have a feeling we'll make a concerted effort to
attract additional help on all fronts.

Regards,
Philip

[1] http://www.php.net/~scoates/unicode/
[2] http://wiki.phpdoc.info/DocSkel
[3] http://livedocs.roshambo.org/
Andrei Zmievski

2006-07-20, 6:57 pm

> Perhaps you could provide an example or two on "how to document a
> unicode change" as currently I for one would not know where to begin.
> Like the best way to go about finding [1] and documenting a unicode
> related change. And an estimate on about how many functions will
> eventually require documentation changes for this. 50, 200, 500, 1000,
> ...?


The only reliable way is to read the source code, of course. :) But
seriously, it might be tough for someone to pinpoint what
Unicode-related changes have been done to a function. As an example,
the notes for strtoupper() and strtolower() could say something like:

These functions perform full case-mapping according to the Unicode
standard [1]. Consequently, the length of the string may not be the
same after the operation: it may become shorter or longer.

[1] http://www.unicode.org/versions/Uni...ch05.pdf#G21180

As for how many functions will require documentation changes, it's hard
to say, as we've only upgraded about 200 right now. I would say that
about half of the string funcs may require some notes.

> Also, the doc team is sort of on vacation currently but once Livedocs
> [3] is online I have a feeling we'll make a concerted effort to
> attract additional help on all fronts.


Makes sense.

-Andrei
Philip Olson

2006-07-25, 6:57 pm


Hello everyone,

An RFC has been created on this topic so please review and comment:

RFC :: Details :: Documenting Unicode support
http://doc.php.net/php/rfc/rfc-proposal-show.php?id=6

Summary of RFC:
A new role named unicode going above the changelog. A new changelog
entry recording this change, add features/unicode.xml, update related
extensions with Unicode information (mbstring/iconv...), track the
progress, and be sure the new doc style is implemented first.

Note: This is the first real use of this RFC system and generated
emails currently only go to the doc-web mailing list. So, user
beware :-)

Regards,
Philip
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com