Home > Archive > APL > October 2007 > APL Idiom?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
|
| Anyone have an APL Idiom to do the following, given a binary bit
vector
0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
Produce the following:
0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
?
| |
|
| APL wrote:
> Anyone have an APL Idiom to do the following, given a binary bit
> vector
> 0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
> Produce the following:
> 0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
>
> ?
>
B{times}+\B>{neg}1{drop}0,B
p
--
Posted via a free Usenet account from http://www.teranews.com
| |
|
| On Oct 22, 4:43 pm, APL <apl....@gmail.com> wrote:
> Anyone have an APL Idiom to do the following, given a binary bit
> vector
> 0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
> Produce the following:
> 0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
>
> ?
{ =D7+\ }0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
| |
|
| On Oct 22, 5:27 pm, Rav <Pa...@cais.com> wrote:
> APL wrote:
>
>
> B{times}+\B>{neg}1{drop}0,B
>
> p
>
> --
> Posted via a free Usenet account fromhttp://www.teranews.com
Or, in case UTF8 is not working for you, or you are not using Dyalog:
bool<-0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
boolx+\bool
| |
|
| On Oct 22, 4:43 pm, APL <apl....@gmail.com> wrote:
> Anyone have an APL Idiom to do the following, given a binary bit
> vector
> 0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
> Produce the following:
> 0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
>
> ?
Hmmm... Should read more carefully - sorry
| |
|
| On Oct 22, 9:27 am, Rav <Pa...@cais.com> wrote:
> APL wrote:
>
>
> B{times}+\B>{neg}1{drop}0,B
>
Perfect. Thanks.
| |
| Phil Last 2007-10-22, 6:58 pm |
| On Oct 22, 7:14 pm, APL <apl....@gmail.com> wrote:
> On Oct 22, 9:27 am, Rav <Pa...@cais.com> wrote:
>
>
>
>
> Perfect. Thanks.
Kai was trying to say (but maybe this won't work either if google
doesn't have it right. Looks ok as I type using the IME.) ...
{ =D7+\ }
{omega times plus scan omega}
Here is the derived function equivalent
=D7 (+\)=20
times jot (plus scan) commute
Both of these avoid the necessity for the intermediate assignment of B
or whatever.
Phil
| |
|
| On Oct 22, 12:40 pm, Phil Last <phil.l...@ntlworld.com> wrote:
> On Oct 22, 7:14 pm, APL <apl....@gmail.com> wrote:
>
>
>
>
>
>
> Kai was trying to say (but maybe this won't work either if google
> doesn't have it right. Looks ok as I type using the IME.) ...
> { =D7+\ }
> {omega times plus scan omega}
> Here is the derived function equivalent
> =D7 (+\)
> times jot (plus scan) commute
> Both of these avoid the necessity for the intermediate assignment of B
> or whatever.
> Phil
I don't do defined-functions, ala Dyalog. Straight APL works find and
the solution above is perfect. Some days, the simple seems hard...
| |
|
| Another solution:
B\{member}(+\({shape}B{enclose}B)/B){times}B{enclose}B
| |
| Phil Last 2007-10-23, 3:57 am |
| On Oct 22, 8:40 pm, Phil Last <phil.l...@ntlworld.com> wrote:
> On Oct 22, 7:14 pm, APL <apl....@gmail.com> wrote:
>
>
>
>
>
>
> Kai was trying to say (but maybe this won't work either if google
> doesn't have it right. Looks ok as I type using the IME.) ...
> { =D7+\ }
> {omega times plus scan omega}
> Here is the derived function equivalent
> =D7 (+\)
> times jot (plus scan) commute
> Both of these avoid the necessity for the intermediate assignment of B
> or whatever.
> Phil
OK, I know I was making the same mistake as Kai in not reading the
problem properly, also not reading the rest of the thread, but in
terms of what gets transmitted and received it seems it's google
that's throwing away the apl characters. As I type this the IME is
letting me enter and displaying " ~ * ~ " (CTRL+"all
these apl characters"). I'll bet google chucks most of them away.
Phil
| |
| stevemansour@yahoo.com 2007-10-23, 6:58 pm |
| On Oct 22, 2:14 pm, APL <apl....@gmail.com> wrote:
> On Oct 22, 9:27 am, Rav <Pa...@cais.com> wrote:
>
>
>
>
> Perfect. Thanks.
A slightly more compact expression:
Bx+\1,2>/B
| |
|
| > I don't do defined-functions, ala Dyalog. Straight APL works find and
> the solution above is perfect. Some days, the simple seems hard...
???
There is hardly anything available in APL simpler than dynamic
functions.
I used them because one often can avoid creating temporary variables,
as in this case.
Anyway, here is Dyalog []AV without the control chars for test
purposes:
%' =20
_abcdefghijklmno
pqrstuvwxyz =AF.=20
0123456789 =A4=A5$=A3=A2
ABCDEFGHIJKLMNO
PQRSTUVWXYZ =FD=B7=20
=C1=C2=C3=C7=C8=CA=CB=CC=CD=CE=CF=D0=D2=
D3=D4
=D5=D9=DA=DB=DD=FE=E3=EC=F0=F2=F5{=80} =20
=A8=C0=C4=C5=C6 =C9=D1=D6=D8=DC=DF=E0=E1=E2=E4
=E5=E6=E7=E8=E9=EA=EB=ED=EE=EF=F1[/ \=20
< =3D > -+=F7=D7? ~
* ( =20
|;, ! =20
=F3=F4=F6=F8"# &' =20
@=F9=FA=FB^=FC' =B6
: =BF=A1 )] =A7 =20
In the browser I am editing this message I see all the APL characters.
However, I am prepared for the worst...
| |
|
| On Oct 23, 8:49 am, folic <AA2e...@lycos.co.uk> wrote:
> Another solution:
>
> B\{member}(+\({shape}B{enclose}B)/B){times}B{enclose}B
Hardly.
The first solution from Rav which is perfect takes with a vector of
booleans with 1000000 elements 0.40 something on my PC when executed
100 times.
This is 11% faster than the slightly shorter solution from Steve.
A solution with nested arrays is supposed to be __much__ slower, and
to occupy much more memory.
| |
|
| PiBBbnl3YXksIGhlcmUgaXMgRHlhbG9nIFtdQVYg
d2l0aG91dCB0aGUgY29udHJvbCBjaGFycyBm
b3IgdGVzdAo+IHB1cnBvc2VzOgo+Cj4gICAgICAg
ICAgICUnCj4gX2FiY2RlZmdoaWprbG1ubwo+
IHBxcnN0dXZ3eHl6ICDCry4KPiAwMTIzNDU2Nzg5
IMKkwqUkwqPCogo+ICBBQkNERUZHSElKS0xN
Tk8KPiBQUVJTVFVWV1hZWiAgw73Ctwo+ICDDgcOC
w4PDh8OIw4rDi8OMw43DjsOPw5DDksOTw5QK
PiDDlcOZw5rDm8Odw77Do8Osw7DDssO1e+KCrH0K
PiDCqMOAw4TDhcOGIMOJw5HDlsOYw5zDn8Og
w6HDosOkCj4gw6XDpsOnw6jDqcOqw6vDrcOuw6/DsVsvIFwKPiA8ID0gPiAgIC0rw7fDlz8gIH4K
PiAgICAgKiAgICAoCj4gfDssICAgICAgICAgIQo+
ICAgIMOzw7TDtsO4IiMgJicKPiAgICAgICAg
QMO5w7rDu17DvCcgwrYKPiA6IMK/ wqEgICAgKV0gIMKnCj4KPiBJbiB0aGUgYnJvd3Nl
ciBJIGFt
IGVkaXRpbmcgdGhpcyBtZXNzYWdlIEkgc2VlIGFs
bCB0aGUgQVBMIGNoYXJhY3RlcnMuCj4gSG93
ZXZlciwgSSBhbSBwcmVwYXJlZCBmb3IgdGhlIHdv
cnN0Li4uCgpPaCBkZWFyLi4uLgo=
| |
|
| stevemansour@yahoo.com wrote:
> On Oct 22, 2:14 pm, APL <apl....@gmail.com> wrote:
>
> A slightly more compact expression:
>
> Bx+\1,2>/B
>
Very nice! I didn't think about using N-wise reduction. While it is a
LITTLE slower (a little more than 1.5 times on my system), since either
way is exceedingly fast in absolute terms, the speed difference is
negligible. On the one hand, the first algorithm works on ANY APL (i.e.
earlier ones like APL*PLUS PC), not just more modern ones that implement
N-wise reduction. On the other hand, the second algorithm is easier to
"read" in that it's more obvious that it's doing N-wise reduction 2
elements at a time.
--
Posted via a free Usenet account from http://www.teranews.com
| |
| Ted Edwards 2007-10-23, 6:58 pm |
| APL wrote:
> Anyone have an APL Idiom to do the following, given a binary bit
> vector
> 0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
> Produce the following:
> 0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
What comes to mind immediately is:
x{<-}0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
x\{eps}({iota}{rho}y){times}y{<-}x{partition}x
0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
(using APL2)
Ted
| |
| Ted Edwards 2007-10-23, 6:58 pm |
| kai wrote:
> bool<-0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
> boolx+\bool
doesn't work in APL2 and I hope it doesn't in Dialog:
bool{times}+\bool
0 0 0 1 0 0 2 3 4 5 0 0 0 0 6 0 7 8
Ted
| |
| Bakul Shah 2007-10-23, 6:58 pm |
| Ted Edwards wrote:
> APL wrote:
>
> What comes to mind immediately is:
> x{<-}0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
> x\{eps}({iota}{rho}y){times}y{<-}x{partition}x
> 0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
>
> (using APL2)
>
> Ted
Here is a K3 solution (closer in spirit to the original
solution in this thread):
{ x*+\0,>':x}0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
| |
| Bakul Shah 2007-10-23, 6:58 pm |
| Bakul Shah wrote:
> Ted Edwards wrote:
>
> Here is a K3 solution (closer in spirit to the original
> solution in this thread):
>
> { x*+\0,>':x}0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
> 0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
Oops. One must always test boundary conditions!
{x*+\(1#x),>':x}1 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
1 0 0 2 0 0 3 3 3 3 0 0 0 0 4 0 5 5
| |
| Ted Edwards 2007-10-23, 6:58 pm |
| The three I tested that worked:
x{<-}0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
x\{eps}({iota}{rho}y){times}y{<-}x{enclose}x
0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
x{times}+\x>{neg}1{drop}0,x
0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
x{times}+\1,2>/x
0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
TIMER executes R L times and divides the resultant time by L (default 1)
1E4 TIMER'x\{eps}({iota}{rho}y){times}y{<-}x{enclose}x'
0.0000273
1E4 TIMER'x{times}+\x>{neg}1{drop}0,x'
0.000016
1E4 TIMER'x{times}+\1,2>/x'
0.0000215
B{<-}1^{neg}1+?1E6{rho}2
TIMER 'B\{eps}({iota}{rho}y){times}y{<-}B{enclose}B'
1.028
TIMER'B{times}+\B>{neg}1{drop}0,B'
0.17
TIMER'B{times}+\1,2>/B'
1.457
Tests run on a T-Pad T60 running APL2 for OS/2 under eCS. Clearly
"Rav"'s is the fastest.
Ted
| |
| Tracker 2007-10-23, 6:58 pm |
| kai wrote:
>
> ???
>
> There is hardly anything available in APL simpler than dynamic
> functions.
> I used them because one often can avoid creating temporary variables,
> as in this case.
The only problem is the dynamic functions work only Dyalog. Not APLX, not APL+Win, not APL2. But please correct me if I'm wrong. Perhaps you are saying, the only APL to use is Dyalog? :-)
| |
| Tracker 2007-10-23, 9:58 pm |
| Rav wrote:
> stevemansour@yahoo.com wrote:
>
> Very nice! I didn't think about using N-wise reduction. While it is a
> LITTLE slower (a little more than 1.5 times on my system), since either
> way is exceedingly fast in absolute terms, the speed difference is
> negligible. On the one hand, the first algorithm works on ANY APL (i.e.
> earlier ones like APL*PLUS PC), not just more modern ones that implement
> N-wise reduction. On the other hand, the second algorithm is easier to
> "read" in that it's more obvious that it's doing N-wise reduction 2
> elements at a time.
The N-Wise reduction is 4 time slower than Rav's solution on my system.
| |
| Tracker 2007-10-24, 3:57 am |
| Ted Edwards wrote:
> APL wrote:
>
> What comes to mind immediately is:
> x{<-}0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
> x\{eps}({iota}{rho}y){times}y{<-}x{partition}x
> 0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
>
> (using APL2)
>
> Ted
Ted - your solution fails if the first element is a 1 - I think - provided I translated it correctly. Rav's solution still wins, and works on all APL systems.
| |
| Phil Last 2007-10-24, 6:58 pm |
| On Oct 24, 5:09 am, Tracker <n...@email.net> wrote:
> Ted Edwards wrote:
>
>
>
>
> Ted - your solution fails if the first element is a 1 - I think - provided I translated it correctly. Rav's solution still wins, and works on all APL systems.
And looking at it again it's also the closest to a description of the
problem.
w>(neg)1(drop)0,w means "each item greater than its predecessor".
The frustration with dyadic (n-wise) reduction is that it never seems
to do exactly what you want.
Firstly the length is always ((abs)n)-1 less than that of the
argument.
The temptation is to correct this AFTER the fact because we want to
operate the reduction on the argument as supplied. This is
conceptually wrong for the same reason that 0,1(drop)w is wrong ... it
gives wrong length on empty w, ok n-wise reduction gives error on
empty anyway so not such a big problem.
But to go back to "each item greater than its predecessor". This is
not something n-wise reduction can actually do because whether n is
positive or negative the result always corresponds positionally with
the FIRST of each n items. We want the second. The closest to this is
to catenate the identity to the front before the reduction ...
(neg)2>/0,w
but then all the items have moved and 'though it does work it no
longer looks like the same problem we're answering.
Incidentally, for those who haven't used it the negative n-wise
reduction reverses the n-item sub-vectors before the reduction. Only
really useful with neg 2, I suspect.
| |
| phil chastney 2007-10-25, 3:57 am |
| Phil Last wrote:
> On Oct 22, 8:40 pm, Phil Last <phil.l...@ntlworld.com> wrote:
>
> OK, I know I was making the same mistake as Kai in not reading the
> problem properly, also not reading the rest of the thread, but in
> terms of what gets transmitted and received it seems it's google
> that's throwing away the apl characters. As I type this the IME is
> letting me enter and displaying " ~ * ~ " (CTRL+"all
> these apl characters"). I'll bet google chucks most of them away.
this reminds me of the occasion when a group of APL users complained
their printer wasn't working, and the IS dept (who had openly declared
their intention of stamping out APL) said the problem was that they (the
APLers) were trying to print APL
the point, as here, is that neither Google nor the printer knows it's
transmitting or receiving APL -- they only recognise a stream of bytes
what prompts you to say "google chucks most of them away"? are there
signs that google is discarding bytes from the stream, or are you
receiving the right number of characters, but not getting the display
you expect?
the problem with the printer was that the connector was falling apart
-- I fixed it with my trusty Swiss Army knife, albeit a bit clumsily,
and thereafter always carried a proper set of screwdrivers, just so IS
couldn't mislead their users a second time
I doubt if that solution will help with the Google problem, though
all the best . . . /phil
[P.S: that bit about only recognising a stream of bytes a not quite
right, what with language tags and automatic language recognition, but I
doubt if these efforts will ever extend to include APL]
| |
| David Liebtag 2007-10-25, 6:58 pm |
| > [P.S: that bit about only recognising a stream of bytes a not quite right,
> what with language tags and automatic language recognition, but I doubt if
> these efforts will ever extend to include APL]
I belive this is wrong. If you encode your posts using Unicode, and if the
readers have fonts which include glyphs for the APL characters, this problem
is already solved.
David Liebtag
| |
|
| On Oct 25, 4:16 pm, "David Liebtag" <DavidLieb...@vermontel.net>
wrote:
>
> I belive this is wrong. If you encode your posts using Unicode, and if the
> readers have fonts which include glyphs for the APL characters, this problem
> is already solved.
>
> David Liebtag
In theory: yes. In practice: often.
In theory, there is no difference between theory and practice. In
practice, there is.
I have configured my Firefox so that APL385 Unicode is used to display
text, regardless what the website is telling the browser.
I did a test with groups hosted by Google in the past, and that test
was successful then. In this thread I send an email where most of the
[]AV was included. Unfortunately I cannot see a single APL character,
and that is because the bytes representing these characters have
disappeared.
I do not understand this because I've sent the email via my Google
mail account (which is configured as UTF8) and the page is displayed
using UTF8. So the bytes must have been gone on their way.
As I already said: oh dear...
| |
| Phil Last 2007-10-25, 6:58 pm |
| On Oct 25, 9:53 am, phil chastney
<phil.hates.s...@amadeus.munged.eclipse.co.uk> wrote:
> Phil Last wrote:
>
>
> this reminds me of the occasion when a group of APL users complained
> their printer wasn't working, and the IS dept (who had openly declared
> their intention of stamping out APL) said the problem was that they (the
> APLers) were trying to print APL
>
> the point, as here, is that neither Google nor the printer knows it's
> transmitting or receiving APL -- they only recognise a stream of bytes
>
> what prompts you to say "google chucks most of them away"? are there
> signs that google is discarding bytes from the stream, or are you
> receiving the right number of characters, but not getting the display
> you expect?
>
> the problem with the printer was that the connector was falling apart
> -- I fixed it with my trusty Swiss Army knife, albeit a bit clumsily,
> and thereafter always carried a proper set of screwdrivers, just so IS
> couldn't mislead their users a second time
>
> I doubt if that solution will help with the Google problem, though
>
> all the best . . . /phil
>
> [P.S: that bit about only recognising a stream of bytes a not quite
> right, what with language tags and automatic language recognition, but I
> doubt if these efforts will ever extend to include APL]
To put it more accurately I am transmitting unicode and somewhere
along the line some of the characters are being replaced by blanks. I
certainly don't imagine the good folks at google have anything
particular against apl. Just that google groups or maybe just usenet
groups don't yet seem to benefit from a proper handling of unicode
encoding.
| |
| Ted Edwards 2007-10-25, 6:58 pm |
| Tracker wrote:
> Ted - your solution fails if the first element is a 1 - I think -
> provided I translated it correctly. Rav's solution still wins, and
> works on all APL systems.
While I agree that Rav's is the best, I don't think mine fails:
x{<-}0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
x\{eps}({iota}{rho}y){times}y{<-}x{enclose}x
0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
x{<-}1 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
x\{eps}({iota}{rho}y){times}y{<-}x{enclose}x
1 0 0 2 0 0 3 3 3 3 0 0 0 0 4 0 5 5
x{<-}1 1 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
x\{eps}({iota}{rho}y){times}y{<-}x{enclose}x
1 1 0 2 0 0 3 3 3 3 0 0 0 0 4 0 5 5
Ted
| |
| Tracker 2007-10-26, 3:58 am |
| Ted Edwards wrote:
> Tracker wrote:
>
> While I agree that Rav's is the best, I don't think mine fails:
>
> x{<-}0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
> x\{eps}({iota}{rho}y){times}y{<-}x{enclose}x
> 0 0 0 1 0 0 2 2 2 2 0 0 0 0 3 0 4 4
> x{<-}1 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
> x\{eps}({iota}{rho}y){times}y{<-}x{enclose}x
> 1 0 0 2 0 0 3 3 3 3 0 0 0 0 4 0 5 5
> x{<-}1 1 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 1
> x\{eps}({iota}{rho}y){times}y{<-}x{enclose}x
> 1 1 0 2 0 0 3 3 3 3 0 0 0 0 4 0 5 5
>
> Ted
As I said - provided I translated it correctly. I re-entered it today, and it appears to work fine. Nicely done.
| |
| phil chastney 2007-10-26, 3:58 am |
| kai wrote:
> On Oct 25, 4:16 pm, "David Liebtag" <DavidLieb...@vermontel.net>
> wrote:
>
> In theory: yes. In practice: often.
>
> In theory, there is no difference between theory and practice. In
> practice, there is.
>
> I have configured my Firefox so that APL385 Unicode is used to display
> text, regardless what the website is telling the browser.
>
> I did a test with groups hosted by Google in the past, and that test
> was successful then. In this thread I send an email where most of the
> []AV was included. Unfortunately I cannot see a single APL character,
> and that is because the bytes representing these characters have
> disappeared.
>
> I do not understand this because I've sent the email via my Google
> mail account (which is configured as UTF8) and the page is displayed
> using UTF8. So the bytes must have been gone on their way.
>
> As I already said: oh dear...
when you reply to a message, the mailer software will usually offer you
the option of replying using the encoding of the original message, or an
over-riding encoding of your choice
secondly, displaying a page using UTF-8 isn't a lot of use unless the
original page was sent using UTF-8 -- it is sometimes necessary to
scroll through all the likely encodings, until you find one that works
until MIME headers and UTF-8 are more widely used, this group will
encounter problems with APL code in text msgs -- particularly since the
only standard APL encoding is Unicode
regards . . . /phil
| |
| phil chastney 2007-10-26, 3:58 am |
| David Liebtag wrote:
>
> I believe this is wrong. If you encode your posts using Unicode, and if the
> readers have fonts which include glyphs for the APL characters, this problem
> is already solved.
I'm not sure which problem you think is "solved" here
the only way to transmit APL code in text form unambiguously is via
Unicode, and I think we agree on that
actually, that's not quite right: APL text can be converted to
hexadecimal values using Unicode, but how those hexadecimal values get
transmitted is a separate issue -- we have UTF-7, UTF-8, UTF-16 (little
end and big end), plus EBCDIC variants, for a start
so let's be precise, and rephrase that to "the only way to encode APL in
text form unambiguously is via Unicode, and the most convenient way of
transmitting encoded APL is via UTF-8" -- can we agree on that?
the net result is still a byte stream -- to human observers, the
individual characters may be recognisable as part of the APL character
set, but none of the communications machinery knows nor cares
and, to muddy the waters further, a UTF-8 stream, being 8-bit, does not
need a BOM
-- some people would insist it should not have one, the resultant
ambiguity being no worse than that arising from the current plethora of
codepages, and dangers arising from BOMs being accidentally embedded are
thus avoided
-- personally, I'm happy with the common practice of prefixing a UTF-8
encoded BOM to a UTF-8 encoded stream, especially where there is no
other out-of-band means (such as a MIME header) of identifying the
encoding in use
-- either way, once again, the communications machinery neither knows
nor cares
my PS referred to natural languages -- there are moves afoot to include
language tags in plain text -- such tags are not part of the Unicode
standard, though the bodies concerned will work closely with Unicode
org, and may even work under the aegis of Unicode org -- I doubt if
these efforts will ever extend to include mathematical notation or
formal languages in general
regards . . . /phil
| |
| phil chastney 2007-10-26, 3:58 am |
| Phil Last wrote:
>
> To put it more accurately I am transmitting unicode and somewhere
> along the line some of the characters are being replaced by blanks. I
> certainly don't imagine the good folks at google have anything
> particular against apl. Just that google groups or maybe just usenet
> groups don't yet seem to benefit from a proper handling of unicode
> encoding.
OK, time to get picky
you encoded your text using Unicode, and transmitted it, presumably,
using UTF-8
what you see, after this stream has made the trip to Google and back,
occupies as many print positions as the original text, if I understand
you correctly
this strongly suggests that the same flavour of UTF was used to encode
the text for transmission, and then to decode the incoming byte stream
for display --> unless, of course, it wasn't encoded using UTF at all,
but just sent as a common or garden 8-bit encoding
in either case, the hexadecimal values received should match those sent
-- I rather doubt if any of the sent values were replaced by 0x0020, so
the problem may be the display at your end
so, in your position, I would
(i) ascertain that the original text was encoded using Unicode
(cut and paste into Wordpad is quick, but not foolproof)
(ii) check your email settings, to ensure that you send and receive
using UTF-8
(iii) view the received message using different encodings, to ensure
that this particular msg was sent using UTF-8 (i.e, didn't
inherit an 8-bit encoding from a previous msg)
(iv) check the font you're using for display -- does it have APL
symbols in it?
HTH, and all the best . . . /phil
| |
| Ted Edwards 2007-10-26, 9:59 pm |
| phil chastney wrote:
> until MIME headers and UTF-8 are more widely used, this group will
> encounter problems with APL code in text msgs -- particularly since the
> only standard APL encoding is Unicode
FYI, I have never (yet) had a problem sending APL in a *.ZIP file. This
doesn't solve the NG problem but ...
Ted
| |
|
| On Oct 23, 4:43 pm, kai <kaithomas...@googlemail.com> wrote:
>
> ???
>
> There is hardly anything available in APL simpler than dynamic
> functions.
> I used them because one often can avoid creating temporary variables,
> as in this case.
>
> Anyway, here is Dyalog []AV without the control chars for test
> purposes:
>
> %'
> _abcdefghijklmno
> pqrstuvwxyz =AF.
> 0123456789 =A4=A5$=A3=A2
> ABCDEFGHIJKLMNO
> PQRSTUVWXYZ =FD=B7
> =C1=C2=C3=C7=C8=CA=CB=CC=CD=CE=CF=D0=D2=
D3=D4
> =D5=D9=DA=DB=DD=FE=E3=EC=F0=F2=F5{=80}
> =A8=C0=C4=C5=C6 =C9=D1=D6=D8=DC=DF=E0=E1=E2=E4
> =E5=E6=E7=E8=E9=EA=EB=ED=EE=EF=F1[/ \
> < =3D > -+=F7=D7? ~
> * (
> |;, !
> =F3=F4=F6=F8"# &'
> @=F9=FA=FB^=FC' =B6
> : =BF=A1 )] =A7
>
> In the browser I am editing this message I see all the APL characters.
> However, I am prepared for the worst...
These are not Unicode APL chars and so they do not show up unless you
happen to use the correct code page for APL chars.
For me most of them show up as =CDslandish chars.
a
%'
_abcdefghijklmno
pqrstuvwxyz =AF.
0123456789 =A4=A5$=A3=A2
ABCDEFGHIJKLMNO
PQRSTUVWXYZ =FD=B7
=C1=C2=C3=C7=C8=CA=CB=CC=CD=CE=CF=D0=D2=
D3=D4
=D5=D9=DA=DB=DD=FE=E3=EC=F0=F2=F5{=80}
=A8=C0=C4=C5=C6 =C9=D1=D6=D8=DC=DF=E0=E1=E2=E4
=E5=E6=E7=E8=E9=EA=EB=ED=EE=EF=F1[/ \
< =3D > -+=F7=D7? ~
* (
|;, !
=F3=F4=F6=F8"# &'
@=F9=FA=FB^=FC' =B6
: =BF=A1 )] =A7
33 10$ a.i.a,' '
37 39 32 32 10 95 97 98 99 100
101 102 103 104 105 106 107 108 109 110
111 10 112 113 114 115 116 117 118 119
120 121 122 32 32 194 175 46 10 48
49 50 51 52 53 54 55 56 57 32
194 164 194 165 36 194 163 194 162 10
32 65 66 67 68 69 70 71 72 73
74 75 76 77 78 79 10 80 81 82
83 84 85 86 87 88 89 90 32 32
195 189 194 183 10 32 195 129 195 130
195 131 195 135 195 136 195 138 195 139
195 140 195 141 195 142 195 143 195 144
195 146 195 147 195 148 10 195 149 195
153 195 154 195 155 195 157 195 190 195
163 195 172 195 176 195 178 195 181 123
226 130 172 125 32 32 10 194 168 195
128 195 132 195 133 195 134 32 195 137
195 145 195 150 195 152 195 156 195 159
195 160 195 161 195 162 195 164 10 195
165 195 166 195 167 195 168 195 169 195
170 195 171 195 173 195 174 195 175 195
177 91 47 32 92 10 60 32 61 32
62 32 32 32 45 43 195 183 195 151
63 32 32 126 10 32 32 32 32 42
32 32 32 32 40 32 32 32 32 32
32 10 124 59 44 32 32 32 32 32
32 32 32 32 33 32 32 10 32 32
32 195 179 195 180 195 182 195 184 34
35 32 38 39 32 32 32 32 10 32
32 32 32 32 32 32 64 195 185 195
186 195 187 94 195 188 39 32 194 182
10 58 32 194 191 194 161 32 32 32
32 41 93 32 32 194 167 32 10 32
| |
| Phil Last 2007-10-27, 6:57 pm |
| On Oct 26, 8:24 am, phil chastney
<phil.hates.s...@amadeus.munged.eclipse.co.uk> wrote:
> Phil Last wrote:
>
>
> OK, time to get picky
>
> you encoded your text using Unicode, and transmitted it, presumably,
> using UTF-8
>
> what you see, after this stream has made the trip to Google and back,
> occupies as many print positions as the original text, if I understand
> you correctly
>
> this strongly suggests that the same flavour of UTF was used to encode
> the text for transmission, and then to decode the incoming byte stream
> for display --> unless, of course, it wasn't encoded using UTF at all,
> but just sent as a common or garden 8-bit encoding
>
> in either case, the hexadecimal values received should match those sent
> -- I rather doubt if any of the sent values were replaced by 0x0020, so
> the problem may be the display at your end
>
> so, in your position, I would
> (i) ascertain that the original text was encoded using Unicode
> (cut and paste into Wordpad is quick, but not foolproof)
> (ii) check your email settings, to ensure that you send and receive
> using UTF-8
> (iii) view the received message using different encodings, to ensure
> that this particular msg was sent using UTF-8 (i.e, didn't
> inherit an 8-bit encoding from a previous msg)
> (iv) check the font you're using for display -- does it have APL
> symbols in it?
>
> HTH, and all the best . . . /phil
The full picture. I am replying through the supplied google groups
interface under Firefox, not through email, which would be via
Thunderbird if I did. In any case character encoding is set to UTF-8
and my default font is set as "apl385 unicode" in both clients and for
all circumstances. I am using the Dyalog supplied IME and can switch
between apl and non-apl at the press of a button. With apl on I can
type any apl character that the IME implements and see it displayed
correctly on the screen. This works in firefox, thunderbird, open
office, notepad .... I can write apl in word docs, spreadsheets, text
files, emails ... . I can save them send them receive them. I can
create events in google calendar in apl characters and save them and
go back later and there they are.
I try it in comp-lang-apl. I see them. I send. I look again. They're
gone.
Phil
| |
| Markus Triska 2007-10-27, 6:57 pm |
| Phil Last <phil.last@ntlworld.com> writes:
> I try it in comp-lang-apl. I see them. I send. I look again. They're
> gone.
The headers of your messages consistently say:
Content-Type: ... charset="us-ascii"
If you want to send Unicode APL programs, you must get your
browser/newsreader to use UTF-8 or a similar encoding for sending.
| |
| Phil Last 2007-10-27, 6:57 pm |
| On Oct 27, 3:16 pm, Markus Triska <e0225...@stud4.tuwien.ac.at> wrote:
> Phil Last <phil.l...@ntlworld.com> writes:
>
> The headers of your messages consistently say:
>
> Content-Type: ... charset="us-ascii"
>
> If you want to send Unicode APL programs, you must get your
> browser/newsreader to use UTF-8 or a similar encoding for sending.
Markus,
Can I ask how you know this. I look at the source of the page and
"charset=" occurs twice, once at the beginning where it sets the
charset for the page to UTF-8 and once in the text of your message
where it says "charset=(ampersand)quot;us-ascii(ampersand)quot;" (my
transliteration).
Phil
| |
|
| From: Markus Triska <e0225855@stud4.tuwien.ac.at>
Date: Sat, 27 Oct 2007 18:39:05 +0200
Message-ID: <m1640swlzq.fsf@gmx.at>
Cancel-Lock: sha1:xacdt5GYABu43EzDtnY1jSt8iCE=
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Lines: 27
NNTP-Posting-Host: news-access-from.tuwien.ac.at
X-Trace: 1193503511 tunews.univie.ac.at 11868 192.35.241.118
X-Complaints-To: abuse@tuwien.ac.at
Bytes: 2266
Xref: number1.nntp.dca.giganews.com comp.lang.apl:22684
Phil Last <phil.last@ntlworld.com> writes:
> how you know this.
In Gnus, press "t" to toggle visibility of headers. With Google
groups, click on "More options" -> "Show original"; your message:
http://groups.google.com/group/comp...ec?dmode=source
Compare that with the UTF-8 encoded message I sent previously:
http://groups.google.com/group/comp...fbb86d832a73819
whose headers you see here:
http://groups.google.com/group/comp...19?dmode=source
The important difference lies in the Content-Type/charset field.
> I look at the source of the page and "charset=" occurs twice, once
> at the beginning where it sets the charset for the page to UTF-8 and
That's right, the page Google sends us to show all messages sent to
this thread is (correctly) marked and sent as UTF-8; however, at issue
is your individual message, from whose header you see that it is
attributed as ASCII, which cannot generally encode Unicode APL chars.
| |
| Phil Last 2007-10-27, 6:57 pm |
| On Oct 27, 5:22 pm, Phil Last <phil.l...@ntlworld.com> wrote:
> On Oct 27, 3:16 pm, Markus Triska <e0225...@stud4.tuwien.ac.at> wrote:
>
>
>
>
>
> Markus,
> Can I ask how you know this. I look at the source of the page and
> "charset=" occurs twice, once at the beginning where it sets the
> charset for the page to UTF-8 and once in the text of your message
> where it says "charset=(ampersand)quot;us-ascii(ampersand)quot;" (my
> transliteration).
> Phil
Further to this, I guess you must send and receive emails to and from
this group. As I said in my last but one message I do not but my
browser is set to send, receive and display UTF-8.
If you are receiving an email from me it must be constructed by google
and the 'charset="us-ascii"' added by that source.
Again, as stated above, it works for every other application/page I
have tried including, I have just discovered, dyalogusers on yahoo.
| |
| Phil Last 2007-10-27, 6:57 pm |
| On Oct 27, 5:39 pm, Markus Triska <e0225...@stud4.tuwien.ac.at> wrote:
> Phil Last <phil.l...@ntlworld.com> writes:
>
> In Gnus, press "t" to toggle visibility of headers. With Google
> groups, click on "More options" -> "Show original"; your message:
>
> http://groups.google.com/group/comp...e6dfddec?dmo...
>
> Compare that with the UTF-8 encoded message I sent previously:
>
> http://groups.google.com/group/comp...fbb86d832a73819
>
> whose headers you see here:
>
> http://groups.google.com/group/comp...32a73819?dmo...
>
> The important difference lies in the Content-Type/charset field.
>
>
> That's right, the page Google sends us to show all messages sent to
> this thread is (correctly) marked and sent as UTF-8; however, at issue
> is your individual message, from whose header you see that it is
> attributed as ASCII, which cannot generally encode Unicode APL chars.
What header? What individual message? All I see is a page of messages!
There are no individual headers!
| |
| Markus Triska 2007-10-27, 6:57 pm |
| Phil Last <phil.last@ntlworld.com> writes:
> What header? What individual message? All I see is a page of messages!
> There are no individual headers!
In Google groups, at the top right of each message, there is a link
with the caption "More options"; click it to reveal other links for
each message, and click "Show original" to see the message as it
arrived at and is propagated by the news server. If a message contains
the wrong encoding type in its header fields, there is little hope
that it can show up correctly in the page showing all messages.
| |
| Phil Last 2007-10-27, 6:57 pm |
| On Oct 27, 6:10 pm, Markus Triska <e0225...@stud4.tuwien.ac.at> wrote:
> Phil Last <phil.l...@ntlworld.com> writes:
>
> In Google groups, at the top right of each message, there is a link
> with the caption "More options"; click it to reveal other links for
> each message, and click "Show original" to see the message as it
> arrived at and is propagated by the news server. If a message contains
> the wrong encoding type in its header fields, there is little hope
> that it can show up correctly in the page showing all messages.
Got you.
Thanks.
Ok the message shown by "show original" is not from me nor via my
email server.
It is being constructed by google groups from the text received from
my browser presumably for the purposes of forwarding it to members who
receive emails from the group.
I am writing my message in UTF-8.
Firefox is transmitting the page in UTF-8.
GOOGLE is changing it to us-ascii!
| |
| Markus Triska 2007-10-27, 6:57 pm |
| Phil Last <phil.last@ntlworld.com> writes:
> As I said in my last but one message I do not but my browser is set
> to send, receive and display UTF-8.
Still, all your messages contain the attribute charset="us-ascii" in
their "Content-Type" header field. If your message is UTF-8 encoded
and uses characters not in UTF-8's ASCII subset, the header fields
should say so! Consider trying a different newsreader - for Gnus, add:
(setq mm-coding-system-priorities '(utf-8))
to your .emacs to use UTF-8 as encoding for outgoing messages whenever
ASCII cannot be used; Gnus will add the right header as well.
> Again, as stated above, it works for every other application/page I
> have tried including, I have just discovered, dyalogusers on yahoo.
Probably unrelated; they could be smart enough to correct wrong
headers by looking at the message content, seeing it's likely UTF-8.
| |
| Phil Last 2007-10-27, 6:57 pm |
| On Oct 27, 6:28 pm, Markus Triska <e0225...@stud4.tuwien.ac.at> wrote:
> Phil Last <phil.l...@ntlworld.com> writes:
>
> Still, all your messages contain the attribute charset="us-ascii" in
> their "Content-Type" header field. If your message is UTF-8 encoded
> and uses characters not in UTF-8's ASCII subset, the header fields
> should say so! Consider trying a different newsreader - for Gnus, add:
>
> (setq mm-coding-system-priorities '(utf-8))
>
> to your .emacs to use UTF-8 as encoding for outgoing messages whenever
> ASCII cannot be used; Gnus will add the right header as well.
>
>
> Probably unrelated; they could be smart enough to correct wrong
> headers by looking at the message content, seeing it's likely UTF-8.
Please see previous message. I am not using either a newsreader or an
email client. I am using Firefox which works fine. The headers you are
seeing are put there by GOOGLE. It is GOOGLE that gets it wrong!
| |
| Markus Triska 2007-10-27, 6:57 pm |
| Phil Last <phil.last@ntlworld.com> writes:
> I am using Firefox which works fine. The headers you are seeing are
> put there by GOOGLE. It is GOOGLE that gets it wrong!
It is possible; however, I've seen Google groups working reliably for
many different encodings. In my opinion, you are overly sure that
Firefox gets it right, which was not shown satisfactorily (as I said,
its working in other cases doesn't prove it, as the host application
can always guess the right encoding even with wrong headers). I've
never seen Google groups changing a message's encoding if it was
(correctly) supplied by the client application, whether brower or
Usenet reader; to make sure Firefox gets it right, you have to apply a
different test. You could also try a different browser.
| |
| Markus Triska 2007-10-27, 6:57 pm |
| Phil Last <phil.last@ntlworld.com> writes:
> Firefox is transmitting the page in UTF-8.
It could be that Firefox transmits your message in UTF-8 but attaches
the wrong encoding header; how did you test that this is not the case?
|
|
|
|
|