Home > Archive > PERL Beginners > February 2006 > Non-printing Characters
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Non-printing Characters
|
|
| Ryan Frantz 2006-02-23, 9:55 pm |
| I've got a few reports that are generated by a third-party app that we =
use and the raw report files include incomprehensible strings at the =
beginning of each page like so:
=1B&k2S=1B&l6D
Sometimes there are control characters (shown below as normal text i.e. =
^D~Q) throughout the file:
1=FA=F9^E~I/~X/~C=F9;SELF FUNDING =
ADMINISTRATORS=F97PAGE:=F9^E1=FA=F9^ELXC
VELEX=F9?ELIGIBILITY EXTRACT=FA =
=FA GROUP: =C2~@=FA=F9^HANNE ARUNDEL MEDICAL CENTER=FA =FA =FA GROUP =
NAME=F9^H: ANNE ARUNDEL MEDICAL CENTER=F9^FPLAN: =C2~@ RX CARD=FA AS OF =
~H/~A/~V~C=FA =FA TOTAL PARTICIPANTS :=F9^D~Q=FA TOTAL =
DEPENDENTS=F9^C:=F9^D~M=FA =FA DOS FILENAME : =
l:/work/ex~C~I~X=EE=B3=FA=FB
What kind of text/coding/whatever is this and what tools are available =
that I can use to search for this content (to remove it)? I've never =
really had to deal with anything other than alphanumeric characters. Do =
I need to search for certain octal/hex codes to find these?
Ryan
| |
| Tom Phoenix 2006-02-23, 9:55 pm |
| On 2/23/06, Ryan Frantz <ryanfrantz@informed-llc.com> wrote:
> I've got a few reports that are generated by a third-party app that we us=
e
> and the raw report files include incomprehensible strings at the beginnin=
g
> of each page like so:
>
> =1B&k2S=1B&l6D
I wonder whether those are supposed to be escape sequences, as one
might send to a video terminal. Those typically are signalled by a
non-printing character (such as "\x1b", esc) followed by... well,
anything goes, really. But you can figure out the rules for your data.
Or, maybe you can tell your third-party app that you're not using a
fancy video terminal. Check its docs to see whether it respects the
TERM environment variable.
> What kind of text/coding/whatever is this and what tools are available th=
at
> I can use to search for this content (to remove it)?
Tools? You've got Perl! :-)
The trick is to know how to tell Perl how to tell the wheat from the
chaff. If you can dump the file contents, you might be able to figure
out the encoding. In these cases I use the Unix command 'od -xc
somefile | less' to see what I can see of a file's format.
Another way to see what's inside would be to step through a simple
program in Perl's debugger, using the 'x' command periodically to
examine a variable's contents. (In this case, $_.)
perl -ndebug somefile
Does this help you to make any progress? If you can see what's going
on in the file, but you can't turn that knowledge into Perl code, let
us know. Good luck with it!
--Tom Phoenix
Stonehenge Perl Training
| |
| Ryan Frantz 2006-02-23, 9:55 pm |
|
> -----Original Message-----
> From: tom.phoenix@gmail.com [mailto:tom.phoenix@gmail.com] On Behalf
Of
> Tom Phoenix
> Sent: Thursday, February 23, 2006 2:43 PM
> To: Ryan Frantz
> Cc: beginners@perl.org
> Subject: Re: Non-printing Characters
>=20
> On 2/23/06, Ryan Frantz <ryanfrantz@informed-llc.com> wrote:
>=20
we[color=darkred]
> use
> beginning
>=20
> I wonder whether those are supposed to be escape sequences, as one
> might send to a video terminal. Those typically are signalled by a
> non-printing character (such as "\x1b", esc) followed by... well,
> anything goes, really. But you can figure out the rules for your data.
Typically, the reports are generated and sent directly to the printer.
I do have the ability to save the report in a spool so that I can print
it at my leisure and even print it to a text file (which I've done).
However, those stray strings are present. I believe that they are
commands (or remnants of commands) sent to the printer. I (sort of)
know this because I used to work for the vendor that wrote the
(cr)application (as a tech, not a developer).
>=20
> Or, maybe you can tell your third-party app that you're not using a
> fancy video terminal. Check its docs to see whether it respects the
> TERM environment variable.
>=20
available[color=darkred]
> that
>=20
> Tools? You've got Perl! :-)
Obviously. ;) Since I'm not really sure what I'm looking at, I don't
know what questions to ask.
>=20
> The trick is to know how to tell Perl how to tell the wheat from the
> chaff. If you can dump the file contents, you might be able to figure
> out the encoding. In these cases I use the Unix command 'od -xc
> somefile | less' to see what I can see of a file's format.
This is helpful. I've never used 'od' but I'm aware of it. I can start
using/learning it.
>=20
> Another way to see what's inside would be to step through a simple
> program in Perl's debugger, using the 'x' command periodically to
> examine a variable's contents. (In this case, $_.)
>=20
> perl -ndebug somefile
>=20
> Does this help you to make any progress? If you can see what's going
> on in the file, but you can't turn that knowledge into Perl code, let
> us know. Good luck with it!
Yes. This gives me a place to start looking and learning more about
this type of content so that I can ask better questions next go 'round.
Obliged.
>=20
> --Tom Phoenix
> Stonehenge Perl Training
| |
| usenet@DavidFilmer.com 2006-02-23, 9:55 pm |
| Ryan Frantz wrote:
> Yes. This gives me a place to start looking and learning more about
> this type of content so that I can ask better questions next go 'round.
You might also try looking at the file in a hex editor. Some
characters don't display on terminals (or cause goofy behavior, like
backspaces), but if you view it in a hex editor, you may see patterns
or indicators/flags that aren't apparent when viewed on an ordinary
terminal.
--
http://DavidFilmer.com
|
|
|
|
|