Home > Archive > Cobol > July 2004 > Cobol Copybook Parsing
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Cobol Copybook Parsing
|
|
|
| Hi,
I am a java programmer trying to parse cobol copybooks and convert
them to xsd.
I was wondering if there were any sample copybooks and associated data
available anywhere on the net ? The problem is that I am still wading
thru cobol, so I tht those samples would help clear some doubts.
Either ways if anyone else has an idea then please answer/confirm the
following:
1. Firstly I have concluded that all fields in cobol are fixed
length(except for some DEPENDING ON stuff taht I have to read.). What
would be the size of a field which has a V(decimal point). Would the
data in the file have a "." or only the digits.
2. Similarly for edited numerics and alphanumerics would the data file
contain the edits like $,. etc. i.e are they in the data or are they
inserted.
3. PIC S9(5)V9(2) ... Would the size of this be 7(5+2) ?
4. Is data arranged sequentially in these files, one record after
another with no delimeters?
Thanks,
Ed
| |
| Chuck Stevens 2004-07-01, 3:55 pm |
|
"Ed" <ed_narayanan@yahoo.com> wrote in message
news:64c2ece4.0407010650.45e1c30d@posting.google.com...
> Hi,
> I am a java programmer trying to parse cobol copybooks and convert
> them to xsd.
It'd be a good idea to scrounge up a COBOL reference manual or textbook
somewhere. I suspect it'd help a lot.
COBOL files that are the targets of the COPY verb can basically contain
anything, up to and including the entirety of a program. The use of COPY is
not limited to data descriptions.
> I was wondering if there were any sample copybooks and associated data
> available anywhere on the net ? The problem is that I am still wading
> thru cobol, so I tht those samples would help clear some doubts.
> Either ways if anyone else has an idea then please answer/confirm the
> following:
>
> 1. Firstly I have concluded that all fields in cobol are fixed
> length(except for some DEPENDING ON stuff taht I have to read.). What
> would be the size of a field which has a V(decimal point). Would the
> data in the file have a "." or only the digits.
"V" can only appear in an *unedited* numeric PICTURE; it tells you where the
decimal point is when the data item is manipulated. It doesn't appear in
the data.
> 2. Similarly for edited numerics and alphanumerics would the data file
> contain the edits like $,. etc. i.e are they in the data or are they
> inserted.
$ is one of a few special cases in that if there's more than one of them in
the PICTURE character-string it indicates that it can float, but yes, they
are in the data:
01 A PIC S9(5)V99 value -12345.67.
01 B PIC $$$,$$$9.99-.
MOVE A TO B.
B will contain "$12,345.67-".
MOVE 37.2 TO B.
Where the character "b" represents a blank space, B will contain
"bbbb$37.2b".
> 3. PIC S9(5)V9(2) ... Would the size of this be 7(5+2) ?
Presuming it's USAGE DISPLAY, and there's no applicable SIGN clause on the
item that states otherwise, yes, the size of the item would be seven
characters. Were it USAGE PACKED-DECIMAL, it's most likely the size would
be eight four-bit digits (one of which is the sign).
> 4. Is data arranged sequentially in these files, one record after
> another with no delimeters?
Fields are arranged in records in memory without delimiters (though there
may be "slack bits" in some implementations and in some circumstances
between two data items). How records are arranged in files is
implementor-dependent and hardware-dependent (where are the files?).
-Chuck Stevens
| |
| Robert Wagner 2004-07-01, 3:55 pm |
| ed_narayanan@yahoo.com (Ed) wrote:
>4. Is data arranged sequentially in these files, one record after
>another with no delimeters?
Usually. If the file was declated ASCII text (LINE SEQUENTIAL), trailing spaces
will have been removed and the usual line terminator added. Some Cobols use a
proprietary file system that adds a 2-4 byte header to each record.
Just about all Cobols come with a utility program that can copy files to ASCII
text format -- provided there are no BINARY, PACKED DECIMAL or COMP-n fields. If
the file was intended for interchange, it won't use those forms and will
probably be ASCII text. If the file is internal to the Cobol system, there is a
good chance it will use those forms.
BINARY are words, half-words or long words.
PACKED DECIMAL uses four bits (nybble) per digit plus one nybble for the sign,
rounded up to integral bytes.
COMP-n can be a variety of things ranging from float to big-endian binary. COMP
wasn't in the Cobol language standard, so each compiler defines it differently.
| |
| Michael Mattias 2004-07-01, 3:55 pm |
| Well, if you can find access to a PC which can run MS-DOS programs...
.. and can get the copybook onto that PC....
How about a piece of FREE software which will parse that copybook and give
you a nice formatted report showing the offset and size of every elementary
data item?
Can't complain about the price (although the sofware itself looks its age...
which is ten years...)
Get it at:
http://www.flexus.com/ftp/cobfd.zip
--
Michael Mattias
Tal Systems, Inc.
Racine WI
mmattias@talsystems.com
"Ed" <ed_narayanan@yahoo.com> wrote in message
news:64c2ece4.0407010650.45e1c30d@posting.google.com...
> Hi,
> I am a java programmer trying to parse cobol copybooks and convert
> them to xsd.
> I was wondering if there were any sample copybooks and associated data
> available anywhere on the net ? The problem is that I am still wading
> thru cobol, so I tht those samples would help clear some doubts.
| |
| Chuck Stevens 2004-07-01, 3:55 pm |
| "Robert Wagner" <robert.deletethis@wagner.net> wrote in message
news:40e43993.526145272@news.optonline.net...
> Usually. If the file was declated ASCII text (LINE SEQUENTIAL), trailing
spaces
> will have been removed and the usual line terminator added.
LINE SEQUENTIAL is a (common) vendor extension, not standard COBOL.
> Just about all Cobols come with a utility program that can copy files to
ASCII
> text format -- provided there are no BINARY, PACKED DECIMAL or COMP-n
fields.
I'd expect that with COBOLs from twenty years ago, but compilers written
according to the 1985 standard include INSPECT ... CONVERTING, which can do
the translation within the program. The CODE-SET clause appears to have
been in standard COBOL for 36 years now.
> If the file was intended for interchange, it won't use those forms and
will
> probably be ASCII text.
Unless it's intended for interchange among machines that use EBCDIC. Or,
for that matter, a specific national character set representation --
Unicode's becoming rather popular these days, I hear.
> BINARY are words, half-words or long words.
BINARY items are whatever the implementor decides them to be, starting with
the '85 standard. BINARY was always an implementor extension to COBOL prior
to that.
BINARY-CHAR UNSIGNED must handle the range 0 thru (2**8) - 1); its format is
otherwise unspecified.
BINARY-SHORT UNSIGNED must handle the range 0 thru (2**16) - 1; ditto.
BINARY-LONG UNSIGNED must handle the range 0 thru (2**32) - 1; ditto.
BINARY-DOUBLE UNSIGNED must handle the range 0 - (2**64) - 1; ditto.
Signed versions are similar (-127 to +127 instead of 0 to 255, for example).
Note that each of these USAGEs may contain values beyond these ranges at the
implementor's discretion; these are minimum ranges.
> PACKED DECIMAL uses four bits (nybble) per digit plus one nybble for the
sign,
> rounded up to integral bytes.
On some implementations. The standard requires only that a radix of 10 is
used to represent the item, and that each digit position shall occupy the
minim configuration in the computer's memory, however that might be
configured.
Not all implementations round up to the nearest byte, and not all
implementations allocate memory for a sign if the programmer has not
specified a sign. (Personally, I find it difficult to believe that there
are still implementations that require packed-decimal items to be aligned on
a byte boundary, and that require that they allow space for a sign even when
the programmer has requested otherwise. "Quaint" is way too weak a term;
perhaps "neolithic"?)
> COMP-n can be a variety of things ranging from float to big-endian binary.
COMP
> wasn't in the Cobol language standard, so each compiler defines it
differently.
None of the COMP-n USAGEs have *ever* been in the COBOL language standard.
They weren't in '68, or '74, or '85, and they *aren't* in the current 2002
standard or in the proposed draft for 2008. While they "weren't" in the
standard, it's important to point out that not only has that situation *not
changed*, it is almost certain *not ever to change*.
And not all compilers define the COMP-n USAGEs *differently*; some compilers
don't define it *at all*. And even for those that do, there's no guarantee
that for any given "n" any two implementations of COBOL that DO support
"COMP-n" will bear any similarity to each other.
Is COMP-1 48-bit INTEGER item with a sign in the leftmost bit but one,
zeroes in the next seven bits, and an integer value in the low-order 39 bits
(Unisys MCP COBOL [68]), or is it a 32-bit FLOATING-POINT item with the sign
in the left-most bit, an exponent in the next seven bits, and the remaining
24 bits devoted to the mantissa (IBM S/370 DOS/VS ANSI COBOL [68])? Is
COMP-2 a form of PACKED-DECIMAL whose length is variable (MCP COBOL[68]), or
is it a double-precision floating-point item occupying eight bytes (DOS/VS
COBOL[68])?
-Chuck Stevens
| |
| docdwarf@panix.com 2004-07-01, 3:55 pm |
| In article <cc1hbo$1oml$1@si05.rsvl.unisys.com>,
Chuck Stevens <charles.stevens@unisys.com> wrote:
[snip]
>(Personally, I find it difficult to believe that there
>are still implementations that require packed-decimal items to be aligned on
>a byte boundary, and that require that they allow space for a sign even when
>the programmer has requested otherwise. "Quaint" is way too weak a term;
>perhaps "neolithic"?)
Back to the library of Esher in Sumer for me, I guess.
DD
| |
| Richard 2004-07-01, 3:55 pm |
| ed_narayanan@yahoo.com (Ed) wrote
> 2. Similarly for edited numerics and alphanumerics would the data file
> contain the edits like $,. etc. i.e are they in the data or are they
> inserted.
If the field has insertion characters specified in the picture then
some of these will be in the data record field.
> 3. PIC S9(5)V9(2) ... Would the size of this be 7(5+2) ?
It is 7 numeric digits, but the number of bytes may be 7, or 8 if it
has sign separate, or 4 if it or the group is specified as some sort
of COMP~ field.
> 4. Is data arranged sequentially in these files, one record after
> another with no delimeters?
That doesn't seem likely. First it will depend on the file
organization and then it may depend on the brand of the compiler used
and also on several options. Some data files may well be fixed length
records sequentially ordered with no delimiters, others may be packed
or compressed records with headers organized is some way that makes
random access efficient but non-cobol access difficult.
| |
| Warren Simmons 2004-07-01, 3:55 pm |
| And, I think, if you talk nice to the people at Tiny COBOL, they might
have something that already does this. Their compiler is written in C,
and produces C to compile the program as far as I know.
Warren Simmons
Michael Mattias wrote:
> Well, if you can find access to a PC which can run MS-DOS programs...
> . and can get the copybook onto that PC....
>
> How about a piece of FREE software which will parse that copybook and give
> you a nice formatted report showing the offset and size of every elementary
> data item?
>
> Can't complain about the price (although the sofware itself looks its age...
> which is ten years...)
>
> Get it at:
> http://www.flexus.com/ftp/cobfd.zip
>
> --
> Michael Mattias
> Tal Systems, Inc.
> Racine WI
> mmattias@talsystems.com
> "Ed" <ed_narayanan@yahoo.com> wrote in message
> news:64c2ece4.0407010650.45e1c30d@posting.google.com...
>
>
>
>
>
| |
| JerryMouse 2004-07-01, 8:55 pm |
| Ed wrote:
> Hi,
> I am a java programmer trying to parse cobol copybooks and convert
> them to xsd.
> I was wondering if there were any sample copybooks and associated data
> available anywhere on the net ? The problem is that I am still wading
> thru cobol, so I tht those samples would help clear some doubts.
> Either ways if anyone else has an idea then please answer/confirm the
> following:
I can help you with that. Stay tuned - it will become clear.
>
> 1. Firstly I have concluded that all fields in cobol are fixed
> length(except for some DEPENDING ON stuff taht I have to read.). What
> would be the size of a field which has a V(decimal point). Would the
> data in the file have a "." or only the digits.
Well, it wouldn't have a decimal point. But the field may contain more than
digits. Or not.
>
> 2. Similarly for edited numerics and alphanumerics would the data file
> contain the edits like $,. etc. i.e are they in the data or are they
> inserted.
Sometimes inserted, or part of the data, or a mixture.
>
> 3. PIC S9(5)V9(2) ... Would the size of this be 7(5+2) ?
Sometimes. It could be 7( as above), 8 (with sign or binary - at least two
flavors, not counting floating point), or 3 bytes (packed).
>
> 4. Is data arranged sequentially in these files, one record after
> another with no delimeters?
Sometimes. There could be delimiters (of several different kinds). There
could be record descriptors on the front of the file and/or each record. The
file could also be in a proprietary format.
>
> Thanks,
> Ed
You're welcome. I glad I was able to help.
Oh, you forgot to ask about other stuff, like scaling ("123" = 123,000 or
0.00123), but you probably don't have any of that.
| |
| Robert Jones 2004-07-01, 8:55 pm |
| Hello Ed,
posting interspersed and at the end
ed_narayanan@yahoo.com (Ed) wrote in message news:<64c2ece4.0407010650.45e1c30d@posting.google.com>...
> Hi,
> I am a java programmer trying to parse cobol copybooks and convert
> them to xsd.
> I was wondering if there were any sample copybooks and associated data
> available anywhere on the net ? The problem is that I am still wading
> thru cobol, so I tht those samples would help clear some doubts.
> Either ways if anyone else has an idea then please answer/confirm the
> following:
>
> 1. Firstly I have concluded that all fields in cobol are fixed
> length(except for some DEPENDING ON stuff taht I have to read.). What
> would be the size of a field which has a V(decimal point). Would the
> data in the file have a "." or only the digits.
Yes, all fields in COBOL are fixed length according to their
definitions, with the exception of data items containing the depending
on phrase of the occurs clause in their own or their subordinate
entries.
V for a decimal point indicates an implied decimal point, it
occupies no storage. If a "." actual decimal point is used instead,
then that does occupy a byte and can only be used in edited numeric
data items, which are always of usage display or national.
>
> 2. Similarly for edited numerics and alphanumerics would the data file
> contain the edits like $,. etc. i.e are they in the data or are they
> inserted.
>
Yes the data file would contain these characters, as per the
actual decimal point "." and each occupy a byte.
> 3. PIC S9(5)V9(2) ... Would the size of this be 7(5+2) ?
>
The space occupied will depend upon the USAGE clause of the
data item. For usage DISPLAY, which is the default and is probably
what you are currently thinking of, Each digit occupies a character
position, with the exception of the S (for the sign) and the V for the
implied decimal point. The sign of the field is usually held as a
"tweak" to the first or last decimal digit, the exception is where the
SEPARATE phrase is used in the SIGN clause of the data item, in which
case the sign is stored in a separate character. The default for the
SIGN clause, whether or not the SIGN clause is actually written in a
data definition, is that the sign is not separate.
Yes the siz of your example would be 7 characters/bytes.
> 4. Is data arranged sequentially in these files, one record after
> another with no delimeters?
>
Yas, normally, but there is an option in many compilers to
use a file definition that includes the term "line sequential", in
which case each record is terminated by a line feed or carriage return
as per the compiler in use.
> Thanks,
> Ed
More generally, I advise you to obtain the manuals for the compiler
with which the programs are used. Two types of manual are of
particular interest, the language reference manual, which gives the
syntax of the COBOL for the specific compiler, and the programmer's
guide, which shows how to put the statements together, compile, link
and execute, etc, with examples. Most vendors have their manuals
online, in particular I know that IBM and Microfocus definitely do and
are free, unlike their compilers.
In the numeric data definitions, you may come across the words COMP,
COMP-3, PACKED-DECIMAL, COMP-n where n is a number. These specify the
usage of the numeric data items and allow the storage to be compressed
and make numeric operations much more efficient, you will definitely
want to read the manual if you come across them.
good luck,
Robert
| |
| Robert Wagner 2004-07-02, 8:55 am |
| "Chuck Stevens" <charles.stevens@unisys.com> wrote:
>"Robert Wagner" <robert.deletethis@wagner.net> wrote in message
>news:40e43993.526145272@news.optonline.net...
>
>I'd expect that with COBOLs from twenty years ago, but compilers written
>according to the 1985 standard include INSPECT ... CONVERTING, which can do
>the translation within the program. The CODE-SET clause appears to have
>been in standard COBOL for 36 years now.
The question was 'How can I read a Cobol file?' It was not 'How can I write a
Cobol program to translate code sets?'
>Is COMP-1 48-bit INTEGER item with a sign in the leftmost bit but one,
>zeroes in the next seven bits, and an integer value in the low-order 39 bits
>(Unisys MCP COBOL [68]), or is it a 32-bit FLOATING-POINT item with the sign
>in the left-most bit, an exponent in the next seven bits, and the remaining
>24 bits devoted to the mantissa (IBM S/370 DOS/VS ANSI COBOL [68])? Is
>COMP-2 a form of PACKED-DECIMAL whose length is variable (MCP COBOL[68]), or
>is it a double-precision floating-point item occupying eight bytes (DOS/VS
>COBOL[68])?
I doubt his Java program will encounter a file created by Unisys MCP COBOL [68].
Step back and look at this thread through the eyes of an outsider. An innocent
asks how to read a Cobol file. My response tried to helpfully answer the
question. Your response, to an outsider, appears to revel in the lack of
standardization between platforms and compilers. An insider recognizes yet
another attempt to make RW look stupid.
The outsider would have been better served if you'd answered his question rather
than using it as a launching pad to play CLC politics.
| |
| docdwarf@panix.com 2004-07-02, 8:55 am |
| In article <40e5172f.582885893@news.optonline.net>,
Robert Wagner <robert.deletethis@wagner.net> wrote:
[snip]
>An insider recognizes yet
>another attempt to make RW look stupid.
Mr Wagner, in another posting you noticed - when it was pointed out in
what appeared to be a rather unsubtle manner - that you bandied about some
rather hostile words. Might it just happen to be possible that just as
you did not see hostility in your own writing you are seeing hostility
where none exists?
Address the facts, Mr Wagner, in as dispassionate manner as can be
mustered, and the discussion might go differently... on *both* sides.
DD
| |
|
| Hey,
Chuck, Robert, Richard...Thanks for the explanantions. The thing you
all mentioned is the system dependent data file format. Now this is a
problem cause if you have just a data description copybook and its
associated data you have to be able to parse the data somehow.
I installed weblogic workshop which has a copybook importer.
http://e-docs.bea.com/wlintegration...guid/import.htm
This sucks in copybook and spits out a bea proprietary file called
MFL(Messg. format lang.) The only information it asks for are about
Endianess(big/little) and encoding(ASCII/EBCDIC/Other(?)). So I would
assume that this is all u need along with a copybook to parse a data
file.
So now comes the part where all of you have talked about implementor
dependent/proprietary file system/file org./compiler :
> How records are arranged in files is
> implementor-dependent and hardware-dependent (where are the files?).
> -Chuck Stevens
>First it will depend on the file
>organization and then it may depend on the brand of the compiler used
>and also on several options.
>Some Cobols use a
>proprietary file system that adds a 2-4 byte header to each record.
So am a bit puzzled about this as the bea importer doesnt ask for any
more info. Wonder how they handle the different formats then ?
Regards,
Ed
| |
|
| Thanks Michael, this looks really promising.I'll be using this in the
coming days to try and figure out lengths and stuff..
"Michael Mattias" <michael.mattias@gte.net> wrote in message news:<AeXEc.45$AV1.11@newssvr16.news.prodigy.com>...[color=darkred]
> Well, if you can find access to a PC which can run MS-DOS programs...
> . and can get the copybook onto that PC....
>
> How about a piece of FREE software which will parse that copybook and give
> you a nice formatted report showing the offset and size of every elementary
> data item?
>
> Can't complain about the price (although the sofware itself looks its age...
> which is ten years...)
>
> Get it at:
> http://www.flexus.com/ftp/cobfd.zip
>
> --
> Michael Mattias
> Tal Systems, Inc.
> Racine WI
> mmattias@talsystems.com
> "Ed" <ed_narayanan@yahoo.com> wrote in message
> news:64c2ece4.0407010650.45e1c30d@posting.google.com...
| |
| Robert Wagner 2004-07-02, 3:55 pm |
| ed_narayanan@yahoo.com (Ed) wrote:
>Hey,
>Chuck, Robert, Richard...Thanks for the explanantions. The thing you
>all mentioned is the system dependent data file format. Now this is a
>problem cause if you have just a data description copybook and its
>associated data you have to be able to parse the data somehow.
If I were writing the parser, I would have it also look at the data file .. to
determine whether records have a header and whether they are ASCII terminated.
>I installed weblogic workshop which has a copybook importer.
>http://e-docs.bea.com/wlintegration...guid/import.htm
>This sucks in copybook and spits out a bea proprietary file called
>MFL(Messg. format lang.) The only information it asks for are about
>Endianess(big/little) and encoding(ASCII/EBCDIC/Other(?)). So I would
>assume that this is all u need along with a copybook to parse a data
>file.
>
>So now comes the part where all of you have talked about implementor
>dependent/proprietary file system/file org./compiler :
>
>
>
>
> So am a bit puzzled about this as the bea importer doesnt ask for any
>more info. Wonder how they handle the different formats then ?
My guess would be it assumes IBM mainframe rules for the grey areas. Some major
PC Cobol compilers do that so programs can be developed and tested on a PC, then
uploaded to a mainframe.
The most commonly used grey areas on IBM are:
COMP-5 is the same as BINARY. They may be small-endian when the compiler runs
on a small-endian machine such as Intel.
Size of PIC Bytes allocated
1-4 2
5-9 4
>9 8
COMP-3 is the same as PACKED DECIMAL. IBM mainframe rules require a sign be
allocated and size rounded up to integral bytes (because that's what IBM
handware requires).
Try giving it an uncommon USAGE such as COMP-1. If that translates to
single-precision float, the translator is following IBM rules. If it chokes on
COMP-X, a Micro Focus feature, or PIC 1, that would be further evidence the
author didn't consider other Cobol compilers.
| |
| Peter Lacey 2004-07-02, 3:55 pm |
| Ed wrote:
>
> So now comes the part where all of you have talked about implementor
> dependent/proprietary file system/file org./compiler :
>
>
Do you not have access to someone cognizant of COBOL who can write a q&d
program to convert the files to ASCII format? Seems to me you must have
the files as well as the copy books: it would not be difficult to read
them and write them out as fixed-length fields or as comma-seperated
fields ( with all numeric fields unpacked) . Then Bob's your uncle!
PL
| |
| Tom Morrison 2004-07-02, 3:55 pm |
| "Ed" <ed_narayanan@yahoo.com> wrote in message
news:64c2ece4.0407010650.45e1c30d@posting.google.com...
> Hi,
> I am a java programmer trying to parse cobol copybooks and convert
> them to xsd.
Ed, perhaps you would be kind enough to mention the "why" behind this.
Perhaps there is an easier way to do all this without engaging in the
exercise you describe.
If you think this is really the way to go, perhaps you should think in terms
of using an allocation map output from a COBOL compiler (which one?) on your
system (which one?). There you might be able to find very detailed
information about what the compiler thinks the data layout is, which might
(probably will) be different than many of your early attempts to parse COBOL
data/record descriptions.
Best regards,
Tom Morrison
Liant Software Corporation
| |
| docdwarf@panix.com 2004-07-02, 3:55 pm |
| In article <W7hFc.10775$dK.5510@newssvr24.news.prodigy.com>,
Tom Morrison <t.morrison@liant.com> wrote:
>"Ed" <ed_narayanan@yahoo.com> wrote in message
>news:64c2ece4.0407010650.45e1c30d@posting.google.com...
>
>Ed, perhaps you would be kind enough to mention the "why" behind this.
It might be something as simple as 'COBOL bad, anything else good.'
DD
| |
| Richard 2004-07-02, 3:55 pm |
| ed_narayanan@yahoo.com (Ed) wrote
>
> So am a bit puzzled about this as the bea importer doesnt ask for any
> more info. Wonder how they handle the different formats then ?
They could identify the structure by checking particular codes in the
file headers and then trying to deceifer the file with appropriate
routines.
| |
| JerryMouse 2004-07-02, 8:55 pm |
| docdwarf@panix.com wrote:
> In article <W7hFc.10775$dK.5510@newssvr24.news.prodigy.com>,
> Tom Morrison <t.morrison@liant.com> wrote:
>
> It might be something as simple as 'COBOL bad, anything else good.'
Or "I have this one hammer..."
| |
| Robert Wagner 2004-07-02, 8:55 pm |
| docdwarf@panix.com wrote:
>In article <40e5172f.582885893@news.optonline.net>,
>Robert Wagner <robert.deletethis@wagner.net> wrote:
>
>[snip]
>
>
>Mr Wagner, in another posting you noticed - when it was pointed out in
>what appeared to be a rather unsubtle manner - that you bandied about some
>rather hostile words. Might it just happen to be possible that just as
>you did not see hostility in your own writing you are seeing hostility
>where none exists?
No. Mr. Stevens was definately hostile. I responded in like kind.
>Address the facts, Mr Wagner, in as dispassionate manner as can be
>mustered, and the discussion might go differently... on *both* sides.
I've really tried. A good example was the thread about 'how can my Java thing
read a Cobol file.' I answered with neutrality. Mr. Stevens used it as an excuse
to launch an attack.
| |
| docdwarf@panix.com 2004-07-03, 8:55 am |
| In article <40e5fdc0.12228346@news.optonline.net>,
Robert Wagner <robert.deletethis@wagner.net> wrote:
>docdwarf@panix.com wrote:
>
>
>No.
So much for that possibility, then.
>Mr. Stevens was definately hostile. I responded in like kind.
Assuming that the first sentence is true... Mr Stevens jumps off the
Brooklyn Bridge, you will move 'in like kind', too?
>
>
>I've really tried.
Mr Wagner, you've mentioned how you managed to maintain composure during a
military 'mental endurance exam' for longer than anyone had managed
previously; is this forum so much more trying that it is victorious where
the other failed?
DD
| |
| docdwarf@panix.com 2004-07-03, 8:55 am |
| In article <0o-dndt5b_0gV3jd4p2dnA@giganews.com>,
JerryMouse <nospam@bisusa.com> wrote:
>docdwarf@panix.com wrote:
>
>Or "I have this one hammer..."
.... and I hammer in th' moooo-ooooorrr-nin', all over this land... I
hammer out xsd, an' I hammer out xsd, an' I hammer out some more an' more,
x-s-d, aaaaaa-aaaaaallllll over this land.
DD
| |
| Robert Wagner 2004-07-03, 8:55 am |
| docdwarf@panix.com wrote:
>Assuming that the first sentence is true... Mr Stevens jumps off the
>Brooklyn Bridge, you will move 'in like kind', too?
By coincidence, I'll spend the evening of July 4 ON the Booklyn Bridge, watching
fireworks over the East River. If people are jumping, lemming instincts will
terminate reportage here.
| |
| docdwarf@panix.com 2004-07-03, 3:55 pm |
| In article <40e66d48.40784586@news.optonline.net>,
Robert Wagner <robert.deletethis@wagner.net> wrote:
>docdwarf@panix.com wrote:
>
>
>By coincidence, I'll spend the evening of July 4 ON the Booklyn Bridge, watching
>fireworks over the East River. If people are jumping, lemming instincts will
>terminate reportage here.
One might hope to find a better reason to pull a Brody than that, Mr
Wagner... best wishes for you 'n your'n to have a safe, insane Fourth.
DD
| |
| Joe Zitzelberger 2004-07-03, 3:55 pm |
| In article <cc47tn$2v0$1@panix5.panix.com>, docdwarf@panix.com wrote:
> In article <W7hFc.10775$dK.5510@newssvr24.news.prodigy.com>,
> Tom Morrison <t.morrison@liant.com> wrote:
>
> It might be something as simple as 'COBOL bad, anything else good.'
>
> DD
>
If you can get an xsd that documents your copybook, you can feed it to
automatic tools (.NET, AXIS, etc) that will create objects that can
serialize and deserialize that copybook.
Add a bit of trasport information to your schema and call it a WSDL and
the objects will then know how to communicate to the server that offers
that copybook.
Ideally, you would get a tool that would take a copybook, generate out
some Java classes that could then be used. No networking headache, no
fixed-positon field parsing, just generate and use...it really is ...
| |
| docdwarf@panix.com 2004-07-03, 3:55 pm |
| In article <joe_zitzelberger-567325.09564503072004@corp.supernews.com>,
Joe Zitzelberger <joe_zitzelberger@nospam.com> wrote:
>In article <cc47tn$2v0$1@panix5.panix.com>, docdwarf@panix.com wrote:
>
>
>If you can get an xsd that documents your copybook, you can feed it to
>automatic tools (.NET, AXIS, etc) that will create objects that can
>serialize and deserialize that copybook.
>
>Add a bit of trasport information to your schema and call it a WSDL and
>the objects will then know how to communicate to the server that offers
>that copybook.
>
>Ideally, you would get a tool that would take a copybook, generate out
>some Java classes that could then be used. No networking headache, no
>fixed-positon field parsing, just generate and use...it really is ...
Ummmmmm... so it might be one thing, it might be... something else.
DD
| |
| Lueko Willms 2004-07-04, 8:55 pm |
| It seems, my NetNews provider is again sloppy on forwarding my
messages to comp.lang.cobol; none of my contributions to this thread
appears in groups.google.com
That's why I resubmit them. It it is a duplicate for any of you readers, please let me know via email, including, if possible, the original message with all its headers.
------------------ schnapp --------------------------------
.. Am 02.07.04
schrieb ed_narayanan@yahoo.com (Ed)
auf /COMP/LANG/COBOL
in 64c2ece4.0407020432.5b123882@posting.google.com
ueber Re: Cobol Copybook Parsing
en> So now comes the part where all of you have talked about implementor
en> dependent/proprietary file system/file org./compiler :
Take into account that those who talk COBOL here have worked most
of their time on big iron, where there are quite important differences
in machine architecture and where each and every hardware vendor
produced his own operating system and his own compilers, including a
COBOL compiler.
en> So am a bit puzzled about this as the bea importer doesnt ask for
en> any more info. Wonder how they handle the different formats then ?
I don't know, but this piece of software might be somewhat
adventurous in this regard.
But it may also be that in the PC world the COBOL compilers are not
so diverse, and that a certain industry-standard has developed among
the different software houses offering COBOL compilers for the
derivatives of MS-DOS and UNIX.
And I guess - otherwise you would have mentioned it -- that your
work is based on this Intel/Microsoft or Intel/Linux world.
For example, the standard (85) knows as USAGE only COMP or
COMPUTATIONAL, DISPLAY, and INDEX.
Implementors have implemented a number of extensions with more
computational formats like COMP-1, COMP-2, COMP-3 etc. Now, several
vendors have accepted IBM's definition of COMP-3 as meaning packed-
decimal.
So, I think it is possible, but I don't know, that all those Micro
Focus, Fujitsu, Realia etc COBOLs for PCs treat those COMP-n USAGEs in
an identical fashion.
If this translator bases itself on such knowledge, OK.
Yours,
Lüko Willms http://www.mlwerke.de
/--------- L.WILLMS@jpberlin.de -- Alle Rechte vorbehalten --
"Es sind nicht die Generäle und Könige, die die Geschichte machen,
sondern die breiten Massen des Volkes" - Nelson Mandela
| |
| Warren Simmons 2004-07-04, 8:55 pm |
| Lueko Willms wrote:
> It seems, my NetNews provider is again sloppy on forwarding my
> messages to comp.lang.cobol; none of my contributions to this thread
> appears in groups.google.com
>
> That's why I resubmit them. It it is a duplicate for any of you readers, please let me know via email, including, if possible, the original message with all its headers.
> ------------------ schnapp --------------------------------
>
> . Am 02.07.04
> schrieb ed_narayanan@yahoo.com (Ed)
> auf /COMP/LANG/COBOL
> in 64c2ece4.0407020432.5b123882@posting.google.com
> ueber Re: Cobol Copybook Parsing
>
> en> So now comes the part where all of you have talked about implementor
> en> dependent/proprietary file system/file org./compiler :
>
> Take into account that those who talk COBOL here have worked most
> of their time on big iron, where there are quite important differences
> in machine architecture and where each and every hardware vendor
> produced his own operating system and his own compilers, including a
> COBOL compiler.
>
>
> en> So am a bit puzzled about this as the bea importer doesnt ask for
> en> any more info. Wonder how they handle the different formats then ?
>
> I don't know, but this piece of software might be somewhat
> adventurous in this regard.
>
> But it may also be that in the PC world the COBOL compilers are not
> so diverse, and that a certain industry-standard has developed among
> the different software houses offering COBOL compilers for the
> derivatives of MS-DOS and UNIX.
>
> And I guess - otherwise you would have mentioned it -- that your
> work is based on this Intel/Microsoft or Intel/Linux world.
>
> For example, the standard (85) knows as USAGE only COMP or
> COMPUTATIONAL, DISPLAY, and INDEX.
>
> Implementors have implemented a number of extensions with more
> computational formats like COMP-1, COMP-2, COMP-3 etc. Now, several
> vendors have accepted IBM's definition of COMP-3 as meaning packed-
> decimal.
>
> So, I think it is possible, but I don't know, that all those Micro
> Focus, Fujitsu, Realia etc COBOLs for PCs treat those COMP-n USAGEs in
> an identical fashion.
>
> If this translator bases itself on such knowledge, OK.
>
>
> Yours,
> Lüko Willms http://www.mlwerke.de
> /--------- L.WILLMS@jpberlin.de -- Alle Rechte vorbehalten --
>
> "Es sind nicht die Generäle und Könige, die die Geschichte machen,
> sondern die breiten Massen des Volkes" - Nelson Mandela
It would certainly be of common interest to learn if this compatibility
is true or not. I believe that this would tend to over time cause the
differences to go away.
Warren Simmons
| |
|
| Tom,
The "why" is a requirement.The system that processes the data can
accept only xml and for it make sense of the data we have to create an
XSD to define the XML instance.
Regards,
Ed
"Tom Morrison" <t.morrison@liant.com> wrote in message news:<W7hFc.10775$dK.5510@newssvr24.news.prodigy.com>...
> "Ed" <ed_narayanan@yahoo.com> wrote in message
> news:64c2ece4.0407010650.45e1c30d@posting.google.com...
>
> Ed, perhaps you would be kind enough to mention the "why" behind this.
> Perhaps there is an easier way to do all this without engaging in the
> exercise you describe.
>
> If you think this is really the way to go, perhaps you should think in terms
> of using an allocation map output from a COBOL compiler (which one?) on your
> system (which one?). There you might be able to find very detailed
> information about what the compiler thinks the data layout is, which might
> (probably will) be different than many of your early attempts to parse COBOL
> data/record descriptions.
>
> Best regards,
> Tom Morrison
> Liant Software Corporation
| |
|
| Peter Lacey <lacey@mb.sympatico.ca> wrote in message
> Do you not have access to someone cognizant of COBOL who can write a q&d
> program to convert the files to ASCII format? Seems to me you must have
> the files as well as the copy books: it would not be difficult to read
> them and write them out as fixed-length fields or as comma-seperated
> fields ( with all numeric fields unpacked) . Then Bob's your uncle!
The way I understand it so far is that the cobol data files can
contain both text(EBCDIC/ASCII) AND binary(numerics with cOMP). So i
assume a simple ebcdic to ascii would not suffice. Usually data files
are either binary or text ! This one has a mixture of both :( So what
do we call such a file ?
| |
|
| > Do you not have access to someone cognizant of COBOL who can write a q&d
> program to convert the files to ASCII format? Seems to me you must have
> the files as well as the copy books: it would not be difficult to read
> them and write them out as fixed-length fields or as comma-seperated
> fields ( with all numeric fields unpacked) . Then Bob's your uncle!
Excellent article by Michael which explains why it is not a simple "to
ascii" conversion..
http://www.flexus.com/ebd2asc.html
| |
| Peter Lacey 2004-07-05, 8:55 pm |
| Ed wrote:
>
> Peter Lacey <lacey@mb.sympatico.ca> wrote in message
>
> The way I understand it so far is that the cobol data files can
> contain both text(EBCDIC/ASCII) AND binary(numerics with cOMP). So i
> assume a simple ebcdic to ascii would not suffice. Usually data files
> are either binary or text ! This one has a mixture of both :( So what
> do we call such a file ?
Piece of cake. Cobol converts all its numeric formats to ordinary
numbers with decimal point and sign as required, using a MOVE
statement. If your files actually contain EBCDIC then the compiler in
question will support EBCDIC -> ASCII conversions. (Even IBM mainframe
environments, n'est-ce pas?)
At least TRY finding a programmer. There must be zillions of us.
PL
| |
| William M. Klein 2004-07-06, 3:55 pm |
|
"Robert Wagner" <robert.deletethis@wagner.net> wrote in message
news:40e43993.526145272@news.optonline.net...
> ed_narayanan@yahoo.com (Ed) wrote:
<snip>
> BINARY are words, half-words or long words.
VERY implementor dependent (even onPC's) check out the Micro Focus "IBM-COMP"
compiler directive (for example) which controls whether or not you can have
one-byte binary fields. Also "big-endian" vs "little-endian" is a MAJOR issue
with PC compilers.
Finally, many (almost all? all?) compiler vendors provide an "option" to ignore
the picture (in some cases) and use the full storage - while the ANSI/ISO
Standards require "honoring" the picture (decimal, not binary) definition.
All of this is ALSO true for many/most COMP/COMPUTATIONAL fields.
P.S. I haven't yet caught up on the entire thread so, this may already have
been mentioned by others.
--
Bill Klein
wmklein <at> ix.netcom.com
| |
| William M. Klein 2004-07-06, 3:55 pm |
| If you are trying to get XML data out of a COBOL file, you might be interested
in the "built-in" facilities of at least one COBOL compiler. See the XML
GENERATE statement (and descriptions) at:
http://publibz.boulder.ibm.com/cgi-...igy3lr20/6.2.41
and
http://publibz.boulder.ibm.com/cgi-.../igy3pg20/5.2.1
That seems (to me) to be a "better solution" than "re-inventing the wheel". If
you are working in an environment with a COBOL compiler (as you must be if you
have COBOL copybooks), then check with that compiler vendor to see what tools
they ALREADY have for doing this.
--
Bill Klein
wmklein <at> ix.netcom.com
"Ed" <ed_narayanan@yahoo.com> wrote in message
news:64c2ece4.0407042214.4382809d@posting.google.com...
> Tom,
> The "why" is a requirement.The system that processes the data can
> accept only xml and for it make sense of the data we have to create an
> XSD to define the XML instance.
> Regards,
> Ed
>
>
> "Tom Morrison" <t.morrison@liant.com> wrote in message
news:<W7hFc.10775$dK.5510@newssvr24.news.prodigy.com>...[color=darkred]
| |
| Tom Morrison 2004-07-06, 3:55 pm |
| "Ed" <ed_narayanan@yahoo.com> wrote in message
news:64c2ece4.0407042214.4382809d@posting.google.com...
> The "why" is a requirement.The system that processes the data can
> accept only xml and for it make sense of the data we have to create an
> XSD to define the XML instance.
Ed, the Liant XML Toolkit will generate an XSD from a COBOL data structure,
but it is an implement in the RM/COBOL Developer System, which includes a
compiler, etc.
You didn't mention what system (compiler, operating system, etc) that will
be sourcing the data. This makes a difference.
Tom Morrison
Liant Software Corporation
www.liant.com
| |
| Richard 2004-07-06, 8:55 pm |
| ed_narayanan@yahoo.com (Ed) wrote
> Usually data files
> are either binary or text ! This one has a mixture of both :( So what
> do we call such a file ?
Usually, 'binary' files also include text. For example if you look at
an executable file it will be mostly binary but there will usually be
text strings for messages, etc.
There may well be some files that are only numbers stored as binary,
but data files could be a mixture of text and binary or all text.
| |
| Bernard Giroud 2004-07-07, 3:55 am |
| Warren Simmons a écrit :
> And, I think, if you talk nice to the people at Tiny COBOL, they might
> have something that already does this. Their compiler is written in C,
> and produces C to compile the program as far as I know.
>
> Warren Simmons
>
Little correction:TinyCOBOL produces gas assembler and thus is
target platform oriented (IA32). Contrary to OpenCOBOL which
effectively produces C and should be platform agnostic.
--
Bernard Giroud
Open Source COBOL Tools Developer
[color=darkred]
>
> Michael Mattias wrote:
| |
|
| Hi,
I am following the IBM
http://www-306.ibm.com/software/awdtools/cobol/zos/ documentation. A
few lingering doubts about parsing:
a) Would the scaling letter P be present in the data ? i.e. PIC 999PP
would be 5 bytes or 3 in data.
b) REDEFINES redeclares the same memory so would it contribute to the
data file or can it be ignored ?
c) RENAME provides an alternate name name for data items. Again is it
just to "rename" or do fields exist corresponding to this.
d) Are insertion chars present in data for edited pics. The answer I
got on this forum was that they are part of the data. But thinking
about the name "insertion" it seems as if they are inserted into the
actual data for display purposes so ideally they shouldnt be in the
data ? something like the assumed decimal V.
Regards
| |
| Binyamin Dissen 2004-07-12, 8:55 am |
| On 12 Jul 2004 05:10:55 -0700 ed_narayanan@yahoo.com (Ed) wrote:
:>I am following the IBM
:>http://www-306.ibm.com/software/awdtools/cobol/zos/ documentation. A
:>few lingering doubts about parsing:
:>a) Would the scaling letter P be present in the data ? i.e. PIC 999PP
:>would be 5 bytes or 3 in data.
The P positions are implied.
The amount of bytes depends on the USAGE clause.
:>b) REDEFINES redeclares the same memory so would it contribute to the
:>data file or can it be ignored ?
Why are you parsing the copybook?
REDEFINEd fields are fields as well.
:>c) RENAME provides an alternate name name for data items. Again is it
:>just to "rename" or do fields exist corresponding to this.
:>d) Are insertion chars present in data for edited pics. The answer I
:>got on this forum was that they are part of the data. But thinking
:>about the name "insertion" it seems as if they are inserted into the
:>actual data for display purposes so ideally they shouldnt be in the
:>data ? something like the assumed decimal V.
V, like P, is implied.
--
Binyamin Dissen <bdissen@dissensoftware.com>
http://www.dissensoftware.com
Director, Dissen Software, Bar & Grill - Israel
| |
| Michael Mattias 2004-07-12, 8:55 am |
|
"Ed" <ed_narayanan@yahoo.com> wrote in message
news:64c2ece4.0407120410.6ec77d5d@posting.google.com...
> Hi,
> I am following the IBM
> http://www-306.ibm.com/software/awdtools/cobol/zos/ documentation. A
> few lingering doubts about parsing:
>
> a) Would the scaling letter P be present in the data ? i.e. PIC 999PP
> would be 5 bytes or 3 in data.
a P (decimal scaling) in a PICTURE clause has no effect on the size of a
data element.
> b) REDEFINES redeclares the same memory so would it contribute to the
> data file or can it be ignored ?
A REDEFINES is the same as a 'union' (C, BASIC) or 'org' (assembly).
"contribute' is kind of a funny word here. If you mean "does the presence of
a REDEFINES change the length of a record" the answer is no.
> c) RENAME provides an alternate name name for data items. Again is it
> just to "rename" or do fields exist corresponding to this.
RENAMES simply is a REDEFINE of a range of variable names; a REDEFINES may
only reference a single dataname.
Renames and Redefines example which get you the same results:
Using Renames
05 MM PIC 9(02)
05 DD PIC 9(02)
05 CCYY PIC 9(04)
66 MMDDCCYY RENAMES MM THRU CCYY.
Using Redefines
05 MMDDCCYY PIC 9(08).
05 FILLER REDEFINES MMDDCCYY
10 MM PIC 9(02)
10 DD PIC 9(02)
10 CCYY PIC 9(04)
> d) Are insertion chars present in data for edited pics.
In an edited alpha, alphanumeric or number PICTURE, any formatting
characters are included in the data. (A "V" in a picture clause is NOT an
editing character).
FWIW, "edited <anything>" is RARELY stored a data file.
Download this for more (text and graphics tutorial on this subject):
http://www.flexus.com/ftp/cobdata.zip
Rules in action (create a report about all data items appearing in an FD):
http://www.flexus.com/ftp/cobfd.zip
--
Michael Mattias
Tal Systems, Inc.
Racine WI
mmattias@talsystems.com
Author/Contributor of said tutorial and utility
| |
| Donald Tees 2004-07-12, 3:56 pm |
| Ed wrote:
> Hi,
> I am following the IBM
> http://www-306.ibm.com/software/awdtools/cobol/zos/ documentation. A
> few lingering doubts about parsing:
>
> a) Would the scaling letter P be present in the data ? i.e. PIC 999PP
> would be 5 bytes or 3 in data.
three
>
> b) REDEFINES redeclares the same memory so would it contribute to the
> data file or can it be ignored ?
it is a second way of interpeting the same data, so the answer is it
cannot be ignored, but it *is* the same data, not a second set.
example:
02 data-name-one pic 99v99 value 12.34.
02 data-name-two pic 9999v redefines data-name.
...
02 data-name-three picture 9999v9999.
move data-name-one to data-name-three.
makes data-name-three equal to 0012.3400.
move data-name-two to data-name-three.
makes data-name-three equal to 1234.0000.
No extra fields exist ... the information is there for procedural
interpetation. The above originating data is four bytes, interpeted two
different ways, then moved into 8 bytes.
>
> c) RENAME provides an alternate name name for data items. Again is it
> just to "rename" or do fields exist corresponding to this.
Just a rename.
>
> d) Are insertion chars present in data for edited pics. The answer I
> got on this forum was that they are part of the data. But thinking
> about the name "insertion" it seems as if they are inserted into the
> actual data for display purposes so ideally they shouldnt be in the
> data ? something like the assumed decimal V.
>
> Regards
When you move data to an edited picture, it is converted to that form,
normally for display purposes. The characters, in this case, are
actually present in the data. It would be unusual to have edited picture
clauses in data files. The norm is for them to be used in printable
files, or files designed for viewing on the screen. Moving data *out* of
edited fields can often be problematic, particularly with older Cobol's.
You should treat them as "typed" input, though they are normally a bit
cleaner than that.
Donald
| |
| Tom Morrison 2004-07-12, 3:56 pm |
| "Ed" <ed_narayanan@yahoo.com> wrote in message
news:64c2ece4.0407120410.6ec77d5d@posting.google.com...
> Hi,
> I am following the IBM
> http://www-306.ibm.com/software/awdtools/cobol/zos/ documentation. A
> few lingering doubts about parsing:
Does this mean that this describes the system that is sourcing the data?
>
> a) Would the scaling letter P be present in the data ? i.e. PIC 999PP
> would be 5 bytes or 3 in data.
It would be 3 *digits* but the number of bytes depends on (1) the data
representation of the digits (see USAGE clause) and (2) the size of a *byte*
on the system sourcing the data. This example might reasonably be contained
in 1 through 3 octets (bytes).
>
> b) REDEFINES redeclares the same memory so would it contribute to the
> data file or can it be ignored ?
It most definitely cannot be ignored. Actually, the presence of REDEFINES
in a file's record description is usually an indicator that the file
actually contains multiple record types, and you must be able to distinguish
between those record types before you can correctly interpret the data
contained in the record.
>
> c) RENAME provides an alternate name name for data items. Again is it
> just to "rename" or do fields exist corresponding to this.
RENAME almost certainly will not be encountered but its syntax and semantic
rules are clearly defined.
>
> d) Are insertion chars present in data for edited pics. The answer I
> got on this forum was that they are part of the data. But thinking
> about the name "insertion" it seems as if they are inserted into the
> actual data for display purposes so ideally they shouldnt be in the
> data ? something like the assumed decimal V.
As another post has stated, you likely will not see edited PICTUREs in a
data file (unless you intend to represent print spool files in XML). If you
do see such pictures, you may be certain that the insertion characters
represent character positions you must consider when interpreting the
contents of the record; how you interpret type data in the fields may, or
may not, be influenced by the insertion characters (consider insertion
zeroes, for example).
Ed, you are engaged in the *very* nontrivial exercise of extracting a
correct physical schema of a COBOL data store and applying that schema to a
similar, but different, type of data representation. This is similar to the
claims in US Patent 5,826,076, which you might want to review for other
pitfalls.
Good luck!
| |
| Richard 2004-07-12, 3:56 pm |
| ed_narayanan@yahoo.com (Ed) wrote
> b) REDEFINES redeclares the same memory so would it contribute to the
> data file or can it be ignored ?
You cannot just ignore redefines. They are telling you that there are
alternate layouts for the record. In some records an area may be a
block of text. In another it may be a handful of numbers. This would
be determined by a 'record type' or other signal.
For example an Order file may contain a mixture of records for each
order. Some will be product sales lines, others may be comments or
instructions. A REDEFINE or implicit redefine may specify the various
sets of fields. The actual data in the file may be:
order# type line key
123456 H 000 CUSTNO date reference
123456 D 001 PRODNO ordered sent backord price discount
123456 D 002 PROD2 ordered sent backord price discount
123456 C 005 Comment-text
The Comment text redefines the quantity and price fields.
> c) RENAME provides an alternate name name for data items. Again is it
> just to "rename" or do fields exist corresponding to this.
Similar to redefines it may be that the comment text just renames
'ordered THRU discount' so it can do a group move or use the area as a
text field.
Did you think it would be easy ?
| |
| Chuck Stevens 2004-07-12, 3:56 pm |
| "Tom Morrison" <t.morrison@liant.com> wrote in message
news:2RAIc.14784$Mu.2651@newssvr24.news.prodigy.com...
>
> It would be 3 *digits* but the number of bytes depends on (1) the data
> representation of the digits (see USAGE clause) and (2) the size of a
*byte*
> on the system sourcing the data. This example might reasonably be
contained
> in 1 through 3 octets (bytes).
Two octets I'd buy for accurate representation of such an item.
One-and-a-half octets I'd buy. Even ten bits I'd buy. But how so one 8-bit
field?
The reasonable maximum value per COBOL rules in an item declared PIC 999PP
is 99,900, and ISTM the internal representation of it has to be some form of
999. I don't know of a way to cram an integer value any bigger than 255
into an eight-bit item without resorting to some bizarre sort of
floating-point (which itself puts us into approximation land).
I don't see an exact representation of the integer 999 fitting into a single
octet (eight bits); can you clarify how?
-Chuck Stevens
| |
| Michael Mattias 2004-07-12, 8:55 pm |
| "Richard" <riplin@Azonic.co.nz> wrote in message
news:217e491a.0407121120.104eeea1@posting.google.com...
> ed_narayanan@yahoo.com (Ed) wrote
>
text field.[color=darkred]
>
> Did you think it [Parsing a copybook to get a record layout] would be easy
?
As the author of the free utility I posted earler, I will guarantee you it's
not.
I spent hours and hours getting the logic right when I had REDEFINES+OCCURS
(which is one of the most common combinations), with OCCURS within OCCURS
(another fun one), and with multiple REDEFINES with or without subordinate
OCCURS.
For that matter, I never bothered with "P" (I have never seen this 'for
real') or RENAMES (I used it once and a client told me "we don't do that
here!").
Of course, I wrote that about ten years ago.. betcha I could do it a lot
better today. (That's also written under MS-DOS, and I had to do some stuff
to avoid using up all my 640K of memory.... IIRC, that was tricky on those
4000-elementary-items-per-group-item copylibs)
(Someone sent me that and had the stones to tell me 4000 datanames in one
group item was "typical". Yeah, sure.)
MCM
| |
|
| Hey thanks...
So the answers to my questions as far as I could understand are:
1) P scaling is rarely used and if it does exist then it is ignored
while calculating the size in the data file.
2) In REDEFINES the data size is fixed as the size of the original
data.
i.e. 02 data-one pic 99v99 value 12.34.
02 data-two pic 9999v redefines data-name.
The size of data-one is 4 bytes (assuming DISPLAY). Anything
redefining this will also have to be of the same size. Now how would
it be possible to figure out what the data file contains ? i.e. in
some records it could be data-one and in another it could be data-two.
How does the parser figure out which data description to use ?
Richard said "This would be determined by a 'record type' or other
signal.".. where would this flag be ? in the copybook or in the data ?
are there any standards for this (oops sorry wrong language ;-) to be
talking abt standards)?
3) RENAME would also have the same problem as above. i.e how to know
which description is being used in the data , the original or the
renamed one.
4) Edited chars (if they appear) will be present in the data.
"Michael Mattias" <michael.mattias@gte.net> wrote in message news:<7dDIc.36061$eH1.17077808@newssvr28.news.prodigy.com>...
> "Richard" <riplin@Azonic.co.nz> wrote in message
> news:217e491a.0407121120.104eeea1@posting.google.com...
> text field.
> ?
>
> As the author of the free utility I posted earler, I will guarantee you it's
> not.
>
> I spent hours and hours getting the logic right when I had REDEFINES+OCCURS
> (which is one of the most common combinations), with OCCURS within OCCURS
> (another fun one), and with multiple REDEFINES with or without subordinate
> OCCURS.
>
> For that matter, I never bothered with "P" (I have never seen this 'for
> real') or RENAMES (I used it once and a client told me "we don't do that
> here!").
>
> Of course, I wrote that about ten years ago.. betcha I could do it a lot
> better today. (That's also written under MS-DOS, and I had to do some stuff
> to avoid using up all my 640K of memory.... IIRC, that was tricky on those
> 4000-elementary-items-per-group-item copylibs)
>
> (Someone sent me that and had the stones to tell me 4000 datanames in one
> group item was "typical". Yeah, sure.)
>
> MCM
| |
| Donald Tees 2004-07-13, 8:55 am |
| Ed wrote:
> Hey thanks...
> So the answers to my questions as far as I could understand are:
>
> 1) P scaling is rarely used and if it does exist then it is ignored
> while calculating the size in the data file.
>
yep ... typical use: 02 parts-per-million pic VPPPPP9999, four digits
accuracy. Note scaling can be used before or after the decimal.
> 2) In REDEFINES the data size is fixed as the size of the original
> data.
> i.e. 02 data-one pic 99v99 value 12.34.
> 02 data-two pic 9999v redefines data-name.
> The size of data-one is 4 bytes (assuming DISPLAY). Anything
> redefining this will also have to be of the same size. Now how would
> it be possible to figure out what the data file contains ? i.e. in
> some records it could be data-one and in another it could be data-two.
> How does the parser figure out which data description to use ?
>
You cannot by parsing it. That is where a program comes into it. It may
be a whole record that is redefined, and it may be a single field. It
could be something as obsure as a country code from a setup file used to
interpet a currency, or even the date used to interpet historical data
(say pre and post Y2K dates).
The copy books are just that ... copies of descriptions that get copied
into programs for further use. As stand alone entities, they only
contain a specific amount of info, and their use can only be infered.
Donald
| |
| Richard 2004-07-13, 3:55 pm |
| ed_narayanan@yahoo.com (Ed) wrote
> 1) P scaling is rarely used and if it does exist then it is ignored
> while calculating the size in the data file.
But it is not ignored when calculating the size of the _number_.
> 2) In REDEFINES the data size is fixed as the size of the original
> data.
Not necessarily. The size of the complete area is the largest of any
of the REDEFINES. The standard requires that the largest is specified
first but extensions may allow any to be the largest.
In FD record areas every 01 level has an implicit redefine.
> i.e. 02 data-one pic 99v99 value 12.34.
> 02 data-two pic 9999v redefines data-name.
> The size of data-one is 4 bytes (assuming DISPLAY). Anything
> redefining this will also have to be of the same size. Now how would
> it be possible to figure out what the data file contains ? i.e. in
> some records it could be data-one and in another it could be data-two.
Because there will be some other data item in the record which
indicates to the programmer, implicity or explicitly, which to use.
For example if this was a Payroll file there may be a flag in the
record which indicates 'H' or 'S' when it is 'H' (for hourly rate)
then the data record holds a field giving hourly rate as 999v999.
When it is 'S' the value is an annual Salary as 999999.
> How does the parser figure out which data description to use ?
Did you think it was going to be easy ?
> Richard said "This would be determined by a 'record type' or other
> signal.".. where would this flag be ? in the copybook or in the data ?
You seem to have a conceptual problem. Of course the flag is in each
data record as a data item. Of course the flag definition is in the
copybook.
> are there any standards for this (oops sorry wrong language ;-) to be
> talking abt standards)?
No.
> 3) RENAME would also have the same problem as above. i.e how to know
> which description is being used in the data , the original or the
> renamed one.
Yea, it's a problem isn't it. The copybook is not enough, you also
need the documentation, or the program code.
> 4) Edited chars (if they appear) will be present in the data.
| |
| Richard 2004-07-13, 8:55 pm |
| Donald Tees <donald_tees@sympatico.ca> wrote
> The copy books are just that ... copies of descriptions that get copied
> into programs for further use. As stand alone entities, they only
> contain a specific amount of info, and their use can only be infered.
It is also not impossible that several copybooks may be used to
completely define all the records that may exist in a file:
FD Mixed-File.
COPY "MixHeader".
COPY "MixLineItem".
COPY "MixOther".
each having a different 01 level and all implicit redefines when in
FILE SECTION.
Or a CopyBook may define several different files where these are
always used together.
FILE SECTION.
COPY "OrderFiles.FD".
OrderFile.FD:
FD OrderHeader.
01 Header-Record.
03 OH-Key.
....
FD OrderItems.
01 Item-Record.
03 OI-Key.
....
| |
| Richard 2004-07-13, 8:55 pm |
| Donald Tees <donald_tees@sympatico.ca> wrote
> The copy books are just that ... copies of descriptions that get copied
> into programs for further use. As stand alone entities, they only
> contain a specific amount of info, and their use can only be infered.
It is also not impossible that several copybooks may be used to
completely define all the records that may exist in a file:
FD Mixed-File.
COPY "MixHeader".
COPY "MixLineItem".
COPY "MixOther".
each having a different 01 level and all implicit redefines when in
FILE SECTION.
Or a CopyBook may define several different files where these are
always used together.
FILE SECTION.
COPY "OrderFiles.FD".
OrderFile.FD:
FD OrderHeader.
01 Header-Record.
03 OH-Key.
....
FD OrderItems.
01 Item-Record.
03 OI-Key.
....
| |
| Chuck Stevens 2004-07-13, 8:55 pm |
| Not just that.
I see absolutely nothing in the '74 or subsequent standards that precludes a
program from consisting SOLELY of a COPY statement, so long as the text
being incorporated via COPY is a syntactically-valid program.
Given these two files:
File COPYIT:
000100 COPY COPYBOOK.
FILE COPYBOOK:
000100 IDENTIFICATION DIVISION.
000200 ENVIRONMENT DIVISION.
000300 DATA DIVISION.
000400 PROCEDURE DIVISION.
000500 ONLY-PARAGRAPH.
000600 DISPLAY "IN THE COPYBOOK".
000700 STOP RUN.
Both our COBOL74 and COBOL85 compilers will compile "COPYIT" without murmur,
and execution of the resultant object code from either results in the
dutiful DISPLAY of "IN THE COPYBOOK" on the ODT.
Limitations of context for COPY were indeed the case in '68-compliant
compilers, but those limitations were lifted in the '74 standard. "Copy
books" are by NO MEANS limited to the DATA DIVISION. A COPY statement can
appear anywhere in a program that a character string or a separator can
appear.
-Chuck Stevens
"Richard" <riplin@Azonic.co.nz> wrote in message
news:217e491a.0407131354.4e8e8551@posting.google.com...
> Donald Tees <donald_tees@sympatico.ca> wrote
>
>
> It is also not impossible that several copybooks may be used to
> completely define all the records that may exist in a file:
>
> FD Mixed-File.
> COPY "MixHeader".
> COPY "MixLineItem".
> COPY "MixOther".
>
> each having a different 01 level and all implicit redefines when in
> FILE SECTION.
>
> Or a CopyBook may define several different files where these are
> always used together.
> FILE SECTION.
> COPY "OrderFiles.FD".
>
> OrderFile.FD:
>
> FD OrderHeader.
> 01 Header-Record.
> 03 OH-Key.
> ....
> FD OrderItems.
> 01 Item-Record.
> 03 OI-Key.
> ....
| |
| Donald Tees 2004-07-14, 3:55 am |
| Chuck Stevens wrote:
> Not just that.
>
> I see absolutely nothing in the '74 or subsequent standards that precludes a
> program from consisting SOLELY of a COPY statement, so long as the text
> being incorporated via COPY is a syntactically-valid program.
>
> Given these two files:
> File COPYIT:
> 000100 COPY COPYBOOK.
> FILE COPYBOOK:
> 000100 IDENTIFICATION DIVISION.
> 000200 ENVIRONMENT DIVISION.
> 000300 DATA DIVISION.
> 000400 PROCEDURE DIVISION.
> 000500 ONLY-PARAGRAPH.
> 000600 DISPLAY "IN THE COPYBOOK".
> 000700 STOP RUN.
>
> Both our COBOL74 and COBOL85 compilers will compile "COPYIT" without murmur,
> and execution of the resultant object code from either results in the
> dutiful DISPLAY of "IN THE COPYBOOK" on the ODT.
>
> Limitations of context for COPY were indeed the case in '68-compliant
> compilers, but those limitations were lifted in the '74 standard. "Copy
> books" are by NO MEANS limited to the DATA DIVISION. A COPY statement can
> appear anywhere in a program that a character string or a separator can
> appear.
>
> -Chuck Stevens
>
I believe that Bill stated (a year or two back) that the new standard
even allows them to be nested up to 16 levels deep. In theory, one could
create an IDE structure that did nothing but create copies containing
copies, and submit them to the compiler in that form.
Donald
| |
| Donald Tees 2004-07-14, 3:55 am |
| Richard wrote:
>
>
> Not necessarily. The size of the complete area is the largest of any
> of the REDEFINES. The standard requires that the largest is specified
> first but extensions may allow any to be the largest.
>
> In FD record areas every 01 level has an implicit redefine.
>
Yes, I forgot those wrinkles.
>
> Because there will be some other data item in the record which
> indicates to the programmer, implicity or explicitly, which to use.
>
In most cases ;< )
Consider:
02 percentage-of-total picture 99v9.
02 fraction-of-total picture V999 redefines
percentage-of-total.
02 parts-per-thousand picture 999V redefines
percentage-of-total.
02 parts-per-ten-thousand picture 999PV ...
02 parts-per-100-thousand picture 999PPV ...
02 parts-per-million picture 999PPPV ...
divide total-amount by part-of-amount
giving fraction-of-total rounded.
or
move 100 to parts-per-thousand.
I wonder how many redundant multiply/divides by 100 exist in Cobol programs?
Donald
| |
| Robert Wagner 2004-07-14, 8:55 am |
| Donald Tees <donald_tees@sympatico.ca> wrote:
>I wonder how many redundant multiply/divides by 100 exist in Cobol programs?
Sounds like a job for the optimizer. Given this typical code ..
02 a comp-3 pic s9(09)v99.
02 b comp-3 pic s9(09)v99.
compute a-percentage rounded = a * 100 / b
... the optimizer could pretend the code read
02 a-generated redefines a comp-3 pic s9(11).
compute a-percentage rounded = a-generated / b
| |
| Tom Morrison 2004-07-14, 3:55 pm |
| "Chuck Stevens" <charles.stevens@unisys.com> wrote in message
news:ccuolj$4j8$1@si05.rsvl.unisys.com...
> "Tom Morrison" <t.morrison@liant.com> wrote in message
> news:2RAIc.14784$Mu.2651@newssvr24.news.prodigy.com...
>
> *byte*
> contained
[snip][color=darkred]
> I don't see an exact representation of the integer 999 fitting into a
single
> octet (eight bits); can you clarify how?
Chuck, I believe that certain compilers can achieve this using truncation
options on binary representations. It takes three decimal digits to
represent 255, e.g., but only one octet under certain coercive conditions.
I was trying to show that the task Ed has undertaken is, contrary to his
expectations, *not* very easy, and that there might be a better solution if
he and his team could remember what problem they are really trying to solve
(which I bet is *not* parsing COBOL copybooks).
Tom Morrison
| |
| Howard Brazee 2004-07-14, 3:55 pm |
|
On 13-Jul-2004, Donald Tees <donald_tees@sympatico.ca> wrote:
> Consider:
>
> 02 percentage-of-total picture 99v9.
> 02 fraction-of-total picture V999 redefines
> percentage-of-total.
> 02 parts-per-thousand picture 999V redefines
> percentage-of-total.
> 02 parts-per-ten-thousand picture 999PV ...
> 02 parts-per-100-thousand picture 999PPV ...
> 02 parts-per-million picture 999PPPV ...
>
> divide total-amount by part-of-amount
> giving fraction-of-total rounded.
> or
> move 100 to parts-per-thousand.
>
> I wonder how many redundant multiply/divides by 100 exist in Cobol programs?
When I was a kid, it took me forever to understand the "multiply by 100" part of
calculating percent. That was because I already implicitly moved the decimal
point over with a redefines in my head. I think that way with CoBOL today.
But the optimizer should be able to think the same way.
| |
| Chuck Stevens 2004-07-14, 3:55 pm |
| My point here is that a PIC 999 item of *any* USAGE is required to handle
any value in the range 0 - 999, not just some values that might have as many
as three digits. I recognize that it's possible to fit the value 255 into
an 8-bit item, but the rules for BINARY (as distinct from, say, BINARY-CHAR
UNSIGNED, new in 2002) require that "sufficient computer storage must be
allocated by the implementor to contain the maximum range of values implied
by the associated decimal PICTURE character-string", and I would contend
that 8 bits is not sufficient to contain a value in the range 0 through 999
inclusive.
Note that an item declared USAGE BINARY-CHAR UNSIGNED *must* allow a range
from 0 through 255 inclusive but may at the implementor's discretion allow a
larger upper limit. Note also that such an item can't have a PICTURE
clause. "Plain" USAGE BINARY items have PICTURE clauses, and enough storage
*must* be allocated for them to hold the entire range of values denoted
thereby.
-Chuck Stevens
"Tom Morrison" <t.morrison@liant.com> wrote in message
news:TTaJc.15163$BE3.2315@newssvr24.news.prodigy.com...
> "Chuck Stevens" <charles.stevens@unisys.com> wrote in message
> news:ccuolj$4j8$1@si05.rsvl.unisys.com...
999PP[color=darkred]
> [snip]
> single
>
> Chuck, I believe that certain compilers can achieve this using truncation
> options on binary representations. It takes three decimal digits to
> represent 255, e.g., but only one octet under certain coercive conditions.
>
> I was trying to show that the task Ed has undertaken is, contrary to his
> expectations, *not* very easy, and that there might be a better solution
if
> he and his team could remember what problem they are really trying to
solve
> (which I bet is *not* parsing COBOL copybooks).
>
> Tom Morrison
>
>
| |
| Tom Morrison 2004-07-15, 3:55 pm |
| Chuck,
Sorry, I really didn't want to set off a discussion of this sort.
Relativity (http://www.liant.com/products/relativity/) does what Ed is
s ing to do (plus a *whole lot more*), and I was drawing upon my
experiences from its development. Several COBOL implementations have
*nonstandard* means to coerce binary representations to a single octet, in
spite of the item's picture, spawned by the need for inter-language calling
capabilities (something not obviously addressed by the COBOL standards).
If Ed stated which compiler/system was sourcing the data, I cannot recall
the fact. If he is using a system that allowed such nonstandard excursions,
then he'd better plan for them, without regard to the current or any
previous standard. And this whole subthread may be irrelevant since the
concern would be more of an issue if the XML were the source of data being
brought into the COBOL application.
Tom Morrison
"Chuck Stevens" <charles.stevens@unisys.com> wrote in message
news:cd3k7v$cek$1@si05.rsvl.unisys.com...
> My point here is that a PIC 999 item of *any* USAGE is required to handle
> any value in the range 0 - 999, not just some values that might have as
many
> as three digits. I recognize that it's possible to fit the value 255 into
> an 8-bit item, but the rules for BINARY (as distinct from, say,
BINARY-CHAR
> UNSIGNED, new in 2002) require that "sufficient computer storage must be
> allocated by the implementor to contain the maximum range of values
implied
> by the associated decimal PICTURE character-string", and I would contend
> that 8 bits is not sufficient to contain a value in the range 0 through
999
> inclusive.
>
> Note that an item declared USAGE BINARY-CHAR UNSIGNED *must* allow a range
> from 0 through 255 inclusive but may at the implementor's discretion allow
a
> larger upper limit. Note also that such an item can't have a PICTURE
> clause. "Plain" USAGE BINARY items have PICTURE clauses, and enough
storage
> *must* be allocated for them to hold the entire range of values denoted
> thereby.
>
> -Chuck Stevens
[snip]
| |
| Lueko Willms 2004-07-15, 3:55 pm |
| .. Am 15.07.04
schrieb t.morrison@liant.com (Tom Morrison)
auf /COMP/LANG/COBOL
in 6bwJc.15359$T%1.6284@newssvr24.news.prodigy.com
ueber Re: Cobol Copybook Parsing
TM> If Ed stated which compiler/system was sourcing the data, I cannot
TM> recall the fact.
He didn't, and I can only repeat that he should better take the
allocation map produced by the compiler as input instead of recreating
a COBOL compiler with many implementation variants.
Unfortunately, my NetNews-Provider loses at least one out of two
messages to COMP.* groups, so many may not have read my contributions
to this thread.
Yours,
Lüko Willms http://www.mlwerke.de
/--------- L.WILLMS@jpberlin.de -- Alle Rechte vorbehalten --
"Nach meiner Ansicht besitzt die Presse _das_ _Recht_,
Schriftsteller, Politiker, Komödianten und andere öffentliche
Charaktere zu _beleidigen_. Achtete ich [so einen Angriff gegen mich]
einer Notiz wert, so galt mir in solchen Fällen der Wahlspruch: à
corsaire, corsaire et demi [auf einen Schelmen anderthalben]."
- Karl Marx 17.11.1860 (Herr Vogt, Kapitel XI)
|
|
|
|
|