For Programmers: Free Programming Magazines  


Home > Archive > Cobol > August 2004 > Layout Hell-o









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Layout Hell-o
Carol

2004-07-27, 8:55 pm

Ok, for my 21 different layouts for 21 different files from the same
company:

In all the layouts whereever I see PIC S9(9) USAGE COMP,

The translation (whatever it may be) will be the same?

The same is true for all field definitions?

thanks


William M. Klein

2004-07-27, 8:55 pm

Carol,
You have a 99.99 % expectation that "COMP" will be the same for all the file
layouts that were created on a single machine (with a single compiler). It is
POSSIBLE (but relatively unlikely) that COMP in different file layouts were
brought into different programs that were compiled with different compiler
options/directives.

For example, if the files were created with a Micro Focus compiler, the
IBM-COMP
COMP-5
TRUNC

compiler options all impact exactly what "USAGE COMP" means. However, it is a
relatively safe assumption that if these 21 different files come from the same
company (creating them on the same machine with the same compiler) that PROBABLY
they used the same options and COMP in one means the same thing as COMP in
another.

P.S. "COMP" in a *single* file layout *must* mean the same thing for every
layout within that same file. Even I (who can find loop-holes in almost
anything <G> ) can't think of any way for COMP to have multiple meanings within a
single file layout. Of course, COMP-3, COMP-4, COMP-5, etc *will*
possibly/probably mean different things.

P.P.S. In all these threads have we talked about both implicitly and explicit
"SYNC" and slack bytes? If you see binary zeroes or other unexpected "junk"
between fields, it may because the compiler and operating systems (following
specific rules) may "insert" bytes (or even bits) between two defined fields.

--
Bill Klein
wmklein <at> ix.netcom.com
"Carol" <kgdg@helkusa.com> wrote in message
news:_K2dnVflRe2AKpvcRVn-jA@comcast.com...
> Ok, for my 21 different layouts for 21 different files from the same
> company:
>
> In all the layouts whereever I see PIC S9(9) USAGE COMP,
>
> The translation (whatever it may be) will be the same?
>
> The same is true for all field definitions?
>
> thanks
>
>



Carol

2004-07-28, 3:55 pm

I have seen that junk! I have cheated by fudging the starting poinsts of
some of the translated fields.

I see that I may have been on to something. Thank you for telling me why
that is.
!


"William M. Klein" <wmklein@nospam.netcom.com> wrote in message
news:VhANc.597$cK.125@newsread2.news.pas.earthlink.net...
> Carol,
> You have a 99.99 % expectation that "COMP" will be the same for all

the file
> layouts that were created on a single machine (with a single compiler).

It is
> POSSIBLE (but relatively unlikely) that COMP in different file layouts

were
> brought into different programs that were compiled with different compiler
> options/directives.
>
> For example, if the files were created with a Micro Focus compiler, the
> IBM-COMP
> COMP-5
> TRUNC
>
> compiler options all impact exactly what "USAGE COMP" means. However, it

is a
> relatively safe assumption that if these 21 different files come from the

same
> company (creating them on the same machine with the same compiler) that

PROBABLY
> they used the same options and COMP in one means the same thing as COMP in
> another.
>
> P.S. "COMP" in a *single* file layout *must* mean the same thing for every
> layout within that same file. Even I (who can find loop-holes in almost
> anything <G> ) can't think of any way for COMP to have multiple meanings

within a
> single file layout. Of course, COMP-3, COMP-4, COMP-5, etc *will*
> possibly/probably mean different things.
>
> P.P.S. In all these threads have we talked about both implicitly and

explicit
> "SYNC" and slack bytes? If you see binary zeroes or other unexpected

"junk"
> between fields, it may because the compiler and operating systems

(following
> specific rules) may "insert" bytes (or even bits) between two defined

fields.
>
> --
> Bill Klein
> wmklein <at> ix.netcom.com
> "Carol" <kgdg@helkusa.com> wrote in message
> news:_K2dnVflRe2AKpvcRVn-jA@comcast.com...
>
>



JerryMouse

2004-07-28, 3:55 pm

Carol wrote:
> I have seen that junk! I have cheated by fudging the starting
> poinsts of some of the translated fields.
>
> I see that I may have been on to something. Thank you for telling me
> why that is.
> !


He told you "what." Here's the "why" of "slack" bytes.

Early on, and continuing to this day, there were machines that had a marked
decrease in efficiency if some data fields were not on the proper
(half-word, full-word, double-word) boundary. The program must move these
mis-aligned fields to a proper alignment before they could be operated upon.

For example, assume the computer, in order to do binary arithmetic, could
load the register (where the binary arithmetic was to be done) only from a
memory location on a full-word boundary.

Extra code was generated by the compiler to move the binary-word argument to
a temporary full-word boundary in memory before the register-load
instruction. After the binary computation in the register, additional code
would have to be generated to take the result from the register and store it
in a temporary area then move it from the temporary area to its rightful
place in the program.

In days of yore (and sometimes today), COBOL programmers kept in the front
of their mind these hardware oddities. The compiler, itself, helped by
forcing all "01" level data names to full-word boundaries.

77-level data items in COBOL told the compiler to ignore this default
alignment. You'll see programs with tons of 77-level items in
Working-Storage instead of 01-level items for this original reason. In other
words, originally:

01 Data-A Pic X.
01 Data-B Pic X.
01 Data-C Pic X

took up 12 bytes of storage. Change the "01" in each to "77" and the same
set takes up only three bytes - because the 01-level forces the compiler to
begin the data element on a full-word boundary in memory.

Many of today's machines don't really care about byte alignment. They can do
binary arithmetic in situ, that is, memory-to-memory, without having to load
a register. There are other machines with hardware instructions to load
accumulators/registers without regard for byte alignment of the parameters.

Still, "slack bytes" will be with us forever.

"Slack bytes" are not to be with "expansion bytes" (areas stuck in
by the programmer for future use).


Carol

2004-07-28, 3:55 pm

why - thank you!

Cobol is an interesting place to visit.
"JerryMouse" <nospam@bisusa.com> wrote in message
news:2KadnTEXxpAcP5rc4p2dnA@giganews.com...
> Carol wrote:
>
> He told you "what." Here's the "why" of "slack" bytes.
>
> Early on, and continuing to this day, there were machines that had a

marked
> decrease in efficiency if some data fields were not on the proper
> (half-word, full-word, double-word) boundary. The program must move these
> mis-aligned fields to a proper alignment before they could be operated

upon.
>
> For example, assume the computer, in order to do binary arithmetic, could
> load the register (where the binary arithmetic was to be done) only from a
> memory location on a full-word boundary.
>
> Extra code was generated by the compiler to move the binary-word argument

to
> a temporary full-word boundary in memory before the register-load
> instruction. After the binary computation in the register, additional code
> would have to be generated to take the result from the register and store

it
> in a temporary area then move it from the temporary area to its rightful
> place in the program.
>
> In days of yore (and sometimes today), COBOL programmers kept in the front
> of their mind these hardware oddities. The compiler, itself, helped by
> forcing all "01" level data names to full-word boundaries.
>
> 77-level data items in COBOL told the compiler to ignore this default
> alignment. You'll see programs with tons of 77-level items in
> Working-Storage instead of 01-level items for this original reason. In

other
> words, originally:
>
> 01 Data-A Pic X.
> 01 Data-B Pic X.
> 01 Data-C Pic X
>
> took up 12 bytes of storage. Change the "01" in each to "77" and the same
> set takes up only three bytes - because the 01-level forces the compiler

to
> begin the data element on a full-word boundary in memory.
>
> Many of today's machines don't really care about byte alignment. They can

do
> binary arithmetic in situ, that is, memory-to-memory, without having to

load
> a register. There are other machines with hardware instructions to load
> accumulators/registers without regard for byte alignment of the

parameters.
>
> Still, "slack bytes" will be with us forever.
>
> "Slack bytes" are not to be with "expansion bytes" (areas stuck

in
> by the programmer for future use).
>
>



Lueko Willms

2004-07-28, 3:55 pm

.. Am 28.07.04
schrieb nospam@bisusa.com (JerryMouse)
auf /COMP/LANG/COBOL
in 2KadnTEXxpAcP5rc4p2dnA@giganews.com
ueber Re: Layout Hell-o

n> took up 12 bytes of storage. Change the "01" in each to "77" and the
n> same set takes up only three bytes

unless the SYNCHRONIZED clause is specified in the data
description

n> - because the 01-level forces the
n> compiler to begin the data element on a full-word boundary in memory.


Yours,
Lüko Willms http://www.mlwerke.de
/--------- L.WILLMS@jpberlin.de -- Alle Rechte vorbehalten --

"Nach meiner Ansicht besitzt die Presse _das_ _Recht_,
Schriftsteller, Politiker, Komödianten und andere öffentliche
Charaktere zu _beleidigen_. Achtete ich [so einen Angriff gegen mich]
einer Notiz wert, so galt mir in solchen Fällen der Wahlspruch: à
corsaire, corsaire et demi [auf einen Schelmen anderthalben]."
- Karl Marx 17.11.1860 (Herr Vogt, Kapitel XI)
Chuck Stevens

2004-07-28, 3:55 pm


"JerryMouse" <nospam@bisusa.com> wrote in message
news:2KadnTEXxpAcP5rc4p2dnA@giganews.com...

> Early on, and continuing to this day, there were machines that had a

marked
> decrease in efficiency if some data fields were not on the proper
> (half-word, full-word, double-word) boundary. The program must move these
> mis-aligned fields to a proper alignment before they could be operated

upon.

This rationale may be accurately stated for some machines, but I don't think
it's necessarily accurate for all. It is not now, nor has it ever (so far
as I know) been, necessary to move a non-sync word-oriented item to a
temporary array in order to retrieve it into top-of-stack on a Burroughs
B5000 or any of its successors. The sequence of operations will be less
efficient, but it's doable without using a temporary.

> The compiler, itself, helped by
> forcing all "01" level data names to full-word boundaries.


That depends on the implementation, but I'd say yes, probably 01-level items
are generally aligned at the largest applicable boundary irrespective of
content. But any relationship *between* 01-level items is an
implementor-specified detail; for example, in Unisys MCP COBOL74 (and its
COBOL(68) predecessor), every 01-level item occupies a *separate* memory
space. The 01-level item is thus word-aligned not because 01-level items
are word-aligned but because memory is allocated in word-aligned chunks.

> 77-level data items in COBOL told the compiler to ignore this default
> alignment.


Not exactly, as I understand it. Level 01 was originally intended to apply
to records to which elementary items were subordinate and to inform the
compiler to expect (but not require) such subordinate elementary items.
Level 77 was intended to inform the compiler that no such subordinate
entries existed and that the data items so described were independent of any
other data items. The standard has never mandated a difference in
functionality between an 01-level elementary item and a 77-level item. The
big difference is that the implementor can do what he wants with the
information that the item's description isn't dependent on its location or
its relationship to any other data item. What the implementor does with
that information is up to the implementor. And the reasons a user in one
implementation might want to use 77's instead of 01's might indeed be in
*direct conflict* with different implementations. If a 77-level numeric
item is generally presumed to be "on the stack" in a stack-oriented machine,
it's likely to be retrievable *much* more quickly than an 01-level
elementary item if the compiler must generate code to retrieve the latter
through a descriptor to memory!

> You'll see programs with tons of 77-level items in
> Working-Storage instead of 01-level items for this original reason.


You may see programs with lots of 77-level *numeric* items in
Working-Storage instead of 01-level items because the compiler recognizes,
since these are 77-level items, it's often free to optimize the usage.
It's not clear to me that "this original reason" is the only possible
explanation. I know programmers who use 77-level items to describe
independent data items and reserve 01 for records precisely because they
want to make it absolutely and unambiguously clear that their 77-level items
are independent data items and their 01-level items are, or are to be
treated as, records.

> In other words, originally:
>
> 01 Data-A Pic X.
> 01 Data-B Pic X.
> 01 Data-C Pic X
>
> took up 12 bytes of storage. Change the "01" in each to "77" and the same
> set takes up only three bytes - because the 01-level forces the compiler

to
> begin the data element on a full-word boundary in memory.


On some systems, not all!

It's not clear to me that the "original reason" is that 77-level items butt
up against one another at byte boundaries thus saving memory, and 01-level
ones require a minimum of 32 bits -- in fact, I think there's as much
evidence to suggest that the *original* reason is that 77-level numerics end
up on the stack and were accessible by a simple VALC, while 01-level
elementary numerics were in memory and required sequences like LT, NAMC,
NXLV (at best) to retrieve them! ;-)

> Still, "slack bytes" will be with us forever.


I agree. But I don't think of "slack bytes" as the space between 01's, or
for that matter, between 77's, but rather empty spaces between items *in a
record*.

For example, the Unisys MCP implementation requires group items to begin and
end on a byte boundary (for purposes of MOVE, they are, after all,
alphanumeric). However, packed-decimal items, which occupy data in four-bit
increments, may begin or end on a four-bit boundary. If for example the
last elementary item in a group begins on a byte boundary but is unsigned
and contains an odd number of digits, or is signed and contains an even
number of digits, the group that immediately follows will be preceded by
four slack bits.

E.g.,
01 A-RECORD.
03 GROUP-1.
05 ITEM-1 PIC S9(4) PACKED-DECIMAL.
* four slack bits go here
03 GROUP-2.
05 ITEM-2 PIC 9(5) PACKED-DECIMAL.
* four more slack bits go here
03 ITEM-3 PIC X.

-Chuck Stevens


Warren Simmons

2004-07-28, 3:55 pm

Jerry, I nominate you to do definitions. That was a very clear
description of
the problem from most perspectives. The Engineers would wonder why<G>
Thanks for that effort.

Warren Simmons


JerryMouse wrote:

>Carol wrote:
>
>
>
>He told you "what." Here's the "why" of "slack" bytes.
>
>Early on, and continuing to this day, there were machines that had a marked
>decrease in efficiency if some data fields were not on the proper
>(half-word, full-word, double-word) boundary. The program must move these
>mis-aligned fields to a proper alignment before they could be operated upon.
>
>For example, assume the computer, in order to do binary arithmetic, could
>load the register (where the binary arithmetic was to be done) only from a
>memory location on a full-word boundary.
>
>Extra code was generated by the compiler to move the binary-word argument to
>a temporary full-word boundary in memory before the register-load
>instruction. After the binary computation in the register, additional code
>would have to be generated to take the result from the register and store it
>in a temporary area then move it from the temporary area to its rightful
>place in the program.
>
>In days of yore (and sometimes today), COBOL programmers kept in the front
>of their mind these hardware oddities. The compiler, itself, helped by
>forcing all "01" level data names to full-word boundaries.
>
>77-level data items in COBOL told the compiler to ignore this default
>alignment. You'll see programs with tons of 77-level items in
>Working-Storage instead of 01-level items for this original reason. In other
>words, originally:
>
>01 Data-A Pic X.
>01 Data-B Pic X.
>01 Data-C Pic X
>
>took up 12 bytes of storage. Change the "01" in each to "77" and the same
>set takes up only three bytes - because the 01-level forces the compiler to
>begin the data element on a full-word boundary in memory.
>
>Many of today's machines don't really care about byte alignment. They can do
>binary arithmetic in situ, that is, memory-to-memory, without having to load
>a register. There are other machines with hardware instructions to load
>accumulators/registers without regard for byte alignment of the parameters.
>
>Still, "slack bytes" will be with us forever.
>
>"Slack bytes" are not to be with "expansion bytes" (areas stuck in
>by the programmer for future use).
>
>
>
>

Richard

2004-07-28, 8:55 pm

"JerryMouse" <nospam@bisusa.com> wrote

> 77-level data items in COBOL told the compiler to ignore this default
> alignment. You'll see programs with tons of 77-level items in
> Working-Storage instead of 01-level items for this original reason. In other
> words, originally:
>
> 01 Data-A Pic X.
> 01 Data-B Pic X.
> 01 Data-C Pic X
>
> took up 12 bytes of storage. Change the "01" in each to "77" and the same
> set takes up only three bytes - because the 01-level forces the compiler to
> begin the data element on a full-word boundary in memory.


I don't think that is true, or is certainly not true in some
implementations.

77 levels are _non-contiguous_ items. This means that they are _not_
contiguous, or at least do not need to be contiguous, whereas you have
it that the 77 level _makes_ them contiguous.

In all implementations that I have used a 77 level it treaded
identically to it being an 01 with no subordinate items. In other
words your example using 01s would take 12 (or 24) bytes and as 77s
would take exactly the same amount of space.

In fact with MicroFocus there is an 'ALIGN' directive which specifies
the alignment boundary to be used for both 01s and 77s. The default
is 8 bytes, hence 24 for the 3 01 or 77s.

The way to defeat slack bytes is to use sub-ordinate items to a
grouping 01.

01 Data-Group.
03 Data-A Pic X.
03 Data-B Pic X.
03 Data-C Pic X.

This takes 3 bytes and will be followed by one or more slack bytes
before the next 01 or 77.

The point about 77s is that they are non-contiguous, the compiler is
allowed to (but need not) insert slack bytes, or move some of these
items elsewhere in memory, or shuffle them in any way it feels like.
01 levels are record items and are expected to have subordinate items.
Each subordinate item _is_ contiguous at the byte level.

Just a historical note: I was given a client support problem back in
the late 60s when a customer couldn't fit his program into the 16Kword
1901A machine. He had 'saved space' by having a series of single
character flags as 03 levels in a record. It turned out that each
reference to one of these flags required 11 words of instructions to
extract and align the character, it being a word machine and rather
clumsy with characters. Changing all those flags to use a word each
saved several Kwords and made it fit.
William M. Klein

2004-07-28, 8:55 pm

I thought that Chuck had already explained this.

There is nothing in the '85 (or 2002 - or I think 74) Standards that in ANY way
distinguishes between

01 Field1 Pic X
01 Field2 Pic X

and

77 Field1 Pic X.
77 Field2 Pic X.

The implementor may (or may not)

A) put the 01-levels on specific types of boundaries
B) put the 77-levels on the same or different types of boundaries
C) place field1 and field2 (of either type) in any particular relationship to
each other in real or virtual storage.

There is one and ONLY one difference between 01-levels and 77-levels:

01-levels MAY (but need not) have subordinate items;
77-levels may NOT (ever) have subordinate items defined.

Slack bytes (or bits) in COBOL terminology refer only to those bytes within the
same "record" description, e.g.

01 aRecord.
05 Fielda Pic X.
*> there might be a slack byte, bit or other here
05 Fieldb Pic 9 Binary Sync.
*> there might or might not be slack stuff here
05 Where-does-the-group-start
Occurs 5 times.
10 Fieldc Usage Index
*> what about this?
10 Fieldd Pic S9(2) Usage Packed-Decimal.
*> what about here?

--
Bill Klein
wmklein <at> ix.netcom.com
"Richard" <riplin@Azonic.co.nz> wrote in message
news:217e491a.0407281440.66ad2469@posting.google.com...
> "JerryMouse" <nospam@bisusa.com> wrote
>
>
> I don't think that is true, or is certainly not true in some
> implementations.
>
> 77 levels are _non-contiguous_ items. This means that they are _not_
> contiguous, or at least do not need to be contiguous, whereas you have
> it that the 77 level _makes_ them contiguous.
>
> In all implementations that I have used a 77 level it treaded
> identically to it being an 01 with no subordinate items. In other
> words your example using 01s would take 12 (or 24) bytes and as 77s
> would take exactly the same amount of space.
>
> In fact with MicroFocus there is an 'ALIGN' directive which specifies
> the alignment boundary to be used for both 01s and 77s. The default
> is 8 bytes, hence 24 for the 3 01 or 77s.
>
> The way to defeat slack bytes is to use sub-ordinate items to a
> grouping 01.
>
> 01 Data-Group.
> 03 Data-A Pic X.
> 03 Data-B Pic X.
> 03 Data-C Pic X.
>
> This takes 3 bytes and will be followed by one or more slack bytes
> before the next 01 or 77.
>
> The point about 77s is that they are non-contiguous, the compiler is
> allowed to (but need not) insert slack bytes, or move some of these
> items elsewhere in memory, or shuffle them in any way it feels like.
> 01 levels are record items and are expected to have subordinate items.
> Each subordinate item _is_ contiguous at the byte level.
>
> Just a historical note: I was given a client support problem back in
> the late 60s when a customer couldn't fit his program into the 16Kword
> 1901A machine. He had 'saved space' by having a series of single
> character flags as 03 levels in a record. It turned out that each
> reference to one of these flags required 11 words of instructions to
> extract and align the character, it being a word machine and rather
> clumsy with characters. Changing all those flags to use a word each
> saved several Kwords and made it fit.



Warren Simmons

2004-07-29, 3:55 am

Chuck,

You have added lots of good information on the current status of levels.
Thanks.
Regarding the original intent, while I can not say I could read the
minds of those
involved, I do remember that at that time "we were trying to solve what we
considered hardware problems." Our members were in my view not as
knowledgeable
about their hardware, nor how the compiler would handle things.

Things have changed in all areas of computer interface. That's why I believe
it is reasonable to make the standard say, all data is defined in STAND
NOTATION,
without reference to sync, or bytes, or bits. Then the compiler writer will
optimize to fit the hardware, and it would be reasonable to ask them to\
create a label with the hardware record description for each file as it
is created. If key words are necessary to specify the exact method of
recording
a record, a field, a byte, etc. they are probably well covered now, and
few would
be needed as extensions. Yet, this is only 2004! What is coming? Will
source programs begin to answer the phone, and operate the computer
without human intervention? I have no idea. But, the current state of
data definition in source programs is a laugh.

Warren Simmons
..



Chuck Stevens wrote:

>"JerryMouse" <nospam@bisusa.com> wrote in message
>news:2KadnTEXxpAcP5rc4p2dnA@giganews.com...
>
>
>
>marked
>
>
>upon.
>
>This rationale may be accurately stated for some machines, but I don't think
>it's necessarily accurate for all. It is not now, nor has it ever (so far
>as I know) been, necessary to move a non-sync word-oriented item to a
>temporary array in order to retrieve it into top-of-stack on a Burroughs
>B5000 or any of its successors. The sequence of operations will be less
>efficient, but it's doable without using a temporary.
>
>
>
>
>That depends on the implementation, but I'd say yes, probably 01-level items
>are generally aligned at the largest applicable boundary irrespective of
>content. But any relationship *between* 01-level items is an
>implementor-specified detail; for example, in Unisys MCP COBOL74 (and its
>COBOL(68) predecessor), every 01-level item occupies a *separate* memory
>space. The 01-level item is thus word-aligned not because 01-level items
>are word-aligned but because memory is allocated in word-aligned chunks.
>
>
>
>
>Not exactly, as I understand it. Level 01 was originally intended to apply
>to records to which elementary items were subordinate and to inform the
>compiler to expect (but not require) such subordinate elementary items.
>Level 77 was intended to inform the compiler that no such subordinate
>entries existed and that the data items so described were independent of any
>other data items. The standard has never mandated a difference in
>functionality between an 01-level elementary item and a 77-level item. The
>big difference is that the implementor can do what he wants with the
>information that the item's description isn't dependent on its location or
>its relationship to any other data item. What the implementor does with
>that information is up to the implementor. And the reasons a user in one
>implementation might want to use 77's instead of 01's might indeed be in
>*direct conflict* with different implementations. If a 77-level numeric
>item is generally presumed to be "on the stack" in a stack-oriented machine,
>it's likely to be retrievable *much* more quickly than an 01-level
>elementary item if the compiler must generate code to retrieve the latter
>through a descriptor to memory!
>
>
>
>
>You may see programs with lots of 77-level *numeric* items in
>Working-Storage instead of 01-level items because the compiler recognizes,
>since these are 77-level items, it's often free to optimize the usage.
>It's not clear to me that "this original reason" is the only possible
>explanation. I know programmers who use 77-level items to describe
>independent data items and reserve 01 for records precisely because they
>want to make it absolutely and unambiguously clear that their 77-level items
>are independent data items and their 01-level items are, or are to be
>treated as, records.
>
>
>
>to
>
>
>
>On some systems, not all!
>
>It's not clear to me that the "original reason" is that 77-level items butt
>up against one another at byte boundaries thus saving memory, and 01-level
>ones require a minimum of 32 bits -- in fact, I think there's as much
>evidence to suggest that the *original* reason is that 77-level numerics end
>up on the stack and were accessible by a simple VALC, while 01-level
>elementary numerics were in memory and required sequences like LT, NAMC,
>NXLV (at best) to retrieve them! ;-)
>
>
>
>
>I agree. But I don't think of "slack bytes" as the space between 01's, or
>for that matter, between 77's, but rather empty spaces between items *in a
>record*.
>
>For example, the Unisys MCP implementation requires group items to begin and
>end on a byte boundary (for purposes of MOVE, they are, after all,
>alphanumeric). However, packed-decimal items, which occupy data in four-bit
>increments, may begin or end on a four-bit boundary. If for example the
>last elementary item in a group begins on a byte boundary but is unsigned
>and contains an odd number of digits, or is signed and contains an even
>number of digits, the group that immediately follows will be preceded by
>four slack bits.
>
>E.g.,
> 01 A-RECORD.
> 03 GROUP-1.
> 05 ITEM-1 PIC S9(4) PACKED-DECIMAL.
> * four slack bits go here
> 03 GROUP-2.
> 05 ITEM-2 PIC 9(5) PACKED-DECIMAL.
> * four more slack bits go here
> 03 ITEM-3 PIC X.
>
> -Chuck Stevens
>
>
>
>

Tukla Ratte

2004-07-29, 3:55 pm

Carol wrote:

> Cobol is an interesting place to visit.


But you wouldn't want to live there! <rimshot>

--
Tukla, Squeaker of Chew Toys
Official Mascot of Alt.Atheism
Carol

2004-07-29, 3:55 pm

exactly

"Tukla Ratte" <tukla_ratte@tukla.net> wrote in message
news:2mt0eiFqtgfeU1@uni-berlin.de...
> Carol wrote:
>
>
> But you wouldn't want to live there! <rimshot>
>
> --
> Tukla, Squeaker of Chew Toys
> Official Mascot of Alt.Atheism



Richard

2004-07-29, 8:55 pm

"William M. Klein" <wmklein@nospam.netcom.com> wrote

> I thought that Chuck had already explained this.


That depends on the definition of 'already'. No doubt you received
mine after Chuck's, but Google may take several hours to post
messages.

> A) put the 01-levels on specific types of boundaries
> B) put the 77-levels on the same or different types of boundaries
> C) place field1 and field2 (of either type) in any particular relationship to
> each other in real or virtual storage.


I thought that I had already covered that ;-)

> Slack bytes (or bits) in COBOL terminology refer only to those bytes within
> the same "record" description, e.g.


That may be true, but implementations may align 77s and 01s to
boundaries which _also_ require slack bytes outside of those data
items. Or did you have another term which is more appropriate ?
William M. Klein

2004-07-29, 8:55 pm


"Richard" <riplin@Azonic.co.nz> wrote in message
news:217e491a.0407291302.2898209a@posting.google.com...
> "William M. Klein" <wmklein@nospam.netcom.com> wrote

<snip>
>
>
> That may be true, but implementations may align 77s and 01s to
> boundaries which _also_ require slack bytes outside of those data
> items. Or did you have another term which is more appropriate ?


I don't know what I would call those things. As Chuck (or someone) pointed out,
any two 01-levels (or 77-levels or a combination of the two) don't even need to
be in the same address space (and aren't in some implementations). Often (with
the '85 Standard) adding the EXTERNAL attribute to one of 2 77-level/01-levels
will place them in TOTALLY different "thingies" (operating system/compiler
specific).

Bottom-line:
Yes there may be bytes (or bits) between two 01-levels or 77-levels - but it
is also possible that there may be NO known (to the programmer) relationship
between the two. If you want to EXTEND the definition of "slack bytes" to
include such storage (when an implementor does this), that is OK with me - but
it is certainly not what I meant in my original post - nor what I think is the
normal COBOL usage of the term.


--
Bill Klein
wmklein <at> ix.netcom.com


Robert Wagner

2004-07-29, 8:55 pm

"Carol" <kgdg@helkusa.com> wrote:

>Cobol is an interesting place to visit.


Packed and binary types, as well as possible slack bytes, were not a Cobol
invention; they were a mainframe invention. Other common mainframe languages
write files with the same types.
Chuck Stevens

2004-07-29, 8:55 pm

"Warren Simmons" <wsimmons5@optonline.net> wrote in message
news:41085D70.3030507@optonline.net...

> ... Things have changed in all areas of computer interface. That's why I

believe
> it is reasonable to make the standard say, all data is defined in STAND
> NOTATION,
> without reference to sync, or bytes, or bits. Then the compiler writer

will
> optimize to fit the hardware, and it would be reasonable to ask them to\
> create a label with the hardware record description for each file as it
> is created. If key words are necessary to specify the exact method of
> recording
> a record, a field, a byte, etc. they are probably well covered now, and
> few would
> be needed as extensions.


Let's presume for the moment that it was indeed practical to describe the
contents of a given record in an implementation-independent fashion, in such
a way as to allow any arbitrary user to decode the data in such a way as to
make it useful on another system. I will return to this later.

Things may have changed more than you realize, and in a way you didn't
anticipate!! ;-)

First: In 2002 COBOL, the record description that used to be mandatory with
every FD or SD is now OPTIONAL. "READ <filename> INTO <record-name>" has
been there all along; what allows this to work is "WRITE FILE <filename>
FROM <record-name>". This is but a short step from "WRITE
<dummy-record-name> FROM <actual-working-storage-record-name>", which has
been permissible all along. Because using "dummy record descriptions" in
the FILE section has been permissible all along, it has always been
impractical to associate *meaningful* record descriptions with the file
itself, and I believe requiring *all* record descriptions that might end up
describing a file's contents to be associated with that file to be a
restriction that would bring howls of protest from many programmers. Quite
simply, I don't think the requirement could be met in the general case, and
the limitations that it would place on most application designers -- that
they *could not* use WRITE FROM without "breaking" this mechanism, for
example -- would represent a *retrograde* step.

Second: the identification of which of multiple record descriptions applies
to a particular record has historically been an execution-time
determination, The new-for-2002 SELECT WHEN clause provides some
capabilities here, but its use is STRICTLY up to the user, it's never
mandatory. Absent the availability and use of that clause, no matter how
implementation-independent the record descriptions are, the determination as
to which description applies to the particular record in the file cannot be
determined automatically.

Third: If a given record has within it a REDEFINES that is used to
distinguish between "subtypes" of a given record, how, in an
implementation-independent way, I don't see a practical means of recording
the mechanism whereby the choice is made in an implementation-independent
way other than associating the choice with the physical record.

Fourth: While OCCURS ... DEPENDING creates a whole set of problems in
defining the layout of a record, particularly when the ODO item is not
within (nor can it be derived from) the record itself, two major new
wrinkles will be present in the post-2002 standard. The details for these
two new wrinkles -- any-length elementary items and dynamic-capacity
tables -- are still being worked out, but one of the major goals is that
items of this sort will be allowed within records, with other items
(including variable-length ones) immediately following them in the record
description. If such a record "acts as if" the variable-length items are of
a fixed length at any given instant, but as the lengths vary it "acts as if"
the data items following the item whose lengths are varying appear in
different locations relative to the start of the record, the task of
describing of the record in an *implementation-independent* fashion becomes
rather more complex.

Fifth: I believe the only solution to the above four issues is a mechanism
whereby, on any given WRITE statement, a desccription of the record is
incorporated into the physical data written on the medium, in an
implementation-independent "standard notation". I do not believe it is
either appropriate or possible for the compiler to ensure that every record
written to the file has an accurate standard-notation record description for
the record in any other way. A mechanism analogous to the
obsolete-since-1985 LABEL RECORDS clause can function meaningfully only for
a subset of programs that follow restrictions that have been lifted
elsewhere in COBOL, and is prone to serious error.

Sixth: Data description notation.

I took, as an example, a PICTURE S9(18) USAGE BINARY (legal in '85-standard
COBOL) on Unisys MCP systems and came up with some of its characteristics.
1) It occupies twelve eight-bit bytes (a total of 96 bits). 2) It must be
aligned on an 8-bit boundary in its memory space. NOTE the remainder of
this discussion assumes the leftmost bit is numbered 95, the rightmost bit
0. 3) Bit 95 may be used by software, so it needs to be preserved somehow,
but is not used by hardware. 4) Bit 94 contains the sign of the mantissa,
0 = positive, 1 = negative. 4) Bit 93 contains the sign of the exponent.
5) The six bits downard from bit 92 through bit 87 contain the LOW-ORDER
six bits of the exponent magnitude (which is the power of eight to which the
mantissa -- describe later -- is raised). It's likely this will contain 13
(see later). 6) The thirty-nine bits downard from bit 86 through bit 48
contain comprise the integer part of the mantissa. 7) The nine bits
downward from bit 47 through 39 contain the HIGH-ORDER nine bits of the
exponent magnitude. 8) The thirty-nine bits downward from bit 38 through
bit 0 contain the fractional part of the mantissa.

[Note that there are three common accurate representations for the value
(-1) in that description -- @661000000000000000000000@, normalized double,
which is the same as a normalized single that's had either XTND or the
sequence ZERO, JOIN applied against it; @400000000001000000000000@, which is
a single-precision integer that's had either XTND or trhe sequence ZERO,
JOIN applied against it; and @4C8000000000000000000001@, canonic
double-precision integer, which has an exponent of thirteen with the
mantissa right-justified -- but there are actually 26 valid representations
of that value in this form. I don't see a way for such a data item, with
26 possible options, to be described *economically* in an
implementation-independent fashion.]

Seventh: Going back to the issue of "standard notation", even presuming the
ability to make such a description *economical* enough to be associated with
the file on its medium or the individual records in that file, a means of
describing arbitrary fields of arbitrary record descriptions as implemented
on arbitrary hardware in a manner that would allow any user or application
program to decode every record accurately and completely would itself
require the specification of a complete "standard notation" that would cover
all eventualities. That "standard notation" would in effect be a language
of its own, and would need to be completely described -- so that any
implementation could build software to read, understand and translate a
record description from any other implementation.

It's my opinion that this isn't a single proposal, it's two proposals. The
first is devising a way to describe the records in an
implementation-independent way; the second is to attach these descriptions
in some fashion to the file or to the records in that file.

Devising an industry-standard mechanism whereby any record description from
any particular implementation could be described in such a way that any
implementation could understand not only its data alignment but its contents
is, I would suggest, *way* beyond the purview of COBOL, or the COBOL
standards committee. Certainly I don't have time to write a proposal
detailing the specifications for the description of fields of arbitrary
usage and alignment and content in records as produced by arbitrary system
in a way that any system could read and understand it.

If somebody wants to write such a proposal, I'd certainly be willihng to
look at it, but I wouldn't consider supporting the attachment of
standard-notation record descriptions to files or records at run time as
part of the COBOL specification without it.

As a practical matter, I think that standard-notation specification is too
complex to include directly into the COBOL standard. If it's worth doing,
it's worth doing as a separate, language-independent ISO standard, and once
that's adopted, the COBOL standard can add whatever "hooks" are needed in
the syntax to invoke its provisions.

Write a formal, detailed proposal for the record description
language/notation, and we can certainly see; the availability of such a
specification is prerequisite to any change to standard COBOL. J4 might
even champion such a specification (and they might not). But don't expect
to see anything in the next version of the COBOL standard; WG4 informed J4
in no uncertain terms some thirteen months ago what the list of candidate
features was for the next standard (due out in 2008, and to meet that date
the initial draft for international review needs to be finished Real Soon
Now), and that none more was to be considered or proposed. At this point I
think WG4 is more likely to decide to *delete* items from that list (or to
delete features suggested for items on that list) than they are to *add* new
ones at the October 2004 meeting in The Hague. J4 has certainly been
enjoined by WG4 from deciding to add new widgets to WG4's list on its own!

-Chuck Stevens


Chuck Stevens

2004-07-29, 8:55 pm

I wrote:

> [Note that there are three common accurate representations for the value
> (-1) in that description -- ... and @4C8000000000000000000001@, canonic
> double-precision integer, ...


Wrong. My mistake. That's -68,719,476,736. The value for (1) is
@468000000000000000000001@.

For grins, while I'm at it, for those who care, turns out there seems to be
27, not 26, exact representations of integers < 8 in double precision on
Unisys MCP systems. Using (-7) as an example, they are:
@667000000000000000000000@, = normalized USAGE DOUBLE
@658E00000000000000000000@
@6501C0000000000000000000@
@648038000000000000000000@
@640007000000000000000000@
@638000E00000000000000000@
@6300001C0000000000000000@
@628000038000000000000000@
@620000007000000000000000@
@618000000E00000000000000@
@61000000001C000000000000@
@608000000003800000000000@
@400000000007000000000000@, extended single-integer
@600000000007000000000000@ (forgot this one: ditto, with (-0) in the
exponent).
@408000000000007000000000@
@410000000000000E00000000@
@4180000000000001C0000000@
@420000000000000038000000@
@428000000000000007000000@
@430000000000000000E00000@
@4380000000000000001C0000@
@44000000000000000000E000@
@448000000000000000007000@
@450000000000000000000380@
@4580000000000000000000E0@
@46000000000000000000001C@
@468000000000000000000007@, canonic double-integer.

The corresponding single-precision representations are given by lopping off
the rightmost 48 bits of zeroes from the first fourteen of these. Thus:
@667000000000@, = normalized USAGE REAL
...
@400000000007@, canonic single-integer
@600000000007@, ditto with (-0) in the exponent field


And, back to the topic of a standard-notation representation of records, any
implementation-independent description of information in these formats had
better be prepared to understand all of them, and why they all represent the
same value, to say nothing of what that value is ...

-Chuck Stevens


Warren Simmons

2004-07-30, 3:55 am

Chuck Stevens wrote:

>"Warren Simmons" <wsimmons5@optonline.net> wrote in message
>news:41085D70.3030507@optonline.net...
>
>
>

Chuck,

Forgive me for excluding your details in my reply. To few are really
interested,
and I am grateful that you have given me the time and detail supporting your
position. I wish I could present a case for anything half as well.

I realize I am not current on COBOL. It's just too much work to catch
up with
the likes of some of you who work in the effort today. Memory becomes
fragile and I wonder if I have read or been told things that keep
reminding me
that there is a lot to do in the area of a standard language that has
not been
addressed yet. Knowing full well how long it takes to arrive at any standard
(the Data Base Task Group took 7 years and didn't really produce anything
because among other things, the methods of Mr. Date had been sold already).
Remember, this is all my view.

The world went from 80 col. cards, to tape, and other means of
recording, and
the good things about the card system were lost. A hole was a hole. I
generally
feel that our system to government is even as it can be. Yet, for some
things
we allow a great deal of leeway and in others we define it down to the
nanosecond.

I feel that at some point, there needs to be a reckoning to protect what
we have
and need for the future. When I moved in with my daughter and gave her my
99 Buick, I thought it was a fine car for what I needed. Just this past
w she
traded it in on a new Acura (her favorite brand), and it is a great car
to ride in.
However, other than seats, trunk, mileage, it's a mode of
transportation. I recall
The B3500's we had used a memory address system that ignored the binary
system. I remember when RCA tried to build an IBM system, their designers
did a "better" job with instruction set, and the RCA could not run the
IBM software.
I remember that the Honeywell 200 failed it's open house in WDC that I
attended,
because, at first Honeywell converted unpunched columns to spaces instead
of zeros. I remember when the B6500 line was designed, a look at the
customer
base showed them that the major applications on B5000's were data
processing.
So, they designed an instruction to replace a whole lot of instructions
(I was
told by the project leader it was 99) to one instruction.

In the various conversations in this newsgroup over the last several months
and well before that, problems of files not readable because the format was
not known, special vendor tape drives to convert data from other vendors ...
have been covered.

While I feel the 2002 standard is a big plus because of it's International
Status, I'm not too sure that everything now included is as important
as better exchange of data without the tedium that exists. I would
not care how that was fixed. I think it's more important that OO, but
I believe OO is good. Perhaps I'm alone on an island regarding the
implications
to our business and other needs for the safety of our way of life, but
I believe the continuation of internal vs. external computer operation
is well beyond it's life time. And I don't believe that WG4, etc. have
the best interests of our needs in mind except in the growth of the
language which I like.

Again, I appreciate the effort you have made to put out my fire. I shall
not be discussing this in this forum again. After all it is a COBOL forum.

Warren Simmons



Richard

2004-07-30, 3:55 am

"William M. Klein" <wmklein@nospam.netcom.com> wrote

>
> I don't know what I would call those things. As Chuck (or someone) pointed out,
> any two 01-levels (or 77-levels or a combination of the two) don't even need to
> be in the same address space (and aren't in some implementations). Often (with
> the '85 Standard) adding the EXTERNAL attribute to one of 2 77-level/01-levels
> will place them in TOTALLY different "thingies" (operating system/compiler
> specific).
>
> Bottom-line:
> Yes there may be bytes (or bits) between two 01-levels or 77-levels - but it
> is also possible that there may be NO known (to the programmer) relationship
> between the two. If you want to EXTEND the definition of "slack bytes" to
> include such storage (when an implementor does this), that is OK with me - but
> it is certainly not what I meant in my original post - nor what I think is the
> normal COBOL usage of the term.


In fact the term 'slack bytes' is a generic term that does not apply
just to Cobol or to data layouts. There is nothing that I am doing
that could be construed as _extending_ its meaning, but you seem to
only want to use in a very restricted sense.

For example the term is commonly used in connection with file
allocation on a disk. Disk files have a particular size but the space
on the disk is given to a file as a number of fixed size allocation
units or clusters. The difference between the actual file size and
the allocated space are 'slack bytes'.

In the cases where data items are put into different address spaces
(which happened with ICL 1900 Cobol because there was a 4096 byte
'lower' memory, once this was used everything went into 'upper' at
some other place) such as separate segments, or a heap, or EXTERNAL,
then if these areas start on some alignment boundary there may still
be internal slack bytes where the segment size is rounded up to a
boundary or external slack bytes where the segment is set to the exact
size, but the next segment on disk/memory/virtual memory must start on
a boundary.

If you don't want to use the common term for these and you don't have
another term then you may be reduced to saying 'thingies with no name'
when referring to them.

I just did a Google for 'slack bytes' and on IBM site I found:

"""" --------------------------------------
# COBOL for AIX (V2.0)
# Language Reference

Slack bytes
There are two types of slack bytes:

* Slack bytes within records: unused character positions that
precede each synchronized item in the record
* Slack bytes between records: unused character positions added
between blocked logical records

""""

These are, of course, only the two relevant types, there may be slack
bytes at the end of the block or at the end of the file.
Lueko Willms

2004-07-30, 8:55 am

.. Am 29.07.04
schrieb charles.stevens@unisys.com (Chuck Stevens)
auf /COMP/LANG/COBOL
in cec34l$2h5t$1@si05.rsvl.unisys.com
ueber Re: Layout Hell-o

CS> And, back to the topic of a standard-notation representation of
CS> records, any implementation-independent description of information in
CS> these formats had better be prepared to understand all of them, and
CS> why they all represent the same value, to say nothing of what that
CS> value is ...

I don't think there is any sense in implementation-independent
descriptions of very implementation-specific representations of data.

What is necessary, is to have generic formats and their description
for information INTERCHANGE as the ANSI standards for interchange
magnetic TAPES, extended to current media for information exchange -
CD-ROMs, ZIP-Disks, network.

I think e.g. of the floating point format as specified by IEEE; on
the other hand, I think that machine-specific formats as the 'little-
endian' binaries as used internally by Intel-CPUs should have no place
in files for information interchange.


Yours,
Lüko Willms http://www.mlwerke.de
/--------- L.WILLMS@jpberlin.de -- Alle Rechte vorbehalten --

"Es sind nicht die Generäle und Könige, die die Geschichte machen,
sondern die breiten Massen des Volkes" - Nelson Mandela
JerryMouse

2004-07-30, 8:55 am

Carol wrote:[color=darkred]
> exactly
>
> "Tukla Ratte" <tukla_ratte@tukla.net> wrote in message
> news:2mt0eiFqtgfeU1@uni-berlin.de...

Oh, I don't know.

Try floating a dollar-sign (to the immediate left of the amount,
irrespective of the magintude of the amount) in C or VB. Almost all
paychecks are generate by a COBOL program.

Likewise, computing the hyperbolic arc tangent in COBOL is not trivial.

It's a shame you're bogged down in data conversion - if you were, say,
building a program to prorate oil and gas revenues to 50,000 lease owners
you'd get a sense of COBOL. Imagine a program to create and print 50,000,000
Social Security checks. This is like ten lines of code in COBOL (I
exaggerate).


docdwarf@panix.com

2004-07-30, 3:55 pm

In article <tNudnRF0EL6vo5fcRVn-gw@giganews.com>,
JerryMouse <nospam@bisusa.com> wrote:

[snip]

>Imagine a program to create and print 50,000,000
>Social Security checks. This is like ten lines of code in COBOL (I
>exaggerate).


Quite right... let's see, assuming a Well-Designed System, where the
check-printing program does nothing but read the input of the
create-check-print-file program and write seven lines of output to the
check form... and that the program was originally written in 1965 but
re-written in 1973 with some of those newfangled 'structured
techniques'... you might get something closer to 20 lines:

PROCEDURE DIVISION.
OPEN INPUT SITE-INFO-FILE CHECK-DATA-FILE.
READ SITE-INFO-FILE INTO WS-SITE-DATA-REC.
MOVE CORR WS-SITE-DATA TO WS-OUTPUT-SITE-DATA.
PERFORM 219750-PROCESS-CHECK-DATA THRU 219750-EX UNTIL EOF.
CLOSE SITE-INFO-FILE CHECK-DATA-FILE.
STOP RUN.
219750-PROCESS-CHECK-DATA.
READ CHECK-DATA-FILE INTO WS-INFILE-REC AT END GO TO 219750-EX.
MOVE CORR WS-INFILE TO WS-OUT-DATA.
WRITE PRTREC FROM WS-OUT-DATA-1 AFTER POSITIONING CC-7.
WRITE PRTREC FROM WS-OUT-DATA-2 AFTER POSITIONING CC-2.
WRITE PRTREC FROM WS-OUT-DATA-3 AFTER POSITIONING CC-2.
WRITE PRTREC FROM WS-OUT-DATA-4 AFTER POSITIONING CC-4.
WRITE PRTREC FROM WS-OUT-DATA-5 AFTER POSITIONING CC-1.
WRITE PRTREC FROM WS-OUT-DATA-6 AFTER POSITIONING CC-1.
WRITE PRTREC FROM WS-OUT-DATA-7 AFTER POSITIONING CC-1.
219750-EX.
EXIT.

(yes, I know that I open myself up to castigations by posting code that
does not make use of later techniques such as inline Performs, mixed-case
text and Set (condition) to True... and that AFTER POSITIONING was dropped
from the Standard almost two decades back... but, like it or not, this
kind of code is often found in Large Installations and but for Y2K
remediation would, quite possibly, still be running.)

DD

Robert Wagner

2004-07-30, 8:55 pm

docdwarf@panix.com wrote:

>In article <tNudnRF0EL6vo5fcRVn-gw@giganews.com>,
>JerryMouse <nospam@bisusa.com> wrote:
>
>[snip]
>

Imagine a program to Direct Deposit 50M Social Security disbursements into
retiree's bank accounts. How do you do that in Cobol without calling services
written in other languages? It can be done 100% in Cobol but, in reality, it is
not.
[color=darkred]
>Quite right... let's see, assuming a Well-Designed System


Well-Designed = six digit paragraph numbers.

>(yes, I know that I open myself up to castigations by posting code that
>does not make use of later techniques such as inline Performs, mixed-case
>text and Set (condition) to True... and that AFTER POSITIONING was dropped
>from the Standard almost two decades back... but, like it or not, this
>kind of code is often found in Large Installations and but for Y2K
>remediation would, quite possibly, still be running.)


Because I work in Large Installations, I can say from Experience they do NOT use
or emulate '70s technology. They send the check to a 'printer thing' written in
a non-Cobol language causing a graphic check to pop out of the printer. That's
how we did it at Sears to print 400K paychecks per w.

For traditional printed reports, they send it to a COM thing that produces a
Word document, distributed via email, or a Cognos thing that produces XML
appearing on a Web page. Worst case, they send it to a line sequential file that
gets translated to PostScript and sent to a network printer.
Robert Wagner

2004-07-30, 8:55 pm

l.willms@jpberlin.de (Lueko Willms) wrote:

>. Am 29.07.04
>schrieb charles.stevens@unisys.com (Chuck Stevens)
> auf /COMP/LANG/COBOL
> in cec34l$2h5t$1@si05.rsvl.unisys.com
> ueber Re: Layout Hell-o
>
>CS> And, back to the topic of a standard-notation representation of
>CS> records, any implementation-independent description of information in
>CS> these formats had better be prepared to understand all of them, and
>CS> why they all represent the same value, to say nothing of what that
>CS> value is ...
>
> I don't think there is any sense in implementation-independent
>descriptions of very implementation-specific representations of data.


In the last 30 years, I haven't described a file with COMP-? types. COMP is for
internal efficiency; files should be 100% DISPLAY.

> What is necessary, is to have generic formats and their description
>for information INTERCHANGE as the ANSI standards for interchange
>magnetic TAPES, extended to current media for information exchange -
>CD-ROMs, ZIP-Disks, network.


Forget INTERCHANGE; think XML. Even for 'internal' files. You never know when
management will offer to exchange the files with outsiders, as happened to
Carol.

Companies in Financial Services Data Warehousing -- Reuters, Bloomberg, Thomson
and Morningstar -- are rapidly moving to making XML a standard for internal as
well as interchange. I worked at one of them last year. We exchanged
multi-megabytes daily, and none of it came via tape.

> I think e.g. of the floating point format as specified by IEEE; on
>the other hand, I think that machine-specific formats as the 'little-
>endian' binaries as used internally by Intel-CPUs should have no place
>in files for information interchange.


They have no place in ANY files.
docdwarf@panix.com

2004-07-30, 8:55 pm

In article <410ac8ab.21550590@news.optonline.net>,
Robert Wagner <robert.deletethis@wagner.net> wrote:
>docdwarf@panix.com wrote:
>
>
>Imagine a program to Direct Deposit 50M Social Security disbursements into
>retiree's bank accounts.


Perhaps one might do just that in another thread, Mr Wagner.

[snip]

>
>Well-Designed = six digit paragraph numbers.


Well-designed = not interrupting in mid-sentence in order to remove any
trace of context without indicating that such editing has occurred.

>
>
>Because I work in Large Installations, I can say from Experience they do NOT use
>or emulate '70s technology.


*Almost* right, Mr Wagner... it is nice to see you attempting but you're
still *not* quite there. Consider this response to your assertion about
your experiences:

Because I work in Large Installations I can say from Experience that they
DO things like I posted all the time... what does this say about our
experiences, Mr Wagner?

>They send the check to a 'printer thing' written in
>a non-Cobol language causing a graphic check to pop out of the printer. That's
>how we did it at Sears to print 400K paychecks per w.


Mr Wagner, it just might be possible that That Which Was Done at Sears was
not That Which Was Done at Other Places... but *do* keep trying, really!
LX-i

2004-07-31, 3:55 am

JerryMouse wrote:

> Carol wrote:
>
>
>
> Oh, I don't know.
>
> Try floating a dollar-sign (to the immediate left of the amount,
> irrespective of the magintude of the amount) in C or VB. Almost all
> paychecks are generate by a COBOL program.


VB6

Dim myPay as Double
myPay = 5643.234325
MsgBox "Your pay is " . FormatCurrency(myPay,,,True,True) . "!"

(I don't have a VB6 compiler on this machine, so the above may contain
syntax errors - I can't remember the exact options that go on that
function, but I think the first "true" tells it to put negative numbers
in parenthesis, and the second "true" tells it to group the digits. The
two parameters I left out are digits after the decimal point, and
leading zeros which, if I remember correctly, use the "locale" default
if unspecified.)


--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~
~ / \ / ~ Live from Montgomery, AL! ~
~ / \/ o ~ ~
~ / /\ - | ~ LXi0007@Netscape.net ~
~ _____ / \ | ~ http://www.knology.net/~mopsmom/daniel ~
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~ I do not read e-mail at the above address ~
~ Please see website if you wish to contact me privately ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~

Robert Wagner

2004-07-31, 3:55 am

docdwarf@panix.com wrote:

>In article <410ac8ab.21550590@news.optonline.net>,
>Robert Wagner <robert.deletethis@wagner.net> wrote:


use[color=darkred]
>
>*Almost* right, Mr Wagner... it is nice to see you attempting but you're
>still *not* quite there. Consider this response to your assertion about
>your experiences:
>
>Because I work in Large Installations I can say from Experience that they
>DO things like I posted all the time... what does this say about our
>experiences, Mr Wagner?


It says our experiences are dispirate. To sort it out, we'd have to compare your
Legacy clients with mine.

Mine include US Fed, a State (NY), Sears, a Finanial Services Giant, a brokerage
giant (Merrill Lynch) and a phamaceutical giant.

What to you have?

Lueko Willms

2004-07-31, 8:55 am

.. Am 30.07.04
schrieb robert.deletethis@wagner.net (Robert Wagner)
bei /COMP/LANG/COBOL
in 410ad0b4.23607858@news.optonline.net
ueber Re: Layout Hell-o

RW>
RW> In the last 30 years, I haven't described a file with COMP-? types.
RW> COMP is for internal efficiency; files should be 100% DISPLAY.

That depends ... if the element stored is mostly just presented to
a human viewer, or if it is used in computations.

And then, for space efficiency, one would compress the file, and
uncompress it again for usage -- that may be more computational effort
than converting a binary number for display. Also think of binary
storage as a kind of data compression ...

How do the database tables look like which are being used?

RW> XML a standard for internal as well as interchange.

I had XML in mind ...
[color=darkred]

RW> They have no place in ANY files.

Well, look at the various UTF-16 representations of Unicode -- I
still remember how Internet-oriented people were opposed to the
established multinational character sets like Teletex (ISO 6937 or
CCITT T.61) or the code switching techniques of ISO 2022, which are
widely being used in Japan.

And now look at what comes out of Unicode: three variants of UTF-
16: UTF-16LE with little-endian representation of the character
numbers, UTF-16BE with big-endian, and a third format, where the
'endian-ness' is being described in the first 16 bits of the file.

You will have that in XML, too.


Yours,
Lüko Willms http://www.mlwerke.de
/--------- L.WILLMS@jpberlin.de -- Alle Rechte vorbehalten --

"Die Interessen der Nation lassen sich nicht anders formulieren als unter
dem Gesichtspunkt der herrschenden Klasse oder der Klasse, die die
Herrschaft anstrebt." - Leo Trotzki (27. Januar 1932)
docdwarf@panix.com

2004-07-31, 8:55 am

In article <410af886.33803113@news.optonline.net>,
Robert Wagner <robert.deletethis@wagner.net> wrote:
>docdwarf@panix.com wrote:
>
>
>use
>
>It says our experiences are dispirate.


*Exactly*, Mr Wagner... our experiences are different. Your word on what
Large Installations do is suspect... but your word on what *you* have seen
in the departments of the Large Installations you have worked in might be
trusted.

See the difference?

>To sort it out, we'd have to compare your
>Legacy clients with mine.


It *is* already sorted out, Mr Wagner... it falls into the realm of 'our
experiences are different'. Watch what happens, now, when a foray is made
back into the schoolyard-beloved Land of Whose is Bigger.

>
>Mine include US Fed, a State (NY), Sears, a Finanial Services Giant, a brokerage
>giant (Merrill Lynch) and a phamaceutical giant.
>
>What to you have?


What a coincidence, Mr Wagner... mine, too, include the Fed, a locality
(DC), several Financial Services Giants (insurance and banking, national
and international), a brokerage giant, accounting (Big 8/5/3), health
services, communications, food services and a few others.

How about that, Mr Wagner... it seems as though our experieces are nigh
equally broad and yet we have come to different conclusions; we're right
back at the same place we were before you whipped out your brag-sheet and
I exposed mine. How can that be possible?

If the world is a wide, wonderful, mysterious place where different things
can happen at different times then... our experiences are different and we
might have something to learn from each other.

If the world is a one-size-MUST-fit-all,
my-experience-measures-everything, I-have-seen-this-so-it-MUST-be-true
place then... oh, I *cannot* resist, then *you* are obviously mistaken and
have much to learn from me.

Since you are not yet enlightened then it might be wise to start with Wu
Li's basics of chopping wood and carrying water.

DD
Richard

2004-07-31, 8:55 pm

robert.deletethis@wagner.net (Robert Wagner) wrote


>
> They have no place in ANY files.


How strange then that database (and after all isam is just a Cobol
database) designers and implementors disagree with you.

The point being that one can't just 'send a database' and nor can one
(generally) just 'send an isam'.

I know that you were in the habit of sending C-ISAM datafiles, but
that was before you knew that you were sending crap along with the
data. Any reasonable data interchange will be done with clean data
which means extracting it from the actual data file into a suitable
format.
Robert Wagner

2004-07-31, 8:55 pm

l.willms@jpberlin.de (Lueko Willms) wrote:

>RW>
>RW> In the last 30 years, I haven't described a file with COMP-? types.
>RW> COMP is for internal efficiency; files should be 100% DISPLAY.
>
> That depends ... if the element stored is mostly just presented to
>a human viewer, or if it is used in computations.


If it is used in computations, move it to working-storage.

> And then, for space efficiency, one would compress the file, and
>uncompress it again for usage -- that may be more computational effort
>than converting a binary number for display. Also think of binary
>storage as a kind of data compression ...


Ironically, people who claim to conserve disk space are the same ones who pad
the record with 100 bytes 'for future expansion' .. and put arrays in records,
thereby guaranteeing every record is worst-case. Why? Because someone told them
variable-length records are Bad (based on 30 year-old experience) and someone
else told them arrays are better than lists (because he once had troubles with
pointers, and doesn't know about normalized database design).

Application programmers are amateurs out of their depth when it comes to
managing data files. The job is better done by professionals who write
databases.

Some file systems do compress and expand individual records as they are
accessed. It is almost free because they do it in parallel threads. The main
process doesn't have to wait for an output record to be compressed before
proceeding.

> How do the database tables look like which are being used?


Do you mean how do they look on disk? Most databases store numbers as
large-precision floating-point .. large enough to avoid rounding errors.

> And now look at what comes out of Unicode: three variants of UTF-
>16: UTF-16LE with little-endian representation of the character
>numbers, UTF-16BE with big-endian, and a third format, where the
>'endian-ness' is being described in the first 16 bits of the file.


Those are religious wars. Standards are supposed to eliminate, not legitimize
them.

> You will have that in XML, too.


Nope. XML does not allow raw binary; it must be encoded MIME/Base64.
Robert Wagner

2004-07-31, 8:55 pm

riplin@Azonic.co.nz (Richard) wrote:

>robert.deletethis@wagner.net (Robert Wagner) wrote
>
>
>
>How strange then that database (and after all isam is just a Cobol
>database) designers and implementors disagree with you.
>
>The point being that one can't just 'send a database' and nor can one
>(generally) just 'send an isam'.
>
>I know that you were in the habit of sending C-ISAM datafiles, but
>that was before you knew that you were sending crap along with the
>data. Any reasonable data interchange will be done with clean data
>which means extracting it from the actual data file into a suitable
>format.


The crap is deleted records. If the C-ISAM file was just reorganized, it is an
ascii text file (assuming no comp/packed). The weakness of this approach is the
requirement for an external definition such as a copybook. It is difficult for
ISAM to produce a self-defining file because the file system knows only about
keys, and nothing about dependencies. However my demo could have done it,
because it had turned the copybook into a schema.

There are many tools that export tables or entire databases to XML, including
schema information and constraints. This produces self-defining files that can
be imported without manual intervention. People who exchange a lot of data have
a method 'Just Send Table' that combines Export with FTP. Granted, they are
using a suitable format, as you said, but the process is automated.

Alternatively, one can write a do-it-yourself that reads the data dictionary and
constructs dynamic SQL to extract and format the data. That gives one more
control. For instance, different formats depending on recipient .. some want
csv, others want XML. I just wrote such a tool in Cobol.



Robert Wagner

2004-07-31, 8:55 pm

docdwarf@panix.com wrote:


>How about that, Mr Wagner... it seems as though our experieces are nigh
>equally broad and yet we have come to different conclusions; we're right
>back at the same place we were before you whipped out your brag-sheet and
>I exposed mine. How can that be possible?
>
>If the world is a wide, wonderful, mysterious place where different things
>can happen at different times then... our experiences are different and we
>might have something to learn from each other.
>
>If the world is a one-size-MUST-fit-all,
>my-experience-measures-everything, I-have-seen-this-so-it-MUST-be-true
>place then... oh, I *cannot* resist, then *you* are obviously mistaken and
>have much to learn from me.
>
>Since you are not yet enlightened then it might be wise to start with Wu
>Li's basics of chopping wood and carrying water.


You are saying there is no single Reality .. that every observer sees a
different reality filtered through his prejudices.

That's not an insight, that's a rejection of reason. Reality is the only thing
that keeps us honest .. and sane. Without the grounding of Reality, ANYthing can
be 'true' and nothing can be 'proven'. Flim-flam men and politicians embrace the
idea; scientists abhor it.

I believe there is one Reality. It may be difficult to discern through a glass
darkly. We may trip over uncertainty in Quantum Theory and Chaos but, at least
in the macro world, there is a single reality .. also known as Facts.

Wu Li et al. touched something in human DNA developed over a million years of
caveman experience .. most of which no longer applies. That doesn't make it
Universal Wisdom; that makes it 'comfort food'.
docdwarf@panix.com

2004-08-01, 3:55 am

In article <410c2275.110085913@news.optonline.net>,
Robert Wagner <robert.deletethis@wagner.net> wrote:
>docdwarf@panix.com wrote:
>
>
>
>You are saying there is no single Reality .. that every observer sees a
>different reality filtered through his prejudices.


No, Mr Wagner, I am saying nothing about 'reality' or 'Reality' or
'realities' or any other word with a 'real-' root; if you read my words
carefully you will see that I spoke *only* of our experiences and that our
experiences are, by our stating of them, different.

>
>That's not an insight, that's a rejection of reason.


It is also something I didn't say.

[snip]

>I believe there is one Reality.


It seems that all have their beliefs and what they think of as Good
Reasons for them, as well; I believe that you are avoiding what seems to
be an inescapable conclusion about varieties of experience with a sidestep
into the metaphysical occurrences of Reality.

>It may be difficult to discern through a glass
>darkly. We may trip over uncertainty in Quantum Theory and Chaos but, at least
>in the macro world, there is a single reality .. also known as Facts.


The facts are, Mr Wagner, that I pointed out how our experiences were
different... and you seemed to think this was a statement about something
you believe in. It might help if you were able to remove the apparent 'I,
Robert Wagner' and look through the glass of 'it has been seen'.

One person (you) has seen one thing, another person (I) have seen others.
Two people of apparently equally broad experience have seen different
things. What might be concluded from this?

>
>Wu Li et al. touched something in human DNA developed over a million years of
>caveman experience .. most of which no longer applies. That doesn't make it
>Universal Wisdom; that makes it 'comfort food'.


It was not claimed to be either by me, Mr Wagner... but how interesting
that you see the need to dismiss it as such.

DD

Richard

2004-08-01, 8:55 am

robert.deletethis@wagner.net (Robert Wagner) wrote

>
> The crap is deleted records. If the C-ISAM file was just reorganized, it is an
> ascii text file (assuming no comp/packed).


No, there still may be crap at the end of the file, yet within the
file size, where space has been allocated for records which remain
unused.

There is also the issue of negative numbers, unless you specify sign
separate.

> [...]
> .. some want
> csv, others want XML. I just wrote such a tool in Cobol.


Exactly. You claimed that COMP and other binary formats should not be
used in files beacuse it made them unsuitable for interchange, but
application files: ISAM and databases; are completely unsuitable to
use for interchange. When an extract is used it matters not whether
binary data items are used in the application data.
Richard

2004-08-01, 8:55 am

robert.deletethis@wagner.net (Robert Wagner) wrote

> Ironically, people who claim to conserve disk space are the same ones who pad
> the record with 100 bytes 'for future expansion'


Which specific people did both of these at the same time ?

In fact when compression is used, 'padding' takes no disk space at all
(it just makes a bigger number in the RLL code).

> and someone
> else told them arrays are better than lists (because he once had troubles with
> pointers, and doesn't know about normalized database design).


Someone attempted to tell us here that lists were 'better' than
pointers, but, in fact, it turned out they were slower and less
flexible.

> Application programmers are amateurs out of their depth when it comes to
> managing data files. The job is better done by professionals who write
> databases.


Is that _all_ application programmers ? or just the ones that you have
met ?

> Some file systems do compress and expand individual records as they are
> accessed. It is almost free because they do it in parallel threads. The main
> process doesn't have to wait for an output record to be compressed before
> proceeding.


It may, or may not, use 'separate threads' depending on the actual
system. The _actual_ advantage of compression (apart from space
saving) is that fewer disk transfers are required, this being usually
a much larger saving than the cost of compressing and decompressing.

> Do you mean how do they look on disk? Most databases store numbers as
> large-precision floating-point .. large enough to avoid rounding errors.


Databases also store fixed point numerics as scaled integers.

> Nope. XML does not allow raw binary; it must be encoded MIME/Base64.


It allows binary represented as 6bit characters scaled to the base64
set.
Michael Mattias

2004-08-01, 8:55 am

"Robert Wagner" <robert.deletethis@wagner.net> wrote in message
news:410ad0b4.23607858@news.optonline.net...
>
> They have no place in ANY files.


How about in systems where other software is in use; software which cannot
'natively' read/write COBOL-format numerics? e.g., Intel-based systems
using programs written in "C" or BASIC or FORTRAN to which COBOL USAGE
BINARY or PACKED-DECIMAL data are alien?

Sure, you could define all numeric data with USAGE DISPLAY, which can be
used by any program in the system, but that could create a terrible waste of
storage space, and require all software (including the COBOL-based software)
to effect a character-string-to-true-numeric conversion whenever arithmetic
is required.

Better I would think that such systems use centrally controlled copy
libraries, one for each such shared file for each language product in use.
(Not that centrally-controlled copy libraries shouldn't be in use anyway for
files acccesed by many programs within a system).


MCM



Robert Wagner

2004-08-01, 3:55 pm

docdwarf@panix.com wrote:

>In article <410c2275.110085913@news.optonline.net>,
>Robert Wagner <robert.deletethis@wagner.net> wrote:
[on the topic of emulating '70s printer technology]
[color=darkred]
>
>No, Mr Wagner, I am saying nothing about 'reality' or 'Reality' or
>'realities' or any other word with a 'real-' root; if you read my words
>carefully you will see that I spoke *only* of our experiences and that our
>experiences are, by our stating of them, different.


Second attempt. A possible explanation is that you work in mainframe shops where
the past is preserved. I work primarily in Unix shops having little history.

>One person (you) has seen one thing, another person (I) have seen others.
>Two people of apparently equally broad experience have seen different
>things. What might be concluded from this?


There is a bias caused by job selection and/or worker selection.

Robert Wagner

2004-08-01, 3:55 pm

riplin@Azonic.co.nz (Richard) wrote:

>You claimed that COMP and other binary formats should not be
>used in files beacuse it made them unsuitable for interchange, but
>application files: ISAM and databases; are completely unsuitable to
>use for interchange. When an extract is used it matters not whether
>binary data items are used in the application data.


You're right, but raw data files ARE exchanged. Carol received 21 of them.

Databases put a stop to that.

Boss: Send the product file to X.
RW: Ok, I'll work up an extract. What fields do they want?
Boss: We don't have time for that. Just send 'em the file.
RW: They won't know how to read it.
Boss: Send a copybook. Let them figure it out.
Alistair Maclean

2004-08-01, 3:55 pm

In message <410c13bd.106316921@news.optonline.net>, Robert Wagner
<robert.deletethis@wagner.net> writes
>l.willms@jpberlin.de (Lueko Willms) wrote:
>
>Ironically, people who claim to conserve disk space are the same ones who pad
>the record with 100 bytes 'for future expansion' .. and put arrays in records,
>thereby guaranteeing every record is worst-case. Why? Because someone told them
>variable-length records are Bad (based on 30 year-old experience) and someone
>else told them arrays are better than lists (because he once had troubles with
>pointers, and doesn't know about normalized database design).
>
>Application programmers are amateurs out of their depth when it comes to
>managing data files. The job is better done by professionals who write
>databases.


Are these the same professionals who put simple standalone files into
databases because that is all that they have ever done and they haven't
heard of VSAM?


Damn! And now I've gotten myself involved in a pointless argument.

--
Alistair Maclean


Notice at an Australian wildlife park: "These animals are dangerous. Do not
leave your vehicle. Entrance $5. Poms on bicycles - free".

docdwarf@panix.com

2004-08-01, 3:55 pm

In article <410ced29.41421787@news.optonline.net>,
Robert Wagner <robert.deletethis@wagner.net> wrote:
>docdwarf@panix.com wrote:
>
>[on the topic of emulating '70s printer technology]


Is that how you see it, Mr Wagner? How curious... what you quote below is
more about how the experiences of people can be different, technology of
any sort receives but passing mention.

>
>
>Second attempt. A possible explanation is that you work in mainframe shops where
>the past is preserved. I work primarily in Unix shops having little history.


That might be a bit closer, Mr Wagner... I work in places where things
which have been demonstrated as 'working in a bulletproof manner' for
decades are allowed to continue to work, you have worked in... other
places.

>
>
>There is a bias caused by job selection and/or worker selection.


Exactly! One kind of shop is biased towards workers and procedures which
have demonstrated the ability to function under adverse conditions for
decades and produce the kinds of results which satisfy thousands of people
on a regular basis, the other kind of shop does... other stuff with other
folks.

Of course if such conclusions dissatisfy you then you might wish to look
for others.

DD

Robert Wagner

2004-08-01, 3:55 pm

riplin@Azonic.co.nz (Richard) wrote:

> The _actual_ advantage of compression (apart from space
>saving) is that fewer disk transfers are required, this being usually
>a much larger saving than the cost of compressing and decompressing.


The spread between computer and disk speed will continue to increase. Computers
double in speed every 18 months. Disks advance more slowly, and may already have
hit the physical limit set by strength of materials.

Processors now use trace width of .1 microns = 100 angstroms. The smallest
possible conductor measures 4 angstroms. That means we have four 18-month
generations, six years, to go before hitting the wall. What then?

>
>It allows binary represented as 6bit characters scaled to the base64
>set.


That's intended for non-text objects such as jpgs, not for numbers represented
in binary.

Robert Wagner

2004-08-01, 3:55 pm

"Michael Mattias" <michael.mattias@gte.net> wrote:

>"Robert Wagner" <robert.deletethis@wagner.net> wrote in message
>news:410ad0b4.23607858@news.optonline.net...
>
>How about in systems where other software is in use; software which cannot
>'natively' read/write COBOL-format numerics? e.g., Intel-based systems
>using programs written in "C" or BASIC or FORTRAN to which COBOL USAGE
>BINARY or PACKED-DECIMAL data are alien?


All three languages can natively handle type 'int', which is Cobol's BINARY.

BINARY should be avoided because it gums up data transmission and EBCDIC-ASCII
conversion.

>Sure, you could define all numeric data with USAGE DISPLAY, which can be
>used by any program in the system, but that could create a terrible waste of
>storage space, and require all software (including the COBOL-based software)
>to effect a character-string-to-true-numeric conversion whenever arithmetic
>is required.


Disk space and bandwidth are cheap. Computers are fast, and no longer live in a
glass room 'island'. Being able to communicate with other computers is more
important than saving a few milliseconds. You are applying yesteryear's
solutions to today's problems.

>Better I would think that such systems use centrally controlled copy
>libraries, one for each such shared file for each language product in use.
>(Not that centrally-controlled copy libraries shouldn't be in use anyway for
>files acccesed by many programs within a system).


We have that. After wrenching control of data from applications programmers, we
put it in a database, where the centrally controlled description is called a
schema. Every language can read the data. So can non-language tools such as
spreadsheets, report generators and data warehousing products. Best of all,
they don't have to be running on the same machine. A program running on a
desktop can directly access databases all over the world .. without caring what
brand they are.

Want a record layout? Just type 'DESCribe tablename'.
Richard

2004-08-01, 8:55 pm

robert.deletethis@wagner.net (Robert Wagner) wrote

>
> The spread between computer and disk speed will continue to increase. Computers
> double in speed every 18 months. Disks advance more slowly, and may already have
> hit the physical limit set by strength of materials.


Your argument doesn't seem to be related at all.

CPU speed increases faster than disk speed so a valid conclusion is
that compression, which 'wastes' CPU but saves disk transfers is a
good thing.

In any case your 'hit the physical limit' is what was being said by
shortsighted pundits before transistors were invented.

>
> That's intended for non-text objects such as jpgs, not for numbers represented
> in binary.


Binary numbers are 'non-text objects'. The point is that, in spite of
your assertions, XML _does_ allow binary objects.
Richard

2004-08-03, 8:55 am

robert.deletethis@wagner.net (Robert Wagner) wrote

> You're right, but raw data files ARE exchanged. Carol received 21 of them.


It wouldn't have made much difference if they were EDIFAC or XML or
CSV, raw data doesn't _mean_ anything unless you know what it means.

> Databases put a stop to that.
>
> Boss: Send the product file to X.
> RW: Ok, I'll work up an extract. What fields do they want?
> Boss: We don't have time for that. Just send 'em the file.
> RW: They won't know how to read it.
> Boss: Send a copybook. Let them figure it out.


While you may have sent junk files out in that way, it certainly isn't
a common practice for the very good reason that data files contain
information that _you_don't_want_others_to_get_.

In the stock inventory files is the cost price. Would you send cost
prices to your customers ? Would you send your customer list to your
competitors ?

Any _sensible_ site would filter the data and only send what was
required in an extract.

In any case databases are proprietry format, but so are most Cobol
file systems. It happens that the default for MF Cobol on Unix is
C-ISAM, but this does not make it suitable as an interchange file
(even if you don't understand this yet), nor does it make this the
only format that is or has been used.

'sending a file plus a copybook' does _NOT_ make it usbale, even if it
doesn't contain data that is confidential.

Perhaps this is just another reason why everywhere you worked went
bankrupt.
docdwarf@panix.com

2004-08-03, 8:55 am

In article <410cf434.43224866@news.optonline.net>,
Robert Wagner <robert.deletethis@wagner.net> wrote:

[snip]

>Boss: Send the product file to X.
>RW: Ok, I'll work up an extract. What fields do they want?
>Boss: We don't have time for that. Just send 'em the file.
>RW: They won't know how to read it.
>Boss: Send a copybook. Let them figure it out.


Mr Wagner, how that you find yourself working in a shop like that...
now, I'm probably not as good with sports metaphors as you but just let me
say that you shouldn't worry, all you gotta do is step up to the plate a
few times, throw that shuttlecock for a some touchdowns, knock that 7-10
split out of the sandtrap and soon you'll find that the football's in your
court... 'the choice is theirs, not the employer's', no?

DD

Robert Wagner

2004-08-03, 8:55 am

riplin@Azonic.co.nz (Richard) wrote:

>robert.deletethis@wagner.net (Robert Wagner) wrote
>
Computers[color=darkred]
have[color=darkred]
>
>Your argument doesn't seem to be related at all.
>
>CPU speed increases faster than disk speed so a valid conclusion is
>that compression, which 'wastes' CPU but saves disk transfers is a
>good thing.


That's what I said. My argument was directly related.

>In any case your 'hit the physical limit' is what was being said by
>shortsighted pundits before transistors were invented.


CDs and magneto optical disks are made of polycarbonate, which has a tensile
strength of 60 MPa (million Pascals = million Newtons per square meter). They
spin at up to 20,000 rpm, which is the physical limit. Any faster would make
them shatter.

Hard disks are made of aluminum, which has a tensile strength of 250 MPa (same
as steel) . They now spin at up to 10K rpm. They cannot go four times faster
than polycarbonate because they're heavier. The theoretical max is around 20K
rpm.

Kevlar has tensile strength of 1250 MPa and is light. A Kevlar disk could
theoretically go above 100K rpm. Problem is, it's expensive. Bigger problem is,
the outside surface would be traveling at 1900 Km/hr = 1200 m/hr .. about twice
the speed of sound. It would have to be housed in a vacuum to avoid a sonic boom
and would have to be heavily shielded for safety in case of failure.

For comparison, a dentist's drill goes 100K rpm. It doesn't break the sound
barrier because the bit has a small circumference.

Carbon nanotubes have a tensile strength of 65-200 GPa. In theory, they could
spin at an incredible 10M rpm. The limiting factor would be the strength of
surface coating -- the metal or polymer holding the information.

Decades ago, non-mechanical storage was thought to be the answer. Remember
Bubble Memory? It turned out to be impractical because bubbles collided at the
figure-eight intersection .. like an electronic demolition derby.

On the solid-state front, Compact Flash, Secure Digital (SD) and SONY offer
inexpensive cards holding 512M. Smart Media stopped at 128M. Why? Because Smart
Media hit the physical wall. Short-sighted companies (in your judgment) who
hitched their wagon to SM have been forced to change direction e.g. Olympus.

I doubt solid-state storage will ever be cheaper than mechanical (disk). Pundits
have predicted it for decades, but it hasn't happened.

Richard

2004-08-03, 8:55 am

robert.deletethis@wagner.net (Robert Wagner) wrote

> That's what I said. My argument was directly related.


No. You didn't relate it at all. You just said CPUs are getting
faster relative to disks - you didn't draw how this related to disk
compression of file space at all.

[color=darkred]
> CDs and magneto optical disks [..]
> Hard disks are made of aluminum, [..]
> Kevlar has tensile strength of [..]


That is exactly the same sort of irrelevant arguments that were used
to prove that computers were limited by the physics of how valves
operated.
Robert Wagner

2004-08-03, 8:55 am

riplin@Azonic.co.nz (Richard) wrote:

>robert.deletethis@wagner.net (Robert Wagner) wrote
>
>
>It wouldn't have made much difference if they were EDIFAC or XML or
>CSV, raw data doesn't _mean_ anything unless you know what it means.


XML files can be self-defining. When imported to a database or spreadsheet,
columns will have names like 'Part_Number' and 'UPC'.

If it came from a mainframe shop, the name will likely be self-qualifying and
missing vowels -- 'TPrdMst_PrtNmbr' -- but still decipherable.

>
>While you may have sent junk files out in that way, it certainly isn't
>a common practice for the very good reason that data files contain
>information that _you_don't_want_others_to_get_.
>
>In the stock inventory files is the cost price. Would you send cost
>prices to your customers ? Would you send your customer list to your
>competitors ?


Only the most rudimentary system puts Cost in the Product master. Typically cost
and selling prices are in another table keyed by date. A product does not have a
single cost; it has historic and future costs.

Same for Quantity On Hand. When processing an order, you don't care what the
quantity is that instant, you want to know how many will be on the shelf when
the picker arrives 12 hours hence. Typical resolution for that table is one
hour.

Both are examples of Lists. They are not Arrays in the Product master, they are
separate tables .. when designed well, per Codd et al.

>Any _sensible_ site would filter the data and only send what was
>required in an extract.


I'll let you tell managers who send raw files that they are not sensible.

>In any case databases are proprietry format, but so are most Cobol
>file systems. It happens that the default for MF Cobol on Unix is
>C-ISAM, but this does not make it suitable as an interchange file
>(even if you don't understand this yet), nor does it make this the
>only format that is or has been used.


Nearly all ISAMs come with a tool that turns the indexed file into a flat file
-- IDCAMS, Realcopy, etc. Carol's tapes were probably created with IDCAMS.