Code Comments
Programming Forum and web based access to our favorite programming groups.As promised, I dug around and found some documents that could be useful for people considering migration. The one on Normalization is a revamp of an opriginal paper from MicroSoft. I don't know who wrote it but it is crisp and succinct. I revised it minimally. The other two describe some tools I wrote, and in the course of doing so, provide some useful background and examples. Here is a snippet from another thread here... > > A very important observation, Jimmy. > > I have some stuff on this somewhere... I'll see if I can post it to a web > server so people can access it. > > Pete. > > This has now been posted... Accessing the following link will reveal 3 documents that are worth reading if you are considering migrating ISAM to RDB.... http://homepages.ihug.co.nz/~dashwo...hwood/RDBStuff/ Any or all feedback appreciated. Pete.
Post Follow-up to this message"Pete Dashwood" <dashwood@removethis.enternet.co.nz> wrote in message news:589a52F2g4c1cU1@mid.individual.net... [snip] > This has now been posted... Accessing the following link will reveal 3 > documents that are worth reading if you are considering migrating ISAM to > RDB.... > > http://homepages.ihug.co.nz/~dashwo...hwood/RDBStuff/ > > Any or all feedback appreciated. In 4.ISAM2RDB.doc, 1. Page 3, Dealing with OCCURS (Repeating Groups), items 1 and 3. You seem to disregard the space savings that ODO and RECORD VARYING provide. 2. Page 4, ultimate paragraph. You write "COBOL supports 3 levels of indexing". That was true of COBOL '74. COBOL '85 allowed 7 levels. 3. Page 8, Currency data type (in table). You write "This data TYPE is held as 64 bit floating point ...). MS Access help states, "Currency variables are stored as 64-bit (8-byte) numbers in an integer format, scaled by 10,000 to give a fixed-point number with 15 digits to the left of the decimal point and 4 digits to the right."
Post Follow-up to this message"Rick Smith" <ricksmith@mfi.net> wrote in message news:1320k5l1kvroc05@corp.supernews.com... > > "Pete Dashwood" <dashwood@removethis.enternet.co.nz> wrote in message > news:589a52F2g4c1cU1@mid.individual.net... > [snip] > > In 4.ISAM2RDB.doc, > > 1. Page 3, Dealing with OCCURS (Repeating Groups), > items 1 and 3. You seem to disregard the space savings > that ODO and RECORD VARYING provide. Yes, that's probably true, although I would have done so unconsciously. My personal opinion (and it is ONLY that :-)) is that these constructs are just pointless and useless. Unless COBOL dynamically allocates space (and it doesn't) the only "saving" that is made with ODO is on external media. Internally, an ODO definitition always takes the maximum space that it could. The compiler has to allocate the maximum because it can't dynamically allocate space at run time. I don't use this construct, and I discourage others from doing so too. A relational DB allows "tables" with "infinite" (limiited only by available disk space, and that gets cheaper every year) dimension, so the external saving is just unnecessary if you use RDB, anyway. Never needed it; don't use it. :-) RECORD VARYING... may have some marginal use and is certainly important when processing legacy files. I honestly don't know what ISAM2RDB does with these constructs. I think it iwould be an easy matter to change the ISAM source so that it reflected the maximums before presenting it to ISAM2RDB. If I was still interested in marketing this stuff I'd fix it to accommodate these constructs. > > 2. Page 4, ultimate paragraph. You write "COBOL > supports 3 levels of indexing". That was true of COBOL > '74. COBOL '85 allowed 7 levels. Thanks. I didn't know that. I could easily modify ISAM2RDB to take this into account. I have never seen a live COBOL program with more than 3 dimensions (and even 3 is pretty rare), so I don't see this as a major problem. I guess people don't always do what the standard may permit them to. > > 3. Page 8, Currency data type (in table). You write "This > data TYPE is held as 64 bit floating point ...). MS Access > help states, "Currency variables are stored as 64-bit (8-byte) > numbers in an integer format, scaled by 10,000 to give a > fixed-point number with 15 digits to the left of the decimal > point and 4 digits to the right." > Hardly worth arguing (floating point is just one intepretation of scaled decimal) but I stand corrected. :-) Thanks for the comments, Rick. I wish I had been able to get someone whose opinions I value as much as yours to review these documents before they were published. I did the best I could under the circumstances :-). Pete.
Post Follow-up to this message(I found the .DOC file, but don't remember if Pete posted links to the progr ams themselves. However, a couple of comments on Rick's ideas (and Pete's repli es). 1) On the mainframe 4-7 level tables do occur. Not super often, but do exit . 2) The RECORD VARYING IN SIZE phrase (in an FD) is (IMHO) one of the more us eful additions of the '85 Standard. This allowed one to "get" the size of variab le length record easily and via a "supported" interface. It also makes "settin g" the size on output easier as well. 3) Whether or not using ODO would have been a "good idea" would depend on wh ere it is used. In Standard COBOL, one may ONLY specify an ODO as the last "gro up" at the end of a record. (One may not have data following an ODO at the same level, nor may one nested an ODO under another OCCURS - either with or witho ut the DEPENDING ON phrase). Now there is a relatively common extension that allows "nested" ODO's and data after an ODO. HOWEVER, what this actually me ans (semantics) does vary from implementor to implementor. Check out the Micro Focus "ODOSLIDE' directive to see two of the major implementation difference s. In fact, this SORT-OF answers Pete's "dynamic allocation" issue. When one u ses the ODOSLIDE (on) directive, MF doesn't actually do dynamic allocation but D OES change the amount of storage "currently in use" (available to the applicatio n). Again, I don't know if Pete was talking about places that would or would not be conforming for ODOs, but (in general) I would agree with him that "avoiding" its use unless there is a REALLY good reason to use them, is probably a good ide a. -- Bill Klein wmklein <at> ix.netcom.com "Pete Dashwood" <dashwood@removethis.enternet.co.nz> wrote in message news:58bp6cF2h1f0qU1@mid.individual.net... > > "Rick Smith" <ricksmith@mfi.net> wrote in message > news:1320k5l1kvroc05@corp.supernews.com... > > Yes, that's probably true, although I would have done so unconsciously. > > My personal opinion (and it is ONLY that :-)) is that these constructs are > just pointless and useless. Unless COBOL dynamically allocates space (and it > doesn't) the only "saving" that is made with ODO is on external media. > Internally, an ODO definitition always takes the maximum space that it cou ld. > The compiler has to allocate the maximum because it can't dynamically allo cate > space at run time. > > I don't use this construct, and I discourage others from doing so too. A > relational DB allows "tables" with "infinite" (limiited only by available disk > space, and that gets cheaper every year) dimension, so the external saving is > just unnecessary if you use RDB, anyway. > > Never needed it; don't use it. :-) > > RECORD VARYING... may have some marginal use and is certainly important wh en > processing legacy files. > > I honestly don't know what ISAM2RDB does with these constructs. I think it > iwould be an easy matter to change the ISAM source so that it reflected th e > maximums before presenting it to ISAM2RDB. If I was still interested in > marketing this stuff I'd fix it to accommodate these constructs. > > > Thanks. I didn't know that. I could easily modify ISAM2RDB to take this in to > account. I have never seen a live COBOL program with more than 3 dimension s > (and even 3 is pretty rare), so I don't see this as a major problem. I gue ss > people don't always do what the standard may permit them to. > > > Hardly worth arguing (floating point is just one intepretation of scaled > decimal) but I stand corrected. :-) > > Thanks for the comments, Rick. I wish I had been able to get someone whose > opinions I value as much as yours to review these documents before they we re > published. I did the best I could under the circumstances :-). > > Pete. > >
Post Follow-up to this messagePete Dashwood wrote: > "Rick Smith" <ricksmith@mfi.net> wrote in message > news:1320k5l1kvroc05@corp.supernews.com... > > Yes, that's probably true, although I would have done so unconsciously. > > My personal opinion (and it is ONLY that :-)) is that these constructs are > just pointless and useless. Unless COBOL dynamically allocates space (and it > doesn't) the only "saving" that is made with ODO is on external media. > Internally, an ODO definitition always takes the maximum space that it > could. The compiler has to allocate the maximum because it can't dynamical ly > allocate space at run time. > > I don't use this construct, and I discourage others from doing so too. A > relational DB allows "tables" with "infinite" (limiited only by available > disk space, and that gets cheaper every year) dimension, so the external > saving is just unnecessary if you use RDB, anyway. > > Never needed it; don't use it. :-) > > RECORD VARYING... may have some marginal use and is certainly important wh en > processing legacy files. > Like so much in this business, it depends. If ODO saves a significant amount of raw file space to store the data on external media this can have a number of beneficial effects that go much beyond the mere cost of your DASD media: (1)savings in processor time, I/O activity, media, and real time to backup the external data for Disaster Recovery; (2)savings in cost of disk media at a DR recovery site (which may be expensive or difficult to increase depending on your contract); (3)savings in processor time, I/O activity, and real time to reorganize or rebuild the database;(4)savings in processor time, I/O, and elapsed time to sequentially access a significant percent of the database, because more used bytes are transferred with each physical block read; (5)savings in the number of buffers required (affecting size of working set and real storage requirements) for caching the database in order to contain the same number of records in cache and get acceptable response time for random access. If you are in an environment where you are never constrained by processor time, real memory, I/O response times, daily batch windows, DASD availability, or DR costs, then by all means ODO is irrelevant. In all other cases, one looks for the major resource hogs, or "loved ones" with poor response times, and do whatever it takes to address the problem, including use of ODO where appropriate. We too have had COBOL programmers who hated to deal with variable length records. But, the marginal extra cost to manage variable length records within a COBOL program can easily be insignificant when compared with what is costs to pump unused bytes through the I/O subsystem over and over. COBOL does not bother to dynamically allocate storage to ODO items at run time, because with virtual storage there is no significant savings in allocating COBOL ODO data items at anything less than the max required. Unused portions of a large array do not contribute to the working set of the program or the real storage required to execute. In the z/OS environment, real 4KiB pages wouldn't even be assigned to portions of a large array until the first reference required it. So long as you don't do something silly, like initializing the entire array in advance just in case you might need all of it, then the cost of unused portions is essentially zero in that environment. Although it's probable your remarks on ODO were only intended to apply to record formats used in I/O, I want others reading this to be clear that there are other cases in COBOL where ODO is the only reasonable way to go. One case where ODO should ALWAYS be used is for a sorted data item array with a variable number of items that will be used repeatedly with a SEARCH ALL. Not only does proper setting of the "depending on" variable eliminate the need to initialize unused trailing items in the array, but it guarantees the resulting binary search uses the minimal number of compares for the search. For arrays whose max size is much greater than their average usage, failure to use ODO here can have a significant negative impact on performance. ... -- Joel C. Ewing, Fort Smith, AR jREMOVEcCAPSewing@acm.org
Post Follow-up to this messageOn Apr 14, 4:29 pm, "Joel C. Ewing" <jcREMOVEew...@CAPS.acm.org> wrote: > Pete Dashwood wrote: > > > > > > > > > > Like so much in this business, it depends. > > If ODO saves a significant amount of raw file space to store the data on > external media this can have a number of beneficial effects that go much > beyond the mere cost of your DASD media: (1)savings in processor time, > I/O activity, media, and real time to backup the external data for > Disaster Recovery; (2)savings in cost of disk media at a DR recovery > site (which may be expensive or difficult to increase depending on your > contract); (3)savings in processor time, I/O activity, and real time to > reorganize or rebuild the database;(4)savings in processor time, I/O, > and elapsed time to sequentially access a significant percent of the > database, because more used bytes are transferred with each physical > block read; (5)savings in the number of buffers required (affecting size > of working set and real storage requirements) for caching the database > in order to contain the same number of records in cache and get > acceptable response time for random access. > > If you are in an environment where you are never constrained by > processor time, real memory, I/O response times, daily batch windows, > DASD availability, or DR costs, then by all means ODO is irrelevant. In > all other cases, one looks for the major resource hogs, or "loved ones" > with poor response times, and do whatever it takes to address the > problem, including use of ODO where appropriate. > > We too have had COBOL programmers who hated to deal with variable length > records. But, the marginal extra cost to manage variable length records > within a COBOL program can easily be insignificant when compared with > what is costs to pump unused bytes through the I/O subsystem over and over . > > COBOL does not bother to dynamically allocate storage to ODO items at > run time, because with virtual storage there is no significant savings > in allocating COBOL ODO data items at anything less than the max > required. Unused portions of a large array do not contribute to the > working set of the program or the real storage required to execute. In > the z/OS environment, real 4KiB pages wouldn't even be assigned to > portions of a large array until the first reference required it. So > long as you don't do something silly, like initializing the entire array > in advance just in case you might need all of it, then the cost of > unused portions is essentially zero in that environment. > > Although it's probable your remarks on ODO were only intended to apply > to record formats used in I/O, I want others reading this to be clear > that there are other cases in COBOL where ODO is the only reasonable way > to go. One case where ODO should ALWAYS be used is for a sorted data > item array with a variable number of items that will be used repeatedly > with a SEARCH ALL. Not only does proper setting of the "depending on" > variable eliminate the need to initialize unused trailing items in the > array, but it guarantees the resulting binary search uses the minimal > number of compares for the search. For arrays whose max size is much > greater than their average usage, failure to use ODO here can have a > significant negative impact on performance. > ... > > -- > Joel C. Ewing, Fort Smith, AR jREMOVEcCAPSew...@acm.org Joel I agree with you, though I think that these days file compression on the fly is also available in some environments and is probably much more effective at reducing files sizez. I quite agree with your comments on minimising sort requirements, though am mot so sure about only initiallzing the parts of a table in use. While I see the benefits of this there, it also sets a trap for the unwary maintenance programmer at 3.00 a.m. on a call out, though I can also see that clear and/or appropriately documented code would minimise the risk. Have you seen John Piggott's proposal for taking the topic quite a bit further and now incorporated in the draft standard for the next revision? It is very similar to the technique used by the Pick O/S, though its use for files was left aa a possible future enhancement. Then people would be able to truly talk about COBOL files, as this format would then only be able to be read by COBOL programs in non- Pick operating systems, though I suppose suppliers might also write some utilities for them. I would make reading dumps harder and interpretive debuggers harder to implement and follow. It will be of great benefit to programs using massive data structures, though for general use it would probably add unnecessary complexity. Robert
Post Follow-up to this messageJoel, I assume that it would be obvious to some - but not all - other readers from terms such as "DASD" "Virtual Storage" and your eventual mention of "z/OS" t hat you are probably MOST familiar with COBOL on a mainframe (and probably an IB M MVS-OS/390-z/OS mainframe). Both how ODO's are handled (object code AND storage vs performance considerations) in that COBOL environment are not universal across COBOL implementations. I am not necessarily disagreeing with you (and certainly NOT for z/OS) but I did want to point this out - just in case others have not. P.S. In fact, under z/OS, the CICS documentation specifically says to avoid ODO's as much as possible. See: http://publibz.boulder.ibm.com/cgi-...fhp3b01/1.4.1.1 which says (in part), "Statements that produce variable-length areas, such as OCCURS DEPENDING ON, should be used with caution within the WORKING-STORAGE SECTION." Obviously (and for those not familiar with CICS) there is NO "File Section" under CICS. -- Bill Klein wmklein <at> ix.netcom.com "Joel C. Ewing" <jcREMOVEewing@CAPS.acm.org> wrote in message news:Cr6Uh.21322$PL.11781@newsread4.news.pas.earthlink.net... > Pete Dashwood wrote: > Like so much in this business, it depends. > > If ODO saves a significant amount of raw file space to store the data on > external media this can have a number of beneficial effects that go much > beyond the mere cost of your DASD media: (1)savings in processor time, I/O > activity, media, and real time to backup the external data for Disaster > Recovery; (2)savings in cost of disk media at a DR recovery site (which ma y be > expensive or difficult to increase depending on your contract); (3)savings in > processor time, I/O activity, and real time to reorganize or rebuild the > database;(4)savings in processor time, I/O, and elapsed time to sequential ly > access a significant percent of the database, because more used bytes are > transferred with each physical block read; (5)savings in the number of buf fers > required (affecting size of working set and real storage requirements) for > caching the database in order to contain the same number of records in cac he > and get acceptable response time for random access. > > If you are in an environment where you are never constrained by processor > time, real memory, I/O response times, daily batch windows, DASD availabil ity, > or DR costs, then by all means ODO is irrelevant. In all other cases, one > looks for the major resource hogs, or "loved ones" with poor response time s, > and do whatever it takes to address the problem, including use of ODO wher e > appropriate. > > We too have had COBOL programmers who hated to deal with variable length > records. But, the marginal extra cost to manage variable length records > within a COBOL program can easily be insignificant when compared with what is > costs to pump unused bytes through the I/O subsystem over and over. > > COBOL does not bother to dynamically allocate storage to ODO items at run > time, because with virtual storage there is no significant savings in > allocating COBOL ODO data items at anything less than the max required. > Unused portions of a large array do not contribute to the working set of t he > program or the real storage required to execute. In the z/OS environment, > real 4KiB pages wouldn't even be assigned to portions of a large array unt il > the first reference required it. So long as you don't do something silly, > like initializing the entire array in advance just in case you might need all > of it, then the cost of unused portions is essentially zero in that > environment. > > Although it's probable your remarks on ODO were only intended to apply to > record formats used in I/O, I want others reading this to be clear that th ere > are other cases in COBOL where ODO is the only reasonable way to go. One c ase > where ODO should ALWAYS be used is for a sorted data item array with a > variable number of items that will be used repeatedly with a SEARCH ALL. Not > only does proper setting of the "depending on" variable eliminate the need to > initialize unused trailing items in the array, but it guarantees the resul ting > binary search uses the minimal number of compares for the search. For arr ays > whose max size is much greater than their average usage, failure to use OD O > here can have a significant negative impact on performance. > ... > > -- > Joel C. Ewing, Fort Smith, AR jREMOVEcCAPSewing@acm.org
Post Follow-up to this message"Robert Jones" <rjones0@hotmail.com> wrote in message news:1176585614.919567.229980@e65g2000hsc.googlegroups.com... > On Apr 14, 4:29 pm, "Joel C. Ewing" <jcREMOVEew...@CAPS.acm.org> <much snippage> (and bottom-posted for a change <G> ) > > Have you seen John Piggott's proposal for taking the topic quite a bit > further and now incorporated in the draft standard for the next > revision? It is very similar to the technique used by the Pick O/S, > though its use for files was left aa a possible future enhancement. > Then people would be able to truly talk about COBOL files, as this > format would then only be able to be read by COBOL programs in non- > Pick operating systems, though I suppose suppliers might also write > some utilities for them. I would make reading dumps harder and > interpretive debuggers harder to implement and follow. > > It will be of great benefit to programs using massive data structures, > though for general use it would probably add unnecessary complexity. > > Robert And what has gotten into the draft (WD 1.7) is a MESS (for ANY LENGTH items especially, but many of the same problems also exist for dynamic tables). Specifically for the ANYLENGTH items, see: http://www.cobolstandard.info/j4/files/07-0060.doc Dynamic tables are at least "better" (cleaner?)as they don't have the "direc t" option; stoarage is always "wherever the implementor puts it" -- Bill Klein wmklein <at> ix.netcom.com >
Post Follow-up to this message"Pete Dashwood" <dashwood@removethis.enternet.co.nz> wrote in message news:589a52F2g4c1cU1@mid.individual.net... [snip] > This has now been posted... Accessing the following link will reveal 3 > documents that are worth reading if you are considering migrating ISAM to > RDB.... > > http://homepages.ihug.co.nz/~dashwo...hwood/RDBStuff/ > > Any or all feedback appreciated. In DBnormalization.doc, page 2, under "Other Normalization Forms", you write "Fourth normal form, also called Boyce Codd Normal Form (BCNF), ...." is incorrect. < http://en.wikipedia.org/wiki/Boyce-Codd_normal_form > "Boyce-Codd normal form (or BCNF) is a normal form used in database normalization. It is a slightly-stronger version of third normal form (3NF). " < http://en.wikipedia.org/wiki/Fourth_normal_form > "4NF is the next level of normalization after Boyce-Codd normal form (BCNF)."
Post Follow-up to this message"Robert Jones" <rjones0@hotmail.com> wrote in message news:1176585614.919567.229980@e65g2000hsc.googlegroups.com... > On Apr 14, 4:29 pm, "Joel C. Ewing" <jcREMOVEew...@CAPS.acm.org> > wrote: Yes. But for me at least, the divantages far outweigh the advantages. IF it saves space and SIGNIFICANT space. A RDB in normal form will provide a better storage solution, in my opinion. It isn't just about space; it is about overall availablity. Huh?!! The processor time to deal with ODO is outrageous...compared to fixed length. Awww... I was gonna respond to all of these points below, and then I realised it just isn't worth the time. Use ISAM with or without ODO if you really believe what you have written; I'll stick to RDB. (2)savings in cost of disk media at a DR recovery Sure, whatever... I just don't see ODO as a solution; I see it as a problem. :-) Not me. I don't get emotional about software. I don't use ODO, not because I hate variable records, but because it introduces unjustifiable complexity and hassle. Besides, you don't need ODO to create varying length records. Yes, my remarks were intended to apply to record formats used in I/O. But I also contest your assertion that it is "the only reasonable way to go"... :-) Get a life! "Can have..." ? "Significant" ? Sure, and you COULD get hit by a meteorite with SIGNIFICANT impact, while walking down the road. Real life example...? Nope. In 1960 this might have mattered; in 2007.... I think not. SEARCH ALL will take into account the trailing entries anyway, (provided you have them set to high values). You say using ODO will "eliminate the need to initialize unused trailing items in the array" but this can be done at compile time and the table can be loaded initialized. You then say "it guarantees the resulting binary search uses the minimal number of compares for the search." but the difference in number of compares is binary; it isn't one extra compare for every "trailing" entry... Sorry, I'm not persuaded.
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.