Code Comments
Programming Forum and web based access to our favorite programming groups.All righty... there's a flat file that's used to load a VSAM KSDS. Originally the file was de-duped with a SORT on the key field and the SUM FIELDS=NONE control statement, eg: SORT FIELDS=(5,9,CH,A) SUM FIELDS=NONE Now an additional field has been added to the sort which extends it to something like: SORT FIELDS=(5,9,CH,A,115,8,CH,D) ... and the de-duping no longer occurs. When the result of the sort is REPRO'd there are records with duplicate key fields present and IDCAMS starts throwing sequence errors. At the moment there are 86 dupe-key recs in a file of about 87K recs; the quick fix is to up the IDCAMS tolerance for errors by specifying: REPRO INFILE(INPUT1) OUTFILE(OUTPUT1) ERRORLIMIT(500) ... and this should keep things happy until the limit is reached. The question becomes... where would I find out how many errors IDCAMS has encountered in a given build of the file? I can send the SYSPRINT to a file and run a SORT to COUNT the IDC3314I messages... but I'd rather add another step to the job as a last resort; it would be more elegant to find a parameter to query. DD
Post Follow-up to this messagedocdwarf@panix.com wrote: > All righty... there's a flat file that's used to load a VSAM KSDS. > Originally the file was de-duped with a SORT on the key field and the SUM > FIELDS=NONE control statement, eg: > > SORT FIELDS=(5,9,CH,A) > SUM FIELDS=NONE Doc, if I understand you correctly, the key for this VSAM file begins in column 5, for a length of 9? Perhaps my experience is limited, but I find it is extremely rare to have a KSDS file with any key displacement other than zero. I do know a couple of examples in my shop and they tend to confuse the average two-year programmer. > > Now an additional field has been added to the sort which extends it to > something like: > > SORT FIELDS=(5,9,CH,A,115,8,CH,D) > > ... and the de-duping no longer occurs. When the result of the sort is > REPRO'd there are records with duplicate key fields present and IDCAMS > starts throwing sequence errors. Now you've really got me. You're sorting on column 5 for 9 ascending, and also sorting on column 115 for 8 descending? Well, Syncsort or DFSORT should have no problem with that, but correct me if I am wrong. Isn't it the case that a primary key in a KSDS file consists of a single field? I don't believe VSAM supports a primary key consisting of two fields separated by non-key data, but I'm ready to be proved wrong. > > At the moment there are 86 dupe-key recs in a file of about 87K recs; the > quick fix is to up the IDCAMS tolerance for errors by specifying: > > REPRO INFILE(INPUT1) OUTFILE(OUTPUT1) ERRORLIMIT(500) > > ... and this should keep things happy until the limit is reached. > > The question becomes... where would I find out how many errors IDCAMS has > encountered in a given build of the file? I can send the SYSPRINT to a > file and run a SORT to COUNT the IDC3314I messages... but I'd rather add > another step to the job as a last resort; it would be more elegant to find > a parameter to query. > > DD I don't think showing the errors is really the solution you want. Frank Yeager suggests elsewhere that your aim is to save the highest value of field 115-for-8 associated with the VSAM key of 5-for-9 (hmm, Social Security Number?). I am not familiar with ICETOOL because we have syncsort, but you might consider two SORT steps. Keep your first sort as-is, but without SUM FIELDS=NONE. In the second step, sort on the 5-for-9, use the EQUALS option to preserve original sequence, and use SUM FIELDS=NONE to remove duplicates on the 5-for-9. You might need to test ascending versus descending on the 115-for-8 field to get the correct result. EQUALS is the default on Syncsort. I have no experience with DFSORT for comparison. For other readers, the EQUALS option guarantees that records with the same sort key will have their original input sequence preserved in the output file. Some sort algorithms will randomize the sequence of records with duplicate sort keys. I think that might work, although sorting the file twice will make the job run longer. Of course, I don't know how much heartache and pain it takes to insert an additional job step at your shop. With kindest regards, -- http://arnold.trembley.home.att.net/
Post Follow-up to this message"Arnold Trembley" <arnold.trembley@worldnet.att.net> wrote in message news:%UMVe.242763$5N3.21978@bgtnsc05-news.ops.worldnet.att.net... > docdwarf@panix.com wrote: > > > Doc, if I understand you correctly, the key for this VSAM file begins in > column 5, for a length of 9? Perhaps my experience is limited, but I find > it is extremely rare to have a KSDS file with any key displacement other > than zero. I do know a couple of examples in my shop and they tend to > confuse the average two-year programmer. Last one I created keyed off column 9 for a length of 8. I doubt it was unusual - though if doing a sort I could definitely consider changing the outrec which I was just too lazy to do. > > Now you've really got me. You're sorting on column 5 for 9 > ascending, and also sorting on column 115 for 8 descending? Well, > Syncsort or DFSORT should have no problem with that, but correct me if I > am wrong. Isn't it the case that a primary key in a KSDS file consists of > a single field? I don't believe VSAM supports a primary key consisting of > two fields separated by non-key data, but I'm ready to be proved wrong. I don't believe that he is saying that the new field is part of the key in the VSAM file. The sort option of SUM FIELDS=NONE will just skip second instances....so for example if the two fields were: AAAAAAAAAA......BBBBBBBB AAAAAAAAAA......CCCCCCC AAAAAAAAAA......BBBBBBBB The Sort Originally yielded: AAAAAAAAAA......BBBBBBBB But now with the extra field in the Sort it Yields: AAAAAAAAAA......BBBBBBBB AAAAAAAAAA......CCCCCCC I, like you, am not sure why you would add a field to the sort and not the key - unless the second key is for an alternate index which appears not to be the case given the intent is to have "NODUPS" I thought IDCAMS threw dupe errors not sequence errors in this instance....tells you what I know. > > I don't think showing the errors is really the solution you want. Frank > Yeager suggests elsewhere that your aim is to save the highest value of > field 115-for-8 associated with the VSAM key of 5-for-9 (hmm, Social > Security Number?). Isn't this the age old...requirements trauma..."why do you ask for "A" DD, when you obviously want "B"?"....or rephrased so as not to answer with a question....."I understand that you asked for "A" but I believe you really want "B" " Obviously the define of the KSDS has "NODUPS" so all he is asking is how can he make sure that the whole thing doesn't fall on it's face in production (I think...) in other words...let IDCAMS throw out the dupes. I don't have a manual in front of me, but I would have "hoped" that if you have "TWO KEYS" trying to be loaded into a "NODUP" file that it would throw out "BOTH" not knowing which one is the appropriate record. This is typical in DBMS's but maybe VSAM is the stupid version of data management :-) > I am not familiar with ICETOOL because we have syncsort, but you might > consider two SORT steps. Keep your first sort as-is, but without SUM > FIELDS=NONE. In the second step, sort on the 5-for-9, use the EQUALS > option to preserve original sequence, and use SUM FIELDS=NONE to remove > duplicates on the 5-for-9. You might need to test ascending versus > descending on the 115-for-8 field to get the correct result. He could get the same results as that by putting the sort back the way it was before.... I was taught that the sort is done *prior* to the removing the duplicate records - perhaps the fact that Frank used ICETOOL means that the sort is not guaranteed to happen first (he's a kind of expert on DFSORT / ICETOOL, I feel).... > EQUALS is the default on Syncsort. I have no experience with DFSORT for > comparison. For other readers, the EQUALS option guarantees that records > with the same sort key will have their original input sequence preserved > in the output file. Some sort algorithms will randomize the sequence of > records with duplicate sort keys. SYNCSORT is supposedly a transparently replacement for DFSORT so don't worry, be happy. > I think that might work, although sorting the file twice will make the job > run longer. Of course, I don't know how much heartache and pain it takes > to insert an additional job step at your shop. Mainframes and COBOL aren't dead...it's apparently their operating environments.... > With kindest regards, > > -- > http://arnold.trembley.home.att.net/ JCE
Post Follow-up to this messageIn article <dg6la6$ab$1@reader1.panix.com>, docdwarf@panix.com () wrote: > All righty... there's a flat file that's used to load a VSAM KSDS. > Originally the file was de-duped with a SORT on the key field and the SUM > FIELDS=NONE control statement, eg: > > SORT FIELDS=(5,9,CH,A) > SUM FIELDS=NONE > > Now an additional field has been added to the sort which extends it to > something like: > > SORT FIELDS=(5,9,CH,A,115,8,CH,D) > > ... and the de-duping no longer occurs. When the result of the sort is > REPRO'd there are records with duplicate key fields present and IDCAMS > starts throwing sequence errors. > > At the moment there are 86 dupe-key recs in a file of about 87K recs; the > quick fix is to up the IDCAMS tolerance for errors by specifying: > > REPRO INFILE(INPUT1) OUTFILE(OUTPUT1) ERRORLIMIT(500) > > ... and this should keep things happy until the limit is reached. > > The question becomes... where would I find out how many errors IDCAMS has > encountered in a given build of the file? I can send the SYSPRINT to a > file and run a SORT to COUNT the IDC3314I messages... but I'd rather add > another step to the job as a last resort; it would be more elegant to find > a parameter to query. > > DD It isn't exactly IDCAMS, but you could run an additional step on the file with sort using the original 5,9 fields with sum=none and the XSUM dd. The count of records in the XSUM will be the number of errors that IDCAMS is going to throw.
Post Follow-up to this messageIn article <%UMVe.242763$5N3.21978@bgtnsc05-news.ops.worldnet.att.net>, Arnold Trembley <arnold.trembley@worldnet.att.net> wrote: > > >docdwarf@panix.com wrote: > > >Doc, if I understand you correctly, the key for this VSAM file begins >in column 5, for a length of 9? Perhaps my experience is limited, but > I find it is extremely rare to have a KSDS file with any key >displacement other than zero. I do know a couple of examples in my >shop and they tend to confuse the average two-year programmer. Our experiences agree, Mr Trembley; I was taught that keeping a primary key close to the start of the record decreases access-time. When the file was originally designed the key was (17 0)... but then something came up where somebody needed something else and this became the altkey; the primary was shifted to (9 17). (I used (5 9) just as a 'fer example'.) > > >Now you've really got me. You're sorting on column 5 for 9 >ascending, and also sorting on column 115 for 8 descending? Correct. >Well, >Syncsort or DFSORT should have no problem with that, but correct me if >I am wrong. Isn't it the case that a primary key in a KSDS file >consists of a single field? I don't believe VSAM supports a primary >key consisting of two fields separated by non-key data, but I'm ready >to be proved wrong. The primary key of the VSAM remains where it is; what is needed is a different record from the input file to be loaded into the VSAM. Without getting into the monkey-dicked-the-dog of my current process... imagine an input file containing purchase records. For Whatever Reason someone desires a VSAM KSDS which contains a record of the customer's most recent purchase. Sort on customer number (5,9) ascending and purchase date/time descending; the first record will be the most recent. [snip] > >I don't think showing the errors is really the solution you want. >Frank Yeager suggests elsewhere that your aim is to save the highest >value of field 115-for-8 associated with the VSAM key of 5-for-9 (hmm, >Social Security Number?). Aye. > >I am not familiar with ICETOOL because we have syncsort, but you might >consider two SORT steps. Keep your first sort as-is, but without SUM >FIELDS=NONE. In the second step, sort on the 5-for-9, use the EQUALS >option to preserve original sequence, and use SUM FIELDS=NONE to >remove duplicates on the 5-for-9. You might need to test ascending >versus descending on the 115-for-8 field to get the correct result. 115-for-8 is a date field (YYYYMMDD) so (in absence of duplicate dates) descending will give the most recent record. [snip] >I think that might work, although sorting the file twice will make the >job run longer. Of course, I don't know how much heartache and pain >it takes to insert an additional job step at your shop. Both solutions - making a second sort-pass to de-dupe or upping the IDCAMS ERRORLIMIT - are 'inelegant'... perhaps the discomfort I feel indicates my tendencies towards being an Artist. Ah, the torment of my sensitive soul may yet produce a Great Work! Thanks much. DD
Post Follow-up to this messageIn article <_ZOVe.72274$xl6.45974@tornado.tampabay.rr.com>, jce <defaultuser@hotmail.com> wrote: >"Arnold Trembley" <arnold.trembley@worldnet.att.net> wrote in message >news:%UMVe.242763$5N3.21978@bgtnsc05-news.ops.worldnet.att.net... [snip] > >Last one I created keyed off column 9 for a length of 8. I doubt it was >unusual - though if doing a sort I could definitely consider changing the >outrec which I was just too lazy to do. Maintaining consistency of format is what is wanted here, as well... the same copybook can be used for the flat file and the KSDS. I just checked the Prod control-member library and most KEYS are defined at displacement zero or one... but there are a few 'fliers', as well, like (14 8) and (6 28). The file I'm making is not (yet) used by many applications so efficiency is not (yet) paramount. [snip] > >I don't believe that he is saying that the new field is part of the key in >the VSAM file. Correct. [snip] >I, like you, am not sure why you would add a field to the sort and not the >key - unless the second key is for an alternate index which appears not to >be the case given the intent is to have "NODUPS" Again, I'll avoid the tawdry specifics about my client... but see the example given in another posting to get the most recent purchase for a given customer. > >I thought IDCAMS threw dupe errors not sequence errors in this >instance....tells you what I know. Learn something new every day, or a reasonable facsimile thereof. The IDCAMS SYSPRINT shows: IDC3314I **RECORD OUT OF SEQUENCE - KEY FOLLOWS: ... which, in a way, makes sense... if one considers that for a primary key field any key which is not greater than the previous key is 'out of sequence'. [snip] > >Isn't this the age old...requirements trauma..."why do you ask for "A" DD, >when you obviously want "B"?"....or rephrased so as not to answer with a >question....."I understand that you asked for "A" but I believe you really >want "B" " ... or ... 'We told you that we wanted A but now, after running for a couple-three months we're finding errors... so now we want B... for at least the next couple-three months.' [snip] > >SYNCSORT is supposedly a transparently replacement for DFSORT so don't >worry, be happy. One might want to submit a job with an EXEC PGM=ICETOOL and see if it 806s... if it doesn't then referring to Mr Yaeger's most excellent Mini-User guide (even if one is large enough not to be considered a mini user) at <http://publibz.boulder.ibm.com/cgi-... 20031205121435> ... or ... <http://publibz.boulder.ibm.com/cgi-.../> 31205121435> ... might allow for some useful experimentation. DD
Post Follow-up to this messageIn article <joe_zitzelberger-2C9111.02070214092005@ispnews.usenetserver.com> , Joe Zitzelberger <joe_zitzelberger@nospam.com> wrote: >In article <dg6la6$ab$1@reader1.panix.com>, docdwarf@panix.com () >wrote: [snip] > >It isn't exactly IDCAMS, but you could run an additional step on the >file with sort using the original 5,9 fields with sum=none and the XSUM >dd. > >The count of records in the XSUM will be the number of errors that >IDCAMS is going to throw. Leaving aside the explicit statement of: (from <http://publibz.boulder.ibm.com/cgi-...124143823&CASE=> --begin quoted text: DFSORT does not support the XSUM parameter provided by a competitive sort product to write records deleted by SUM processing to a SORTXSUM DD data set. However, ICETOOL's SELECT operator can perform the same function as XSUM with FIELDS=NONE. --end quoted text ... this adds another SORT step. Granted, nothing exceptional given a flat file input of 505 characters and 87K recs... but were I to do this I'd let the second SORT de-dupe the file and have done with it. Thanks much! DD
Post Follow-up to this messageOn Wed, 14 Sep 2005 03:36:27 GMT, Arnold Trembley <arnold.trembley@worldnet.att.net> enlightened us: > > >docdwarf@panix.com wrote: > > >Doc, if I understand you correctly, the key for this VSAM file begins >in column 5, for a length of 9? Perhaps my experience is limited, but > I find it is extremely rare to have a KSDS file with any key >displacement other than zero. I do know a couple of examples in my >shop and they tend to confuse the average two-year programmer. > Very common for variable length VSAM files where the key is the first field. Hence it would start in position 5. > >Now you've really got me. You're sorting on column 5 for 9 >ascending, and also sorting on column 115 for 8 descending? Well, >Syncsort or DFSORT should have no problem with that, but correct me if >I am wrong. Isn't it the case that a primary key in a KSDS file >consists of a single field? I don't believe VSAM supports a primary >key consisting of two fields separated by non-key data, but I'm ready >to be proved wrong. > No it does not but it does support a primary key and an alternative or secondary key elsewhere in the file. > >I don't think showing the errors is really the solution you want. >Frank Yeager suggests elsewhere that your aim is to save the highest >value of field 115-for-8 associated with the VSAM key of 5-for-9 (hmm, >Social Security Number?). > If you want to know how many duplicate records there were and maybe what the keys where, then what else you going to do? >I am not familiar with ICETOOL because we have syncsort, but you might >consider two SORT steps. Keep your first sort as-is, but without SUM >FIELDS=NONE. In the second step, sort on the 5-for-9, use the EQUALS >option to preserve original sequence, and use SUM FIELDS=NONE to >remove duplicates on the 5-for-9. You might need to test ascending >versus descending on the 115-for-8 field to get the correct result. > >EQUALS is the default on Syncsort. I have no experience with DFSORT >for comparison. For other readers, the EQUALS option guarantees that >records with the same sort key will have their original input sequence >preserved in the output file. Some sort algorithms will randomize the >sequence of records with duplicate sort keys. > >I think that might work, although sorting the file twice will make the >job run longer. Of course, I don't know how much heartache and pain >it takes to insert an additional job step at your shop. > >With kindest regards, The Sort is not the problem. The IDCAMS repro of the sorted file to the KSDS file is the problem. It will not load duplicate keys and will stop when its error limit has been reached. Regards, //// (o o) -oOO--(_)--OOo- Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rich Cook ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Remove nospam to email me. Steve
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.