Home > Archive > Cobol > May 2005 > Performance problem with a VSAM files
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Performance problem with a VSAM files
|
|
| JRDuval 2005-04-26, 3:55 am |
| Hi.
Since last w we're experiencing performance problem when writing to a
VSAM file.
Context:
We have a big master compressed VSAM file (about 30 million records) that we
read to decompress it and writing it into 50 sequentiel files depending on
wich record we read. For one of them we decided to change it into a VSAM
file for convenience. This is the biggest one with more than 1 million
records.
Problem:
For some reason, when finishing testing we were amazed to see that it took
10 times more CPU time to run that our other program and that much I/O. The
only thing that I could thing of was to put back our older program and run
an IDCAMS to convert it into a VSAM file. Is there a problem with Cobol when
writing big VSAM file or is it just some kind of limitation when using VSAM
files into Cobol program.
Is there another way to do it instead of using IDCAMS.
Thank you for your input and good night..
J R Duval
| |
| William M. Klein 2005-04-26, 3:55 am |
| You don't say what else (if anything) has changed, but there WAS a *significant*
change to CISIZE "stuff" with z/OS V1R3.
See:
http://www-03.ibm.com/support/techd...ndex/FLASH10206
and have your "performance" or "sy min" people see if this is relevant.
--
Bill Klein
wmklein <at> ix.netcom.com
"JRDuval" <jrduval@videotron.ca> wrote in message
news:2yhbe.9244$oy2.391259@weber.videotron.net...
> Hi.
>
> Since last w we're experiencing performance problem when writing to a
> VSAM file.
> Context:
> We have a big master compressed VSAM file (about 30 million records) that we
> read to decompress it and writing it into 50 sequentiel files depending on
> wich record we read. For one of them we decided to change it into a VSAM
> file for convenience. This is the biggest one with more than 1 million
> records.
>
> Problem:
> For some reason, when finishing testing we were amazed to see that it took
> 10 times more CPU time to run that our other program and that much I/O. The
> only thing that I could thing of was to put back our older program and run
> an IDCAMS to convert it into a VSAM file. Is there a problem with Cobol when
> writing big VSAM file or is it just some kind of limitation when using VSAM
> files into Cobol program.
>
> Is there another way to do it instead of using IDCAMS.
>
> Thank you for your input and good night..
>
> J R Duval
>
>
| |
| Colin Campbell 2005-04-26, 3:55 am |
| JRDuval wrote:
>Hi.
>
>Since last w we're experiencing performance problem when writing to a
>VSAM file.
>Context:
>We have a big master compressed VSAM file (about 30 million records) that we
>read to decompress it and writing it into 50 sequentiel files depending on
>wich record we read. For one of them we decided to change it into a VSAM
>file for convenience. This is the biggest one with more than 1 million
>records.
>
>Problem:
>For some reason, when finishing testing we were amazed to see that it took
>10 times more CPU time to run that our other program and that much I/O. The
>only thing that I could thing of was to put back our older program and run
>an IDCAMS to convert it into a VSAM file. Is there a problem with Cobol when
>writing big VSAM file or is it just some kind of limitation when using VSAM
>files into Cobol program.
>
>Is there another way to do it instead of using IDCAMS.
>
>Thank you for your input and good night..
>
>J R Duval
>
>
>
>
I tested using VSAM for an application that we were going to write years
ago. It was compared against an IMS data base, a DB2 data base, and
good old sequential data sets (VB format).
Nothing came close to the QSAM data set for performance.
I asked a contractor who had worked at another place that used VSAM
extensively, and she said that it was her impression that the VSAM I/O
was generally done using an Assembler program. I seemed to find that
adding a record to a control interval (I hope I'm remembering the
terminology correctly) was inefficient in COBOL, but could be controlled
in Assembler. I expected to be able to add ten records to the data set
for one EXCP (physical I/O), but instead, I was getting two EXCP's for
each record, and one more when a "block" was actually written to the
VSAM data set.
So, if you need that big data set to be VSAM, you are probably on track
to create it as a QSAM data set, then use a utility to load the data
into a VSAM data set.
If it is just for some "convenience", I would forget using VSAM, as it
sounds as if you already have a working application.
| |
| Kelly Bert Manning 2005-04-26, 8:55 am |
|
"JRDuval" (jrduval@videotron.ca) writes:
>
> Since last w we're experiencing performance problem when writing to a
> VSAM file.
> Context:
> We have a big master compressed VSAM file (about 30 million records) that we
> read to decompress it and writing it into 50 sequentiel files depending on
> wich record we read. For one of them we decided to change it into a VSAM
> file for convenience. This is the biggest one with more than 1 million
> records.
ESDS, KSDS or RRDS? Were you using ACCESS IS SEQUENTIAL and OPEN OUTPUT?
>
> Problem:
> For some reason, when finishing testing we were amazed to see that it took
> 10 times more CPU time to run that our other program and that much I/O. The
> only thing that I could thing of was to put back our older program and run
> an IDCAMS to convert it into a VSAM file. Is there a problem with Cobol when
> writing big VSAM file or is it just some kind of limitation when using VSAM
> files into Cobol program.
>
> Is there another way to do it instead of using IDCAMS.
Syncsort can write sorted output records to a VSAM dataset instead of a QSAM
dataset. I'd be surprised if DFSORT couldn't do the same.
Syncsort can also be used to split files. Again, I'd be surprised if DFSORT
did not also support this. You don't actually have to sort in order to split.
| |
| Kelly Bert Manning 2005-04-26, 3:55 pm |
|
Colin Campbell (cmcampb@adelphia.net) writes:
> I tested using VSAM for an application that we were going to write years
> ago. It was compared against an IMS data base, a DB2 data base, and
> good old sequential data sets (VB format).
>
> Nothing came close to the QSAM data set for performance.
IMS and DB2 use VSAM and OSAM, BSAM or QSAM, so there is no reason to
expect you could do any better than something that doesn't involve the
overhead of a DBMS. Only use a DB if you need to. I recall one FOCUS
application I converted to IMS where they were using a full function
hierarchical FOCUS DB for something that seemed like it could be done
by sorting a transaction file and merging it with an accum file. When
I tracked down the developer he agreed, but felt that doing random access
to a FOCUS DB would be an interesting way to do this. Apparently he had
no idea of the I/O and CPU cost of doing random update.
That said, you should have been able to get close to QSAM SAM-E performance
with IMS or DB2, at least for sequential processing.
Native VSAM sequential access should be similar to QSAM if you specify
the appropriate bufferring, that is a minimum of 2 tracks + 1 data CI
and enough index CIs to hold all a CI from all levels of your index.
DB2 has a sequential pre-fetch of 10 CIs at a time which speeds things up.
IMS doesn't have anything comparable for online VSAM access, but HSSR
can be used to speed up offline sequential access to VSAM or OSAM.
IMS uses QSAM sequential acces for online processing of OSAM DB
datasets. Online OSAM Sequential Bufferring can usually give you average
IWait times of a fraction of a millisecond for most overlapped sequential
reads, reading 10 blocks with each read. Default is 4 sets of 10 blocks
in page fixed storage.
> I asked a contractor who had worked at another place that used VSAM
> extensively, and she said that it was her impression that the VSAM I/O
> was generally done using an Assembler program. I seemed to find that
> adding a record to a control interval (I hope I'm remembering the
> terminology correctly) was inefficient in COBOL, but could be controlled
> in Assembler. I expected to be able to add ten records to the data set
> for one EXCP (physical I/O), but instead, I was getting two EXCP's for
> each record, and one more when a "block" was actually written to the
> VSAM data set.
>
You didn't try hard enough then. Were you doing skip sequential processing,
random processing, or a full sequential load or read?
> So, if you need that big data set to be VSAM, you are probably on track
> to create it as a QSAM data set, then use a utility to load the data
> into a VSAM data set.
>
> If it is just for some "convenience", I would forget using VSAM, as it
> sounds as if you already have a working application.
I've often found that sorting and converting to skip sequential processing
can tame IMS, DB2 and VSAM performance problems.
We had an IMS job that used to run past the end of the nightly batch
window. Requesting IMS Sequential Buffering for all IMS DBs got it to
report the total Random, Synchronous Sequential and Overlapped Sequential
reads for each IMS OSAM DB.
One was being processed sequentially and IMS activated OSAM SB for it.
A logically related IMS OSAM DB was being procesed randomly, in the order
that the logical relationships were stored in the DB being processed
sequentially. An average IWait delay of 5 milliseconds multiplied by
millions adds up to hours of IWait delay.
I had the maintenance program write a program to extact all of the possible
segments that might be retrived, sort it to physical sequence for the other
DB using Syncsort with a VSAM KSCS as the SORTOUT. This logical relationship
used symbolic pointers, that is IMS segment key values, rather than IMS
RBA pointers, so I used thems as the final part of the VSAM key after the
RBA of the root in the DB being processed sequentially.
Doing that for a 5 million segment subset in a 2 Gbyte OSAM DB took 20 minutes.
Skip sequential processing it as VSAM while processing the sequential DB
took almost zero IWait time. A job which regularly ran from 10 pm to past 7 am
consistently ended before midnight after the change.
A job which had a similar random logical relationship headache that took
all w end also speeded up from this approach. It started out as a once a
year job, which was a nuisance, but not worth changing. When it was changed
to monthly we applied the same technique and got it running overnight
during the w instead of from Friday Evening to Sunday afternoon.
| |
|
|
| Kelly Bert Manning 2005-04-26, 8:55 pm |
|
(yaeger@us.ibm.com) writes:
> Yes, DFSORT can write sorted output records to a VSAM data set.
>
> Yes, DFSORT can beused to split files in a variety of ways as discussed
> on the DFSORT website:
>
> http://www.ibm.com/servers/storage/...tmst01.html#t01
>
> Frank Yaeger - DFSORT Team (IBM) - yaeger@us.ibm.com
> Specialties: ICETOOL, IFTHEN, OVERLAY, Symbols, Migration
> => DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort/
That is what I expected, but the site where I worked until the end of March
never had DFSORT during my 28 years working there.
This was real pain when Syncsort was part of the picture for performance
problems with IMS utilities such as Prefix Resolution and IMS Change
accumulation.
| |
| Lawrence Greenwald 2005-04-27, 3:55 am |
| In article <2yhbe.9244$oy2.391259@weber.videotron.net>,
"JRDuval" <jrduval@videotron.ca> wrote:
> Hi.
>
> Since last w we're experiencing performance problem when writing to a
> VSAM file.
> Context:
> We have a big master compressed VSAM file (about 30 million records) that we
> read to decompress it and writing it into 50 sequentiel files depending on
> wich record we read. For one of them we decided to change it into a VSAM
> file for convenience. This is the biggest one with more than 1 million
> records.
>
> Problem:
> For some reason, when finishing testing we were amazed to see that it took
> 10 times more CPU time to run that our other program and that much I/O. The
> only thing that I could thing of was to put back our older program and run
> an IDCAMS to convert it into a VSAM file. Is there a problem with Cobol when
> writing big VSAM file or is it just some kind of limitation when using VSAM
> files into Cobol program.
>
> Is there another way to do it instead of using IDCAMS.
>
> Thank you for your input and good night..
>
> J R Duval
Have you tried the 'AMP' JCL parameter (AMP goes on the DD statement
pointing to the VSAM file - AMP is used for buffering - - akin to the
DCB BUFNO parameter for sequential files) - using the right AMP
parameters can speed up operations 10 fold! No program changes are
required. Check out a VSAM performance guide for more info - I know
there's at least one IBM manual that talks about it.
--LG
| |
| JRDuval 2005-04-27, 3:55 am |
| Thank you for the link.
I know for a fact that our CISIZE are not explicitely coded when we create
our VSAM files.
I'm going to take a look at this. That might not hurt.
Thank you for your help.
J R Duval
"William M. Klein" <wmklein@nospam.netcom.com> a écrit dans le message de
news:0Gjbe.5530364$Zm5.855162@news.easynews.com...
> You don't say what else (if anything) has changed, but there WAS a
*significant*
> change to CISIZE "stuff" with z/OS V1R3.
>
> See:
> http://www-03.ibm.com/support/techd...ndex/FLASH10206
>
> and have your "performance" or "sy min" people see if this is relevant.
>
> --
> Bill Klein
> wmklein <at> ix.netcom.com
> "JRDuval" <jrduval@videotron.ca> wrote in message
> news:2yhbe.9244$oy2.391259@weber.videotron.net...
a[color=darkred]
that we[color=darkred]
on[color=darkred]
VSAM[color=darkred]
took[color=darkred]
The[color=darkred]
run[color=darkred]
when[color=darkred]
VSAM[color=darkred]
>
>
| |
| JRDuval 2005-04-27, 3:55 am |
|
"Lawrence Greenwald" <lgreenwa@cts.com> a écrit dans le message de
news:lgreenwa-38B79E.17265226042005@chiapp18.algx.net...
> In article <2yhbe.9244$oy2.391259@weber.videotron.net>,
> "JRDuval" <jrduval@videotron.ca> wrote:
>
a[color=darkred]
that we[color=darkred]
on[color=darkred]
VSAM[color=darkred]
took[color=darkred]
The[color=darkred]
run[color=darkred]
when[color=darkred]
VSAM[color=darkred]
>
> Have you tried the 'AMP' JCL parameter (AMP goes on the DD statement
> pointing to the VSAM file - AMP is used for buffering - - akin to the
> DCB BUFNO parameter for sequential files) - using the right AMP
> parameters can speed up operations 10 fold! No program changes are
> required. Check out a VSAM performance guide for more info - I know
> there's at least one IBM manual that talks about it.
>
> --LG
Yes I've tried many combination.
So far the more we could improve the speed is about 2 times better than it
is now.
Maybe settings are not right. I'll check out the manual if I could get one
at my site.
Thank you for your help.
J R Duval
| |
| JRDuval 2005-04-27, 3:55 am |
|
"Kelly Bert Manning" <bo774@FreeNet.Carleton.CA> a écrit dans le message de
news:d4kthk$2q8$1@theodyn.ncf.ca...
>
> "JRDuval" (jrduval@videotron.ca) writes:
a[color=darkred]
that we[color=darkred]
on[color=darkred]
VSAM[color=darkred]
>
> ESDS, KSDS or RRDS? Were you using ACCESS IS SEQUENTIAL and OPEN OUTPUT?
I'im uisng a KSDS with access dynamic with open output statement.
took[color=darkred]
The[color=darkred]
run[color=darkred]
when[color=darkred]
VSAM[color=darkred]
>
> Syncsort can write sorted output records to a VSAM dataset instead of a
QSAM
> dataset. I'd be surprised if DFSORT couldn't do the same.
Do you mean to do a sort into the Cobol program or, use syncsort outside
the program sort it and put it back into the VSAM file.
>
> Syncsort can also be used to split files. Again, I'd be surprised if
DFSORT
> did not also support this. You don't actually have to sort in order to
split.
Thank you for your help
JR Duval
| |
| Kelly Bert Manning 2005-04-27, 8:55 am |
|
"JRDuval" (jrduval@videotron.ca) writes:
>
> I'im uisng a KSDS with access dynamic with open output statement.
Why are you using access dynamic to load this? For a load you should
be writing the records sequentially, in ascending order by key value.
I've never seen Random I/O be a way of avoiding the cost of sorting for
files of this size. Random processing is usually a way of doing even more
I/O and using more CPU.
Have you read
1.10.3.2.1 Opening an empty file
in the Enterprise COBOL Programmer's Guide?
http://publibz.boulder.ibm.com/cgi-...=20040220035836
>
> Do you mean to do a sort into the Cobol program or, use syncsort outside
> the program sort it and put it back into the VSAM file.
I meant Sort (using whichever sort is available at your site) outside the
program and have the VSAM KSDS as the output from the sort.
Check the references to the sort wizardry earlier in this thread.
I would never recommend imbedding a sort inside a program except as a last
hope attempt to make a long running process fit into a processing window.
| |
| JRDuval 2005-05-01, 3:55 pm |
|
"Kelly Bert Manning" <bo774@FreeNet.Carleton.CA> a écrit dans le message de
news:d4nhqf$j3i$1@theodyn.ncf.ca...
>
> "JRDuval" (jrduval@videotron.ca) writes:
>
> Why are you using access dynamic to load this? For a load you should
> be writing the records sequentially, in ascending order by key value.
>
> I've never seen Random I/O be a way of avoiding the cost of sorting for
> files of this size. Random processing is usually a way of doing even more
> I/O and using more CPU.
The thing with this program is that it hasn't been maintaned for a while
(about 5 years). Other VSAM created were not that big and performance fot
that job was acceptable. So to add the new VSAM file we just cut and paste
an old file definition into the existing program and change the file name.
If it works why bother. That was our mistake. We'll be more careful next
time.
>
> Have you read
> 1.10.3.2.1 Opening an empty file
> in the Enterprise COBOL Programmer's Guide?
>
>
http://publibz.boulder.ibm.com/cgi-...=20040220035836
outside[color=darkred]
>
> I meant Sort (using whichever sort is available at your site) outside the
> program and have the VSAM KSDS as the output from the sort.
>
> Check the references to the sort wizardry earlier in this thread.
>
> I would never recommend imbedding a sort inside a program except as a last
> hope attempt to make a long running process fit into a processing window.
At our site we have many programs that use internal sort wich I do not
approuve. But if we get internal sort out of those programs, we would have
to redo almost the entire programs. Original logic was not done wiith
external sort.
Thank you for your help. It was very much appriciated.
J.R Duval
| |
| Kelly Bert Manning 2005-05-01, 3:55 pm |
|
"JRDuval" (jrduval@videotron.ca) writes:
>
> The thing with this program is that it hasn't been maintaned for a while
> (about 5 years). Other VSAM created were not that big and performance fot
> that job was acceptable. So to add the new VSAM file we just cut and paste
> an old file definition into the existing program and change the file name.
> If it works why bother. That was our mistake. We'll be more careful next
> time.
>
....
>
> At our site we have many programs that use internal sort wich I do not
> approuve. But if we get internal sort out of those programs, we would have
> to redo almost the entire programs. Original logic was not done wiith
> external sort.
>
> Thank you for your help. It was very much appriciated.
So did the performance of the original program improve with coding for a
sequential load?
I didn't mean to suggest that anyone should start a project to find and
replace all internal sorts.
"if it isn't broken don't fix it"
It is more a case of something to avoid in new programs, or to consider
changing in a program which fails often, or which is being opened up for
major revision.
I find having stand alone sort utilities on IBM mainframes very convenient.
When I worked on Honeywell GCOS applications in the late 1970s the only
simple way to do a sort was to invoke it from a COBOL-74 program. If I recall
correctly the only examples in their sort utility were invoking it from
COBOL-74 and from GMACS assembler. Ironically we were never able to get
Honeywell PL/I programs to compile, even though their COBOL-74 compiler was
apparently written in their PL/I and would do PUT SKIPs when it aborted.
| |
|
|
| Andreas Lerch 2005-05-02, 3:55 pm |
|
Am 26.04.05, 02:17:13, schrieb "JRDuval" <jrduval@videotron.ca> zum=20
Thema Performance problem with a VSAM files:
[color=darkred]
> Hi.
> Since last w we're experiencing performance problem when writing=20=
to a
> VSAM file.
> Context:
> We have a big master compressed VSAM file (about 30 million records)=20=
that we
> read to decompress it and writing it into 50 sequentiel files=20
depending on
> wich record we read. For one of them we decided to change it into a=20=
VSAM
> file for convenience. This is the biggest one with more than 1 million=
> records.
> Problem:
> For some reason, when finishing testing we were amazed to see that it =
took
> 10 times more CPU time to run that our other program and that much=20=
I/O. The
> only thing that I could thing of was to put back our older program and=
=20
run
> an IDCAMS to convert it into a VSAM file. Is there a problem with=20
Cobol when
> writing big VSAM file or is it just some kind of limitation when using=
=20
VSAM
> files into Cobol program.
> Is there another way to do it instead of using IDCAMS.
> Thank you for your input and good night..
Hello
in MVS there ist a subsystem named BLSR (Batch Local Shared Resources)=20=
which can handle VSAM-reads better than every application programm -=20=
it can be activated on JCL only, its a simple change.
The other is the evaluation of the output files, can you determine it=20=
binary. Thats better than do a average of 25 questions to fill 50=20
files.
Evaluate True
when file equal 01 write file01
when file equal 02 write file02
when file equal 11 write file11
when file equal 21 write file21
...... and so on can be raplaced by
Evaluate True
when filebyte1 equal 0
evaluate true
when filebyte2 equal 1 write file01
when filebyte2 equal 2 write file02
evaluate true
when filebyte1 equal 1
evaluate true
when filebyte2 equal 1 write file11
...... and so on
this are 2,5 + 5 avarage questions ist the filenumber is from 0 to 49
One other is the sequence READ INTO, WRITE FROM. If you can do this:=20=
READ FD-File - WRITE FROM input-definition, you will save one move.
READ INTO --> takes the next buffer-offset and makes a move into the=20=
storage-area
WRITE FROM --> takes the next buffer-address and makes a move from=20
storage to the buffer
maybe, there are a lot of solutions!
Einen schoenen Tag
Andreas Lerch
| |
| Clark F. Morris, Jr. 2005-05-02, 8:55 pm |
| JRDuval wrote:
> "Kelly Bert Manning" <bo774@FreeNet.Carleton.CA> a écrit dans le message de
> news:d4nhqf$j3i$1@theodyn.ncf.ca...
>
>
>
> The thing with this program is that it hasn't been maintaned for a while
> (about 5 years). Other VSAM created were not that big and performance fot
> that job was acceptable. So to add the new VSAM file we just cut and paste
> an old file definition into the existing program and change the file name.
> If it works why bother. That was our mistake. We'll be more careful next
> time.
>
>
>
> http://publibz.boulder.ibm.com/cgi-...=20040220035836
>
>
> outside
>
>
>
> At our site we have many programs that use internal sort wich I do not
> approuve. But if we get internal sort out of those programs, we would have
> to redo almost the entire programs. Original logic was not done wiith
> external sort.
>
> Thank you for your help. It was very much appriciated.
>
> J.R Duval
>
>
Internal sorts are fine. As someone else pointed out, on modern
hardware for most circumstances, memory is not a problem (unless your
application is running 24 bit for some peculiar reason). Internal sorts
at least have to go through the installation's change control for
programs and on the IBM z series are clearer than most of the utility
sorts. SORT FIELDS=(12,5,PD,A,24,32,CH,A) doesn't really tell me what
fields are being sorted. If you have SYNCSORT on HP-UX and use the
COBOL copybook option for the record description, that is a different
story.
In regard to the problem of the load, the best way is to describe the
file as ORGANIZATION INDEXED ACCESS SEQUENTIAL and open it OUTPUT. As
someone else posted the DD statement (I am assuming MVS, OS390 or z/OS)
should have an AMP parameter with BUFND=2*(data-CIs-per-track+1) and
BUFNI=(1+index-set) where the index set is the number of index CIs
needed to address each of the control areas. If you have 8192 byte data
CIs you get 6 per 3390 track. Do not have a data CI (control interval
equivalent of a block) less than 4096 bytes. Make certain the size of
your index CIs is large enough to contain the keys for all of the data
CIs in the control area (1 cylinder) and the larger the key size, the
more attention must be paid to the CI size since keys may not compress
as well as the IBM assumption says they will. Also watch the number of
index levels. Allocate in cylinders or make certain the secondary
quantity for number of records will cause a cylinder to be allocated.
Use the AMP parameter on all DD statements for the VSAM files with the
following exceptions: 1) Files that are input to or output from SYNCSORT
which left to its own devices will allocate an optimal amount (the same
may be true of the other SORT products), 2) Files that are defined to
Systems managed storage for system determined buffering or otherwise
fully controlled by a storage management product and 3) possibly files
that are in USING and GIVING statements of the SORT verb since they may
be taking advantage of the SORT optimizations.
I realize that the above is arcanity that must boggle the minds of those
using non z series platforms and I believe that the system should be
handling much of this. However, with tuning like this a system designed
in world of vastly different constraints can perform quite well in
today's world.
Also those on other platforms note that the INDEXED file structure -
VSAM on the IBM z series (and predecessors) is an operating system
component for which COBOL only uses a subset of the facilities
available. VSAM is accessible by most of the report programs (DYL280
now Vision something or another, Easytrieve, etc.) and by appropriate
system utilities. PL/1 definitely has VSAM capability, I assume that
C/C++ has extensions for VSAM and Fortran may know what to do with VSAM.
Indeed one of the interesting decisions when going to a Unix or
Windows platform is whether to convert all of the VSAM to the relational
database being used or get the add on INDEXED file system for the
platform. The conversion to the relational database is the better long
term strategy and probably saves software license cost but getting the
add-on INDEXED file system will cut the amount of change.
| |
| Kelly Bert Manning 2005-05-03, 3:55 pm |
|
"Clark F. Morris, Jr." (cfmtech@istar.ca) writes:
> Internal sorts are fine. As someone else pointed out, on modern
> hardware for most circumstances, memory is not a problem (unless your
> application is running 24 bit for some peculiar reason). Internal sorts
Or the sort product itself, as I mentioned earlier. Deciding not to use
any XA memory or hiperspace for Gbyte sorts is not an ideal way for a sort
product to establish a reputation for consistent, reliable, performance.
If I recall correctly the BMSG output showed Syncsort was trying to sort
1.5 Gbytes with less than 250K of 24 bit memory and nothing else. That was
all that IMS Prefix Resolution left for it. Going back through our late
completion reports for a while and checking step EXCP reports on the late
jobs I noticed that Syncsort was occassionaly doing this for standalone sorts,
they just didn't slow down quite as much because they had more 24 bit
memory to work with.
I'm a fan of Keep it Simple. Simple processes are usually easier to debug
and get working reliably. If they are handling huge volumes of data then
passing records directly to the sort or receiving them directly from the
sort can be a way of speeding things up, but for small time sorts it makes
things more more complex to restart if they fail.
A program with an imbedded sort is at least 2 separate processes, sometimes
more. I've seen them go even farther, concatenating several different
processing phases into a single program with sorts or merges between the
phases.
When you are processing hundreds of millions of records taking up gigabytes
of space that may be worth the extra complexity, but for smaller processes
it probably isn't worth it.
If they need to be combined to reduce I/O and elasped time then that
should be done, but look at the numbers first, consider the failure modes
and rerun and restart processing if it fails.
| |
| William M. Klein 2005-05-03, 3:55 pm |
| I am more familiar with DFSort than SyncSort - but how recent was your "bad"
SyncSort internal sort problem? Were you using the FASTSRT compiler option?
I would CERTAINLY not recommend using an old compiler (pre-FASTSRT) or an old
release of any SORT product.
Most (not all) of the "internal sort" problems that I have heard of are "well,
we had this problem 10 years ago, so we haven't used it since".
As I indicated in another reply, there certainly ARE cases where an external
sort better "fits" an application design - but the same can be said (IMHO) with
internal sorts. If you have a "3 step job - that simply creates a temporary
dataset, SORTs it, and then process the output of a SORT - then, I think, this
belongs as in internal SORT.
--
Bill Klein
wmklein <at> ix.netcom.com
"Kelly Bert Manning" <bo774@FreeNet.Carleton.CA> wrote in message
news:d584vc$hdg$1@theodyn.ncf.ca...
>
> "Clark F. Morris, Jr." (cfmtech@istar.ca) writes:
>
>
> Or the sort product itself, as I mentioned earlier. Deciding not to use
> any XA memory or hiperspace for Gbyte sorts is not an ideal way for a sort
> product to establish a reputation for consistent, reliable, performance.
> If I recall correctly the BMSG output showed Syncsort was trying to sort
> 1.5 Gbytes with less than 250K of 24 bit memory and nothing else. That was
> all that IMS Prefix Resolution left for it. Going back through our late
> completion reports for a while and checking step EXCP reports on the late
> jobs I noticed that Syncsort was occassionaly doing this for standalone sorts,
> they just didn't slow down quite as much because they had more 24 bit
> memory to work with.
>
> I'm a fan of Keep it Simple. Simple processes are usually easier to debug
> and get working reliably. If they are handling huge volumes of data then
> passing records directly to the sort or receiving them directly from the
> sort can be a way of speeding things up, but for small time sorts it makes
> things more more complex to restart if they fail.
>
> A program with an imbedded sort is at least 2 separate processes, sometimes
> more. I've seen them go even farther, concatenating several different
> processing phases into a single program with sorts or merges between the
> phases.
>
> When you are processing hundreds of millions of records taking up gigabytes
> of space that may be worth the extra complexity, but for smaller processes
> it probably isn't worth it.
>
> If they need to be combined to reduce I/O and elasped time then that
> should be done, but look at the numbers first, consider the failure modes
> and rerun and restart processing if it fails.
| |
| Clark F. Morris, Jr. 2005-05-04, 8:55 pm |
| JRDuval wrote:
> "Kelly Bert Manning" <bo774@FreeNet.Carleton.CA> a écrit dans le message de
> news:d4nhqf$j3i$1@theodyn.ncf.ca...
>
>
>
> The thing with this program is that it hasn't been maintaned for a while
> (about 5 years). Other VSAM created were not that big and performance fot
> that job was acceptable. So to add the new VSAM file we just cut and paste
> an old file definition into the existing program and change the file name.
> If it works why bother. That was our mistake. We'll be more careful next
> time.
>
>
>
> http://publibz.boulder.ibm.com/cgi-...=20040220035836
>
>
> outside
>
>
>
> At our site we have many programs that use internal sort wich I do not
> approuve. But if we get internal sort out of those programs, we would have
> to redo almost the entire programs. Original logic was not done wiith
> external sort.
>
> Thank you for your help. It was very much appriciated.
>
> J.R Duval
>
>
Internal sorts are fine. As someone else pointed out, on modern
hardware for most circumstances, memory is not a problem (unless your
application is running 24 bit for some peculiar reason). Internal sorts
at least have to go through the installation's change control for
programs and on the IBM z series are clearer than most of the utility
sorts. SORT FIELDS=(12,5,PD,A,24,32,CH,A) doesn't really tell me what
fields are being sorted. If you have SYNCSORT on HP-UX and use the
COBOL copybook option for the record description, that is a different
story.
In regard to the problem of the load, the best way is to describe the
file as ORGANIZATION INDEXED ACCESS SEQUENTIAL and open it OUTPUT. As
someone else posted the DD statement (I am assuming MVS, OS390 or z/OS)
should have an AMP parameter with BUFND=2*(data-CIs-per-track+1) and
BUFNI=(1+index-set) where the index set is the number of index CIs
needed to address each of the control areas. If you have 8192 byte data
CIs you get 6 per 3390 track. Do not have a data CI (control interval
equivalent of a block) less than 4096 bytes. Make certain the size of
your index CIs is large enough to contain the keys for all of the data
CIs in the control area (1 cylinder) and the larger the key size, the
more attention must be paid to the CI size since keys may not compress
as well as the IBM assumption says they will. Also watch the number of
index levels. Allocate in cylinders or make certain the secondary
quantity for number of records will cause a cylinder to be allocated.
Use the AMP parameter on all DD statements for the VSAM files with the
following exceptions: 1) Files that are input to or output from SYNCSORT
which left to its own devices will allocate an optimal amount (the same
may be true of the other SORT products), 2) Files that are defined to
Systems managed storage for system determined buffering or otherwise
fully controlled by a storage management product and 3) possibly files
that are in USING and GIVING statements of the SORT verb since they may
be taking advantage of the SORT optimizations.
I realize that the above is arcanity that must boggle the minds of those
using non z series platforms and I believe that the system should be
handling much of this. However, with tuning like this a system designed
in world of vastly different constraints can perform quite well in
today's world.
Also those on other platforms note that the INDEXED file structure -
VSAM on the IBM z series (and predecessors) is an operating system
component for which COBOL only uses a subset of the facilities
available. VSAM is accessible by most of the report programs (DYL280
now Vision something or another, Easytrieve, etc.) and by appropriate
system utilities. PL/1 definitely has VSAM capability, I assume that
C/C++ has extensions for VSAM and Fortran may know what to do with VSAM.
Indeed one of the interesting decisions when going to a Unix or
Windows platform is whether to convert all of the VSAM to the relational
database being used or get the add on INDEXED file system for the
platform. The conversion to the relational database is the better long
term strategy and probably saves software license cost but getting the
add-on INDEXED file system will cut the amount of change.
| |
| William M. Klein 2005-05-05, 3:55 pm |
| I am more familiar with DFSort than SyncSort - but how recent was your "bad"
SyncSort internal sort problem? Were you using the FASTSRT compiler option?
I would CERTAINLY not recommend using an old compiler (pre-FASTSRT) or an old
release of any SORT product.
Most (not all) of the "internal sort" problems that I have heard of are "well,
we had this problem 10 years ago, so we haven't used it since".
As I indicated in another reply, there certainly ARE cases where an external
sort better "fits" an application design - but the same can be said (IMHO) with
internal sorts. If you have a "3 step job - that simply creates a temporary
dataset, SORTs it, and then process the output of a SORT - then, I think, this
belongs as in internal SORT.
--
Bill Klein
wmklein <at> ix.netcom.com
"Kelly Bert Manning" <bo774@FreeNet.Carleton.CA> wrote in message
news:d584vc$hdg$1@theodyn.ncf.ca...
>
> "Clark F. Morris, Jr." (cfmtech@istar.ca) writes:
>
>
> Or the sort product itself, as I mentioned earlier. Deciding not to use
> any XA memory or hiperspace for Gbyte sorts is not an ideal way for a sort
> product to establish a reputation for consistent, reliable, performance.
> If I recall correctly the BMSG output showed Syncsort was trying to sort
> 1.5 Gbytes with less than 250K of 24 bit memory and nothing else. That was
> all that IMS Prefix Resolution left for it. Going back through our late
> completion reports for a while and checking step EXCP reports on the late
> jobs I noticed that Syncsort was occassionaly doing this for standalone sorts,
> they just didn't slow down quite as much because they had more 24 bit
> memory to work with.
>
> I'm a fan of Keep it Simple. Simple processes are usually easier to debug
> and get working reliably. If they are handling huge volumes of data then
> passing records directly to the sort or receiving them directly from the
> sort can be a way of speeding things up, but for small time sorts it makes
> things more more complex to restart if they fail.
>
> A program with an imbedded sort is at least 2 separate processes, sometimes
> more. I've seen them go even farther, concatenating several different
> processing phases into a single program with sorts or merges between the
> phases.
>
> When you are processing hundreds of millions of records taking up gigabytes
> of space that may be worth the extra complexity, but for smaller processes
> it probably isn't worth it.
>
> If they need to be combined to reduce I/O and elasped time then that
> should be done, but look at the numbers first, consider the failure modes
> and rerun and restart processing if it fails.
| |
| JRDuval 2005-05-06, 3:55 am |
|
"Kelly Bert Manning" <bo774@FreeNet.Carleton.CA> a écrit dans le message de
news:d4nhqf$j3i$1@theodyn.ncf.ca...
>
> "JRDuval" (jrduval@videotron.ca) writes:
>
> Why are you using access dynamic to load this? For a load you should
> be writing the records sequentially, in ascending order by key value.
>
> I've never seen Random I/O be a way of avoiding the cost of sorting for
> files of this size. Random processing is usually a way of doing even more
> I/O and using more CPU.
The thing with this program is that it hasn't been maintaned for a while
(about 5 years). Other VSAM created were not that big and performance fot
that job was acceptable. So to add the new VSAM file we just cut and paste
an old file definition into the existing program and change the file name.
If it works why bother. That was our mistake. We'll be more careful next
time.
>
> Have you read
> 1.10.3.2.1 Opening an empty file
> in the Enterprise COBOL Programmer's Guide?
>
>
http://publibz.boulder.ibm.com/cgi-...=20040220035836
outside[color=darkred]
>
> I meant Sort (using whichever sort is available at your site) outside the
> program and have the VSAM KSDS as the output from the sort.
>
> Check the references to the sort wizardry earlier in this thread.
>
> I would never recommend imbedding a sort inside a program except as a last
> hope attempt to make a long running process fit into a processing window.
At our site we have many programs that use internal sort wich I do not
approuve. But if we get internal sort out of those programs, we would have
to redo almost the entire programs. Original logic was not done wiith
external sort.
Thank you for your help. It was very much appriciated.
J.R Duval
| |
| Andreas Lerch 2005-05-06, 3:55 am |
|
Am 26.04.05, 02:17:13, schrieb "JRDuval" <jrduval@videotron.ca> zum=20
Thema Performance problem with a VSAM files:
[color=darkred]
> Hi.
> Since last w we're experiencing performance problem when writing=20=
to a
> VSAM file.
> Context:
> We have a big master compressed VSAM file (about 30 million records)=20=
that we
> read to decompress it and writing it into 50 sequentiel files=20
depending on
> wich record we read. For one of them we decided to change it into a=20=
VSAM
> file for convenience. This is the biggest one with more than 1 million=
> records.
> Problem:
> For some reason, when finishing testing we were amazed to see that it =
took
> 10 times more CPU time to run that our other program and that much=20=
I/O. The
> only thing that I could thing of was to put back our older program and=
=20
run
> an IDCAMS to convert it into a VSAM file. Is there a problem with=20
Cobol when
> writing big VSAM file or is it just some kind of limitation when using=
=20
VSAM
> files into Cobol program.
> Is there another way to do it instead of using IDCAMS.
> Thank you for your input and good night..
Hello
in MVS there ist a subsystem named BLSR (Batch Local Shared Resources)=20=
which can handle VSAM-reads better than every application programm -=20=
it can be activated on JCL only, its a simple change.
The other is the evaluation of the output files, can you determine it=20=
binary. Thats better than do a average of 25 questions to fill 50=20
files.
Evaluate True
when file equal 01 write file01
when file equal 02 write file02
when file equal 11 write file11
when file equal 21 write file21
...... and so on can be raplaced by
Evaluate True
when filebyte1 equal 0
evaluate true
when filebyte2 equal 1 write file01
when filebyte2 equal 2 write file02
evaluate true
when filebyte1 equal 1
evaluate true
when filebyte2 equal 1 write file11
...... and so on
this are 2,5 + 5 avarage questions ist the filenumber is from 0 to 49
One other is the sequence READ INTO, WRITE FROM. If you can do this:=20=
READ FD-File - WRITE FROM input-definition, you will save one move.
READ INTO --> takes the next buffer-offset and makes a move into the=20=
storage-area
WRITE FROM --> takes the next buffer-address and makes a move from=20
storage to the buffer
maybe, there are a lot of solutions!
Einen schoenen Tag
Andreas Lerch
| |
| Clark F. Morris, Jr. 2005-05-06, 8:55 am |
| JRDuval wrote:
> "Kelly Bert Manning" <bo774@FreeNet.Carleton.CA> a écrit dans le message de
> news:d4nhqf$j3i$1@theodyn.ncf.ca...
>
>
>
> The thing with this program is that it hasn't been maintaned for a while
> (about 5 years). Other VSAM created were not that big and performance fot
> that job was acceptable. So to add the new VSAM file we just cut and paste
> an old file definition into the existing program and change the file name.
> If it works why bother. That was our mistake. We'll be more careful next
> time.
>
>
>
> http://publibz.boulder.ibm.com/cgi-...=20040220035836
>
>
> outside
>
>
>
> At our site we have many programs that use internal sort wich I do not
> approuve. But if we get internal sort out of those programs, we would have
> to redo almost the entire programs. Original logic was not done wiith
> external sort.
>
> Thank you for your help. It was very much appriciated.
>
> J.R Duval
>
>
Internal sorts are fine. As someone else pointed out, on modern
hardware for most circumstances, memory is not a problem (unless your
application is running 24 bit for some peculiar reason). Internal sorts
at least have to go through the installation's change control for
programs and on the IBM z series are clearer than most of the utility
sorts. SORT FIELDS=(12,5,PD,A,24,32,CH,A) doesn't really tell me what
fields are being sorted. If you have SYNCSORT on HP-UX and use the
COBOL copybook option for the record description, that is a different
story.
In regard to the problem of the load, the best way is to describe the
file as ORGANIZATION INDEXED ACCESS SEQUENTIAL and open it OUTPUT. As
someone else posted the DD statement (I am assuming MVS, OS390 or z/OS)
should have an AMP parameter with BUFND=2*(data-CIs-per-track+1) and
BUFNI=(1+index-set) where the index set is the number of index CIs
needed to address each of the control areas. If you have 8192 byte data
CIs you get 6 per 3390 track. Do not have a data CI (control interval
equivalent of a block) less than 4096 bytes. Make certain the size of
your index CIs is large enough to contain the keys for all of the data
CIs in the control area (1 cylinder) and the larger the key size, the
more attention must be paid to the CI size since keys may not compress
as well as the IBM assumption says they will. Also watch the number of
index levels. Allocate in cylinders or make certain the secondary
quantity for number of records will cause a cylinder to be allocated.
Use the AMP parameter on all DD statements for the VSAM files with the
following exceptions: 1) Files that are input to or output from SYNCSORT
which left to its own devices will allocate an optimal amount (the same
may be true of the other SORT products), 2) Files that are defined to
Systems managed storage for system determined buffering or otherwise
fully controlled by a storage management product and 3) possibly files
that are in USING and GIVING statements of the SORT verb since they may
be taking advantage of the SORT optimizations.
I realize that the above is arcanity that must boggle the minds of those
using non z series platforms and I believe that the system should be
handling much of this. However, with tuning like this a system designed
in world of vastly different constraints can perform quite well in
today's world.
Also those on other platforms note that the INDEXED file structure -
VSAM on the IBM z series (and predecessors) is an operating system
component for which COBOL only uses a subset of the facilities
available. VSAM is accessible by most of the report programs (DYL280
now Vision something or another, Easytrieve, etc.) and by appropriate
system utilities. PL/1 definitely has VSAM capability, I assume that
C/C++ has extensions for VSAM and Fortran may know what to do with VSAM.
Indeed one of the interesting decisions when going to a Unix or
Windows platform is whether to convert all of the VSAM to the relational
database being used or get the add on INDEXED file system for the
platform. The conversion to the relational database is the better long
term strategy and probably saves software license cost but getting the
add-on INDEXED file system will cut the amount of change.
| |
| Kelly Bert Manning 2005-05-06, 3:55 pm |
|
"Clark F. Morris, Jr." (cfmtech@istar.ca) writes:
> Internal sorts are fine. As someone else pointed out, on modern
> hardware for most circumstances, memory is not a problem (unless your
> application is running 24 bit for some peculiar reason). Internal sorts
Or the sort product itself, as I mentioned earlier. Deciding not to use
any XA memory or hiperspace for Gbyte sorts is not an ideal way for a sort
product to establish a reputation for consistent, reliable, performance.
If I recall correctly the BMSG output showed Syncsort was trying to sort
1.5 Gbytes with less than 250K of 24 bit memory and nothing else. That was
all that IMS Prefix Resolution left for it. Going back through our late
completion reports for a while and checking step EXCP reports on the late
jobs I noticed that Syncsort was occassionaly doing this for standalone sorts,
they just didn't slow down quite as much because they had more 24 bit
memory to work with.
I'm a fan of Keep it Simple. Simple processes are usually easier to debug
and get working reliably. If they are handling huge volumes of data then
passing records directly to the sort or receiving them directly from the
sort can be a way of speeding things up, but for small time sorts it makes
things more more complex to restart if they fail.
A program with an imbedded sort is at least 2 separate processes, sometimes
more. I've seen them go even farther, concatenating several different
processing phases into a single program with sorts or merges between the
phases.
When you are processing hundreds of millions of records taking up gigabytes
of space that may be worth the extra complexity, but for smaller processes
it probably isn't worth it.
If they need to be combined to reduce I/O and elasped time then that
should be done, but look at the numbers first, consider the failure modes
and rerun and restart processing if it fails.
| |
| Kelly Bert Manning 2005-05-08, 8:55 pm |
|
"Clark F. Morris, Jr." (cfmtech@istar.ca) writes:
> Internal sorts are fine. As someone else pointed out, on modern
> hardware for most circumstances, memory is not a problem (unless your
> application is running 24 bit for some peculiar reason). Internal sorts
Or the sort product itself, as I mentioned earlier. Deciding not to use
any XA memory or hiperspace for Gbyte sorts is not an ideal way for a sort
product to establish a reputation for consistent, reliable, performance.
If I recall correctly the BMSG output showed Syncsort was trying to sort
1.5 Gbytes with less than 250K of 24 bit memory and nothing else. That was
all that IMS Prefix Resolution left for it. Going back through our late
completion reports for a while and checking step EXCP reports on the late
jobs I noticed that Syncsort was occassionaly doing this for standalone sorts,
they just didn't slow down quite as much because they had more 24 bit
memory to work with.
I'm a fan of Keep it Simple. Simple processes are usually easier to debug
and get working reliably. If they are handling huge volumes of data then
passing records directly to the sort or receiving them directly from the
sort can be a way of speeding things up, but for small time sorts it makes
things more more complex to restart if they fail.
A program with an imbedded sort is at least 2 separate processes, sometimes
more. I've seen them go even farther, concatenating several different
processing phases into a single program with sorts or merges between the
phases.
When you are processing hundreds of millions of records taking up gigabytes
of space that may be worth the extra complexity, but for smaller processes
it probably isn't worth it.
If they need to be combined to reduce I/O and elasped time then that
should be done, but look at the numbers first, consider the failure modes
and rerun and restart processing if it fails.
|
|
|
|
|