Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Need help on cobol

I had two Physical Sequnetial files both are sorted by CUST ID no.

File1 containts 10,000 records
File2 contains  22 millions rows.(For same CUSTID, there are
multiple records)

I need compare file1 CUST ID and File2 CUST ID and matching rows would
be written to
output file.

Here I am thinking two possible solutions:

1. Fetch each record from FIle1 and compare with file2
sequentilly until the cust ID in file1 greater then file2 cust id.
Match is found write ouput record.

2. Put all the 22 million rows in a table and use SEARCH ALL for
each every record on file1.

I want to know which method is preferrable.


questions:

1.  If I store 22 million records in table declaration, how much
storage is needed. Is this ok to use this method.

2. Sequntial processing it is taking very very llong time.

If there are any different methods are there, Let me know

Please suggest your opions.

Your help is appreciated.

Thanks,


Report this thread to moderator Post Follow-up to this message
Old Post
florence
08-05-06 11:55 PM


Re: Need help on cobol
In my opinion, this kind of problem is better solved by a sequential
match-merge process.  This is a well-known, reliable, and efficient
batch processing technique.

You don't say which COBOL compiler you are using, or what operating
environment this will run in, or if either of the files are on tape
versus disk.  It might also be helpful to know the record lengths of
each file.

Few COBOL environments will be able to support a working-storage table
containing 22 million records.  If we assume each record is 80 bytes
long, the working-storage table would occupy 80 * 22 million bytes or
about 1.76 gigabytes of memory.  And loading an in-memory table still
requires you to read every record in the larger file.

Some database products may allow you to allocate a database table and
load it, but not in my limited experience with DB2.

With kindest regards,



florence wrote:
>
>  I had two Physical Sequnetial files both are sorted by CUST ID no.
>
>     File1 containts 10,000 records
>     File2 contains  22 millions rows.(For same CUSTID, there are
> multiple records)
>
> I need compare file1 CUST ID and File2 CUST ID and matching rows would
> be written to
> output file.
>
>  Here I am thinking two possible solutions:
>
>        1. Fetch each record from FIle1 and compare with file2
> sequentilly until the cust ID in file1 greater then file2 cust id.
> Match is found write ouput record.
>
>        2. Put all the 22 million rows in a table and use SEARCH ALL for
> each every record on file1.
>
> I want to know which method is preferrable.
>
>
>  questions:
>
>        1.  If I store 22 million records in table declaration, how much
> storage is needed. Is this ok to use this method.
>
>         2. Sequntial processing it is taking very very llong time.
>
> If there are any different methods are there, Let me know
>
>    Please suggest your opions.
>
> Your help is appreciated.
>
> Thanks,
>

--
http://arnold.trembley.home.att.net/


Report this thread to moderator Post Follow-up to this message
Old Post
Arnold Trembley
08-05-06 11:55 PM


Re: Need help on cobol
Thanks Arnold,

I am working with IBM mainframes with Z/os. Merging is not possible
becasue for each matching record, I need to do some calculations and
write output. Both these files are on 3390 disk.

Thanks in advance,



Arnold Trembley wrote:
> In my opinion, this kind of problem is better solved by a sequential
> match-merge process.  This is a well-known, reliable, and efficient
> batch processing technique.
>
> You don't say which COBOL compiler you are using, or what operating
> environment this will run in, or if either of the files are on tape
> versus disk.  It might also be helpful to know the record lengths of
> each file.
>
> Few COBOL environments will be able to support a working-storage table
> containing 22 million records.  If we assume each record is 80 bytes
> long, the working-storage table would occupy 80 * 22 million bytes or
> about 1.76 gigabytes of memory.  And loading an in-memory table still
> requires you to read every record in the larger file.
>
> Some database products may allow you to allocate a database table and
> load it, but not in my limited experience with DB2.
>
> With kindest regards,
>
>
>
> florence wrote: 
>=20
> --=20
> http://arnold.trembley.home.att.net/
> 


Report this thread to moderator Post Follow-up to this message
Old Post
florence
08-05-06 11:55 PM


Re: Need help on cobol
Thanks Arnold,

I am working with IBM mainframes with Z/os. Merging is not possible
becasue for each matching record, I need to do some calculations and
write output. Both these files are on 3390 disk.

Thanks in advance,



Arnold Trembley wrote:
> In my opinion, this kind of problem is better solved by a sequential
> match-merge process.  This is a well-known, reliable, and efficient
> batch processing technique.
>
> You don't say which COBOL compiler you are using, or what operating
> environment this will run in, or if either of the files are on tape
> versus disk.  It might also be helpful to know the record lengths of
> each file.
>
> Few COBOL environments will be able to support a working-storage table
> containing 22 million records.  If we assume each record is 80 bytes
> long, the working-storage table would occupy 80 * 22 million bytes or
> about 1.76 gigabytes of memory.  And loading an in-memory table still
> requires you to read every record in the larger file.
>
> Some database products may allow you to allocate a database table and
> load it, but not in my limited experience with DB2.
>
> With kindest regards,
>
>
>
> florence wrote: 
>=20
> --=20
> http://arnold.trembley.home.att.net/
> 


Report this thread to moderator Post Follow-up to this message
Old Post
florence
08-05-06 11:55 PM


Re: Need help on cobol
florence wrote:

Thanks Arnold,

I am working with IBM mainframes with Z/os. Merging is not possible
becasue for each matching record, I need to do some calculations and
write output. Both these files are on 3390 disk.

file1 LREC =3D 20
file2 LRECL=3D120.

Output record length =3D 200.
I hope this helps

>
> Thanks in advance,
>
>
>
> Arnold Trembley wrote: 
or 
ch 


Report this thread to moderator Post Follow-up to this message
Old Post
florence
08-05-06 11:55 PM


Re: Need help on cobol
florence wrote:

Thanks Arnold,

I am working with IBM mainframes with Z/os. Merging is not possible
becasue for each matching record, I need to do some calculations and
write output. Both these files are on 3390 disk.

file1 LREC =3D 20
file2 LRECL=3D120.

Output record length =3D 200.
I hope this helps

>
> Thanks in advance,
>
>
>
> Arnold Trembley wrote: 
or 
ch 


Report this thread to moderator Post Follow-up to this message
Old Post
florence
08-05-06 11:55 PM


Re: Need help on cobol
florence wrote:

>    I am working with IBM mainframes with Z/os. Merging is not possible
> becasue for each matching record, I need to do some calculations and
> write output. Both these files are on 3390 disk.

What part of "some calculations and write output" prevents a standard 2
file merge logic being used ?

The whole point of a merge from two files is that you wind up at some
point in the program with each record matching. That is where you would
then calculate and output.


Report this thread to moderator Post Follow-up to this message
Old Post
Richard
08-05-06 11:55 PM


Re: Need help on cobol
florence wrote:
> I had two Physical Sequnetial files both are sorted by CUST ID no.
>
>     File1 containts 10,000 records
>     File2 contains  22 millions rows.(For same CUSTID, there are
> multiple records)
>
> I need compare file1 CUST ID and File2 CUST ID and matching rows would
> be written to
> output file.

You haven't specified whther it is the file1 matching records or the
file2 matching records that are written to output, or both.

Given custids of:

File1:  B D E E F ...

File2: A(1) A(2) B(1) C(1) D(1) D(2) E(1) E(2) E(3) F(1) ...

(n) indicating there are 2 A records, 3 E records.

Which records will be output ?  Those from file1 ? those from file2 ?
both ? If there are two records with the same custid in file1 (is this
possible) do you need to output all the matching file2 records for
each, duplicating the output ?

>  Here I am thinking two possible solutions:
>
>        1. Fetch each record from FIle1 and compare with file2
> sequentilly until the cust ID in file1 greater then file2 cust id.
> Match is found write ouput record.

You imply that it would be necessary to start again at the beginning of
file2 for each record in file1.  You said that the files are sorted by
Cust-Id.  For each record in File1 it is only necessary to read forward
in file2 because all the records already read in file2 must be lower
CustId than the current File1 CustId. That is the nature of them being
sorted.

>        2. Put all the 22 million rows in a table and use SEARCH ALL for
> each every record on file1.

SEARCH ALL does not give you 'all' the records that match, it only
gives one but it may use a binary chop search (or any other method) and
the one that it finds need not be the first of that key. That is it
might 'search all' the table when searching.

> I want to know which method is preferrable.

Neither probably.

>  questions:
>
>        1.  If I store 22 million records in table declaration, how much
> storage is needed. Is this ok to use this method.

Simple 22,000,000 x table item size.  Are you allowed to use a Gigabyte
of RAM or so ?  Note that a SERCH ALL (which is unlikely to be what you
want anyway) will potentially access all parts of the table for each
SEARCH and so will hammer the virtual memory mercilessly and will
thrash.  The operators will kill your program.

>         2. Sequntial processing it is taking very very llong time.

Are you reading the whole of file2 for each file1 record ? why ?


Report this thread to moderator Post Follow-up to this message
Old Post
Richard
08-05-06 11:55 PM


Re: Need help on cobol
Thank you very much for your analysis.Thats excellent.

I will need to match file1(No duplicate CUST IDS in this file)
with file2 custids(Duplicate CUST IDS in this file).

Once CUST ID matches, then based on  status category in file2
field I need to pass
one filed data to one of seven output fields in the output record.  All
other output fields in the output record would be populated by file2
fields only, and output record would be "PIPE" delimeted.

FIle1 LRECL  20( It has 10000)
File2 LRECL 120(It has 22 millian records)

I hope this helps.

What is PIPE delimeted(it is JUST putting "|" after each filed in the
output record?)

Once again appreciated your help.

Thanks




Richard wrote:
> florence wrote: 
>
> You haven't specified whther it is the file1 matching records or the
> file2 matching records that are written to output, or both.
>
> Given custids of:
>
> File1:  B D E E F ...
>
> File2: A(1) A(2) B(1) C(1) D(1) D(2) E(1) E(2) E(3) F(1) ...
>
> (n) indicating there are 2 A records, 3 E records.
>
> Which records will be output ?  Those from file1 ? those from file2 ?
> both ? If there are two records with the same custid in file1 (is this
> possible) do you need to output all the matching file2 records for
> each, duplicating the output ?
> 
>
> You imply that it would be necessary to start again at the beginning of
> file2 for each record in file1.  You said that the files are sorted by
> Cust-Id.  For each record in File1 it is only necessary to read forward
> in file2 because all the records already read in file2 must be lower
> CustId than the current File1 CustId. That is the nature of them being
> sorted.
> 
>
> SEARCH ALL does not give you 'all' the records that match, it only
> gives one but it may use a binary chop search (or any other method) and
> the one that it finds need not be the first of that key. That is it
> might 'search all' the table when searching.
> 
>
> Neither probably.
> 
>
> Simple 22,000,000 x table item size.  Are you allowed to use a Gigabyte
> of RAM or so ?  Note that a SERCH ALL (which is unlikely to be what you
> want anyway) will potentially access all parts of the table for each
> SEARCH and so will hammer the virtual memory mercilessly and will
> thrash.  The operators will kill your program.
> 
>
> Are you reading the whole of file2 for each file1 record ? why ?


Report this thread to moderator Post Follow-up to this message
Old Post
florence
08-05-06 11:55 PM


Re: Need help on cobol
florence wrote:
> Thank you very much for your analysis.Thats excellent.
>
>        I will need to match file1(No duplicate CUST IDS in this file)
> with file2 custids(Duplicate CUST IDS in this file).
>
>       Once CUST ID matches, then based on  status category in file2
> field I need to pass
> one filed data to one of seven output fields in the output record.  All
> other output fields in the output record would be populated by file2
> fields only, and output record would be "PIPE" delimeted.
>
> FIle1 LRECL  20( It has 10000)
> File2 LRECL 120(It has 22 millian records)

Then you probably need:

PERFORM UNTIL File1-CustId = HIGH-VALUE
READ file1
AT END MOVE HIGH-VALUES TO File1-CustId
END-READ
PERFORM Read-File2
UNTIL File2-CustId >= File1-CustId
PERFORM
UNTIL File1-CustId = HIGH-VLUES
OR File2-CustId > File1-CustId

* deal with matching file2 record here

PERFORM Read-File2
END-PERFOM
END-PERFORM

Report this thread to moderator Post Follow-up to this message
Old Post
Richard
08-05-06 11:55 PM


Sponsored Links




Last Thread Next Thread Next
Pages (3): [1] 2 3 »
Search this forum -> 
Post New Thread

Cobol archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 06:56 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.