For Programmers: Free Programming Magazines  


Home > Archive > Cobol > February 2007 > Need to import non-relational data into a database, and produce reports based on it.









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Need to import non-relational data into a database, and produce reports based on it.
Pierre.Canuck@gmail.com

2007-02-15, 9:55 pm

A very simple example I've got is as follows:

line 1 AC 00012345 John Q Public
line 2 PA 00012345 Balance
line 3 PE 00012345 No funds
line 4 PD 00012345 20 20
line 5 AT 00012345 USD ...
line 6 DA 00012345
line 7 DE 00012345 No transactions
line 8 PA 00012345 Balance
line 9 PE 00012345 No funds
line 10 PD 00012345 25 25
line 11 AC 00012345 David Q Public
....

a) Lines 1-10 are related--they belong to one account -- record type
AC to the next record type AC
b) Lines 2-4 are related--they belong to one category--record types
PA, PE, PD (until the next PA)
c) Lines 8-10 are related--they belong to one category--record types
PA, PE, PD (until the next PA)

So, I need to tie the records a together, likewise for b and c, so
that I can write appropriate SQL join.
I've got modest amount experience importing data and a lot of SQL
experience. I'm very weak wrt/COBOL data.

Where I really need some help is creating a good methodology for doing
this. Murphy's law says that i'll create a methodology, which will
fall apart on the third COBOL data structure I need to migrate. I was
hoping that those with a lot of COBOL and/or data migration experience
can steer me in the right direction.

Yes, I completely agree that the table structure is terrible--it
should have a complete key. However, the customer's program is
probably almost as old as I am, so there's no changing that now. My
goal is to import it into a SQL database, and deal with it there.

Michael Mattias

2007-02-16, 6:55 pm

<Pierre.Canuck@gmail.com> wrote in message
news:1171591553.003341.39890@l53g2000cwa.googlegroups.com...
>A very simple example I've got is as follows:
>
> line 1 AC 00012345 John Q Public
> line 2 PA 00012345 Balance
> line 3 PE 00012345 No funds
> line 4 PD 00012345 20 20
> line 5 AT 00012345 USD ...
> line 6 DA 00012345
> line 7 DE 00012345 No transactions
> line 8 PA 00012345 Balance
> line 9 PE 00012345 No funds
> line 10 PD 00012345 25 25
> line 11 AC 00012345 David Q Public
> ...
>
> a) Lines 1-10 are related--they belong to one account -- record type
> AC to the next record type AC
> b) Lines 2-4 are related--they belong to one category--record types
> PA, PE, PD (until the next PA)
> c) Lines 8-10 are related--they belong to one category--record types
> PA, PE, PD (until the next PA)
>
> So, I need to tie the records a together, likewise for b and c, so
> that I can write appropriate SQL join.
> I've got modest amount experience importing data and a lot of SQL
> experience. I'm very weak wrt/COBOL data.


What operating environment?

If Windows/NT, why use COBOL at all? The above file is easily defined as
an ODBC datasource using the ODBC text driver and you can INSERT INTO...
(SELECT ..). or use some form of PL/SQL if your target database supports it.
That would seem perfect for someone weak in COBOL and strong in SQL.

For that matter, you could create your reports directly from this
datasource.

If this were say, IBM mainframe or a 'nix system, well, I'd probably use
COBOL, too.

Regardless, your first step is to define the destination database. You can't
do anything at all until you do that and short of designing it for you, I'm
not too sure anyone here could offer any advice of value as to your next
step..

MCM















Pierre.Canuck@gmail.com

2007-02-16, 6:55 pm

environment:
It's a Windows 2003 environment. I don't plan on using COBOL, i'm
going to use a modern language, e.g. C# to parse the various record
types/fields, insert into SQL database. Once it's in SQL, then i'll
run the reports. The target system is completely SQL based, all
programmers have SQL experience. We're trying to move away from COBOL
code.

size:
The dataset is millions of records, so the ODBC text driver won't
offer the performance I need, in addition, i've got the hurdle of
relating records a, b and c together, which I need an alogrithm for.

news group choice:
I'm posting this in the COBOL group, since this is COBOL data, and
many participants have SQL experience--so it's the perfect blend of
skills to propose a methodology.

Once again, if anyone has some advice wrt/methodology of relating the
records, that would be greatly appreciated. I'm not look for anyone
to write code.

One approach is:
i) for the SQL tables PA, PD, PE, add an integer column "PKEY"
ii) when importing, upon parsing a PA record, increment PKEY value
iii) when parsing PA, PD, PE records store the current PKEY value in
the PKEY column, thus:
lines 2-4 would have PKEY=1
lines 8-10 would have PKEY=2
iv) SQL query would use a INNER JOIN PA.PKEY=PD.PKEY AND
PA.PKEY=PE.PKEY

Is this a good approach? My COBOL experience is weak, so are there
scenarios of other COBOL data where this approach is weak or
completely fall apart? Can anyone offer a better approach which will
work in most or all scenarios? Thanks in advance.


Richard

2007-02-16, 6:55 pm

On Feb 17, 4:00 am, Pierre.Can...@gmail.com wrote:

> news group choice:
> I'm posting this in the COBOL group, since this is COBOL data,


There is no such thing as 'COBOL data'. It is just data, that it was
written by a COBOL program is irrelevant, it could have been written
by C# or Java or assembler.

It is application data. The way the data is related is built into the
application program, you need to understand the way that the program
has been written to tie the data together.


Howard Brazee

2007-02-16, 6:55 pm

On 15 Feb 2007 18:05:53 -0800, Pierre.Canuck@gmail.com wrote:

>I've got modest amount experience importing data and a lot of SQL
>experience. I'm very weak wrt/COBOL data.


I've only been programming in CoBOL since 1969, so maybe you can
educate me - what are "COBOL data"?

You seemed to describe a file format - if you know what the data look
like, then you can access the data with the language of your choice.
Pierre.Canuck@gmail.com

2007-02-16, 9:55 pm

On Feb 16, 2:00 pm, Howard Brazee <how...@brazee.net> wrote:
> On 15 Feb 2007 18:05:53 -0800, Pierre.Can...@gmail.com wrote:
>
>
> I've only been programming in CoBOL since 1969, so maybe you can
> educate me - what are "COBOL data"?
>
> You seemed to describe a file format - if you know what the data look
> like, then you can access the data with the language of your choice.


Yes, I stand corrected, it's non-relational data. In this case it's
generated by a COBOL program, but it could have been any generated by
any language.

Also I stand corrected, I could use (almost) any language to import
this data--COBOL, VB6, C#, VB.NET, Java, Perl, in fact i've even used
SQL (wasn't too pretty)...

I guess you could call me a newbie--I'm used to relational data--
xBase, Access, Sybase, SQL Server, Oracle... so, having to deal with
non-relational data is 'different' for me. I suppose i'll just have
to deal with this one case at a time--i'll look at the copybook and
hopefully it'll conform to the methodology i've laid-out. If not,
i'll be back to the drawing board.

I do have one remaining question, in your experience, how often do you
encounter record definitions having incomplete keys--i.e. you need to
look at the previous record(s) to relate them together? Is this
something rare?

HeyBub

2007-02-17, 6:55 pm

Pierre.Canuck@gmail.com wrote:
> On Feb 16, 2:00 pm, Howard Brazee <how...@brazee.net> wrote:
>
> Yes, I stand corrected, it's non-relational data. In this case it's
> generated by a COBOL program, but it could have been any generated by
> any language.
>
> Also I stand corrected, I could use (almost) any language to import
> this data--COBOL, VB6, C#, VB.NET, Java, Perl, in fact i've even used
> SQL (wasn't too pretty)...
>
> I guess you could call me a newbie--I'm used to relational data--
> xBase, Access, Sybase, SQL Server, Oracle... so, having to deal with
> non-relational data is 'different' for me. I suppose i'll just have
> to deal with this one case at a time--i'll look at the copybook and
> hopefully it'll conform to the methodology i've laid-out. If not,
> i'll be back to the drawing board.
>
> I do have one remaining question, in your experience, how often do you
> encounter record definitions having incomplete keys--i.e. you need to
> look at the previous record(s) to relate them together? Is this
> something rare?


No, it is quite common. I remember one file. The file consisted of a header
record (the well descriptor record) followed by an indeterminate number
(from zero to several thousand) test reports.

Since this condition has already been worked out by previous programmers,
and is buttressed by decades of real-world debugging and experience,
wouldn't the whole project be easier, quicker, and cheaper if you simply
hired a COBOL programmer?

You may have to build wheel-chair ramps, but that shouldn't cost much.


Alistair

2007-02-17, 6:55 pm

On 17 Feb, 14:18, "HeyBub" <heybubNOS...@gmail.com> wrote:
> Pierre.Can...@gmail.com wrote:
>
>
>
>
>
>
>
>
> No, it is quite common. I remember one file. The file consisted of a header
> record (the well descriptor record) followed by an indeterminate number
> (from zero to several thousand) test reports.
>
> Since this condition has already been worked out by previous programmers,
> and is buttressed by decades of real-world debugging and experience,
> wouldn't the whole project be easier, quicker, and cheaper if you simply
> hired a COBOL programmer?
>
> You may have to build wheel-chair ramps, but that shouldn't cost much.- Hide quoted text -
>
> - Show quoted text -


I'm a spritely 49 years young and still able to climb stairs unaided.
I'm also potty trained and reasonably well domesticated AND available.
Rates are not as high as Doc's.....

Howard Brazee

2007-02-19, 9:55 pm

On 16 Feb 2007 19:47:16 -0800, Pierre.Canuck@gmail.com wrote:

>I do have one remaining question, in your experience, how often do you
>encounter record definitions having incomplete keys--i.e. you need to
>look at the previous record(s) to relate them together? Is this
>something rare?


I've seen various stages of how variable data are handled:

1. Variable length records predominate.
2. Header and detail records predominate.
3. Hierarchal (or Network) databases have owners and children
records.
4. We don't care what happens behind the scenes, the database will
translate the SQL to what we ask for.

In mainframe shops, I don't see many variable length records anymore,
but still see quite a bit of data with header and detail records.


We will always need to process data that are not organized in a
database. It might not be positionally consistent like old style
files. But even a word processor creates documents with some
formatted header information.
Michael Mattias

2007-02-19, 9:55 pm

On 16 Feb 2007 19:47:16 -0800, Pierre.Canuck@gmail.com wrote:
>I do have one remaining question, in your experience, how often do you
>encounter record definitions having incomplete keys--i.e. you need to
>look at the previous record(s) to relate them together? Is this
>something rare?


It is not rare at all; it is in fact quite common.

If you think about it a little, you'll realize this is what sequential
access is all about. (If "Y" follows "X" then "Y" logically belongs with
"X").


MCM





Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com