Home > Archive > Cobol > October 2007 > Data validation (NUMERIC checking follow-on)
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Data validation (NUMERIC checking follow-on)
|
|
| William M. Klein 2007-10-21, 6:55 pm |
| I just thought that I would comment on changes (potential more than real) in the
area of data validation. In many ways, these seem (to me) to indicate a desire
(from a Standards point of view) to move MORE closely to the approaches that
Pete has advocated in the original thread.
Before the '02 Standard, Standard COBOL had VERY little in the way of built-in
data-validation. It was really easy to get "unpredictable results" with
"incompatible data". In the '02 Standard a NUMBER of features were added to
"solve" this problem. It is unfortunate (IMHO) that they came so late (and
some - possibly most - will never get implemented).
Consider that with an '85 Standard compiler, it was INCREDIBLY difficult to
check if the content of a data item actually matched its data definition
(Picture, Usage, other clauses). To do so, a program would have to virtually
check byte by byte (even nibble by nibble). This became particularly odious
with the addition of Intrinsic Functions such as the date and NumVal functions
where passing a "bad" argument lead to totally unpredictable (and non-portable)
results.
With the '02 Standard, a number of TEXT-xxxx intrinsic functions were added.
These allowed for "pre-testing" of the contents of fields to be used for NumVal
and Date functions. They didn't, however, actually test fields for conformance
to their data descriptions. My best guess is that these functions will
(eventually) get implemented by some - possibly most - vendors.
The '02 Standard also expanded the use of IF NUMERIC to non-Display/National
usages. This made it more useful for testing fields that were defined as
Numeric. However, there never was (and still isn't) a "comparable" IF
NUMERIC-EDITED test to see if numeric-edited (or alphanumeric-edited) fields'
content conform to their data description. The expanded IF NUMERIC test is
medium common usage today.
Certainly the VALIDATE facility allows for all (????) the types of validation
that a program might want. It even has ways of testing content of REDEFINES
based on what is in other fields. It allows for range checks and specific value
checks. Unfortunately, besides being relatively complex, I will guess that it
will rarely if ever be implemented. (The final Standard definition was
similar - but not identical to - a preprocessor product that had limited
availability.) My guess is that if this feature were in general use (and had
been since the '70s) COBOL data validation would be a "non-issue" for
programmers - and this forum.
Another enhancement in the '02 Standard that is relevant is the
EC-Data-Incompatible exception condition. I think that in some ways, this MOST
CLOSELY resembles what Pete was talking about. When turned on (and the default
is OFF - but it can be turned on for a few lines or an entire program), an
exception will be raised (and can be handled within a declarative) whenever a
numeric or numeric-edited field is used as a "sending field" (in the COBOL
sense) and the content does not match the data definition. As I recall, it does
NOT check alphanumeric-edited fields, so it isn't a "complete" validation
facility, but it certainly does treat "incompatible data" as an exception that
requires "handling" (or abnormal termination of the program).
A final issue to mention is TYPE and TYPEDEF. Although I am SLIGHTLY familiar
with what is in the '02 Standard, I don't really "internalize" it as I haven't
used it and I don't know what other languages do in this area. It is my
understanding/memory that "traditional" COBOL is considered a "weakly typed"
language. This is (I think) at the route of having non-numeric data within
numerically defined fields. With the '02 Standard both weak and strong user
defined types are introduced (to Standard COBOL). It is my understanding that
strongly typed fields do everything that Pete has asked for - for numeric
fields. They *must* have good data and you can only do "good" things with them.
Weakly typed data has SOME of the "safety" features but still may end up with
"bad" data within the fields (I think - but am not certain about this).
* * * *
Bottom-Line:
In a theoretical world where the '02 Standard were implemented and all
applications "validated" data once when it was first "introduced", there would
be lots of "good" things that programs could do to validate the data initially
and to protect from "problems" with unexpectedly polluted fields. I am NOT
holding my breath for such a day - but did think it was worth commenting on all
the "post" IF NUMERIC solutions that have been (theoretically) introduced into
COBOL.
--
Bill Klein
wmklein <at> ix.netcom.com
| |
| Pete Dashwood 2007-10-21, 6:55 pm |
|
"William M. Klein" <wmklein@nospam.netcom.com> wrote in message
news:FrKSi.227572$6L.213372@fe03.news.easynews.com...
>I just thought that I would comment on changes (potential more than real)
>in the area of data validation. In many ways, these seem (to me) to
>indicate a desire (from a Standards point of view) to move MORE closely to
>the approaches that Pete has advocated in the original thread.
>
> Before the '02 Standard, Standard COBOL had VERY little in the way of
> built-in data-validation. It was really easy to get "unpredictable
> results" with "incompatible data". In the '02 Standard a NUMBER of
> features were added to "solve" this problem. It is unfortunate (IMHO)
> that they came so late (and some - possibly most - will never get
> implemented).
>
> Consider that with an '85 Standard compiler, it was INCREDIBLY difficult
> to check if the content of a data item actually matched its data
> definition (Picture, Usage, other clauses). To do so, a program would
> have to virtually check byte by byte (even nibble by nibble). This
> became particularly odious with the addition of Intrinsic Functions such
> as the date and NumVal functions where passing a "bad" argument lead to
> totally unpredictable (and non-portable) results.
>
> With the '02 Standard, a number of TEXT-xxxx intrinsic functions were
> added. These allowed for "pre-testing" of the contents of fields to be
> used for NumVal and Date functions. They didn't, however, actually test
> fields for conformance to their data descriptions. My best guess is that
> these functions will (eventually) get implemented by some - possibly
> most - vendors.
>
> The '02 Standard also expanded the use of IF NUMERIC to
> non-Display/National usages. This made it more useful for testing fields
> that were defined as Numeric. However, there never was (and still isn't)
> a "comparable" IF NUMERIC-EDITED test to see if numeric-edited (or
> alphanumeric-edited) fields' content conform to their data description.
> The expanded IF NUMERIC test is medium common usage today.
>
> Certainly the VALIDATE facility allows for all (????) the types of
> validation that a program might want. It even has ways of testing content
> of REDEFINES based on what is in other fields. It allows for range checks
> and specific value checks. Unfortunately, besides being relatively
> complex, I will guess that it will rarely if ever be implemented. (The
> final Standard definition was similar - but not identical to - a
> preprocessor product that had limited availability.) My guess is that if
> this feature were in general use (and had been since the '70s) COBOL data
> validation would be a "non-issue" for programmers - and this forum.
>
> Another enhancement in the '02 Standard that is relevant is the
> EC-Data-Incompatible exception condition. I think that in some ways, this
> MOST CLOSELY resembles what Pete was talking about. When turned on (and
> the default is OFF - but it can be turned on for a few lines or an entire
> program), an exception will be raised (and can be handled within a
> declarative) whenever a numeric or numeric-edited field is used as a
> "sending field" (in the COBOL sense) and the content does not match the
> data definition. As I recall, it does NOT check alphanumeric-edited
> fields, so it isn't a "complete" validation facility, but it certainly
> does treat "incompatible data" as an exception that requires "handling"
> (or abnormal termination of the program).
>
> A final issue to mention is TYPE and TYPEDEF. Although I am SLIGHTLY
> familiar with what is in the '02 Standard, I don't really "internalize" it
> as I haven't used it and I don't know what other languages do in this
> area. It is my understanding/memory that "traditional" COBOL is
> considered a "weakly typed" language. This is (I think) at the route of
> having non-numeric data within numerically defined fields. With the '02
> Standard both weak and strong user defined types are introduced (to
> Standard COBOL). It is my understanding that strongly typed fields do
> everything that Pete has asked for - for numeric fields. They *must* have
> good data and you can only do "good" things with them. Weakly typed data
> has SOME of the "safety" features but still may end up with "bad" data
> within the fields (I think - but am not certain about this).
>
> * * * *
>
> Bottom-Line:
> In a theoretical world where the '02 Standard were implemented and all
> applications "validated" data once when it was first "introduced", there
> would be lots of "good" things that programs could do to validate the data
> initially and to protect from "problems" with unexpectedly polluted
> fields. I am NOT holding my breath for such a day - but did think it was
> worth commenting on all the "post" IF NUMERIC solutions that have been
> (theoretically) introduced into COBOL.
>
I read this with interest, Bill. It is very impressive. I really hope it
gets implemented, but I understand the odds are stacked against it. Still,
definitely a step in the right direction.
I think you put your finger on the problem with VALIDATE; it is
(necessarily?) too complex and many people won't look at it for that reason.
I'm pretty busy at the moment, but as soon as I can, I have decided to make
to my validation component a web service which I will offer free to the
community.
I know some people are starting to experiment with SOA and web services and
it could be a good test and "experiment" exercise.
Obviously, people would have concerns about adding a remote web service
(running on a server they have no control over or guarantee that the service
will continue) to production code, however, I am happy to make a
downloadable version of the component available, that can be installed on a
corporate web server, once people have tried connecting their application to
the service.
Complete documentation on the component and how to interface to it will be
available at the same time.
There would be no licensing or fees of any kind on the service, but if
people want the source code (I understand COBOL mentality... :-)) I would
make a small charge for that and it would be subject to a standard EULA. The
fees from this would be used to help offset the cost of runnning my web
server.
Pete.
--
"I used to write COBOL...now I can do anything."
| |
| Alistair 2007-10-29, 6:55 pm |
| On 21 Oct, 15:42, "William M. Klein" <wmkl...@nospam.netcom.com>
wrote:
> I just thought that I would comment on changes (potential more than real) in the
> area of data validation. In many ways, these seem (to me) to indicate a desire
> (from a Standards point of view) to move MORE closely to the approaches that
> Pete has advocated in the original thread.
>
> Before the '02 Standard, Standard COBOL had VERY little in the way of built-in
> data-validation. It was really easy to get "unpredictable results" with
> "incompatible data". In the '02 Standard a NUMBER of features were added to
> "solve" this problem. It is unfortunate (IMHO) that they came so late (and
> some - possibly most - will never get implemented).
>
> Consider that with an '85 Standard compiler, it was INCREDIBLY difficult to
> check if the content of a data item actually matched its data definition
> (Picture, Usage, other clauses). To do so, a program would have to virtually
> check byte by byte (even nibble by nibble). This became particularly odious
> with the addition of Intrinsic Functions such as the date and NumVal functions
> where passing a "bad" argument lead to totally unpredictable (and non-portable)
> results.
>
> With the '02 Standard, a number of TEXT-xxxx intrinsic functions were added.
> These allowed for "pre-testing" of the contents of fields to be used for NumVal
> and Date functions. They didn't, however, actually test fields for conformance
> to their data descriptions. My best guess is that these functions will
> (eventually) get implemented by some - possibly most - vendors.
>
> The '02 Standard also expanded the use of IF NUMERIC to non-Display/National
> usages. This made it more useful for testing fields that were defined as
> Numeric. However, there never was (and still isn't) a "comparable" IF
> NUMERIC-EDITED test to see if numeric-edited (or alphanumeric-edited) fields'
> content conform to their data description. The expanded IF NUMERIC test is
> medium common usage today.
>
> Certainly the VALIDATE facility allows for all (????) the types of validation
> that a program might want. It even has ways of testing content of REDEFINES
> based on what is in other fields. It allows for range checks and specific value
> checks. Unfortunately, besides being relatively complex, I will guess that it
> will rarely if ever be implemented. (The final Standard definition was
> similar - but not identical to - a preprocessor product that had limited
> availability.) My guess is that if this feature were in general use (and had
> been since the '70s) COBOL data validation would be a "non-issue" for
> programmers - and this forum.
>
> Another enhancement in the '02 Standard that is relevant is the
> EC-Data-Incompatible exception condition. I think that in some ways, this MOST
> CLOSELY resembles what Pete was talking about. When turned on (and the default
> is OFF - but it can be turned on for a few lines or an entire program), an
> exception will be raised (and can be handled within a declarative) whenever a
> numeric or numeric-edited field is used as a "sending field" (in the COBOL
> sense) and the content does not match the data definition. As I recall, it does
> NOT check alphanumeric-edited fields, so it isn't a "complete" validation
> facility, but it certainly does treat "incompatible data" as an exception that
> requires "handling" (or abnormal termination of the program).
>
> A final issue to mention is TYPE and TYPEDEF. Although I am SLIGHTLY familiar
> with what is in the '02 Standard, I don't really "internalize" it as I haven't
> used it and I don't know what other languages do in this area. It is my
> understanding/memory that "traditional" COBOL is considered a "weakly typed"
> language. This is (I think) at the route of having non-numeric data within
> numerically defined fields. With the '02 Standard both weak and strong user
> defined types are introduced (to Standard COBOL). It is my understanding that
> strongly typed fields do everything that Pete has asked for - for numeric
> fields. They *must* have good data and you can only do "good" things with them.
> Weakly typed data has SOME of the "safety" features but still may end up with
> "bad" data within the fields (I think - but am not certain about this).
>
> * * * *
>
> Bottom-Line:
> In a theoretical world where the '02 Standard were implemented and all
> applications "validated" data once when it was first "introduced", there would
> be lots of "good" things that programs could do to validate the data initially
> and to protect from "problems" with unexpectedly polluted fields. I am NOT
> holding my breath for such a day - but did think it was worth commenting on all
> the "post" IF NUMERIC solutions that have been (theoretically) introduced into
> COBOL.
>
> --
> Bill Klein
> wmklein <at> ix.netcom.com
Thanks Bill.
I have worked with a language (Natural - remember it? I keep banging
on about it!) which validated batch records against the definitiuon at
the time of reading them in to memory. Unfortunately it took me 8
hours to suss why the program fell over upon reading in the first data
record of a file. It would have taken me less time with a nice old
fashioned s0C7.
| |
| Howard Brazee 2007-10-29, 6:55 pm |
| On Mon, 29 Oct 2007 13:42:13 -0700, Alistair
<alistair@ld50macca.demon.co.uk> wrote:
>I have worked with a language (Natural - remember it? I keep banging
>on about it!) which validated batch records against the definitiuon at
>the time of reading them in to memory. Unfortunately it took me 8
>hours to suss why the program fell over upon reading in the first data
>record of a file. It would have taken me less time with a nice old
>fashioned s0C7.
I'm curious - does Natural have an equivalent of Redefines - if so, do
the file definitions include rules to determine which definition is
applicable?
| |
| Alistair 2007-10-29, 6:55 pm |
| On 29 Oct, 21:09, Howard Brazee <how...@brazee.net> wrote:
> On Mon, 29 Oct 2007 13:42:13 -0700, Alistair
>
> <alist...@ld50macca.demon.co.uk> wrote:
>
> I'm curious - does Natural have an equivalent of Redefines - if so, do
> the file definitions include rules to determine which definition is
> applicable?
Yes Natural does use REDEFINES. I think that (I'm not a 100% expert on
this) it uses the record definition specified on the READ statement
but when you expect data to be kosher it can be off-putting to find
that a file won't open. One of the problems is that the error message
given may not be correct and may point to the wrong statement!
|
|
|
|
|