Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Demo: Sorts (revised)
This is a re=posting of the sorts demo with one added test: inserting record
s
into an indexed file and reading them back in sequence. Jerry Mouse said thi
s
technique is "efficient". In FACT, it ran 15 times slower than a Cobol sort 
and
more than 100 times slower than a fast sort.

*             I'll take Sorts And Trees for a thousand, Alex
*
* These two programs -- Sorts and Trees -- demonstrate Cobol Features
* overlooked by most programmers, while running timing tests on
* sort algorithms.
*
* Glossary.
* We will see three data structures:
*
* Table   An array defined with Occurs. Same as a sequential file.
*
* List    Each entry has a pointer or index containing the address
*         of the next entry, and optionally the previous. Same as
*         a 'random' file.
*
* Tree    Like a family tree, where each node contains one data
*         value and pointers to up to two children, named Left,
*         containing entries less than the node's data, and Right.
*         Nodes with fewer than two children are called Leaves.
*         Same as an indexed file.
*
* Sorts program
*
* Feature: Nested Programs
*
* Cobol is often criticized for having 'an ocean of data' in
* working-storage that is globally accessible to all parts of the
* program. It is also criticized for having overly large monolithic
* programs. Nested programs, seen below, are one answer to that
* objection; we will see another in Trees. They are written like
* external called programs, but they are in the same source file
* with the caller. If they were separately compiled, there would
* be additional administrative work to list them on program
* inventories, promote them through Change Management, perhaps
* list them as dependents in make files.
*
* Nested programs have their own private working-storage and lack
* access to their parent(s)' working-storage unless passed as
* a parameter or declared Global. They have been in the ANSI
* Cobol Standard since 1985.
*
* Test 0. Indexed file sort
*
* Inserts records into an indexed file, then reads them back in sequence.
* This is included as a bad example, the wrong way to sort data.
* It runs 15 times slower than a Cobol sort.
*
* Test 1. Cobol file sort
*
* This is a familiar Cobol sort with input and output procedures,
* release and return records. No Features here, it is shown to provide a
* timing baseline.
*
* Test 2. Micro Focus table sort
*
* Feature: Table sort
*
* Micro Focus provides an extension allowing you sort a table in memory.
* When we time it, we find its speed is almost identical to the file
* sort in test 1, and half the speed it 'should' be as shown in test 3.
* Apparently, under the covers it is releasing records to and returning
* records from the file sort. This was an extension to the 1985 Standard,
* is included in the 2002 Standard.
*
* Test 3. Quick sort
*
* Feature: Calling C language functions from Cobol
*
* This test calls qsort() in the Standard C Library, and directs it to
* call memcmp().
*
* How do you tell Cobol to use the C calling convention? Not necessary,
* Micro Focus uses the C convention by default. How do you tell the
* linker (loader) to look in C libraries? Not necessary. We stopped
* doing static binding (link edit) many years ago. Binding is now done
* at execution time by a program named "ld", which follows
* LD_LIBRARY_PATH. The C Standard Library is in that path by default. In
* short, you don't have to do anything special. You can see libraries
* that will be loaded at execution time by typing "ldd sorts".
*
* C libraries contain a plethora of useful functions. See
* http://www.tcnj.edu/~cs/doc/CStdLib.html
* also
* http://www.bebits.com/app/2917
* also
* a hundred others in /usr/lib, many defined in Standards
*
* The speed of quick sort is said to be proportional to N log2(N), where
* N is the number of entries and log2 is the base 2 logarithm. Most
* high-speed algorithms run at this speed, for example Shell, Heap and
* Comb sorts. Low-speed sorts such as Bubble run at N*N, are suitable
* only for small values of N, say less than 100.
*
* Tests 4 & 5. Radix sort followed by Insertion sort
*
* Features: The radix algorithm
*           Linked lists using indexes
*
* When we had to sort a large number of punched cards, we put them into
* approximate sequence by sorting on the first few columns with a card
* sorter. Next we put them through a machine called a collator, which
* pulled cards out of order, typically 5-10%. We sorted those cards on
* the full key, then used a collator to merge them into the big deck.
*
* Would you believe the card sorting algorithm, written in Cobol, is
* five times faster than the best high-speed algorithms written in C,
* and about ten times faster than the sorts delivered by Micro Focus?
*
* The algorithm inspects one 'column' at a time, right to left, tossing
* 'cards' into 'pockets' by linking the 'card' to what had previously
* been the last in the 'pocket'. At the end of each pass, it 'gathers'
* the pocket strings into a single string ('deck') by linking the last
* 'card' in a 'pocket' to the first 'card' in the next 'pocket'.
*
* A 'column' can be any convenient size -- one bit, one byte or one
* word. The demo program defines a column as two bytes, 16 bits. This
* requires 65,536 'pockets', of which only a few thousand are actually
* used.
*
* When cards are approximately in order, the program 'merges' the ones
* out of order into place with an Insertion Sort. As a primary
* algorithm, Insertion is slow, close to N*N. But when the entries are
* almost in sequence, it is quite fast. It almost always puts them in
* order with one pass, then makes a second pass to verify that the job
* is complete. It takes a third pass only when there are two consecutive
* steps down. In the 50,000 item test, that doesn't happen, it runs in
* two passes. If radix had sorted on seven bytes rather than six, they
* would have been in perfect order. Insertion would have taken only one
* pass.
*
* The Radix-Insertion algorithm speed is proportional to kN + 2N+, where
* k is the number of columns sorted by Radix (in this case 3) and the +
* following 2N is close-range rearrangements by Insertion. In THEORY,
* the relative speeds of test 3 vs. test 5 should be 50,000 *
* log2(50,000) = 785,000 vs. 3*50,000 + 2*50,000+ =~ 300,000. The ratio
* is 2.6. In PRACTICE, it is 5.0 times as fast. As usual, the devil is
* in the details. If the test program had used subscripts rather than
* indexes, its speed would have been cut in half. Not because indexes
* are that much faster, because every subscript operation generates a
* call to the runtime whereas index operations produce inline code (seen
* via the now commented ASM options).
*
* [If your calculator doesn't have log2, take ln (base e) and divide by .69,
* which is ln(2).]
*
* To compile this "cob -x sorts.cbl". -x produces a Unix executable.
* To execute "sorts".
*
* Continued in trees.cbl
*
* ---------------------------------------------------------------------

$SET SOURCEFORMAT"FREE"
$SET NOBOUND OPT"2" NOTRUNC IBMCOMP NOCHECK
*     $SET ASM ASMLIST SOURCEASM
identification division.
program-id. Sorts.
author. Robert Wagner.
* Compare speed of four sorts.

* Results: Sun SPARC, 1.8 GHz, 50,000 30-byte entries, time in
* seconds for 10 iterations.
*                                  --- ratio ---
*                                   2.   3.   4.   5.
* 0. Indexed file sort      55.4  14.6 22.2  110  139
* 1. Cobol file sort         3.7   1.0  1.5  7.4  9.3
* 2. Micro Focus table sort  3.8        1.5  7.6  9.5
* 3. qsort()                 2.5             5.0  6.3
* 4. radix-insertion          .5                  1.3
* 5. radix to linked list     .4
*
* Trees.cbl contains an expanded version of these test results.
working-storage section.
01  test-data GLOBAL.
05  radix-columns               comp  pic s9(02).
05  test-number                 comp  pic s9(02).
05  card      occurs 50001 indexed                *> the data
x-n first-n last-n previous-n-0 previous-n-1 previous-n-2 x-last.
10  sort-key.
15  key-column occurs 15 indexed x-c comp-x pic xx.
10  next-n                             index.
10  next-n-1                           index.

01  init-data.
05  init-columns value zero     comp  pic s9(02).
05  init-number  value zero     comp  pic s9(02).
05  init-card occurs 50001 indexed i-n i-next i-last.
10  init-key-1                    pic v9(18).
10  init-key-2                    pic v9(12).
10  init-next                     index.
10  init-next-1                   index.

01  timer-variables.
05  repeat-factor value 10      comp  pic s9(09).
05  previous-key                      pic  x(30).
05  card-count                  comp  pic s9(09).
05  start-time.
10  start-hours                   pic  9(02).
10  start-minutes                 pic  9(02).
10  start-seconds                 pic  9(02).
10  start-hundredths              pic  9(02).
05  end-time.
10  end-hours                     pic  9(02).
10  end-minutes                   pic  9(02).
10  end-seconds                   pic  9(02).
10  end-hundredths                pic  9(02).
05  elapsed-time                      pic  z(04).9.

procedure division.
perform construct-data

display '0. Indexed file sort' with no advancing
move 0 to init-number
move 2 to repeat-factor
perform timer-on
perform repeat-factor times
perform initialize-test-data
call 'indexed-file-sort'
end-perform
perform timer-off
perform verify-sort

display '1. Cobol file sort'   with no advancing
move 1 to init-number
move 10 to repeat-factor
perform timer-on
perform repeat-factor times
perform initialize-test-data
call 'cobol-sort'
end-perform
perform timer-off
perform verify-sort

display '2. Micro Focus sort'  with no advancing
move 2 to init-number
perform timer-on
perform repeat-factor times
perform initialize-test-data
call 'mf-sort'
end-perform
perform timer-off
perform verify-sort

display '3. quick sort N log2(N)' with no advancing
move 3 to init-number
perform timer-on
perform repeat-factor times
perform initialize-test-data
call 'quick-sort'
end-perform
perform timer-off
perform verify-sort

display '4. radix-insertion kN + 2N+' with no advancing
move 4 to init-number
move 3 to init-columns
perform timer-on
perform repeat-factor times
perform initialize-test-data
call 'radix-sort'
end-perform
perform timer-off
move zero to init-columns
perform verify-sort

display '5. radix-insertion to linked list' with no advancing
move 5 to init-number
move 3 to init-columns
perform timer-on
perform repeat-factor times
perform initialize-test-data
call 'radix-sort'
end-perform
perform timer-off

goback

. construct-data.
set x-last, i-last to 50001
perform varying i-n from 1 by 1 until i-n = i-last
compute init-key-1 (i-n) = function random
compute init-key-2 (i-n) = function random
set i-next to i-n
set i-next up by 1
set init-next (i-n) to i-next
set init-next-1 (i-n) to i-last
end-perform
move high-values to init-card (i-last)

. initialize-test-data.
move init-data to test-data
set first-n to 1

. verify-sort.
move zero to previous-key, card-count
set x-n to first-n
if  x-n equal to x-last
display 'test failed, no first'
end-if
perform until x-n equal to x-last
if  sort-key (x-n) less than previous-key
display 'test failed '
previous-key space sort-key (x-n)
set x-n to x-last
else
move sort-key (x-n) to previous-key
if  init-columns not equal to zero
set x-n to next-n (x-n)
else
set x-n up by 1
end-if
add 1 to card-count
end-if
end-perform
if  card-count not equal to 50000
display 'test failed, count is ' card-count
end-if

. timer-on.
accept start-time from time
. timer-off.
accept end-time from time
compute elapsed-time rounded =
((((((((end-hours * 60) +
end-minutes) * 60) +
end-seconds) * 100) +
end-hundredths) -
((((((start-hours * 60) +
start-minutes) * 60) +
start-seconds) * 100) +
start-hundredths)) / 100)
* 10 / repeat-factor
display  elapsed-time ' seconds'

Report this thread to moderator Post Follow-up to this message
Old Post
Robert Wagner
06-25-04 11:50 PM


Re: Demo: Sorts (revised)
quote:
Originally posted by Robert Wagner This is a re=posting of the sorts demo with one added test: inserting record s into an indexed file and reading them back in sequence. Jerry Mouse said thi s technique is "efficient". In FACT, it ran 15 times slower than a Cobol sort and more than 100 times slower than a fast sort. * I'll take Sorts And Trees for a thousand, Alex * * These two programs -- Sorts and Trees -- demonstrate Cobol Features * overlooked by most programmers, while running timing tests on * sort algorithms. * * Glossary. * We will see three data structures: * * Table An array defined with Occurs. Same as a sequential file. * * List Each entry has a pointer or index containing the address * of the next entry, and optionally the previous. Same as * a 'random' file. * * Tree Like a family tree, where each node contains one data * value and pointers to up to two children, named Left, * containing entries less than the node's data, and Right. * Nodes with fewer than two children are called Leaves. * Same as an indexed file. * * Sorts program * * Feature: Nested Programs * * Cobol is often criticized for having 'an ocean of data' in * working-storage that is globally accessible to all parts of the * program. It is also criticized for having overly large monolithic * programs. Nested programs, seen below, are one answer to that * objection; we will see another in Trees. They are written like * external called programs, but they are in the same source file * with the caller. If they were separately compiled, there would * be additional administrative work to list them on program * inventories, promote them through Change Management, perhaps * list them as dependents in make files. * * Nested programs have their own private working-storage and lack * access to their parent(s)' working-storage unless passed as * a parameter or declared Global. They have been in the ANSI * Cobol Standard since 1985. * * Test 0. Indexed file sort * * Inserts records into an indexed file, then reads them back in sequence. * This is included as a bad example, the wrong way to sort data. * It runs 15 times slower than a Cobol sort. * * Test 1. Cobol file sort * * This is a familiar Cobol sort with input and output procedures, * release and return records. No Features here, it is shown to provide a * timing baseline. * * Test 2. Micro Focus table sort * * Feature: Table sort * * Micro Focus provides an extension allowing you sort a table in memory. * When we time it, we find its speed is almost identical to the file * sort in test 1, and half the speed it 'should' be as shown in test 3. * Apparently, under the covers it is releasing records to and returning * records from the file sort. This was an extension to the 1985 Standard, * is included in the 2002 Standard. * * Test 3. Quick sort * * Feature: Calling C language functions from Cobol * * This test calls qsort() in the Standard C Library, and directs it to * call memcmp(). * * How do you tell Cobol to use the C calling convention? Not necessary, * Micro Focus uses the C convention by default. How do you tell the * linker (loader) to look in C libraries? Not necessary. We stopped * doing static binding (link edit) many years ago. Binding is now done * at execution time by a program named "ld", which follows * LD_LIBRARY_PATH. The C Standard Library is in that path by default. In * short, you don't have to do anything special. You can see libraries * that will be loaded at execution time by typing "ldd sorts". * * C libraries contain a plethora of useful functions. See * http://www.tcnj.edu/~cs/doc/CStdLib.html * also * http://www.bebits.com/app/2917 * also * a hundred others in /usr/lib, many defined in Standards * * The speed of quick sort is said to be proportional to N log2(N), where * N is the number of entries and log2 is the base 2 logarithm. Most * high-speed algorithms run at this speed, for example Shell, Heap and * Comb sorts. Low-speed sorts such as Bubble run at N*N, are suitable * only for small values of N, say less than 100. * * Tests 4 & 5. Radix sort followed by Insertion sort * * Features: The radix algorithm * Linked lists using indexes * * When we had to sort a large number of punched cards, we put them into * approximate sequence by sorting on the first few columns with a card * sorter. Next we put them through a machine called a collator, which * pulled cards out of order, typically 5-10%. We sorted those cards on * the full key, then used a collator to merge them into the big deck. * * Would you believe the card sorting algorithm, written in Cobol, is * five times faster than the best high-speed algorithms written in C, * and about ten times faster than the sorts delivered by Micro Focus? * * The algorithm inspects one 'column' at a time, right to left, tossing * 'cards' into 'pockets' by linking the 'card' to what had previously * been the last in the 'pocket'. At the end of each pass, it 'gathers' * the pocket strings into a single string ('deck') by linking the last * 'card' in a 'pocket' to the first 'card' in the next 'pocket'. * * A 'column' can be any convenient size -- one bit, one byte or one * word. The demo program defines a column as two bytes, 16 bits. This * requires 65,536 'pockets', of which only a few thousand are actually * used. * * When cards are approximately in order, the program 'merges' the ones * out of order into place with an Insertion Sort. As a primary * algorithm, Insertion is slow, close to N*N. But when the entries are * almost in sequence, it is quite fast. It almost always puts them in * order with one pass, then makes a second pass to verify that the job * is complete. It takes a third pass only when there are two consecutive * steps down. In the 50,000 item test, that doesn't happen, it runs in * two passes. If radix had sorted on seven bytes rather than six, they * would have been in perfect order. Insertion would have taken only one * pass. * * The Radix-Insertion algorithm speed is proportional to kN + 2N+, where * k is the number of columns sorted by Radix (in this case 3) and the + * following 2N is close-range rearrangements by Insertion. In THEORY, * the relative speeds of test 3 vs. test 5 should be 50,000 * * log2(50,000) = 785,000 vs. 3*50,000 + 2*50,000+ =~ 300,000. The ratio * is 2.6. In PRACTICE, it is 5.0 times as fast. As usual, the devil is * in the details. If the test program had used subscripts rather than * indexes, its speed would have been cut in half. Not because indexes * are that much faster, because every subscript operation generates a * call to the runtime whereas index operations produce inline code (seen * via the now commented ASM options). * * [If your calculator doesn't have log2, take ln (base e) and divide by .69, * which is ln(2).] * * To compile this "cob -x sorts.cbl". -x produces a Unix executable. * To execute "sorts". * * Continued in trees.cbl * * --------------------------------------------------------------------- $SET SOURCEFORMAT"FREE" $SET NOBOUND OPT"2" NOTRUNC IBMCOMP NOCHECK * $SET ASM ASMLIST SOURCEASM identification division. program-id. Sorts. author. Robert Wagner. * Compare speed of four sorts. * Results: Sun SPARC, 1.8 GHz, 50,000 30-byte entries, time in * seconds for 10 iterations. * --- ratio --- * 2. 3. 4. 5. * 0. Indexed file sort 55.4 14.6 22.2 110 139 * 1. Cobol file sort 3.7 1.0 1.5 7.4 9.3 * 2. Micro Focus table sort 3.8 1.5 7.6 9.5 * 3. qsort() 2.5 5.0 6.3 * 4. radix-insertion .5 1.3 * 5. radix to linked list .4 * * Trees.cbl contains an expanded version of these test results. working-storage section. 01 test-data GLOBAL. 05 radix-columns comp pic s9(02). 05 test-number comp pic s9(02). 05 card occurs 50001 indexed *> the data x-n first-n last-n previous-n-0 previous-n-1 previous-n-2 x-last. 10 sort-key. 15 key-column occurs 15 indexed x-c comp-x pic xx. 10 next-n index. 10 next-n-1 index. 01 init-data. 05 init-columns value zero comp pic s9(02). 05 init-number value zero comp pic s9(02). 05 init-card occurs 50001 indexed i-n i-next i-last. 10 init-key-1 pic v9(18). 10 init-key-2 pic v9(12). 10 init-next index. 10 init-next-1 index. 01 timer-variables. 05 repeat-factor value 10 comp pic s9(09). 05 previous-key pic x(30). 05 card-count comp pic s9(09). 05 start-time. 10 start-hours pic 9(02). 10 start-minutes pic 9(02). 10 start-seconds pic 9(02). 10 start-hundredths pic 9(02). 05 end-time. 10 end-hours pic 9(02). 10 end-minutes pic 9(02). 10 end-seconds pic 9(02). 10 end-hundredths pic 9(02). 05 elapsed-time pic z(04).9. procedure division. perform construct-data display '0. Indexed file sort' with no advancing move 0 to init-number move 2 to repeat-factor perform timer-on perform repeat-factor times perform initialize-test-data call 'indexed-file-sort' end-perform perform timer-off perform verify-sort display '1. Cobol file sort' with no advancing move 1 to init-number move 10 to repeat-factor perform timer-on perform repeat-factor times perform initialize-test-data call 'cobol-sort' end-perform perform timer-off perform verify-sort display '2. Micro Focus sort' with no advancing move 2 to init-number perform timer-on perform repeat-factor times perform initialize-test-data call 'mf-sort' end-perform perform timer-off perform verify-sort display '3. quick sort N log2(N)' with no advancing move 3 to init-number perform timer-on perform repeat-factor times perform initialize-test-data call 'quick-sort' end-perform perform timer-off perform verify-sort display '4. radix-insertion kN + 2N+' with no advancing move 4 to init-number move 3 to init-columns perform timer-on perform repeat-factor times perform initialize-test-data call 'radix-sort' end-perform perform timer-off move zero to init-columns perform verify-sort display '5. radix-insertion to linked list' with no advancing move 5 to init-number move 3 to init-columns perform timer-on perform repeat-factor times perform initialize-test-data call 'radix-sort' end-perform perform timer-off goback . construct-data. set x-last, i-last to 50001 perform varying i-n from 1 by 1 until i-n = i-last compute init-key-1 (i-n) = function random compute init-key-2 (i-n) = function random set i-next to i-n set i-next up by 1 set init-next (i-n) to i-next set init-next-1 (i-n) to i-last end-perform move high-values to init-card (i-last) . initialize-test-data. move init-data to test-data set first-n to 1 . verify-sort. move zero to previous-key, card-count set x-n to first-n if x-n equal to x-last display 'test failed, no first' end-if perform until x-n equal to x-last if sort-key (x-n) less than previous-key display 'test failed ' previous-key space sort-key (x-n) set x-n to x-last else move sort-key (x-n) to previous-key if init-columns not equal to zero set x-n to next-n (x-n) else set x-n up by 1 end-if add 1 to card-count end-if end-perform if card-count not equal to 50000 display 'test failed, count is ' card-count end-if . timer-on. accept start-time from time . timer-off. accept end-time from time compute elapsed-time rounded = ((((((((end-hours * 60) + end-minutes) * 60) + end-seconds) * 100) + end-hundredths) - ((((((start-hours * 60) + start-minutes) * 60) + start-seconds) * 100) + start-hundredths)) / 100) * 10 / repeat-factor display elapsed-time ' seconds'

Report this thread to moderator Post Follow-up to this message
Old Post
PaulWDent
07-31-04 03:49 PM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

Cobol archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 03:22 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.