Code Comments
Programming Forum and web based access to our favorite programming groups.I am trying to understand the mindset prevalent during the advent of high-level programming languages, especially regarding compilers and run-time evaluation. I hope someone can shed some historical insight. Firstly, back when everything was done in pure machine code or assembly, how common was the use of self-modifying code? Was it only used for things like inlined loop counters, or was there anything like run-time generation of non-trivial code? In a word, was it seen as a useful thing that a von Neumann architecture general-purpose microprocessor really was a universal Turing machine that could be reprogrammed on the fly? When batch compilers became popular, such flexibility was obviously lost: you couldn't generate new Fortran code on the fly from your original Fortran program. Was this ever seen as a problem? Interpreters, of course, easily support run-time code generation simply by allowing the interpreter to be called from within the program. Lisp certainly supported eval from day one. But did McCarthy's team _invent_ the idea of reading in high-level code at run-time and then interpreting it (instead of compiling it to machine code beforehand), or was interpretation an older idea? Of course UTMs and universal functions had been known, but when was it realized that the same idea could be applied in practice? I'm sorry about the vagueness of these questions. Any remarks or references relating to the subject are appreciated. Lauri Alanko la@iki.fi [Back in the 1950s, people used every coding trick they could to squeeze programs into tiny memories, including all sorts of self-modifying code. For a famous, probably apochryphal, story that captures the way that people programmed then, see http://catb.org/~esr/jargon/html/story-of-mel.html There were lots of interpreters in the 1950s, typically running a machine-like code that was higher level than the real machine code, e.g., with floating point and index registers that the underlying machine didn't have. Those survived into the 1960s. The standard float package on the PDP-8 was an interpreter that had instructions with the same address formats as the real instructions so you could use the regular assembler to code them by renaming the opcodes. I think that the idea of having an external representation that mapped to and from an internal representation that could be interpreted was new to Lisp. For a long time it was seen as an interim hack until Lisp 2 with a nicer syntax was ready. -John]
Post Follow-up to this messageLauri Alanko wrote: > Firstly, back when everything was done in pure machine code or > assembly, how common was the use of self-modifying code? Very common. For example, the IBM 7094 had a whole class of instructions explicitly for writing self-modifying code -- instructions to load and store the opcode portion of words (instructions), and the register field, and the address field. There is a very nice paper by Ken Thompson on regular expression matching that generates a state machine on the fly on the 7094. Each chunk of code is a state. (It was granted US Patent 3,568,156, entitled "Text Matching Algorithm," filed 9 Aug 1967, issued 2 Mar 1971, inventor Ken Thompson.) -- ------------------------------------------------------------------------- = Jeff Kenton Consulting and software development = = http://home.comcast.net/~jeffrey.kenton =
Post Follow-up to this messageLauri Alanko wrote:
>
> I am trying to understand the mindset prevalent during the advent of
> high-level programming languages, especially regarding compilers and
> run-time evaluation. I hope someone can shed some historical insight.
Some mindsets that come to mind (reading over this list, I can't really
distinguish between mindset and environment):
1. Compilers were thought of as "Automatic Programming". An early
report on Fortran, 1957, was "The Fortran Automatic Coding System"
2. There was only one criteria for good vs. bad compilers: efficiency of
the generated code.
3. A mindset of limited resources; useful, economic, results were
produced on machines with memories so small that multiplier acronyms
("K" for example) were not needed in size specifications. And schedules
were in months, not years.
4. A computer could be a human being (who operated a Frieden or other
calculator). You can find books of mathematical tables where the
computers - the names of people - are listed on the title page.
5. Users with problems requiring computer solutions took their problem
to a programmer. Only after high-level languages "advented" (in
particular, FORTRAN) did it become COMMON for users to write their own
programs.
6. Cascading compilers. For example, FORTRANSIT (IBM 650) input was
FORTRAN, the compiler output was an IT program. The IT compiler
produced SOAP (the 650s assembly language) output. The SOAP processor
produced your object deck.
7. Languages were sometimes designed/implemented by people trying to
solve their own problems. IPL, LISP being examples.
> Firstly, back when everything was done in pure machine code or
> assembly, how common was the use of self-modifying code? ...
See "The Preparation of Programs for an Electronic Digital Computer",
Wilkes, Wheeler, & Gill, 1951, page 8, "Modification of orders by the
program".
> Was it only used for things like inlined loop counters, or was there
> anything like run-time generation of non-trivial code? In a word,
> was it seen as a useful thing that a von Neumann architecture
> general-purpose microprocessor really was a universal Turing machine
> that could be reprogrammed on the fly?
> When batch compilers became popular, such flexibility was obviously
> lost: you couldn't generate new Fortran code on the fly from your
> original Fortran program. Was this ever seen as a problem?
Well, it wasn't seen as a problem because you could do it if you wanted
to (I assume that few people wanted to! - another mindset thing). Not
in complete generality, but you are asking about the "advent" - when
things got started. Look up FORMAC (Sammet). I think you do things
like (where "set" assigns the expression, not the value of the
expression. Set is not the correct FORMAC keyword, but it will do for
this example)
set v1 = a + b
set v2 = c + d
v3 = v1 * v2
v3 would then have the value ac + ad + bc + bd
There must have been an evaluate function.
> Interpreters, of course, easily support run-time code generation
> simply by allowing the interpreter to be called from within the
> program. Lisp certainly supported eval from day one. But did
> McCarthy's team _invent_ the idea of reading in high-level code at
> run-time and then interpreting it (instead of compiling it to machine
> code beforehand), or was interpretation an older idea? Of course UTMs
> and universal functions had been known, but when was it realized that
> the same idea could be applied in practice?
Interpreters have "always" been here. Well, if not from day 1,
certainly from day 2.
Lookup "Speedcoding" for the IBM 701, about 1953. Or "Short Code" for
the UNIVAC. Or Laning/Zierler.
Fortran IV could "read in high-level code at run-time and then interpret
it". Only FORMAT specifications, but exactly the function you asked
about.
I don't know if this functionality was in any Fortran II implementations
or not, nor do I have dates. Comparable to LISP? Of course not, but you
asked about beginnings.
There was a language TRAC (Deutsch?) for the PDP-1 where the a program
could modify itself (I think).
Post Follow-up to this messageLauri Alanko wrote: (snip) > Firstly, back when everything was done in pure machine code or > assembly, how common was the use of self-modifying code? (snip) The IBM OS/360 Fortran library does it. I don't believe I know of any compilers that generate self modifying code, though. The IBM 360/91, one of the earlier machines doing out of order execution, had special logic to detect modified instructions that had been previously fetched. IBM, as of OS/360, has three attributes they apply to load modules (executable code files). Serially reusable, reentrant, and refreshable. Reentrant and refreshable modules usually should not be self modifying, and later OS would store then in read only page frames. Fortran library routines were serially reusable, but not reentrant or refreshable. Also, OS/360 channel programs were often self modifying. -- glen [The IBM 360 architecture specifically permits instruction modification without extra serialization. I gather it's still a pain for hardware implementers, although it's now much more comment to do an EXecute instruction to run one instruction created at runtime. -John]
Post Follow-up to this messageLauri Alanko wrote: > > Firstly, back when everything was done in pure machine code or > assembly, how common was the use of self-modifying code? The first couple of machines I programmed (IBM 1620, IBM 1401) had no address or index registers of any kind. So self-modifying code was the only possible way to write things like loops going through arrays. You could avoid getting hopelessly bogged down in tar, by working out a couple of what would today be called design patterns and then following them downright slavishly. Later, the university paid extra for a 1620 "special feature" of indirect addressing, which made it immeasurably easier. By the time I discovered linked data structure, It was on an IBM 1410, which, as I recall, had 3 index registers.
Post Follow-up to this messageOn 4 Aug 2004 02:44:48 -0400, Jeff Kenton <Jeffrey.Kenton@comcast.net> wrote: > Lauri Alanko wrote: > > > Very common. ... Indeed, I think in the /very/ early days of computing, when no machine had more than one or two KB of memory*, it was generally necessary for any practical program to do a lot of self-modification, as well as a great deal of the loading of segments of code from the drum*. It became a considerable art, squeezing functionality out of code that could fit into such a tiny amount of memory. Unfortunately, it also caused an assumption in the industry, one that lasted far too long, that such self-modification was an acceptable technique. Originally, the COBOL language had an ALTER x TO PROCEED TO y statement, which modified a GOTO statement (itself labelled x), changing its destination from whatever it was before to y. As you can imagine, programmers could -- and generally did -- achieve the most spectacularly obfuscated code with this statement. It became quite notorious, and quickly became deprecated and then removed from compilers altogether. Unfortunately, I think things have swung too far the other way, in that even the humble GOTO is now considered verboten. Silly. -- Nick Roberts *Memory those days was 'core' memory, a three-dimensional wire grid with a magnetic 'core' (a ring made from ferrite) around each triple intersection. Each core stored one bit. It was a truly daft piece of technology (by today's standards), but a fascinating work of art to look at.** *In those days, the fastest secondary storage device was a 'drum', which was a rotating drum covered in magnetic material (like a disk), with a read/write head that scanned from end to end. The last generations of drums had multiple heads, and were very fast.*** [** - The more modern ones had core. Some of the earlier ones used Williams tubes, CRTs that stored data (very unreliably) by the presence or absence of spots on the CRT, or mercury tank or magnetorestrictive delay lines which worked better but were hard to program since you wanted to minimize the amount of time the comptuter spent waiting for words to come around. Some computers like the Electrodata 205, LGP 30, and IBM 650 ran straight off the drum. Core was by comparison pretty simple, although you had do do strange things to make big fast core banks, like sitting them in tanks ofing oil in the IBM 7094. Typical core banks were 2 1/2 D (X, Y, and sense wire) rather than 3D. *** - Yup. The 650's drum in the mid 1950s spun at about 12K rpm, a speed that disks have only recently matched. -John]
Post Follow-up to this messageDick Weaver <rweaver@ix.netcom.com> wrote: >Some mindsets that come to mind (reading over this list, I can't really >distinguish between mindset and environment): > >1. Compilers were thought of as "Automatic Programming". An early >report on Fortran, 1957, was "The Fortran Automatic Coding System" Only in the sense of "automatic macro expansion". Remember that was in the days when analysts wrote flow-charts and programmers coded them into assembler. >2. There was only one criterion for good vs. bad compilers: efficiency of >the generated code. Absolutely NOT! Certainly by the early 1960s, the concept of debugging compilers existed, which were expected to have thorough diagnostics, insert good checking and compile very fast. What happened to that concept, I wonder? :-( Even in the early 1950s, the memory and time requirements of a compiler were an issue. I should have to ask whether anyone can remember the relevant priorities. > 5. Users with problems requiring computer solutions took their >problem to a programmer. Only after high-level languages "advented" >(in particular, FORTRAN) did it become COMMON for users to write their own >programs. Not in Cambridge! Here, users programmed in machine code and autocode before Fortran was invented. EDSAC I was the first practical general purpose computer, and EDSAC II was used by a wide spectrum of the university (including Arts and Humanities). Of course, I am relying on my seniors' claims, as as a mere stripling of 56 :-) >6. Cascading compilers. For example, FORTRANSIT (IBM 650) input was >FORTRAN, the compiler output was an IT program. The IT compiler >produced SOAP (the 650s assembly language) output. The SOAP processor >produced your object deck. Yup. IAL on Titan - not a bad assembler. Typical optimising compilers had half a dozen passes, often implemented as separate programs, and some of them had external specifications. >7. Languages were sometimes designed/implemented by people trying to >solve their own problems. IPL, LISP being examples. Aren't they still? Perl, Python, Java .... Regards, Nick Maclaren. [The original Fortran computer emitted extremely good code because the conventional wisdom was that compiled scientific code would be too slow to be useful. I gather that not long after it was shipped, they added an option to compile faster by skipping some of the optimization and generate worse code, which everyone always used. Code speed wasn't a big issue for Cobol, since commercial programs are invariably I/O bound, but I don't know what the adoption issues were. -John]
Post Follow-up to this messageDick Weaver wrote:
> There was a language TRAC (Deutsch?) for the PDP-1 where the a program
> could modify itself (I think).
This isn't about TRAC, but (1967 or so) I spent a month trying to
understand the core of the Thor timesharing system for the PDP-1
written in PDP-1 assembler. I understood it all except for one place
where, in the code, in one location either of two data words would be
stored. One word was a "SZA" ("skip on zero accumulator") and the
other was a "SNA" ("skip on non-zero accumulator"). Why would anyone
modify a skip instruction? It was explained to me later that the code
had to be self-modifying. I swore that I would never write
self-modifying code, and I have kept my promise for 37 years. (And I
remember quite clearly who wrote that code, but I bet he doesn't
remember doing it.)
Klaus Wirth used to go off on self-modifying code; he
considered it a terrible evil.
john slimick
slimick@pitt.edu
[I missed the reference to Trac, a language in which I've probably
written more code than anyone. It's a strictly interpretive macro
expansion language somewhat similar to Unix m4, so just about anything
you write is by some standard self-modifying code. The code I wrote
didn't do much self-modifying beyond putting commas into arguments to
recursive routines to implement variable length argument lists and
poor man's structures. -John]
Post Follow-up to this messageOn Thursday 05 Aug 2004 7:27 pm, you wrote: > [The IBM 360 architecture specifically permits instruction modification > without extra serialization. I gather it's still a pain for hardware > implementers, although it's now much more common to do an EXecute > instruction to run one instruction created at runtime. -John] Self-modifying code is common in IBM assembler code in daily use today. The most common form is a NOP modified to a branch instruction, for example: LABEL1 NOP LABEL2 OI LABEL1+1,X'FO' The Or-Immediate changes the second byte of the NOP: converting the instruction from a "branch never" to a "branch always". You save one bit of memory and a whole compare instruction! The EXecute instruction loads the target instruction into memory, overwrites part of it with the contents of a register, and executes the result. It is most often used to produce a variable length move. But I have seen EXecute used as a sort of one-instruction subroutine call: EX is only four bytes, so if you use EX to EXecute a six byte instruction, then you are saving two bytes! -- Martin Martin.Ward@durham.ac.uk http://www.cse.dmu.ac.uk/~mward/ Erdos number: 4 [Eeeww. I didn't even do that back in 1970. -John]
Post Follow-up to this messageNick Maclaren wrote: > Absolutely NOT! Certainly by the early 1960s, the concept of > debugging compilers existed, which were expected to have thorough > diagnostics, insert good checking and compile very fast. What > happened to that concept, I wonder? :-( In the 1960s, few compiled languages had features such as open ended arrays, ability to create a pointer to any variable and similar features. It was certainly possible to create a compiler that checked for almost all access violations and range violations. This is no longer the case. In fact, the trend is away from such checkable code. Fortran 90 includes the ability to point to arbitrary variables. With all of its other faults, at least Fortran was checkable. -- Samiam is Scott A. Moore Personal web site: http:/www.moorecad.com/scott ISO 7185 Standard Pascal web site: http://www.moorecad.com/standardpascal [Where's PL/I when we need it? -John]
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.