Code Comments
Programming Forum and web based access to our favorite programming groups.Hi, Maybe you have some ideas how to cope with this problem: I'm trying to optimize assembler code for a complex DSP which is an in-order superscalar processor (integer and load/store pipeline). Before it decodes instructions (which might be 16- or 32-bit), they are fetched into a 64bit fetch buffer. Thus, the fetch is aligned to 8byte addresses and in case an instruction at a branch target goes beyond the 8byte-border (misalignment), the processor stalls for additional cycles (extra transfer of control penalties). The DSP supports also an instruction cache which makes things even more complicated since multiple instructions are read from the cache and again they might span over multiple lines leading to extra cycles. My optimizations deal with moving basic blocks (determined by some cost functions) from the slow main memory to a small but fast memory thus allowing a fast access to these particular blocks. However, I have large problems with the "optimized" code. The moved blocks benefit from the faster memory but due to the moving the addresses of the subsequent instructions obviously change. Sometimes it's even sufficient to add one instruction which modifies the address of the following code to get significant runtime changes. The reason are new misaligned jump targets, differently loaded fetch buffers and thus different filling of the superscalar pipeline which might have a positive or negative effect on the total program runtime. Thus, my problem is that I can achieve a local optimization for the moved blocks but the resulting global influence is not predictable and might even undo the benefits and even result in a degraded runtime of the program. How do compiler developers cope with this problem? Are there any approaches which allow to predict the influence of a local code optimization on the global code performance for complex processors? Regards, Tim
Post Follow-up to this messageTim Frink wrote: (snip) > My optimizations deal with moving basic blocks (determined by some > cost functions) from the slow main memory to a small but fast memory > thus allowing a fast access to these particular blocks. However, I > have large problems with the "optimized" code. The moved blocks > benefit from the faster memory but due to the moving the addresses of > the subsequent instructions obviously change. (snip) > Thus, my problem is that I can achieve a local optimization for the > moved blocks but the resulting global influence is not predictable > and might even undo the benefits and even result in a degraded > runtime of the program. > How do compiler developers cope with this problem? Are there any > approaches which allow to predict the influence of a local code > optimization on the global code performance for complex processors? If I understand your question right, the answer is 'dynamic programming'. Dynamic programming, which is neither dynamic nor programming, allows one to do global optimization given weights (cost functions) of the various choices. http://en.wikipedia.org/wiki/Dynamic_programming http://www.ics.uci.edu/~dan/class/1.../6/Dynamic.html First popularized by biologists comparing protein sequences, it was then used by the unix 'diff' program for finding a minimal set of differences between two files. (Again, with the appropriate weighting.) In general, dynamic programming allows one to do in polynomial time what would otherwise seem to take exponential time. The use of dynamic programming for code generation in compilers is described in the book about LCC, http://drhanson.s3.amazonaws.com/st...ents/linux.html and likely in other compiler books. -- glen
Post Follow-up to this messageOn Mar 28, 3:44 am, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote: > Dynamic programming, which is neither dynamic nor programming, allows > one to do global optimization given weights (cost functions) of the > various choices. >[...] > > First popularized by biologists comparing protein sequences, it was > then used by the unix 'diff' program Biologists first? Naah. Preston
Post Follow-up to this message"preston.briggs@gmail.com" <preston.briggs@gmail.com> writes: > On Mar 28, 3:44 am, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote: > > Biologists first? Naah. I'll second that, but with a bit more by way of a reference. The basic ideas of dynamic programming can actually be traced far back, long before the name "dynamic programming" or the start of molecular biology. But for the present purpose, it suffices to ask who gave it recogized its importance sufficiently to give it the name "dynamic programming," and when and why. For that, see Richard Bellman on the Birth of Dynamic Programming, by Stuart Dreyfus, http://www.eng.tau.ac.il/~ami/cd/or...-50-01-0048.pdf The short version is that Richard Bellman named the technique at RAND (a military think tank) in fall of 1950, choosing a name that "not even a Congressman could object to." So far as I know, RAND was not studying biology in 1950.
Post Follow-up to this messageMax Hailperin wrote: > "preston.briggs@gmail.com" <preston.briggs@gmail.com> writes: (snip) > I'll second that, but with a bit more by way of a reference. (snip) > The short version is that Richard Bellman named the technique at RAND > (a military think tank) in fall of 1950, choosing a name that "not > even a Congressman could object to." So far as I know, RAND was not > studying biology in 1950. In my post I had a reference to the wikipedia article referencing Richard Bellman and his development of dynamic programming. But how many people read Bellman's paper and started implementing such algorithms in the 1950's? It was Needleman-Wunsch that introduced it to biology, and Smith-Waterman that made it even more popular. The unix 'diff' program dates from the early 1970's: http://en.wikipedia.org/wiki/Diff While Needleman-Wunsch, J Mol Biol. 48(3), was 1970. It is my understanding, though I don't have a direct reference right now, that diff, around 1974, referenced Needleman-Wunsch for the algorithm. Now, how many people have read the Needleman-Wunsch and Smith-Waterman papers compared to Bellman's? Computer science conferences on pattern matching algorithms are dominated by papers on biological applications, and rarely do you see applications to compilers. -- glen [For diff history, see http://www.cs.dartmouth.edu/~doug/diff.ps which does indeed reference Needleman and Wunsch. -John]
Post Follow-up to this messageOn Mar 29, 12:33 pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote: > In my post I had a reference to the wikipedia article referencing > Richard Bellman and his development of dynamic programming. But how > many people read Bellman's paper and started implementing such > algorithms in the 1950's? A few. My recollection of Bellman was from Knuth, Art of Programming, Volume 1, Notes on the Exercises, page xvii. The copyright is 1968. Another example of dynamic programming, perhaps more directly relevant to compilers, is the CYK parsing algorithm, from 1965, see http://en.wikipedia.org/wiki/CYK_algorithm > Computer science conferences on pattern matching algorithms are > dominated by papers on biological applications, and rarely do > you see applications to compilers. The days I work for Google, and we do a little pattern matching too. But I guess I don't think of dynamic programming as restricted to pattern matching. The examples from my grad school algorithms book (Aho, Hopcroft, and Ullman, 1974) focus on finding the product of many matrices. And there's Aho and Johnson's paper on optimal evaluation of expressions, from 1975. (The work you cited from LCC descends from Aho and Johnson). And there's the BURS theory, a big favorite of mine. Plenty of good compiler stuff! Preston
Post Follow-up to this messageHi, > > If I understand your question right, the answer is 'dynamic > programming'. Thank you for your answer. I'm not sure if dynamic programming is an approach that I can apply to my problem. When I understand the idea of dynamic programming correctly, it exploits the idea of "overlapping subproblems" and "memoization", i.e. it is assumed that the problem can be divided into independent subproblems which can be solved separately and then their optimal solution can be used to construct the global optimal solution. For my problem with the alignment I could divide the code into smaller subproblems where I could try to find an optimal local solution. However, these subproblems are not independent. When I move some code locally in one place (let's say that's the region of the first subproblem), then this might possibly also influence some following code in another region that I consider as a further subproblem. Thus, calculating separate optimal local solutions and them combine them will not work for me. Do you see another way to cope with my optimization problem? Regards, Tim
Post Follow-up to this message> I'm not sure if dynamic programming is an approach that I can apply to > my problem. When I understand the idea of dynamic programming > correctly, it exploits the idea of "overlapping subproblems" and > "memoization", i.e. it is assumed that the problem can be divided into > independent subproblems which can be solved separately and then their > optimal solution can be used to construct the global optimal solution. > > For my problem with the alignment I could divide the code into smaller > subproblems where I could try to find an optimal local > solution. However, these subproblems are not independent. When I move > some code locally in one place (let's say that's the region of the > first subproblem), then this might possibly also influence some > following code in another region that I consider as a further > subproblem. Thus, calculating separate optimal local solutions and > them combine them will not work for me. Despite our disagreement about the history, I think Glen's intuition is right on. You don't compute just the locally optimal solutions for the subproblems, you compute all the solutions for each subproblem and the cost for each, then consider the all the combinations, picking the one that yields the globally optimal result. It's not always easy to come up with such an approach; that's why the people who do it publish and become famous! Preston
Post Follow-up to this messageTim Frink <plfriko@yahoo.de> writes: >... > I'm not sure if dynamic programming is an approach that I can apply to > my problem [of deciding which blocks to move to faster memory].... From your original posting, it is not clear to me which of the following problems you want to solve (in each case, optimizing the total speed of the program, which is dependent on the addresses of the blocks): (1) For each block, make a yes/no decision on whether to move it to the faster memory. The address of each block within its memory (the slow memory or the fast memory) is equal to the sum of the size of the preceding blocks that were assigned to the same memory. That is, the order of the blocks within each memory remains the same as in the original layout and no padding bytes are introduced. (2) Make the same decisions as problem (1), but if an N-byte block is moved to the fast memory, up to N padding bytes can be left in its place in the slow memory. The actual number of padding bytes is an additional output of the decision process. (3) Solve some other even more general placement problem, where the number of padding bytes is not limited to N, or padding bytes are also allowed in the fast memory, or the blocks can be permuted into a different order. For the moment, I will assume you want to solve problem (1), both because this seems closest to the way you initially stated the problem, and because it is the simplest. For this problem, dynamic programming is indeed a useful technique. Consider working sequentially through the list of blocks, deciding for each one whether to move it to fast memory or not. The optimal decision on whether to move block k does not depend on the specific choices you made regarding each of blocks 1 through k-1. Instead, it just depends on the total size in bytes of the blocks in the range from 1 through k-1 that were selected for fast memory. This total size is what determines all three relevant facts: (1) how many bytes of the fast memory's capacity are still available for blocks k and onward, (2) what the address of block k would be if left in the slow memory, (3) what the address of block k would be if moved to the fast memory. This would lead to a dynamic programming solution using a two-dimensional table, with one dimension indexed by block number and the other dimension indexed by number of bytes. The other piece of guidance I would offer is to look specifically at the literature on dynamic programming solutions to the knapsack problem, rather than just dynamic programming in general. Your problem (or at least what I am taking to be your problem -- number 1 in my list) is a generalization of the 0-1 knapsack problem. Unlike the standard problem, the items are in a sequence rather than a set and the value of each item changes depending on how much of the capacity of the knapsack is filled. However, the standard dynamic programming approach to the knapsack problem still works.
Post Follow-up to this messageTim Frink <plfriko@yahoo.de> writes: > I'm not sure if dynamic programming is an approach that I can apply to > my problem. When I understand the idea of dynamic programming > correctly, it exploits the idea of "overlapping subproblems" and > "memoization", i.e. it is assumed that the problem can be divided into > independent subproblems which can be solved separately and then their > optimal solution can be used to construct the global optimal solution. > > For my problem with the alignment I could divide the code into smaller > subproblems where I could try to find an optimal local > solution. However, these subproblems are not independent. When I move > some code locally in one place (let's say that's the region of the > first subproblem), then this might possibly also influence some > following code in another region that I consider as a further > subproblem. Thus, calculating separate optimal local solutions and > them combine them will not work for me. Actually, that is exactly the kind of problem dynamic programming is designed to solve. The criteria for dynamic programming is only that if you have already picked all the other problems with the value that leads to the optimal solution, then you can pick the value to the last problem that leads to the optimal solution (and so on working backwards). Dynamic programming does not require the choices to be independent (and in fact is designed to solve the case where the choices are interdependent). If the choices are independent, simpler solutions can be used. Thus, dynamic programming is often a backtracking approach, you pick a solution for each of the problems in some order (often a greedy algorithm is used to come up with the initial solution) and compute the final score. Then, you revisit the items in the opposite order you originally picked them and try a different value for each one, until you have the best result. It is useful to think of the choices as a tree, where each choice made constrains the other subsequent choices (i.e. when you put one block in a regions of memory, you are then constrained not to put another block in that same memory), and you are visiting the tree of all possible choices in an organized fashion. Now, as I recall (and I haven't done any significant dynamic programming recently), there usually is a method to prune unfruitful subtree explorations, i.e. a way to determine if changing the value of some value cannot possibly lead to a better solution than the one that has already been calculated, but that may or may not be a required feature of the method. It is worth noting, that because of the required backtracking, dynamic programming solutions usually grow exponentially with input problem size. That is, if your problem adds one more binary decision to the previous problem, the new problem takes roughly twice as long to solve, because you have doubled the size of tree you have to explore to find the correct solution. Hope this helps, -Chris **************************************** ************************************ ** Chris Clark Internet: christopher.f.clark@compiler-resources.co m Compiler Resources, Inc. or: compres@world.std.com 23 Bailey Rd Web Site: http://world.std.com/~compres Berlin, MA 01503 voice: (508) 435-5016 USA fax: (978) 838-0263 (24 hours) ---------------------------------------------------------------------------- --
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.