For Programmers: Free Programming Magazines  


Home > Archive > A86 Assembler > January 2005 > Worth using registers instead of memory locations?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Worth using registers instead of memory locations?
W Marsh

2005-01-08, 3:55 pm

Is there a worthwhile speed boost in using registers to hold variables
instead of memory, when the variables will be accessed multiple times
anyway? I imagine that since the variables I am talking about are
static and frequently accessed, they will remain in cache anyway, and
make swapping them into registers a mostly pointless exercise. Am I
right?

Matt

2005-01-09, 3:55 am

"W Marsh" <spamtrap@crayne.org> wrote in message
news:1105190558.444899.54290@f14g2000cwb.googlegroups.com...
> Is there a worthwhile speed boost in using registers to hold variables
> instead of memory, when the variables will be accessed multiple times
> anyway? I imagine that since the variables I am talking about are
> static and frequently accessed, they will remain in cache anyway, and
> make swapping them into registers a mostly pointless exercise. Am I
> right?


When reading from the L1 cache there is a 2-4 cycle latency penalty
depending on which x86 processor you're running on. That penalty is paid per
instruction. If the variables are accessed only once, then there is no
difference in cycles, e.g.:

; Increment some random counter
mov eax, [counter] ; 3 cycles
inc eax ; 1 cycle
mov [counter], eax ; effectively 0 cycles
; Total 4 cycles

inc [counter] ; 4 cycles
; Total 4 cycles

The latter is more compact and will benefit slightly from that. If you
access the variable twice, things change:

; Increment some random counter twice
mov eax, [counter] ; 3 cycles
inc eax ; 1 cycle

; Other stuff goes here; assume the counter stays cached in eax

inc eax ; 1 cycle
mov [counter], eax ; effectively 0 cycles
; Total 5 cycles

inc [counter] ; 4 cycles
inc [counter] ; 4 cycles
; Total 8 cycles

The latency stacks up very quickly. It will definitely be felt if it falls
in the critical path. If not, then the latency will be at least partially
hidden, and it may make no difference. The only way to know is to analyze
the code in question. A good heuristic is to ask yourself: how often do I
use the variable? If often, then you probably want to cache it in a
register. If rarely, then the difference in speed won't be large enough to
warrant the change.

-Matt

randyhyde@earthlink.net

2005-01-09, 3:55 am

Keep in mind that instructions that use registers are shorter than
those that access static memory locations, hence, those instructions
use less memory and less of your cache.

Also, in practice, you don't get "1 cycle per instruction" when using
memory. Always remember, Intel's timing claims are (overly) optimistic
concerning memory access. They are fairly good when considering
register access.
Cheers,
Randy Hyde

Mark Gibson

2005-01-10, 3:55 am

W Marsh <spamtrap@crayne.org> wrote:
>Is there a worthwhile speed boost in using registers to hold variables
>instead of memory, when the variables will be accessed multiple times
>anyway? I imagine that since the variables I am talking about are
>static and frequently accessed, they will remain in cache anyway, and
>make swapping them into registers a mostly pointless exercise. Am I
>right?


Wrong. With the 32 or 64 bit x86 processors, saving variables used and
changed every pass in inner loops can beat the hell out of depending on
cache by leaving them in memory. The problem is that the x86 architecture
does not provide all that many general purpose registers. The thing is
that a register entry can be manipulated directly, while something in
memory (even cache memory) has to be fetched and placed into a register
or written from a register to memory in order to do anything with it.
That is one reason pipelining has become so important, but nothing involving
memory can be faster than an equivalent operation involving register entries
only.

Best regards,
Mark



--
"Most Americans think their leaders should be held to a higher standard,
at least the penitentiary level."

Matt

2005-01-12, 3:55 am

"W Marsh" <spamtrap@crayne.org> wrote in message
news:1105190558.444899.54290@f14g2000cwb.googlegroups.com...
> Is there a worthwhile speed boost in using registers to hold variables
> instead of memory, when the variables will be accessed multiple times
> anyway? I imagine that since the variables I am talking about are
> static and frequently accessed, they will remain in cache anyway, and
> make swapping them into registers a mostly pointless exercise. Am I
> right?


When reading from the L1 cache there is a 2-4 cycle latency penalty
depending on which x86 processor you're running on. That penalty is paid per
instruction. If the variables are accessed only once, then there is no
difference in cycles, e.g.:

; Increment some random counter
mov eax, [counter] ; 3 cycles
inc eax ; 1 cycle
mov [counter], eax ; effectively 0 cycles
; Total 4 cycles

inc [counter] ; 4 cycles
; Total 4 cycles

The latter is more compact and will benefit slightly from that. If you
access the variable twice, things change:

; Increment some random counter twice
mov eax, [counter] ; 3 cycles
inc eax ; 1 cycle

; Other stuff goes here; assume the counter stays cached in eax

inc eax ; 1 cycle
mov [counter], eax ; effectively 0 cycles
; Total 5 cycles

inc [counter] ; 4 cycles
inc [counter] ; 4 cycles
; Total 8 cycles

The latency stacks up very quickly. It will definitely be felt if it falls
in the critical path. If not, then the latency will be at least partially
hidden, and it may make no difference. The only way to know is to analyze
the code in question. A good heuristic is to ask yourself: how often do I
use the variable? If often, then you probably want to cache it in a
register. If rarely, then the difference in speed won't be large enough to
warrant the change.

-Matt

randyhyde@earthlink.net

2005-01-12, 3:55 am

Keep in mind that instructions that use registers are shorter than
those that access static memory locations, hence, those instructions
use less memory and less of your cache.

Also, in practice, you don't get "1 cycle per instruction" when using
memory. Always remember, Intel's timing claims are (overly) optimistic
concerning memory access. They are fairly good when considering
register access.
Cheers,
Randy Hyde

Mark Gibson

2005-01-12, 8:55 am

W Marsh <spamtrap@crayne.org> wrote:
>Is there a worthwhile speed boost in using registers to hold variables
>instead of memory, when the variables will be accessed multiple times
>anyway? I imagine that since the variables I am talking about are
>static and frequently accessed, they will remain in cache anyway, and
>make swapping them into registers a mostly pointless exercise. Am I
>right?


Wrong. With the 32 or 64 bit x86 processors, saving variables used and
changed every pass in inner loops can beat the hell out of depending on
cache by leaving them in memory. The problem is that the x86 architecture
does not provide all that many general purpose registers. The thing is
that a register entry can be manipulated directly, while something in
memory (even cache memory) has to be fetched and placed into a register
or written from a register to memory in order to do anything with it.
That is one reason pipelining has become so important, but nothing involving
memory can be faster than an equivalent operation involving register entries
only.

Best regards,
Mark



--
"Most Americans think their leaders should be held to a higher standard,
at least the penitentiary level."

randyhyde@earthlink.net

2005-01-15, 8:55 am

Keep in mind that instructions that use registers are shorter than
those that access static memory locations, hence, those instructions
use less memory and less of your cache.

Also, in practice, you don't get "1 cycle per instruction" when using
memory. Always remember, Intel's timing claims are (overly) optimistic
concerning memory access. They are fairly good when considering
register access.
Cheers,
Randy Hyde

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com