For Programmers: Free Programming Magazines  


Home > Archive > C > June 2006 > Re: A debugging implementation of malloc









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Re: A debugging implementation of malloc
jacob navia

2006-06-25, 3:56 am

Thanks for the answers.
1) True, I do not consider alignment problems, and that could
be a problem in machines where integers are aligned at some
power of 2 boundary. A work-around would be to just
memcpy (or equivalent) and memcmp (or an equivalent loop)
instead of accessing the signature at the end of the
block as an integer.

2) This code is very old, and I have forgotten that now
free accepts NULL. It was written originally under
windows 3.0 (16 bit OS), and I have used it ever
since.

3) A problem arises when size_t is much bigger than
sizeof int. In 64 bit windows, for instance, this
code would truncate a size_t when the user attempts
to allocate more than 2GB of memory in a single block.

A problem arises then here. What should this routine do with
an allocation request of that size? It could be a legal
request, but it would also be the result of a negative
number being passed to malloc...
Microsoft proposed in the TR 24731 to coin a new type
rsize_t that would be the maximum value of a legal request
for an object, and limited to a signed int, in 32 bit systems
2GB. I think the solution would be along those lines here.

Again, thanks to all people that answered.

jacob
Ian Collins

2006-06-25, 3:56 am

jacob navia wrote:

>
> 3) A problem arises when size_t is much bigger than
> sizeof int. In 64 bit windows, for instance, this
> code would truncate a size_t when the user attempts
> to allocate more than 2GB of memory in a single block.
>
> A problem arises then here. What should this routine do with
> an allocation request of that size? It could be a legal
> request, but it would also be the result of a negative
> number being passed to malloc...


I had to make the same call for the debug allocator I use. I settled
for an assert on the size being <= INT_MAX, assuming a request bigger
than that to more likely be a negative number than a real request.

--
Ian Collins.
Barry Schwarz

2006-06-25, 6:56 pm

On Sun, 25 Jun 2006 10:17:39 +0200, jacob navia
<jacob@jacob.remcomp.fr> wrote:

>Thanks for the answers.
>1) True, I do not consider alignment problems, and that could
> be a problem in machines where integers are aligned at some
> power of 2 boundary. A work-around would be to just
> memcpy (or equivalent) and memcmp (or an equivalent loop)
> instead of accessing the signature at the end of the
> block as an integer.


This doesn't solve the problem that your are returning an address
*only* two int beyond the allocated address.

Two possibilities come to mind:

On the very first call to your function, allocate a small
block (sizeof(long long)+sizeof(long double)). Convert the address to
unsigned long. Determine N, the number of low order zeros in the
result. Instead of using sizeof(int) as your delta, use pow(2,N).
Free the memory and get on with what the user requested.

On the very first call, compute max(sizeof(short),
sizeof(int), sizeof(size_t), sizeof(ptrdiff_t), sizeof(long long),
sizeof(long double), ...), betting on the assumption that the longest
object imposes the strictest alignment.

<snip>


Remove del for email
Eric Sosman

2006-06-25, 6:56 pm

Barry Schwarz wrote:
> [...]
> On the very first call, compute max(sizeof(short),
> sizeof(int), sizeof(size_t), sizeof(ptrdiff_t), sizeof(long long),
> sizeof(long double), ...), betting on the assumption that the longest
> object imposes the strictest alignment.


K&R (now retronymically called "K&R I") showed an easy
way to do this without a run-time computation by using the
size of a union whose elements are all the primitive types
likely to need special alignment. Paraphrasing (because
some of today's types didn't exist then):

union all_kinds {
char c;
short s;
int i;
long l;
long long ll;
intmax_t im;
float f;
double d;
long double ld;
void *vp;
void (*fp)(void);
/* others as seem needful */
};
#define ALIGNSIZE sizeof(union all_kinds)

K&R did not, of course, claim that this was perfect.
Still, it's as effective as Barry Schwartz' suggestion and
doesn't require any special first-call computation.

A refinement is possible, since the sizeof a type may
be greater than the type's required alignment:

struct aligned {
char pad;
union all_kinds u;
};
#define ALIGNSIZE offsetof(struct aligned, u)

This still isn't perfect, but may sometimes turn out to
be more economical.

--
Eric Sosman
esosman@acm-dot-org.invalid
Malcolm

2006-06-25, 6:56 pm




"jacob navia" <jacob@jacob.remcomp.fr> wrote
>
> Microsoft proposed in the TR 24731 to coin a new type
> rsize_t that would be the maximum value of a legal request
> for an object, and limited to a signed int, in 32 bit systems
> 2GB. I think the solution would be along those lines here.
>

And so gibberish types multiply.
Why not just go back to malloc() taking an int?
--
Buy my book 12 Common Atheist Arguments (refuted)
$1.25 download or $7.20 paper, available www.lulu.com/bgy1mm


Ian Collins

2006-06-25, 6:56 pm

Malcolm wrote:
> "jacob navia" <jacob@jacob.remcomp.fr> wrote
>
>
> And so gibberish types multiply.
> Why not just go back to malloc() taking an int?


Wouldn't be ideal in a 64 bit system, long would be a better bet. But
there still isn't a guarantee it would be the same size as size_t.

--
Ian Collins.
Keith Thompson

2006-06-25, 6:56 pm

jacob navia <jacob@jacob.remcomp.fr> writes:
> Thanks for the answers.
> 1) True, I do not consider alignment problems, and that could
> be a problem in machines where integers are aligned at some
> power of 2 boundary. A work-around would be to just
> memcpy (or equivalent) and memcmp (or an equivalent loop)
> instead of accessing the signature at the end of the
> block as an integer.


If I remember and understand the code correctly, you allocate two
extra ints at the start of the block, and one at the end, and return
(malloc() + 2 * sizeof(int)). (That's not a valid expression, but you
get the idea.)

If the two ints cause the rest of the block to be misaligned, using
memcpy() to access the int you put at the end of the block will help
only for that int; the user's data will still be potentially
misaligned (depending on what's stored in it).

[...]

> 3) A problem arises when size_t is much bigger than
> sizeof int. In 64 bit windows, for instance, this
> code would truncate a size_t when the user attempts
> to allocate more than 2GB of memory in a single block.


So don't truncate it. Store it as a size_t.

> A problem arises then here. What should this routine do with
> an allocation request of that size? It could be a legal
> request, but it would also be the result of a negative
> number being passed to malloc...
> Microsoft proposed in the TR 24731 to coin a new type
> rsize_t that would be the maximum value of a legal request
> for an object, and limited to a signed int, in 32 bit systems
> 2GB. I think the solution would be along those lines here.


Perhaps, but since rsize_t isn't in the language, there's no portable
way to define it or determine its upper bound.

My advice:

Store the size as a size_t. Don't even think about truncating it. If
the code is intended to be portable, you can't assume that
malloc((size_t)INT_MAX + 1)
or
malloc((size_t)INT_MAX * 2)
is invalid.

Define a constant for the maximum alignment in bytes, and use it
when you decide how much to allocate. For example:
#define MAX_ALIGN (sizeof(long long))

The offset of the user's data in the malloc()ed block would then be
max(MAX_ALIGN, 2 * sizeof(size_t)).

If you want to check for accidental negative values, define a constant
and do an explicit check in your code; treat any attempt to allocate
more than that as an error. For example:
#define MAX_ALLOC (SIZE_MAX / 2)

For systems where some type requires stricter alignment than long
long, or where the maximum sensible allocation is something other than
SIZE_MAX / 2, the constants are easily configured by editing a line or
two in the source (or using whatever configuration system you might
use).

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Keith Thompson

2006-06-25, 6:56 pm

Barry Schwarz <schwarzb@doezl.net> writes:
> On Sun, 25 Jun 2006 10:17:39 +0200, jacob navia
> <jacob@jacob.remcomp.fr> wrote:
>
>
> This doesn't solve the problem that your are returning an address
> *only* two int beyond the allocated address.
>
> Two possibilities come to mind:
>
> On the very first call to your function, allocate a small
> block (sizeof(long long)+sizeof(long double)). Convert the address to
> unsigned long. Determine N, the number of low order zeros in the
> result. Instead of using sizeof(int) as your delta, use pow(2,N).
> Free the memory and get on with what the user requested.


I've worked on systems where the number of low-order zeros in the
representation of an address did *not* correspond directly to the
alignment in bytes of the address. (Cray vector systems, where
hardware addresses denote 64-bit words, and byte pointers
(CHAR_BIT==8) are implemented in software by storing an offset in the
high-order 3 bits of the pointer.)

> On the very first call, compute max(sizeof(short),
> sizeof(int), sizeof(size_t), sizeof(ptrdiff_t), sizeof(long long),
> sizeof(long double), ...), betting on the assumption that the longest
> object imposes the strictest alignment.


You can also create a union of all those types, then wrap it in a
struct:

struct foo {
char c;
union everthing u;
};

If you've included enough stuff in "union everything", then
offsetof(struct foo, u) will probably be the strictest alignment,
suitable for use by malloc(). If long long is 8 bytes and requires
only 4-byte alignment, this method will give you 4 rather than 8,
which could save some space.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Keith Thompson

2006-06-25, 6:56 pm

"Malcolm" <regniztar@btinternet.com> writes:
> "jacob navia" <jacob@jacob.remcomp.fr> wrote
> And so gibberish types multiply.
> Why not just go back to malloc() taking an int?


Because it won't always work. There are good reasons for int to be 32
bits even on 64-bit systems. (with 8-bit char, making int 64 bits
would leave a hole in the type system; short would be either 16 or 32
bits). But there's no reason to forbid malloc()ing more than INT_MAX
bytes.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Skarmander

2006-06-26, 3:57 am

Keith Thompson wrote:
> "Malcolm" <regniztar@btinternet.com> writes:
>
> Because it won't always work. There are good reasons for int to be 32
> bits even on 64-bit systems. (with 8-bit char, making int 64 bits
> would leave a hole in the type system; short would be either 16 or 32
> bits). But there's no reason to forbid malloc()ing more than INT_MAX
> bytes.
>

Isn't "int" supposed to be the "natural" integer type for a platform? The
"hole" argument is compelling up to a point, but an implementation can
always provide int_least32_t and int_fast32_t for code that needs such types
(which, for ease of portability, will typically hide these behind typedefs
anyway).

Today's systems properly make the tradeoff, but I can imagine 64-bit
architectures where having a 32-bit int would be silly. ('short' would
indeed be likely to be 32 bits there, and there might be a separate 16-bit
type not covered by the regular types).

S.
pete

2006-06-26, 3:57 am

Skarmander wrote:

> Isn't "int" supposed to be the "natural" integer type for a platform?


By tradition, but not by force of law.

--
pete
Keith Thompson

2006-06-26, 7:56 am

Skarmander <invalid@dontmailme.com> writes:
> Keith Thompson wrote:
> Isn't "int" supposed to be the "natural" integer type for a platform?


Sure, but "natural" is a loose enough concept that the requirement is
unenforceable.

> The "hole" argument is compelling up to a point, but an implementation
> can always provide int_least32_t and int_fast32_t for code that needs
> such types (which, for ease of portability, will typically hide these
> behind typedefs anyway).


Sure, for C99. I don't think C99 has yet influenced the choice
of sizes of predefined types. A lot of C code is written
to be portable to C90 implementations; such code cannot use
<stdint.h>. (Unless it uses Doug Gwyn's q8 implementation,
<http://www.lysator.liu.se/c/q8/index.html> (link currently down).)

> Today's systems properly make the tradeoff, but I can imagine 64-bit
> architectures where having a 32-bit int would be silly. ('short' would
> indeed be likely to be 32 bits there, and there might be a separate
> 16-bit type not covered by the regular types).


I don't have to imagine them. I've used systems with:

char 8 bits
short 32 bits
int 64 bits

and others with

char 8 bits
short 64 bits
int 64 bits

The lack of a 32-bit integer type on the latter was inconvenient at
times. (The implementation was C90 only.)

(If you're curious, the systems were a Cray T3E (based on the Alpha
processor) and a Cray T90 (vector system), respectively.)

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Frederick Gotham

2006-06-26, 7:56 am

Keith Thompson posted:


> and others with
>
> char 8 bits
> short 64 bits
> int 64 bits
>
> The lack of a 32-bit integer type on the latter was inconvenient at
> times. (The implementation was C90 only.)



I'm curious, why is the lack of a 32-Bit integer type inconvenient? It
would seem that the 64-Bit integer types would provide all the
"functionality" you need... unless you're depending on 0 rolling backwards
into 4294967295?


--

Frederick Gotham
Skarmander

2006-06-26, 6:56 pm

Keith Thompson wrote:
> Skarmander <invalid@dontmailme.com> writes:
>
> Sure, but "natural" is a loose enough concept that the requirement is
> unenforceable.
>
>
> Sure, for C99. I don't think C99 has yet influenced the choice
> of sizes of predefined types. A lot of C code is written
> to be portable to C90 implementations; such code cannot use
> <stdint.h>. (Unless it uses Doug Gwyn's q8 implementation,
> <http://www.lysator.liu.se/c/q8/index.html> (link currently down).)
>

Clarify: I meant to imply that implementations are free to provide
nonstandard integer types as an extension. (Like, say, __int32.) Portable
code that needs exactly-sized integer types will usually use its own typedef
sequestered in a header; customizing this is easy.

If implementations offer additional integer types through C99 standard types
that's even better; I was just using the C99 types to unambiguously describe
the kind of types I'm talking about.

>
> I don't have to imagine them. I've used systems with:
>
> char 8 bits
> short 32 bits
> int 64 bits
>
> and others with
>
> char 8 bits
> short 64 bits
> int 64 bits
>
> The lack of a 32-bit integer type on the latter was inconvenient at
> times. (The implementation was C90 only.)
>
> (If you're curious, the systems were a Cray T3E (based on the Alpha
> processor) and a Cray T90 (vector system), respectively.)
>

Thanks for the information. I've only been familiar with the recent x86-64
platforms, which for obvious reasons make very sure to keep 32-bit types around.

S.
Keith Thompson

2006-06-26, 6:56 pm

Frederick Gotham <fgothamNO@SPAM.com> writes:
> Keith Thompson posted:
>
>
>
>
> I'm curious, why is the lack of a 32-Bit integer type inconvenient? It
> would seem that the 64-Bit integer types would provide all the
> "functionality" you need... unless you're depending on 0 rolling backwards
> into 4294967295?


The particular case I'm thinking of involved an externally defined
data format. The code, which was in a system header file, used a
32-bit bit field where most other implementations used an ordinary
struct member. This caused problems for code that took that member's
address.

A workaround was possible, but it was, as I said, inconvenient.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com