Code Comments
Programming Forum and web based access to our favorite programming groups.Apparently, C# does not allow you to return a reference to a value
type. In practical terms this means that you cannot modify the fields
of a struct-based property. For example:
public class Thing
{
private Point _pos; // Point is a struct, a value type
public Point Position
{
get { return _pos; } // returns a _copy_ of _pos. Not what I had
in mind.
set { _pos = value; }
}
}
Later...
Thing t = new Thing;
t.Position.X = 100; // compiler error CS1612: cannot modify return
value
To add insult to injury, the program also incurs a slight performance
hit for allocating a temporary copy of the structure each time the
Position property is invoked. Am I right about this?
Maybe I'm missing something, but I just cannot see the purpose of such
a restriction. Where is the benefit of not allowing references to value
types?
Thanks,
Aleko
Post Follow-up to this messageBecause they're value types? :) You're meant to think about value types as though they were ints or doubles. You don't go into an int and fiddle with the third bit of _that particular int_. In fact, the concept of "that particular int" doesn't even make any sense. ints don't have identities: there is no "this copy of the value 5" versus "that copy of the value 5". There's just 5. Of course, there are _variables_ that store ints, and you can modify them. That is, in a nutshell, the point: you modify variables containing values; you don't modify the values themselves. And yes, whenever you return an int from a property, you get a copy, not the original int. Allowing you to get a reference leads to C / C++, meaning unsafe code: myClassObject.IntProperty = 0; int *x = &myClassObject.IntProperty; *x = 5; I don't think I need to go into the problems that this causes: that *x = 5 bypasses any code in set_IntProperty designed to safeguard encapsulation, etc. etc. So, what does this have to do with Point? Everything! As I said, Point is a value type, and you're meant to think of it in the same way as you think of an int or a double. You don't change ints: you calculate new ones and set variables / properties to new values. Similarly, you don't change Points: you calculate new ones and set variables / properties to new values. Yes, Points get copied onto the stack when returned from a property, just as ints and doubles do. Think about what would happen if it were not so: myClassObject.PointProperty.X = 0; bypasses all of the code in set_PointProperty designed to encapsulate the state of myClass. What if set_PointProperty takes great pains to ensure that the point never has an X of 0? Too bad: it was set through the "back door". The way you solve this problem with reference types (classes) is with events: If you had a Point class, it would have to raise XChanged and YChanged events so that myClass could intercept changes and throw exceptions as appropriate. Ouch! Talk about overhead! No, when dealing with value types, you have to set properties like this: myClass.PointProperty = new Point(100, myClass.PointProperty.Y); Points are really small (64 bits), so pushing one on the stack is no big deal. In fact, that's one of the recommendations for making structs: keep them small, because they're copied all over the place, just as ints and double are. The only issue I have with the way that structs are implemented in .NET is that many common structs, like Point, have set accessors on some of their properties. Personally, I think that this is utterly ridiculous. It really doesn't buy you much to be able to say: Point p = new Point(0, 0); p.X = 50; instead of: Point p = new Point(0, 0); p = new Point(50, p.Y); and the former syntax only works if p is a variable, not, as you pointed out, if it's a property. Neither does it work if the Point is in an aggregate structure: Hashtable h = new Hashtable(); h["Point"] = new Point(0, 0); ((Point)h["Point"]).X = 50; does squat, because it's changing the X property of a copy of the (boxed) Point in the hash table, not of the entry itself. So, as I said, I think that set accessors on struct properties are stupid. They just lead people to the false assumption that they can treat structs as they treat classes, which isn't true. When I make my own structs, I always make them invariant: if you want to change one of my value types, you either have to "new" a new one, or call some method on one that returns a new value. (A silly example for Point would be: Point p = new Point(0, 0); p = p.ChangeX(50); this would always work, no matter what the situation. Sadly, they didn't implement Point this way. I think they should have.)
Post Follow-up to this message>>Where is the benefit of not allowing references to value types? >Because they're value types? :) I know they are, but what's wrong with returning a reference to a value type? You can /pass in/ a value type by reference to a function (by using the ref keyword), so why shouldn't you be able to pass it out? > It really doesn't buy you much to be able to say: >Point p = new Point(0, 0); >p.X = 50; >instead of: >Point p = new Point(0, 0); >p = new Point(50, p.Y); Hmm, I must disagree with this one. Assigning a value to a field is many times cheaper than dynamically allocating a new copy of the struct, copying _all the fields_, and finally having the garbage collector clean up the copy. That's like buying a new walkman every time the battery runs out, instead of just replacing the battery. :) >So, as I said, I think that set accessors on struct properties are >stupid. They just lead people to the false assumption that they can >treat structs as they treat classes, which isn't true. I agree, which is why I think struct fields should be public. Structs are typically used when performance is an issue, so information hiding in this case is not really a concern. By extension struct-based class properties may also be publically exposed without guilt, because they are, well, value types. You can't do something dumb, like forget to allocate them, or assign null to them. The worst you can do is assign some nonsensical value to a field. (If that's an issue, hide the fields behind accessors) I suppose OO purists will stone me for suggesting violating the golden rules, but the alternative is worse. I tried re-writing the Size struct as a class (with private fields, and accessors) so I could return it by value, and then change its properties. It worked fine, but I had to do write an extra class, and I don't think I gained much by doing that. Purity comes at a cost, and sometimes it's not worth it. Regards, Aleko
Post Follow-up to this message> Structs are typically used when performance is an issue No! No! I blame Microsoft for this misconception. They have examples of doing this on MSDN and all. It's a terrible design decision. Yes, in a few obscure cases you can gain something by using structs instead of classes. However, the cases are, as I said, obscure... and rare. As you have pointed out, using structs instead of classes causes lots of copying, which can lead to performance _degradation_... unless you pay very, very close attention to what you're doing. The source of all of the confusion is the idea that structs are somehow "classes lite", and that you should press them into service when you need screaming performance. Of course, every newbie out there immediately starts making everything a "struct" so that their app will "run faster", then wonders why everything is so screwy. Aaargh. (Sorry... this is a pet peeve of mine, doubly so because Microsoft seems to endorse this sort of silliness. :) Those few obscure and rare cases aside, structs should be employed in only one situation: when you want _value semantics_. That is, when you _want_ something that is copied rather than handled by reference. When you want it to act like a _value_. Here are couple of examples from my coding: a Fraction, and a Measure. A Fraction is just what it looks like: it's a type capable of holding a whole part, a numerator, and a denominator. It acts like any other number, and that's why it's a struct. I absolutely _don't_ want to assign a Fraction to some variable, have that variable say fracVar.Numerator = 3, and have the fraction change in that variable _and_ the variable it was assigned from. That would be horrific. I want the thing to act like a _value_: like an int or a double or a decimal. (This is, incidentally, precisely why my Fraction has no set accessors, so the snippet of code above isn't even legal in my world.) Similarly, a Measure is a quantity that has a unit of measure (a reference type, in my world) tagging along with it. Again, I want it to act like an int or a decimal. I _absolutely don't_ want reference semantics here. I want value semantics. IMHO there is a huge problem in the .NET community, wherein programmers try to press structs into service as a sort of "class lite" to increase performance, without really understanding what they're doing and thus creating unnecessary headaches for themselves. structs are a valuable part of the language. As I said, I've used them in a few places _where they make sense_ and they're wonderful just the way they are. It's when you try to use them inappropriately that they start acting up. (The only exception being what you ran into: putting Points and Sizes into an aggregate structure is a perfectly reasonable thing to do. Then wondering why you can't work with them as though they were reference types is also perfectly reasonable. The problem is not that they don't act like reference types. The problem is that somebody thought it was clever to make Point and Size _mutable_, which they shouldn't be.) > finally having the garbage collector clean up the copy This is inaccurate. The garbage collector does not have to clean up the copy: the copy must either live on the stack (no garbage collection there), or within a reference type which would have had to have been garbage collected anyway. The only time the garbage collector gets involved with value types is if you box them, which effectively creates a reference type to hold the value. This is probably why Microsoft chose to make Point and Size into value types: precisely to avoid garbage collection of a bazillion little Points and Sizes when code starts doing mathematics with them. _This_ is the sense in which structs are "faster": they require no GC, and they require no heap allocation. Constructing a new Point is, at run time, no more expensive than constructing a new double, and just like a double there is no garbage collection required. It's a questionable design: I'm not sure that the savings in GC and allocation time for Points and Sizes is worth the confusion it's caused amongst programmers. The fact that MSDN contains an example of using a struct for a _customer record_ doesn't help matters either (grrr...). > ...which is why I think struct fields should be public. ... which is useful only if you're using them for what I would claim is the "wrong purpose". If you use structs to implement new kinds of _values_, then there's no need to go breaking encapsulation in an attempt to make them act like reference types. Again, you could argue that Points and Sizes as structs is a dodgy implementation, and I would agree. However, if you're making your own structs, and you make all of the fields public and start trying to use them as though they were reference types, I would claim that you're pushing on a rope. :) As well, even doing this won't help your hash table situation: even if all of Point's fields were public, you still couldn't do the assignment directly into the value that is boxed and stored in the hash table. Value semantics would still bite you back. :) > Purity comes at a cost, and sometimes it's not worth it. Absolutely true, and sometimes I decide to abandon clean code for other benefits. Just be sure, however, that you're not abandoning what you call "purity" because you're using the wrong construct for what you want to do. The only "bad" thing about structs is that in aggregates (pre v2.0) you have to say: Point oldPoint = (Point)myHash["Key"]; myHash["Key"] = new Point(oldPoint.X, newY); instead of ((Point)myHash["Key"]).Y = newY; It has nothing to do with performance... it's all about how much typing is required. I agree. the source code is ugly, and v2.0 won't be much better: myHash["Key"] = new Point(myHash["Key"].X, newY); But IMHO it's not worth abandoning or warping the feature because in one situation it doesn't "play nice", when it's so useful in other contexts.
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.