Monday, November 9, 2015

Reference Types, Value Types, and Equality

In many places, we don't need to worry about the difference between value types and reference types. But one place we do see a difference is how equality is handled by default. I've had a couple of questions and comments about this recently, so let's take a closer look.

The Question
The questions came from my video series on Lambdas and LINQ (video playlist). There's also a print article available for download here: Learn to Love Lambdas (and LINQ, Too!).

Here's the code we're looking at:


This code refreshes the data in a list box by calling into a library that uses the Event Asynchronous Pattern (EAP). There's a particular piece of functionality we care about:
  1. Before refreshing the list box, we save off a copy of the currently selected item.
  2. After refreshing the list box, we try to set the selected item based on our saved value.
At the top of the method, we save off the selected item:


And then inside our lambda expression, we try to set the selected item in the list box:

Set Selected Item in List Box
The object that we have here (the "Person" class) does not have a primary key, so we compare the first name and last name properties -- this definitely isn't ideal, but it works for our small data set.

With this code in place, the selection will be saved between refreshes. So, if we run the application and select an item (Isaac Gampu), our screen looks like this:

Initial Selection - Isaac Gampu
(Notice the selection in the top left corner). Then when we change the sorting and click the "Refresh" button, we see that Isaac Gampu is still selected:

Isaac Gampu still selected after refresh
For more details on how the lambda expressions and LINQ methods work, be sure to check out the associated resources: Learn to Love Lambdas (and LINQ, Too!).

And this brings us to the question:
Why can't we just set the selected item directly based on our saved person?
The proposal is to replace the code above ("Set selected item in list box") with the following:


This makes our method much simpler:


This looks great (and sounds a lot easier). But unfortunately, it doesn't work. Here's what happens. First, we make an initial selection (John Sheridan):

Initial Selection - John Sheridan
Then when we click "Refresh", we lose our selection:

No selection after Refresh
Why does this happen? For this we'll need to take a closer look at the difference between reference types and value types.

Reference Types, Value Types, and Equality
The difference between reference types and value types is how they are stored in memory. There are 2 memory locations we need to worry about: the stack and the heap.

I won't go into a full explanation of the stack and heap. You can get a good overview here: C# Heap(ing) and Stack(ing) in C#.

What's important to understand here is how equality is handled differently with reference types and value types.

Value Type Equality
Since value types are stored on the stack, equality is determined by comparing the values. This makes it easy to compare ints, bools, and chars. A struct is also a value type. Since a struct generally has multiple sub-values, equality is determined by comparing the values of each field and/or property.

Reference Type Equality
Since reference types are stored on the heap, equality is determined by looking at the pointer. If 2 objects point to the same object in memory, then they are considered to be equal. This means that they are only equal if they point to the same instance in memory.

But if we have 2 separate instances, then they will not be equal (at least with the default implementation; we'll look at another solution in just a bit).

Classes are reference types, so we end up running into this behavior quite a bit. Our "Person" is a class:


Observed Behavior
So why do we have a problem with our proposed code?


When we assign a value to the "SelectedItem" property of our list box, it looks through the items in the list to try to find one that is equal.

When we save off our "selectedPerson" object, we have an instance of the Person class (that holds "John Sheridan" in our example). When we reload our list box, we have a new collection of Person objects. One of those items has a value of "John Sheridan", but this is a different instance from the one that we saved off.

Because we have 2 different instances (the saved one and the one in the list), these are not considered to be equal. And that's why we get the observed behavior (no selection) instead of the expected behavior.

Now let's look at two solutions that will get our proposed code to work.

Option 1: Change Person to a Value Type
One option is to change the "Person" object from a reference type to a value type. In our case, this would involve changing it from a "class" to a "struct". The "Person" declaration is close to the same:


But because we just changed this to a value type, some of our other code breaks. Here's the updated code where we save off a copy of the selected item:


Since our "Person" is now a non-nullable value type, we need to treat it a little differently. The "as" operator only works with reference types since it can result in a null. So instead, we initialize "selectedPerson" to an empty "Person" object, then we make sure that the list box actually has a selected item. If it does, then we cast it to a "Person" and assign it to our variable.

The rest of the code stays the same. Here's our updated method:


And our behavior is as expected. If we run our application and select an item (Dave Lister):

Initial Selection - Dave Lister
And the click "Refresh", we see that our selected item is set appropriately:

Dave Lister still selected after Refresh
Because we're dealing with value types, the comparison is done by looking at the values of the properties: first name, last name, start date, and rating. If these values all match, then the items are considered by be equal.

So we've seen how we can change our data object to a value type, and our proposed code works as expected. But what if we want to keep our "Person" as a class? There's an option for that, too.

Option 2: Override Equals()
Our second option is to keep "Person" as a class (a reference type). The default behavior of "Equals" is what we saw above -- it only returns true if we're comparing the same instance. But we can override the default behavior with something that is more appropriate to our situation. Here's the code for that:


Notice that we have a "class" here. Then we override the "Equals()" method. First we cast our parameter to a "Person". When we use the "as" operator, we will get a null if the cast fails. Next we check for a "null". This could be because another object type was passed in or a null parameter was passed in. If either of these is true, then we return false (meaning, not equal).

Finally, we set up our own equality comparison based on the key properties: first name, last name, and start date. (I didn't include the rating since that could be changeable.)

Then we can go back to our original proposed method:


If we run this code, we see that it works as expected. We can select an item (John Crichton):

Initial Selection - John Crichton

And then after "Refresh", the item is still selected:

John Crichton still selected after Refresh
It would be nice if we could stop here, but things aren't quite so simple. If you notice in our "Person" class, we have a green squiggly. This tells us that there's a warning. We can see this in our build results:


This tells us that whenever we override "Equals()", we should also override "GetHashCode()". The hash code is used when we use this class as a key in a Dictionary. There are rules about creating hash codes that become important depending on how we're using the object. You can get more information on this here: MSDN Object.GetHashCode() Method.

Another thing that we should do when we override "Equals()" is to provide a generic version for our specific type: "Equals<Person>(Person obj)". But we won't do that today either.

As we can see, once we start going down the path of changing how equality works with an object, we have quite a bit of work to do.

Original Code
One of the questions that I received on the original code:
Why didn't you just override the Equals method?
There are a couple of reasons for this. The first reason is that we lose the lambda expressions and LINQ methods that I wanted to demonstrate here .

But the bigger reason is that we don't always have the option of changing our data objects. Even when we're constrained by our data, we can use LINQ to query it, filter it, sort it, and even easily locate items in a list.

Wrap Up
We don't usually have to worry about the difference between reference types and value types. But as we've seen, there is a big difference between them when we start to talk about equality comparisons. As usual, there are several different ways that we can approach the problem. And that's good. When we have options, we are free to select whichever one works best for our particular situation.

Happy Coding!

1 comment: