Lesson: Data Structures Review

(Lessons) (Previous: Lesson: Comparing Data Structures) (Next: Lesson: RPG Introduction)

Table of Contents

1. Designing a data structure

1.1. How should the data be broken down?

What makes a data structure better than a single object is that information that somehow "belongs" together is grouped together in one object, rather than having all information as instance variables of one huge object. This gives me two advantages:
  1. I can think about the data structure and its code in manageable pieces, rather than having to deal with the whole thing all at once
  2. I can reuse a class or collection of classes in many places in the data structure, or in another data structure.
You have seen examples of the first advantage whenever you turn a data structure into code - it is a lot easier to write first the simple classes at the "bottom" of the object model hierarchy, and then gradually build them up into more complex pieces, than to try to deal with the whole system all at once. We have also seen examples of the second advantage: Copier used two Paper objects, one for each tray, just as Toaster used a ToasterSlot object for each slot. The Point object is even more versatile: we have used it throughout Turtles and then continued to use it in the current project.

In general, anything that you recognize as representing a complete object that you could pick up and move to another place, should be its own class.

1.2. What variables should have setters?

Usually the point of a data structure is to be able to access all the information it contains. So, I will almost always have getter methods for all the instance variables within the structure, so that I can follow any path through it.

On the other hand, it is not always advisable to have setters for everything. One common plague of programmers is code that accidentally changes something inside a data structure in a way that breaks the structure. For example, if it is possible to set a variable to null from the outside, I can break off a whole piece of the structure in this way. The error will not be discovered until I later try to access the missing branch of the data structure, at which point there is no way to tell what piece of code is responsible for doing the damage.

In general, if there is no good reason for someone outside your data structure to have to modify a variable, there should not be a setter for it. Take our Toaster example: there is no way to change the ToasterSlot objects installed in the Toaster, because under normal operation a Toaster's slots are not replaced. As a result, the code in Toaster can confidently send messages to the ToasterSlots without needing to worry about whether they might be null. In contrast, it is possible to change what type of object is in each slot, so I need to be more careful in ToasterSlot about how I deal with my object.

An extreme case of protecting from change is to make an object class completely immutable, meaning that all the information in its variables is set by its constructor and there is no way to change the object once it is constructed. String works this way. I can't change a String once it is created; all that I can do is to replace it with an entirely new String.

1.3. Writing constructors, getters, and setters

For every variable that I create, there are three things that I need to do with it in the class:
  1. Give it some initial value in the constructor
  2. Provide a getter for it
  3. Provide a setter for it (if appropriate)
The last two are merely a matter of following out a set pattern that you should have memorized by now:

private Type name;

public Type getName() {
   return name;
}

public void setName(Type a) {
   name = a;
}

Constructors are a bit more interesting, because you have to decide for which variables a constructor will take values as parameters, and for which variables it will just provide a default value. Often, there will be many different constructors with different combinations of parameters. Remember that you can use this() to have one constructor call another.

How does this LineSegment class work?

public class LineSegment {
   private static Point last = new Point(0, 0);
   private Point start;
   private Point end;
   
   public LineSegment(Point end) {
      this(last, end);
   }
   
   public LineSegment(Point start, Point end) {
      super();
      this.start = start;
      this.end = end;
      last = end;
   }
   
   public LineSegment(double x1, double y1, double x2, double y2) {
      this(new Point(x1, y1), new Point(x2, y2));
   }
}

2. Things to do with a data structure

2.1. Accessing or changing an element

If I want to retrieve or change the value of any variable in a data structure, there are two steps involved:
  1. Get a reference to the object that owns that variable
  2. Tell it to get or set the variable's value
Of course, whether I can actually change a value depends on whether it has a setter provided. On the other hand, I can almost always expect that there will be a getter available for a value, since the whole point of the data structure is to make all that information accessible.

So, for example, suppose that you want to find out how much money is in my wallet. Since this is a property of the Wallet, you first need to get a reference to my Wallet before you can ask how much money is in it. How do you get that reference? Well, it's an instance variable belonging to me, so you need to ask me for it:

Wallet w = mrZ.getWallet();
double money = w.getCash();

You could also do this all in one line:

double money = mrZ.getWallet().getMoney();

When I change the value of a variable that was a reference, I am cutting off an entire piece of the data structure and replacing it with something new. For example, suppose that a LineSegment is composed of two Points, its start and end.

LineSegment line = new LineSegment(new Point(1, 2), new Point(3, 4));

Suppose now that I want to set the y coordinate of the end point to 1, instead of 4. What is the difference between doing this:

line.setEnd(new Point(3, 1));

... versus doing this:

line.getEnd().setY(1);

This is the sort of problem for which it is useful to have an object model to work from. In the first case, I am creating an entirely new Point object, disconnecting the Point that end used to refer to, and putting the new Point in its place. In the second example, I am going into the Point that is referenced by end and just changing its y coordinate.

Which one of these I would normally use depends on what getters and setters the data structure provides. A Point would be a good candidate for being immutable, since it is so simple; that would mean that if I want to change the Point, I have to completely replace it, rather than just changing it.

2.2. Getting it to describe itself

There are many situations where I want a readable description of the contents of a data structure. In other words, I want it to put together a String that describes in some human-readable format the data that it is representing.

Every object already knows how to provide a description of itself - there is a method called toString(), inherited from the Object class, that is supposed to do this job. Unfortunately, the information it outputs is not very informative: If I have a Point object and I ask it for toString(), the default thing for it to return is something like "Point@1ac4be62" (meaning, "a point object at the address 1ac4be62. Addresses are typically written in hexadecimal, base 16, where letters are used to stand for the digits above the regular decimal ones - A is 10, b is 11, and so on)

So, to get an informative description of an object, you want to override the toString() method. A typical example is shown below:

public class Person {
   private String name;
   private int age;
   private double height;
   
   public String toString() {
      String str = "Person";
      str += "\nName: " + name;
      str += "\nAge: " + age;
      str += "\nHeight: " + height + " ft";
      return str;
   }
}

The "\n" at the start of each new line mean the same as hitting the "Enter" key on your keyboard. So, for an example person, the toString() would look like this:

Person mrZ = new Person("Russell Zahniser", 24, 6.33);
System.out.print(mrZ);

Output:

Person
Name: Russell Zahniser
Age: 24
Height: 6.33 ft

The big idea about toString() is that toString() will be called whenever I need to convert an Object into a String. So, for example, when I add an object to a string, or when I try to System.out.print() it, its toString() method will first be called to turn it into a String.

2.3. Checking if its contents match another version of the same data structure

One of the things that makes objects more difficult to work with than numerical values is that, since an object is linked to by reference, asking the question object1 == object2 compares only the references in object1 and object2, rather than the contents of the objects.

So, for example, it will never be true that new Integer(4) == new Integer(4), since I have created here two different objects, at different addresses. These to objects happen to have identical contents, but since == only compares the address, it won't tell me anything about that.

So, if all I can do with == is to compare references, the solution must be to ask one object to compare itself to the other. After all, that object knows its own structure better than anyone, so if I give it another member of its class and ask it to make the comparison, it will know exactly what needs to be compared. The method that asks an object if it is equal to another is called equals():

public boolean equals(Object other)

This method, like ==, will return true if the object being asked to do the comparison is equal to the object you give it as a parameter. So, for example, this would be true:

if(new Integer(4).equals(new Integer(4))) {

You have seen this before in comparing strings. If I want to know, in Asteroids, if my frame is "done", I am comparing two String objects, getFrame() and "done", and thus I use equals() instead of ==.

if(getFrame().equals("done")) {

The equals() method usually does its job in two steps. First, it checks to see if the object passed to it is, in fact, of its own class; if it isn't, there is no way they can be equal. Then, once it is established that the class is the same, it checks if this and other match in all their instance variables.

Remember, use == when comparing number values, and equals() when comparing objects.

public class Person {
   private int age;
   private String Name;
   
   public boolean equals(Object other) {
      if(!other instanceof Person) {
         return false;
      } else {
         Person p = (Person)other;
         if(p.age == this.age && p.name.equals(this.name)) {
            return true;
         } else {
            return false;
         }
      }
   }

Actually, returning true or false explicitly above is unnecessary; I could just return the value of the if condition. In fact, I could have done the whole thing in one line:

return other instanceof Person &&
      ((Person)other).age == this.age &&
      ((Person)other).name.equals(this.name);

One last note - if you want to compare objects in a "!=" sort of way instead of "==", remember that there is an operator ! ("Not") that you can use to reverse any boolean value. So, in other words, I can do this:

if(!getFrame.equals("done")) {

2.4. Ordering it in relation to another version of the same data structure

A problem closely related to that of checking if two objects are equal is the problem of ordering two objects according to some criteria.

For example, suppose I have a Book object that has a title and an author, both Strings. A Library might deal with these Books in large quantity and want to compare them. The librarian can identify when two objects represent the same book, because they have the same author and title. But a more common task would be trying to figure out where the book belongs on the shelf; this involves looking at another book and figuring out if the book you have comes before or after that book.

Not every type of object has this sort of ordering. For example, there is no "natural" way that I can think of to order Turtles. So, the objects that do provide some way to order themselves have to announce that they have that capability. They do that by implementing an interface called Comparable:

public class Integer implements Comparable {

An interface is a list of methods that an object needs to implement in order to be treated in a certain way. You could think of it as an adjective describing a capability of the object. For example, an object that is Weighable would be required to have a getWeight() method.

We will learn more about interfaces at the end of the year. For now, what you need to know is that if I declare a class to implement Comparable, it has to have a method with this signature:

public int compareTo(Object other)

This mehtod returns an int because there are three possible outcomes of a comparison: I am either less than the object I was asked to compare myself to (which I signal by returning a negative number), or I am greater than it (which I signal by returning a positive number), or I am equal to it, which I signal by returning 0.

So, you can use compareTo() like this:

if(name1.compareTo(name2) < 0) {
   System.out.println(name1 + " comes before " + name2);
} else if(name1.compareTo(name2) > 0) {
   System.out.println(name1 + " comes after " + name2);
} else {
   System.out.println(name1 + " and " + name2 + " are equal");
}

Just as writing an equals() method involves using == and equals() to compare instance variables, compareTo() will do its work by comparing number variables with >, <, <=, and >=, and object variables with compareTo(). It does this somewhat differently than equals(), however: you will pick one criteria to use first, then compare the next variable only if that is equals. So, for example, in ranking two people in alphabetical order, I compare their last names first, and only if they have different last names do I go on to compare first names. If the first and last names are both equal, I go on to compare something else, like age:

public class Person {
   private String firstName, lastName;
   private int age;
   ...
   public int compareTo(Object other) {
      Person p = (Person)other;
      if(lastName.compareTo(p.lastName) < 0) {
         return -1;
      } else if(lastName.compareTo(p.lastName) > 0) {
         return 1;
      } else if(firstName.compareTo(p.firstName) < 0) {
         return -1;
      } else if(firstName.compareTo(p.firstName) > 0) {
         return 1;
      } else if(age < p.age) {
         return -1;
      } else if(age > p.age) {
         return 1;
      } else {
         return 0;
      }
   }

Of course, I can do that much more efficiently if I just think about how to get the right sort of int returned, instead of listing out all the possibilities:

public int compareTo(Object other) {
   Person p = (Person)other;
   if(!lastName.equals(p.lastName)) {
      return lastName.compareTo(p.lastName);
   } else if(!firstName.equals(p.firstName)) {
      return firstName.compareTo(p.firstName);
   } else {
      return age - p.age;
   }
}

3. Homework

Data Structures Review (pdf) (Due: 12/20/06)
Code for Data Structures Test (pdf)

Data Structures Exercise (pdf) (10 pts, 12/21/06)

(Lessons) (Previous: Lesson: Comparing Data Structures) (Next: Lesson: RPG Introduction)