Saturday, August 29, 2015

hashCode, equals methods and Set collection in Java

On this post, let's talk about a little basic part of programming, but important to understand.  We know the importance of the hash value of an object, but sometimes have a hard time to understand a bug caused by using a hashCode.

Now, let's look at the following code.  This Pojo.java is a simple object with two member variables, and the hashCode() and equals methods are auto-generated by Eclipse. Look good, isn't it?

public class Pojo {
   private int number = 0;
   private String str = "";
   public int getNumber() {
      return number;
   }
   public void setNumber(int number) {
      this.number = number;
   }
   public String getStr() {
      return str;
   }
   public void setStr(String str) {
      this.str = str;
   }
   @Override
   public int hashCode() {
      final int prime = 31;
      int result = 1;
      result = prime * result + number;
      result = prime * result + ((str == null) ? 0 : str.hashCode());
      return result;
   }
   @Override
   public boolean equals(Object obj) {
      if (this == obj)
         return true;
      if (obj == null)
         return false;
      if (getClass() != obj.getClass())
         return false;
      
      Pojo other = (Pojo) obj;
      if (number != other.number)
         return false;
      if (str == null) {
         if (other.str != null)
            return false;
      } else if (!str.equals(other.str))
         return false;
      return true;
   } 
}


Let's look at a Testing code. Can you say what the printed statements are?

public class Test {
   public static void main(String[] args) {
      Set<Pojo> pojoSet = new HashSet<Pojo>();
      for (int i = 1; i < 10 ; i++){
         Pojo pojo = new Pojo();
         pojo.setNumber(i);
         pojoSet.add(pojo);
  
         if(i == 5){
            pojo.setStr(“String Value”);
         }
      }
  
      Pojo[] data = pojoSet.toArray(new Pojo[pojoSet.size()]);
      Set<Pojo> anotherSet = new HashSet<Pojo>(Arrays.asList(data));

      for(Pojo pojo : pojoSet){
         if(pojo.getNumber() == 5){
            boolean isRemoved = anotherSet.remove(pojo);
            System.out.println("From anotherSet: " + isRemoved);
            isRemoved = pojoSet.remove(pojo);
            System.out.println("From pojoSet: " + isRemoved);
         }
      }
   }
}

Outputs are
   From anotherSet: true
   From pojoSet: false

When an object is added to a HashSet, the object's hashCode method is called and calculate the hash value of the object in the HashSet.  When a value of the 'Str' is changed later, the hashCode of that object in the HashSet doesn't get updated.
When this object is passed to a method in the HashMap (in our case, remove method), a hash value of the passed object is calculated and compare this hash value with hash value in the HashSet.

In our example, the hash value of the 5th object in the pojoSet is calculated with an empty Str value.  After that, the Str value is changed.  When the pojo object in the second for loop is passed to the remove method, a hash value of this object is calculated with a non empty value of the Str.  Therefore, the second output statement shows the 'false'.

In production code, we often have an Entity object with a Set type property. During a business process, we sometimes create a new entity that is a property of another. In this case, we add the newly created java object to the set and persist the parent entity object to insert the newly created object.   During the insertion, the DB returns an id of the object and the object is updated.  If the id (or any other updated field) is used in the hashKey method of the entity, we no longer find the newly created java object from the Set (until the Set is reloaded).

No comments:

Post a Comment

Java 9: Flow - Reactive Programming

Programming world has always been changed fast enough and many programming / design paradigms have been introduced such as object oriented p...