Skip to main content

Why HashMap key should be immutable in java

HashMap is used to store the data in key, value pair where key is unique and value can be store or retrieve using the key.
Any class can be a candidate for the map key if it follows below rules.

1. Overrides hashcode() and equals() method.

  Map stores the data using hashcode() and equals() method from key. To store a value against a given key, map first calls key's hashcode() and then uses it to calculate the index position in backed array by applying some hashing function. For each index position it has a bucket which is a LinkedList and changed to Node from java 8. Then it will iterate through all the element and will check the equality with key by calling it's equals() method if a match is found, it will update the value with the new value otherwise it will add the new entry with given key and value.
In the same way it check for the existing key when get() is called. If it finds a match for given key in the bucket with given hashcode(), it will return the value otherwise will return null.
Below picture shows the internal of hashmap.

Intrenals of hashmap

2. Key class should be immutable.

    If we want to use any class as a key in a hashmap then it should be immutable. If it is not possible then you need to make sure that once the class instance is created it is not modifying those instance properties which participate in the hashcode() calculation and equality check.
Let's understand it by example.
There is an Employee class (mutable) which we want to use as a key in Hashmap.
Employee.java
public class Employee{

    private String name;

    private String id;

    public Employee(String name, String id) {

        super();
        this.name = name;
        this.id = id;
    }

    //public getters
    //public setters
    //hashcode using name & id
    //equals using name & id
}
With this mutable class we create an instance of Employee and put it in a hashmap, for ex:-
Employee key = new Employee("name1", "id1");

hashmap.put(key, "some value");
Let's assume that it has generated some hashcode like 123456 for given name and id, and it has calculated the index as 1 basis on that & then stored the value at related location.

Now we call the setter on name or id on object "key" and it's hashcode is changed, let's assume now it has hashcode as 234567 and index position as 4. for ex:-
key.setName("other name");
If we try to search the map for this key, it may not find a value as it's hashcode is changed and hashmap has calculated different index position for this key.
Now if we call the put() method on map with this key, it will store this object at different position and bucket in the hashmap. It may result in some memory leak as we still have the same employee object (key) as a key in map with the previous hashcode 123456.
Now there are two keys in hash map with the same Employee instance and same name, id but at different positions and bucket. It will make the first key unreachable until we have an instance of Employee with the same name & id as we had during first put which can be used to update or retrieve the value for this key. Below picture describes this scenario step by step which also clarifies why map key should be immutable.

Hashmap state after updates


Comments

  1. I Would like to thank you for this article. From this article, I got more and more useful information. This is so helpful to me. Keep updating more articles.
    Ubs accounting
    Myob Singapore
    Best Accounting software Singapore

    ReplyDelete

Post a Comment

Popular Posts

Setting up kerberos in Mac OS X

Kerberos in MAC OS X Kerberos authentication allows the computers in same domain network to authenticate certain services with prompting the user for credentials. MAC OS X comes with Heimdal Kerberos which is an alternate implementation of the kerberos and uses LDAP as identity management database. Here we are going to learn how to setup a kerberos on MAC OS X which we will configure latter in our application. Installing Kerberos In MAC we can use Homebrew for installing any software package. Homebrew makes it very easy to install the kerberos by just executing a simple command as given below. brew install krb5 Once installation is complete, we need to set the below export commands in user's profile which will make the kerberos utility commands and compiler available to execute from anywhere. Open user's bash profile: vi ~/.bash_profile Add below lines: export PATH=/usr/local/opt/krb5/bin:$PATH export PATH=/usr/local/opt/krb5/sbin:$PATH export LDFLAGS=...

Entity to DTO conversion in Java using Jackson

It's very common to have the DTO class for a given entity in any application. When persisting data, we use entity objects and when we need to provide the data to end user/application we use DTO class. Due to this we may need to have similar properties on DTO class as we have in our Entity class and to share the data we populate DTO objects using entity objects. To do this we may need to call getter on entity and then setter on DTO for the same data which increases number of code line. Also if number of DTOs are high then we need to write lot of code to just get and set the values or vice-versa. To overcome this problem we are going to use Jackson API and will see how to do it with minimal code only. Maven dependency <dependency> <groupId>com.fasterxml.jackson.core</groupId> <artifactId>jackson-databind</artifactId> <version>2.9.9</version> </dependency> Entity class Below is ...

Multiple data source with Spring boot, batch and cloud task

Here we will see how we can configure different datasource for application and batch. By default, Spring batch stores the job details and execution details in database. If separate data source is not configured for spring batch then it will use the available data source in your application if configured and create batch related tables there. Which may be the unwanted burden on application database and we would like to configure separate database for spring batch. To overcome this situation we will configure the different datasource for spring batch using in-memory database, since we don't want to store batch job details permanently. Other thing is the configuration of  spring cloud task in case of multiple datasource and it must point to the same data source which is pointed by spring batch. In below sections, we will se how to configure application, batch and cloud task related data sources. Application Data Source Define the data source in application properties or yml con...