Saturday, August 16, 2014

Saving Memory with Primitive ArrayLists, HashMaps and BitSets in Java


Primitive type Arraylists


This summer, when I was trying to develop a small scale database, I faced a huge memory problem. I tried to keep the columns in an ArrayList, which was the obvious choice as they can increase dynamically in size. But, on the downside, ArrayList can only work with objects. So, int would have to be used as Integer, double as Double and so on.

The problem with this approach was, as soon as the primitive types are wrapped by objects, their sizes triples. For instance, here is an illustration to show the difference between double and Double. (Source: University of Virginia)



Now, this representation ofcourse varies from JVM to JVM, but this is the idea. The overheads are mostly caused by pointers and object metadata. This restricted me from storing a lot of these values. (oh! It was a Main Memory database, so forget about writing the values to disk). Also, with more space consumption, my GC overhead increased considerably. Yeah! I know!! I need a powerful machine. :(

Anyway, this got me looking up a bit till I stumbled upon the Apache library who created ArrayList versions for primitive files. Here is a link to it -

Primitive Lists

Now, this class contains primitive lists for almost all primitives like ArrayIntListArrayDoubleListArrayCharList etc. It even contains ArrayBooleanList. It has the same methods and functions exactly like the arraylist.

Bitset


The success with integers made me wonder, can I do better with the boolean? You see, boolean primitives take up size between 1 byte to 4 byte depending upon the JVM and the way it is implemented, but it actually needs only 1 bit to store the information. So, to store a list of booleans, we can actually use a BitSet. This grossly reduces the memory footprint thereby solving my memory problem upto a great deal.

I have a small example of how bit sets can be used as boolean lists.



import java.util.BitSet;

public class testBitSet {
    public static void main(String[] args) {
        BitSet bs = new BitSet();
        
        for(int i=0;i<20;i+=2){         //Sets alternate bits to true
            bs.set(i);
        }
        
        //Randomly Printing bits
        System.out.println("Element 11: " + bs.get(11));    //false
        System.out.println("Element 12: " + bs.get(12));    //true
        
        //Changing value of element at position 12 to false
        bs.set(12, false);
        System.out.println("Element 12: " + bs.get(12));    //false
        
        //printing all set element. Prints [0 2 4 6 8 10 14 16 18 ]
        int cursor=bs.nextSetBit(0);
        System.out.print("All set elements : [");
        while(cursor!=-1){
            System.out.print(cursor+" ");
            cursor=bs.nextSetBit(++cursor);
        }
        System.out.println("]");
    }
}



Primitive type Hashmaps


Another interesting set of collections are the Maps or the HashMaps which, again, take objects as references. If only they could take primitive types :(. Well luckily, they can be modified to do this too :). The classes from the Trove package does exactly this. They contain different classes like TIntIntMap,TDoubleIntMap or TObjectIntMap (Which can take strings/objects as keys). Needless to say they work in the same way as regular util maps. Here is a link to it -

Gnu Trove Package

Hope this helps people out there trying to save memory.



Friday, August 15, 2014

Difference between .equals and == in java


In Java, == checks for reference equality while .equals() checks for content equality, which is true in every case. But, java refers to the memory in some inconsistent way which brings up all the confusion.
Check the following code -


class you{
    public static void main(String[] args){
        
        Integer a = 1;
        Integer b = 1;   // a & b are pooled from the same memory location
        Integer i = 128;
        Integer j = 128;  /* i and j are not pooled. Hence stored in different memory location */
        
        System.out.println("a == b: "+(a == b));  //true
        System.out.println("a.equals(b): "+(a.equals(b)));  //true
        System.out.println("i == j: "+(i == j));  //false
        System.out.println("i.equals(j): "+(i.equals(j)));    //true
        String x="abc";
        /* Since strings are immutable, x and y are stored in the same location in the heap memory for same value*/
        String y="abc";
        /*Creating a new reference. z has a  different memory location but same value*/
        String z=new String("abc");
        System.out.println("x == y: "+(x == y));  //true
        System.out.println("x.equals(y): "+(x.equals(y)));   //true
        System.out.println("x == z: "+(x == z));  //false
        System.out.println("x.equals(z): "+(x.equals(z)));   //true
    }
}

       For Integer objects, when the value is below 8-bits(-128 to 127), java stores it ,for the same values, in the same location in the memory and pools it from there. Hence all objects point to the same memory location for these values. But when the value is out of this range, dedicated memory locations are created and hence at those times == returns false.
       For Strings, since they are immutable, java stores them in the same memory location (irrelevant of size) when declared directly. However, if u create a string type object and then assign the same value, the memory location is diff.
So its always safer to use .equals() when u want to do content checking for strings.