Programming is a wonderful mix of art and science; source code is both a poem and a math problem. It should be as simple and elegant as it is functional and fast. This blog is about that (along with whatever else I feel like writing about).

Monday, August 22, 2005

Associative Array Doubly Linked List

This isn't going to be a demonstration of an application of any kind, but rather I'm going to show a data structure I developed the other day to help out with my web applications. I implemented it in Javascript, but it should be trivial to port it to other languages. (If the language doesn't support associative arrays on its own, then you'll have to implement that first.)

The idea behind an Associative Array Doubly Linked List (AADLList) starts with an associative array and adds a doubly linked list to it. In an associative array, you have a bunch of key/value pairs, where the value is accessed by its key (normal arrays can be viewed as an associative array where the keys are integers starting with 0). In a linked list, you have to define an object that has not only its data, but also a pointer to the next object in the list. A linked list will also have a pointer to the front of the list. A doubly linked list takes this a step further, by giving each object a pointer to the object before it AND after it, and the list must have pointers to the first and last items in the list.

I wanted to make this data structure independent of the application you're going to use it in, so instead of defining information fields in the objects as well as the pointers, the nodes in this list will have another variable inside that you can fill with whatever object you want.

First, we have to define the list objects, AADLListNode and AADLList:

function AADLListNode() {
    this.node = null;
    this.next = null;
    this.prev = null;
}

function AADLList() {
    this.list = new Array();
    this.first = null;
    this.last = null;


Note that I omitted a trailing bracket at the end of the AADLList definition. That is because the rest of the code goes inside the AADLList class definition.

As you can see, the ListNode class has a node object and its two pointers. The List object has a list (which is an array) and its two pointers. At first, the first and last pointers are null, which means the array is empty.

The first thing we need to do is add nodes to the list. When we're adding the first item in the list, we have to take more things into consideration. When the first node is inserted, the list's first and last pointers have to change from null to the new first node. So we write our first function:

/*
    add a node that is the first and therefore only node in the list
    */
    this.addFirst = function(key, val) {
        var node = new AADLListNode();
        node.node = val;
        node.next = null;
        node.prev = null;
        
        this.list[key] = node;
        
        this.last = node;
        this.first = node;
    }


After our first node has been added, we can start adding more nodes. Since it's a doubly linked list, we have the desirable ability to add nodes to either the front or the back of the list. (Actually, we will be able to insert a node anywhere in the list.) So we write two functions, to add a node to the front or back of the list.

/*
    add the given key/val pair as the first element in the list
    */
    this.addToFront = function(key, val) {
        var temp = this.first;
        
        var node = new AADLListNode();
        node.node = val;
        node.next = this.first;
        node.prev = null;
        
        if (this.isEmpty()) {
            this.addFirst(key, val);
        }
        
        this.list[key] = node;
        
        if (this.first) {
            this.first.prev = node;
        }
        this.first = node;
    }
    
    /*
    add the given key/val pair as the last element in the list
    */
    this.addToBack = function(key, val) {
        var node = new AADLListNode();
        node.node = val;
        node.next = null;
        node.prev = this.last;
        
        if (this.isEmpty()) {
            this.addFirst(key, val);
        }
        
        this.list[key] = node;
        
        if (this.last) {
            this.last.next = node;
        }
        this.last = node;
    }


The code in these functions is pretty simple. We just need to set the prev or next pointers of the newly created node, and update the first or last pointers of the list. As you can see, these functions use the isEmpty() function to determine whether they should call addFirst() or not. We have to create isEmpty() for ourselves, like so:

/*
    returns true if the list is empty
    */
    this.isEmpty = function() {
        if (!this.first && !this.last) {
            return true;
        } else {
            return false;
        }
    }


Simple enough. If the first and last pointers are both null, then the list is empty. It should, in theory, be possible to also just access the this.list.length value and check that it is zero, but unfortunately, when you remove an entry in a Javascript array, it does not disappear. The key continues to exist, but points to null. So the this.list.length field will be incorrect if a node is ever removed from the list.

I often don't care whether my nodes are inserted at the front or the back, so I threw a convenience function in:

/*
    simple convenience function
    just calls addToBack
    */
    this.add = function(key, val) {
        this.addToBack(key, val);
    }


Now we can add as many nodes as we want, either at the front or the back of the list. But how do we access the nodes once they're there? One way is to simply go through the object's members and use them, but this is generally not an awesome way to go (if you really wanted to do it, you'd have to do something like this: myList.list[myKey].node.myField ... etc. Clearly not what you want to be doing every time you want to retrieve something from the list). So we can write a function that will retrieve a given node for us based on its key.

/*
    try to get the node for a given key
    return the node if it exists, null if not
    */
    this.getNode = function(key) {
        for (var id in this.list) {
            if (id == key) {
                return this.list[id];
            }
        }
        
        return null;
    }


But that isn't always what we want, because it will return an AADLListNode object, which you probably won't want to use in your application. So we write another, similar function, to just return the object inside the node (without the pointers).

/*
    try to get the node value for a given key
    return the value of the node or null, if there is no node
    */
    this.getNodeValue = function(key) {
        var node = this.getNode(key);
        if (node) {
            return node.node;
        } else {
            return null;
        }
    }


Note that if we were simply doing a linked list (or even a doubly linked list), this operation would have an O(n) efficiency (worst case), because we would have to traverse the list and check each node to see if it was the one we're looking for. But since it's also an associative array, we can access it in O(1) time, greatly speeding up our data access. This is good.

I'm sure you've noticed that what this is missing is a way to remove a node from the list. So we're going to add that function now. We need to make sure we check that the node actually exists, then change the next object's prev pointer to point to the removed object's prev, and vice versa. (Don't know how to explain that very well, except to use the term "switcharoo," which I think should cover it pretty well.)

/*
    remove a node from the list
    return true on success, false on failure (if the given key isn't in the list)
    */
    this.removeNode = function(key) {
        var node = this.getNode(key);
        
        //if the node doesn't exist, you can't remove it
        if (!node) {
            return false;
        }
        
        var tempNext = node.next;
        var tempPrev = node.prev;
        
        //if the node doesn't have next/prev, you can't change them
        if (tempNext)
            tempNext.prev = tempPrev;
        if (tempPrev)
            tempPrev.next = tempNext;
        
        //if it's the first node, set the next node to be the first
        if (key == this.first.node.id) {
            if (tempNext) {
                this.first = tempNext;
            } else if (tempPrev) {
                this.first = tempPrev;
            } else {
                this.first = null;
            }
        }
        
        //if it's the last node, set the prev node to be the last
        if (node == this.last) {
            if (tempPrev) {
                this.last = tempPrev;
            } else if (tempNext) {
                this.last = tempNext;
            } else {
                this.last = null;
            }
        }
        
        //null out the entry in the array
        this.list[key] = null;
        
        return true;
    }


As you can see, this is by far the most complicated function in the class (and it's not even very complicated!).

And don't forget that trailing bracket!

}


Now we have a working AADLList, which gives us all the advantages of a doubly linked list and an associative array in one easy to use package. You have O(1) access to any node in the list, but you can also traverse from the back or the front and add nodes in whatever order you want (although that isn't implemented in this version).

So what would you want to add to the AADLList to make it a little better?
  • How about an insert() function of some kind, so you can decide where in the list you want a node to be added?
  • Build on your insert() function and make the AADLList a sorted list.
  • Give it some queue-like or stack-like functionality, with push() and pop() operations. (Who wouldn't want the ultimate data structure?)

Feel free to use it, add to it, whatever. Let me know if you have any issues with it, or problems using it or implementing new features for it.

No comments: