Suppose you are given a programming task that involves storing a list of integer x–coordinates to be used in drawing a picture. "Aha!" you might think, "I'll use an array to store all those values." And indeed, an array is one solution to the task. Arrays store lists of data, so you might just go ahead and declare your data structure as :
const int size = 128;
int xcoords[size];
Now, all would be well and good. But what if I came along in your program and tried to add 129 points to your data structure? You might have built-in dumb-user-devices to prevent me from doing such a diabolical deed, or your program might overwrite some of the data that's already in the array, or it might do something altogether different (and completely undesirable). You would start getting bug reports, and then nasty emails would come your way complaining that users want to be able to add 129 points. You really need to face it : you have to be able to dynamically add data to the list. You could simply reserve enough space for 100000 coordinates, but that will eat up memory quickly, and you know that a user will eventually find a way to break a program with such a hard-coded restriction. Even by declaring your array (in C++) as :
int size = 128;
// possibly change the value of size here ...
int *xcoords = new int[size];
or as :
int getNumXCoords(void);
// ... later, inside a function call somewhere ...
int *xcoords = new int[getNumXCoords()];
The user will have to know ahead of time how many points she is planning on using. There must be a better way.
Thankfully, there are several better ways, and one of them is a linked list.
A linked list is a very useful data structure ; it is very much like an array, but it can dynamically grow and shrink to fit the number of data points that are entered in a program. Linked lists can be implemented in any computer language that is capable of using pointers (or any other analogous data type), but the code here uses C and sometimes C++.
Basically, a non-null list consists of several nodes arranged in a particular linear order, just like an array. Unlike an array, though, each node consists of, in its simplest form, a data structure that has both a data element and a pointer.
The data item holds the node's data element (like the x-coordinate), and the pointer (often called the next pointer) points to, or holds the address of, the next node in the list. Sometimes, as shown in the diagram, the data item is itself a pointer, but I'll stick with the example case of the data being integers. To make a list of arrays of floats, for example, just change the type of the data element in the linked list nodes.
There is one convention I'd like to clear up before continuing, and that is the null pointer convention. Pointers can either point to something by holding some useful data value, or they can point to nothing by holding a special value, often called NULL or NUL or just plain 0. Different folks have different ways of representing null pointers in diagrams. (Usually, it's a slash, or ground, or something else null-looking.)
I prefer the electrical ground symbol (three parallel lines) ; therefore, that is the notation that I will use here. Before going any further, draw a few list nodes just to get a feel for using your pencil or crayons or whatever you've got near your computer.
Here's the conceptual diagram of a small linked list that holds integers as data :
Notice three things about the list :
Thus a linked list is a linear data structure that is a lot like an array. The tricky part is that linked lists are a little bit more complex to implement than arrays. I'll give a small start, and then I will provide some programming ideas to help the linked list concept along.
To write linked list code, you first need to create the nodes. This is an easy enough process using the class concept in C++. A simple example might be :
class IntList {
public:
IntList *next; // the pointer to the next node
int data; // the node's data element
};
In the C programming language, which does not use objects per se, we might write the following instead (note the striking similarity ...) :
typedef struct _IntListNode IntList;
struct _IntListNode {
IntList *next;
int data;
};
So, using our new class (or struct), we could build the list above by writing in C, for example :
IntList *one = malloc(sizeof(IntList));
IntList *two = malloc(sizeof(IntList));
IntList *three = malloc(sizeof(IntList));
IntList *four = malloc(sizeof(IntList));
IntList *top = one;
one->data = 128; one->next = two;
two->data = 5; two->next = three;
three->data = 1979; three->next = four;
four->data = -7; four->next = NULL;
Now, you might think this is a little bit repetitive, and it certainly is. There are much better ways to handle list formation. My current favorite is this C++ code :
class IntList {
public:
// constructor initializes both data members
IntList(int d = 0, IntList *n = 0)
: data(d), next(n) { }
// destructor recursively deletes whole list
~IntList() { if(next) delete next; }
private:
IntList *next; // pointer to next node
int data; // node's data element
};
The constructor basically initializes the class instance's data elements. The code after the colon on the constructor's line assigns d to the class instance's data element, and n to the class instance's next element.
If you're not sure how the destructor destroys the entire list, draw out a list with your crayon, and then pretend that you called the first node's destructor. Notice that when you delete the first node by calling its destructor, the next pointer gets deleted as well. See what happens to the node that follows the first one (and the node that follows the second one, and so on).
To illustrate how this has changed things, let's build the list again using the new class constructor, this time in C++.
// default argument here makes next null
IntList *top = new
IntList(-7);
top = new IntList(1979, top);
top = new IntList(5, top);
top = new IntList(128, top);
What a difference ! Of course, you can use this in a loop to automate the process even further.
There are two basic ways to traverse a list : recursively and iteratively. The iterative way makes use of a loop that has a fixed termination point. Recursion relies on nested function calls to traverse the list. Here is an example of using iteration to traverse the list and print out the data elements :
// suppose buildList() returns the list we made above.
IntList *top = buildList();
// do other things before you need the list again ...
for (IntList *t = top; t; t = t->next)
cout << "element data is " << t->data << endl;
Note that top could even be null when the for loop starts ; the code would still work.
For the curious, here is an example of a function in C that would traverse a linked list recursively :
void
printListRecursive(IntList *l)
{
if (! l) return;
printf("element data is %d\n", l->data);
printListRecursive(l->next);
}
Traversing the list recursively, like doing anything else recursively, may seem a bit obfuscated until you become familiar with the process of recursion. It is very important to remember to put a terminating condition in a recursive function : we don't want ten million cats in hats running around here !
Once you've seen the code for doing one thing to each list element (like printing them, for example), it's pretty easy to change it to do something else to each list element. For instance, you could count the elements in the list by doing the following :
IntList *top = buildList();
// do other things before you need the list again ...
int numElem = 0;
for (IntList *t = top; t; t = t->next) numElem++;
That was iterative ; the recursive function might look like this :
int
countListRecursive(IntList *l)
{
if (! l) return 0;
return 1 + countListRecursive(l->next);
}
Now that you know how to do things to each item in the list sequentially, it might seem like a logical next step to use this same process to do nothing to each element in the list sequentially until you find a particular element that you need. This is just the way that one goes about, for example, finding an element in the list or counting only specific elements in the list. Of course, there are many other tasks that you might do to elements in a list that fall into this category, but searching is one of the most common.
So here's an example of how you might traverse the list iteratively (in C) to find a specific element. The function returns NULL if the number we're looking for isn't in the list.
IntList *
find(int needle, IntList *haystack)
{
IntList *t;
for(t = haystack; t; t = t->next)
if (t->data == needle) return t;
return NULL;
}
Another example of how you might use this functionality would be to extract only those elements from a list that satisfy certain criteria, building up a list of such elements that you find. You might want to find all the prime numbers in a given list of integers, for example. Here's some code in C++ that uses recursion to do that.
IntList *
primesFromListRecursive(IntList *start, IntList *filtered)
{
if (! start) return filtered;
if (isPrime(start->data)) {
IntList *newF = new IntList(start->data, filtered);
return primesFromListRecursive(start->next, newF);
} else {
return primesFromListRecursive(start->next, filtered);
}
}
Another step you might want to take with this neato list concept is to change the list once you've built it up. There are several ways to change a list, but they are on different levels of complexity. So far, we've addressed the simplest levels of complexity, with building an unsorted list and traversing the list to do something to each element. Now let's take a look at how you might delete an element.
Whenever you are dealing with linked lists, it's important to address three types of elements in the list :
Sounds pretty easy, and the above three cases are enough to keep yourself out of hot water when programming a linked list. If you've covered the case of the beginning of the list, covered the case of the end of the list, and covered an arbitrary case in the middle of the list, you're set !
So how does this apply to deleting ? Well, let's take a look at the list from above and think of how to delete the beginning, the end, and the middle of the list.
The beginning is pretty easy : we would need to keep a temp pointer to the beginning (so we don't lose track of it), reroute the top pointer to the value of the first element's next pointer, and then set the first element's next pointer to NULL. Then we'll be safe to delete that node without destroying the rest of the list, and thanks to the rerouted top pointer, we know where the new start of the list will be.
The end is even easier, once you see how the beginning works : once we find the end of the list, we keep a temp pointer to that element, set the previous node's next pointer to NULL (because it will become the end of the list), and delete the last node.
Finally, deleting a node from the middle : this is a little more complex, but it uses the same process as before. We again need to keep a temp pointer to the node we're trying to get rid of. Then, like the case for the beginning of the list, we reroute the pointer that points to the current node to the value of the current node's next pointer. After we set the current node's next to NULL, we'll be set to delete the node.
That covers our three cases. Now the question is how to unify the three into one compact function ... it becomes more clear when you notice that in all three cases, we take the pointer that points to the node we want to delete and then reroute it to the node after the current node. There are two somewhat special cases: in the beginning of the list, the previous pointer is the pointer that keeps track of the start of the list, and for the end of the list, the node after the one we want to delete is effectively NULL. So here is one way to delete a node from a linked list, iteratively in C :
IntList *
delete(IntList *l, integer value)
{
IntList *ret = l, *cur = l, *prev = l;
/* advance the current node pointer, trailing behind
the previous node pointer, until we find the end
of the list or the value we want. */
while (cur && (cur->data != value)) {
prev = cur;
cur = cur->next;
}
/* if we found a node with the value, delete it. */
if (cur && (cur->data == value)) {
if (cur == prev) ret = cur->next;
else prev->next = cur->next;
cur->next = NULL; free(cur);
}
return ret;
}
There are some things to keep in mind about this particular implementation. First, it's not pretty. Next, this function deletes the first node it encounters in the list that has a data value of value. There are ways to delete nodes based on a pointer to a particular node. (Say, for instance, you found a pointer to the node using a search function, then passed the pointer to the delete function.) In addition, this function deletes a maximum of one node from the list. There are ways to delete all nodes with a specified data element.
A great intro page is The Code Project linked list tutorial. It discusses use of the CList class included with the Microsoft Foundation Classes (MFC).
For those of you looking for a quick code fix, check the 10 minute question about linked lists from http://devx.com.
Created and published