AVL / Balance Tree with Ordinal Numbering of Elements in C

Aaron Logue 2010
$ ./btest

   1          (2) D  value fSeE
   2             (1) E  value Rwgm
   3       (4) F  value sm4Y
   4          (1) G  value yOb6
   5    (10) H  value cG9l
   6             (1) I  value uVKS
   7          (2) J  value XEfc
   8       (5) J  value zFnH
   9          (2) L  value QU6S
  10             (1) L  value VKAH
  11 (27) M  value 7N1l
  12             (1) M  value 3Idc
  13          (3) N  value hplK
  14             (1) O  value gluL
  15       (8) O  value lxXQ
  16             (2) O  value EqnK
  17                (1) O  value 4TuE
  18          (4) P  value jwMS
  19             (1) Q  value yL85
  20    (16) R  value zFgX
  21             (1) S  value 0HuQ
  22          (4) V  value fARW
  23             (2) V  value VJgN
  24                (1) V  value cgkP
  25       (7) W  value NZZA
  26             (1) X  value 5J8z
  27          (2) Y  value 1AIM

Key A not found in tree.
Key B not found in tree.
Key C not found in tree.
Key D found at row 1, value fSeE
Key E found at row 2, value Rwgm
Key F found at row 3, value sm4Y
Key G found at row 4, value yOb6
Key H found at row 5, value cG9l
Key I found at row 6, value uVKS
Key J found at row 7, value XEfc
Key K not found in tree.
Key L found at row 9, value QU6S
Key M found at row 11, value 7N1l
Key N found at row 13, value hplK
Key O found at row 14, value gluL
Key P found at row 18, value jwMS
Key Q found at row 19, value yL85
Key R found at row 20, value zFgX
Key S found at row 21, value 0HuQ
Key T not found in tree.
Key U not found in tree.
Key V found at row 22, value fARW
Key W found at row 25, value NZZA
Key X found at row 26, value 5J8z
Key Y found at row 27, value 1AIM
Key Z not found in tree.
The balance tree was invented in 1962 by Georgy Adelson-Velsky and Evgenii Mikhailovich Landis. Landis, born in 1921, passed away in Moscow in 1995. Adelson-Velsky , born in 1922, is alive and well at last count. May he enjoy a great many more sunny days.

Here is a snippet of his page from Google's Russian->English translator, humbly mentioning this computational cornerstone: "Finally, it should be noted the current reference design developed by EM Landis and myself, for which the number of elementary operations, both for the inclusion of a new piece of information, and for the exclusion of the old proportional to the logarithm of reference."

On the left is the output of a C program that creates a small AVL tree using keys picked randomly from letters of the alphabet, and randomly generated values. It then searches for each possible key and if found, displays the value associated with it and which row the key would be in if they were listed in order.

To facilitate this, the insert and delete functions maintain a count in each node of the number of elements in that node's subtree. As the nodes are descended and their keys compared to the key being searched for, the size of the left subtree plus 1 is added to a total whenever a right branch is followed or the sought-after key is found in a node. That total is the node's ordinal value; the position of the found key within an ordered list of keys.

avlc_t * tree = avlc_init(allow_dupes);

This AVL implementation allows duplicate keys on a per-tree basis. However, for applications that have no need for duplicate keys or row numbering, the following four functions should do the job.

intavlc_add(avlc_t *tree, char *key, void *value);
void * avlc_get(avlc_t *tree, char *key);
void * avlc_del(avlc_t *tree, char *key);
intavlc_free(avlc_t *tree);

In the example on the left, note that the searches for J and O find the first ordered instance of the key, as opposed to the first occurrence encountered while descending the tree. Searching in this manner requires additional processing time, but may be necessary to achieve the desired outcome for trees that allow duplicate keys or that can perform searches on partial matches of keys. Several different search functions are included.

Download the tarball here: avl_2010.tar.gz
Included are a couple test files and a memory leak tester.

Functions

avlc_t *  avlc_init(int allow_dupes); Create an instance of a tree. Set allow_dupes to 0 to cause avlc_add() to reject duplicate keys, or 1 to accept and store them.
int
avlc_add(avlc_t *tree, char *key,
void *value);
Adds a key/value pair to the specified tree. Key must be a null-terminated string. A copy of the key is made, so the caller's copy of the key can be freed after calling this function. Value can be anything that fits into sizeof(void *) bytes. Returns 1 to indicate success, or 0 to indicate failure due to attempted duplicate key insertion on a !allow_dupes tree.
void *  avlc_get(avlc_t *tree, char *key); Search the tree for an exact match of key and return the associated value. If multiple possible matches exist, the value returned is the one associated with the first match encountered as the tree is descended. Returns NULL if not found.
void * avlc_del(avlc_t *tree, char *key); Search the tree for an exact match of key and delete the key/value pair if found. The first match encountered as the tree is descended will be deleted. The copy of the key string made earlier by avlc_add() is freed, but it is the responsibility of the caller to free any data structure pointed at by the void * value. The value is returned to indicate key found and node deleted, or NULL if key was not found.
int avlc_free(avlc_t *tree); Free all of the key/value pairs in the tree. Before using this function, the caller may wish to use avlc_walk() and pass a function pointer to a routine to free data structures pointed at by value.
void * 
avlc_get_len(avlc_t *tree, char *key,
int len);
This form of avlc_get() is intended for cases where key is not null-terminated. If multiple possible matches exist, the key returned is the first match encountered as the tree is descended.
void * 
avlc_search(avlc_t *tree, char *key,
char *next, int maxlen);
This function performs partial matching and seeks the lowest ordered match in the list rather than the first match encountered. If a tree consists of keys "xy" and "xz" and an avlc_search for the key "x" is performed, the value associated with key "xy" will be returned, and "xz" will be copied into the memory pointed at by next. Set maxlen as you would for strncpy(next, source, maxlen).
int avlc_walk(avlc_t *tree, avlc_func_t *func); Traverse the tree in order, passing the key and value to the caller's function for each node, and halting if the caller's function returns nonzero. This function is typically used prior to calling avlc_free() to free memory associated with data structures pointed at by void *value.
int
avlc_walk_parm(avlc_t *tree,
avlc_func_parm_t *func,
void *parm);
This function works just like avlc_walk(), but allows a parameter to be passed to the caller's function.
The following five functions require #define TRACK_SUBTREE_SIZE in avlc.h. If indexing is not needed, a small performance gain can be had by #undef-ing the TRACK_SUBTREE_SIZE code out, which will remove the node subtree size counts and the code that maintains them when adding to, deleting from, and rebalancing the tree.
char *
avlc_get_key_by_index(avlc_t *tree,
int index);
Return a pointer to the key at the position requested by the index if the keys were in sorted order. Index is 1-based, so a tree with 3 elements would return keys for index values 1, 2, or 3. Returns NULL if index is out of range.
void *
avlc_get_value_by_index(avlc_t *tree,
int index);
Return the value associated with the key at the position requested by the index. Returns NULL if index is out of range.
void *
avlc_get_value_and_index(avlc_t *tree,
char *key,
int *index);
Search the tree for an exact match of key and return the associated value along with index, its ordinal position in a sorted list of all keys in the tree. If multiple possible matches exist, the returned is the one with the lowest ordinal value, as opposed to the first match encountered as the tree is descended. If the key is not found, the value is NULL and the index is negative or zero.
void *
avlc_del_index(avlc_t *tree,
int *index);
Deletes the key/value pair at the specified 1-based index. Returns NULL if the index was out of range, or the value stored in the node that was deleted. The caller can then free data structures pointed at by the value, if any.
int
avlc_walk_range(avlc_t *tree,
avlc_func_parm_t *func,
void *parm,
int first, int last);
Works like avlc_walk_parm, but only walks the tree from the specified first index to the last. The caller's function can halt the walk by returning nonzero.
Typedefs for the caller-provided functions that are called by the avlc_walk functions:
typedef int avlc_func_t(char *key, void *value);
typedef int avlc_func_parm_t(char *key, void *value, void *parm);