Data structure cheat sheet
A quick cheat sheet on common algorithms and data structures:
Linear Structures
Linked Lists
Basic Implementation
String
Related Algorithms
Arrays
Sorting and Ordering
Simple Sort
Insertion sort.
Quick Sort
The core of the quicksort is dividing the sorting array into half and repeat this operation. The dividing procedure is called partition, and there are two typical implementations of it: Lomuto and Hoare partition. The Lomuto partition process the elements one by one, and swap the elements smaller than the pivot to the left side. Hoare partition goes through the arrays on both ends, swap pairs on the wrong side, and end the progress when the two scan meets.
Quicksort on average takes time. Worse case it can cause . There are a number of algorithms based on the idea of doing partitions and processing either or both sides of it. If only one side of it is processed, it uses O(n) time.
Merge Sort
Median and order statistics
Selecting min/max from the array takes linear time, and selecting the general element takes a expected time using a method similar to quicksort.
Tree-based Structures
Binary Tree
Property
- A full n-level binary tree have elements.
- A n-node binary tree have levels.
Storage
Traversal
General traversal of binary trees all takes time, except for the root node, exactly two times of the search routine was called for each node.
For Binary Search Trees, Min/Max is easy - just follow the left/right link. Getting successor/predecessor is tricker: the next element of x is either a) the minimum element of the right subtree, or b) the lowest ancestor of x whose left child is also the ancestor of x.
Insertion and Deletion
Insertion is relatively simple, just follows the search procedure until you find an empty node.
Deletion is more complicated: if the target has two children, we find the successor of , if it is the right child of , it can be replaced by y, otherwise, we first replace y by its right child, then replace z using y.
Heap
A (binary) heap is a complete binary tree that keeps a sepcific condition of its nodes: For a max-heap, , for a min-heap, . A heap can be used to maintain a priority queue. A heap is often stored as a continous vector.
Heapify
Heapify is a fundamental operation to keep the heap property when there is a new node inserted into the root.
Complexity: . You can prove that using recusion master theroem.
Build a heap
To build a heap, call heapify() n/2 times for all non-leaf nodes.
Complexity:
Heapsort
To sort an array in increasing order, build a max heap, put the first element to the final position of the array, and maintain the heap property by calling heapify with the remaining element.
Complexity:
Priority Queue
A (max) priority queue support four operations: insert, max, extract-max, increase-key. With a heap, these four operations can be done using , , , time.
Codes
void max_heapify(Node* A, int size, int start) {
auto largest = start;
if(start * 2 < size && A[largest] < A[start * 2])
largest = start * 2;
if(start * 2 + 1 < size && A[largest] < A[start * 2 + 1])
largest = start * 2 + 1;
if(largest != start) {
swap(A[start], A[largest]);
maxheapify(A, size, largest);
}
}
void build_max_heap(Node* A, int size) {
for(int i = size / 2; i >= 0; --i) {
max_heapify(A, size, i);
}
}
n-ary Tree
Union Find
Hashing Table
Graphs
Storage
Fundamentally, the representation of a graph is always equivalent to an adjacent matrix. The most usual methods are uncompressed matrix, adjacent list, and compressed matrix.
Traverse
Properties
Algorithms
Dynamic Programming
Recursion
Divide and conquer
Searching
Mathematical Problems
Permutations and Combinations
Generate next lexicographic permutation
- find the longest non-increasing suffix:
67[1]5432
- find the last element that is larger than the previous one:
67[1]543[2]
- swap
67[2]543[1]
- reverse the tail:
672[1345]
Generate the k-th permutation of N
The idea is to generate the permutation one element by one element: we can determine the first element of the permutation by comparing k with N, then repeat this procedure.
Generate permutations with the duplications
This can be done with a simple backtracking search.