What is a Skip List in Data Structure?

A skip list is a data structure that is similar to a linked list, but allows for efficient search, insertion, and deletion of elements. It achieves this by using a hierarchical structure of linked lists, where each level of the list represents a different level of granularity in the data. The elements in the highest level list are the most granular, while the elements in the lowest level list are the least granular. This allows for the search to start at the highest level and "skip" over elements that are not likely to be the target, thereby reducing the number of elements that need to be checked. The skip list was first proposed by William Pugh in a 1989 paper.

How Skip List Works in Data Structure?

A skip list works by using a hierarchy of linked lists, where each level of the list represents a different level of granularity in the data. Each element in the list is represented by a node, which contains the value of the element and links to the next element at the same level and the next level down. The highest level of the skip list is the most granular, meaning it contains the most elements, and the lowest level is the least granular, meaning it contains the fewest elements. The elements at the higher level are a subset of the elements at the lower level.

When inserting a new element, a random level is chosen for it, with the level being higher with lower probability. Then the element is inserted into each level where it belongs, following a similar process to a regular linked list insertion.

When searching for an element, the search starts at the highest level and follows the links until it finds an element that is equal to or greater than the target. If it encounters a link to a lower level, it will follow that link and continue the search at the next level. As it moves down the levels, the granularity of the data increases, so the search becomes more precise. This allows the search to skip over elements that are not likely to be the target, thereby reducing the number of elements that need to be checked.

The skip list can also be used for deletion of elements by starting from the highest level and moving down the levels until the element is found, and then removing it from each level it is present in.

Types of Skip List

There are two main types of skip lists:

Simple Skip List: In this type, each node contains only one element of the list. The node has a forward pointer for each level in the list. Each pointer points to the next element at the same level. Simple skip lists are easy to implement and understand, but they can use more memory than other types of skip lists as the number of elements in the list increases.
Multi-list Skip List: In this type, each node contains multiple elements of the list. The idea behind this type of skip list is to reduce the number of nodes needed to represent a list. The node contains an array of elements and an array of forward pointers, one for each level. Each pointer points to the next node at the same level that contains an element greater than or equal to the element in the current node. Multi-list skip lists can use less memory than simple skip lists, but they can be more complex to implement and understand.

Another variation of Skip List is the Weighted Skip List, in this type, each node has a weight associated with it. This weight controls the probability of the node appearing in higher levels of the list. With this feature, it allows the list to adapt to the distribution of elements in the list, and make operations more efficient.

Example of Skip List

An example of a simple skip list implementation in C++ using the STL's list class to represent the linked lists at each level, and a std::uniform_int_distribution to generate the random level for each insertion:

C++

Python

Java

#include <iostream>
#include <cstdlib>
#include <ctime>
#include <list>
#include <unordered_map>
#include <random>

class SkipList {
    public:
        SkipList() {
            // initialize the seed for random number generation
            std::srand(std::time(0));
        }

        void insert(int value) {
            std::list<int>::iterator it;
            std::unordered_map<int, std::list<int>::iterator>::iterator mapIt;

            // choose a random level for the new element
            int level = 0;
            while (std::rand() % 2 == 0) {
                level++;
            }

            // insert the element at each level
            for (int i = 0; i <= level; i++) {
                if (i == lists.size()) {
                    // create a new level if it doesn't exist
                    lists.push_back(std::list<int>());
                }

                it = lists[i].begin();
                mapIt = map.find(i);
                if (mapIt != map.end()) {
                    it = mapIt->second;
                }

                // find the position to insert the element
                while (it != lists[i].end() && *it < value) {
                    map[i] = it;
                    it++;
                }

                // insert the element
                lists[i].insert(it, value);
            }
        }

        bool search(int value) {
            std::list<int>::iterator it;
            std::unordered_map<int, std::list<int>::iterator>::iterator mapIt;

            // start at the highest level
            for (int i = lists.size() - 1; i >= 0; i--) {
                it = lists[i].begin();
                mapIt = map.find(i);
                if (mapIt != map.end()) {
                    it = mapIt->second;
                }

                // follow the links to the next element
                while (it != lists[i].end() && *it < value) {
                    map[i] = it;
                    it++;
                }

                // check if the element is found
                if (it != lists[i].end() && *it == value) {
                    return true;
                }
            }

            return false;
        }

    private:
        std::vector<std::list<int>> lists;
        std::unordered_map<int, std::list<int>::iterator> map;
};

Applications of Skip List

Skip lists are a relatively simple data structure and are used in a variety of applications because of their efficient search, insertion, and deletion performance. Some common applications include:

Database indexing: Skip lists can be used to index data in a database, allowing for efficient search, insertion, and deletion of records.
Sorting algorithms: Skip lists can be used as the underlying data structure in sorting algorithms to sort large datasets.
Geographic information systems: Skip lists can be used to index and search large datasets of geographic data, such as maps and satellite images.
Network routing: Skip lists can be used to efficiently store and search routing tables in network routers.
Graph algorithms: Skip lists can be used to implement data structures for graph algorithms, such as shortest path algorithms.
Text search engines: Skip lists can be used to index and search large datasets of text, such as in search engines and text editors.
In-memory databases: Skip lists are used as an indexing structure in some in-memory databases to provide efficient data access.
Multi-dimensional data: Skip Lists are used in multi-dimensional data to provide efficient search, insertion and deletion operations.

Overall, Skip Lists are a powerful data structure that can be used to implement efficient algorithms for a wide range of applications.

Time and Space Complexity of Skip List

The time complexity of operations on a skip list is as follows:

Search: O(log n) average and O(n) worst case. The average case time complexity is O(log n) because the search starts at the highest level and skips over elements that are not likely to be the target, reducing the number of elements that need to be checked. The worst case time complexity is O(n) if the elements are not evenly distributed across the levels, and the search has to check all elements.
Insertion: O(log n) average and O(n) worst case. The average case time complexity is O(log n) because the insertion is done at each level where it belongs. The worst case time complexity is O(n) if the elements are not evenly distributed across the levels, and the insertion has to check all elements.
Deletion: O(log n) average and O(n) worst case. The average case time complexity is O(log n) because the deletion is done at each level where the element is present. The worst case time complexity is O(n) if the elements are not evenly distributed across the levels, and the deletion has to check all elements.

Regarding space complexity, it depends on the implementation of the skip list, but generally speaking:

Simple Skip List: For a simple skip list, the space complexity is O(n) as each element of the list takes one node, and each node takes a constant amount of space.
Multi-list Skip List: For a multi-list skip list, the space complexity is O(m*n), where m is the average number of elements in each node, and n is the number of elements in the list.
Weighted Skip List: For a weighted skip list, the space complexity is O(n), as each element takes one node, and each node takes a constant amount of space.

Keep in mind, that the above complexities are average case complexities, and the actual time and space complexity of a skip list can vary depending on the specific implementation and the distribution of elements in the list.

Advantages and Disadvantages of Skip List

Advantages of skip lists:

Efficient search, insertion, and deletion: Skip lists have a time complexity of O(log n) on average for search, insertion, and deletion operations, making them more efficient than other data structures such as unordered arrays or linked lists.
Simple to implement: Skip lists are relatively simple to implement, and the basic structure is similar to a linked list, making it easy to understand and implement.
Dynamic: Skip lists can adjust to changes in the number of elements in the list, making them useful for applications where the number of elements may change over time.
Space-efficient: Skip lists can be more space-efficient than other data structures such as balanced trees, as they only use a small amount of extra space for the additional pointers at each level.
Handling large datasets: Skip lists can handle large datasets efficiently as it uses a hierarchical structure of linked lists that allow for efficient search, insertion and deletion of elements.

Disadvantages of skip lists:

Randomness: The performance of skip lists can be affected by the randomness of the levels chosen for each element, which can lead to unbalanced lists and poor performance in certain cases.
Complexity: Skip lists can be more complex than other data structures, and the implementation