The B-TREE data structure

Today it's most common to just leverage APIs and frameworks. That's cool because nobody has time to go in and implement HTTP server or TCP communication.
That's why the web grows so rapidly, because the infrastructure is already in place.
Now, what happens when we just use something out of the box and don't know anything about how it functions? Let's take the B-Tree data structure as an instance.
Which is my favourite data structure.

Why is it important to go deep into details?

When we do that we become an informed creator instead of just going with the flow and not understanding stuff.
For example: when we understand the B-Tree structure we find where it's best use cases lie

  • Is it just in the DB index?
  • What happens when the page splits?
  • Why is the time complexity log(n)
So as a summary: The depth of knowledge allows for more scalable solutions, as you can anticipate issues before they arise.

Binary tree or Balanced tree

When the term B-Tree is mentioned we first associate it with the binary tree, most often we learn about this data structure much earlier simpliy because its not as complex as the Balanced Tree
Although both are trees there is a fundamental difference between the binary and balanced Tree

The binary tree can easily become a linked list if the data is skewed in a particular manner. For example, if we add numbers in ascending or descending order to a binary search tree, each new number will be added to the right or left of the previous node, respectively. Which resembles a linked list, leading to inefficient operations since the tree's height is no longer balanced and can reach O(n) time complexity for insertion, deletion, and search operations.

The balanced tree doesn't have this problem as a self-balancing algorithm keeps the tree height minimal. This ensures that operations like insertion, deletion, and search maintain their efficiency, typically at O(log n) time complexity.

Implementation and how to get your head around this

Now enough rambling, let's see the implementation and how to grasp the structure

Link to repo
The link above is to my github repo where I have a C++ implementation of a B-Tree with insert and search
I went into much detail with comments and the cleanest code I could write because at the end of the day it all comes down to pointer manipulation

I don't think there is a shortcut to grasp the subject, the structure requires knowledge in how memory works and how pointers work, so after grapsing that, it's just a process of understanding the page split which will help you understand the algorithm


Important: There is a logic that generates a DOT file which later generates an image using the command dot -Tpng bplustree.dot -o bplustree.png

I think this way you would be able to play around with the structure and see how it's represented in memory



Email me or message me on linkedin if you need any help