It’s always interesting to learn something new, especially if it’s fundamental knowledge.
Recently I have found very useful course about algorithms.
Lectures are meaningful and practical. I definetly recommend them for everyone who wants to understand basic algorithms.

All algorithms are implemented in Java, but I decided to write them in Ruby too. I understand, that it may be not a good idea (Java is more suitable
for this), but I will do it for myself. In my view, it doesn’t matter what language you choose in this case. The main thing is understanding algorithms.

So, in this post I want to tell you about Union-Find algorithms.

## Dynamic connectivity

Firstly, we should understand the reason of existence of this type of algorithms. Just have a look at the picture:

As you can see, there are 2 points: `p`

and `q`

. Could you get an answer: is there a path connecting `p`

and `q`

?

This type of problems is called `connectivity`

problem. Wikipedia tells us about this like

In computing and graph theory, a dynamic connectivity structure is a data structure that dynamically maintains information about the connected components of a graph.

So, `dynamic connectivity`

is a data structure, which can be used for describing a graph.
But, what is it actually? How can we use it for answering the question?

It means, that we can describe the way from `p`

to `q`

through the graph of elements. Each element is a point, which can be connected with another one.
For example, let’s say we have 10 objects and they are separate initially (they are independent components):
We can connect it together and check if some of them have already connected. The connection action is called `union`

. `Find`

is a query, wich checks if two objects are in
the same component. So, we can make `union`

4 with 3:

There are more connectivity examples:

Now, we know what is `dynamic connectivity`

, but how can we implement it?
Actually, there are some algorithms to do Union-Find operations.

## Quick-Find

Quick Find is a easiest algorithm of UF.

```
class QuickFind
def initialize(size)
@ids = (0..size-1).to_a
end
def find(index)
@ids[index]
end
def connected?(p, q)
find(p) == find(q)
end
def union(p, q)
return if connected?(p, q)
old = @ids[p]
new = @ids[q]
@ids.map! { |val| val = (val == old)? new : val }
end
def count
@ids.uniq.size
end
end
```

`Find`

query just checks whether `p`

and `q`

are in the same component (have the same value).
`Union`

query changes all entries with `@ids[p]`

to `@ids[q]`

## Quick-Union

`Quick-Union`

is a more complex algorithm then `Quick-Find`

```
class QuickUnion
attr_reader :count
def initialize(size)
@ids = (0..size-1).to_a
@count = size
end
def find(index)
@ids[index] == index ? index : find(@ids[index])
end
def connected?(p, q)
find(p) == find(q)
end
def union(p, q)
return if connected?(p, q)
@ids[find(p)] = find(q)
@count -= 1
end
end
```

`Find`

query checks if `p`

and `q`

have the same root.
`Union`

changes the id of `p`

’s root to the id of `q`

’s root.

In this case `find`

operation is more complicated, then `union`

. Finding the root means pass all path through graph
from element to its root. A tree of the components can be very deep.
In other words we can `union`

very tall tree, which takes a lot of `find`

operations.
That is why we can improve this algorithm by checking the order of the joined objects.

## Weighted Quick-Union

It’s an improved version of the `Quick-Union`

algorithm.

```
class WeightedQuickUnion
attr_reader :count
def initialize(size)
@ids = (0..size-1).to_a
@size = Array.new(size, 1)
@count = size
end
def find(index)
@ids[index] == index ? index : find(@ids[index])
end
def connected?(p, q)
find(p) == find(q)
end
def union(p, q)
return if connected?(p, q)
root_p, root_q = find(p), find(q)
bigger?(root_p, root_q) ? join(root_q, root_p) : join(root_p, root_q)
@count -= 1
end
private
def bigger?(root_1, root_2)
@size[root_1] > @size[root_2]
end
def join(root_1, root_2)
@ids[root_1] = root_2
@size[root_2] += @size[root_1]
end
end
```

As you can see, we have an array `@size`

, which includes the sizes of all components.
`Union`

query checks the size of components and makes join between smaller and bigger.
It means we need less `find`

operations.
So, the order is important.

## Weighted Quick-Union With Path Compression

Nonetheless, we can make it even better!
Every time when we pass trough the tree (after computing the root of `p`

), we can set
the id of each examined node to point to that root.
For example, we have the tree:

After `find`

it will be:

Everything what we need is just change few lines of our `find`

method.

```
def find(index)
return index if @ids[index] == index
@ids[index] = @ids[@ids[index]]
find(@ids[index])
end
```

## Tests

Let’s test it!

I have found some data sets.
One of them includes 11 union operations,
another one 900.

```
class Benchmarks
def tiny
test File.read('./tiny_UF.txt')
end
def medium
test File.read('./medium_UF.txt')
end
private
def test(str)
data = str.split("\n")
size, pairs = data.shift, data
Benchmark.bm(20) do |bm|
%w(QuickFind QuickUnion WeightedQuickUnion WeightedQuickUnionPC).each do |uf_class|
bm.report(uf_class) do
uf = Object.const_get(uf_class).new size.to_i
union(uf, pairs)
end
end
end
end
def union(uf, pairs)
pairs.each do |pair|
p, q = pair.split(' ')
uf.union(p.to_i, q.to_i)
end
end
end
```

```
# tiny data set
user system total real
QuickFind
0.000000 0.000000 0.000000 (0.002838)
QuickUnion
0.000000 0.000000 0.000000 (0.000091)
WeightedQuickUnion
0.000000 0.000000 0.000000 (0.000078)
WeightedQuickUnionPC
0.000000 0.000000 0.000000 (0.000056)
# medium data set
user system total real
QuickFind
0.060000 0.010000 0.070000 (0.051383)
QuickUnion
0.010000 0.000000 0.010000 (0.002796)
WeightedQuickUnion
0.000000 0.000000 0.000000 (0.001658)
WeightedQuickUnionPC
0.010000 0.000000 0.010000 (0.001589)
```

So, everything works like we expected and `Weighted Quick-Union With Path Compression`

is
the quickest algorithm.

Union-Find algorithms performance:

## Conclusion

Union-Find algorithms are widely used. We can see them in different applications:

- Dynamic connectivity
- Games (Go)
- Percolation
- Least common ancestor
- Equivalence of finite state automata
- etc..

More information about UF you can read here.

Thanks Coursera for this amazing course.

In the next publication I will tell you about stacks and queues.

Thx for reading and stay tuned!