# mot.la

## Calculating Euclidean Distance in Ruby

Euclidean distance is often used in conjunction with the k-means algorithm.
Remember your **pythagorean theorem** from high school geometry?
**Euclidean distance is the same theorem but across N dimensions.**

For example, you can calculate the distance between two points in 2D, 3D space, 4D space, and more. Let's start with a 2D example.

### 2D Example

In the line chart above, you can see we have two data points. The first has a value of 1 for X and 3 for Y. The second has a value of 4 for X and 1 for Y.

If we were in high school geometry, we would draw lines to convert this to a triangle. Then we'd label lines A, B, and C.

Once we know those are labeled A and B, we can give them values for the length of their lines. In the chart above, A has a length of 2 and B has a length of 3.

With that information we can calculate the length of line C (the connecting line)
with the following pythagorean formula `sqrt(A^2 + B^2) = C`

.
Plugging in the numbers we get `sqrt(2^2 + 3^2) = C`

, which becomes
`sqrt(4 + 9) = C`

, which subsequently becomes approximately
`3.605 = C`

.

You could have also written the formula like the following.

point1 = [1,3]
point2 = [4,1]
distance = sqrt( (1-4)^2 + (3-1)^2 )

Making sense? You just used Euclidean distance to calculate the distance between two points in 2D space. Let's try it in 3 dimensions next.

### 3D Example

In 3 dimensions, we can model our vectors as an array of 3 numbers rather than 2 numbers. [X,Y] becomes [X,Y,Z]. Let's build off the top of our last example, but add the 3D value Z.

point1 = [1,3,2]
point2 = [4,1,4]
distance = sqrt( (1-4)^2 + (3-1)^2 + (2-4)^2 )

### Conclusion

That's it! All we had to do was add the 3rd dimension to the distance
calculation (`(2-4)^2`

). Adding a 4th dimension is the same.

point1 = [1,3,2,8]
point2 = [4,1,4,9]
distance = sqrt( (1-4)^2 + (3-1)^2 + (2-4)^2 + (8-9)^2)

From here, you can follow that pattern to add N dimensions as necessary for your k-means algorithm.