Vectors are multi dimensional numbers

tl;dr

Vectors are multi dimensional numbers in a numerically ordered notation, so [1, 2] is not the same as [2, 1]. You add and multiply them the same way you do regular numbers. A scalar is a constant that multiplies the values, stretching the vector or reversing its direction.

Matrices and vectors - the way I as an engineer always understood them is simply as an array or an array with multiple dimensions. Understanding some basic maths and vector multiplication logic helps me visualise and rationalise some of my decisions. However, it does not click as well as it should. The purpose of this post is to go over the absolute fundamentals of vectors and figure out how to intuitively understand them.

The reason I'm approaching this from a maths perspective rather than a CS one is that mathematical modeling appears to be fundamental, and abstractions in maths can be modeled by a computer a lot easier than trying to fit physical or computer science concepts into maths.

Vectors as a multi dimensional numbers

A vector is an ordered list of numbers, and the way I find clearest to think about it is as a multi dimensional number.

A one dimensional number, like 22 or 33, lives on the number line. We start at the origin, 00, and walk 22 or 33 steps to the right in the positive or negative direction (the usual xx axis). That number is just a point in space measured from the origin. Negative numbers are the same idea, we just walk backwards (to the left).

-3 -2 -1 0 1 2 3 +2 -2

Let's that now we want to add another dimension, the usual yy axis. This means a number now needs two coordinates, one for xx and one for yy.

We can represent this multi dimensional number as a vector:

v=[12]\vec{v} = \begin{bmatrix} 1 \\ 2 \end{bmatrix}

We start at the origin (0,0)(0, 0), walk 11 step on the xx axis and 22 steps in the yy direction, and land on a point. That point is the tip of the vector.

x y 1 right 2 up v

What's important here is that [12]\begin{bmatrix} 1 \\ 2 \end{bmatrix} is a distinct vector on its own. One vector corresponds to one ordered combination of numbers, and one ordered combination of numbers corresponds to one vector.

Which means [12]\begin{bmatrix} 1 \\ 2 \end{bmatrix} is not the same as [21]\begin{bmatrix} 2 \\ 1 \end{bmatrix}. Once visualised, this becomes obvious.

x y [1, 2] [2, 1]

The same logic applies to higher dimensions of vectors. A 3D vector is three numbers, a 100D vector is 100 numbers, and so on. Past three dimensions we lose the ability to visualise it properly, but the underlying concept remains the same.

x y z 2 along x 3 along y 1 along z

Vectors are multi dimensional numbers in a numerically ordered notation.

What vectors can represent

Once you treat a vector as a multi dimensional number, a lot of things turn out to be vectors. A few that come up:

Whatever the components mean, the geometric intuition still holds. If it's an ordered tuple of numbers, it's a vector.

A representation is only useful if you can do things to it. For vectors, the two basic operations are adding and scaling, and everything more interesting (dot products, matrix transformations, etc.) is built out of those.

Adding vectors

Adding vectors lets us combine two displacements, two forces, or any two quantities with direction and size.

The intuition is easiest to see on the one dimensional number line.

Take addition of 22 and 33. The result is 55, but on the number line what we actually do is walk 22 steps from the origin, then walk 33 more, ending at 55. This is how we learned to add numbers in primary school.

0 1 2 3 4 5 6 +2 +3

The same idea works in two dimensions, with:

v=[21],w=[35]\vec{v} = \begin{bmatrix} 2 \\ 1 \end{bmatrix}, \quad \vec{w} = \begin{bmatrix} 3 \\ 5 \end{bmatrix}

The geometric idea is the same as on the number line. We walk v\vec{v} from the origin, then walk w\vec{w} starting from where v\vec{v} ended. The tip of the second walk is our result.

x y v w v + w

Vectors here are really arrows with direction and intensity. We add the first vector from the origin, use the tip of that vector as the origin for the second one, and the result is the tip of the second vector measured from the original origin.

This gives the same answer as adding the components directly:

v+w=[21]+[35]=[2+31+5]=[56]\vec{v} + \vec{w} = \begin{bmatrix} 2 \\ 1 \end{bmatrix} + \begin{bmatrix} 3 \\ 5 \end{bmatrix} = \begin{bmatrix} 2 + 3 \\ 1 + 5 \end{bmatrix} = \begin{bmatrix} 5 \\ 6 \end{bmatrix}

Or generally, you add component by component:

[x1y1]+[x2y2]=[x1+x2y1+y2]\begin{bmatrix} x_1 \\ y_1 \end{bmatrix} + \begin{bmatrix} x_2 \\ y_2 \end{bmatrix} = \begin{bmatrix} x_1 + x_2 \\ y_1 + y_2 \end{bmatrix}

The geometric walk and the component-wise sum are two ways of looking at the same operation.

It's also worth noting that v+w\vec{v} + \vec{w} gives the same result as w+v\vec{w} + \vec{v}. The walks can be done in either order. On the number line this is obvious because 2+3=3+22 + 3 = 3 + 2. On the 2D plane it's a bit more interesting, because the two walks trace out a parallelogram and end up at the same corner either way. Worth remembering, because plenty of operations on vectors and matrices don't behave this nicely.

The same intuition holds in 3D and beyond.

Just like numbers, just in multiple dimensions!

Scaling vectors

We refer to multiplying a vector by some constant as scaling the vector. When we scale a vector, we really just stretch, squish, or flip it.

Take v=[21]\vec{v} = \begin{bmatrix} 2 \\ 1 \end{bmatrix}.

Multiplying by a constant 33:

3v=3[21]=[63]3\vec{v} = 3 \cdot \begin{bmatrix} 2 \\ 1 \end{bmatrix} = \begin{bmatrix} 6 \\ 3 \end{bmatrix}

This points in the same direction as v\vec{v} but is three times as long.

Multiplying by a constant 12\frac{1}{2}:

12v=[10.5]\frac{1}{2}\vec{v} = \begin{bmatrix} 1 \\ 0.5 \end{bmatrix}

Same direction, half the length.

Multiplying by a constant 1-1:

v=[21]-\vec{v} = \begin{bmatrix} -2 \\ -1 \end{bmatrix}

Same length, but pointing the opposite way.

x y 3v v ½v -v

Or generally, you multiply each component by the constant:

c[xy]=[cxcy]c \cdot \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} cx \\ cy \end{bmatrix}

This process is called scaling, and the number doing the scaling is called a scalar. The name is exactly as literal as it sounds. It scales the vector.

Final notes

Treating a vector as a multi dimensional number, rather than a list with operations bolted on, is what makes the rest of linear algebra start to feel connected for me, instead of a stack of unrelated rules.

Multiplying vectors together is a separate topic, and once you start thinking of vectors as numbers in their own right, things like dot products and cross products stop feeling arbitrary. They describe geometric relationships between two multi dimensional numbers, not just arithmetic on lists.

The same goes for ı^\hat{\imath} and ȷ^\hat{\jmath} notation, where every 2D vector is written as a combination of two unit vectors. It's the same idea as decimal place value. Once a vector is a multi dimensional number, asking what the "11" of each axis looks like, and how every other vector is built out of those ones, is a natural question rather than a memorised rule. Spans, bases, and linear combinations all fall out of that.

And matrices stop being grids of numbers. A matrix is really a transformation that takes a multi dimensional number and produces another multi dimensional number. That intuition only really lands once vectors stop being lists and start being objects with their own arithmetic.