People must get taught math terribly if they think "I don't need to worry about piles of abstract math to understand a rotation, all I have to do is think about what happens to the XYZ axes under the matrix rotation". That is what you should learn in the math class!
Anyone who has taken linear algebra should know that (1) a rotation is a linear operation, (2) the result of a linear operation is calculated with matrix multiplication, (3) the result of a matrix multiplication is determined by what it does to the standard basis vectors, the results of which form the columns of the matrix.
This guy makes it sound like he had to come up with these concepts from scratch, and it's some sort of pure visual genius rather than math. But... it's just math.
I took a linear algebra class, as well as many others. It didn't work.
Most math classes I've taken granted me some kind of intuition for the subject material. Like I could understand the concept independent from the name of the thing.
In linear algebra, it was all a series of arbitrary facts without reason for existing. I memorized them for the final exam, and probably forgot them all the next day, as they weren't attached to anything in my mind.
"The inverse of the eigen-something is the determinant of the abelian".
It was just a list of facts like this to memorize by rote.
I passed the class with a decent grade I think. But I really understood nothing. At this point, I can't remember how to multiply matrices. Specifically do the rows go with the columns or do the columns go with the rows?
I don't know if there's something about linear algebra or I just didn't connect with the instructor. But I've taken a lot of other math classes, and usually been able to understand the subject material readily. Maybe linear algebra is different. It was completely impenetrable for me.
You might want to try Linear Algebra Done Right by Sheldon Axler. It's a short book, succinct but extremely clear and approachable. It explains Linear Algebra without using determinants, which are relegated to the end, and emphasises understanding the powerful ideas underpinning the subject rather than learning seemingly arbitrary manipulations of lists and tables of numbers.
Those manipulations are of course extremely useful and worth learning, but the reasons why, and where they come from, will be a lot clearer after reading Axler.
As someone pointed out elsewhere in this thread, the book is available free at https://linear.axler.net/
To remind oneself how to multiply matrices together, it suffices to remember how to apply a matrix to a column vector, and that ((A B) v) = (A (B v)).
For each 1-hot vector e_i (i.e. the row vector that has a 1 in the i-th position and 0s elsewhere), apply B e_i to get the i-th column of the matrix B. Then, apply the matrix A to the result, to obtain A (B e_i), which equals (A B) e_i . This is then the i-th column of the matrix A B. And, when applying the matrix A to some column vector v, for each entry/row of the resulting vector, it is obtained by combining the corresponding row of A, with the column vector v.
So, to get the entry at the j-th row of the i-th column of (A B), one therefore combines the i-th column of B with the j-th row of A. Or, alternatively/equivalently, you can just compute the matrix (A B) column by column, by, for each e_i , computing that the i-th column of (A B) is (A (B e_i)) (which is how I usually think of it).
To be clear, I don't have the process totally memorized; I actually use the above reasoning to remind myself of the computation process a fair portion of the time that I need to compute actual products of matrices, which is surprisingly often given that I don't have it totally memorized.
When I took linear algebra, the professor emphasized the linear maps, and somewhat de-emphasized the matrices that are used to notate them. I think this made understanding what is going on easier, but made the computations less familiar. I very much enjoyed the class.
Here's a recipe for matrix multiplication that you can't forget: choose bases b_i/c_j for your domain/codomain. Then all a matrix is is listing the outputs of a function for your basis: if you have a linear function f, then the ith column of its matrix A is just f(b_i). If you have another function g from f's codomain, then same thing, its matrix B is just the list of outputs g(c_j). Then the ith column of BA is just g(f(b_i)). If you write these things down on paper and expand out what I wrote, you'll see the usual row and column thing pop out. The point is that f(b_i) is a weighted sum of the c_i (since c_i is a basis for the target of f), but you can pull the weighted sums through the definition of g because it's linear. A basis gives you a minimal description/set of points where you need to define a function, and the definition for all other points follows from linearity.
The point of the eigen-stuff is that along some directions, linear functions are just scalar multiplication: f(v) = av. If the action in a direction is multiplication by a, then it can't also be multiplication by b. So unequal eigenvalues must mean different directions/linearly independent subspaces. So e.g. if you can find n different eigenvalues/eigenvectors, you've found a simple basis where each direction is just multiplication. You also know that it's invertible if the eigenvalues are nonzero since all you did was multiply by a_i along each direction, so you can invert it by multiplying by 1/a_i on each direction.
Taught properly it's all very straightforward, though determinants require some more buildup with a detour through things like quotienting and wedge products if you really want it to be straightforward IMO. You start by saying you want to look at oriented areas/volumes, and look at the properties you need. Then quotienting gives you a standard tool to say "I want exactly the thing that has those properties" (wedge products). Then the action on wedges gives you what your map does to volumes, with the determinant as the action on the full space. You basically define it to be what you want, and then you can calculate it by linearity/functoriality just like you expand out the definition of a linear map from a basis.
IDK why but the replies to your comment crack me up because they ended up confusing me rather than helped. It's the same for me. Impenetrable.
I'm an applied math PhD who thinks linear algebra is the best thing ever, and it's the nuts and bolts of modern AI, so for fun and profit I'll attempt a quick cheat sheet.
To manage expectations, this won't be very satisfying by itself. You have to do a lot of exercises for this stuff to become second nature. But hopefully it at least imparts a sense that the topic is conceptually meaningful and not just a profusion of interacting symbols. For brevity, we'll pretend real numbers are the only numbers that exist; assume basic knowledge of vectors; and, I won't say anything about eigenvalues.
1. The most important thing to know about matrices is that they are linear maps. Specifically, an m x n matrix is a map from n-dimensional space (R^n) to m-dimensional space (R^m). That means that you can use the matrix as a function, one which takes as input a vector with n entries and outputs a vector with m entries.
2. The columns of a matrix are vectors. They tell you what outputs are generated when you take the standard basis vectors and feed them as inputs to the associated linear map. The standard basis vectors of R^n are the n vectors of length 1 that point along the n coordinate axes of the space (the x-axis, y-axis, z-axis, and beyond for higher-dimensional spaces). Conversely, a vector with n entries is also an n x 1 column matrix.
3. Every vector can be expressed uniquely as a linear combination (weighted sum) of standard basis vectors, and linear maps work nicely with linear combinations. Specifically, F(ax + by) = aF(x) + bF(y) for any real-valued "weights" a,b and vectors x,y. From this, you can show that a linear map is uniquely determined by what it maps the standard basis vectors to. This + #2 explains why linear maps and matrices are equivalent concepts.
4a. The way you apply the linear map to an arbitrary vector is by matrix-vector multiplication. If you write out (for example) a 3 x 2 matrix and a 2 x 1 vector, you will see that there is only one reasonable way to do this: each 1 x 2 row of the matrix must combine with the 2 x 1 input vector to produce an entry of the 3 x 1 output vector. The combination operation is, you flip the row from horizontal to vertical so it's a vector, then you dot-product it with the input vector.
4b. Notice how when you multiply 3x2 matrix with 2x1 vector, you get a 3x1 vector. In the "size math" of matrix multiplication, (3x2) x (2x1) = (3x1); the inner 2's go away, leaving only the outer numbers. This "contraction" of the inner dimensions, which happens via the dot product of matching vectors, is a general feature of matrix multiplication. Contraction is also the defining feature of how we multiply tensors, the 3D and higher-dimensional analogues of matrices.
5. Matrix-matrix multiplication is just a bunch of matrix-vector multiplications put side-by-side into a single matrix. That is to say, if you multiply two matrices A and B, the columns of the resulting matrix C are just the individual matrix-vector multiplications of A with the columns of B.
6. Many basic geometric operations, such as rotation, shearing, and scaling, are linear operations, so long as you use a version of them that keeps the origin fixed (maps the zero vector to zero vector). This is why they can be represented by matrices and implemented in computers with matrix multiplication.
I think this is pretty instructor-dependent. I had two LinAlg courses, and in the first, I felt like I was building a great intuition. In the second, the instructor seemed to make even the stuff I previously learned seem obtuse and like "facts to memorize."
Maybe linear algebra is more instructor-dependent, since we have fewer preexisting concepts to build on?
A lot of people who find themselves having to deal with matrices when programming have never taken that class or learned those things (or did so such a long time ago that they've completely forgotten). I assume this is aimed at such people, and he's just reassuring them that he's not going to talk about the abstract aspects of linear algebra, which certainly exist.
I'd take issue with his "most programmers are visual thinkers", though. Maybe most graphics programmers are, but I doubt it's an overwhelming majority even there.
> most programmers are visual thinkers
I remember reading that there's a link between aphantasia (inability to visualize) and being on the spectrum.
Being an armchair psychologist expert with decades of experience, I can say with absolute certainty that a lot of programmers are NOT visual thinkers.
Do you have anything I can read about that? I'm definitely on the spectrum and have whatever the opposite of aphantasia is, I can see things very clearly in my head
"In Experiment 2 we have shown that people with aphantasia report higher AQ scores (more traits associated with autism than controls), and fall more often within the range suggestive of autism (≥32)."
https://www.sciencedirect.com/science/article/abs/pii/S10538...
This is interesting because, to me, programing is a deeply visual activity. It feels like wandering around in a world of forms until I find the structures I need and actually writing out the code is mostly a formality.
I would describe my experience of it similarly, but wouldn't call it "visual thinking" in the sense meant in the article, where one uses actual imagery and visual-spatial reasoning. Indeed, I almost completely lack the ability to conjure mental imagery (aphantasia) and I've speculated it might be because a part of my visual cortex is given over to the pseudo-visual activity that seems to take place when I program.
I'm especially sure my sort of pseudo-visual thinking isn't what the article means by "visual thinking" because I also use it when working through "piles of abstract math", which I take to very kindly indeed.
Is your "wandering" of this sort of pseudo-visual nature, or do you see actual visual images that could be drawn? Very intriguing if the latter, and I'd be curious to know what they look like.
> Is your "wandering" of this sort of pseudo-visual nature, or do you see actual visual images that could be drawn?
They're like if the abstract machines you talk about in CS theory classes were physical objects.
For example, thinking about a data processing pipeline, I might see the different components performing transformations on messages flowing through it. I can focus on one component and think about how it take apart the message to extract the structure it's trying to manipulate, interacts with its local state, etc. If something is active and stateful it feels different than if it's just a plain piece of data. I run the machine through its motions to understand where the complexity is and where things could break, comparing different designs against each other.
I'm thinking about a data format, I think about the relationships between containers, headers, offsets between structures, etc, like pieces that I can move around to see how their relationships change and understand how it would work in practice.
It's more than an image that can be drawn because the pieces are in motion as they operate. It's the same kind of "material" that mathematical objects are made out of when I'm thinking about abstract math. It's immensely useful skill for doing my job, in designing systems.
I actually struggle a lot with translating the systems in my head into prose. To me, certain design decisions are completely obvious and wouldn't need to be stated, so when we all understand the product goals I often to neglect to explain why a certain thing works the way it does, because to me it's completely obvious how it's useful towards achieving the product goals. So that's something I have to actively put more effort into.
I also really struggled when I took a real linear algebra class, since it was taught in a very blackboardy "tabular" style which was harder for me to visualize. I was unfamiliar with it due to being used to thinking about matrices in the context of computer graphics and game engines.
Math achievement correlates strongly with visuospatial reasoning. Programmers may not be as proficient in math as economists, but they are better at it than biologists or lawyers.
I would distinguish between visual imagination and visuospatial reasoning.
For people like myself with aphantasia, there are often problems solving strategies that can help you when you can’t visualize. Like draw a picture.
And lots of problems don’t really require as much visual imagination as you would think. I’m pretty good at math, programming, and economics. Not top tier, but pretty good.
If there are problems out there that you struggle with compared to others, then that’s the universe telling you that you don’t have a comparative advantage in it. Do something else and hire the people who can more easily solve them if you need it.
And since the economist's main skill at math is fitting a very short ruler to a very large curve... i wouldn't put them ahead of lawyers...
> Anyone who has taken linear algebra should know that [...]
My university level linear algebra class didn't touch practical applications at all, which was frustrating to me because I knew that it could be very useful to some background doing hobbyist game dev. I still wish I had a better understanding of the use cases for things like eigenvectors/values.
Here are some applications of eigenvectors and eigenvalues:
1) If you have a set of states and a stochastic transition function, which gives for each starting state, the probability distribution over what the state will be at the next time step, you can describe this as a matrix. The long-term behavior of applying this can be described using the eigenvectors and eigenvalues of this matrix. Any stable distribution will be an eigenvector with eigenvalue 1. If there is periodicity to the behavior, where for some initial distributions, the distribution will change over time in a periodic way, where it ends up endlessly cycling through a finite set of distributions, then the matrix will have eigenvalues that are roots of unity (i.e. a complex number s such that s^n = 1 for some positive integer n). Eigenvalues with absolute value less than 1 correspond to transient contributions to the distribution which will decay (the closer to 0, the quicker the decay.). When there are finitely many states, there will always be at least one eigenvector with eigenvalue 1.
2) Related to (1), there is the PageRank algorithm, where one takes a graph where each node has links to other nodes, and one models a random walk on these nodes, and one uses the eigenvector one (approximately) finds in order to find the relative importance of the different nodes.
3) Rotations generally have eigenvalues that are complex numbers with length 1. As mentioned in (1), eigenvalues that are complex numbers with length 1 are associated with periodic/oscillating behavior. Well, I guess it sorta depends how you are using the matrix. If you have a matrix M with all of its eigenvalues purely imaginary, then exp(t M) (with t representing time) will describe oscillations with rates given by those eigenvalues. exp(t M) itself will have eigenvalues that are complex numbers of length 1. This is very relevant in solutions to higher order differential equations or differential equations where the quantity changing over time is a vector quantity.
____
But, for purposes of gamedev, I think probably the eigenvalues/eigenvectors are probably the less relevant things. Probably instead, at least for rendering and such, you want stuff like, "you can use homogeneous coordinates in order to incorporate translations and rotations into a single 4x4 matrix (and also for other things relating the 3d scene to the 2d screen)", and stuff about like... well, quaternions can be helpful.
Of course, it all depends what you are trying to do...
I have taken several linear algebra courses, one from my high school and two from universities. The thing is, not all courses of linear algebra will discuss rotations the way you discuss it. One reason is that sometimes a high school linear algebra course cannot assume students have learned trigonometry. I've seen teachers teach it just to solve larger linear systems of equations. Another reason is that sometimes a course will focus just on properties of vector spaces without relating them to geometry; after all who can visualize things when the course routinely deals with 10-dimensional vectors or N-dimensional ones where N isn't a constant.
I think teaching beginner linear algebra using matrices representing systems of equations is a pedagogical mistake. It gives the wrong impression that matrices are linear algebra and makes it difficult for students to think about it in an abstract way. A better way is to start by discussing abstract linear combinations and then illustrating what can be done with this using visualizations in various coordinate systems. Once the student understands this intuitively, systems of equations and matrices can be brought up as equivalent ways to represent linear transformations on paper. It’s important to emphasize that matrices are convenient but not the only way to write the language of linear algebra.
Usually you just draw a 2D or 3D picture and say "n" while pointing to it. e.g. I had a professor that drew a 2D picture on a board where he labeled one axis R^m and the other R^n and then drew a "graph" when discussing something like the implicit function theorem. One takeaway of a lot of linear algebra is that doing this is more-or-less correct (and then functional analysis tells you this is still kind-of correct-ish even in infinite dimensions). Actually SVD tells you in some sense that if you look at things in the right way, the "heart" of a linear map is just 1D multiplication acting independently along different axes, so you don't need to consider all n dimensions at once.
> This guy makes it sound like he had to come up with these concepts from scratch, and it's some sort of pure visual genius rather than math. But... it's just math.
The problem with this kind of thinking is that it encourages the exact kind of teaching you disparage. It's very easy to get lost in the sauce of arbitrary notational choices that the underlying concepts end up completely lost. Before you know it, matrices are opaque self-justified atoms in an ecosystem rather than an arbitrary tabular shorthand. Mathematics is not a singular immutable dogma that dropped out of the sky as natural law, and rediscovery is the most powerful tool for understanding.
When I was studying and made the mistake of choosing 3D computer graphics as a lecture, I remember some 4x4 matrix that was used for rotation, with all kinds of weird terms in it, derived only once, in a way I was not able to understand and that didn't relate to any visual idea or imagination, which makes it extra hard for me to understand it, because I rely a lot on visualization of everything. So basically, there was a "magical formula" to rotate things and I didn't memorize it. Exam came and demanded having memorized this shitty rotation matrix. Failed the exam, changed lectures. High quality lecturing.
Later in another lecture at another university, I had to rotate points around a center point again. This time found 3 3x3 matrices on wikipedia, one for each axis. Maybe making at least seemingly a little bit more sense, but I think I never got to the basis of that stuff. Never seen a good visual explanation of this stuff. I ended up implementing the 3 matrices multiplications and checked the 3D coordinates coming out of that in my head by visualizing and thinking hard about whether the coordinates could be correct.
I think visualization is the least of my problems. Most math teaching sucks though, and sometimes it is just the wrong format or not visualized at all, which makes it very hard to understand.
You can do rotation with a 3x3 matrix.
The first lecture was using a 4x4 matrix because you can use it for a more general set of transformations, including affine transforms (think: translating an object by moving it in a particular direction).
Since you can combine a series of matrix multiplications by just pre-multiplying the matrix, this sets you up for doing a very efficient "move, scale, rotate" of an object using a single matrix multiplication of that pre-calculated 4x4 matrix.
If you just want to, e.g., scale and rotate the object, a 3x3 matrix suffices. Sounds like your first lecture jumped way too fast to the "here's the fully general version of this", which is much harder for building intuition for.
Sorry you had a bad intro to this stuff. It's actually kinda cool when explained well. I think they probably should have started by showing how you can use a matrix for scaling:
for example, will grow an object by 2x in the x dimension, 1.5x in the y dimension, and keep it unchanged in the z dimension. (You'll note that it follows the pattern of the identity matrix). The derivation of the rotation matrix is probably best first derived for 2d; the wikipedia article has a decentish explanation:https://en.wikipedia.org/wiki/Rotation_matrix
The first time I learned it was from a book by LaMothe in the 90s and it starts with your demonstration of 3D matrix transforms, then goes "ha! gimbal lock" then shows 4D transforms and the extension to projection transforms, and from there you just have an abstraction of your world coordinate transform and your camera transform(s) and most everything else becomes vectors. I think it's probably the best way to teach it, with some 2D work leading into it as you suggest. It also sets up well for how most modern game dev platforms deal with coordinates.
Think OpenGL used all those 2,3,4D critters at API level. It must be very hardware friendly to reduce your pipeline to matrix product. Also your scene graph (tree) is just this, you attach relative rotations and translations to graph nodes. You push your mesh (stream of triangles) at tree nodes, and composition of relative transforms up to the root is matrix product (or was the inverse?) that transform the meshes that go to the pipeline. For instance character skeletons are scene subgraphs, bones have translations, articulations have rotations. That's why it is so convenient to have rotations and translations in a common representation, and a linear one (4D matrix) is super. All this excluding materials, textures, and so on, I mean.
Tricks of the * Game Programming Gurus :)
> The first lecture was using a 4x4 matrix because you can use it for a more general set of transformations, including affine transforms (think: translating an object by moving it in a particular direction).
I think this is mixing up concepts that are orthogonal to linear spaces, linear transformations, and even specific operations such as rotations.
The way you mention "more general set of transformations" suggests you're actually referring to homogeneous coordinates, which is a trick that allows a subset of matrix-vector multiplication and vector addition in 3D spaces to be expressed as a single matrix-vector multiplication in 4D space.
This is fine and dandy if your goal is to take a headstart to 3D programming, where APIs are already designed around this. This is however a constrained level of abstraction above actual linear algebra, which may be and often is more confusing.
> You can do rotation with a 3x3 matrix.
You can do a rotation or some rotations but SO(3) is not simply connected.
It mostly works for rigid bodies centered on the origin, but gimbal lock or Dirac's Plate Trick are good counter example lenses. Twirling a baton or a lasso will show that 720 degrees is the invariant rotation in SO(3)
The point at infinity with a 4x4 matrix is one solution, SU(3), quaternions, or recently geometric product are other options with benefits at the cost of complexity.
I think you are confused about what 'simply connected' means. A 3x3 matrix can represent any rotation. Also from a given rotation there is a path through the space of rotations to any other rotation. It's just that some paths can't be smoothly mapped to some other paths.
SO(3) contains all of the orthogonal 3x3 matrices of determinant 1.
If you are dealing with rigid bodies rotated though the origin like with the product of linear translations you can avoid the problem. At least with an orthonormal basis R^3 with an orthogonal real valued 3x3 matrix real entries which, where the product of it with its transpose produces the identity matrix and with determinant 1
But as soon as you are dealing with balls, where the magnitude can be from the origin to the radius, you run into the issue that the antipodes are actually the same point, consider the north and south poles being the same point, that is what I am saying when the topology is not simply connected.
The rigid body rotation about the origin is just a special case.
Twist a belt twice and tape one end to the table and you can untwist it with just horizontal translation, twist it once (360deg) and you cannot.
In computer graphics, 4x4 matrices let you do a rotation and a translation together (among other things). There's the 3x3 rotation block you found later as well as a translation vector embedded in it. Multiplying a sequence of 4x4 matrices together accumulates the rotations and translations appropriately as if they were just a bunch of function applications. i.e. rotate(translate(point)) is just rotation_matrix * translation_matrix * point_vector if you construct your matrices properly. Multiplying a 4x4 matrix with another 4x4 matrix yields a 4x4 matrix result, which means that you can store an arbitrary chain of rotations and translations accumulated together into a single matrix...
Yeah you need to build up the understanding so that you can re-derive those matrices as needed (it's mostly just basic trigonometry). If you can't, that means a failure of your lecturer or a failure in your studying.
The mathematical term for the four by four matrices you were looking at is "quaternion" (I.e. you were looking at a set of four by four matrices isomorphic to the unit quaternions).
Why use quaternions at all, when three by three matrices can also represent rotations? Three by three matrices contain lots of redundant information beyond rotation, and multiplying quaternions requires fewer scalar additions and multiplications than multiplying three by three matrices. So it is cheaper to compose rotations. It also avoids singularities (gimbal lock).
This was part of Steve Baker’s (“Omniverous Hexapod”, sic) extensions to a long-standing Usenet FAQ about graphics programming, put out by “Carniverous Hexapod” (sic). It’s at least two decades old, and the FAQ from which he it on may be from the 1990s? I have the niggling recollection that the Carniverous name may have been based on Vernor Vinge’s _Fire upon the deep_ aliens.
He did not invent it, but he probably had to deal with aspiring graphics programmers who were not very math-savvy.
Honestly, many math teachers are kinda bad at conveying all that.
When everything clicked a few years down the line it all became so simple.
Like you mention "linear operation", the word linear doesn't always make intuitive sense in terms of rotations or scaling if you have encountered simple 1 or 2 dimensional linear transformations when doing more basic graphics programming.
As a teacher, I think the biggest lesson I had to learn was to always have at least 3 different ways of explaining everything to give different kinds of people different entrypoints into understanding concepts.
For someone uninitiated a term like "basis vector" can be pure gibberish if it doesn't follow an example of a transform as a viewport change, and it needs to be repeated after your other explanations (of for example how vector components in the source view just are scalars upon the basis vectors when multiplied with a matrix instead of a heavy un-intuitive concept).
Math is just a standardized way to communicate those concepts though, it's a model of the world like any other. I get what you mean, but these intuitive or visualising approaches help many people with different thinking processes.
Just imagine that everyone has equal math ability, except the model of math and representations of mathematical concepts and notation is more made for a certain type of brains than others. These kind of explanations allow bringing those people in as well.