Why do people still believe that the laws of physics should be elegant?
Often I read articles about how physicists think that there should be some awesome, really elegant principles that govern our world. But is there a reason why people believe that? I mean, I understand that's certainly desirable, but can't the cosmos be governed by chaotic, complicated laws, instead of elegant ones?
If it's too much a leap of faith to assume that the laws must take an elegant form, maybe you should consider the following statement instead: The universe is highly symmetrical. Observation has consistently shown this and the more and more we probe, the more symmetry we discover. The symmetry mathematically constrains the forms that the laws of physics may take.
Here is an example of how this works. Let's assume (in classical physics) that the gravitational force (a vector) of body 1 upon body 2 is a function of their two masses, six spatial coordinates, and the current time coordinate, that is, \(f : \mathbb{R}^9 \to \mathbb{R}^3\).
There are a whole lot of different functions from nine reals to three reals. You can make one as complicated as you want. But Newton's law of gravitation is, in contrast, extremely simple: \(\mathbf{F} = -G m_1 m_2 \frac{\hat{\mathbf{r}}}{r^2}\). (Note that the \(\hat{\mathbf{r}}\) unit vector and the squared distance \(r^2\) are functions of the six coordinates.) Why is this?
We observe the following:
Here is an example of how this works. Let's assume (in classical physics) that the gravitational force (a vector) of body 1 upon body 2 is a function of their two masses, six spatial coordinates, and the current time coordinate, that is, \(f : \mathbb{R}^9 \to \mathbb{R}^3\).
There are a whole lot of different functions from nine reals to three reals. You can make one as complicated as you want. But Newton's law of gravitation is, in contrast, extremely simple: \(\mathbf{F} = -G m_1 m_2 \frac{\hat{\mathbf{r}}}{r^2}\). (Note that the \(\hat{\mathbf{r}}\) unit vector and the squared distance \(r^2\) are functions of the six coordinates.) Why is this?
We observe the following:
- The correct equation can't contain time at all, since the universe appears to be invariant under time-translation.
- The force has to be proportional to \(m_1\). You can see this because if you combine two copies of body 1 then their effect should be double the effect of one copy, and there really is no difference between two bodies of mass \(m_1\) and one body of mass \(2m_1\) anyway, because all matter is composed of small particles anyway. A similar argument shows that it also has to be proportional to \(m_2\).
- By this point we already know the force law has to take the form \(\mathbf{F} = m_1 m_2 \mathbf{f}(x_1, y_1, z_1, x_2, y_2, z_2)\). Now we invoke translational invariance: the force has to stay the same if all we do is change our frame of reference by moving without changing the direction we're looking in. That gives us \(\mathbf{f}(\mathbf{x}_1, \mathbf{x}_2) = \mathbf{f}(0, \mathbf{x}_2 - \mathbf{x}_1)\), where \(\mathbf{x}_1\) is shorthand for \((x_1, y_1, z_1)\), and so on. So the correct form for \(\mathbf{f}\) must be a function of the displacement alone.
- Now we already know the force law is of the form \(\mathbf{F} = m_1 m_2 \mathbf{g}(d_x, d_y, d_z)\), where \(d_x = x_1 - x_2\) and so on. At this point we apply rotational invariance. The magnitude of the force can't change if we just rotate our frame of reference, which means we can rotate it into a position where the displacement from body 2 to body 1 lies along the positive x-axis. We thus see that \(g\), the magnitude of \(\mathbf{g}\), must be a function of distance alone. (This is clearer if you transform into spherical coordinates; rotation allows you to obtain \(d_\theta = d_\phi = 0\) while leaving \(r\) or \(\rho\) unchanged.)
- Having obtained that the magnitude of the force is \(F = m_1 m_2 h(r)\) for some function \(h : \mathbb{R} \to \mathbb{R}\), we can ask about the direction. When our frame of reference is rotated in such a way that the displacement lies along the x-axis, we can again apply rotational invariance to conclude that the force must, in fact, be along the x-axis too! If it were in any other direction, then rotating the system around the x-axis would leave the system itself unchanged while changing the direction of the force by rotating the force vector around the x-axis, which breaks rotational invariance. So we conclude that the force has to always be parallel to the displacement vector.
- Having gotten this far, we have \(\mathbf{F} = m_1 m_2 \hat{\mathbf{r}} h(r)\). Unfortunately, you cannot deduce from this simplified analysis that \(h(r) = kr^{-2}\). I can't give a convincing argument that it shouldn't be, say, \(k/(r^2 + r)\) or \(k\exp(1/r)\). Even still, look how far we've gotten! We've managed to take some unknown function from nine reals to three reals and reduce it to a unknown function from one real to one real, times some known part. (Furthermore, that function is probably meromorphic.) So even if we don't know that \(h(r) = -G r^{-2}\), we do have a certain guarantee of simplicity that is far, far better than anything we could've gotten without taking into consideration the symmetries of the universe.