Brian Bi

What is Noether's theorem?

(Explain with mathematical notation, but with accompanying explanation in English)
OK. There are already a lot of questions about Noether's (first) theorem, so first make sure you're not looking for the answer to one of them:
Second, I should point out that Noether's paper itself, "Invariant Variation Problems", qualifies as an answer to this question: an exposition of Noether's theorem in mathematical notation with accompanying explanation in English. You can find the paper on arXiv.

Still want to read my answer? Okay. I'll assume you can handle the math, which will be physics-flavoured, i.e., extremely sloppy. (I apologize in advance for this.) [8] Let's go over some background.

Configurations of physical systems

The history of a physical system is represented by a configuration, which is a function from a background manifold [4], \(U\), to \(\mathbb{R}^n\). [1][2] For example, for N classical particles, \(U\) may be \([a, b]\), representing time, and a configuration maps time to the 3N coordinates of the particles at a given time. For the electromagnetic field, we may take \(U\) to be \(K \subseteq \mathbb{R}^4\), a compact subset of physical space-time, where a configuration maps a space-time point to the scalar potential and three components of the vector potential at the given point. The set of all such functions between the given two manifolds is the configuration space, \(C\), in which the kind of system under study lives. [3]

On-shell and off-shell configurations

Some configurations are unphysical—they cannot occur because they do not obey the laws of physics. Configurations that obey the laws of physics are said to be on-shell; those that do not are said to be off-shell. The identification of on-shell and off-shell configurations is achieved through an action functional, \(\mathcal S : C \to \mathbb{R}\). A configuration \(c_0 \in C\) is on-shell if and only if the action is stationary with respect to variations that preserve the boundary conditions, that is, the functional derivative vanishes identically on the interior of \(U\): \[\frac{\delta \mathcal S}{\delta c}\bigg|_{c=c_0} = 0\] Almost all actions of physical interest are local, that is, they are given by the integral over the background of a local Lagrangian density function. "Local" means the Lagrangian density is defined on a jet bundle (or so I'm told), i.e., depends on the value of a function and its derivatives up to a finite order at a given point, typically 1. We'll just consider that case here; derivatives of order 2 and higher are left as an exercise for the reader.\[S[c] = \int_U \mathcal{L}(x, c(x), \nabla c(x))\] For a local action, requiring that the action be stationary yields the Euler–Lagrange equation, which is satisfied by on-shell configurations everywhere on the interior of \(U\):\[\frac{\partial\mathcal L}{\partial c_i} = \operatorname{div} \frac{\partial \mathcal L}{\partial \nabla c_i}\] where the equality holds for all components of the configuration, \(i = 1, \ldots n\). [11]

Symmetries of the action

We define a transformation to be a one-to-one mapping from the configuration space to itself. We will focus our attention on groups of transformations (with function composition as the group operation) acting on the configuration space [6].

Let \(G\) be such a group of transformations. Suppose that for all configurations \(c_0 \in \mathcal C\) and all transformations \(g \in G\), \(\mathcal S[c_0] = \mathcal S[g(c_0)]\). That is, the transformations preserve the action. Then we say that the action has a symmetry under \(G\), or that \(G\) is a symmetry of the action.

When the group is smooth, i.e., a Lie group, we say that the symmetry is a continuous symmetry (but recall that it really means the symmetry group is differentiable, and in fact smooth—not merely continuous). The elements of the Lie algebra associated to the symmetry group are usually called generators of the symmetry.

There are two important consequences of the action having a continuous symmetry:
  1. The system will have a physical symmetry in the sense that the symmetry group necessarily maps on-shell configurations to on-shell configurations. [5] For example, when the action is invariant under the group of spatial translations, then the system it describes will be translationally invariant too: its behaviour will be the same no matter where in the universe it is located.
  2. The system will have a conserved quantity on-shell. In fact, a stronger statement can be made: it will have a conserved current and the conservation law for the conserved quantity can be described by a local continuity equation.
The first statement is easy to see and intuitive. The proof is left as an exercise to the reader. [7] The second statement is the content of Noether's theorem.

In fact, the condition we need is a bit weaker than this: it suffices that the change in the action under a transformation depends only on the boundary conditions, that is, the value the configuration takes at the boundary. [12] It is not hard to see that this still yields a physical symmetry, as a configuration, in order to be on-shell, need only have stationary action with respect to nearby configurations with the boundary conditions fixed.

We assume that an on-shell configuration is also on-shell when restricted to a subset manifold, implying that the action changes only by a boundary term on all domains of integration. In turn, this implies that the Lagrangian density changes by a divergence. (Discussion: Does the action and Lagrangian have identical symmetries and conserved quantities?)

\(\delta c \approx \epsilon X(c)\) (\(X\) is a generator of the symmetry)
\(\delta \mathcal{L} \approx \epsilon \operatorname{div} f\) (\(f\) is independent of the interior configuration)

Conserved currents

Noether's theorem gives a formula for a locally conserved current corresponding to a physical symmetry. A conserved current is a vector field whose divergence vanishes. In the case where \(U\) represents time, a conserved current is trivially just a conserved quantity such as energy or momentum. If \(U\) represents space-time, a conserved current can be interpreted as a combination of a conserved charge density and a current describing the flow of that charge, together satisfying a continuity equation. For example, the electric charge density and three components of the electric current density together form such a conserved current,\[\partial_\mu j^\mu = 0\] where \(j^\mu = (c\rho, J_x, J_y, J_z)\), and this can be written as a continuity equation,\[\frac{\partial \rho}{\partial t} + \nabla \cdot \mathbf{J} = 0\] Identifying the time-like direction on our space-time manifold \(U\), we can integrate the conserved charge density over the space-like slice at a given time, to obtain a total conserved charge, which may represent, for example, the total electric charge present in a system at a given time. The conserved charge is then conserved in the sense that its value is constant in time.

The theorem and proof

The theorem itself has already been stated: to a continuous symmetry of the action there corresponds a conserved current, that is, a vector field whose divergence vanishes on-shell. [13]

We now give the explicit formula for the conserved current [14]:\[j = X(c_0) \cdot \frac{\partial \mathcal L}{\partial \nabla c} - f\] where \(X\) is a generator of a continuous symmetry of the action. (A physicist would write that \(1 + \epsilon X\) is an infinitesimal transformation.) A symmetry transformation generated by \(X\), \(\exp(\lambda X)\), is assumed to change the Lagrangian density by a total divergence, \(\operatorname{div} f\), that doesn't depend on interior values of the configuration.

To prove this, take the divergence of both sides. On the right hand side we obtain, applying the appropriate product rule [15]:\begin{align*}\operatorname{div} j &= \operatorname{div}\left[X(c_0) \cdot \frac{\partial\mathcal L}{\partial \nabla c} - f\right] \\ & = \nabla X(c_0) \cdot \frac{\partial \mathcal L}{\partial \nabla c} + X(c_0) \cdot \operatorname{div} \frac{\partial \mathcal L}{\partial \nabla c} - \operatorname{div} f \\ &= \nabla X(c_0) \cdot \frac{\partial \mathcal L}{\partial \nabla c} + X(c_0) \cdot \frac{\partial \mathcal L}{\partial c} - \operatorname{div} f\end{align*} where the final equality is obtained by applying the Euler–Lagrange equations. But the first two terms are now the total derivative of \(\mathcal{L}\) with respect to \(\lambda\) at \(\lambda = 0\) as the system transforms, \(c_0 \mapsto \exp(\lambda X)(c_0)\); this is clearly recognizable from the chain rule. We recall that this change is a divergence, \(\operatorname{div} f\). Therefore this entire expression vanishes, and so does \(\operatorname{div} j\), which therefore gives \(j\) as a conserved current.


The examples will be done in the physicist's usual style, which involves infinitesimals. The reader is advised to bear in mind that infinitesimals can be eliminated by rewriting expressions in terms of derivatives.

For a single classical particle in three dimensions moving in a time-independent potential, the Lagrangian is \[L = \frac{1}{2} m|\dot x|^2 - V(x)\] The action thus obtained is \[\mathcal{S} = \int_a^b L \, \mathrm{d}t\] Consider the following transformation, which shifts a path in time [9][10] by the infinitesimal increment \(\epsilon\):\[x' \leftarrow x + \epsilon \dot x\] Then, plugging this into the expression for \(L\) and using \(V(x + \delta x) \approx \delta x \, V'(x)\),\[L' - L = m \epsilon \dot x \cdot \ddot x + \epsilon^2 |\ddot x|^2 - \epsilon V'(x)\] Discard the \(\epsilon^2\) term. What remains can be written as the total derivative \[L' - L = \epsilon \frac{\mathrm{d}}{\mathrm{d}t} \left[\frac{1}{2} m |\dot x|^2 - V(x)\right]\] The quantity in the brackets is \(f\). Note that the generator itself is \(X = \dot x\), and the term \(\frac{\partial \mathcal L}{\partial \nabla c}\) is \(\nabla_{\dot x} L = m\dot x\). We obtain for our conserved current \[j = m|\dot x|^2 - \left[\frac{1}{2}m|\dot x|^2 - V(x)\right] = \frac{1}{2}m|\dot x|^2 + V(x)\] which is the energy of the particle.

Energy is therefore the Noether charge corresponding to the time-translation symmetry of the system.

For a more complicated example, consider a complex Klein–Gordon field, that is, a complex field with the Lagrangian density\[\mathcal L = \frac{1}{2} \eta^{\mu\nu} \partial_\mu \phi^* \partial_\nu \phi - \frac{1}{2} |\phi|^2\] One can verify that the Lagrangian density is invariant under a global change in phase of the complex field \(\phi\). The action is therefore also invariant, and \(f = 0\). We can represent an infinitesimal phase shift by\[\phi' \leftarrow i \epsilon \phi\] so \(X\phi = i\phi\). It only remains to work out the term \(\frac{\partial \mathcal L}{\partial \nabla \phi}\). Recall that a complex field is really represented as a pair of real fields, which we'll call \(\phi_x\) and \(\phi_y\), with \(\phi = \phi_x + i\phi_y\). In terms of these, the Lagrangian density is\[\mathcal{L} = \frac{1}{2} \eta^{\mu\nu} (\partial_\mu \phi_x \partial_\nu \phi_x + \partial_\mu \phi_y \partial_\nu \phi_y) - \frac{1}{2} m^2 (\phi_x^2 + \phi_y^2)\] Only the first term depends on the derivatives of the fields. The differentiation is now a straightforward application of the product rule,\[\frac{\partial\mathcal L}{\partial(\partial_\lambda \phi_x)} = \frac{1}{2} \partial^\lambda \phi_x + \frac{1}{2} \partial^\lambda \phi_x = \partial^\lambda \phi_x\] and likewise \[\frac{\partial\mathcal L}{\partial(\partial_\lambda \phi_y)} = \partial^\lambda \phi_y\] and \(X(\phi_x)\) is the real part of \(i\phi\), that is, \(-\phi_y\), and likewise \(X(\phi_y)\) is \(\phi_x\) so our conserved current is\[j^\lambda = -\phi_y \partial^\lambda \phi_x + \phi_x \partial^\lambda \phi_y\] which can alternatively be written as \[j^\lambda = \operatorname{Im}(\phi^* \partial^\lambda \phi)\] This can be thought of as analogous to the electric current density for this field, and \(\operatorname{Im}(\phi^* \, \partial \phi/\partial t)\) can be identified with the electric charge density (up to a factor of c). The current \(j^\lambda\) is the Noether current corresponding to this symmetry under global phase shift (which is called a "global U(1) symmetry") and the value\[Q = \int j^0 \, \mathrm{d}x \, \mathrm{d}y \, \mathrm{d}z\] at fixed \(t\) is the Noether charge, with \[\frac{dQ}{dt} = 0\] that is, \(Q\) is a conserved quantity.


[1] Possible generalizations are left as an exercise for the reader.
[2] A complex field can be represented by a pair of real fields.
[3] Configurations may or may not be redundant; that is, it is possible that two different configurations are physically indistinguishable. For example, performing a gauge transformation on the electromagnetic potentials gives a different configuration that yields the same E and B fields.
[4] The manifold should be equipped with a measure so that it is possible to integrate a scalar function over it. Generally it will be an orientable Riemannian manifold.
[5] This terminology is probably not standard.
[6] In the usual sense (Group action)
[7] Hint: \[\frac{\delta (\mathcal S \circ g)}{\delta c}\bigg|_{c=c_0} = \nabla g(c_0) \cdot \frac{\delta \mathcal S}{\delta c}\bigg|_{c=g(c_0)}\] [8] Making it mathematically rigorous is left as an exercise for the reader.
[9] This is just the Taylor expansion up to order one of \(x(t - \epsilon)\).
[10] The way this transformation is written, it actually changes the particle's position to its position infinitesimally in the future—so the particle arrives earlier, which means we are shifting backward in time. We do this to get the correct sign for the conserved quantity.
[11] For example, in the case of the classical electromagnetic field in vacuum, this gives four equations, one for each component of the four-potential, equivalent to Gauss's law in vacuum and the Ampère–Maxwell law in vacuum, \(\nabla \times \mathbf{B} = \mu_0 \epsilon_0 \frac{\partial \mathbf{E}}{\partial t}\) (which is three scalar equations). More examples may be found in standard references, such as L&L vols. 1 and 2.
[12] This is trivially satisfied when the action really is invariant.
[13] A subtle question: must the action exhibit a symmetry, or can we derive a conservation law from the equations of motion alone, that is, simply from the knowledge of a physical symmetry? To the best of my knowledge, you really do need to have a symmetry of the action. See, e.g., Do an action and its Euler-Lagrange equations have the same symmetries? Noether also originally stated her theorem in terms of symmetries of the action.
[14] Because Quora's LaTeX support is shitty, some explicit dependences have been suppressed. To be clear, \(j\) is a vector field that depends on the position \(x_0 \in U\) as well as the configuration near \(x_0\); \(f\) depends on position but is assumed only to depend on boundary values of \(c_0\); \(X(c_0)\) is evaluated at \(x_0\); and \(\partial \mathcal L / \partial \nabla c\) is evaluated with the arguments \(x_0\), \(c_0(x_0)\), and \(\nabla c_0(x_0)\).
[15] This should be interpreted as follows: \(\nabla X(c_0) \cdot \partial \mathcal L / \partial \nabla c\) is the sum of \(\nabla X(c_0^i) \cdot \partial \mathcal L / \partial \nabla c^i\) where the superscript \(i\) denotes the ith component of the configuration; \(\operatorname{div} \partial \mathcal L / \partial \nabla c\) contracts over the index created by the gradient on \(c\), not the components of \(c\) (not that that would make any sense, anyway).


I would like to thank Qmechanic from Physics Stack Exchange, who helped to clear up much of the confusion I experienced while writing this answer.