The other day I was thinking about the Klein-Gordon equation, otherwise known as the Klein–Fock–Gordon equation. I had to use it for something, and afterwards I found myself thinking of its derivation. So, for fun, let’s derive it!

Klein Gordon equation traveling wave plot5

[For the reader not familiar with this equation, it is a relativistic wave equation related to the Schrödinger equation. You can find a very general entry into its use and importance here. I also found some lecture slides that go over its derivation in more detail than my own treatment here].


In that we are dealing here with relativistic quantum mechanics, in the following derivation of the Klein–Fock–Gordon equation we’re going to employ the Einstein energy-momentum relation. Also, to simplify things let’s take the invoke the standard convention for the units in which \hbar = c =1. This will allow us to not have to focus on \hbar and c terms that appear throughout. It’s not so much lazy, just practical.

We start with the Schrödinger equation in natural units,

    \[ i \frac{\partial{d}\psi}{\partial{t}t} + \frac{1}{2m} \nabla^2 \psi + V\psi = 0 \]

Let’s now remind ourselves of the equation for the nonrelativistic energy of a free particle,

    \[ E= \frac{\vec{p}}{2m} \]

(Recall: the inner product of momentum (lorentz invariant) gives, p^{\mu}p_{\mu} = m^2c^2 = \frac{\epsilon^2}{c^2}=\mid{p}^s\mid).

Quantising this nonrelativistic equation we get \hat{H} \psi = E \psi, where \hat{H} = T + V =  \frac{p^2}{2m} + V.

Taking the quantum mechanical operators, we get:

    \[ E \rightarrow \ i \hbar \frac{\partial}{\partial t} \]

    \[ P \rightarrow \ -i \hbar \vec{\nabla} \]

With these operators defined, the purpose for doing this is because a natural attempt is to use the Einstein energy-momentum relation. We now want to turn our attention to this elation, which one may already know takes the form:

    \[ \hat{E}^2 = p^2 +m^2 \]

Substituting for our quantum operators, we get something immediately looking like this:

    \[ (i \hbar \frac{\partial}{\partial t})^2 = (-i \hbar \vec{\nabla})^2 + m^2 \]

Or, one will sometimes see this written as (\frac{E}{C})^2 = p^2 + m^2c^2, then substituting for E and \vec{p} you will find (\frac{\hbar}{c} \frac{\partial}{\partial t})^2 \phi = (\hbar \nabla - m^2 C^2) /phi. But because we’ve set \hbar = c =1 our approach is slightly different. In our case, when we expand these brackets and simplify,

    \[ - \frac{\partial^2 \phi}{\partial t^2} = m^2 \phi - \nabla^2 \phi \ \ \ (*) \]

This is essentially the Klein-Gordon equation. However, we have some conflicts in notation. Really, we want our equation above to be in four-vector notation. So, what do we do? I’ve already given some hints in the graphic I posted above (a sketch from my notebook).

In four-vectors, one will already likely be familiar with the idea of having one time component and three space components. We can write this as X^{\mu} = (X^o, \bar{X}), where \bar{X} is just short for our x, y, and z components.

Now, with that sorted, we need to think about our four-gradient, which we can write as \partial_{\mu} = (\frac{\partial}{\partial t}, \vec{\nabla}). Taking the dot product, or, in other words, taking the Einstein summation convention into consideration, \partial_{\mu}\partial^{\mu} = \partial_{\mu} g^{\mu \nu} \partial_{\nu}, where one might recognise g^{\mu \nu} as our metric.  Understanding the metric convention here has the signature (+ – – -), we come to the following:

    \[ \partial_{\mu}\partial^{\mu} = \partial_{\mu} g^{\mu \nu} \partial_{\nu} \implies (\frac{\partial}{\partial t} \nabla) \begin{pmatrix} 1 & 0 \\ 0 & \bar{-1} \end{pmatrix} \begin{pmatrix} \frac{\partial}{\partial t} \\ \bar{\nabla} \end{pmatrix} \]

which, after performing standard matrix multiplication, comes out to

    \[ \frac{\partial}{\partial t} \nabla \begin{pmatrix} \frac{\partial}{\partial t} \\ \bar{\nabla} \end{pmatrix} = \frac{\partial^2}{\partial t^2} - \bar{\nabla}^2\]

So, we have (invoking the d’Alembertian at the end)

    \[ \partial^{\mu} \partial_{\nu} = \frac{\partial^2}{\partial t^2} - \bar{\nabla}^2\ \implies \Box = \frac{\partial^2}{\partial t^2} - \bar{\nabla}^2\ \]

As we approach our final result, recognise that what we have now looks very much like (*). That’s because we have an explicit representation for our d’Alembertian. All that’s left is some algebra and we’re done!

    \[ - \frac{\partial^2 \phi}{\partial t^2} = m^2 \phi - \nabla^2 \phi \]

    \[ - \frac{\partial^2 \phi}{\partial t^2} + \nabla^2 \phi - m^2 \phi = 0 \]

    \[ \frac{\partial^2 \phi}{\partial t^2} - \nabla^2 \phi + m^2 \phi = 0 \]

Sub for the d’Alembert operator,

    \[ \Box^2 \phi + m^2 \phi = 0 \]

    \[ \implies (\Box^2 + m^2) \phi = 0 \]

And here is the version of the Klein-Gordon equation you will see in many texts, except it does not include \hbar and c. In that case one will often see it written as, (\Box + \frac{m^2c^2}{\hbar})\phi = 0 where, again, \Box = (\frac{1}{x} \partial t)^2 - \nabla.

What is really cool about this equation is that you can find plane wave solutions to it relatively easily. The caveat being that the plane wave is a solution to the Klein-Gordon equation so long that energy and momentum follows Einstein’s relation. I’ll also save some discussion on some of its problems, or limitations, for another time. For now it is just nice to appreciate the result – an attempt at relativistic quantum mechanics!