The other day I was thinking about the Klein-Gordon equation, otherwise known as the Klein–Fock–Gordon equation. I had to use it for something, and afterwards I found myself thinking of its derivation. So, for fun, let’s derive it!

Klein Gordon equation traveling wave plot5

[For the reader not familiar with this equation, it is a relativistic wave equation related to the Schrödinger equation. You can find a very general entry into its use and importance here. I also found some lecture slides that go over its derivation in more detail than my own treatment here].


In that we are dealing here with relativistic quantum mechanics, in the following derivation of the Klein–Fock–Gordon equation we’re going to employ the Einstein energy-momentum relation. Also, to simplify things let’s take the invoke the standard convention for the units in which \hbar = c =1. This will allow us to not have to focus on \hbar and c terms that appear throughout. It’s not so much lazy, just practical.

We start with the Schrödinger equation in natural units,

    \[ i \frac{\partial{d}\psi}{\partial{t}t} + \frac{1}{2m} \nabla^2 \psi + V\psi = 0 \]

As we continue to set the stage, let’s now also remind ourselves of the equation for the nonrelativistic energy of a free particle (this will be important for reasons that will become clear in a moment),

    \[ E= \frac{\vec{p}}{2m} \]

Now that we have our equation written, we can proceed. The key here is that we’re going to want to quantise this nonrelativistic equation. The result we get is \hat{H} \psi = E \psi, where \hat{H} = T + V =  \frac{p^2}{2m} + V.

(Recall: the inner product of momentum (lorentz invariant) gives, p^{\mu}p_{\mu} = m^2c^2 = \frac{\epsilon^2}{c^2}=\mid{p}^s\mid).

Taking the quantum mechanical operators, we get:

    \[ E \rightarrow \ i \hbar \frac{\partial}{\partial t} \]

    \[ P \rightarrow \ -i \hbar \vec{\nabla} \]

With these operators defined, the purpose for doing this is because a natural attempt is to want to use the Einstein energy-momentum relation. We can now turn our attention to this relation, which one may already know takes the form:

    \[ \hat{E}^2 = p^2 +m^2 \]

Substituting for our quantum operators, we get something immediately looking like this:

    \[ (i \hbar \frac{\partial}{\partial t})^2 = (-i \hbar \vec{\nabla})^2 + m^2 \]

Or, one will sometimes see this written as (\frac{E}{C})^2 = p^2 + m^2c^2, then substituting for E and \vec{p} you will find (\frac{\hbar}{c} \frac{\partial}{\partial t})^2 \phi = (\hbar \nabla - m^2 C^2) /phi. But because we’ve set \hbar = c =1 our approach is slightly different. In our case, when we simply expand the brackets and simplify,

    \[ - \frac{\partial^2 \phi}{\partial t^2} = m^2 \phi - \nabla^2 \phi \ \ \ (*) \]

This is essentially the Klein-Gordon equation. However, we have some conflicts in notation. Really, we want our equation above to be in four-vector notation. So, what do we do? I’ve already given some hints in the graphic I posted above (a sketch from my notebook).

In four-vectors, one will already likely be familiar with the idea of having one time component and three space components. We can write this as X^{\mu} = (X^o, \bar{X}), where \bar{X} is just short for our x, y, and z components.

Now, with that sorted, we need to think about our four-gradient, which we can write as \partial_{\mu} = (\frac{\partial}{\partial t}, \vec{\nabla}). Taking the dot product, or, in other words, taking the Einstein summation convention into consideration, \partial_{\mu}\partial^{\mu} = \partial_{\mu} g^{\mu \nu} \partial_{\nu}, where one might recognise g^{\mu \nu} as our metric.  Understanding the metric convention here has the signature (+ – – -), we come to the following:

    \[ \partial_{\mu}\partial^{\mu} = \partial_{\mu} g^{\mu \nu} \partial_{\nu} \implies (\frac{\partial}{\partial t} \nabla) \begin{pmatrix} 1 & 0 \\ 0 & \bar{-1} \end{pmatrix} \begin{pmatrix} \frac{\partial}{\partial t} \\ \bar{\nabla} \end{pmatrix} \]

which, after performing standard matrix multiplication, comes out to

    \[ \frac{\partial}{\partial t} \nabla \begin{pmatrix} \frac{\partial}{\partial t} \\ \bar{\nabla} \end{pmatrix} = \frac{\partial^2}{\partial t^2} - \bar{\nabla}^2\]

So, we have (invoking the d’Alembertian at the end)

    \[ \partial^{\mu} \partial_{\nu} = \frac{\partial^2}{\partial t^2} - \bar{\nabla}^2\ \implies \Box = \frac{\partial^2}{\partial t^2} - \bar{\nabla}^2\ \]

As we approach our final result, recognise that what we have now looks very much like (*). That’s because we have arrived at a representation for our d’Alembertian, and we need to perform a substitution. All that is then left is some algebra and we’re done! Here is what I mean: let us now return to our previous equation, (*), rearrange it so that we can substitute directly for our d’Alembertian:

    \[ - \frac{\partial^2 \phi}{\partial t^2} = m^2 \phi - \nabla^2 \phi \]

    \[ - \frac{\partial^2 \phi}{\partial t^2} + \nabla^2 \phi - m^2 \phi = 0 \]

    \[ \frac{\partial^2 \phi}{\partial t^2} - \nabla^2 \phi + m^2 \phi = 0 \]

Sub for the d’Alembert operator,

    \[ \Box^2 \phi + m^2 \phi = 0 \]

    \[ \implies (\Box^2 + m^2) \phi = 0 \]

And here is the version of the Klein-Gordon equation you will see in many texts, except it does not include \hbar and c. In that case one will often see it written as, (\Box + \frac{m^2c^2}{\hbar})\phi = 0 where, again, \Box = (\frac{1}{x} \partial t)^2 - \nabla.

Concluding remarks

What is really cool about this equation is that you can find plane wave solutions to it relatively easily. The caveat being that the plane wave is a solution to the Klein-Gordon equation so long that energy and momentum follows Einstein’s relation. This last comment provides a hint for further study, should the inquisitive reader immediately think of connections with GR and RQM.

I’ll also save some discussion on some of the problems, or limitations, pertaining to the KG equation for another time. For now it is just nice to appreciate the result – an attempt at relativistic quantum mechanics!

In a following post, I will show alternative way (there are a few) to derive the KG equation which is much more terse or abrupt. I think it is useful to know.