Mathematics for the Stern-Gerlach Circuit
Revisiting an Error
Recently we discussed the Stern-Gerlach experiment in the context of Quantum Information. From this perspective, the silver atoms play the role of quantum bits or qubits. In particular, we discussed what happens when we chain multiple Stern-Gerlach devices together in series, serving as a toy model for a quantum circuit.
We dove a little deeper with our examples on our YouTube channel. If you missed our prior discussions, don't worry too much, the logic in the next sections of this essay should be fairly self-contained.
In our previous discussion, we spent a lot of time on a particular three-machine circuit. The one with each of the Stern-Gerlach magnets are pointed in the $\widehat{z}$, $\widehat{x}$, $\widehat{z}$, respectively.
In our original set up, the “spin up” output terminal of one device connects directly to the input terminal of the next device. So for the $\widehat{z}-\widehat{x}-\widehat{z}$-circuit, we demonstrated that the final output of the circuit was an evenly split beam.
We then turned the middle device off, and found that only a spin up beam remained.
We then tried to set up a similar problem where the second device was still on, but we didn’t actually observe which way the beam was pointing. In this case, we said, the system should behave just like turning the second device off.
We were completely wrong.
The opposite is what actually happens; the beam shoots downwards.Today, I'll explain why. The reasons behind the error are subtle, but offer an excellent introduction to the mathematics involved.
Vectors and Matrices
You might remember that the most general state of a silver atom - our qubit - is a linear combination of up and down states,
$$|\mathrm{Ag} \rangle = \alpha\, |\uparrow\; \rangle + \beta\, |\downarrow\; \rangle.$$
You might also remember that \emph{up} and \emph{down} are defined relative to the detecting device. For us this means the direction which the magnet in the Stern-Gerlach device is pointing. $|\uparrow\;\rangle$ means parallel to that magnetic field, and $|\downarrow\;\rangle$ means antiparallel to it.
As before we can elect to encode this information as a column vector:
$$|\mathrm{Ag}\rangle \rightsquigarrow \left( \begin{array}{c} \alpha \\ \beta \end{array}\right),$$
where
\begin{equation}\label{zbasis}|\uparrow\;\rangle \rightsquigarrow \left( \begin{array}{c} 1 \\ 0 \end{array}\right),\quad |\downarrow\;\rangle \rightsquigarrow \left( \begin{array}{c} 0 \\ 1 \end{array}\right).\end{equation}
Today we will make extensive us of this kind of notation. One other important reminder is that we can take the inner product of two such vectors:
$$\langle \varphi | \psi \rangle = \left(\begin{array}{cc}\varphi_{\uparrow}^{\star} & \varphi_{\downarrow}^{\star} \end{array}\right) \left( \begin{array}{c}\psi_{\uparrow} \\ \psi_{\downarrow}\end{array}\right) = \varphi^{\star}_{\uparrow}\psi_{\uparrow} + \varphi^{\star}_{\downarrow}\psi_{\downarrow}.$$
So that in particular, the inner product of a vector with itself gives us its magnitude:
$$| |\mathrm{Ag}\rangle |^{2} = \langle \mathrm{Ag} | \mathrm{Ag} \rangle = |\alpha|^{2} + |\beta|^{2}.$$
The probabilistic interpretation of the state vector $|\mathrm{Ag}\rangle$ requires that this magnitude be equal to 1.
In this interpretation, $|\alpha|^{2}$ and $|\beta|^{2}$ represent the probability of finding the silver atom in the spin up or spin down state, respectively.
The kick experienced by the silver atom, as you might recall, comes from a brief exposure to a force applied by the strongly varying magnetic field inside the Stern-Gerlach detector. The force itself is also proportional to the magnetic moment of the silver atom - and therefore its spin angular momentum. Because the spin angular momentum of a silver atom is always observed as either up or down, there can be only two possible impulses delivered by such a kick: of equal magnitude but opposite in direction.
Let's call the magnitude of the impulse derived by this kick $\sigma$. Thus, a generic silver atom state $|\mathrm{Ag}\rangle$ receives an impulse of either $\pm \sigma$, depending upon its state.
Mathematically, a convenient way to represent that fact is with a matrix. Let's call it $\Sigma_{z}$.
$$\Sigma_{z} = \left(\begin{array}{cc} \sigma & 0 \\ 0 & -\sigma \end{array}\right).$$
Notice that when acting on $|\uparrow\;\rangle$, we get $\sigma$ out,
$$\Sigma_{z}\, |\uparrow\;\rangle = \left(\begin{array}{cc} \sigma & 0 \\ 0 & -\sigma \end{array}\right) \left(\begin{array}{c} 1 \\ 0 \end{array}\right) = \sigma \left(\begin{array}{c} 1 \\ 0 \end{array}\right) = \sigma \,|\uparrow\;\rangle.$$
And when acting on $|\downarrow\;\rangle$, we get $-\sigma$:
$$\Sigma_{z}\, |\downarrow\;\rangle = \left(\begin{array}{cc} \sigma & 0 \\ 0 & -\sigma \end{array}\right) \left(\begin{array}{c} 0 \\ 1 \end{array}\right) = -\sigma \left(\begin{array}{c} 0 \\ 1 \end{array}\right) = -\sigma \,|\downarrow\;\rangle.$$
In mathematical terms, the detector in the $\widehat{z}$-direction is \emph{defined} by the matrix $\Sigma_{z}$. Consequently, when we say that the quantum state vectors - as in \eqref{zbasis} - are defined in terms of the detecting devices, this is the precise definition of what we mean. In everything we have to say today, they define a basis of vectors \emph{with respect to the $\widehat{z}$-direction}.
As it turns out, there is also a matrix - or operator - that can be used to represent the detector placed in the $\widehat{x}$ configuration:
$$\Sigma_{x} = \left(\begin{array}{cc} 0 & \sigma \\ \sigma & 0 \end{array}\right).$$
Not being diagonal, it's a little more complicated. But that's the price we pay for using a vector basis in the $\widehat{z}$-direction while asking questions about the $\widehat{x}$-direction. Nevertheless, we can compute:
$$\Sigma_{x} \,|\uparrow\; \rangle = \left(\begin{array}{cc} 0 & \sigma \\ \sigma & 0 \end{array}\right) \left(\begin{array}{c} 1 \\ 0 \end{array}\right) = \sigma \left(\begin{array}{c} 0 \\ 1 \end{array}\right) = \sigma\, |\downarrow\;\rangle. $$
Similarly,
$$\Sigma_{x} \,|\downarrow\; \rangle = \sigma\, |\uparrow\;\rangle.$$
Up and down states flip! So perhaps you can begin to see where we made our error in the last video. But not so fast! The basis vectors $|\uparrow\;\rangle$ and $|\downarrow\;\rangle$ are defined relative to $\Sigma_{z}$, not $\Sigma_{x}$. Quantum mechanics says that we can only physically observe the effect of $\Sigma_{x}$ in the $\widehat{x}$-direction. So what does $\Sigma_{x}\,|\uparrow\;\rangle$ even mean?
Observations and Eigenvectors
Let's go back to $\Sigma_{z}$, the matrix that represented the Stern-Gerlach device in the $\widehat{z}$-direction. It had two diagonal elements, one for each impulse given to spin up and down states:
\begin{equation}\label{eigens}\Sigma_{z}\,|\uparrow\;\rangle = \sigma \,|\uparrow\;\rangle,\quad \Sigma_{z}\,|\downarrow\;\rangle = -\sigma \,|\downarrow\;\rangle.\end{equation}
Contrast this with the effect of $\Sigma_{x}$:
$$\Sigma_{x}\,|\uparrow\;\rangle = \sigma \,|\downarrow\;\rangle.$$
The effect on the vectors in \eqref{eigens} is just scalar multiplication. $\Sigma_{x}$, on the other hand, actively changes the state. It may look like a subtle point, but the form that $\Sigma_{z}$ takes in \eqref{eigens} is special. It's specific to the $\widehat{z}$-basis of vectors that we've called $|\uparrow\;\rangle$ and $|\downarrow\;\rangle$. It's the basis in which $\Sigma_{z}$ is diagonal:
$$\Sigma_{z} = \left(\begin{array}{rr} \sigma & 0 \\ 0 & -\sigma \end{array}\right).$$
In general, when a matrix acts on a vector and returns a scaled version of that same vector back - as in \eqref{eigens} - we call that vector an eigenvector. The scalar it returns is the associated eigenvalue.
Given any matrix $M$, the search for those eigenvectors $v$ that behave like:
$$M\cdot v = \lambda v,$$
for some scalar $\lambda$ is called the eigenvalue problem for $M$. For $n\times n$ matrices with real components, there will be at most $n$ such eigenvectors, each with it's own eigenvalue. The eigenvalues might be the same. We will learn how to find the eigenvalues and eigenvectors of a matrix in due course.
Exercise 1 : Write down any diagonal matrix $M$ in any number of dimensions you like. You should be able to immediately write down every eigenvector for $M$, and know each associated eigenvalue.
Eigenvalues are extremely important in Quantum Mechanics. Here are some fun mathematical facts about Quantum Mechanics.
Fun Math Fact One
Every possible measurement that can be made in Quantum Mechanics is defined to be some operator that acts on the vector space of physical states. If the state vector is finite-dimensional, like our two-dimensional silver atoms $|\mathrm{Ag}\rangle$, then these operators are matrices. To reiterate: observables are operators.
Fun Math Fact Two
For any observable operator $M$, every observation made ever must be an eigenvalue of $M$. To reiterate: eigenvalues are the only possible result of observation.
Fun Math Fact Three
When any measurement is made of an observable operator $M$, the eigenvector associated to the observed eigenvalue is the new, resulting quantum state vector. Sometimes we call such possibiliites the eigenstates. To reiterate: Quantum Mechanics changes the physical state to the eigenvector associated with the observed eigenvalue.
Fun Math Fact Four
As far as we can tell, nature picks which eigenvalue is observed at random. The likelihood of a given observation is given by the quantum state vector.
These might seem like obscure of complicated mathematical facts. Perhaps. But, you've already studied them at length! If you've been following along, you should have no trouble with the following exercise:
Exercise 2 : Reframe all these fun facts in terms of the quantum state vector $|\mathrm{Ag}\rangle$ and the Stern-Gerlach device set to the $\widehat{z}$-direction, that is, the matrix operator $\Sigma_{z}$.
Measurement is still the weird part about Quantum Mechanics. How a generic state vector $|\mathrm{Ag}\rangle$ converts to either $|\uparrow\;\rangle$ or $|\downarrow\;\rangle$ is not known. That's just how nature works. Why nature works that way is probably obvious$^{1}$: a measurement must observe something, an observing a linear superposition of states simultaneously just doesn't make any sense. Nature picks one so that observations make sense.
This brings us back to our trouble with $\Sigma_{x}|\uparrow\;\rangle$. The vector is defined with respect to the $\widehat{z}$-direction. It's an eigenvector of $\Sigma_{z}$. You'll never measure a silver atom using $\Sigma_{x}$ and find it in the state $|\uparrow\;\rangle$.
Exercise 3 : Show that the vector, $|\,\sigma_{x} \,\rangle$ defined as
$$|\,\sigma_{x} \,\rangle \propto |\uparrow\;\rangle + |\downarrow\;\rangle,$$
is an eigenvector of $\Sigma_{x}$. What is its eigenvalue? Do the same for
$$|\,\sigma_{-x} \,\rangle \propto |\uparrow\;\rangle - |\downarrow\;\rangle.$$
Normalize both vectors by multiplying them finding the appropriate constant of proportionality so that
$$\langle \,\sigma_{\pm x} \,| \,\sigma_{\pm x} \,\rangle = 1.$$
Finally, to make sure you've understood everything, let's go back to the two-machine circuit.
Exercise 4 : Start with a silver atom in the Quantum state $|\uparrow\;\rangle$, as you would find it just exiting the $\Sigma_{z}$ device. What are the possible measurement values and states you'd find after exposing those atoms to the second device, $\Sigma_{x}$.
If you've got all this, then you are well on your way to understanding Quantum Mechanics. There are still new ideas to learn, but much of the rest of our discussions will involve exploring these same ideas in different contexts.
Revisiting the Three-Machine Circuit
We are now in position to explain the error referenced in the beginning. To that end, we return to the three-machine circuit.
The Stern-Gerlach device measures the spin of a silver atom by projecting it on a screen. It's position on that screen is determined by which trajectory it took: spin up, or spin down.
In our purposefully convoluted example we recombine the up and down beams of silver atoms exiting the second device. We do so in two ways, one with and one without ``clickers''. That is, with and without measuring which way the silver atoms where deflected. More to the point, without the clicks, we are assuming no measurement took place by the second device. Although the silver atoms were still subjected to that sharply pointed magnetic field.
Let's review the “clickers off” situation using the precise mathematics.
The randomly distributed silver atoms emerge from the oven with arbitrary values of $\alpha$ and $\beta$. The first Stern-Gerlach device measures spin in the $\widehat{z}$-direction. We block the $\widehat{z}$-spin down terminal, so all silver atoms entering the second device are known to be in the state $|\uparrow\;\rangle$. This is our starting point.
The third and final device also measures spin in the $\widehat{z}$-direction. Therefore, the final states must be either $|\uparrow\;\rangle$ or $|\downarrow\;\rangle$. These are the only two possibilities for our final state.
Because we don't measure the output of the second device, the physical state does not need to be in an eigenstate of $\Sigma_{x}$. This is the contraposition of our third fun fact. So the physical state that actually enters our third device is
$$| \,\psi \,\rangle \propto \Sigma_{x}\,|\uparrow\;\rangle.$$
In matrix terms,
$$| \,\psi\rangle \rightsquigarrow \left(\begin{array}{cc} 0 &\sigma \\ \sigma & 0 \end{array}\right)\left(\begin{array}{c}1\\0\end{array}\right) = \left(\begin{array}{c}0 \\ \sigma \end{array}\right).$$
Let's normalize this state so that the sum of probabilities add up to one:
$$\langle \,\psi\, | \,\psi \,\rangle = 1 \Rightarrow |\,\psi \, \rangle \rightsquigarrow \left(\begin{array}{c}0 \\ 1 \end{array}\right).$$
In other words, it just so happens that $|\,\psi\,\rangle$ is given by the spin down state, $|\downarrow\;\rangle$.
Therefore, the state that entered the third device, which is modeled by $\Sigma_{z}$ just so happened to be an eigenvalue of $\Sigma_{z}$. So the final outcome was a pure beam of spin down silver atoms. This is contrary to what we had published in our last video. We regret the error.
We reiterate this observation with one final remark. You might perhaps wonder we we didn't look for eigenvectors of the combined operator $\Sigma_{x}\Sigma_{z}$. We cannot simultaneously measure the spin in both the $\widehat{x}$ and $\widehat{z}$ directions because they are orthogonal. One way to see this is that the two operators do not commute:
$$\Sigma_{x} \Sigma_{z} \neq \Sigma_{z} \Sigma_{x} $$
This tells us that the order in which the devices are arranged matters.
Exercise 5 : Verify this fact by explicitly computing the matrix $\Sigma_{x}\Sigma_{z} - \Sigma_{z}\Sigma_{x}$.
In this case the second device merely acted on the quantum state, but it didn't observe it, so no eigenstate was selected. Nevertheless, the operator can still change the physical state. The eigenvector ``selection'' rule only applies during measurement.
Here are a couple of final exercises to check our understanding.
Exercise 6 : Rotate the second device in the above set up so it is at an angle $\theta$ with respect to the $\widehat{z}$-direction:
$$\Sigma_{\theta} = \cos\theta \, \Sigma_{z} + \sin\theta\, \Sigma_{x} = \sigma \left(\begin{array}{rr} \cos\theta & \sin\theta \\ \sin\theta & -\cos\theta \end{array}\right).$$
Assuming the “clickers” are turned off, what is the output on the final screen? What happens when we turn the “clickers” on? Note that this second part requires a bit of calculation.
Exercise 7 : Add a fourth device to this circuit, in between the $\widehat{x}$ and $\widehat{z}$ devices. Let's suppose this device also has a magnet pointing in the $\widehat{z}$ direction. Let's suppose that its beams are also recombined, so that it also doesn't measure which direction the silver atoms were deflected. What is the final output on the detector screen? What happens when we exchange the order of those two, nonmeasuring, middle devices?
$^{1}$: This tone and temper is often associated to the ``shup up and calculate'' school of thought. It can sometimes start an inadvertent barroom brawl if used with folks who have other philosophies, however untested. It is mathematically equivalent to other interpretations, such as the ``Many-Worlds'' interpretation associated with physicist Hugh Everett. There are other ideas that differ mathematically from these. The theoretical space for such novel but fringe ideas continues to shrink as experiments are made. There is no experimental evidence to consider them further, so we shall not.