Hamiltonian Dynamics in Haskell
by Justin Le ♦
Posted inAs promised in my hamilton introduction post (published almost exactly one year ago!), I’m going to go over implementing of the hamilton library using
- DataKinds (with TypeLits) to enforce sizes of vectors and matrices and help guide us write our code
- Statically-sized linear algebra with hmatrix
- Automatic differentiation with ad
- Statically-sized vectors with vector-sized
This post will be a bit heavy in some mathematics and Haskell concepts. The expected audience is intermediate Haskell programmers. Note that this is not a post on dependent types, because dependent types (types that depend on runtime values) are not explicitly used.
The mathematics and physics are “extra” flavor text and could potentially be skipped, but you’ll get the most out of this article if you have basic familiarity with:
- Basic concepts of multivariable calculus (like partial and total derivatives).
- Concepts of linear algebra (like dot products, matrix multiplication, and matrix inverses)
No physics knowledge is assumed, but knowing a little bit of first semester physics would help you gain a bit more of an appreciation for the end result!
The hamilton library introduction should be considered a “soft prerequisite” for this post, as it presents motivations, visual demonstrations, and general overviews of the methods presented here!
The Goaltop
At the end of this, we should be able to have Haskell automatically generate equations of motions for any arbitrary system described in arbitrary coordinate systems, and simulate that system.
Normally, we’d describe a system using particles’ x and y coordinates, but our goal is to be able to describe our particles’ positions using any coordinate system we want (polar, distance-along-a-curved-rail, pendulum-angles, etc.) and have Haskell automatically generate equations of motions and time progressions of those coordinates.
Read my hamilton library introduction for more information and examples!
Hamiltonian Mechanicstop
As mentioned in the previous post, Hamiltonian mechanics is a re-imagining of dynamics and mechanics (think “the world post-
Hamiltonian mechanics lets you parameterize your system’s “position” in arbitrary ways (like the angle of rotation, for pendulum problems) and then posits that the full state of the system exists in something called phase space, and that the system’s dynamics is its motion through phase space that is dictated by the geometry of the Hamiltonian of that phase space.
The system’s Hamiltonian is a

In the example above, if we imagine that phase space is the 2D location, then the Hamiltonian is the mountain. And for a system dropped anywhere on the mountain, its motion would be along the contour lines. For example, if a system started somewhere along the 10 contour line, it would begin to oscillate the entire phase space along the 10 contour line.1
Every smooth
The trick, then, to using Hamiltonian dynamics to model your system, is:
Finding the phase space to describe your system. This can be done based on any continuous parameterization of your system (“generalized coordinates”), like angles of pendulums and so on.
Finding the Hamiltonian on that phase space to describe your system.
And then Hamilton’s dynamics will give you the rest! All you do is “follow the contour lines” on that Hamiltonian!
Phase Spacetop
Hamiltonian dynamics are about systems moving around in phase space. It seems that phase space is the “room where it happens”, so to speak, so let’s dig deeper into what it is. Phase space is a
- All of the current values of the
parameters (“generalized coordinates”) - All of the current “generalized momenta” of those
parameters
So if you were parameterizing your pendulum system by, say, the angle of the pendulum, then a point in phase space would be the current angle of the pendulum along with the current “generalized momentum” associated with the angle of the pendulum. What exactly is generalized momentum? We’ll go over calculating it eventually, but what does it represent…physically?
The deeper answer involves the underlying Lie algebra of the Lie group associated with the generalized coordinates, but going into that would make this a completely different post. What I can say is that the generalized momenta associated with (“conjugate to”) certain sets of familiar coordinates yield things that we typically call “momenta”:
The momentum conjugate to normal Cartesian coordinates is just our normal run-of-the-mill linear momentum (in the
) from first semester physics.The momentum conjugate to the angle
in polar coordinates is angular momentum ( ) from first semester physics.The momentum conjugate to the radial coordinate
in polar coordinates is also just boring old linear momentum , which makes sense because purely radial motion is just linear motion.
So, it’s our normal momentum (for linear and polar coordinates) generalized to arbitrary coordinates.
Hamiltonian Dynamicstop
I’ve explained Hamiltonian dynamics for time-independent Hamiltonians as “follow the contour lines”. If you remember your basic multi-variable calculus course, you’ll know that the line of “steepest ascent” is the gradient. If we call the Hamiltonian
But we want to move along the contour lines…and these are the lines perpendicular to the direction of steepest descent. The vector perpendicular to
This is a conclusion with one generalized coordinate
Essentially, these give you “updating functions” for
This picture is appealing to me in a visceral way because it sort of seems like the system is “surfing” along the Hamiltonian’s contour lines. It’s being “pushed” faster when the Hamiltonian is steeper, and slower when it’s more shallow. I can apply all my intuition as a surfer3 to Hamiltonian mechanics!
Hamiltonian Dynamics and Physical Systemstop
Earlier I mentioned that the two steps for applying Hamiltonian mechanics to your system was figuring out your system’s conjugate momenta and the appropriate Hamiltonian. To explain this, I’m going to make a couple of simplifying assumptions that make the job easier for the purposes of this article:
- Your coordinates and potential energy are time-independent.
- Your potential energy function only depends on positions, and not velocities. (So nothing like friction or wind resistance or magnetic field vector potentials)
With these assumptions, I’m going to skip over discussing the Lagrangian of the system, which is the traditional way to do this. You can think of this section as me presenting derived conclusions and skipping the derivations.
Conjugate Momentatop
For systems with velocity-independent potential energies, it can be shown that the momentum conjugate to coordinate
Where
Just linear momentum, like I claimed before.
Let’s generalize this to arbitrary coordinates. In general, for Cartesian coordinates, the kinetic energy will always be
Where
To give us nice notation and make things more convenient, we’ll write this as a quadratic form over an inertia matrix:
Where
Now! How to generalize this to arbitrary coordinates? Well, if we have
So we can get
Or, in short:
But, hey, this looks a lot like a matrix-vector multiplication! If we make
And we can plug it in (remembering that
And for the final step, we differentiate with respect to the
Now, we’re going to be using
It’s going to be important for us to also be able to go backwards (to get
The power of linear algebra!
Hamiltonians of Physical Systemstop
Ok, that’s step one. How about step two – finding the Hamiltonian for your system?
The real Hamiltonian is actually the Poisson bracket of the system’s Lagrangian, but I did some of the work for you for the case of time-independent coordinates where the potential energy depends only on positions (so, no friction, wind resistance, time, etc.). In such a case, the Hamiltonian of a system is precisely the system’s total mechanical energy, or its kinetic energy plus the potential energy:
Which makes a lot of intuitive sense, because you might recall that total mechanical energy is always conserved for certain types of systems. Incidentally, Hamiltonian dynamics makes sure that the value of the system’s Hamiltonian stays the same (because it moves along contour lines). So, the system’s Hamiltonian always stays the same, and so its total mechanical energy stays the same, as well! Energy is conserved because the Hamiltonian stays the same!
Anyway, we want to build our system’s Hamiltonian from properties of the coordinate system, so plugging in our expression for
Oh, but oops, the Hamiltonian has to be a function of
Hamiltonian Equationstop
We got our Hamiltonian! Now just to find our updating functions (the partial derivatives of the Hamiltonian), and we’re done with the math.
Because we are assuming the case (with loss of generality)
We already can calculate
The collection of “second-order derivatives of
if we use
And with that, we have our final expression for
Or, to use our abuse of notation:
And, finally, we have everything we need – we can now construct our equations of motion! To progress through phase space (
That’s it. We’re done. Have a nice day, thanks for reading!
The Haskelltop
Just kidding, now it’s time for the fun stuff :)
Our final goal is to be able to simulate a system of discrete particles through arbitrary generalized coordinates.
To simplify the math, we always assume that, whatever generalized coordinates you are using (
So, in order to fully describe the system, we need:
- Each of their masses (or inertias) in their underlying
Cartesian coordinates, which we’ll call . - A function
to convert the generalized coordinates ( ) to Cartesian coordinates ( ) - The potential energy function
in the generalized coordinates ( )
From these alone, we can derive the equations of motion for the particles in phase space as a system of first-order ODEs using the process described above. Then, given an initial phase space position, we can do numeric integration to simulate our system’s motion through phase space. To “surf the Hamiltonian waves in phase space”, so to speak.
But, to be explicit, we also are going to need some derivatives for these functions/vectors, too. If you’ve been following along, the full enumeration of functions and vectors we need is:
But, as we’ll see, with libraries like ad in Haskell, we can really just ask the user for
Our Data Structurestop
We can couple together all of these functions in a data type that fully describes the physics of our systems (the “shape” of the Hamiltonian):
data System m n = System
sysInertia :: R m -- ^ 'm' vector
{ sysCoords :: R n -> R m -- ^ f
, sysJacobian :: R n -> L m n -- ^ J_f
, sysHessian :: R n -> V.Vector n (L m n) -- ^ H_f
, sysPotential :: R n -> Double -- ^ U
, sysPotentialGrad :: R n -> R n -- ^ grad U
, }
R n
and L m n
are from the hmatrix library; an R n
represents an n-vector (For example, an R 4
is a 4-vector), and an L m n
represents an m x n
matrix (For example, an L 5 3
is a 5x3 matrix).
A System m n
will describe a system parameterized by n
generalized coordinates, taking place in an underlying m
-dimensional Cartesian space.
It’ll also be convenient to have a data type to describe the state of our system in terms of its generalized positions (
data Config n = Config
confPositions :: R n
{ confVelocities :: R n
,
}deriving Show
And, more importantly, remember that Hamiltonian dynamics is all about surfing around on that phase space (generalized positions
data Phase n = Phase
phasePositions :: R n
{ phaseMomenta :: R n
,
}deriving Show
Getting comfortable with our data typestop
First of all, assuming we can construct a System
in a sound way, let’s imagine some useful functions.
We can write a function underlyingPosition
, which allows you to give a position in generalized coordinates, and returns the position in the “underlying coordinate system”:
underlyingPosition :: System m n
-> R n
-> R m
= sysCoords underlyingPosition
Note that the types in our function helps us know exactly what the function is doing — and also helps us implement it correctly. If we have a System
in n
dimensions, over an underlying m
-dimensional Cartesian space, then we would need to convert an R n
(an n-dimensional vector of all of the positions) into an R m
(a vector in the underlying Cartesian space).
Simple enough, but let’s maybe try to calculate something more complicated: the momenta of a system, given its positions and velocities (configuration).
We remember that we have a nice formula for that, up above:
We can translate that directly into Haskell code:
momenta :: (KnownNat n, KnownNat m)
=> System m n
-> Config n
-> R n
Config q v) = tr j #> mHat #> j #> v
momenta s (where
= sysJacobian s q
j = diag (sysInertia s) mHat
Note that, because our vectors have their size indexed in their type, this is pretty simple to write and ensure that the shapes “line up”. In fact, GHC can even help you write this function by telling you what values can go in what locations. Being able to get rid of a large class of bugs and clean up your implementation space is nice, too!
(Note that hmatrix requires a KnownNat
constraint on the size parameters of our vectors for some functions, so we add this as a constraint on our end.)
With this, we can write a function to convert any state in configuration space to its coordinates in phase space:
toPhase :: (KnownNat n, KnownNat m)
=> System m n
-> Config n
-> Phase n
= Phase (confPositions c) (momenta s c) toPhase s c
This function is important, because “configuration space” is how we actually directly observe our system – in terms of positions and velocities, and not in terms of positions and momenta (and sometimes conjugate momenta might not even have meaningful physical interpretations). So, having toPhase
lets us “initialize” our system in terms of direct observables, and then convert it to its phase space representation, which is something that Hamiltonian mechanics can work with.
Automatic Differentiationtop
Now, creating a System
“from scratch” is not much fun, because you would have to manually differentiate your coordinate systems and potentials to generate your Jacobians and gradients.
Here’s where the magic comes in – we can have Haskell generate our Jacobians and gradients automatically, using the amazing ad library! We can just use the appropriately named grad
, jacobian
, and hessianF
functions.
Quick Intro to ADtop
At the simplest level, if we have a function from some number to some other number, we can use diff
to get its derivative:
myFunc :: RealFloat a => a -> a
myFunc :: RealFloat a => a -> a diff
If we have a function from a sized vector to a scalar, we can use grad
to get its gradient:
-- import qualified Data.Vector.Sized as V
myFunc :: RealFloat a => V.Vector n a -> a
myFunc :: RealFloat a => V.Vector n a -> V.Vector n a grad
Where each of the components in the resulting vector corresponds to the rate of change of the output according to variations in that component.
We’re using statically sized vector type from the vector-sized package (in the Data.Vector.Sized module), where V.Vector n a
is a n
-vector of a
s – for example, a V.Vector 3 Double
is a vector of 3 Double
s.
We have to use Vector
(instead of R
, from hmatrix) because automatic differentiation for gradients requires some Functor to work. An R 5
is essentially a V.Vector 5 Double
, except the latter can contain other, non-Double things – and therefore can be used by ad to do its magic.
If we have a function from a sized vector to a (differently) sized vector, we can use the jacobian
function to get its jacobian!
myFunc :: RealFloat a => V.Vector n a -> V.Vector m a
myFunc :: RealFloat a => V.Vector n a -> V.Vector m (V.Vector n a) jacobian
Again note the usage of sized vector types, and the fact that our m
-vector of n
-vectors.
Finally, we can get our Hessian Tensor by using hessianF
:6
myFunc :: RealFloat a => V.Vector n a -> V.Vector m a
hessianF myFunc :: RealFloat a => V.Vector n a -> V.Vector m (V.Vector n (V.Vector n a))
Conversion between vector-sized and hmatrixtop
Just a small hiccup — the ad libraries requires our vectors to be Functors, but R
and L
from hmatrix are not your typical capital-F Functor
instances in Haskell. We just need to do some manual conversion using the hmatrix-vector-sized library.
This gives functions like:
-- import qualified Data.Vector.Sorable.Sized as VS
gvecR :: V.Vector n Double -> R n
grVec :: R n -> V.Vector n Double
rowsL :: V.Vector m (R n) -> L m n
lRows :: L m n -> V.Vector m (R n)
to allow us to convert back and forth.
Also, even though ad gives our Hessian as an
tr2 :: (KnownNat m, KnownNat n)
=> V.Vector m (L n n)
-> V.Vector n (L m n)
= fmap rowsL . traverse lRows
tr2 {-# INLINE tr2 #-}
We also would need to have a function converting a vector of vectors into a matrix:
vec2l :: KnownNat n
=> V.Vector m (V.Vector n Double)
-> L m n
= rowsL . fmap gvecR
vec2l {-# INLINE vec2l #-}
Using AD to Auto-Derive Systemstop
Now to make a System
using just the mass vector, the coordinate conversion function, and the potential energy function:
mkSystem :: (KnownNat m, KnownNat n)
=> R m
-> (forall a. RealFloat a => V.Vector n a -> V.Vector m a)
-> (forall a. RealFloat a => V.Vector n a -> a)
-> System m n
= System
mkSystem m f u -- < convert from | actual thing | convert to >
= m
{ sysInertia = gvecR . f . grVec
, sysCoords = tr . vec2l . jacobianT f . grVec
, sysJacobian = tr2 . fmap vec2l . hessianF f . grVec
, sysHessian = u . grVec
, sysPotential = gvecR . grad u . grVec
, sysPotentialGrad }
Now, I hesitate to call this “trivial”…but, I think it really is a straightforward direct translation of the definitions, minus some boilerplate conversions back and forth between vector using vecR
, rVec
, etc.!
- The vector of masses is just
m
- The coordinate function is just
f
- The Jacobian of the coordinate function is just
jacobian f
- The Hessian Tensor of the coordinate function is just
hessianF f
- The potential energy function is just
u
- The gradient of the potential energy function is just
grad u
The ad library automatically generated all of these for us and created a perfectly well-formed System
with all of its gradients and Jacobians and Hessians by giving only the coordinate function and the potential energy function, and in such a clean and concise way!
Equations of Motiontop
At this point, we’re ready to write our final equations of motion, which we found to be given by:
These equations aren’t particularly beautiful, but it’s straightforward to translate them into Haskell (using
hamilEqns :: (KnownNat n, KnownNat m)
=> System m n
-> Phase n
-> (R n, R n) -- dq/dt and dp/dt
Phase q p) = (dqdt, dpdt)
hamilEqns s (where
= sysJacobian s q
j = tr j
trj = diag (sysInertia s)
mHat = trj `mul` mHat `mul` j
kHat = inv kHat
kHatInv = kHatInv #> p
dqdt = gvecR bigUglyThing - sysPotentialGrad s q
dpdt where
=
bigUglyThing fmap (\j2 -> -p <.> kHatInv #> trj #> mHat #> j2 #> kHatInv #> p)
(sysHessian s q)
Of course, there is no way to get around the big ugly math term in
But!! I’d much rather write this scary Haskell than that scary math, because ghc typechecks our math! When writing out those equations, we really had no idea if we were writing it correctly, and if the matrix and vector and tensor dimensions line up. If it even made sense to multiply and transpose the quantities we had.
However, when writing hamilEqns
, we let GHC hold our hand for us. If any of our math is wrong, GHC will verify it for us! If any dimensions don’t match up, or any transpositions don’t make sense, we’ll know immediately. And if we’re ever lost, we can leave a typed hole – then GHC will tell you all of the values in scope that can fit in that hole! Even if you don’t completely understand the math, this helps you implement it in a somewhat confident way.
It’s admittedly difficult to convey how helpful these sized vector types are without working through trying to implement them yourself, so feel free to give it a try when you get the chance! :D
Numerical Integrationtop
The result of hamilEqns
gives the rate of change of the components of our Phase n
. The rest of the processes then is just to “step” Phase n
. Gradually update it, following these rate of changes!
This process is known as numerical integration. The “best” way to do it is quite a big field, so for this article we’re going to be using the extremely extremely simple Euler method to progress our system through time.
Disclaimer – The Euler method is typically a very very bad choice for numerical integration (even though, as popularized in the movie Hidden Figures, it was good enough to send humans to space?). We are just choosing it for this article because it’s very simple, conceptually!
The basic idea is that you pick a time-step,
Which makes sense visually if we imagine
You can understand this symbolically, as well, by remembering that the derivative can be approximated by
We can directly translate this into Haskell: (using konst :: KnownNat n => Double -> R n
, making a constant vector, and *
, the component-wise product of two vectors)
stepEuler :: (KnownNat n, KnownNat m)
=> System m n -- ^ the system
-> Double -- ^ dt
-> Phase n -- ^ q(t) and p(t)
-> Phase n -- ^ q(t + dt) and p(t + dt)
@(Phase q p) = Phase (q + konst dt * dq) (p + konst dt * dp)
stepEuler s dt phwhere
= hamilEqns s ph (dq, dp)
And repeatedly evolve this system as a lazy list:
runSystem :: (KnownNat n, KnownNat m)
=> System m n -- ^ the system
-> Double -- ^ dt
-> Phase n -- ^ initial phase
-> [Phase n] -- ^ progression of the system using Euler integration
= go
runSystem s dt where
= p0 : go (stepEuler s dt p0) go p0
Running with ittop
And…that’s it! Granted, in real life, we would be using a less naive integration method, but this is essentially the entire process!
Let’s generate the boring system, a 5kg particle in 2D Cartesian Coordinates under gravity –
simpleSystem :: System 2 2
= mkSystem (vec2 5 5) id pot
simpleSystem where
-- potential energy of a gravity field
-- U(x,y) = 9.8 * y
pot :: RealFloat a => V.Vector 2 a -> a
= 9.8 * (xy `V.index` 1) pot xy
If we initialize the particle at position
We can make our initial configuration:
simpleConfig0 :: Config 2
= Config
simpleConfig0 = vec2 0 0
{ confPositions = vec2 1 3
, confVelocities }
And then…let it run!
simpleMain :: IO ()
=
simpleMain mapM_ (disp 2 . phasePositions) -- position with 2 digits of precision
. take 25 -- 25 steps
$ runSystem simpleSystem 0.1 (toPhase simpleSystem simpleConfig0)
We get:
> :l Hamilton.hs
ghci> simpleMain
ghci0 0
0.10 0.30
0.20 0.58
0.30 0.84
0.40 1.08
0.50 1.30
0.60 1.51
0.70 1.69
0.80 1.85
0.90 1.99
1.00 2.12
1.10 2.22
1.20 2.31
1.30 2.37
1.40 2.42
1.50 2.44
1.60 2.45
1.70 2.43
1.80 2.40
1.90 2.35
2.00 2.28
2.10 2.18
2.20 2.07
2.30 1.94
2.40 1.79
Exactly what we’d expect! The x
positions increase steadily, and the y
positions increase, slow down, and start decreasing.
We can try a slightly more complicated example that validates (and justifies) all of the work we’ve done – let’s simulate a simple pendulum. The state of a pendulum is characterized by one coordinate
Let’s set up that system! We’ll put it under normal gravity potential, again (
pendulum :: System 2 1
= mkSystem (vec2 5 5) coords pot -- 5kg particle
pendulum where
-- <x,y> = <-0.5 sin(theta), -0.5 cos(theta)>
-- pendulum of length 0.25
coords :: RealFloat a => V.Vector 1 a -> V.Vector 2 a
->theta) = V.fromTuple (- 0.25 * sin theta, - 0.25 * cos theta)
coords (V.head-- potential energy of gravity field
-- U(x,y) = 9.8 * y
pot :: RealFloat a => V.Vector 1 a -> a
= 9.8 * (coords q `V.index` 1)
pot q
pendulumConfig0 :: Config 1
= Config
pendulumConfig0 = 0
{ confPositions = 0.1
, confVelocities
}
pendulumMain :: IO ()
=
pendulumMain mapM_ (disp 3 . phasePositions) -- position with 2 digits of precision
. take 25 -- 25 steps
$ runSystem pendulum 0.1 (toPhase pendulum pendulumConfig0)
This pendulum should wobble back and forth, ever so slightly, around equilibrium.
> :l Hamilton.hs
ghci> pendulumMain
ghci0
0.010
0.020
0.029
0.037
0.042
0.045
0.044
0.040
0.032
0.021
0.007
-0.008
-0.023
-0.038
-0.051
-0.061
-0.068
-0.069
-0.065
-0.056
-0.041
-0.022
-0.000
0.023
We see our
We automatically generated equations of motion for a pendulum. Sweet!
Wrap-Uptop
We traveled through the world of physics, math, Haskell, and back again to achieve something that would have initially seemed like a crazy thought experiment. But, utilizing Hamiltonian mechanics, we have a system that can automatically generate equations of motion given your coordinate system and a potential energy function. We also learned how to leverage typed vectors for more correct code and a smoother development process.
See my previous post for even crazier examples – involving multiple objects, double pendulums, and more. And check out my hamilton library on hackage, which includes demos for exotic interesting systems, rendered graphically on your terminal.
I realize that this was a lot, so if you have any questions or suggestions for clarifications, feel free to leave a comment, drop me a tweet, or find me on the freenode #haskell channel (where I usually idle as jle`!)
The picture with a time-dependent Hamiltonian is different, but only slightly. In the time-dependent case, the system still tries to move along contour lines at every point in time, but the mountain is constantly changing underneath it and the contour lines keep on shifting underneath it. Sounds like life!↩︎
There’s also another perpendicular vector,
, which actually gives motion backwards in time.↩︎Disclaimer: I am not a surfer.↩︎
is full-rank (meaning is invertible) if its rows are linearly independent. This should be the case as you don’t have any redundant or duplicate coordinates in your general coordinate system.↩︎Thanks to Edward Kmett for pointing this out!↩︎
hessian
computes the Hessian Matrix for scalar-valued function, but here, we have a vector-valued function, so we needhessianF
, the Hessian Tensor.↩︎Clearly our system is gaining some sort of phantom energy, since it rises up to 0.045 on the left, and then all the way up to -0.69 on the right. Rest assured that this is simply from the inaccuracies in Euler’s Method.↩︎