This is a followup to my previous post on the stack monoid, but is intended to be self-contained. Motivation: tree structured data GPUs are well known for being efficient on array-structured data, where it is possible to operate on the elements of the array in parallel. That last restriction doesn’t mean that the operations have to be completely independent; it’s also well known that GPUs are good