The code is written in C++11 and inline x86_64 assembly code, with a Python script for code generation. As such, it can be included in C++ programs running on machines with x86_64 CPUs, and is compatible with Linux, Mac OS X, and Windows (via Cygwin).
The main iterator implemented in lifelib is ulife. This uses a backend, which is a function to operate on a 32-by-32 square of cells and return the 16-by-16 centre after applying a cellular automaton, and a container, which dictates how the backend should be invoked in order to simulate an arbitrary pattern. Backends are inherently rule-specific, whereas containers are rule-agnostic. There are two containers which are compatible with ulife, namely the HashLife-based pattern and the more conventional upattern.
The vlife iterator is also supported for the upattern, facilitating a slight performance improvement over ulife for chaotic patterns in 2-state range-1 cellular automata.
lifelib introduces three data structures for storing patterns, namely pattern, upattern, and bitworld. The first two of these are ulife-compatible containers capable of simulation of multi-state patterns; the bitworld can only be used for storage and editing of 2-state patterns.
Patterns and lifetrees
The pattern is the most powerful and useful container, representing a finite set of cells on an infinite plane. Instead of storing its contents within the object, the object merely refers to a node in an external structure known as a lifetree. This means that regular patterns are highly compressed, copying a pattern is a fast O(1) operation, and similar patterns share much of the same space in memory. Patterns can be moved (or coerced) between lifetrees, but this takes time O(L), where L is the length of the macrocell encoding of the pattern.
Patterns are implemented in such a way that they can be used as immutable data structures; running a pattern returns an advanced copy of the object, rather than modifying the original object. They support editing operations such as shifting, rotation/reflection, boolean operations, and Kronecker products. Advancing a pattern uses a HashLife-inspired algorithm, falling back on the ulife iterator for the base case of returning the 16-by-16 centre of a 32-by-32 quartet of leaves. There is a further optimisation for oscillators and spaceships of a known period p: instead of advancing by N generations, it advances the pattern by N modulo p, translating by the appropriate amount if the pattern is a spaceship. This means that a periodic pattern can sensibly be advanced by a negative number of generations, which is particularly useful for building glider streams.
A lifetree is a garbage-collected container which represents a collection of unique 2n-by-2n square tiles ('nodes') as pointers to their children, with a base case of 16-by-16 'leaves' stored directly. If a node is not part of a pattern, and the memory of the lifetree steps over a user-defined threshold, the node will be deleted in a garbage collection step to save memory. Lifetrees do not support concurrency, so taking advantage of multi-core architectures (as in apgsearch) requires using one lifetree per CPU core.
The 'HashLife memory' is stored in the lifetree, so multiple patterns sharing a lifetree will naturally take advantage of each others' memoized computations. This means that after running a pi-heptomino, it will take less time to subsequently run a house.
The upattern is implemented as a collection of overlapping 32-by-32 square tiles, as in Life128 and its relatives. The tiles overlap in such a way that the disjoint centres are either 28-by-28 (if the upattern is designed to run vlife), or 16-by-16 (if the upattern is designed to run ulife). It is slightly faster than the original implementation of vlife, owing to using an std::unordered_map instead of an std::map as its underlying data structure. Unlike the pattern container, a upattern stores its data internally and does not perform any compression. However, it keeps track of which tiles need to be updated, taking advantage of stable and period-2 tiles. Although it is chiefly designed for an infinite plane, it can run a toroidal universe equally quickly.
The bitworld is simply a container for a finitely-supported black-and-white image, with support for basic editing operations and the retrieval of connected clusters of cells. Vectors of bitworlds are also used for converting between the pattern and upattern datastructures, conversion from RLE, and the determination of apgcodes.
Lifelib contains a variant of a lifetree, called a streamtree, which implements a new algorithm called Streamlife. This is based on Hashlife, but is tailored to handle antiparallel streams of gliders and standard spaceships. As such, it can gain a significant speed boost on certain patterns such as the Demonoid and Orthogonoid. The additional calculations mean that Streamlife tends to be slower on other patterns, so the regular lifetree is almost always preferable. Streamlife is only available for Life-like cellular automata.
Comparison of data structures
The salient differences between the data structures are summarised below:
|Data representation||Hashed quadtree||Unordered map||Ordered map|
|Data location||External (in a lifetree)||Internal||Internal|
|Number of layers||Arbitrary||Arbitrary||1|
|Compatible iterators||ulife||ulife, vlife||None|
|Algorithm optimisations||Regular patterns||stable/P2 ash||None|
|Supported topologies||Infinite plane||Torus, infinite plane||Infinite plane|
|Maximum radius||24 000 000 000||32 000 000 000||16 000 000 000|
Backends are created at compile-time by Python modules called genera. These generate C++ and x86_64 assembly code for a given rule, using specialised SIMD instructions to gain a performance advantage. A Python script (called rule3asm.py) decides which of the genera are appropriate for the given rules, calls them, and builds linking code.
There are 17 genera as of version ll2.18. These include the following six 2-state families:
- lifelike: An implementation of arbitrary Life-like cellular automata using SSE2/AVX/AVX2 instruction sets, choosing a Boolean circuit from a precomputed table;
- isotropic: A lookup-table approach for simulating non-B0 isotropic non-totalistic cellular automata;
- ltl: A vectorised implementation of Larger than Life, supporting a maximum range of 7;
- hrot: An implementation of higher range outer-totalistic rules, supporting a maximum range of 5;
- isotropic-hex-ot: An implementation of outer-totalistic hexagonal neighbourhood rules;
- isotropic-hex: An implementation of isotropic non-totalistic hexagonal rules.
Next, there are Generations analogues of the above, with little overhead over the 2-state counterpart, giving another six genera:
- generations (currently does not support B0 rules, unlike the 2-state counterpart);
- isogeny (isotropic generations);
- gltl (Larger than Life + Generations).
- ghrot (Higher-range outer-totalistic + Generations).
- isogeny-hex-ot (hexagonal Generations).
- isogeny-hex (isotropic hexagonal Generations).
Additionally, there are four genera with more than 2 states which do not fit into the Generations category:
- bsfkl: An implementation of Brian Prentice's BSFKL rules.
- deficient: An implementation of 83bismuth38's Deficient rules.
- genext: An implementation of M. I. Wright's Extended Generations rules.
- eightbit: An implementation of custom rule files using Golly formats.
Finally, Tom Rokicki mentioned that Conway's Game of Life can be computed in 19 Boolean operations (amortized), whereas the generic lifelike approach requires 23 logic gates. Rokicki's improvement was subsequently implemented as a separate genus, placed at the top of the list to give priority over the generic lifelike implementation:
- b3s23life (19-operation implementation of Conway's Game of Life).
The algorithms in lifelib are optimised for speed. Comparing like-with-like, it tends to outperform Golly's algorithms in the majority of cases, with a few exceptions. Tom Rokicki compiled a very comprehensive and accurate performance comparison here.
In terms of memory, each lifetree non-leaf node occupies 32 bytes and each leaf node occupies 16 + 32N bytes, where N is the number of bit-layers in the pattern. This is intermediate between 32- and 64-bit Golly in terms of both memory usage and maximum pattern size. 64-bit Golly can theoretically support larger patterns than lifelib, but you would need more than 256 GB of memory in order to take advantage of this.
In addition to running and editing patterns, lifelib features other functionality:
- Fast periodicity detection, which can detect oscillator and spaceship periods below 224 and certain higher periods;
- Determination of apgcodes of patterns in both 2-state and multi-state rules;
- Separation of objects and components of pseudo-objects in non-B0 Life-like cellular automata;
- Subroutines less directly associated with cellular automata, such as calculation of minimum spanning trees, primality testing, and integer factorisation.
- lifelib on Gitlab