apgcode

From LifeWiki
Jump to navigation Jump to search

apgsearch and its search results database, Catagolue, use apgcodes to classify and denote patterns.

Encoding objects

apgcodes consist of a prefix and a suffix, separated by an underscore; both the prefix and the suffix are alphanumeric strings. The prefix further consists of a two-character type and a number. The types are:

The number following the type represents the population for a still life, or otherwise the period of the object.

In codes for xs, xp, and xq, the suffix following the underscore is a representation of the object in Extended Wechsler Format (see below); in codes for yl, the suffix consists of additional information encoding higher-order periods of the pattern.

Unencodable objects

In their original form, apgcodes are not guaranteed to encode still lifes, oscillators and spaceships exceeding a 40×40 bounding box.[note 1] These are instead reported by apgsearch and censused by Catagolue as oversized patterns, using the following types:

  • ov_s for an oversized still life;
  • ov_p for an oversized oscillator;
  • ov_q for an oversized spaceship.

These types are then followed by a number representing the population for a still life, or otherwise the period of the object, as in the non-oversized case. apgcodes of this form do not uniquely identify a specific object, but rather represent classes of objects.

An extension to apgcodes, dubbed "greedy apgcodes" (see below), was later used to relax the 40×40 bounding box constraint, although some sufficiently large patterns remain canonically unencodable.

Unclassifiable objects

Objects which apgsearch cannot classify as above are denoted by one of:

  • zz_EXPLOSIVE
  • zz_LINEAR
  • zz_QUADRATIC
  • zz_REPLICATOR
  • PATHOLOGICAL (e.g. high-period oscillators)

Chaotic-growth (zz_*) patterns are classified according to certain heuristics; the labels chosen do not necessarily reflect the nature of the object encountered, so an object classified as e.g. zz_REPLICATOR need not be an actual replicator.

Like oversized apgcodes, these apgcodes do not uniquely identify specific objects.

Extended Wechsler Format

Non-oversized still lifes, oscillators and patterns are encoded in extended Wechsler format, an extension of a pattern notation developed by Allan Wechsler in 1992.

  • The pattern is separated into horizontal strips of five rows.
    • Each strip, n columns wide, is encoded as a string of n characters in the set {0, 1, 2, ..., 8, 9, a, b, ..., v} denoting the five cells in a vertical column corresponding to the bitstrings {'00000', '00001', '00010', ..., '01000', '01001', '01010', '01011', ... '11111'}.
    • The characters 'w' and 'x' are used to abbreviate '00' and '000', respectively, and the symbols {'y0', 'y1', y2', ..., 'yx', 'yy', 'yz'} are used to encode runs of between 4 and 39 consecutive '0's.
    • Extraneous '0's at the ends of strips are not included in the encoding. (Note that in particular, blank five-row strips are represented by the empty string.)
  • The character 'z' separates contiguous five-row strips.

Examples

xq4_27deee6 encodes a HWSS:

Xq4 27deee6 annotated.png

xs31_0ca178b96z69d1d96 encodes a 31-bit still life; note how the 'z' separates the two five-row strips:

Xs31 0ca178b96z69d1d96 annotated strip1.png
Xs31 0ca178b96z69d1d96 annotated strip2.png

xp30_w33z8kqrqk8zzzx33 encodes a trans-queen-bee-shuttle; note how five-row strips are represented by the empty string:

Xp30 w33z8kqrqk8zzzx33 annotated strip1.png
Xp30 w33z8kqrqk8zzzx33 annotated strip2.png
(10 blank rows omitted)
Xp30 w33z8kqrqk8zzzx33 annotated striplast.png

xp2_31a08zy0123cko is a quadpole on ship; note how trailing zeros in the first five-row strip are not encoded:

Xp2 31a08zy0123cko annotated strip1.png
Xp2 31a08zy0123cko annotated strip2.png

Canonical form

In order to enforce a canonical form, there are further rules regarding encoding:

  • The leftmost column and uppermost row must each contain at least one live cell. (This gives a canonical position.)
  • A canonical orientation and phase must be determined. For example, with the caterer (p3 oscillator with no symmetry), there are three phases and eight orientations, so we have 24 possible encodings. A total order on these encodings is defined as follows:
    • Shorter representations are preferred to longer representations;
    • For representations of the same length, lexicographical ASCII ordering is applied, and preference given to earlier strings.

This gives, for any still-life, oscillator or spaceship, an unambiguous canonical code to represent the pattern. It has several desirable properties:

  • Compression: it is much more compact than RLE or SOF for storing very small patterns, and often even beats the common name ('xp15_4r4z4r4' is shorter than 'pentadecathlon')!
  • Character set: it only uses digits, lowercase letters and the underscore, so can be safely used in filenames and URLs.
  • Human-readability: the prefix means that we can instantly see whether a particular object is a still-life (and if so, what size), oscillator (and if so, what period) or spaceship (and if so, what period). It also means that the string is instantly recognised as being an encoding of an object ('xp2_7' is obviously a blinker, whereas the digit 7 on its own with no extra context is ambiguous).

Adapting apgcodes to larger patterns

As apgcodes were originally limited to patterns fitting into a 40×40 bounding box, different ways of extending them to allow unambiguously encoding larger patterns were proposed.[1] In their original form, apgcodes are not guaranteed to be able to encode larger patterns, due a lack of well-definedness for runs of 40 or more consecutive zeroes, and apgsearch versions up to 3.x would report such patterns as oversized instead of attempting to encode them.

Extended ("greedy") apgcodes were eventually adopted to allow larger patterns to be encoded. These avoid the problem by stipulating that a run of n zeroes be encoded as follows:

  1. If n is less than 40, the run is encoded as in regular apgcodes, using the characters w, x, and y0 .. yz as appropriate.
  2. If n is equal to or greater than 40, the run is encoded as yz (representing the first 39 zeroes), following by the encoding for a run of n - 39 zeroes according to this definition.

That is to say:

Extended apgcodes zeroes encoding.png

This extension, which is used on the LifeWiki for larger patterns, retains compatibility with existing apgcode decoders. Adapting encoders to produce extended apgcodes is similarly straightforward.

Greedy apgcodes were adopted for apgsearch 4.x[2], subject to the following constraints:

  • (width + 2) * (height + 2) <= 10000;
  • the resulting apgcode, without the prefix, cannot exceed 1280 bytes.

Adapting apgcodes to multistate rules

Work is ongoing on extending apgcodes to multistate (Generations) rules.[3] For this, apgcodes encoding still lifes, oscillators and spaceships will contain multiple suffixes, separated by underscores.[note 2]

Each suffix will encode one 'layer' of the pattern, with patterns in an n-state rule having 2+⌈log2(n-2)⌉ layers, where layer 0 is 'live', layer 1 is 'refractory', and the remaining layers implement a binary counter.

For example, the following spaceship in Brian's Brain (Generations rule /2/3, aka B2/S/G3) is represented by the apgcode xq4_3482h8a_03482lx8:

x=11, y = 5, rule = /2/3 AB2.AB$AB.AB$.AB.A4.B$2.AB4.A.B$4.AB2.B! #C [[ GRID GRIDMAJOR 0 GPS 2 AUTOSTART THUMBLAUNCH THUMBSIZE 2 TRACKLOOP 4 -1 0 ]]

Limitations

apgcodes can be ambiguous if a rule is neither specified nor implicitly assumed, with the same code being assigned to different patterns evolving differently. For example, the code xp2_2a54 describes at least three different period-2 oscillators in outer-totalistic rules; further variants exist in non-totalistic rules:

Evolution of xp2_2a54 across different rules
Xp2 2a54 b3s23.gif Xp2 2a54 b34s.gif Xp2 2a54 b4s1.gif
B3/S3 to B35678/S02345678 B34/S to B345678/S0245678 B4/S1 to B45678/S01245678

Notes

  1. Patterns exceeding a 40×40 bounding box can still be encoded using "classic" apgcodes if they match either fit within such a box in at least one phase, or if, alternatively, they do not contain runs of more than 39 zeroes in their extended Wechsler representation.
  2. That is to say, these apgcodes will match the following regular expression: x[spq][0-9]+(_[0-9a-z]+)*

References

  1. Extending apgcodes to larger patterns (discussion thread) at the ConwayLife.com forums
  2. Re: apgsearch v3.1 (discussion thread) at the ConwayLife.com forums
  3. Re: apgsearch v3.1 (discussion thread) at the ConwayLife.com forums