Enumerating Three-Glider Collisions

For scripts to aid with computation or simulation in cellular automata.
User avatar
confocaloid
Posts: 6697
Joined: February 8th, 2022, 3:15 pm
Location: learn to protect yourself against stray gliders and sparks and self-destruct mechanisms

Re: Enumerating Three-Glider Collisions

Post by confocaloid » November 17th, 2024, 8:29 pm

dvgrn wrote:
November 14th, 2024, 8:59 am
[...] Once that's done, the only remaining data cleanup for three-glider collisions will be to figure out which 3G collisions are missing. [...]
Since there are infinitely many 3G collisions, it may be more sensible to provide an iterator over an infinite sequence. Desirable properties of the sequence would include lack of duplicates, and having the generated collisions roughly ordered by increasing size (in every metric that is considered useful) so that for every useful upper bound on the size of a 3G collision one can give an upper bound on the number of terms of the sequence before every "small enough" collision appears. Then it would be up to the application to decide how many terms of the sequence it needs and when to stop iterating.
confocaloid wrote:
November 17th, 2024, 11:49 am
[...] I'm currently running a rewritten script, which should perform additional checks, and output an exact copy of (one of duplicates of) the pattern taken from the input file. [...]
I did run the rewritten script (more details in the edited post), and got the count 453040 twice and the count 453038 once. As far as I can tell, the two missing results from the run over "3gdata.txt" must be due to the 23 patterns with touching gliders (shown in the same post).
dvgrn wrote:
November 17th, 2024, 7:12 pm
[...]
The easiest way to immediately detect a hash collision is to run an additional quick test: whenever a match is found, the next hash recorded for the matching pattern should be the same as the hash of the search pattern evolved by one tick. The confidence level goes up way past the "not worth worrying about" point if there's a match on even one additional generation -- and that data will be readily available for all but the less-than-.1% of the hashes at the end of each recorded series.
Especially if the database is actually meant to be extensible in future with more kinds of enumerations of predecessors (other sets of stationary constellations, collisions involving other spaceships, etc.) having the additional checks against hash clashes in place will become necessary, so it would be helpful to design the system from the start so that the additional checks would be performed in a way that does not make things too slow/costly.

I would not be too surprised if simply merging the existing octo... databases with a single replacement 64-bit hash function happened to produce a hash clash. I would expect a clash once there are xWSSes with up to 1024 generations of each initial pattern.
127:1 B3/S234c User:Confocal/R (isotropic CA, incomplete)
Unlikely events happen.
My silence does not imply agreement, nor indifference. If I disagreed with something in the past, then please do not construe my silence as something that could change that.

User avatar
dvgrn
Moderator
Posts: 11979
Joined: May 17th, 2009, 11:00 pm
Location: Madison, WI
Contact:

Re: Enumerating Three-Glider Collisions

Post by dvgrn » November 18th, 2024, 11:23 am

confocaloid wrote:
November 17th, 2024, 8:29 pm
Since there are infinitely many 3G collisions, it may be more sensible to provide an iterator over an infinite sequence. Desirable properties of the sequence would include lack of duplicates, and having the generated collisions roughly ordered by increasing size (in every metric that is considered useful) so that for every useful upper bound on the size of a 3G collision one can give an upper bound on the number of terms of the sequence before every "small enough" collision appears. Then it would be up to the application to decide how many terms of the sequence it needs and when to stop iterating.
Anyone is certainly welcome to design and implement something along those lines. Seems like it's definitely a good idea to do an enumeration like that, up to the 500,000th collision (or so), just to make sure that the 3G collision database is complete -- up to some specific value, for whatever metric is being used to limit the enumeration.

I'm not clear what "the application" would be, though. The point of the Online Octohash database is that a whole lot of "fingerprinting" hash calculations can be done in advance, and the results can be indexed such that matches can be returned near-instantly even for a much larger dataset than the current experimental octohash/octo3obj/octo3g ones.

That idea of large volumes of pre-calculated hashes doesn't mix well with an application that iterates through an unbounded set of 3G collisions.

Beyond the 500,000th collision (or so) we get really quickly diminishing returns on the effort of enumerating further collisions and hashing every generation. How often is someone really actually in practice going to want to be able to look up, say, the 999th generation of collisions like the one shown here?

So ... given limited resources and time, I'll be continuing to focus on a finite region of the search space -- the region where interesting and useful things are most likely to be found.

For 3G collisions, that region roughly matches the 453040 collisions in the current list. However, because we don't know exactly what parameters 2718281828 used to generate the original set of collisions, we don't really know how many "potentially useful" 3G collisions might be missing from the list. It just seems like there probably are some missing entries (as the above link mentions).
confocaloid wrote:
November 17th, 2024, 8:29 pm
Especially if the database is actually meant to be extensible in future with more kinds of enumerations of predecessors (other sets of stationary constellations, collisions involving other spaceships, etc.) having the additional checks against hash clashes in place will become necessary, so it would be helpful to design the system from the start so that the additional checks would be performed in a way that does not make things too slow/costly.
Sure. I've already done that design work to my own satisfaction. What would you say is wrong with the design I've outlined?
confocaloid wrote:
November 17th, 2024, 8:29 pm
I would not be too surprised if simply merging the existing octo... databases with a single replacement 64-bit hash function happened to produce a hash clash. I would expect a clash once there are xWSSes with up to 1024 generations of each initial pattern.
That seems about right.

On the other hand, a single hash clash, or even a few dozen of them, really doesn't do anything to damage the usefulness of the project. It just means that out of the billions of patterns represented by the hashes in the database, you'll get a few false positives along with your good results in just those few cases.

The valid results will still show up just fine, and the invalid results can be disqualified with a very simple test (e.g., evolving the search pattern by one tick, taking a second hash, and checking that it's the same as the hash of the successor of each matched pattern.)

User avatar
confocaloid
Posts: 6697
Joined: February 8th, 2022, 3:15 pm
Location: learn to protect yourself against stray gliders and sparks and self-destruct mechanisms

Re: Enumerating Three-Glider Collisions

Post by confocaloid » November 18th, 2024, 12:24 pm

dvgrn wrote:
November 18th, 2024, 11:23 am
confocaloid wrote:
November 17th, 2024, 8:29 pm
dvgrn wrote:
November 14th, 2024, 8:59 am
[...] Once that's done, the only remaining data cleanup for three-glider collisions will be to figure out which 3G collisions are missing. [...]
Since there are infinitely many 3G collisions, it may be more sensible to provide an iterator over an infinite sequence. [...]
[...] it would be up to the application to decide how many terms of the sequence it needs and when to stop iterating.
Anyone is certainly welcome to design and implement something along those lines. [...]
I'm not clear what "the application" would be, though. [...]
Those are two different, overlapping problems, and neither of those problems would be fully covered by a solution to the other one.
Creating a database of "useful" collisions would not by itself solve classification of all three-glider collisions, and vice versa.

I think your earlier point about "the only remaining data cleanup for three-glider collisions..." belongs more to the problem of classification of all three-glider collisions, and less to the problem of designing and implementing a database of "useful" (for some definition of useful) collisions.

Assuming extensibility, at any point in future it should be possible to add any "new" "unexpectedly useful" collisions noted later, and the system should continue to work more or less as if those collisions were present in the database from the beginning. (Except probably for showing metadata about when such-and-such collision was added to the database, if discovery information/metadata is part of the project.)
That means that, if your goal is a database, then it is unnecessary to attempt to "figure out which three-glider collisions are missing" right now. The system should be designed to allow adding missing collisions later.

On the other hand, when the goal is classification, one would have to classify all three-glider collisions, regardless of their (perceived or actual) "usefulness", and there are infinitely many.
dvgrn wrote:
November 18th, 2024, 11:23 am
[...] I'm not clear what "the application" would be, though. [...]
"The application" is any application (e.g. a script) written by someone else, using the iterator as a library, and looking through three-glider collisions to solve some task which is not covered by any available database or software tool, and not predictable in advance. It is impossible to predict all kinds of things people will want to do with three-glider collisions.
127:1 B3/S234c User:Confocal/R (isotropic CA, incomplete)
Unlikely events happen.
My silence does not imply agreement, nor indifference. If I disagreed with something in the past, then please do not construe my silence as something that could change that.

Post Reply