Since there are infinitely many 3G collisions, it may be more sensible to provide an iterator over an infinite sequence. Desirable properties of the sequence would include lack of duplicates, and having the generated collisions roughly ordered by increasing size (in every metric that is considered useful) so that for every useful upper bound on the size of a 3G collision one can give an upper bound on the number of terms of the sequence before every "small enough" collision appears. Then it would be up to the application to decide how many terms of the sequence it needs and when to stop iterating.dvgrn wrote: ↑November 14th, 2024, 8:59 am[...] Once that's done, the only remaining data cleanup for three-glider collisions will be to figure out which 3G collisions are missing. [...]
I did run the rewritten script (more details in the edited post), and got the count 453040 twice and the count 453038 once. As far as I can tell, the two missing results from the run over "3gdata.txt" must be due to the 23 patterns with touching gliders (shown in the same post).confocaloid wrote: ↑November 17th, 2024, 11:49 am[...] I'm currently running a rewritten script, which should perform additional checks, and output an exact copy of (one of duplicates of) the pattern taken from the input file. [...]
Especially if the database is actually meant to be extensible in future with more kinds of enumerations of predecessors (other sets of stationary constellations, collisions involving other spaceships, etc.) having the additional checks against hash clashes in place will become necessary, so it would be helpful to design the system from the start so that the additional checks would be performed in a way that does not make things too slow/costly.dvgrn wrote: ↑November 17th, 2024, 7:12 pm[...]
The easiest way to immediately detect a hash collision is to run an additional quick test: whenever a match is found, the next hash recorded for the matching pattern should be the same as the hash of the search pattern evolved by one tick. The confidence level goes up way past the "not worth worrying about" point if there's a match on even one additional generation -- and that data will be readily available for all but the less-than-.1% of the hashes at the end of each recorded series.
I would not be too surprised if simply merging the existing octo... databases with a single replacement 64-bit hash function happened to produce a hash clash. I would expect a clash once there are xWSSes with up to 1024 generations of each initial pattern.