Parallelizable tile-based backtracking

For general discussion about Conway's Game of Life.
Post Reply
User avatar
dvgrn
Moderator
Posts: 10670
Joined: May 17th, 2009, 11:00 pm
Location: Madison, WI
Contact:

Parallelizable tile-based backtracking

Post by dvgrn » July 23rd, 2015, 7:47 pm

From these increasingly irrelevant posts in another thread:
simsim314 wrote:We solve the problem in two steps:

1. Regular tree depth search but forcing the worst score predecessor to be the Kth worst predecessor of some tile.
2. Running K from 1 to max depth.

If we manage to separate nicely all the cases in #1 we could run #2 in parallel.
A clean parallelization option would definitely be a good thing. Seems like K will have to get pretty big before solutions start appearing, though. As you move from tile T1 to tile T2, the T2 predecessor will have an 80% overlap with the T1 predecessor, leaving something less than 32 possibilities that could actually work -- i.e., each combination of ON and OFF cells along one edge gets a score.

(Or is it possible to do this trick without overlapping tiles so much? I don't think so -- you have to check every 3x3's predecessor.)

If T3 and T4 tiles complete a 6x6 square, then T3 will also have no more than 2^5 workable predecessors in its sorted list, and T4 can have at most two. Right?

I remember I had some ideas about storing a compact indexed lookup table for the 80%-overlap cases. Let's see, five bits for each 4x5 area to store the number of predecessors that match, plus five bits to specify each final row or column, plus a score for each predecessor... seems as if that could be made to fit in a gigabyte or two of RAM.
simsim314 wrote:Other than this option that sorts each tile equally, we can do similar thing using some ranges of "worst score". Considering that some tiles have worse predecessors than other tiles, instead of limiting the worst predecessor depth in the list, we can limit the worst predecessor by score. This will require some minor modifications (like skipping some tiles, and using some tiles twice with several "worst scores"), but the essential idea should work as well.
Yes, a separate search could be done for predecessors whose scores add up to each possible integer S, and clearly there won't be any overlaps. Maybe the score should start at zero, and lower scores are better than high ones.

User avatar
simsim314
Posts: 1823
Joined: February 10th, 2014, 1:27 pm

Re: Parallelizable tile-based backtracking

Post by simsim314 » July 24th, 2015, 6:16 am

Few points about the hash tables.

For clarity, I'm working with 3x3 tiles and 5x5 predecessors.

First lets look at all possible overlaps:

1. None - for the first tile.
2. 10 cells overlap for top and bottom (T2, T3)
3. 16 cells overlap for most other cases (T4)

Considering it's all symmetric and that the None case is very rare, all we need to store is the following:

Datatype of: Predecessor and its score and depth.
Hash table: tile 3x3 + 2x5 right edge predecessor -> Predecessor list
Hash table: tile 3x3 + 2x5 right edge predecessor -> Hash Table right bottom 2x3 -> Predecessor list

This will require to rotate the the input for bottom 10 cells, or upper 10 cells so that the right Hash table will work. But I really don't see it as an issue, actually the fact it's symmetrical is very simplifying out lives.
----

If K needs to be high it's actually not a big issue, what's important is that the search will work fast enough and contradictions will be reached in reasonable time. If the fitting options are very limited it's actually good because we could cut off the low Ks very fast, which will leave us with the non trivial cases for deeper Ks. Actually what it means that the search time will be balanced by the "hard cases" that could bring up a solution, and the trivially not sufficient depth cases will be filtered out very quickly.

----

Another interesting point is how to conduct stats for 5x5. Assuming we start from glider collisions, which is good because we will give extra weight to glider constellations. But then if we let it run for too much time, the soup will be purely random soup without any relation to the initial gliders at all, thus giving weight to natural distribution in GOL, but lowering the "target function" which is convert the interaction back into gliders.
Attachments
T4.png
T4.png (4.85 KiB) Viewed 18692 times
T3.png
T3.png (4.79 KiB) Viewed 18692 times
T2.png
T2.png (3.5 KiB) Viewed 18692 times

User avatar
dvgrn
Moderator
Posts: 10670
Joined: May 17th, 2009, 11:00 pm
Location: Madison, WI
Contact:

Re: Parallelizable tile-based backtracking

Post by dvgrn » July 24th, 2015, 11:43 am

simsim314 wrote:Few points about the hash tables.

For clarity, I'm working with 3x3 tiles and 5x5 predecessors.

First lets look at all possible overlaps:

1. None - for the first tile.
2. 10 cells overlap for top and bottom (T2, T3)
3. 16 cells overlap for most other cases (T4)
Ah, got it. I was taking the much more conservative approach of maximally overlapping the 3x3 tiles. That's not necessary to guarantee a valid predecessor -- the width-2 overlap is enough.

I wonder about the 2 or 15 (!) additional 5x5 predecessors that are automatically generated between the overlapping tiles. Does it make sense to include their scores in the complete calculation also?

User avatar
simsim314
Posts: 1823
Joined: February 10th, 2014, 1:27 pm

Re: Parallelizable tile-based backtracking

Post by simsim314 » July 24th, 2015, 1:13 pm

dvgrn wrote:I wonder about the 2 or 15 (!) additional 5x5 predecessors that are automatically generated between the overlapping tiles. Does it make sense to include their scores in the complete calculation also?
Don't think so for two reasons:

1. This approach is heuristic anyway, we just need to make sure we "go in the right direction".
2. It cost extra performance, so there is a good reason to avoid extra calculation without very good basis.

---

Another options I was thinking about is to try to make the tile much smarter (for starters to try 9 options to divide the universe). The reason is that 3x3 is pretty small area, so the information is pretty scarce, lets say we have block which is 2x2 that falls exactly on the border of 4 tiles, finding predecessor to it will be much more complex task than just having the block inside single tile, the results could vary as well (for better or worse). So this approach could give few different interesting results.

---

Other than that the TODO list is very clear:

1. Generate statistics for 5x5 tiles and give them score.
2. Prepare the data structure as mentioned above. (Hash 3x3 + 2x5 -> 5x5 sorted by score etc.)
3. Write parallel search algorithm, for Depth = K.

---

Other than that I see another problem coming when the predecessors after few generations will start to "spread out", we will probably need to separate them manually into few "search areas". But I think it's a bit early to guess, hopefully after 10-15 generations we will start to see something pretty natural appearing, if the algorithm will not stuck.

Another long range shot is to use bigger tiles - and much more RAM. But hopefully the 3x3 and 5x5 tiles will work as well, just slower and with lower quality.

User avatar
simsim314
Posts: 1823
Joined: February 10th, 2014, 1:27 pm

Re: Parallelizable tile-based backtracking

Post by simsim314 » July 24th, 2015, 8:21 pm

I've made small search utility to collect statistics. It's currently hardcoded to work for 5x5, but most of it is generic.

I've also committed statistics for all the 33M 5x5 tiles, out of 700K glider collision soups, which were running for 450 iterations. Still running it hoping to reach few million soups and at least one occurrence for each 5x5 tile.

It's all part of Predecy branch in the GitHub repository, that you can find here.

stats.txt (zipped in the repository) is list of total counts per each "key". The key is calculated by building one long 25 bit integer out of the lines of 5x5 tile. While the last bit (less significant) is left upper corner, going to the right and down (line by line left to right), to the most significant bit in the bottom right.

What's interesting is that the distribution is very wide. Some 5x5 tiles are extremely common while some never appeared in any soup out of 700K in any of the 450 generations. This is good sign that 5x5 carries very good information about the probability of the tiles, and that natural distribution has very unique traits that can be captured just by statistics, so that few back tracing iteration will bring predecessors "up" to be much more probable to occur naturally.

User avatar
simsim314
Posts: 1823
Joined: February 10th, 2014, 1:27 pm

Re: Parallelizable tile-based backtracking

Post by simsim314 » July 25th, 2015, 4:32 pm

I've added data structure that needed for the search in the same repository.

I've made sure some basic sanity tests passed. I took a glider and looked for 10 most common 5x5 predecessors to the 3x3 glider (obviously expecting glider to be the first, and orientation independent as the stats collector didn't use symmetry in any stage). The results are not surprising, two gliders rotated by 180 degree have the same most common 5x5 predecessors while glider parent is the most common option:

Code: Select all

x = 35, y = 76, rule = B3/S23
2bo28b3o$3bo27bo$b3o28bo8$bo30b2o$2b2o27b2o$b2o30bo4$o$2o29b2o$b2o29b
2o$2b2o29b2o$34bo3$bo$3o28b3o$o2bo27bo2bo$b3o28b3o$33bo3$2bo$2bo29b2o$
ob2o27b2obo$b2o29bo$32bo3$bo30bo$3o28b2o$o2bo27bo2bo$2b2o28b3o$2bo30bo
3$3o28bobo$4bo$obobo25bobobo$30bo$bobo28b3o4$3o28b2o$2obo27bob2o$2b2o
28b3o4$3b2o27b3o$3b2o$3b2o25b2o$30b2o$3o27b2o3$o$2o30bo$2ob2o25b2ob2o$
2bo30b2o$34bo4$bo29b2obo$2b2o27b2o$ob2o29bo!
To use the stats from 6M soups, download stats.7z. Compile PredecessorsSearch.cpp as usual it currently doesn't use any special abilities (not C++11 nor OpenMP). This will obviously change pretty soon, as I'm intending to add parallel backtracking search.

User avatar
simsim314
Posts: 1823
Joined: February 10th, 2014, 1:27 pm

Re: Parallelizable tile-based backtracking

Post by simsim314 » July 29th, 2015, 4:08 pm

I've committed to the repository a working prototype for the search utility (no user interface - currently only code modifications).

PredecyInput.py will helps to create a valid input, to the search class.

It was shown to be working nicely on some non trivial cases like this:

Code: Select all

x = 15, y = 24, rule = B3/S23
3$6bo$2bobobo$6bo5bo$5bobo4bo$5bo4b2o$7b6o$6bobob2o2$7b2o$6bo$6bo2bo$
7b2o$6b2o$7bo$7bo$6b2o$7b2o$7b2o$6b2o$6b2o!
I've very hard time to find how to work with it properly. Usually scoring systems will bring a lot of similar solutions, especially predecessor variations on empty tiles. But even skipping this issue, many times scoring system tend to cluster around some specific solutions, while we do want to get nice diversity instead. Minimal "depth" sum, brought intuitively the single most elegant predecessor , but it doesn't give nice diversity.

Another issue is with non trivial shapes (like T or J shapes that pretty common). The search is currently using square tile setup, going in "quarter of spiral" (spiraling from top left) , but the code is very flexible, and the prepared data structure allows any sort of spirals.

I can't even start to think how to automated it for few iterations of depth.

User avatar
dvgrn
Moderator
Posts: 10670
Joined: May 17th, 2009, 11:00 pm
Location: Madison, WI
Contact:

Re: Parallelizable tile-based backtracking

Post by dvgrn » July 30th, 2015, 5:56 pm

simsim314 wrote:I've very hard time to find how to work with it properly. Usually scoring systems will bring a lot of similar solutions, especially predecessor variations on empty tiles.
Maybe this is where a bias toward lower population should come in? Seems as if empty tiles around the edges should stay empty as much as possible. Of course a lot of empty 3x3 tiles will have to have nonempty 5x5 predecessors, but if the neighbors aren't forcing that to happen, it seems as if there should be a strong preference for minimal population...?
simsim314 wrote:But even skipping this issue, many times scoring system tend to cluster around some specific solutions, while we do want to get nice diversity instead. Minimal "depth" sum, brought intuitively the single most elegant predecessor , but it doesn't give nice diversity.
How long does it usually take to find a solution for, say, a 20- to 25-cell 9x9 pattern? I'd expect that you could get a fair amount of diversity by arbitrarily restart the search every now and then, with a different initial tile and/or a different spiral search pattern.

As soon as we start looking more than two or three ticks back in time, we're automatically giving up on enumerating all possible predecessors. For difficult cases, like Gardens of Eden, grandfatherless patterns and so forth, it would still be nice if all single-tick predecessors could be enumerated. But for longer backtracking searches, there doesn't seem to be any harm in chopping off branches of the search tree as soon as they start looking repetitive. (?)
simsim314 wrote:The search is currently using square tile setup, going in "quarter of spiral" (spiraling from top left) , but the code is very flexible, and the prepared data structure allows any sort of spirals.

I can't even start to think how to automated it for few iterations of depth.
I wondered if it might be useful to use a loafer as a sample target for this new algorithm. We can already build a loafer with eight gliders, but somewhere out there there's probably a recipe in the five- to seven-glider range. I don't really expect this search to turn up a new-record recipe any time soon, but it would be quite a triumph.

If nothing else, I bet it's possible to generate a nice 16x16 space-dust loafer predecessor that looks like it came out of Catagolue -- suitable for April Fool's Day next year...!

User avatar
calcyman
Moderator
Posts: 2936
Joined: June 1st, 2009, 4:32 pm

Re: Parallelizable tile-based backtracking

Post by calcyman » July 30th, 2015, 6:20 pm

dvgrn wrote:If nothing else, I bet it's possible to generate a nice 16x16 space-dust loafer predecessor that looks like it came out of Catagolue -- suitable for April Fool's Day next year...!
But can you find a SHA-256 preimage thereof? Then you actually could get it into Catagolue...
What do you do with ill crystallographers? Take them to the mono-clinic!

User avatar
dvgrn
Moderator
Posts: 10670
Joined: May 17th, 2009, 11:00 pm
Location: Madison, WI
Contact:

Re: Parallelizable tile-based backtracking

Post by dvgrn » July 31st, 2015, 1:57 am

calcyman wrote:
dvgrn wrote:If nothing else, I bet it's possible to generate a nice 16x16 space-dust loafer predecessor that looks like it came out of Catagolue -- suitable for April Fool's Day next year...!
But can you find a SHA-256 preimage thereof? Then you actually could get it into Catagolue...
Well, it might be possible, next time I get access to a good high-performance computing cluster. Won't know unless I try. To maximize my chances I suppose it might be a good idea to recompile apgsearch 2.2 code so that it only looks for loafers...!

It might not be too difficult to write a set of multistate rules that would annihilate any pattern that isn't led by a loaf traveling at c/7, come to think of it. Boy, that would be a very boring search looking at a lot of nothing for a long time.

-- But does apgsearch 2.2 even use that kind of Golly-rule-based trick for counting and eliminating objects? I see some familiar function names in the 2.2 codebase, but all the rule-related stuff seems to be missing (or else hidden in the painful parts that I don't want to read too closely).

One more question while I'm off on this irrelevant tangent: have apgsearch 1.x and 2.x been shown to produce identical results on a few randomly-chosen multi-million-soup runs? I suppose B3S23/C1 doesn't have a lot of weird cases to test, the way some other rules do -- the "ov", "zz", and "PATHOLOGICAL" soups, for example. But now I'm wondering: might 2.2 be counting something just slightly differently than the old 1.x code?

[Horrible thought -- indistinguishable statistics, inextricably mixed.]

User avatar
simsim314
Posts: 1823
Joined: February 10th, 2014, 1:27 pm

Re: Parallelizable tile-based backtracking

Post by simsim314 » July 31st, 2015, 5:41 am

dvgrn wrote:To maximize my chances I suppose it might be a good idea to recompile apgsearch 2.2 code so that it only looks for loafers...!
This would be a real waste... maybe it will work X1.5 faster but this version will not report all the nice findings, to the online Census.

-----

And back to the topic:

Minimal population is not such a great idea. First of all some smoke is added automatically because patterns with it are actually more common than without it. Other than that we don't want one single "very good" solution, we want a bunch of different unique and probable solutions, because we use heuristics it's very improbable the best solution we come up with is the most natural one, we can say that we have nice estimation but it's totally not perfect, so we want few different solutions. Now even when you try to "relax" the population parameter, the fist thing that pops up is some extra cell from empty predecessor, when you let it be relaxed more again most of the options from "almost empty" and "empty" tiles start to popup and combinatorically explode.

The problem with enumerating all the predecessors is that there are billions of them. Even when I restrict the MaxDepth to be somewhat low (say 350 - that means take only 350 most common predecessors of each 3x3 tile and search with them only), I get more than million solutions.

I think we need something like "category", and we need to take the best solution from each category. Or hopefully some function will popup that will "normalize" the combinatorical explosions.
dvgrn wrote:How long does it usually take to find a solution for, say, a 20- to 25-cell 9x9 pattern? I'd expect that you could get a fair amount of diversity by arbitrarily restart the search every now and then, with a different initial tile and/or a different spiral search pattern.
Well I limit the MaxDepth parameter so I can get thousands or ten of thousands pretty nicely diverse solutions with good probability in less than a second (I've tested on 21x9 tile and it worked blazingly fast). But when I need to take 5 best solutions out of thousands they tend to come up very similar. I used different estimation functions and they weren't working that well. But maybe I can take 5 solutions from 5 different estimation functions.

On the other hand if I have any glider salvo, I want single good predecessor to popup which is the "natural" predecessor, so any estimation function should pass this "sanity test" which is pretty tricky on it's own.

I actually think it's a good idea to have few estimation functions, thus we will not get stuck with single "good" solution, but they should always "converge" on trivial cases, thus the more natural the solution is, the less diversity we need.

---

I want to remind that although we can use this tool to find all the possible predecessors, the strength of it is that we can find X most "natural" predecessors. So soup or garden of Eden (which are extremely not natural), is not one of its strongest points. We can try to "reverse" the search, searching for the most unnatural solutions there are, but I think this will miss the point, and make the search harder from iteration to iteration, while the point was to use the most common 5x5 tiles to "predict" the 3x3 tile, thus getting much more natural predecessor than the child, hopefully getting to simple SLs (blocks, blinkers, beehives) and gliders after few dozens of ticks.

----

I hope to write some simple interface now, to get rle as an input and get out one or few "best" predecessors.

User avatar
gmc_nxtman
Posts: 1150
Joined: May 26th, 2015, 7:20 pm

Re: Parallelizable tile-based backtracking

Post by gmc_nxtman » July 31st, 2015, 10:51 am

Same error that happened with apgnano, "omp.h" not found.

Code: Select all

Last login: Fri Jul 31 07:49:41 on ttys005
GalensSlverBook:Predecy-master galenmcholbi$ cc PredecessorsSearch.cpp
PredecessorsSearch.cpp:7:10: fatal error: 'omp.h' file not found
#include <omp.h>
         ^
1 error generated.
GalensSlverBook:Predecy-master galenmcholbi$ 


User avatar
dvgrn
Moderator
Posts: 10670
Joined: May 17th, 2009, 11:00 pm
Location: Madison, WI
Contact:

Re: Parallelizable tile-based backtracking

Post by dvgrn » July 31st, 2015, 11:57 am

simsim314 wrote:
dvgrn wrote:To maximize my chances I suppose it might be a good idea to recompile apgsearch 2.2 code so that it only looks for loafers...!
This would be a real waste... maybe it will work X1.5 faster but this version will not report all the nice findings, to the online Census.
Yes, it's a terrible idea. I should have used <terrible></idea> tags to make it clear that I knew that... I would think it would be possible to process soups at least a couple of orders of magnitude faster -- but the odds would be high that only the least interesting kind of loafer predecessor would be found, where the soup converges in a few ticks to a loafer and not much else. We know there are a gazillion such soups out there, but they're no help in finding glider constructions.
simsim314 wrote:Minimal population is not such a great idea...
Yes, I suppose if you force the population to stay low, you're likely to end up in very unnatural territory. Small areas of Life patterns tend to oscillate between higher and lower populations -- check the population plot of a honeyfarm explosion for example.

But probably it won't do any good at all to bias a search to try to alternate between higher and lower populations. That would likely produce just a pointless simulation of "more natural" without improving the odds of getting anything useful.

Glider constructions are fairly unnatural, anyway -- usually lots of orchestrated sparks simultaneously dying off at the very last moment. There's also probably no point in trying to emulate that kind of pattern in any way. If this new approach finds anything, it will be reactions that don't look much like engineered constructions.

I did wonder briefly about forcing predecessors to be significantly different by scattering a little junk around the target object -- behind or next to a spaceship, I suppose, or anywhere around an oscillator. A nearby block or blinker could be easily shot down later, but each possible position would push the search into a different part of the space... maybe. I suppose that deliberately adding different random dying sparks next to the object would have the same effect -- and maybe that's pretty much what the search is already doing.
simsim314 wrote:I think we need something like "category", and we need to take the best solution from each category. Or hopefully some function will popup that will "normalize" the combinatorical explosions.
...
... if I have any glider salvo, I want single good predecessor to popup which is the "natural" predecessor, so any estimation function should pass this "sanity test" which is pretty tricky on it's own.
Even trickier, if half the pattern has turned into gliders and the other half still needs work, the gliders should always get glider predecessors (low diversity) but the remaining spark still needs lots of different predecessors (high diversity).

If the estimation function prioritizes gliders too highly, the algorithm will probably very happily generate huge numbers of gliders out of thin air:

Code: Select all

#C very bad sample loafer recipe
x = 78, y = 94, rule = B3/S23
2bo$obo$b2o3$3bo$bobo$2b2o3$76bo$75bo$3bo71b3o$4b2o$3b2o31$25bobo$21b
3o2b2o$12b2o9bo2bo$11bobo8bo$13bo27b2o$20bo20bo$14bo4bo19bobo$13b2o4b
3o2b2o13b2o$13bobo7b2o$25bo26bo$51bobo$52bobo$53b2o$40b2o$40bobo$28bo
12b2o$24bo2bo$23b2o2b3o$23bobo$34b2o$34bobo$35bo9bo$39bo4bobo$38bobo3b
2o$39bobo$40b2o2$10bo$10b3o16b2o$13bo15b2o3b2o$12b2o19bo2bo$33bo2bo17b
o$34b2o17bobo$53b2o4$33b2o$26b2o4bobo$25bobo5bo$25b2o$21b2o$21b2o3$32b
2o$31bo2bo$31bo2bo$32b2o!

User avatar
dvgrn
Moderator
Posts: 10670
Joined: May 17th, 2009, 11:00 pm
Location: Madison, WI
Contact:

Re: Parallelizable tile-based backtracking

Post by dvgrn » July 31st, 2015, 12:20 pm

dvgrn wrote:If the estimation function prioritizes gliders too highly, the algorithm will probably very happily generate huge numbers of gliders out of thin air...
Maybe a different angle to pursue would be to try to find predecessors that travel as much as possible, and not worry so much about gliders. In a sense, the big bottleneck in a complex synthesis is lack of space to make adjustments.

A construction recipe is made out of some number of small sparky objects. If the objects are very well separated, it doesn't matter so much how many of them there are, because there's enough room to place each piece separately.

No point in having the backtracking search experiment with sparks outside the lightspeed cone of the target object, of course, except maybe to clean up troublesome junk, and that seems like a much lower priority. But information often travels surprisingly close to lightspeed in the middle of an active reaction.

So if I don't think too hard about how exactly to write the estimator code, I could imagine four fast-moving sparks converging at c/2 or faster on a central location where the construction occurs. The farther the sparks travel, the more room there is to construct them.

Is there any way to score predecessors higher if they're stable in the middle (empty, or small still lifes or p2 oscillators, doesn't matter) and only active in a ring around the edges?

-- I'll stop posting wild vague ideas now, I promise.

User avatar
calcyman
Moderator
Posts: 2936
Joined: June 1st, 2009, 4:32 pm

Re: Parallelizable tile-based backtracking

Post by calcyman » July 31st, 2015, 12:56 pm

dvgrn wrote:-- But does apgsearch 2.2 even use that kind of Golly-rule-based trick for counting and eliminating objects? I see some familiar function names in the 2.2 codebase, but all the rule-related stuff seems to be missing (or else hidden in the painful parts that I don't want to read too closely).
No, the only advantage of using rules in 1.x is that they're faster than Python. When you're writing directly in C++, it's naturally quicker to just manually detect objects (and indeed use bitwise tricks for blocks, blinkers, beehives and gliders to avoid unnecessary std::vector construction).
dvgrn wrote:One more question while I'm off on this irrelevant tangent: have apgsearch 1.x and 2.x been shown to produce identical results on a few randomly-chosen multi-million-soup runs?
They don't, because they run infinite-growth patterns for different durations. But they're both equally correct.
What do you do with ill crystallographers? Take them to the mono-clinic!

User avatar
simsim314
Posts: 1823
Joined: February 10th, 2014, 1:27 pm

Re: Parallelizable tile-based backtracking

Post by simsim314 » July 31st, 2015, 2:25 pm

gmc_nxtman wrote:"omp.h" not found.
I don't use omp so you can comment/delete this line.

-----

@dvgrn I think you're missing some major points of this approach. When I conducted the 5x5 soups statistics, I started from gliders and collided them, collecting stats about any 5x5 tile in 450 ticks range (and used ~6M such soups). This means few things:

1. Gliders are prioritized just because any soup had a lot of them, this was not a coincidental statistical feature, but intended "unnatural" bias toward gliders.

2. Small SLs are prioritized from the same reason. Natural soups has them a lot and 450 ticks is enough to get many of them converged into small SLs (blinkers, blocks, and beehives are highly scored).

3. The 5x5 tile statistics is EXTREMELY informative. Some tiles scored in trillions or hundred of millions, others thousands and some bellow hundred. 2 millions 5x5 tiles are scored with 0 - they are very uncommon.

4. The search is limited to use only the most highly scored tile predecessors of 3x3. Some 3x3 tiles have ~100K predecessors, and some only 5K, but for any 3x3 tile I limit the number of 5x5 predecessors to 350 most common ones, and got a lot of results for pretty unnatural c/3. This means that any "proposed" solution is pretty much natural (each 5x5 tile has probably appeared in random soups more than 1K times).

5. Even with this limitation I get hundreds of thousands and millions of solutions, they are pretty much diverse and with high quality. We just need to pick the best of them without taking too much similar solutions. Conducting another search instead of using millions of nice solutions that we got, because we don't know how to estimate and pick and chose the most diverse solutions is kinda silly. More than that there is natural tendency that comes with the statistics to give pretty good results with sparks if the sparks are common, so I'm guessing the features you talk about that has any natural soup, will happen by their own in this approach.

Instead of adding random sparks and conduct the search again, I would say we can "force" X cells to be ON and OFF and take the best solution with this constrain. I think it's unnatural approach. More than that I would argue that we want to get single solution if we have glider salvo, I'm certain the glider salvo predecessor will be scored very high, it's only matter of choosing the estimation function correctly.

----

Actually prioritizing "O" shape or "island group isolation" is a nice idea. Meaning that solutions that can be divided into several separate groups, and each such group will also be encouraged to travel away from the center. I think this pretty unnatural idea could work, but using just the "best" and "most common" natural predecessors will work even better.

I even think that if we take only one good predecessor that passed glider sanity test, and recurse it alone, this will work as well, after few dozens of ticks - maybe it won't bring the best solution possible using this approach, but I do think it end up with gliders and SLs just due to "statistics magic".

User avatar
dvgrn
Moderator
Posts: 10670
Joined: May 17th, 2009, 11:00 pm
Location: Madison, WI
Contact:

Re: Parallelizable tile-based backtracking

Post by dvgrn » August 1st, 2015, 1:56 am

simsim314 wrote:1. Gliders are prioritized just because any soup had a lot of them, this was not a coincidental statistical feature, but intended "unnatural" bias toward gliders.
It will be interesting to see if there's too much bias toward gliders. Let's say there's a common way to make a required spark with two gliders, but you could also use a dozen gliders and end up with the same spark, along the lines of my example above.

It would be nice if the estimator function could reliably prefer the two-glider predecessor instead of the 12-glider one... but that might be a lot of hope to hang on an estimator function, which already seems tricky enough.
simsim314 wrote:I even think that if we take only one good predecessor that passed glider sanity test, and recurse it alone, this will work as well, after few dozens of ticks - maybe it won't bring the best solution possible using this approach, but I do think it end up with gliders and SLs just due to "statistics magic".
Well, I'll certainly keep my fingers crossed and see what the first results from the RLE-in, predecessor-out GUI look like. I'm not expecting that "statistics magic" will turn out to be quite that powerful in practice, when you get back to T minus 20 or 30 ticks -- at least not without an additional trick like the O-shape or island-isolation ideas.

The reason for this is the Garden of Eden theorem, which came up briefly in a previous conversation on the forums.

On average, predecessors are going to be slightly bigger than their descendants. That's just the way it works -- you provably can't stay inside any given bounding box indefinitely as you backtrack. In practice the bounding box's average delta X and Y per backtracking step is maybe around (+1.8, +1.8 ). [Definitely just a guess, but I bet it's well over (+1, +1 ), and probably not much less than (+2, +2 ).]

So if we start with a loafer-sized object, its non-loafer T-minus-20 predecessors will generally be bigger than 25x25. If those predecessors aren't successfully split into separate islands somehow, I suspect that as the size goes up, the backtracking is going to gradually start to be more and more constrained.

There aren't any Gardens of Eden in 5x5 patterns at all -- but there are an awful lot of 25x25 GoEs. The great majority of 25x25 patterns might even be GoEs. Well before T minus 20 there might stop being any workable choices at all for most branches of the search tree.

-- I'd be really curious to find out what proportion of randomly-chosen NxN patterns can be shown to be GoE's, as N increases. I'm certainly hopeful that choosing only the most common 5x5 tiles will keep the search away from the GoE-rich part of the space, at least for a while. But that won't be a proven method until it actually works...!

User avatar
simsim314
Posts: 1823
Joined: February 10th, 2014, 1:27 pm

Re: Parallelizable tile-based backtracking

Post by simsim314 » August 1st, 2015, 3:40 am

I've committed a version that accepts rle - and returns iterative depth search. Notice the application is eating RAM so having less than 8GB of RAM will simply fail, and with 8GB RAM will stack everything (but will work).

I've few new issues now:

1. When I just backtrack even the simplest pattern for 3-4 ticks, the "complexity" starts to increase dramatically. This means the maxDepth starting point increases (the pattern is less natural), and the search takes longer (the pattern is just bigger).

2. I've added padding of size = 2 around the input pattern and for simple stuff it showed to be working fine (distant gliders or glider + block), but for more complex stuff it wasn't sufficient, and leftover junk started to mess things up.

NOTE I think it's safe to conclude after few tests, that using single "best" predecessor based on statistics alone (like sum 1/log(score), min population, min total depth etc.), is not working. The complexity tends to increase.

-----

To solve these issues:

1. I think there is no escape other than to try few solutions (sorted by category) and not single solution. Hopefully one of the solutions will actually get predecessors much quicker, and this will be a sign that this is a more natural predecessor. So instead of searching single solution and increasing the maxDepth, I intend to search few hundred solutions, and the first solution to work with lowest maxDepth (when the search is still pretty fast), will be considered the best solution, and it parents will be iterated next.

This is similar to depth search in chess. You get few moves, then you try to search each one of the moves etc. The proposed solution is depth = 1, but we could go deeper, getting more versatile solutions and trying more options for few generations until we decide who is the best predecessor.

This limitation is due to the fact that although statistical limitation is necessary it's not sufficient (natural patterns will score high but also some unnatural patterns will). To fix that, we need "second derivative" estimation, that means we need to check how our solution is actually "natural", by searching predecessor to it.

2. To fix the boundary size problem I think the best is to iterate few ticks and check that we get the initial rle pattern. Otherwise too many times the solution will not bring the wanted result due to junk around the predecessor.

User avatar
gmc_nxtman
Posts: 1150
Joined: May 26th, 2015, 7:20 pm

Re: Parallelizable tile-based backtracking

Post by gmc_nxtman » September 28th, 2015, 5:46 pm

Am I invoking it incorrectly or something?

[redacted image]
Last edited by gmc_nxtman on June 13th, 2016, 1:07 pm, edited 1 time in total.

User avatar
simsim314
Posts: 1823
Joined: February 10th, 2014, 1:27 pm

Re: Parallelizable tile-based backtracking

Post by simsim314 » September 30th, 2015, 2:33 pm

gmc_nxtman wrote:Am I invoking it incorrectly or something?
Yes... I didn't write tutorial and the current state is not really working.

In few words you need to start from generating the statistics - and then use it. The current state is useless for the end user, it's available only for people who wish to change and play with the code. I hope to get to it at some point to make it workable alpha version - now it's just more like a sketch and not a release or anything.

Post Reply