Here's something else that you can do with this data that's ultimately useless but nonetheless interesting.
I was wondering about "missing" rules on Catagolue, candidates for soup-searching. In particular, I was wondering: what totalistic rules are there that fit "in between" rules already investigated? (I'm not interested in non-totalistic rules here; there'd be far too many, differing only very subtly.)
To illustrate what I mean, consider the following rules graph (not based on actual Catagolue data, obviously):

- 867z5KF.png (3.17 KiB) Viewed 623 times
Are there any (totalistic) rules that are subrules of b38s238 and superrules of b3s23 that aren't yet in the graph? Yes, b38s23 and b3s238. It's possible to compute these by taking the "difference" between the rules, as it were (b8s8), and adding each possible subset of the conditions in this difference to the "lower" (b3s23).
We therefore define the totalistic difference of two totalistic rules A and B as all the B/S conditions that appear in A but not in B. (We only care about the case where A is a superrule of B, however.) If A is a superrule of B we further define the totalistic distance of A and B as the number of conditions in the totalistic difference. (So for instance, for b38s238 and b3s23, the totalistic difference is b8s8, and the totalistic distance is 2.)
Now, Catagolue also allows non-totalistic rules to be investigated. Since our aim is to find out which totalistic rules can be wedged in between rules that are adjacent in the rules graph, we define:
- the totalistic completion of a rule A to be the minimal totalistic superrule of A;
- the totalistic reduction of a rule A to be the maximal totalistic subrule of A; and
- the totalistic difference/distance of two (possibly non-totalistic) rules A and B as the totalistic difference/distance in the above sense of the totalistic reduction of A and the totalistic completion of B.
Note that:
- totalistic completion and reduction are well-defined: for each rule there is precisely one totalistic completion/reduction (exercise left for the reader); and
- totalistic reduction and completion are no-ops for totalistic rules, so the second definition of totalistic difference/distance properly extends the first.
With me so far? Some examples might be useful. Take the rule b34-as236i. Its totalistic reduction is b3s23 (all partial B/S conditions have been removed); its totalistic completion is b34s236 (all partial B/S conditions have been filled in).
Consider the rules b368is23 and b3-as23:

- n1loFpl.png (3.33 KiB) Viewed 623 times
The totalistic reduction of the former is b36s23; the totalistic completion of the latter is b3s23. Therefore the totalistic difference is b6s, and the totalistic distance is 1. (This isn't perhaps not ideal, insofar as that "b3s23" also lies in between these rules, but it's good enough for our purposes, and when computing the "in between" rules in the script below we'd still catch b3s23 in this case.)
We may occasionally run into problems: if A is a superrule of B, the totalistic reduction of A is not necessarily a superrule of the totalistic completion of B. Consider the rules b35aes23 and b35as23:

- hKtpGbe.png (3.38 KiB) Viewed 623 times
The totalistic reduction of the former is b3s23, while the totalistic completion of the latter is b35s23. We therefore go on to define the strict totalistic difference/distance as equal to the totalistic difference/distance iff the superrule relationship is sustained, undefined otherwise. (This makes sense, because in the above example there IS no totalistic rule that can be wedged in between those two.)
Got all that? Good.
We can now instrument the rulesgraph.pl script from the first post above to compute these properties for edges in the (reduced) graph and save them as edge attributes.
And we can write a new script to look at each edge A->B, apply each possible combination of B/S conditions from the strict totalistic difference to the totalistic completion of B, and remember each rule created that way that is not yet in the rules graph (i.e. is neither A nor B, by construction of the graph). In fact we can also count how often each rule was "wanted", and what edges it was attached to.
I've done this. The updated scripts are attached (along with an updated rules.list file and the files generated by rulesgraph.pl); here's the output of mostwantedrules.pl:
Code: Select all
$ perl mostwantedrules.pl
Read 928 vertices and 2070 edges.
Wanted 22 times:
b45678s0345678:
(b3-a45678s01-c2ai345678) - (b457s047)
(b3-i45678s01-c345678) - (b45678s4678)
(b3-a45678s01-c2ai345678) - (b678s034678)
(b3-a45678s01-c2ai345678) - (b578s0568)
(b3-i45678s01-c345678) - (b78s345678)
(b3-i45678s01-c345678) - (b45s34)
(b3-i45678s01-c345678) - (b5678s3578)
(b3-a45678s01-c2ai345678) - (b45678s0)
(b3-i45678s01-c345678) - (b578s0568)
(b3-a45678s01-c2ai345678) - (b5678s45678)
(b3-a45678s01-c2ai345678) - (b45678s4678)
(b3-a45678s01-c2ai345678) - (b45s035678)
(b3-a45678s01-c2ai345678) - (b78s345678)
(b3-a45678s01-c2ai345678) - (b4678s35678)
(b3-i45678s01-c345678) - (b4678s35678)
(b3-i45678s01-c345678) - (b457s047)
(b3-i45678s01-c345678) - (b45678s0)
(b3-i45678s01-c345678) - (b5678s45678)
(b3-i45678s01-c345678) - (b678s034678)
(b3-a45678s01-c2ai345678) - (b45s34)
(b3-a45678s01-c2ai345678) - (b5678s3578)
(b3-i45678s01-c345678) - (b45s035678)
Wanted 16 times:
b35s2:
(b35s12) - (b35s)
(b35s24) - (b35s)
(b35s12) - (b5s2)
(b345s2) - (b35s)
(b345s2) - (b5s2)
(b35s025) - (b5s2)
(b35s27) - (b35s)
(b356s02) - (b5s2)
(b35s27) - (b5s2)
(b2i35s23-i4ij5c) - (b35s)
(b34-i5s23a) - (b5s2)
(b356s02) - (b35s)
(b34-i5s23a) - (b35s)
(b35s24) - (b5s2)
(b2i35s23-i4ij5c) - (b5s2)
(b34-i5s23a) - (b3s2)
Wanted 15 times:
b5s13:
(b57s134) - (bs13)
(b57s134) - (b5s3)
(b57s134) - (b5s1)
(b457s13) - (b5s1)
(b2-a5s135678) - (b5s3)
(b457s13) - (bs13)
(b45s13467) - (b5s1)
(b2-a5s135678) - (bs13)
(b45s13467) - (bs13)
(b457s13) - (b5s3)
(b35s13) - (b5s1)
(b45s135) - (bs13)
(b45s135) - (b5s1)
(b2-a5s135678) - (b5s1)
(b45s135) - (b5s3)
b3568s256:
(b35678s256) - (b56s25)
(b3568s02568) - (b38s2)
(b3568s02568) - (b36s25)
(b35678s256) - (b36s26)
(b3568s02568) - (b56s26)
(b35678s256) - (b56s26)
(b3568s02568) - (b568s2)
(b3568s2567) - (b36s25)
(b3568s02568) - (b36s26)
(b3568s02568) - (b56s25)
(b35678s256) - (b568s2)
(b3568s2567) - (b38s2)
(b3568s2567) - (b56s25)
(b3568s2567) - (b568s2)
(b35678s256) - (b36s25)
b36s06:
(b3468s0568) - (b3s06)
(b3678s0456) - (b3s06)
(b356s0356) - (b36s0)
(b35678s01567) - (b3s06)
(b3678s0346) - (b3s06)
(b36s0136) - (b36s0)
(b367s03567) - (b36s0)
(b35678s01567) - (b36s0)
(b3678s0456) - (b36s0)
(b3678s015678) - (b3s06)
(b35678s02678) - (b3s06)
(b367s03467) - (b3s06)
(b36s035678) - (b3s06)
(b3468s0568) - (b36s0)
(b36s0136) - (b3s06)
Wanted 13 times:
bs3478:
(b3s03478) - (bs348)
(b2i3-e678s34678) - (bs348)
(b2e3-a678s34678) - (bs348)
(b678s13478) - (bs348)
(b3-e678s2i34678) - (bs348)
(b78s345678) - (bs348)
(b478s234678) - (bs348)
(b3-j678s34678) - (bs348)
(b36s34678) - (bs348)
(b678s034678) - (bs348)
(b35s3478) - (bs348)
(b378s3478) - (bs348)
(b368s3478) - (bs348)
b3568s25:
(b35678s256) - (b56s25)
(b3568s02568) - (b38s2)
(b3568s02568) - (b36s25)
(b3568s02568) - (b568s2)
(b3568s2567) - (b36s25)
(b35678s0258) - (b56s25)
(b3568s02568) - (b56s25)
(b35678s256) - (b568s2)
(b3568s2567) - (b38s2)
(b35678s0258) - (b36s25)
(b3568s2567) - (b56s25)
(b3568s2567) - (b568s2)
(b35678s256) - (b36s25)
b8s348:
(b2i3-e678s34678) - (bs348)
(b2e3-a678s34678) - (bs348)
(b368s3468) - (bs348)
(b678s13478) - (bs348)
(b3-e678s2i34678) - (bs348)
(b78s345678) - (bs348)
(b34akt68s348) - (bs348)
(b478s234678) - (bs348)
(b5678s12348) - (bs348)
(b3-j678s34678) - (bs348)
(b678s034678) - (bs348)
(b378s3478) - (bs348)
(b368s3478) - (bs348)
Wanted 12 times:
b3s036:
(b3s01367) - (b3s36)
(b37s03567) - (b3s06)
(b3s01356) - (b3s06)
(b3678s0346) - (b3s06)
(b35s036) - (b3s06)
(b35s036) - (b3s03)
(b35s036) - (b3s36)
(b367s03467) - (b3s06)
(b36s0136) - (b3s36)
(b3s01367) - (b3s06)
(b36s035678) - (b3s06)
(b36s0136) - (b3s06)
b3678s0178:
(b3678s015678) - (b38s078)
(b35678s0178) - (b68s178)
(b3678s015678) - (b36s178)
(b35678s0178) - (b3s018)
(b35678s0178) - (b67s01)
(b35678s0178) - (b38s078)
(b3678s015678) - (b68s178)
(b3678s015678) - (b67s01)
(b3678s015678) - (b36s078)
(b35678s0178) - (b36s078)
(b35678s0178) - (b36s178)
(b3678s015678) - (b3s018)
b3s046:
(b348s046) - (b3s06)
(b3678s0456) - (b3s06)
(b37s01456) - (b3s06)
(b3678s0346) - (b3s06)
(b3s0246) - (b3s04)
(b3s014567) - (b3s06)
(b348s046) - (b3s46)
(b3678s0456) - (b3s04)
(b367s03467) - (b3s06)
(b3s0468) - (b3s06)
(b3s0246) - (b3s06)
(b3s0468) - (b3s04)
b45678s345678:
(b3-i45678s01-c345678) - (b45678s4678)
(b3-i45678s01-c345678) - (b78s345678)
(b3-i45678s01-c345678) - (b45s34)
(b3-i45678s01-c345678) - (b5678s3578)
(b3-a45678s01-c2ai345678) - (b5678s45678)
(b3-a45678s01-c2ai345678) - (b45678s4678)
(b3-a45678s01-c2ai345678) - (b78s345678)
(b3-a45678s01-c2ai345678) - (b4678s35678)
(b3-i45678s01-c345678) - (b4678s35678)
(b3-i45678s01-c345678) - (b5678s45678)
(b3-a45678s01-c2ai345678) - (b45s34)
(b3-a45678s01-c2ai345678) - (b5678s3578)
b3568s0258:
(b3568s02568) - (b356s02)
(b3568s02568) - (b38s2)
(b3568s02568) - (b36s25)
(b3568s02568) - (b3s08)
(b3568s02568) - (b568s2)
(b35678s0258) - (b35s025)
(b35678s0258) - (b356s02)
(b35678s0258) - (b3s258)
(b35678s0258) - (b56s25)
(b3568s02568) - (b56s25)
(b35678s0258) - (b36s25)
(b35678s0258) - (b3s08)
Wanted 11 times:
b7s348:
(b2i3-e678s34678) - (bs348)
(b2e3-a678s34678) - (bs348)
(b678s13478) - (bs348)
(b3-e678s2i34678) - (bs348)
(b78s345678) - (bs348)
(b478s234678) - (bs348)
(b5678s12348) - (bs348)
(b7s013468) - (bs348)
(b3-j678s34678) - (bs348)
(b678s034678) - (bs348)
(b378s3478) - (bs348)
b3s178:
(b36s178) - (b3s17)
(b36s178) - (b3s18)
(b35s12678) - (b3s17)
(b37s12578) - (b3s17)
(b3478s1478) - (b3s17)
(b35s12678) - (b3s18)
(b38s12578) - (b3s17)
(b38s12578) - (b3s18)
(b3s1378) - (b3s18)
(b3s1378) - (b3s17)
(b3478s1478) - (b3s18)
b45s13:
(b457s13) - (b5s1)
(b45s135) - (b4s1)
(b457s13) - (b4s1)
(b457s13) - (bs13)
(b45s13467) - (b5s1)
(b45s13467) - (bs13)
(b457s13) - (b5s3)
(b45s135) - (bs13)
(b45s13467) - (b4s1)
(b45s135) - (b5s1)
(b45s135) - (b5s3)
b6s348:
(b2i3-e678s34678) - (bs348)
(b2e3-a678s34678) - (bs348)
(b368s3468) - (bs348)
(b678s13478) - (bs348)
(b3-e678s2i34678) - (bs348)
(b34akt68s348) - (bs348)
(b5678s12348) - (bs348)
(b3-j678s34678) - (bs348)
(b36s34678) - (bs348)
(b678s034678) - (bs348)
(b368s3478) - (bs348)
Wanted 10 times:
(22 rules)
Wanted 9 times:
(19 rules)
Wanted 8 times:
(46 rules)
Wanted 7 times:
(70 rules)
Wanted 6 times:
(107 rules)
Wanted 5 times:
(172 rules)
Wanted 4 times:
(385 rules)
Wanted 3 times:
(550 rules)
Wanted 2 times:
(1323 rules)
Wanted 1 time:
(1776 rules)
$
So these might be good candidates for investigation. In fact this list has revealed two B3 rules that already
have been investigated in the past but that weren't on my list, namely b35s2 and b3568s0258.
Side note - the scripts may well be buggy, and/or the above definitions may not make sense. (It all made sense to me last night way past midnight, though.)