You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
setup their scheme, presumably each hierarchy can only have access to the one directly below it. the lowest hidden is implied to be the original token sequence, unchanging. lots of inspiration from the old DEQ paper
reasoning_steps=2, # N in the paper - the number of forward evals for the last network (highest hierarchy) above
40
-
relative_period: int|tuple[int, ...] =2# the relative period for each network evaluation call to the one just previous - in the paper, they do 2 networks with a period of 2
45
+
reasoning_steps=2, # N in the paper - the number of forward evals for the last network (highest hierarchy) above
46
+
relative_period: int|tuple[int, ...] =2, # the relative period for each network evaluation call to the one just previous - in the paper, they do 2 networks with a period of 2
47
+
ignore_index=-1
41
48
):
42
49
super().__init__()
43
50
@@ -57,34 +64,39 @@ def __init__(
57
64
58
65
self.networks.append(network)
59
66
60
-
assertlen(self.networks) >0
67
+
self.num_networks=len(self.networks)
68
+
assertself.num_networks>0
61
69
62
70
# setup how frequent each network is called
63
71
# the first network (lowest in the hierarchy) should be called every iteration
0 commit comments