It is often believed that Sysyphus was a king who was forced to forever roll a stone up the mountain slope as a punishment by Zeus for cheating death.
The truth, as is often the case with the Greek myths, is different. In fact, Sysyphus was not a king at all. He was an investor and operator of an early manual gravity energy storage device.
(
Read more... )
Reply
Reply
Offtopic: I presume you've seen the new "Grokking" paper by a small OpenAI team at "The Role of Mathematical Reasoning in General Artificial Intelligence" ICLR workshop :-)
Reply
Reply
1) The effect in the left image of Figure 1 is quite striking. Figure 4 is also quite remarkable.
2) It might be the case that for precisely defined synthetic tasks the effects tend to be "more pronounced" (and tend to lead to the ability to solve the task exactly). It's premature to make this kind of general pronouncement, especially about the ability to solve the task exactly, but this paper seems to push us to at least consider this kind of conjecture.
3) If 2 is actually true, then one notices that these are the conditions for a typical program synthesis problem (a modestly-sized problem precisely defined by few constraints (expected test results)). So it might be that modestly-sized models (like a small transformer used in this paper) will actually be able to solve these tasks, because the data size is really small, so the desirable "superoverfitting area" is not too far...
Reply
1) ->
Agreed, both are pretty striking. Although still in line with what we have seen all along.
2) ->
It is an interesting conjecture but I would like to see more evidence.
3) I suspect that it is exactly the case. For small problems the highly over-parameterized solution is still within reach. For large problems of this type it may be impossible to get to it at all.
Reply
Leave a comment