so we can think of the varying strength of synapses of interconnected biological neurons as evolutionary pre-pretraining for language (with school as pretraining and vocational training as training), and we should translate the learnings of that pre-pretraining to a geometric understanding for computational linguistics, so that we can advantageously apply these learnings as inductive biases to network architecture and weight initialization better
so we can think of the varying strength of synapses of interconnected biological neurons as evolutionary pre-pretraining for language (with school as pretraining and vocational training as training), and we should translate the learnings of that pre-pretraining to a geometric understanding for computational linguistics, so that we can advantageously apply these learnings as inductive biases to network architecture and weight initialization better
craziest shit i've ever read
hopefully crazy in an interesting/good way!