Discussion about this post

User's avatar
Mike X Cohen's avatar

Nice write-up, Sean. Interesting to see how different people from different backgrounds learn about LLM mechanisms. If I may be so audacious as to humbly suggest my 90+ hour course on LLM architecture, training, and mechanistic interpretability, using ML methods to investigate internal activations during inference: https://github.com/mikexcohen/LLM_course

Expand full comment
Christopher Riesbeck's avatar

+1 for abolishing entries in a mental lexicon, Sean. When we were developing understanding systems for episodic knowledge-based reasoning systems at Yale in the late 1980s, it became clear that language comprehension needed all knowledge, not just some small bits crammed into a lexicon. For you, Elman was the inspiration. For me, it was Quillian's Teachable Language Comprehender (https://dl.acm.org/doi/10.1145/363196.363214). TLC understood phrases like "the lawyer's client" or "the doctor's patient" by finding the connecting paths in a semantic network. TLC was a model with no lexicon! Our application of that idea to our episodic knowledge networks was Direct Memory Access Parsing, a model of language understanding as lexically-cued memory recognition. Will Fitzgerald and I wrote a non-technical introduction to the idea in a response to Gernsbacher's Language Comprehension as Structure Building (https://www.cogsci.ecs.soton.ac.uk/cgi/psyc/newpsy?5.38). More technical points are in https://users.cs.northwestern.edu/~livingston/papers/others/From_CA_to_DMAP.pdf.

Quillian 1969, myself 1986, Elman 2009 -- we're due for another attempt to dump the mental lexicon. It all depends -- as it should -- on the quality of the knowledge base.

Expand full comment
4 more comments...

No posts