4 Comments

h = x + self.attention.forward(

self.attention_norm(x), start_pos, freqs_cis, mask

)

out = h + self.feed_forward.forward(self.ffn_norm(h))

This is the transformer code of llama. I really don’t understand which part of this code has anything to do with neurons.

If you can help me answer it, I would be very grateful.

Expand full comment

Conversely, as someone who is new to AI and LLM, this article was an absolutely mesmerizing and relatable demystification. Thank you for such a clear exposition!

Expand full comment

Hey Sean and Timothy! As someone deeply immersed in the world of AI and language models, I must say your article on large language models is a fantastic primer! You've managed to explain the intricacies without getting lost in jargon, making it accessible to all. Kudos for shedding light on the magic behind these powerful models while keeping it approachable. Looking forward to more enlightening reads from you two! Keep up the great work!

Expand full comment
author

Thanks Adam!

Expand full comment