Tim directed me here - subscribed! Can I cheat and suggest a (probably too long-term) project early?
Build MMO-game or sandbox world-game, and set up LLMs as players and see what happens. Alternatively, equip a LLM to participate in existing games like this, and see how long before it does something game-breaking, or other players notice.
I really like this idea! Not sure if you’ve seen this, but I was actually already planning on writing about some recent research doing something like that.
And this one uses LLMs as agents in a software engineering environment (so you’ve got a ceo LLM, developer LLM, etc): https://github.com/OpenBMB/ChatDev
I think there’s a lot of room for interesting studies looking at different kinds of social processes using what’s essentially prompt engineering and hooking up a bunch of LLMs together to see what happens.
I had seen the first, but not the 2nd. I certainly like the idea, I just perhaps think it's not open-ended enough. So much debate about AI is "what if it does X?" and the programmer in me just wants to see some testing done to find out!
Tim directed me here - subscribed! Can I cheat and suggest a (probably too long-term) project early?
Build MMO-game or sandbox world-game, and set up LLMs as players and see what happens. Alternatively, equip a LLM to participate in existing games like this, and see how long before it does something game-breaking, or other players notice.
I really like this idea! Not sure if you’ve seen this, but I was actually already planning on writing about some recent research doing something like that.
This one used LLMs in a sims-like environment and looked at what behavior emerged: https://hai.stanford.edu/news/computational-agents-exhibit-believable-humanlike-behavior
And this one uses LLMs as agents in a software engineering environment (so you’ve got a ceo LLM, developer LLM, etc): https://github.com/OpenBMB/ChatDev
I think there’s a lot of room for interesting studies looking at different kinds of social processes using what’s essentially prompt engineering and hooking up a bunch of LLMs together to see what happens.
I had seen the first, but not the 2nd. I certainly like the idea, I just perhaps think it's not open-ended enough. So much debate about AI is "what if it does X?" and the programmer in me just wants to see some testing done to find out!