Researchers are starting to unravel one of the biggest mysteries behind the AI language models that power text and image generation tools like DALL-E and ChatGPT.
For a while now, machine learning experts and scientists have noticed something strange about large language models (LLMs) like OpenAI’s GPT-3 and Google’s LaMDA: they are inexplicably good at carrying out tasks that they haven’t been specifically trained to perform. It’s a perplexing question, and just one example of how it can be difficult, if not impossible in most cases, to explain how an AI model arrives at its outputs in fine-grained detail.