Solving A Machine-Learning Mystery
MIT researchers found that massive neural network models that are similar to large language models are capable of containing smaller linear models inside their hidden layers, which the large models could train to complete a new task using simple learning algorithms. Image by Jose-Luis Olivares, MIT Large language models like OpenAI’s GPT-3 are massive neural networks that can generate human-like text, from poetry to programming code. Trained using troves of internet data, these machine-learning models take a small bit of input text and then predict the text that is likely to come next. But that’s not all these models can do. Researchers…
Share