Extending GPT-3’s context window infinitely by storing context in GPT-3 itself or in secondary layers

Let’s give GPT-3 what it needs to power an AGI. Feedback is welcome.


OpenAI’s GPT-3 is one the best (and most underrated) things that happened to mankind in 2020. It proved that it is possible for an AI to surpass the zero-shot and few-shot learning abilities of most humans on a huge range of general tasks, both logical and creative, and pass the duck test for human intelligence (aka Turing test) with flying colors.


What about training a meta model that would learn where and how to best adjust GPT-3 parameters so they can “learn” new knowledge, preserved, contextualized and linked to the huge swath of existing knowledge already in the network?


What I propose here is the simplest mechanism allowing for an AI to find its way to modify only a small part of its existing model parameters to learn a specific input, without the need to retrain the model or change its architecture. I believe this is a key step in the right direction towards a more general AI able to learn, improve and contextualize its knowledge, and leverage it in a robust and scalable way.

Infinity skyscrapers
Infinity skyscrapers
Source: Image by joelfilip on Unsplash

Humble problem solver. AGI enthusiast. Software developer. Investor. Growth hacker. Human.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store