I don't really understand your post (and don't know much about GPT3) are you suggesting that the model is stateful in that it can continue learning from successive prompts?
Maybe you can elaborate on this:
>instead, you interact with it, expressing any task in terms of natural language descriptions, requests, and examples, tweaking the prompt until it “understands” & it meta-learns the new task based on the high-level abstractions it learned from the pretraining.
Or are you suggesting that it has such a deep network of abstractions that once a user starts to map that out, the mileage they can extract back out of the model via prompts is very exciting.
The model is not stateful, but you can emulate state (certainly with GPT-3, but also with other language models) by simply feeding back earlier output.
For example, to simulate a chatbot, you start with a prompt. You then successively feed longer and longer chunks of the full chat back to the model, taking incrementally generated lines as the new AI's reply.
This is essentially how some of the 'use GPT-2 as a chatbot' front ends work in the world. This is also extended to make things like AI dungeon work: you can force the model to keep context within its attention by providing a good summary in the prompt.
To speculate a bit on why this seems to work, these models are massive and have read millions of texts in their corpus. Instead of 'retraining' on text which the model probably has already seen, the prompt is nudging the model to identify where in its on weights its encoded the knowledge before.
I don't think the claim is that the model is "stateful" in that it continues to learn from prompts. I think it's that the model no longer requires retraining for different situations; instead it has "learned" a set of lower (higher?) level abstractions from which those same (and possibly new) situations can be constructed dynamically from the input prompt.
Maybe you can elaborate on this:
>instead, you interact with it, expressing any task in terms of natural language descriptions, requests, and examples, tweaking the prompt until it “understands” & it meta-learns the new task based on the high-level abstractions it learned from the pretraining.
Or are you suggesting that it has such a deep network of abstractions that once a user starts to map that out, the mileage they can extract back out of the model via prompts is very exciting.