I don't really understand your post (and don't know much about GPT3) are you sug...

smallnamespace · on July 19, 2020

The model is not stateful, but you can emulate state (certainly with GPT-3, but also with other language models) by simply feeding back earlier output.

For example, to simulate a chatbot, you start with a prompt. You then successively feed longer and longer chunks of the full chat back to the model, taking incrementally generated lines as the new AI's reply.

This is essentially how some of the 'use GPT-2 as a chatbot' front ends work in the world. This is also extended to make things like AI dungeon work: you can force the model to keep context within its attention by providing a good summary in the prompt.

To speculate a bit on why this seems to work, these models are massive and have read millions of texts in their corpus. Instead of 'retraining' on text which the model probably has already seen, the prompt is nudging the model to identify where in its on weights its encoded the knowledge before.

walleeee · on July 19, 2020

I don't think the claim is that the model is "stateful" in that it continues to learn from prompts. I think it's that the model no longer requires retraining for different situations; instead it has "learned" a set of lower (higher?) level abstractions from which those same (and possibly new) situations can be constructed dynamically from the input prompt.