The newfangled ‘text-to-anything’ generative models that are taking the world by storm have given birth to something hot: ‘prompt engineering’. This skill ensures that the machine interprets human language exactly in the way the user intended. A sudden surge in courses, resource materials, and job listings on prompt engineering is a testament to its importance.
But with AI models getting better, the word on the AI streets is that prompt engineering as a skill may already be getting obsolete, which means it will not require behind-the-scenes engineering to understand the user. So, is this new job role – deemed as ‘the career for the future’ – only a passing fad?
Decoding Prompt Engineering
Simon Willison, the creator of the Datasette project, discusses in his blog that there are two different meanings of the term:
(i) Prompting, which involves having an expert-level knowledge of writing prompts for language models: Now that there is a copilot for everything ranging from law to marketing to accounting, learning what prompt to give will help elicit a favourable and contextual response. This prompting must be based on an understanding of how the language model works.
(ii) Prompting, which involves writing software on top of language models by carefully constructing prompts to send to them: Companies utilise prompt engineering to describe how the bot should work, and prescribe responses to certain questions.
Download our Mobile App
An example for the second case is the speculation that Microsoft’s Bing Chat, unlike OpenAI’s ChatGPT, is not trained on the Reinforcement Learning from Human Feedback (RLHF) technique, but on simple prompt engineering. This may be behind the weird responses the chatbot gives to user queries.
Bing Chat gave provocative responses in some conversations.
My new favorite thing – Bing’s new ChatGPT bot argues with a user, gaslights them about the current year being 2022, says their phone might have a virus, and says “You have not been a good user”
Why? Because the person asked where Avatar 2 is showing nearby pic.twitter.com/X32vopXxQG
— Jon Uleis (@MovingToTheSun) February 13, 2023
Microsoft acknowledged this to be a “non-trivial scenario” and said that correcting this would require a lot of prompting, which they’re getting to. Hence, AI companies highly leverage prompting to deploy language models at scale.
A Bug, Really?
However, it seems like prompt engineering only had a short stint since experts opine that “prompt engineering is a bug, rather than a feature of language models”.
Melanie Mitchell, a professor at Santa Fe Institute, mentions that the need for prompt engineering is a sign of the lack of robust language understanding, and scaling LLMs would not necessarily reduce its need. This was also seconded by MetaAI’s Yann LeCun.
A Twitter user also discussed the possibility of scaling RLHF – a reward-based learning system for LLMs – to gigantic data to improve interaction to the point where prompt engineering becomes less relevant. However, to this, LeCun said that while it will certainly mitigate the problem by improving its reliability in common questions, it will not fix it since the distribution of questions given to LLMs are very, very large.
Willison provides a rebuttal to this notion arguing that language models are effectively the world’s most complicated black box system, and therefore understanding the wider context, such that the AI understands the prompt in the right way is important. He describes that the job of a prompt engineer is to constantly run experiments to document findings on the effectiveness of prompts, iterate on prompts, and figure out exactly which components of the prompt are essential, and which are just a waste of tokens.
Similarly, AI researcher John Nay argued that prompt engineering (PE) will be needed for all models in the future because “PE is whatever most robustly specifies humans’ inherently vague and underspecified goals. We can never fully list what we intend LLMs to do in all future scenarios.”
Where will Prompting Fit in?
The question of whether AI can understand natural language – with all its ambiguity and complexity – has been of stark disagreement in the AI research community. The contention is that although contemporary language models can establish correlations between words within a sentence, they don’t have a true understanding of the language they are speaking. Therefore, while they can predict the next word in a sentence, they often fall short in providing an appropriate contextual response.
Meanwhile, there are various alternative approaches to address this lack of understanding that are currently in the works. For example, Dave Ferrucci, founder of Elemental Cognition, is taking a “hybrid” approach that uses language models to generate hypotheses as an “output” and then performing reasoning on top using “causal models”.
Yoshua Bengio, one of the pioneers in deep learning, had earlier told AIM that he was currently working on notions of causality in neural networks, adding that the networks that exist today are trained on massive amounts of datasets without being able to reason with that knowledge as consistently as humans do.
Recently, neuro-symbolic AI, which, along with neural networks, also utilises symbolic AI that can grasp compositional and causal knowledge to be able to generalise AI models, have gained momentum.
what is at issue is how intensely we should study neurosymbolic AI as opposed eg to scaling LLMs. that’s a huge research choice, with (i suspect) huge consequence
— Gary Marcus (@GaryMarcus) September 27, 2022
These models are all working towards one cause – to improve natural language understanding in AI. Therefore, AI development is destined to reduce the need for prompting. In response to Mitchell’s tweet, Dipam Chakraborty, a machine learning engineer at AICrowd, argues that some prompt engineering will always be needed despite the progress made in language models.
While I agree to some extent that the current level of prompt engineering indicates lack of robust language understanding. However some prompt engineering will always be needed because a prompt is inherently underspecified all but the simplest of cases.
— Dipam Chakraborty (@__dipam__) February 28, 2023
Therefore, while the claim that prompt engineering is dead may seem like an overstatement, it is likely that its popularity might wane.