Since its launch, ChatGPT has been making waves in the AI sphere, attracting over 100 million users in record time. The secret sauce to ChatGPT’s impressive performance and versatility lies in an art subtly nestled within its programming – prompt engineering.
Launched in 2022, DALL-E, MidJourney, and StableDiffusion underscored the disruptive potential of Generative AI. But it was Open AI‘s ChatGPT that truly took center stage later in 2022. And this momentum showed no signs of slowing down.
Google’s announcement of Bard and Meta’s Lamma 2 response to OpenAI’s ChatGPT has significantly amplified the momentum of the AI race. By providing these models with inputs, we’re guiding their behavior and responses. This makes us all prompt engineers to a certain degree. The tech industry has taken notice. Venture capitalists are pouring funds into startups focusing on prompt engineering, like Vellum AI. Moreover, Forbes reports that prompt engineers command salaries exceeding $300,000, indicative of a thriving and valuable job market.
In this article, we will demystify the world of prompt engineering in the context of ChatGPT. We’ll explore the core principles, delve into the techniques, and examine their commercial applications.
Before we delve into prompt engineering, let’s first understand how ChatGPT functions and the pivotal role prompt engineering plays in this process. In the upcoming articles, we will discuss some of these advanced techniques in depth so stay tuned!
The Mechanics Behind ChatGPT Prompt Engineering
The secret sauce behind ChatGPT’s success is its ability to understand and mimic the nuances of human conversation. The model is trained on a diverse range of internet text, but crucially, it does not know specific documents or sources in its training set, ensuring generalization over specificity. This training approach allows ChatGPT to generate creative responses, navigate complex dialogues, and even exhibit a sense of humor. However, it’s important to remember that ChatGPT doesn’t truly understand or have beliefs; it generates responses based on patterns it learned during training.
ChatGPT’s ability to understand and respond to human language in a nuanced and context-aware manner makes them incredibly versatile tools.
The mechanics of these models rest on the concept of ‘tokens’—discrete chunks of language that can range from a single character to a whole word. These models work with a specific number of tokens at a time (4096 for GPT-3.5-Turbo or 8192 or 32768 for GPT-4), predicting the next sequence of likely tokens. The models process the tokens using complex linear algebra, predicting the most probable subsequent token.
For instance, a simple nine-word sentence could turn into ten tokens. Even a single complex word could turn into multiple tokens, which helps the model better understand and generate language. Even languages with complex word structures, like German, are managed through tokenization.
GPT (Generative Pre-trained Transformer) models work by generating one token at a time, starting from the first token and continuing up to the last. After generating each token, the model revisits the entire sequence and processes it again to generate the next token. This iterative process continues until the final token is generated, making the generation of text similar to a big for-loop.
But here’s the catch – the quality of these responses largely depends on the prompts it receives. Just like steering a conversation with a human, guiding a dialogue with ChatGPT requires a certain level of skill and understanding. That’s where prompt engineering come into play.
What is Prompt Engineering?
Prompt Engineering is the art of crafting precise, effective prompts/input to guide AI (NLP/Vision) models like ChatGPT toward generating the most cost-effective, accurate, useful, and safe outputs.
Prompt engineering is not just confined to text generation but has wide-ranging applications across the AI domain. It is increasingly being employed in areas like robotic process automation bots, 3D assets, scripts, robot instructions, and other types of content and digital media. It provides a unique blend of logic, coding, art, and in certain cases, special modifiers.
While a prompt can include natural language text, images, or other types of input data, the output can vary significantly across AI services and tools. Every tool has its special modifiers that describe the weight of words, styles, perspectives, layout, or other properties of the desired response.
This field is essential for creating better AI-powered services and obtaining superior results from existing generative AI tools. Enterprise developers, for instance, often utilize prompt engineering to tailor Large Language Models (LLMs) like GPT-3 to power a customer-facing chatbot or handle tasks like creating industry-specific contracts.
This iterative process of prompt refinement and measuring AI performance is a key element in enabling AI models to generate highly targeted, useful responses in various contexts.
Becoming a Prompt Engineer
The GPT-4 model’s prowess in comprehending complex instructions and solving intricate problems accurately makes it an invaluable resource. However, there are different methods to access this model’s capabilities, and understanding these can be crucial to a prompt engineer’s role in optimizing both efficiency and cost-effectiveness.
There are essentially two ways to engage with GPT models like GPT-4. One way is through OpenAI’s API, where costs are calculated per input and output tokens. In this scenario, the costs per 1K tokens can fluctuate based on the context size. For example, in a larger 32K context, the cost could go up to $0.06 for input tokens and $0.12 for output tokens. Thus, when handling a high volume of requests, the cost of usage can quickly accumulate.
Alternatively, ChatGPT, which leverages the GPT-4 model, operates on a subscription-based model.
Another key task in prompt engineering involves model parameter tuning. This process involves adjusting the variables that the model uses to make predictions. By fine-tuning these parameters, prompt engineers can improve the quality and accuracy of the model’s responses, making them more contextually relevant and helpful.
Although many prompt engineers come from a tech background, the field’s multidisciplinary nature makes it accessible to individuals from diverse professional backgrounds. Increasingly, writers, researchers, and even artists are leveraging their unique skills to enhance the efficacy of AI models. This shift is reflected in the job market as well, with a growing number of companies seeking prompt engineers with a diverse range of skills and backgrounds.
Prompt Design and Engineering in ChatGPT
Prompt Design and Engineering play a critical role in optimizing the performance of language models and involve more than just crafting questions; they necessitate a deep understanding of the AI model and a highly iterative and refined approach.
Prompt Design
Prompt design, at its core, is the art and science of creating the perfect prompt for a given large language model (LLM), like ChatGPT, to achieve a clearly stated goal. It’s a blend of:
- Understanding of the LLM: Different language models may respond variably to the same prompt. Moreover, certain language models may have unique keywords or triggers, which they interpret in a specific way.
- Domain Knowledge: Expertise in the relevant field is crucial when designing prompts. For instance, generating a prompt to infer a medical diagnosis would require medical knowledge.
- Iterative Approach and Quality Measurement: The process of creating the ideal prompt often involves trial and error. Therefore, having a way to evaluate the output quality beyond just subjective judgment is vital, particularly when using the prompt at a larger scale.
Prompt Engineering
Prompt engineering is an extended realm of prompt design that includes several critical processes:
- Design of Prompts at Scale: This process includes the design of meta prompts (prompts that generate other prompts) and prompt templates, which are parameterized prompts that can be instantiated at runtime.
- Tool Design and Integration: Prompts can sometimes include results from external tools, and integrating these tools seamlessly into the system is crucial.
- Workflow, Planning, and Prompt Management: Deploying an LLM application, like a chatbot, often requires managing libraries of prompts, planning and choosing the right prompts, and efficiently integrating various tools.
- Evaluation and Quality Assurance of Prompts: This aspect includes defining metrics and processes to evaluate the prompts both automatically and with human involvement.
- Prompt Optimization: The cost and latency of the AI model can depend on the choice of the model and the length of the prompt (number of tokens).
Many automated prompt design approaches, tools, and frameworks have been developed to manage prompts at scale. However, it’s important to understand that none of these tools can replace the nuanced understanding, judgment, and experience of a seasoned prompt engineer.
Prompt Engineering: Techniques and Best Practices
1) Prompt Placement and Description
Placing instructions at the outset of the prompt can significantly impact the AI’s understanding and response. Consider the task of summarizing a text in a bullet point format. A less effective approach would be:
User: Summarize the text below as a bullet point list of the most important points.
{text input here}
On the other hand, a more effective technique could be:
User: Summarize the text below as a bullet point list of the most important points.
Text: “””
{text input here}
“””
By clearly separating the instructions and the context with symbols such as ### or “””, the AI model can better delineate its task and the text it needs to process, thereby giving more accurate and relevant responses.
Secondly, being specific, descriptive, and detailed about the context, expected outcome, length, format, style, and so forth can significantly enhance the precision of AI responses. Suppose you need to generate a brief overview of a scientific paper in an engaging, layperson-friendly style. A vague instruction like “Summarize the scientific paper” would yield less effective results compared to a specific, detailed prompt:
User: Generate a brief (approx. 200 words), engaging summary of the following scientific paper. The summary should be understandable to someone with no scientific background, while accurately reflecting the paper’s key findings.
Paper: “””
{scientific paper text here}
“””
2) Persona Pattern
The Persona Pattern is a strategy that sets the AI to function from a certain perspective, such as a professional role or fictional character. This tactic enables the model to provide outputs related to the specific roles the persona would handle. For instance, one could instruct the AI to act as a cybersecurity expert during a code review. This pattern is particularly useful when users need assistance but are unsure about the exact details required in the output.
Take the following as an example:
User: “Assume the persona of a senior historian specializing in the French Revolution. Given that persona, explain the key events and reasons leading to the downfall of the French monarchy.”
In this scenario, the AI is expected to respond with an in-depth analysis of the French Revolution, focusing on the fall of the monarchy, as a historian would.
Introducing New Information
ChatGPT doesn’t possess knowledge beyond its training cutoff, hence it can’t provide information on recent events or updates post it’s training. However, you can introduce new information to the model via your prompts.
User: “What is cryptocurrency?”
ChatGPT: “Cryptocurrency is a type of digital or virtual currency that uses cryptography for security. The most well-known cryptocurrency is Bitcoin.”
Now, if we add a bit more context to the prompt, we can make the explanation even more intuitive:
User: “Explain the concept of cryptocurrency to a high school student studying digital finance”
ChatGPT: “Imagine you’re playing an online video game and there’s game money you can earn to buy upgrades or new characters. Cryptocurrency is kind of like that, but instead of being limited to the game, you can actually use it to buy real-world goods and services. It’s like digital money that no bank or government controls.”
3) Prompt Size Limitations
Understanding the size limitation of ChatGPT is essential as it directly impacts the volume and type of information we can input. Language models are not built to process unlimited data at once. They have an inherent constraint on the size of the prompt we can create and input. This limitation has profound implications for the design and execution of the prompts.
ChatGPT has a token limit (generally 2048 tokens), which includes both the prompt and the generated response. This means long prompts can limit the length of the response. Therefore, it’s important to keep prompts concise yet informative.
In practical usage, we must act as editors, selectively choosing the most relevant information for the task at hand. Imagine writing a paper or an article with a word or page limit – you can’t just dump random facts, but carefully select and structure information relevant to the subject.
User: “Given the token limitations of the model, summarize the key events of World War II in less than 1000 words.”
By acknowledging the model’s token limitations, this prompt directs the AI to provide a concise yet comprehensive summary of World War II.
Prompts for Repeated Use: In a conversational scenario, you can re-use prompts or refine them based on previous responses. This gives a conversational flow and maintains context in the interaction.
4) Question Refinement Pattern
ChatGPT can also help refine your questions. For example, if you ask a question, it can suggest a better-formulated question for more accurate results.
The Question Refinement Pattern involves the LLM refining the questions asked by the user. It’s particularly useful when users are not experts in a domain or are unsure of how to phrase their question.
An example prompt may look like this:
User: “Whenever I ask a question about data science, suggest a more refined question considering statistical analysis specifics and ask if I want to proceed with the refined question.”
Here, the AI is expected to refine the user’s general data science questions to more detailed questions considering statistical analysis aspects.
5) Cognitive Verifier Pattern
The Cognitive Verifier Pattern breaks down complex questions into smaller, manageable sub-questions. The answers to these sub-questions are then combined to provide a comprehensive answer to the main question. This approach is useful when initial questions are too high-level or when additional clarity is needed to provide a complete answer.
For instance:
User: “If I ask a question about climate change, break it down into three smaller questions that would help you provide a more accurate answer. Combine the answers to these sub-questions to give the final answer.”
In this scenario, the AI will break down the complex question about climate change into three manageable sub-questions. After receiving answers to these sub-questions, the AI will then combine these to provide a more comprehensive and accurate response.
This pattern involves generating multiple questions to obtain a more accurate final answer. It’s a process of verification and consolidation of information for a robust final output
6) Few-Shot Prompting Approach
Few-shot prompting is a powerful approach for teaching AI models to follow specific patterns or carry out tasks. The idea is to feed the model with a number of examples before asking the desired question. This helps to set the desired behavior and guide the model’s responses.
Consider a scenario where a user is attempting to draft a concise, compelling marketing copy for a new tech gadget:
User: “Here are some examples of tech marketing copies: ‘Experience music like never before with our cutting-edge wireless headphones.’ ‘Capture your world in stunning 4K with our ultra-sleek action camera.’ Given these examples, draft a marketing copy for our new AI-powered smartwatch.”
ChatGPT: “Empower your lifestyle with our AI-powered smartwatch, your companion for health, connectivity, and productivity.” The remarkable feature of few-shot learning is that it doesn’t necessitate a detailed description of how to perform the task. Instead, it requires demonstrations of what’s being done, which can make it a good fit for many applications.
Furthermore, this technique extends beyond simple input-output pairing. The model can also be taught to handle intermediate steps to achieve a goal.
7) Chain-of-Thought Prompting
In their recent research paper, “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models” authors Jason Wei and team aim to stimulate reasoning within large language models. Rather than using traditional, direct prompts, the team makes use of a series of interconnected prompts, forming a ‘chain-of-thought.’
For instance, a user might be interested in the potential impact of artificial intelligence on job markets:
User: “Could you briefly explain the concept of artificial intelligence?”
ChatGPT: “Artificial Intelligence, or AI, refers to the capability of a machine to imitate intelligent human behavior. It’s about designing machines that can think and learn like humans, enabling them to perform tasks such as language translation, decision-making, and problem-solving, among others.”
User: “How is AI influencing the current job market?”
ChatGPT: “AI is both creating new jobs and changing the nature of existing ones. On one hand, it automates routine tasks, leading to job displacement in certain sectors. On the other hand, it’s creating demand for roles that didn’t previously exist, such as AI ethics officers, data analysts, and machine learning engineers. Additionally, AI can augment human roles, allowing people to accomplish tasks more efficiently and accurately.”
Chain-of-thought prompting follows a progression of related prompts, maintaining a ‘conversation’ with the model that guides it towards more precise and contextually appropriate responses.
Understanding ChatGPT API and Prompt Engineering from a Developer’s Perspective
From a developer or researcher’s perspective, prompt engineering is a more complex process. It involves the design of algorithms and techniques to help the AI model understand and respond to prompts more effectively. This could include strategies for handling ambiguous prompts, identifying the context of prompts, or chaining prompts together to stimulate more complex thought processes in the AI
In the above illustration, a user interacts with a chat interface, powered by GPT-4. Their input is enhanced for clarity and contextual consistency by a specialized module before being fed to the AI model. The role of this module is to refine and enhance the user’s input for better understanding while also maintaining the context of the conversation. The AI’s response, crafted based on the refined prompt, is returned to the user through the chat interface. The interaction history is updated consistently, maintaining the conversational context. Overall, this diagram illustrates a dynamic user-AI conversation flow enabled by prompt engineering techniques.
The ChatGPT API’s interface is engineered with various hyperparameters that enable users to refine the AI’s responses to prompts, making them more effective and versatile. These include the Temperature, Maximum Length, and Stop Sequences parameters. Let’s explore how they function and can be utilized in different contexts.
1) Temperature
The temperature parameter influences the randomness of the AI’s responses. A higher temperature, such as 1.0, encourages more random output, resulting in diverse but potentially off-topic responses. In contrast, a lower temperature, like 0.2, prompts the AI to select more deterministic responses, which can be beneficial for focused and specific outputs but may lack variety.
Example: If you’re writing a creative story and want unpredictable, imaginative elements, set the temperature to a higher value. A prompt could be: “Write a short story about a time-traveling cat.” With a high temperature, you might get a wild, fantastical tale with unpredictable twists and turns.
2) Maximum Length
This parameter controls the maximum token length of the model’s output, which includes both the tokens in the message input and message output. Setting a higher limit allows for more extended responses, while a lower limit keeps the output short and concise.
Example: For brief answers to trivia questions like, “Who won the Nobel Prize in Literature in 2020?” you might want to set the maximum length to a low value, ensuring the response is concise and direct.
3) Stop Sequences
Stop sequences are specific strings of text where, when the model encounters them, it ceases generating further output. This feature can be useful for controlling the length of the output or instructing the model to stop at logical endpoints.
Example: For a formal letter, you could use “Yours sincerely,” as a stop sequence to ensure the AI does not generate any additional text beyond the proper ending of the letter.
4) Top P
The ‘Top P’ parameter, also known as nucleus sampling, is a method that provides a dynamic selection of the number of words considered at each step of the model’s predictions. A lower value, like 0.5, leads to safer, more focused outputs. A higher value, like 0.9, includes a broader selection of words, leading to more diverse outputs.
Example: If you’re creating an AI to write poems and want it to use a wide array of vocabulary and stylistic phrases, you might set ‘Top P’ to a higher value. A prompt could be: “Compose a poem about autumn’s beauty.”
5) Frequency Penalty
Frequency Penalty controls how much the model should favor less frequent words. A higher penalty (up to 1) encourages the model to use less common words, while a lower value (down to -1) encourages the model to use more common words.
Example: If you’re trying to generate a business proposal and want to avoid jargon, you might set the frequency penalty to a lower value to keep the language clear and accessible.
6) Presence Penalty
The Presence Penalty parameter affects how much the model is penalized for generating new ideas or topics that were not present in the conversation history. Higher values encourage the model to stick to the topics already mentioned, while lower values allow the model to introduce new concepts more freely.
Example: For brainstorming sessions, you might set the presence penalty to a lower value to encourage a diverse array of ideas. You could use a prompt like, “Generate innovative marketing strategies for an online educational platform.”
Industry Application of Prompt Engineering
Customer Support and FAQ Generation
In the world of customer support, the utility of prompt engineering can’t be overstated. One of the most groundbreaking applications of AI in this sector is the advent of AI-powered chatbots. For instance, consider an e-commerce platform like eBay. They utilize an advanced chatbot system that has been fine-tuned with prompt engineering to handle customer inquiries. The chatbot is capable of handling a variety of issues including, but not limited to, providing delivery updates, processing refund requests, and answering queries about product specifications.
When a customer asks, “Where is my order?”, the AI chatbot doesn’t merely respond with a generic message; instead, it taps into the order management system, fetches real-time data, and provides a personalized response like, “Your order #1234 is currently in transit and is expected to be delivered by tomorrow evening.” Such personalized and relevant responses are the direct result of well-engineered prompts, which form the core of these intelligent chatbot systems.
Content Creation and Marketing
In the domain of content creation and marketing, prompt engineering serves as the cornerstone of AI-driven innovations. Companies like Grammarly use AI-powered engines to aid users in creating engaging and grammatically correct content. These platforms work on the basis of prompts, guiding the AI model to generate suggestions or corrections that can enhance the overall quality of the content.
Consider a user aiming to create engaging content about AI advancements. They could employ a prompt like: “Explain the latest advancements in AI in a simple, engaging manner for beginners.” The AI, harnessing this prompt, can draft a concise, reader-friendly piece on complex AI concepts, thus bridging the knowledge gap and stimulating reader interest. This is just one of many ways prompt engineering can enhance content creation.
Coding Assistance and Code Interpretation
Prompt engineering, in this context, is like a two-way conversation between the developer and the AI. The developer provides the AI with a prompt – a description of the problem or functionality they need to implement, and the AI responds with a relevant piece of code or suggestion, aiding the software development process.
Additionally, AI-enhanced coding tools, like GitHub Copilot, have been developed with the expertise of OpenAI. These advanced tools have undergone extensive training to perform complex tasks such as filling out incomplete code files. They do so despite the distinct differences between the process of writing code and the task of completing it.
Education and Personalized Learning
The integration of AI into the education sector has seen a notable surge in recent years, with prompt engineering playing a pivotal role. AI tutors have been developed to provide personalized learning experiences, adapting to each student’s learning pace and style.
Take, for example, an AI-based learning platform like Duolingo. It uses well-engineered prompts to assess a student’s proficiency in a new language and provide suitable exercises. If a student is learning French and struggles with conjugation, the AI tutor can introduce tailored exercises like, “Conjugate the verb ‘manger’ (to eat) in the present tense.” The student’s performance and progress are continually assessed, and the prompts are refined and adapted accordingly, ensuring a personalized and efficient learning experience.
The Unstoppable Rise of Prompt Engineering
As we approach the conclusion of our deep dive into prompt engineering, it’s crucial to underscore how truly nascent this field is. We are at the very precipice of an era where artificial intelligence goes beyond responding to pre-programmed commands, evolving to process and execute carefully engineered prompts that yield highly specific results.
The numerous applications we’ve explored, from customer support to content creation, data analysis, and personalized learning, are just the tip of the iceberg. As research in this arena intensifies, we can look forward to even more sophisticated and nuanced uses of prompt engineering. The convergence of human creativity and AI ingenuity is propelling us towards a future where artificial intelligence will not just assist but transform various aspects of our lives.