Prompt engineering is an emerging discipline at the intersection of AI and process engineering and will undoubtedly contribute to the hydrocarbon and green molecules processing industries. Parts 1 and 2 (October 2024) of this article explore and provide a glimpse of how this technique can help process engineers and operators to manage the power of advanced AI models to optimize processes, contribute to solve complex problems, and improve operational efficiency in refineries and petrochemical plants.
IP: 216.27.29.35
In the artificial intelligence (AI) segment, large language models (LLMs) like ChatGPT, Gemini, Co-pilot, Perplexity, Claude, et al., have emerged as powerful tools for processing and generating human-quality text. These sophisticated algorithms, trained on massive datasets of text and code, possess the ability to comprehend and produce language in a wide range of contexts. However, despite their capabilities, LLMs also present a unique challenge: how to effectively access and harness the vast knowledge they hold.
Jorge Luis Borges’ short story, “The Library of Babel,” describes an infinite library containing every possible book ever written (or to be written). While this repository of knowledge might seem like a utopia for information seekers, it also presents a paradox—the sheer volume of information makes it virtually impossible to find anything specific. LLMs, in many ways, mirror the paradox of Borges’ library. They contain an immense store of information, but extracting and utilizing this knowledge can be a complex task. The challenge lies in effectively communicating with language models, guiding them towards the desired information and prompting them to generate meaningful outputs.
This is where prompt engineering comes into play, as shown in FIG. 1. Prompt engineering involves carefully crafting instructions that guide LLMs towards specific tasks or desired outcomes. By employing prompts, users can transform LLMs from repositories of raw data into versatile tools for generating creative text formats, translating languages, writing different kinds of creative content, and answering questions in an informative way.
FIG. 1. Prompt engineering components: the relevance of instructions and context on LLM output performance.1
Prompt engineering is an emerging discipline at the intersection of AI and process engineering and will undoubtedly contribute to the hydrocarbon and green molecules processing industries. Parts 1 and 2 (October 2024) of this article explore and provide a glimpse of how this technique can help process engineers and operators to manage the power of advanced AI models to optimize processes, contribute to solve complex problems, and improve operational efficiency in refineries and petrochemical plants.
The information included in this article is theoretical and does not correspond to any real process or situation. It is only shown to illustrate the potential capabilities of LLMs and should not be used in any way.
LLMS
LLMs are advanced AI systems that can understand and generate human-like text across a wide range of topics and tasks. These models are trained on massive datasets of text and code, enabling them to process and generate human-quality language with remarkable fluency and accuracy.
The architecture of LLMs. To illustrate the process and explain how LLMs work, consider the flowchart in FIG. 2.
FIG. 2. Simplified LLM flowchart.
LLMs operate on the principle of predicting the next word in a sequence based on the context of previous words.
A simplified breakdown of LLM technical functioning is listed here:
- Tokenization: The input text is broken down into smaller units called tokens. These can be words, parts of words or even individual characters.
- Embedding: Each token is converted into a numerical vector representation that the model can process.
- Neural network processing: The core of an LLM is a large neural network, typically based on a transformer architecture. This network consists of: Attention mechanisms: These allow the model to focus on relevant parts of the input when making predictions. Feed-forward layers: These process the information further.
- Context window: The model considers a certain number of previous tokens (the context window) to predict the next token.
- Output generation: The model produces a probability distribution for the next token. It then selects a token based on this distribution, often using techniques like temperature sampling to control randomness.
- Iteration: Steps 4 and 5 are repeated to generate each subsequent token until the response is complete.
- Fine-tuning and prompt engineering: Models can be further trained on specific datasets or guided by carefully crafted prompts to improve performance on particular tasks.
This process allows LLMs to generate coherent and contextually relevant text based on the input they receive. The immense scale of these models—often containing hundreds of thousands of parameters—enables them to capture complex patterns and relationships in language. The performance of the LLM can be adjusted by tuning some parameters, as shown in TABLE 1.
In essence, LLMs work by recognizing patterns in language, using these patterns to understand input, and then generating appropriate responses based on statistical probabilities learned from their training data.
Prompt engineering: Guiding LLMs to reveal their knowledge. Prompt engineering has emerged as an essential discipline in LLMs, serving as a bridge between the knowledge held within LLMs and the specific needs of users, enabling effective communication and unlocking the full potential of these language models. The prompts should meet the following characteristics to be effective:
- Clear and specific: Prompts should be clear, concise and specific in conveying the desired task or outcome. Ambiguous or vague prompts can lead to unpredictable or irrelevant outputs.
- Informative: Prompts should provide sufficient context and information to guide the LLM towards the intended goal. This includes providing relevant examples, background information or specific instructions.
- Creative and open-ended: Prompts should encourage creativity and open-endedness to allow the LLM to explore different approaches and generate diverse outputs. This is particularly important for tasks like creative writing or brainstorming.
- Adaptable and refinable: Prompt engineering is an iterative process. As outputs are received from the LLM, they should be analyzed to understand how well the model is aligning with objectives. Refine prompts based on the model’s responses to improve performance and tailor the output to the desired outcome.
Key techniques in prompt engineering. Prompt engineering techniques like, one-shot, few-shot, chain of thoughts and tree of thoughts (detailed below) are different approaches to optimize interaction with LLMs and achieve more accurate and relevant results. These techniques are based on the idea of providing the LLM with additional and structured information to guide its reasoning and improve its performance on specific tasks.
One-shot learning: One-shot learning focuses on training the LLM to perform a task with only one example provided. This approach is useful when training data is scarce or expensive to obtain. To implement one-shot learning, the prompt should be clear, concise and provide enough information for the LLM to understand the task and generate the desired outcome.
Few-shots learning: Few-shots learning expands the one-shot concept by providing the LLM with several examples for a specific task. With more examples, the LLM can learn more robust patterns and generalizations, improving its performance on the task. Prompts in few-shots learning should be carefully selected to cover a variety of cases and styles relevant to the task.
Chain of thoughts: The chain of thoughts technique introduces a dialogue structure between the user and the LLM. Instead of providing a single, comprehensive prompt, the user interacts with the LLM conversationally, asking questions and providing feedback. This allows the LLM to refine its understanding of the task and the goal iteratively.
Tree of thoughts: The tree of thoughts technique expands the chain of thoughts approach by organizing the interactions between the user and the LLM into a hierarchical tree-like structure. This allows for exploring different branches of the conversation and considering multiple perspectives, enhancing the depth and coherence of the final outcome.
Many other techniques exist for prompt engineering, such as role prompting (the LLM is asked to take a specific role relevant to the task to solve), or autotuned prompting (the model is asked to construct the optimal prompts based on several requirements and objectives). The methods described above are the most common strategies to start and should be enriched to obtain successful results.
Prompt engineering in the context of processing hydrocarbons and green molecules involves the careful formulation of instructions or questions (prompts) for AI models. These prompts must incorporate industry-specific knowledge, relevant operational data and clear objectives to generate useful and applicable insights. Eventually, the key components of an effective prompt for the processing industry should be:
- Contextualization: Provide relevant information about the specific process, such as temperatures, pressures and flows.
- Clear objectives: Explicitly define what is sought to be optimized or resolved.
- Constraints: Specify operational and safety limits that must be respected.
- Response format: Indicate how the AI should present its recommendations or analysis.
Additionally, the following aspects should be considered:
- Specificity: Use precise industry terminology and provide quantitative data, when possible.
- Structuring: Organize prompts into logical sections (e.g., data, objectives, constraints, response format).
- Iteration: Refine prompts based on received responses to improve the quality and relevance of insights.
- Validation: Cross-check AI recommendations with domain expert knowledge.
The risks associated with these processes should never be forgotten, and the prompts must include specific considerations about:
- Safety: Prompts must include critical safety considerations in the chemical processing industry.
- Process complexity: Capturing the complexity of chemical processes in prompts requires deep technical knowledge.
- Energy transition: Sustainability considerations should be considered in the prompts, guiding the LLM to check reliable references.
- Result interpretation: AI responses must be carefully evaluated before implementation.
PROMPT ENGINEERING EXAMPLES
The following section shows some preliminary examples of how to apply one-shot, few-shots, chain of thoughts and tree of thoughts prompt engineering techniques to solve common problems in a petroleum refinery.
One-shot prompting. This approach provides a single instruction or example to guide the AI’s response.
- Example: “Provide three methods to reduce coke formation in a catalytic reforming catalyst.”
- Increase H2 partial pressure: Operating the reactor with a high partial pressure of H2 helps promote the reverse reactions that remove coke. This method is effective because H2 can react with coke precursors, preventing their accumulation.
- Optimize reaction temperature: Maintaining an optimal reaction temperature (typically between 500°C and 550°C) can minimize the rate of cracking reactions that lead to coke formation. Operating within this temperature range helps balance the reforming reactions while reducing the formation of coke.
- Use coke-resistant catalysts: Developing and using catalysts that are resistant to coke formation, such as those with specific structural properties or those that incorporate bimetallic components, can significantly reduce coke deposition. These catalysts are designed to minimize the sites where coke can form and to enhance the removal of coke precursors.
Few-shots prompting. Here, multiple examples are offered to help the AI better understand the desired response pattern.
Example: “Given the following problem-solution format for process optimization, provide a similar solution for the third case”:
- Problem 1: Low yield in the alkylation unit.
- Solution 1: Increase the isobutane/olefin ratio to 10:1 and reduce the reaction temperature to 5°C.
- Problem 2: High gas generation in the fluid catalytic cracking unit (FCCU).
- Solution 2: Reduce the reactor temperature by 10°C and increase the catalyst circulation rate by 15%.
- Problem 3: Low conversion in the hydrocracking unit.
- Solution 3: “Generate a solution following the above pattern
LLM answer: Increase the reactor temperature by 15°C and adjust the hydrogen partial pressure to 150 bar.
Chain of thought prompting. This method asks the AI to show its reasoning step by step.
- Example: “A crude preheat train shows a 15% efficiency drop. Analyze possible causes and solutions, showing your reasoning at each step”:
- Identify key components of the preheat train
- List possible causes of efficiency loss for each component
- Suggest methods to diagnose each potential cause
- Propose solutions for the most likely causes
- Prioritize solutions based on their impact and ease of implementation LLM answers:
- Identify key components of the preheat train
The crude preheat train typically includes:
- Heat exchangers: Shell-and-tube or plate heat exchangers
- Pumps: To circulate the crude oil
- Filters: To remove particulates from the crude
- Desalters: To remove salts and other impurities
- Piping and valves: To control the flow and pressure List possible causes of efficiency loss for each component:
- Heat exchangers: Fouling due to deposition of asphaltenes, salts and other impurities; mechanical damage or wear; incorrect operation or control settings
- Pumps: Mechanical wear or damage; incorrect flowrates or pressure settings; blockages or leaks in the system
- Filters: Clogging due to excessive particulates; ineffective filtration leading to fouling in downstream equipment
- Desalters: Inefficient removal of salts and impurities; fouling or scaling within the desalter unit
- Piping and valves: Corrosion or scaling; leaks or blockages; incorrect valve settings
- Suggest methods to diagnose each potential cause:
- Heat exchangers: Perform thermal performance analysis to detect fouling; inspect for mechanical damage or wear; check operational logs for deviations in temperature and pressure
- Pumps: Conduct vibration analysis and performance testing; inspect for leaks and blockages; review flowrate and pressure data
- Filters: Monitor differential pressure across filters; inspect and replace clogged filters regularly
- Desalters: Analyze the quality of the desalted crude; inspect for fouling or scaling; review operational efficiency and maintenance records
- Piping and valves: Conduct ultrasonic testing for corrosion and scaling; inspect for leaks and blockages; verify valve settings and operation Propose solutions for the most likely causes:
- Heat exchangers: Implement regular cleaning schedules to remove fouling; apply anti-fouling coatings to heat exchanger surfaces; optimize operational parameters to reduce fouling rates
- Pumps: Perform regular maintenance and replace worn components; adjust flowrates and pressure settings as needed; clear any blockages and repair leaks
- Filters: Increase the frequency of filter inspections and replacements; upgrade to more efficient filtration systems if necessary
- Desalters: Enhance desalter efficiency through chemical additives or process optimization; regularly clean and maintain desalter units
- Piping and valves: Implement corrosion inhibitors and regular cleaning; repair or replace damaged sections of piping; ensure proper valve operation and settings.Prioritize solutions based on their impact and ease of implementation:
- Regular cleaning and maintenance: High impact, relatively easy to implement
- Optimize operational parameters: High impact, requires some analysis and adjustments
- Apply anti-fouling coatings: Medium-to-high impact, moderate implementation effort
- Upgrade filtration systems: Medium impact, moderate-to-high implementation effort
- Enhance desalter efficiency: Medium impact, requires process optimization and potential chemical use
- Repair/replace damaged components: High impact, but may require significant downtime and cost.
Tree of thoughts prompting. This method breaks down complex problems into sub-problems and explores multiple solution paths.
Example: Select the best technology for producing sustainable aviation fuel (SAF) using the tree of thoughts strategy. Consider economic criteria in your evaluation:
- Step 1: Define the problem and objectives—Clearly state the goal of selecting the best SAF technology; and identify the key objectives, such as reducing greenhouse gas (GHG) emissions, achieving cost-effectiveness and ensuring scalability.
- Step 2: Generate initial thoughts—list all potential SAF technologies [e.g., Fischer-Tropsch (FT), hydrogenated esters and fatty acids (HEFA) and alcohol-to-jet (AtJ)]; for each technology, note initial thoughts on their advantages and disadvantages.
- Step 3: Expand thoughts into branches—For each technology, expand on the initial thoughts by considerinTechnical feasibility: Assess the maturity of the technology and its readiness for large-scale deploymentEconomic viability: Evaluate the cost of production, potential for economies of scale and market competitiveness.Environmental impact: Analyze the lifecycle GHG emissions and potential environmental benefits.Feedstock availability: Consider the availability and sustainability of feedstocks required for each technology.
- Step 4: Evaluate economic criteria: Cost of production: Compare the production costs of each technology, including capital and operational expenses.Market competitiveness: Assess the potential for each technology to compete with conventional jet fuel prices. Economies of scale: Determine the scalability of each technology and its ability to reduce costs with increased production.Investment requirements: Evaluate the initial investment needed for infrastructure and technology development.
- Step 5: Synthesize and compare branches—Create a comparison matrix (TABLE 2) to evaluate each technology against the defined criteria. Assign weights to each criterion based on its importance to the overall objective. Score each technology based on how well it meets each criterion.
- Step 6: Select the optimal technology—Analyze the scores and identify the technology that best meets the objectives. Consider any trade-offs and make a final decision based on the comprehensive evaluation.
- Step 7: Justify the selection—Provide a detailed justification for the selected technology, highlighting its strengths and how it meets the economic and environmental goals. Discuss any potential challenges and how they can be mitigated.
LLM answers
- Step 1: Define the problem and objectives: Goal: Select the best technology for producing SAFKey objectives: Reduce GHG emissions, achieve cost-effectiveness and ensure scalability
- Step 2: Generate initial thoughts: Potential SAF technologies: FT Advantages: High energy density, can use various feedstocks Disadvantages: High capital costs, complex process HEFA Advantages: Mature technology, compatible with existing infrastructure Disadvantages: Limited feedstock availability, moderate costs AtJ Advantages: Uses abundant feedstocks like ethanol, scalable Disadvantages: Higher production costs, less mature technology
- Step 3: Expand thoughts into branches: FT
- Technical feasibility: Mature technology but complex
- Economic viability: High CAPEX and OPEX costs
- Environmental impact: Significant reduction in GHG emissions
- Feedstock availability: Various feedstocks can be used, including waste HEFA
- Technical feasibility: Well-established, ready for large-scale deployment
- Economic viability: Moderate costs, potential for cost reduction with scale
- Environmental impact: Up to 80% reduction in lifecycle GHG emissions
- Feedstock availability: Limited to fats, oils and greases AtJ
- Technical feasibility: Emerging technology, promising but less mature
- Economic viability: Higher production costs, potential for cost reduction
- Environmental impact: Significant GHG reduction potential
- Feedstock availability: Abundant feedstocks like ethanol
- Step 4: Evaluate economic criteria: Cost of production: FT: High HEFA: Moderate AtJ: High
- Market competitiveness: FT: Competitive with policy support HEFA: Currently most competitive AtJ: Needs further development
- Economies of scale: FT: Potential for significant cost reduction HEFA: Already benefiting from economies of scale AtJ: Potential not yet realized
- Investment requirements: FT: High initial investment HEFA: Moderate investment AtJ: High investment needed
- Step 5: Synthesize and compare branches: A comparison matrix is shown in TABLE 2.
- Step 6: Select the optimal technology: In this case, HEFA emerges as the optimal technology due to its balance of technical feasibility, economic viability and environmental impact. It is currently the most competitive and benefits from economies of scale.
- Step 7: Justify the selection: Justification:
- Strengths: HEFA is a mature technology with moderate costs and significant environmental benefits. It is compatible with existing infrastructure and has the potential for further cost reductions.
- Challenges: Limited feedstock availability can be addressed by diversifying feedstock sources and improving supply chains.
- Conclusion: HEFA meets the economic and environmental goals effectively, making it the best choice for SAF production at present.
Takeaways. While the answers provided in this article are debatable, they do provide a clear image of the capabilities inherent to LLMs and the role of prompt engineering as a relevant skill for hydrocarbon and green molecules professionals. The ability to formulate effective prompts will become an important tool and should be guided by experts’ technical knowledge. The synergy between human expertise and AI can help the industry to address exciting challenges like the refinery of the future configuration.
Part 2 of this article (October 2024) will further discuss prompt engineering and extracting operational excellence knowledge from AI.
LITERATURE CITED
1 Sahoo, P., A. K. Singh , S. Saha , V. Jain, S. Mondal and A. Chadha, “A systematic survey of prompt engineering in large language models: Techniques and applications,” Department of Computer Science and Engineering, Indian Institute of Technology, Patna, Stanford University and Amazon AI, Feb 5, 2024.
The Author
Larraz, R. – CEPSA, Spain
Rafael Larraz is the Director of CEPSA Corporate Engineering and the Coordinator of the Oil Section at the Spanish Institute of Energy. Dr. Larraz Rafael holds an MS degree in environmental engineering from the University of Las Palmas de Gran Canaria in Spain, an Executive MBA from the IESE Business School of the University of Navarra, and a PhD in chemical engineering from the University of La Laguna in Spain.