LLM Prompt Engineering: Techniques & Best Practices
In this tutorial, you'll learn about LLM Prompt Engineering: Techniques & Best Practices. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
Prompt engineering is the practice of designing input prompts to communicate with large language models to produce accurate and useful outputs. As LLMs become more capable, the quality of the prompt increasingly determines the quality of the output. A well-crafted prompt can mean the difference between a hallucinated answer and a factual, well-reasoned response. Prompt engineering is not just about asking questions — it is about structuring information, providing context, setting constraints, and guiding reasoning processes.
What You'll Learn
In this tutorial, you'll learn prompt engineering techniques including zero-shot and few-shot prompting, chain-of-thought reasoning, structured output formatting, system prompts, and advanced strategies like tree-of-thought using Python with the OpenAI API and LangChain.
Why It Matters
Prompt engineering is the most practical skill for working with LLMs. It determines whether you get useful results or garbage. Well-engineered prompts reduce errors, improve consistency, and enable complex reasoning tasks. As LLMs become more capable, prompt engineering becomes the primary way to control and direct their behavior. Python with the OpenAI API enables programmatic prompt management at scale.
Real-World Use
Durga Antivirus Pro uses prompt engineering to analyze suspicious code snippets. A carefully engineered prompt instructs the LLM to classify code as "malicious," "suspicious," or "benign," explain its reasoning step by step, and output results in a structured JSON format for automated processing.
Zero-Shot Prompting
Zero-shot prompting asks the model to perform a task without examples. The model relies entirely on its training data to understand what you want. This works well for common tasks like sentiment analysis, translation, and summarization. The key to effective zero-shot prompting is being specific about the task and the expected output format. Instead of asking "Is this email suspicious?", say "Classify this email as phishing, spam, or legitimate. Respond with only one word." The more specific your instructions, the better the results.
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "system",
"content": "You are a cybersecurity analyst classifying email threats.]
},
{
"role": "user",
"content": (
"Classify this email as phishing, spam, or legitimate:\n\n"
"Dear user, your account has been compromised. "
"Click here to reset your password immediately."
)
}
],
temperature=0
)
classification = response.choices[0].message.content
print(f"Classification: {classification}")
Expected output:
Classification: phishing
Few-Shot Prompting
Few-shot prompting provides examples to guide the model's response format and reasoning.
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
"content": (
"Classify the sentiment of each review as positive or negative.\n\n]
"Review: This product is amazing and works perfectly.\n"
"Sentiment: positive\n\n"
"Review: Terrible quality, broke after one week.\n"
"Sentiment: negative\n\n"
"Review: It's okay for the price but nothing special.\n"
"Sentiment:"
)
}
],
temperature=0
)
sentiment = response.choices[0].message.content.strip()
print(f"Three-shot result: {sentiment}")
Expected output:
Three-shot result: neutral
Wait — the few-shot examples only show positive and negative, but the model inferred "neutral" from context. This demonstrates why your examples must cover all expected categories.
Chain-of-Thought Prompting
Chain-of-thought (CoT) asks the model to show its reasoning step by step. This technique dramatically improves performance on arithmetic, logic, and multi-step reasoning tasks. By forcing the model to explain its reasoning before giving the final answer, CoT reduces the chance of the model jumping to incorrect conclusions. The reasoning steps act as a scratchpad, allowing the model to keep intermediate results in the generated text. CoT is particularly effective for problems involving calculations, symbolic reasoning, and multi-step decision making.
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
"content": (
"Solve this step by step:\n\n]
"A log file has 15,000 entries. Each entry is 2.5 KB. "
"The server can process 500 KB per second. "
"How many seconds will it take to process all entries?\n\n"
"Let's think step by step."
)
}
],
temperature=0
)
answer = response.choices[0].message.content
print(f"CoT Answer:\n{answer}")
Expected output:
CoT Answer:
1. Total size of all entries: 15,000 entries * 2.5 KB = 37,500 KB
2. Processing speed: 500 KB per second
3. Time needed: 37,500 KB / 500 KB per second = 75 seconds
Therefore, it will take 75 seconds to process all entries.
Structured Output with JSON Mode
Ensure consistent, parseable outputs for programmatic use. JSON mode forces the model to output valid JSON, making it possible to directly parse responses without error-prone string manipulation. The response_format parameter tells the API to constrain the model's output to valid JSON. You specify the expected structure in the prompt, and the model formats its response accordingly. This is essential for building automated pipelines where LLM outputs are consumed by downstream systems — API endpoints, databases, or analysis tools.
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
"content": (
"Extract the following information from the email and return as JSON:\n\n]
"From: support"@bank".com\n"
"Subject: Urgent: Account Verification Required\n"
"Body: Dear John, please verify your account at http://fake-bank.com "
"within 24 hours or your account will be suspended.\n\n"
'Return JSON with keys: sender, subject, urgency_level, contains_link, is_phishing'
)
}
],
response_format={"type": "json_object"},
temperature=0
)
import json
parsed = json.loads(response.choices[0].message.content)
print(json.dumps(parsed, indent=2))
Expected output:
{
"sender": "support@bank.com",
"subject": "Urgent: Account Verification Required",
"urgency_level": "high",
"contains_link": true,
"is_phishing": true
}
Prompt Engineering Techniques
| Technique | Description | Best For |
|---|---|---|
| Zero-Shot | No examples, direct instruction | Simple, well-known tasks |
| Few-Shot | 2-5 examples in the prompt | Format control, new tasks |
| Chain-of-Thought | Show reasoning step by step | Math, logic, complex analysis |
| Tree-of-Thought | Explore multiple reasoning paths | Complex problem solving |
| System Prompt | Set model persona and constraints | Consistent behavior across sessions |
Common Errors and Mistakes
| Mistake | Why It Happens | How to Fix |
|---|---|---|
| Vague instructions | Model fills ambiguity incorrectly | Be specific about format, length, and content |
| Too many examples | Model confuses pattern with rule | Use 2-5 diverse examples, not 20 |
| No system prompt | No context for model behavior | Always set a clear system message |
| Temperature too high | Inconsistent, creative outputs | Use 0 for factual tasks, 0.7 for creative |
| Not validating output | JSON parse errors in production | Use response_format='json_object' or validate |
Practice Questions
- What is the difference between zero-shot and few-shot prompting?
Answer: Zero-shot prompting gives the model a task with no examples. Few-shot prompting provides 2-5 examples demonstrating the desired input-output format, which helps the model understand the expected response structure.
- Why does chain-of-thought prompting improve reasoning tasks?
Answer: CoT encourages the model to break down multi-step problems into intermediate steps. This reduces errors by making the reasoning Process explicit and giving the model a chance to correct its own mistakes as it works through each step.
- What is the purpose of a system prompt?
Answer: The system prompt sets the model's behavior, persona, and constraints for the entire conversation. It establishes rules that persist across multiple user messages, such as "Always respond in JSON" or "You are a helpful tutor."
- How does temperature affect LLM outputs?
Answer: Temperature controls randomness in token selection. Lower temperature (0-0.3) produces deterministic, focused outputs ideal for factual tasks. Higher temperature (0.7-1.0) produces more creative, varied outputs for brainstorming or creative writing.
- Why use structured output formats like JSON?
Answer: Structured outputs enable programmatic processing of model responses. JSON outputs can be parsed, validated, and used in automated pipelines without manual extraction, reducing errors and enabling integration with other systems.
Challenge
Build a prompt engineering system that analyzes network security logs. The system should take raw log entries, classify each as "normal" or "suspicious," explain the reasoning for suspicious entries, and output a structured JSON report. Use few-shot prompting with at least 3 examples covering different attack types (port scan, brute force, DDoS).
Real-World Task
Design a prompt engineering system for automated code review in a security context. The system receives code snippets and must identify potential vulnerabilities (SQL Injection, XSS, buffer overflow). Use a system prompt to set the security expert persona, chain-of-thought for vulnerability analysis, and JSON mode for structured output with fields: vulnerability_type, line_number, severity, and fix_suggestion.
Next Steps
Now learn RAG Systems to combine LLMs with your data using LangChain, and explore LLMs with Hugging Face Transformers. Python and OpenAI APIs power most production LLM applications.
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro