LLM Prompt Engineering: Techniques & Best Practices

DodaTech Updated 2026-06-22 7 min read

In this tutorial, you'll learn about LLM Prompt Engineering: Techniques & Best Practices. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

Prompt engineering is the practice of designing input prompts to communicate with large language models to produce accurate and useful outputs. As LLMs become more capable, the quality of the prompt increasingly determines the quality of the output. A well-crafted prompt can mean the difference between a hallucinated answer and a factual, well-reasoned response. Prompt engineering is not just about asking questions — it is about structuring information, providing context, setting constraints, and guiding reasoning processes.

What You'll Learn

In this tutorial, you'll learn prompt engineering techniques including zero-shot and few-shot prompting, chain-of-thought reasoning, structured output formatting, system prompts, and advanced strategies like tree-of-thought using Python with the OpenAI API and LangChain.

Why It Matters

Prompt engineering is the most practical skill for working with LLMs. It determines whether you get useful results or garbage. Well-engineered prompts reduce errors, improve consistency, and enable complex reasoning tasks. As LLMs become more capable, prompt engineering becomes the primary way to control and direct their behavior. Python with the OpenAI API enables programmatic prompt management at scale.

Real-World Use

Durga Antivirus Pro uses prompt engineering to analyze suspicious code snippets. A carefully engineered prompt instructs the LLM to classify code as "malicious," "suspicious," or "benign," explain its reasoning step by step, and output results in a structured JSON format for automated processing.

Zero-Shot Prompting

Zero-shot prompting asks the model to perform a task without examples. The model relies entirely on its training data to understand what you want. This works well for common tasks like sentiment analysis, translation, and summarization. The key to effective zero-shot prompting is being specific about the task and the expected output format. Instead of asking "Is this email suspicious?", say "Classify this email as phishing, spam, or legitimate. Respond with only one word." The more specific your instructions, the better the results.

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {
            "role": "system",
            "content": "You are a cybersecurity analyst classifying email threats.]
        },
        {
            "role": "user",
            "content": (
                "Classify this email as phishing, spam, or legitimate:\n\n"
                "Dear user, your account has been compromised. "
                "Click here to reset your password immediately."
            )
        }
    ],
    temperature=0
)

classification = response.choices[0].message.content
print(f"Classification: {classification}")

Expected output:

Classification: phishing

Few-Shot Prompting

Few-shot prompting provides examples to guide the model's response format and reasoning.

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {
            "role": "user",
            "content": (
                "Classify the sentiment of each review as positive or negative.\n\n]
                "Review: This product is amazing and works perfectly.\n"
                "Sentiment: positive\n\n"
                "Review: Terrible quality, broke after one week.\n"
                "Sentiment: negative\n\n"
                "Review: It's okay for the price but nothing special.\n"
                "Sentiment:"
            )
        }
    ],
    temperature=0
)

sentiment = response.choices[0].message.content.strip()
print(f"Three-shot result: {sentiment}")

Expected output:

Three-shot result: neutral

Wait — the few-shot examples only show positive and negative, but the model inferred "neutral" from context. This demonstrates why your examples must cover all expected categories.

Chain-of-Thought Prompting

Chain-of-thought (CoT) asks the model to show its reasoning step by step. This technique dramatically improves performance on arithmetic, logic, and multi-step reasoning tasks. By forcing the model to explain its reasoning before giving the final answer, CoT reduces the chance of the model jumping to incorrect conclusions. The reasoning steps act as a scratchpad, allowing the model to keep intermediate results in the generated text. CoT is particularly effective for problems involving calculations, symbolic reasoning, and multi-step decision making.

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {
            "role": "user",
            "content": (
                "Solve this step by step:\n\n]
                "A log file has 15,000 entries. Each entry is 2.5 KB. "
                "The server can process 500 KB per second. "
                "How many seconds will it take to process all entries?\n\n"
                "Let's think step by step."
            )
        }
    ],
    temperature=0
)

answer = response.choices[0].message.content
print(f"CoT Answer:\n{answer}")

Expected output:

CoT Answer:
1. Total size of all entries: 15,000 entries * 2.5 KB = 37,500 KB
2. Processing speed: 500 KB per second
3. Time needed: 37,500 KB / 500 KB per second = 75 seconds

Therefore, it will take 75 seconds to process all entries.

Structured Output with JSON Mode

Ensure consistent, parseable outputs for programmatic use. JSON mode forces the model to output valid JSON, making it possible to directly parse responses without error-prone string manipulation. The response_format parameter tells the API to constrain the model's output to valid JSON. You specify the expected structure in the prompt, and the model formats its response accordingly. This is essential for building automated pipelines where LLM outputs are consumed by downstream systems — API endpoints, databases, or analysis tools.

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {
            "role": "user",
            "content": (
                "Extract the following information from the email and return as JSON:\n\n]
                "From: support"@bank".com\n"
                "Subject: Urgent: Account Verification Required\n"
                "Body: Dear John, please verify your account at http://fake-bank.com "
                "within 24 hours or your account will be suspended.\n\n"
                'Return JSON with keys: sender, subject, urgency_level, contains_link, is_phishing'
            )
        }
    ],
    response_format={"type": "json_object"},
    temperature=0
)

import json
parsed = json.loads(response.choices[0].message.content)
print(json.dumps(parsed, indent=2))

Expected output:

{
  "sender": "support@bank.com",
  "subject": "Urgent: Account Verification Required",
  "urgency_level": "high",
  "contains_link": true,
  "is_phishing": true
}

Prompt Engineering Techniques

Technique	Description	Best For
Zero-Shot	No examples, direct instruction	Simple, well-known tasks
Few-Shot	2-5 examples in the prompt	Format control, new tasks
Chain-of-Thought	Show reasoning step by step	Math, logic, complex analysis
Tree-of-Thought	Explore multiple reasoning paths	Complex problem solving
System Prompt	Set model persona and constraints	Consistent behavior across sessions

Common Errors and Mistakes

Mistake	Why It Happens	How to Fix
Vague instructions	Model fills ambiguity incorrectly	Be specific about format, length, and content
Too many examples	Model confuses pattern with rule	Use 2-5 diverse examples, not 20
No system prompt	No context for model behavior	Always set a clear system message
Temperature too high	Inconsistent, creative outputs	Use 0 for factual tasks, 0.7 for creative
Not validating output	JSON parse errors in production	Use response_format='json_object' or validate

Practice Questions

What is the difference between zero-shot and few-shot prompting?

Answer: Zero-shot prompting gives the model a task with no examples. Few-shot prompting provides 2-5 examples demonstrating the desired input-output format, which helps the model understand the expected response structure.

Why does chain-of-thought prompting improve reasoning tasks?

Answer: CoT encourages the model to break down multi-step problems into intermediate steps. This reduces errors by making the reasoning Process explicit and giving the model a chance to correct its own mistakes as it works through each step.

What is the purpose of a system prompt?

Answer: The system prompt sets the model's behavior, persona, and constraints for the entire conversation. It establishes rules that persist across multiple user messages, such as "Always respond in JSON" or "You are a helpful tutor."

How does temperature affect LLM outputs?

Answer: Temperature controls randomness in token selection. Lower temperature (0-0.3) produces deterministic, focused outputs ideal for factual tasks. Higher temperature (0.7-1.0) produces more creative, varied outputs for brainstorming or creative writing.

Why use structured output formats like JSON?

Answer: Structured outputs enable programmatic processing of model responses. JSON outputs can be parsed, validated, and used in automated pipelines without manual extraction, reducing errors and enabling integration with other systems.

Challenge

Build a prompt engineering system that analyzes network security logs. The system should take raw log entries, classify each as "normal" or "suspicious," explain the reasoning for suspicious entries, and output a structured JSON report. Use few-shot prompting with at least 3 examples covering different attack types (port scan, brute force, DDoS).

Real-World Task

Design a prompt engineering system for automated code review in a security context. The system receives code snippets and must identify potential vulnerabilities (SQL Injection, XSS, buffer overflow). Use a system prompt to set the security expert persona, chain-of-thought for vulnerability analysis, and JSON mode for structured output with fields: vulnerability_type, line_number, severity, and fix_suggestion.

Next Steps

Now learn RAG Systems to combine LLMs with your data using LangChain, and explore LLMs with Hugging Face Transformers. Python and OpenAI APIs power most production LLM applications.

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

← Previous Hugging Face Transformers: BERT, GPT & Model Hub Guide Next → Building RAG Systems: Retrieval-Augmented Generation Guide

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Machine Learning