Skip to content

sed and awk: Text Processing Power Tools

DodaTech Updated 2026-06-22 7 min read

In this tutorial, you'll learn sed and awk for stream editing and text processing including search and replace, field extraction, report generation, and combining both tools for complex text transformations.

Why sed and awk Matter

sed and awk are the two most powerful text-processing tools on Unix. They are installed on every Linux and macOS system. sed excels at stream editing -- find and replace, line deletion, and text transformations. awk adds column-based processing, arithmetic, and report generation. Together, they handle almost any text manipulation task without writing a full program.

By the end of this guide, you will use sed for search/replace and filtering, awk for field extraction and reporting, and combine both for complex pipelines.

What are sed and awk?

sed (stream editor) reads input line by line and applies editing commands. It is ideal for simple transformations. awk is a pattern-scanning and processing language. It splits each line into fields and supports variables, conditionals, and loops.

flowchart LR
  A[Input Text] --> B[sed]
  A --> C[awk]
  B --> D[Search & Replace]
  B --> E[Line Deletion]
  B --> F[Text Insertion]
  C --> G[Field Extraction]
  C --> H[Report Generation]
  C --> I[Data Aggregation]
  D --> J[Output Text]

sed Fundamentals

Basic Syntax

sed 'command' file.txt
sed -i 'command' file.txt  # In-place edit

Search and Replace

# Replace first occurrence on each line
sed 's/old/new/' file.txt

# Replace all occurrences (global)
sed 's/old/new/g' file.txt

# Replace only on lines matching a pattern
sed '/pattern/s/old/new/' file.txt

# Replace with case-insensitive
sed 's/old/new/gi' file.txt

Expected Output

$ echo "foo bar foo baz" | sed 's/foo/FIXED/'
FIXED bar foo baz

$ echo "foo bar foo baz" | sed 's/foo/FIXED/g'
FIXED bar FIXED baz

Line Operations

# Delete lines containing a pattern
sed '/pattern/d' file.txt

# Delete empty lines
sed '/^$/d' file.txt

# Delete line 5
sed '5d' file.txt

# Delete lines 10-20
sed '10,20d' file.txt

# Print specific lines
sed -n '5,10p' file.txt  # Print lines 5-10

# Insert before a line
sed '5i\This is inserted before line 5' file.txt

# Append after a line
sed '5a\This is appended after line 5' file.txt

Advanced sed

# Multiple commands
sed -e 's/foo/bar/g' -e 's/baz/qux/g' file.txt

# Write matches to another file
sed -n '/ERROR/w errors.txt' log.txt

# In-place edit with backup
sed -i.bak 's/foo/bar/g' config.conf

# Replace only on lines 3-8
sed '3,8s/old/new/g' file.txt

# Use different delimiter
sed 's|/path/to/old|/path/to/new|g' paths.txt

Practical sed Examples

# Remove trailing whitespace
sed -i 's/[[:space:]]*$//' file.txt

# Convert tabs to spaces
sed -i 's/\t/    /g' file.txt

# Uppercase first letter of each word
sed 's/\b\(.\)/\u\1/g' file.txt

# Comment out lines with "DEBUG"
sed '/DEBUG/s/^/# /' config.conf

awk Fundamentals

awk splits each input line into fields separated by whitespace. $1 is the first field, $2 the second, and $0 is the entire line.

Field Extraction

# Print first field of each line
awk '{print $1}' file.txt

# Print first and third fields
awk '{print $1, $3}' file.txt

# Print with custom separator (CSV)
awk -F, '{print $1, $2}' data.csv

# Print formatted output
awk '{printf "%-10s %s\n", $1, $3}' file.txt

Expected Output

$ cat data.txt
Alice 30 Developer
Bob 25 Designer
Charlie 35 Manager

$ awk '{print $1, $3}' data.txt
Alice Developer
Bob Designer
Charlie Manager

Patterns in awk

# Print lines matching a pattern
awk '/ERROR/ {print}' log.txt

# Print lines where field matches
awk '$1 == "Alice" {print}' data.txt

# Numeric comparison
awk '$2 > 30 {print $1, "is over 30"}' data.txt

# Start/end patterns
awk '/BEGIN/,/END/ {print}' file.txt

awk Variables

# Built-in variables
awk '{print NR, NF, $0}' file.txt
# NR = line number, NF = number of fields

# Field separator as variable
awk -v FS=',' '{print $1}' data.csv

# Custom variables
awk -v min=30 '$2 > min {print $1}' data.txt

awk Calculations

# Sum a column
awk '{sum += $2} END {print "Total:", sum}' sales.txt

# Average
awk '{sum += $2; count++} END {print "Average:", sum/count}' data.txt

# Count matches
awk '/ERROR/ {count++} END {print count, "errors found"}' log.txt

# Min and max
awk 'NR == 1 {min = $2; max = $2} $2 < min {min = $2} $2 > max {max = $2} END {print "Min:", min, "Max:", max}' data.txt

awk Functions

# String functions
awk '{print toupper($1), length($1)}' names.txt

# Substring
awk '{print substr($1, 1, 3)}' names.txt

# Split
awk '{split($0, arr, ","); print arr[1]}' data.csv

# Math functions
awk '{print sqrt($2), int($2)}' numbers.txt

Combining sed and awk

# Pipe: sed to clean, awk to process
sed 's/[^a-zA-Z0-9 ]//g' dirty.txt | awk '{print $1, $NF}'

# awk to filter, sed to modify
awk '$2 > 100' data.txt | sed 's/^/HIGH: /'

# Complex pipeline
cat access.log \
  | sed 's/\[.*\]//' \
  | awk '{print $1, $7}' \
  | sort \
  | uniq -c \
  | sort -rn \
  | head -10

Real-World Examples

Log File Analysis

# Count error types in a log file
awk '/ERROR/ {errors[$NF]++} END {for (e in errors) print e, errors[e]}' application.log

# Find top 10 IP addresses accessing a server
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10

# Average response time per endpoint
awk '{endpoints[$7] += $NF; counts[$7]++} END {for (e in endpoints) print e, endpoints[e]/counts[e]}' access.log

CSV Processing

# Convert CSV to tab-separated
sed 's/,/\t/g' data.csv

# Sum a column in CSV (with header)
awk -F, 'NR > 1 {sum += $3} END {print sum}' data.csv

# Extract specific columns from CSV
awk -F, '{print $1, $4}' data.csv | column -t

Configuration File Editing

# Uncomment a line in a config file
sed -i '/^# server_host/s/^# //' config.conf

# Update a configuration value
sed -i 's/^max_connections = .*/max_connections = 200/' config.conf

# Add a line after a match
sed -i '/^\[database\]/a host = localhost' config.conf

Common Errors

Problem Cause Fix
sed: -e expression #1, char 1: unknown command Wrong command syntax sed commands start with a letter: s for substitute, d for delete
sed: extra characters after command Trailing characters Ensure no extra spaces after the closing /
awk: division by zero Empty field causes division Check for zero: if ($2 != 0) ...
awk: fatal: cannot open file File not found Verify file path and permissions
sed in-place not working on macOS BSD sed uses different syntax Use sed -i '' 's/old/new/g' file.txt (empty string for no backup)

Practice Questions

1. How do you replace all occurrences of "foo" with "bar" using sed?

sed 's/foo/bar/g' file.txt.

2. What does $1 represent in awk?

The first field of the current record (line).

3. How do you print lines 10-20 of a file with sed?

sed -n '10,20p' file.txt.

4. What is the difference between NR and NF in awk?

NR is the current record (line) number. NF is the number of fields in the current record.

5. How do you sum a column of numbers in awk?

awk '{sum += $1} END {print sum}' file.txt.

Challenge

Write a sed+awk pipeline that processes a web server access log and generates a report showing: the top 10 IP addresses by request count, the number of 404 errors by URL, and the average response time per HTTP method (GET, POST, etc.).

Real-World Task

Use awk to analyze a system log file. Extract all ERROR and WARNING messages, group them by component name, count occurrences of each, and generate a summary report. Use sed to sanitize the input (remove timestamps, anonymize IP addresses). The final output should be a formatted table showing component, error count, and severity level.

Should I use sed or awk for a given task?

Use sed for simple substitutions, deletions, and line-based operations. Use awk for field extraction, calculations, and report generation. For complex workflows, combine both with pipes.

Are there modern alternatives to sed and awk?

Yes. sd is a faster sed replacement focused on regex. xsv is a CSV toolkit. But sed and awk are universally available and require no installation.

Can sed and awk handle large files?

Yes. Both Process files line by line, so they can handle files larger than available memory. They are ideal for log files and data dumps.

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro