sed and awk are powerful text processing tools in Linux, suitable for efficient text operations under the command line, for log parsing, configuration file editing, and data conversion. 1. sed is used for stream editing, and can perform text replacement (such as s/old/new/replacement for the first time, s/old/new/g global replacement), operate on specific lines (such as 3s/old/new/only line 3, 2,5s/old/new/g lines 2 to 5), delete lines (/^$/d delete empty lines, /error/d delete lines containing error), insert or append text (/start/i\New line before insert before matching, /end/a\New line after matching line append after matching), use the -i option to directly modify the file. It is recommended to use other separators such as | to avoid "slash hell" when path replacement. 2. awk is good at field-based text processing. The basic syntax is awk 'pattern { action }' file.txt. The fields are separated by blank by default. The specified fields can be printed ({print $1, $3}), the delimiter is specified with -F (-F','process CSV, -F':'process passwd files), and the pattern filtering is supported ($2 == "active" prints the line with the second field active, /error/ matches the line with error and prints the line number NR), and can be calculated ({sum = $2} END {print "Total:", sum} sum), adds conditions and formatted output ($3 > 100 {print $1 " is over budget"}), built-in variables include NR (current line number), NF (current line field number), FILENAME (file name), such as {print NR ". " $0 " (fields: " NF ")"}. 3. Sed and awk can be used in combination to achieve complex processing through pipelines: first use sed to clean the text and then awk processing, such as sed 's/^\[ \\t\]\*//; s/\[ \\t\]\*$//' file.txt | awk '$2 > 50 {print $1}' and print the first field with the second field greater than 50; or awk extracts the field and appends the content, such as awk -F'|' '{print $2, $4}' data.txt | sed 's/$/ added/'. 4. Practical applications include: extracting the IP in the log and counting grep -oE '\b([0-9]{1,3}\.){3}[0-9]{1,3}\b' access.log | sort | uniq -c, and then filtering the IP accessed more than 100 times with awk '$1 > 100 {print $2}'; safely modify the configuration file sed -i.bak 's/^#\(Port 22\)/\1/' /etc/ssh/sshd_config uncomment Port 22 and back up the original file; formatting ps output ps aux | awk '$3 > 5.0 {print $1, $2, $3, $11}' | head -10 lists the top 10 processes that use more than 5% of the CPU. In short, sed is used when processing text streams, and awk is used when analyzing structured data. The combination of the two can greatly improve the efficiency of command line text processing.
sed
and awk
are two of the most powerful text-processing tools in Linux. They allow you to manipulate text efficiently from the command line, making them essential for log parsing, configuration file editing, and data transformation. Here's how to use them effectively.

1. Using sed
for Stream Editing
sed
(stream editor) is ideal for performing basic text transformations on an input stream (a file or input from a pipeline).
Common sed
Operations
-
Substitute text
Replace the first occurrence of a pattern on each line:sed 's/old/new/' file.txt
Replace all occurrences:
sed 's/old/new/g' file.txt
Replace on specific lines
Only replace on line 3:sed '3s/old/new/' file.txt
Replace in a range (lines 2 to 5):
sed '2,5s/old/new/g' file.txt
Delete lines
Delete blank lines:sed '/^$/d' file.txt
Delete lines containing a pattern:
sed '/error/d' file.log
Insert or append text
Insert a line before a match:sed '/start/i\New line before' file.txt
Append a line after a match:
sed '/end/a\New line after' file.txt
Edit files in place
Use-i
to save changes directly:sed -i 's/foo/bar/g' config.txt
? Tip: Use a different delimiter (like
|
) to avoid "slash hell" when working with paths:sed 's|/home/user|/tmp|g' file.txt
2. Using awk
for Field-Based Text Processing
awk
excels at processing structured text (like CSV or log files), where data is organized in fields.
Basic awk
Syntax
awk 'pattern { action }' file.txt
Print specific fields
By default, fields are separated by whitespace. Print the first and third fields:awk '{print $1, $3}' data.txt
Use a custom delimiter
For comma-separated values:awk -F',' '{print $2}' users.csv
Or with a colon (eg,
/etc/passwd
):awk -F':' '{print $1, $6}' /etc/passwd
Filter lines with patterns
Print lines where the second field equals "active":awk '$2 == "active" {print $0}' status.txt
Print lines containing the word "error":
awk '/error/ {print NR, $0}' app.log
Perform calculations
Sum values in the second column:awk '{sum = $2} END {print "Total:", sum}' numbers.txt
Add conditions and formatting
awk '$3 > 100 {print $1 " is over budget"}' expenses.txt
Built-in variables
-
NR
– Current record (line) number -
NF
– Number of fields in the current line -
FILENAME
– Name of the input file
Example:
awk '{print NR ". " $0 " (fields: " NF ")"}' data.txt
-
3. Combining sed
and awk
You can pipe sed
and awk
together for advanced processing:
Clean up text with
sed
, then process withawk
:sed 's/^[ \t]*//; s/[ \t]*$//' file.txt | awk '$2 > 50 {print $1}'
(Removes leading/trailing whitespace, then prints first field if second field > 50)
Extract and reformat data:
awk -F'|' '{print $2, $4}' data.txt | sed 's/$/ added/'
(Prints fields 2 and 4, then appends " added" to each line)
4. Practical Examples
Extract IPs from a log and count occurences:
grep -oE '\b([0-9]{1,3}\.){3}[0-9]{1,3}\b' access.log | sort | uniq -c
Then use
awk
to filter suspicious ones:awk '$1 > 100 {print $2}' # IPs accessed more than 100 times
Modify configuration files safely:
sed -i.bak 's/^#\(Port 22\)/\1/' /etc/ssh/sshd_config
(Uncomments "Port 22" and creates a backup)
Format output from
ps
:ps aux | awk '$3 > 5.0 {print $1, $2, $3, $11}' | head -10
(Lists top 10 processes with CPU > 5%)
Both tools are scriptable and can handle complex logic, but for quick command-line text manipulation, even simple one-liners save a lot of time. Start with basic substitutions and field printing, then build up as needed.
Basically, if you're editing text streams — use sed
. If you're analyze or reporting on structured data — reach for awk
.
The above is the detailed content of How to Use `sed` and `awk` for Text Processing in Linux. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

LXD is described as the next-generation container and virtual machine manager that offers an immersive for Linux systems running inside containers or as virtual machines. It provides images for an inordinate number of Linux distributions with support

Firefox browser is the default browser for most modern Linux distributions such as Ubuntu, Mint, and Fedora. Initially, its performance might be impressive, however, with the passage of time, you might notice that your browser is not as fast and resp

When encountering DNS problems, first check the /etc/resolv.conf file to see if the correct nameserver is configured; secondly, you can manually add public DNS such as 8.8.8.8 for testing; then use nslookup and dig commands to verify whether DNS resolution is normal. If these tools are not installed, you can first install the dnsutils or bind-utils package; then check the systemd-resolved service status and configuration file /etc/systemd/resolved.conf, and set DNS and FallbackDNS as needed and restart the service; finally check the network interface status and firewall rules, confirm that port 53 is not

If you find that the server is running slowly or the memory usage is too high, you should check the cause before operating. First, you need to check the system resource usage, use top, htop, free-h, iostat, ss-antp and other commands to check CPU, memory, disk I/O and network connections; secondly, analyze specific process problems, and track the behavior of high-occupancy processes through tools such as ps, jstack, strace; then check logs and monitoring data, view OOM records, exception requests, slow queries and other clues; finally, targeted processing is carried out based on common reasons such as memory leaks, connection pool exhaustion, cache failure storms, and timing task conflicts, optimize code logic, set up a timeout retry mechanism, add current limit fuses, and regularly pressure measurement and evaluation resources.

As a system administrator, you may find yourself (today or in the future) working in an environment where Windows and Linux coexist. It is no secret that some big companies prefer (or have to) run some of their production services in Windows boxes an

Frankly speaking, I cannot recall the last time I used a PC with a CD/DVD drive. This is thanks to the ever-evolving tech industry which has seen optical disks replaced by USB drives and other smaller and compact storage media that offer more storage

In Linux systems, 1. Use ipa or hostname-I command to view private IP; 2. Use curlifconfig.me or curlipinfo.io/ip to obtain public IP; 3. The desktop version can view private IP through system settings, and the browser can access specific websites to view public IP; 4. Common commands can be set as aliases for quick call. These methods are simple and practical, suitable for IP viewing needs in different scenarios.

Built on Chrome’s V8 engine, Node.JS is an open-source, event-driven JavaScript runtime environment crafted for building scalable applications and backend APIs. NodeJS is known for being lightweight and efficient due to its non-blocking I/O model and
