


Unpacking Binary Data: A Practical Guide to PHP's `pack()` and `unpack()`
Jul 25, 2025 pm 05:59 PMPHP's pack() and unpack() functions are used to convert between PHP variables and binary data. 1. pack() packages variables such as integers and strings into binary data, and unpack() unpacks the binary data into PHP variables. Both rely on format strings to specify conversion rules. 2. Common format codes include C/c (8-bit with/unsigned characters), S/s (16-bit short integer), L/l/V/N (32-bit long integer, corresponding to different endianness), f/d (floating point/double precision), a/A (fill string), x (null byte), etc. 3. Endite order is crucial: V represents little-endian (Intel), N represents big-endian (network standard). V or N should be used first when communicating across platforms to ensure consistency. 4. In practical applications, unpack() can parse BMP file headers, such as unpack('C2signature/Vsize/Vreserved/Voffset', $binary) to read 14-byte BMP headers and name fields; pack() can be used to build custom binary protocols, such as pack('Cv', $type, $length) to generate message headers. 5. Notes include: confirming the byte order of the data, distinguishing whether there are signs, correctly handling string filling methods (a is filled with NULL, A is filled with spaces), and naming the fields through /name syntax to improve readability. 6. These functions are suitable for parsing binary file formats such as ZIP and PNG, communicating with hardware, implementing network protocols or processing C structure data, and providing fine control of the underlying data without expansion. As long as the format string accurately matches the data structure, binary data operations can be reliably performed.
Working with binary data in PHP might sound like something only low-level systems programs deal with, but it's actually useful in many practical scenarios—like parsing file formats (PNG, ZIP), handling network protocols, or communicating with hardware. PHP's pack()
and unpack()
functions are powerful tools for converting between data types and their binary representations.

Let's break down how they work and when to use them—without the jargon overload.
What Do pack()
and unpack()
Actually Do?
At their core:

-
pack()
converts PHP variables (like integers, strings) into binary data based on a format string. -
unpack()
reads binary data and converts it back into PHP variables, again using a format string.
They're like translators between PHP's high-level world and the raw bytes that computers and networks speak.
$data = pack('S', 258); // Convert integer 258 to 2-byte unsigned short $unpacked = unpack('S', $data); // Convert back var_dump($unpacked); // [1] => 258
Note: unpack()
returns an array, even for a single value.

Understanding the Format Codes
The magic (and confusion) lies in the format string —a sequence of letters and numbers telling PHP how to interpret the data.
Here are the most commonly used codes:
Code | Meaning | Size (bytes) |
---|---|---|
C | Unsigned char (8-bit) | 1 |
c | Signed char | 1 |
S | Unsigned short (16-bit, LE) | 2 |
s | Signed short | 2 |
L | Unsigned long (32-bit, LE) | 4 |
l | Signed long | 4 |
N | Unsigned long (32-bit, BE) | 4 |
V | Unsigned long (32-bit, LE) | 4 |
f | Float (32-bit) | 4 |
d | Double (64-bit) | 8 |
a | NUL-padded string | configurable |
A | Space-padded string | configurable |
x | NUL byte (skip) | 1 |
Note on endianness :
-
V
= Little-endian (Intel, most common) -
N
= Big-endian (network byte order) -
L
= Machine-dependent (usually little-endian)
If you're dealing with cross-platform or network data, prefer N
and V
for predictable behavior.
Practical Example: Reading a BMP File Header
Let's say you want to read basic info from a BMP file. The first 14 bytes are the file header.
$file = fopen('image.bmp', 'rb'); $binary = fread($file, 14); fclose($file); // Unpack BMP header $header = unpack('C2signature/Vsize/Vreserved/Voffset', $binary); // Results: // $header['signature1'] = 66 ('B') // $header['signature2'] = 77 ('M') // $header['size'] = file size in bytes // $header['offset'] = where pixel data starts
Here:
-
C2signature
reads two bytes as unsigned chars and names themsignature1
,signature2
-
Vsize
reads a 32-bit unsigned long in little-endian (BMP uses little-endian) - You can name fields by adding
/name
after the format code
This makes it easy to parse structured binary formats.
Packing Data: Building a Binary Message
Imagine you're creating a simple binary protocol where each message contains:
- Message type (1 byte)
- Length (2 bytes, little-endian)
- Payload (string)
$messageType = 1; $payload = 'Hello'; $length = strlen($payload); $binary = pack('Cv', $messageType, $length) . $payload;
Here:
-
C
= message type (unsigned char) -
v
= length as unsigned short, little-endian (lowercase = little-endian)
Now $binary
can be sent over a socket or saved to a file.
To read it back:
$data = unpack('Ctype/vlength', $binary); $payload = substr($binary, 3, $data['length']);
Tips and Gotchas
- Endianness matters : When working with hardware or network protocols, always confirm byte order. Use
N
(big-endian) orV
(little-endian) for consistency. - Signed vs. unsigned :
c
vsC
,s
vsS
—mixing them up leads to negative numbers or overflow. - String padding :
a
pads with NULLs,A
with spaces. Usea
for binary-safe strings. - Field naming : Use
/name
syntax to make results readable:unpack('Cversion/Ctype/a10data', $bin); // returns ['version'=>1, 'type'=>2, 'data'=>"payload..."]
-
unpack()
returns indexed arrays by default unless you name the fields. - Parsing binary file formats: ZIP, PNG, WAV, etc.
- Communicating with embedded devices or sensors
- Creating or reading custom binary protocols
- Handling packed data from C structures or databases
When Should You Use These Functions?
They're not everyday tools, but when you need them, there's no real alternative in pure PHP.
Basically, pack()
and unpack()
give you fine-grained control over binary data without needing extensions or external libraries. It's not flashy, but once you understand the format codes, you can interface with low-level data structures directly in PHP.
Just remember: match your format string to the data's structure and byte order—get that right, and the rest follows.
The above is the detailed content of Unpacking Binary Data: A Practical Guide to PHP's `pack()` and `unpack()`. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Nullbytes(\0)cancauseunexpectedbehaviorinPHPwheninterfacingwithCextensionsorsystemcallsbecauseCtreats\0asastringterminator,eventhoughPHPstringsarebinary-safeandpreservefulllength.2.Infileoperations,filenamescontainingnullbyteslike"config.txt\0.p

sprintf and vsprintf provide advanced string formatting functions in PHP. The answers are: 1. The floating point accuracy and %d can be controlled through %.2f, and the integer type can be ensured with d, and zero padding can be achieved with d; 2. The variable position can be fixed using positional placeholders such as %1$s and %2$d, which is convenient for internationalization; 3. The left alignment and ] right alignment can be achieved through %-10s, which is suitable for table or log output; 4. vsprintf supports array parameters to facilitate dynamic generation of SQL or message templates; 5. Although there is no original name placeholder, {name} syntax can be simulated through regular callback functions, or the associative array can be used in combination with extract(); 6. Substr_co

TodefendagainstXSSandinjectioninPHP:1.Alwaysescapeoutputusinghtmlspecialchars()forHTML,json_encode()forJavaScript,andurlencode()forURLs,dependingoncontext.2.Validateandsanitizeinputearlyusingfilter_var()withappropriatefilters,applywhitelistvalidation

PHP's PCRE function supports advanced regular functions, 1. Use capture group() and non-capture group (?:) to separate matching content and improve performance; 2. Use positive/negative preemptive assertions (?=) and (?!)) and post-issue assertions (???)) and post-issue assertions (??

UTF-8 processing needs to be managed manually in PHP, because PHP does not support Unicode by default; 1. Use the mbstring extension to provide multi-byte security functions such as mb_strlen, mb_substr and explicitly specify UTF-8 encoding; 2. Ensure that database connection uses utf8mb4 character set; 3. Declare UTF-8 through HTTP headers and HTML meta tags; 4. Verify and convert encoding during file reading and writing; 5. Ensure that the data is UTF-8 before JSON processing; 6. Use mb_detect_encoding and iconv for encoding detection and conversion; 7. Preventing data corruption is better than post-repair, and UTF-8 must be used at all levels to avoid garbled code problems.

Rawstringsindomain-drivenapplicationsshouldbereplacedwithvalueobjectstopreventbugsandimprovetypesafety;1.Usingrawstringsleadstoprimitiveobsession,whereinterchangeablestringtypescancausesubtlebugslikeargumentswapping;2.ValueobjectssuchasEmailAddressen

PHP's native serialization is more suitable for PHP's internal data storage and transmission than JSON, 1. Because it can retain complete data types (such as int, float, bool, etc.); 2. Support private and protected object properties; 3. Can handle recursive references safely; 4. There is no need for manual type conversion during deserialization; 5. It is usually better than JSON in performance; but it should not be used in cross-language scenarios, and unserialize() should never be called for untrusted inputs to avoid triggering remote code execution attacks. It is recommended to use it when it is limited to PHP environment and requires high-fidelity data.

Character-levelstringmanipulationcanseverelyimpactperformanceinimmutable-stringlanguagesduetorepeatedallocationsandcopying;1)avoidrepeatedconcatenationusing =inloops,insteadusemutablebufferslikelist ''.join()inPythonorStringBuilderinJava;2)minimizein
