InnoDB is a storage engine that can store data in tables on disk, so even if the server is shut down after a restart, our data can still be retained. The actual process of processing data occurs in memory, so the data in the disk needs to be loaded into the memory. If it is processing a write or modification request, the contents in the memory also need to be refreshed to the disk. And we know that the speed of reading and writing to disk is very slow, which is several orders of magnitude different from reading and writing in memory. So when we want to get certain records from the table, does the InnoDB storage engine need to read the records from the disk one by one?
The method adopted by InnoDB is to divide the data into several pages, and use pages as the basic unit of interaction between disk and memory. The size of a page in InnoDB is generally 16KB. That is to say, under normal circumstances, at least 16KB of content is read from the disk to the memory at a time, and at least 16KB of the content in the memory is refreshed to the disk at a time.
mysql> show variables like '%innodb_page_size%'; +------------------+-------+ | Variable_name | Value | +------------------+-------+ | innodb_page_size | 16384 | +------------------+-------+ 1 row in set (0.00 sec)
We usually insert data into tables in units of records. The way these records are stored on disk is also called row format or record format. The InnoDB storage engine has designed four different types of row formats, namely Compact, Redundant, Dynamic and Compressed row formats.
Classification and introduction of row record formats
In early InnoDB versions, since there was only one file format, there was no need to name this file format. In order to support new features and incompatibility with earlier versions, the InnoDB engine has developed a new file format. To help manage system compatibility in upgrade and downgrade situations, as well as running different MySQL versions, InnoDB started using a named file format.
In msyql 5.7.9 and later versions, the default row format is determined by the innodb_default_row_format variable, and its default value is dynamic:
mysql> show variables like "innodb_file_format"; +--------------------+-----------+ | Variable_name | Value | +--------------------+-----------+ | innodb_file_format | Barracuda | +--------------------+-----------+ 1 row in set (0.01 sec) mysql> show variables like "innodb_default_row_format"; +---------------------------+---------+ | Variable_name | Value | +---------------------------+---------+ | innodb_default_row_format | dynamic | +---------------------------+---------+ 1 row in set (0.00 sec)
View the current table using Row format:
mysql> show table status like 'dept_emp'\G*************************** 1. row *************************** Name: dept_emp Engine: InnoDB Version: 10 Row_format: Dynamic Rows: 331570 Avg_row_length: 36 Data_length: 12075008Max_data_length: 0 Index_length: 5783552 Data_free: 0 Auto_increment: NULL Create_time: 2021-08-11 09:04:36 Update_time: NULL Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment:1 row in set (0.00 sec)
Specify the row format of the table:
CREATE TABLE 表名(列的信息) ROW_FORMAT=行格式名稱(chēng)ALTER TABLE 表名 ROW_FORMAT=行格式名稱(chēng);
If you want to modify the row mode of an existing table to compressed or dynamic, you must first set the file format to Barracuda: set global innodb_file_format=Barracuda;, and then use ALTER TABLE tablename ROW_FORMAT=COMPRESSED; to modify it to take effect.
Row format
COMPACT
Variable length field list
MySQL supports some variable length data Types, such as VARCHAR(M), VARBINARY(M), various TEXT types, and various BLOB types. We can also call columns with these data types variable-length fields. How many bytes of data are stored in variable-length fields? It is not fixed, so when we store real data, we need to store the number of bytes occupied by these data. If the maximum number of bytes allowed to be stored in the variable field (M × W) exceeds 255 bytes and the actual number of bytes stored (L) exceeds 127 bytes, use 2 bytes to record, otherwise use 1 byte Record.
Question 1: So why use 128 as the dividing line? One byte can represent up to 255, but when MySQL designs the length representation, in order to distinguish whether it is a byte that represents the length, it is stipulated that if the highest bit is 1, then two bytes represent the length, otherwise it is one byte. For example, 01111111, this means the length is 127, and if the length is 128, two bytes are needed, which is 10000000 10000000. The highest bit of the first byte is 1, then this is the beginning of the two bytes representing the length. The second byte can use all bits to represent the length, and it should be noted that MySQL adopts Little Endian counting method, with the low bit first and the high bit last, so 129 is 10000001 10000000. The maximum length of this identification method is 32767, which is 32KB.
Question 2: What should I do if two bytes are not enough to represent the length? The default page size of innoDB is 16KB. For some fields that occupy a lot of bytes, for example, the length of a field is greater than 16KB. If the record cannot be stored in a single page, InnoDB will store part of the data in the so-called overflow. In the page, only the length left in this page is stored in the variable-length field length list, so it can be stored using two bytes. This overflow page mechanism refers to the data overflow later.
NULL value list
Some columns in the table may store NULL values. If these NULL values ??are stored in the real data of the record, it will take up a lot of space, so the Compact row format These columns with NULL values ??are managed uniformly and stored in a NULL value list. For each column that can store NULL, there will be a corresponding binary bit. When the value of the binary bit is 1, it means that the value of the column is NULL. When the binary bit value is 0, it means that the value of the column is not NULL.
Record header information
The record header information used to describe the record, which is composed of fixed 5 bytes. 5 bytes are 40 binary bits, and different bits represent different meanings.
字段 | 長(zhǎng)度(bit) | 說(shuō)明 |
---|---|---|
預(yù)留位1 | 1 | 沒(méi)有使用 |
預(yù)留位2 | 1 | 沒(méi)有使用 |
delete_mask | 1 | 標(biāo)記該記錄是否被刪除 |
min_rec_mask | 1 | B+樹(shù)的每層非葉子節(jié)點(diǎn)中的最小記錄都會(huì)添加該標(biāo)記 |
n_owned | 4 | 表示當(dāng)前記錄擁有的記錄數(shù) |
heap_no | 13 | 表示當(dāng)前記錄在頁(yè)的位置信息 |
record_type | 3 | 表示當(dāng)前記錄的類(lèi)型,0 表示普通記錄,1 表示B+樹(shù)非葉子節(jié)點(diǎn)記錄,2 表示最小記錄,3 表示最大記錄 |
next_record | 16 | 表示下一條記錄的相對(duì)位置 |
隱藏列
記錄的真實(shí)數(shù)據(jù)除了我們自己定義的列的數(shù)據(jù)以外,MySQL會(huì)為每個(gè)記錄默認(rèn)的添加一些列(也稱(chēng)為隱藏列),包括:
DB_ROW_ID(row_id):非必須,6字節(jié),表示行ID,唯一標(biāo)識(shí)一條記錄
DB_TRX_ID:必須,6字節(jié),表示事務(wù)ID
DB_ROLL_PTR:必須,7字節(jié),表示回滾指針
InnoDB表對(duì)主鍵的生成策略是:優(yōu)先使用用戶(hù)自定義主鍵作為主鍵,如果用戶(hù)沒(méi)有定義主鍵,則選取一個(gè)Unique鍵作為主鍵,如果表中連Unique 鍵都沒(méi)有定義的話(huà),則InnoDB會(huì)為表默認(rèn)添加一個(gè)名為row_id的隱藏列作為主鍵。
DB_TRX_ID(也可以稱(chēng)為trx_id) 和DB_ROLL_PTR(也可以稱(chēng)為roll_ptr) 這兩個(gè)列是必有的,但是row_id是可選的(在沒(méi)有自定義主鍵以及Unique 鍵的情況下才會(huì)添加該列)。
其他的行格式和Compact行格式差別不大。
Redundant行格式
Redundant行格式是MySQL5.0之前用的一種行格式,不予深究。
Dynamic行格式
MySQL5.7的默認(rèn)行格式就是Dynamic,Dynamic行格式和Compact行格式挺像,只不過(guò)在處理行溢出數(shù)據(jù)時(shí)有所不同。
Compressed行格式
Compressed行格式在Dynamic行格式的基礎(chǔ)上會(huì)采用壓縮算法對(duì)頁(yè)面進(jìn)行壓縮,以節(jié)省空間。以zlib的算法進(jìn)行壓縮,因此對(duì)于BLOB、TEXT、VARCHAR這類(lèi)大長(zhǎng)度數(shù)據(jù)能夠進(jìn)行有效的存儲(chǔ)(減少40%,但對(duì)CPU要求更高)。
數(shù)據(jù)溢出
如果我們定義一個(gè)表,表中只有一個(gè)VARCHAR字段,如下:
CREATE TABLE test_varchar( c VARCHAR(60000))
然后往這個(gè)字段插入60000個(gè)字符,會(huì)發(fā)生什么?前邊說(shuō)過(guò),MySQL中磁盤(pán)和內(nèi)存交互的基本單位是頁(yè),也就是說(shuō)MySQL是以頁(yè)為基本單位來(lái)管理存儲(chǔ)空間的,我們的記錄都會(huì)被分配到某個(gè)頁(yè)中存儲(chǔ)。而一個(gè)頁(yè)的大小一般是16KB,也就是16384字節(jié),而一個(gè)VARCHAR(M)類(lèi)型的列就最多可以存儲(chǔ)65532個(gè)字節(jié),這樣就可能造成一個(gè)頁(yè)存放不了一條記錄的情況。
在Compact和Redundant行格式中,對(duì)于占用存儲(chǔ)空間非常大的列,在記錄的真實(shí)數(shù)據(jù)處只會(huì)存儲(chǔ)該列的該列的前768個(gè)字節(jié)的數(shù)據(jù),然后把剩余的數(shù)據(jù)分散存儲(chǔ)在幾個(gè)其他的頁(yè)中,記錄的真實(shí)數(shù)據(jù)處用20個(gè)字節(jié)(768字節(jié)后20個(gè)字節(jié))存儲(chǔ)指向這些頁(yè)的地址。這個(gè)過(guò)程也叫做行溢出,存儲(chǔ)超出768字節(jié)的那些頁(yè)面也被稱(chēng)為溢出頁(yè)。
Dynamic和Compressed行格式,不會(huì)在記錄的真實(shí)數(shù)據(jù)處存儲(chǔ)字段真實(shí)數(shù)據(jù)的前768個(gè)字節(jié),而是把所有的字節(jié)都存儲(chǔ)到其他頁(yè)面中,只在記錄的真實(shí)數(shù)據(jù)處存儲(chǔ)其他頁(yè)面的地址。
實(shí)戰(zhàn)分析行格式
準(zhǔn)備表及數(shù)據(jù):
create table row_test ( t1 varchar(10), t2 varchar(10), t3 char(10), t4 varchar(10) ) engine=innodb charset=latin1 row_format=compact; insert into row_test values('a','bb','bb','ccc'); insert into row_test values('d','ee','ee','fff'); insert into row_test values('d',NULL,NULL,'fff');
在Linux環(huán)境下,使用hexdump -C -v mytest.ibd>mytest.txt,打開(kāi)mytest.txt文件,找到如下內(nèi)容:
0000c070 73 75 70 72 65 6d 75 6d 03 02 01 00 00 00 10 00 |supremum........| 0000c080 2c 00 00 00 00 02 00 00 00 00 00 0f 61 c8 00 00 |,...........a...| 0000c090 01 d4 01 10 61 62 62 62 62 20 20 20 20 20 20 20 |....abbbb | 0000c0a0 20 63 63 63 03 02 01 00 00 00 18 00 2b 00 00 00 | ccc........+...| 0000c0b0 00 02 01 00 00 00 00 0f 62 c9 00 00 01 b2 01 10 |........b.......| 0000c0c0 64 65 65 65 65 20 20 20 20 20 20 20 20 66 66 66 |deeee fff| 0000c0d0 03 01 06 00 00 20 ff 98 00 00 00 00 02 02 00 00 |..... ..........| 0000c0e0 00 00 0f 67 cc 00 00 01 b6 01 10 64 66 66 66 00 |...g.......dfff.|
該行記錄從0000c078開(kāi)始,第一行整理如下:
03 02 01 // 變長(zhǎng)字段長(zhǎng)度列表,逆序,t4列長(zhǎng)度為3,t2列長(zhǎng)度為2,t1列長(zhǎng)度為1 00 // NULL標(biāo)志位,第一行沒(méi)有NULL值 00 00 10 00 2c // 記錄頭信息,固定5字節(jié)長(zhǎng)度 00 00 00 2b 68 00 // RowID我們建的表沒(méi)有主鍵,因此會(huì)有RowID,固定6字節(jié)長(zhǎng)度 00 00 00 00 06 05 // 事務(wù)ID,固定6個(gè)字節(jié)80 00 00 00 32 01 10 // 回滾指針,固定7個(gè)字節(jié)61 // t1數(shù)據(jù)'a'62 62 // t2'bb'62 62 20 20 20 20 20 20 20 20 // t3數(shù)據(jù)'bb'63 63 63 // t4數(shù)據(jù)'ccc'
第二行整理如下:
03 02 01 // 變長(zhǎng)字段長(zhǎng)度列表,逆序,t4列長(zhǎng)度為3,t2列長(zhǎng)度為2,t1列長(zhǎng)度為1 00 // NULL標(biāo)志位,第二行沒(méi)有NULL值 00 00 18 00 2b // 記錄頭信息,固定5字節(jié)長(zhǎng)度 00 00 00 00 02 01 // RowID我們建的表沒(méi)有主鍵,因此會(huì)有RowID,固定6字節(jié)長(zhǎng)度 00 00 00 00 0f 62 // 事務(wù)ID,固定6個(gè)字節(jié) c9 00 00 01 b2 01 10 // 回滾指針,固定7個(gè)字節(jié)64 // t1數(shù)據(jù)'d'65 65 // t2數(shù)據(jù)'ee'65 65 20 20 20 20 20 20 20 20 // t3數(shù)據(jù)'ee'66 66 66 // t4數(shù)據(jù)'fff'
第三行整理如下:
03 01 // 變長(zhǎng)字段長(zhǎng)度列表,逆序,t4列長(zhǎng)度為3,t1列長(zhǎng)度為1 06 // 00000110 NULL標(biāo)志位,t2和t3列為空 00 00 20 ff 98 // 記錄頭信息,固定5字節(jié)長(zhǎng)度 00 00 00 00 02 02 // RowID我們建的表沒(méi)有主鍵,因此會(huì)有RowID,固定6字節(jié)長(zhǎng)度 00 00 00 00 0f 67 // 事務(wù)ID,固定6個(gè)字節(jié) cc 00 00 01 b6 01 10 // 回滾指針,固定7個(gè)字節(jié)64 // t1數(shù)據(jù)'d'66 66 66 // t4數(shù)據(jù)'fff'
接下來(lái)更新下數(shù)據(jù):
mysql> update row_test set t2=null where t1='a'; Query OK, 1 row affected (0.02 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> delete from row_test where t2='ee'; Query OK, 1 row affected (0.01 sec)
查看二進(jìn)制內(nèi)容(需要等一會(huì),有可能只寫(xiě)入了緩存,磁盤(pán)上的文件并沒(méi)有更新):
0000c070 73 75 70 72 65 6d 75 6d 03 01 02 00 00 10 00 58 |supremum.......X| 0000c080 00 00 00 00 02 00 00 00 00 00 0f 68 4d 00 00 01 |...........hM...| 0000c090 9e 04 a9 61 62 62 20 20 20 20 20 20 20 20 63 63 |...abb cc| 0000c0a0 63 63 63 63 03 02 01 00 20 00 18 00 00 00 00 00 |cccc.... .......| 0000c0b0 00 02 01 00 00 00 00 0f 6a 4e 00 00 01 9f 10 c0 |........jN......| 0000c0c0 64 65 65 65 65 20 20 20 20 20 20 20 20 66 66 66 |deeee fff| 0000c0d0 03 01 06 00 00 20 ff 98 00 00 00 00 02 02 00 00 |..... ..........| 0000c0e0 00 00 0f 67 cc 00 00 01 b6 01 10 64 66 66 66 00 |...g.......dfff.|
該行記錄從0000c078開(kāi)始,第一行整理如下:
03 01 // 變長(zhǎng)字段長(zhǎng)度列表,逆序,t4列長(zhǎng)度為3,t1列長(zhǎng)度為1 02 // 0000 0010 NULL標(biāo)志位,表示t2為null 00 00 10 00 58 // 記錄頭信息,固定5字節(jié)長(zhǎng)度 00 00 00 00 02 00 // RowID我們建的表沒(méi)有主鍵,因此會(huì)有RowID,固定6字節(jié)長(zhǎng)度 00 00 00 00 0f 68 // 事務(wù)ID,固定6個(gè)字節(jié) 4d 00 00 01 9e 04 a9 // 回滾指針,固定7個(gè)字節(jié)61 // t1數(shù)據(jù)'a'62 62 20 20 20 20 20 20 20 20 // t3數(shù)據(jù)'bb'63 63 63 // t4數(shù)據(jù)'ccc'
第二行整理如下:
03 02 01 // 變長(zhǎng)字段長(zhǎng)度列表,逆序,t4列長(zhǎng)度為3,t2列長(zhǎng)度為2,t1列長(zhǎng)度為1 00 // NULL標(biāo)志位,第二行沒(méi)有NULL值20 00 18 00 00 // 0010 delete_mask=1 標(biāo)記該記錄是否被刪除 記錄頭信息,固定5字節(jié)長(zhǎng)度 00 00 00 00 02 01 // RowID我們建的表沒(méi)有主鍵,因此會(huì)有RowID,固定6字節(jié)長(zhǎng)度 00 00 00 00 0f 6a // 事務(wù)ID,固定6個(gè)字節(jié) 4e 00 00 01 9f 10 c0 // 回滾指針,固定7個(gè)字節(jié)64 // t1數(shù)據(jù)'d'65 65 // t2數(shù)據(jù)'ee'65 65 20 20 20 20 20 20 20 20 // t3數(shù)據(jù)'ee'66 66 66 // t4數(shù)據(jù)'fff'
第三行數(shù)據(jù)未發(fā)生變化。
The above is the detailed content of How MySQL sees InnoDB row format from binary content. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

1. The first choice for the Laravel MySQL Vue/React combination in the PHP development question and answer community is the first choice for Laravel MySQL Vue/React combination, due to its maturity in the ecosystem and high development efficiency; 2. High performance requires dependence on cache (Redis), database optimization, CDN and asynchronous queues; 3. Security must be done with input filtering, CSRF protection, HTTPS, password encryption and permission control; 4. Money optional advertising, member subscription, rewards, commissions, knowledge payment and other models, the core is to match community tone and user needs.

There are three main ways to set environment variables in PHP: 1. Global configuration through php.ini; 2. Passed through a web server (such as SetEnv of Apache or fastcgi_param of Nginx); 3. Use putenv() function in PHP scripts. Among them, php.ini is suitable for global and infrequently changing configurations, web server configuration is suitable for scenarios that need to be isolated, and putenv() is suitable for temporary variables. Persistence policies include configuration files (such as php.ini or web server configuration), .env files are loaded with dotenv library, and dynamic injection of variables in CI/CD processes. Security management sensitive information should be avoided hard-coded, and it is recommended to use.en

To achieve MySQL deployment automation, the key is to use Terraform to define resources, Ansible management configuration, Git for version control, and strengthen security and permission management. 1. Use Terraform to define MySQL instances, such as the version, type, access control and other resource attributes of AWSRDS; 2. Use AnsiblePlaybook to realize detailed configurations such as database user creation, permission settings, etc.; 3. All configuration files are included in Git management, support change tracking and collaborative development; 4. Avoid hard-coded sensitive information, use Vault or AnsibleVault to manage passwords, and set access control and minimum permission principles.

To collect user behavior data, you need to record browsing, search, purchase and other information into the database through PHP, and clean and analyze it to explore interest preferences; 2. The selection of recommendation algorithms should be determined based on data characteristics: based on content, collaborative filtering, rules or mixed recommendations; 3. Collaborative filtering can be implemented in PHP to calculate user cosine similarity, select K nearest neighbors, weighted prediction scores and recommend high-scoring products; 4. Performance evaluation uses accuracy, recall, F1 value and CTR, conversion rate and verify the effect through A/B tests; 5. Cold start problems can be alleviated through product attributes, user registration information, popular recommendations and expert evaluations; 6. Performance optimization methods include cached recommendation results, asynchronous processing, distributed computing and SQL query optimization, thereby improving recommendation efficiency and user experience.

PHP plays the role of connector and brain center in intelligent customer service, responsible for connecting front-end input, database storage and external AI services; 2. When implementing it, it is necessary to build a multi-layer architecture: the front-end receives user messages, the PHP back-end preprocesses and routes requests, first matches the local knowledge base, and misses, call external AI services such as OpenAI or Dialogflow to obtain intelligent reply; 3. Session management is written to MySQL and other databases by PHP to ensure context continuity; 4. Integrated AI services need to use Guzzle to send HTTP requests, safely store APIKeys, and do a good job of error handling and response analysis; 5. Database design must include sessions, messages, knowledge bases, and user tables, reasonably build indexes, ensure security and performance, and support robot memory

To recycle MySQL user permissions using REVOKE, you need to specify the permission type, database, and user by format. 1. Use REVOKEALLPRIVILEGES, GRANTOPTIONFROM'username'@'hostname'; 2. Use REVOKEALLPRIVILEGESONmydb.FROM'username'@'hostname'; 3. Use REVOKEALLPRIVILEGESONmydb.FROM'username'@'hostname'; 3. Use REVOKE permission type ON.*FROM'username'@'hostname'; Note that after execution, it is recommended to refresh the permissions. The scope of the permissions must be consistent with the authorization time, and non-existent permissions cannot be recycled.

When choosing a suitable PHP framework, you need to consider comprehensively according to project needs: Laravel is suitable for rapid development and provides EloquentORM and Blade template engines, which are convenient for database operation and dynamic form rendering; Symfony is more flexible and suitable for complex systems; CodeIgniter is lightweight and suitable for simple applications with high performance requirements. 2. To ensure the accuracy of AI models, we need to start with high-quality data training, reasonable selection of evaluation indicators (such as accuracy, recall, F1 value), regular performance evaluation and model tuning, and ensure code quality through unit testing and integration testing, while continuously monitoring the input data to prevent data drift. 3. Many measures are required to protect user privacy: encrypt and store sensitive data (such as AES

Why do I need SSL/TLS encryption MySQL connection? Because unencrypted connections may cause sensitive data to be intercepted, enabling SSL/TLS can prevent man-in-the-middle attacks and meet compliance requirements; 2. How to configure SSL/TLS for MySQL? You need to generate a certificate and a private key, modify the configuration file to specify the ssl-ca, ssl-cert and ssl-key paths and restart the service; 3. How to force SSL when the client connects? Implemented by specifying REQUIRESSL or REQUIREX509 when creating a user; 4. Details that are easily overlooked in SSL configuration include certificate path permissions, certificate expiration issues, and client configuration requirements.
