PostgreSQL FDE加密

FDE原理:

FDE全称Full database encryption,是全数据库加密,FDE加密对客户端完全透明,当数据从Shared buffer中写入到磁盘的时候,先进行加密,再写进磁盘,反之,读取磁盘,再进行解密,加解密的单位是一个page。

下面根据cybertec开源的一个PostgreSQL版本,进行简单的源码解读,看看是如何实现的,PostgreSQL FDE版本源码

XTS加密模式

PosgtreSQL-fde版本采用Industry standard 128-bit XTS-AES block cipher进行加密
XTS模式是一种适用于AES加密算法的加密模式。有如下的特点:
1、明文被分成固定长度的cipher block(如AES-128的128bit)
FDE中,以page为单位加密,把一个page分成多个128bit的cipher block;
2、每个cipher block之间相互独立
3、以cipher block为单位进行加密,加密时使用该cipher block的index信息(即Tweak,通常为位置信息)作为输入,使得同样的明文,在同样的算法和秘钥之下,得到的密文不相同,增加加密的安全性。

在FDE加密中,Tweak为page的位置信息,由page所在的文件和page在文件中的位置计算而来。

XTS加密的原理如下:

对于每个ciper block,都按照以上方式加密
使用SHA256加密密钥后的key,分成key1和key2,分别用来加密tweak和page。
用key1加密后的tweak(128位),在每一个cipher block加密时与之进行两次异或运算,即:明文→与tweak密文异或→用key2加密→再与tweak密文异或→密文。

部署测试:

下载

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
$ wget http://www.cybertec-postgresql.com/wp-content/uploads/2017/11/postgresql-9.6.0-fde.tar.zip

$ unzip postgresql-9.6.0-fde.tar.zip

$ tar -jxvf postgresql-9.6.0-fde.tar.bz2

$ cd postgresql-9.6.0-fde

$ ll
total 688
-rw-r--r--. 1 appusr appusr 384 Sep 27 2016 aclocal.m4
-rw-rw-r--. 1 appusr appusr 5357 Oct 24 2016 brg_endian.h
-rw-rw-r--. 1 appusr appusr 7855 Oct 24 2016 brg_types.h
drwxrwxr-x. 2 appusr appusr 4096 Sep 27 2016 config
-rwxr-xr-x. 1 appusr appusr 471157 Sep 27 2016 configure
-rw-r--r--. 1 appusr appusr 75195 Sep 27 2016 configure.in
drwxrwxr-x. 55 appusr appusr 4096 Sep 27 2016 contrib
-rw-r--r--. 1 appusr appusr 1192 Sep 27 2016 COPYRIGHT
drwxrwxr-x. 3 appusr appusr 107 Sep 27 2016 doc
-rw-r--r--. 1 appusr appusr 3638 Sep 27 2016 GNUmakefile.in
-rw-r--r--. 1 appusr appusr 283 Sep 27 2016 HISTORY
-rw-r--r--. 1 appusr appusr 75065 Sep 27 2016 INSTALL
-rw-rw-r--. 1 appusr appusr 676 Oct 24 2016 Makefile.rej
-rw-rw-r--. 1 appusr appusr 11026 Oct 24 2016 mode_hdr.h
-rw-r--r--. 1 appusr appusr 1209 Sep 27 2016 README
-rw-rw-r--. 1 appusr appusr 4455 Oct 24 2016 README.encryption
drwxrwxr-x. 16 appusr appusr 4096 Sep 27 2016 src

编译安装

1
2
3
4
5
$ ./configure --prefix=/appdb/pgfde --with-readline

$ make world -j 32

$ make install-world

设置加密密钥,并初始化数据库集群

1
2
3
4
$ read -sp "Postgres passphrase: " PGENCRYPTIONKEY
$ export PGENCRYPTIONKEY
$ echo $PGENCRYPTIONKEY
$ initdb --data-encryption pgcrypto --data-checksums -D /data/fde -E UTF8 --locale=C -U xdb

启动数据库

1
$ /appdb/pgfde/bin/pg_ctl -D /data/fde -l /data/fde/pg-fde.log start

必须设置好PGENCRYPTIONKEY这个环境变量,否则数据库启动不起来

1
2
3
4
5
6
7
8
$ tail -f /data/fde/pg-fde.log

...
LOG: encryption key not provided
DETAIL: The database cluster was initialized with encryption but the server was started without an encryption key.
HINT: Set the key using PGENCRYPTIONKEY environment variable.
FATAL: data encryption could not be initialized
LOG: database system is shut down

正常启动

1
2
3
$ /appdb/pgfde/bin/pg_ctl -D /data/fde -l /data/fde/serverlog status
pg_ctl: server is running (PID: 1471)
/appdb/pgfde/bin/postgres "-D" "/data/fde"

测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ psql -d testdb -p 5433
psql (9.6.9, server 9.6.0)
Type "help" for help.

testdb=# create table test(id int, c1 char(8),c2 varchar(16));
CREATE TABLE
testdb=# select pg_relation_filepath('test');
pg_relation_filepath
----------------------
base/16384/16388
(1 row)

testdb=# --插入2条数据
testdb=# insert into test values (1,'1','1');
INSERT 0 1
testdb=# insert into test values (2,'2','2');
INSERT 0 1
testdb=# insert into test values (3,'c','c');
INSERT 0 1
testdb=# --执行checkpoint,使Shared Buufer的数据刷入到磁盘中
testdb=# checkpoint ;
CHECKPOINT

查看数据文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
$ hexdump -C /data/fde/base/16384/16388
00000000 6d dd 05 a6 b0 98 59 73 30 63 86 cc 8d 8c 7e b5 |m.....Ys0c....~.|
00000010 e8 cc 7b 9c 92 d8 53 8f be cc c2 d9 19 db c7 d8 |..{...S.........|
00000020 da d6 b4 30 fe a3 24 4d aa 5b 6e ee 61 f4 99 a8 |...0..$M.[n.a...|
00000030 a1 f3 1e 40 2c fd e5 6e 4d 77 41 be 15 b3 0e e9 |...@,..nMwA.....|
00000040 45 98 44 ba ec 19 2f ac fd 0c db 67 02 69 31 73 |E.D.../....g.i1s|
00000050 18 b7 74 fe 58 26 d4 6c 1f 94 02 db 48 2d 03 34 |..t.X&.l....H-.4|
00000060 17 07 90 9b db 43 12 3f 18 5a ae b8 1f 69 0f b7 |.....C.?.Z...i..|
00000070 e5 cd dd 07 4f 0b 39 01 16 29 92 5a 04 66 e6 ef |....O.9..).Z.f..|
00000080 ff 68 5c de aa 76 76 5f b7 29 cc cd a1 d5 5e 7e |.h\..vv_.)....^~|
00000090 a5 e2 22 0f 6c 8f da c6 1f 58 f5 52 e6 5a 97 a8 |..".l....X.R.Z..|
000000a0 5b c2 df ad 7f 5e bf 9c 2e 99 e1 34 29 ad 47 59 |[....^.....4).GY|
000000b0 53 7f 71 e5 a5 ee bc ae df 2b bc 27 11 fa 9d 16 |S.q......+.'....|
000000c0 17 a6 97 e0 58 43 44 f5 08 03 bc 03 05 6d c1 aa |....XCD......m..|
000000d0 13 c4 f2 28 45 d5 e8 bc 09 70 17 d0 b3 16 ad 8f |...(E....p......|
000000e0 9c c6 44 0b 42 ef 69 72 78 6b e2 86 c2 12 06 ec |..D.B.irxk......|
000000f0 a4 44 4c 3b c5 d3 51 85 b6 27 74 8a 6c ad 35 b8 |.DL;..Q..'t.l.5.|
# 中间太长,不全贴了
00001f10 3b 73 89 f3 a4 5a b7 e1 be b2 f0 40 88 76 ad cd |;s...Z.....@.v..|
00001f20 9a 62 13 83 31 2b 77 bf 8f bf 06 82 04 e0 ae 60 |.b..1+w........`|
00001f30 4d cd f2 9c 55 ca ff 41 83 e1 7b cd c9 5b c1 91 |M...U..A..{..[..|
00001f40 9c 68 3e 3f d1 2b 00 13 f5 86 9f f4 09 ba 4c 49 |.h>?.+........LI|
00001f50 9c 26 42 ba 08 59 0b 5a 85 48 9c 73 d1 d2 3e 43 |.&B..Y.Z.H.s..>C|
00001f60 30 51 47 96 5e 99 fe 60 4e e3 db 8d 2e 5a 2e 77 |0QG.^..`N....Z.w|
00001f70 72 45 8f 70 5f 04 ce 67 35 dc 33 7c 84 46 45 8d |rE.p_..g5.3|.FE.|
00001f80 c2 f0 9c 6b 77 a4 e9 e4 e4 c7 73 91 ec 27 aa 4c |...kw.....s..'.L|
00001f90 b4 68 0e d9 d8 e4 8a 3b 79 5d 19 17 48 32 fa 7d |.h.....;y]..H2.}|
00001fa0 7d d0 0f fa d2 c6 65 1a b2 17 34 0a 4c 6d 86 aa |}.....e...4.Lm..|
00001fb0 ae f5 47 6c fc 5a 6b ab 68 cd 52 2c 61 a8 74 94 |..Gl.Zk.h.R,a.t.|
00001fc0 f4 e1 c5 ae 6a f4 e3 b7 a8 a3 b4 60 e0 ce 53 da |....j......`..S.|
00001fd0 70 d2 41 f6 3a 4d c4 7b f1 30 4c 17 ee fa 71 8f |p.A.:M.{.0L...q.|
00001fe0 38 ff 6e a7 d3 ad a3 d4 4a 3a eb e1 23 73 a4 7b |8.n.....J:..#s.{|
00001ff0 0f c5 eb 5b f5 4f 5a 20 bd 62 34 df 22 6c 0e 66 |...[.OZ .b4."l.f|
00002000

使用非加密版本PostgreSQL

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ psql -d testdb -p 5432
psql (9.6.9)
Type "help" for help.

testdb=# create table test(id int, c1 char(8),c2 varchar(16));
CREATE TABLE
testdb=# select pg_relation_filepath('test');
pg_relation_filepath
----------------------
base/16384/16391
(1 row)

testdb=# insert into test values (1,'1','1');
INSERT 0 1
testdb=# insert into test values (2,'2','2');
INSERT 0 1
testdb=# insert into test values (3,'c','c');
INSERT 0 1
testdb=# checkpoint ;
CHECKPOINT

查看数据文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
hexdump -C /data/utf8db/base/16384/16391
00000000 00 00 00 00 30 82 5a 01 00 00 00 00 24 00 88 1f |....0.Z.....$...|
00000010 00 20 04 20 00 00 00 00 d8 9f 4e 00 b0 9f 4e 00 |. . ......N...N.|
00000020 88 9f 4e 00 00 00 00 00 00 00 00 00 00 00 00 00 |..N.............|
00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001f80 00 00 00 00 00 00 00 00 eb 06 00 00 00 00 00 00 |................|
00001f90 00 00 00 00 00 00 00 00 03 00 03 00 02 08 18 00 |................|
00001fa0 03 00 00 00 13 63 20 20 20 20 20 20 20 05 63 00 |.....c .c.|
00001fb0 ea 06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001fc0 02 00 03 00 02 09 18 00 02 00 00 00 13 32 20 20 |.............2 |
00001fd0 20 20 20 20 20 05 32 00 e9 06 00 00 00 00 00 00 | .2.........|
00001fe0 00 00 00 00 00 00 00 00 01 00 03 00 02 09 18 00 |................|
00001ff0 01 00 00 00 13 31 20 20 20 20 20 20 20 05 31 00 |.....1 .1.|
00002000

可看到,非加密的PostgreSQL数据文件是明文的,插入的数据都能在数据文件中清楚看到内容,但是加密的数据文件看到的是一堆乱码

查看源码

PostgreSQL-fde针对加密功能,对PostgreSQL自带的加密插件pgcrypto,以及PostgreSQL的Shared Buffer写入到磁盘的代码进行了修改。

其中最主要的代码文件是:
src/backend/storage/smgr/encryption.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
/*-------------------------------------------------------------------------
*
* encryption.c
* This code handles encryption and decryption of data.
*
* Encryption is done by extension modules loaded by encryption_library GUC.
* The extension module must register itself and provide a cryptography
* implementation. Key setup is left to the extension module.
*
*
* Copyright (c) 2016, PostgreSQL Global Development Group
*
*
* IDENTIFICATION
* src/backend/storage/smgr/encryption.c
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"

#include "catalog/pg_control.h"
#include "storage/bufpage.h"
#include "storage/encryption.h"
#include "miscadmin.h"
#include "fmgr.h"
#include "port.h"

bool encryption_enabled = false;
bool have_encryption_provider = false;
EncryptionRoutines encryption_hooks;

/*
* Hook function for encryption providers. The first library to call this
* function gets to provide encryption capability.
* 注册加密模块,在pgcrypto.c文件中的_PG_init方法调用,用于加载加解密的函数钩子
*/
void
register_encryption_module(char *name, EncryptionRoutines *enc)
{
if (!have_encryption_provider)
{
elog(DEBUG1, "Registering encryption module %s", name);
encryption_hooks = *enc;
have_encryption_provider = true;
}
}

/*
* Encrypts a fixed value into *buf to verify that encryption key is correct.
* Caller provided buf needs to be able to hold at least ENCRYPTION_SAMPLE_SIZE
* bytes.
* 加密一个固定值用于校验加密密钥是否正确
*/
void
sample_encryption(char *buf)
{
char tweak[TWEAK_SIZE];
int i;
for (i = 0; i < TWEAK_SIZE; i++)
tweak[i] = i;

encrypt_block("postgresqlcrypt", buf, ENCRYPTION_SAMPLE_SIZE, tweak);
}

/*
* Encrypts one block of data with a specified tweak value. Input and output
* buffer may point to the same location. Size of input must be at least
* ENCRYPTION_BLOCK bytes. Tweak value must be TWEAK_SIZE bytes.
*
* All zero blocks are not encrypted or decrypted to correctly handle relation
* extension.
*
* Must only be called when encryption_enabled is true.
* 加密函数,用于加密数据,此处会调用了pgcrypto里面的加密方法
*/
void
encrypt_block(const char *input, char *output, Size size, const char *tweak)
{
Assert(size >= ENCRYPTION_BLOCK);
Assert(encryption_enabled);

if (IsAllZero(input, size))
{
if (input != output)
memset(output, 0, size);
}
else
encryption_hooks.EncryptBlock(input, output, size, tweak);
}

/*
* Decrypts one block of data with a specified tweak value. Input and output
* buffer may point to the same location. Tweak value must match the one used
* when encrypting.
*
* Must only be called when encryption_enabled is true.
* 解密函数,用于解密从文件中读取的数据,此处会调用了pgcrypto里面的解密方法
*/
void
decrypt_block(const char *input, char *output, Size size, const char *tweak)
{
Assert(size >= ENCRYPTION_BLOCK);
Assert(encryption_enabled);

if (IsAllZero(input, size))
{
if (input != output)
memset(output, 0, size);
}
else
encryption_hooks.DecryptBlock(input, output, size, tweak);
}

/*
* Initialize encryption subsystem for use. Must be called before any
* encryptable data is read from or written to data directory.
* 初始化加密程序
*/
void
setup_encryption()
{
char *filename;

if (encryption_library_string == NULL || encryption_library_string[0] == '\0')
return;

/* Try to load encryption library */
filename = pstrdup(encryption_library_string);

canonicalize_path(filename);

/* Make encryption libraries loading behave as if loaded via s_p_l */
process_shared_preload_libraries_in_progress = true;
load_file(filename, false);
process_shared_preload_libraries_in_progress = false;

ereport(DEBUG1,
(errmsg("loaded library \"%s\" for encryption", filename)));
pfree(filename);

if (have_encryption_provider)
{
encryption_enabled = encryption_hooks.SetupEncryption();
if (encryption_enabled)
{
if (!IsBootstrapProcessingMode())
elog(LOG, "data encryption performed by %s", encryption_library_string);
}
else
elog(FATAL, "data encryption could not be initialized");
}
else
elog(ERROR, "Specified encryption library %s did not provide encryption hooks.", encryption_library_string);
}

pgcrypto.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
...
// 此处只贴了新增的代码

const char* encryptionkey_prefix = "encryptionkey=";
const int encryption_key_length = 32;

/ *
* 支持使用命令设置密钥
*/
static bool pgcrypto_run_keysetup_command(uint8 *key)
{
FILE *fp;
char buf[encryption_key_length*2+1];
int bytes_read;
int i;

if (pgcrypto_keysetup_command == NULL)
return false;

if (!strlen(pgcrypto_keysetup_command))
return false;

elog(INFO, "Executing \"%s\" to set up encryption key", pgcrypto_keysetup_command);

fp = popen(pgcrypto_keysetup_command, "r");
if (fp == NULL)
elog(ERROR, "Failed to execute pgcrypto.keysetup_command \"%s\"",
pgcrypto_keysetup_command);

if (fread(buf, 1, strlen(encryptionkey_prefix), fp) != strlen(encryptionkey_prefix))
elog(ERROR, "Not enough data received from pgcrypto.keysetup_command");

if (strncmp(buf, encryptionkey_prefix, strlen(encryptionkey_prefix)) != 0)
elog(ERROR, "Unknown data received from pgcrypto.keysetup_command");

bytes_read = fread(buf, 1, encryption_key_length*2 + 1, fp);
if (bytes_read < encryption_key_length*2)
{
if (feof(fp))
elog(ERROR, "Encryption key provided by pgcrypto.keysetup_command too short");
else
elog(ERROR, "pgcrypto.keysetup_command returned error code %d", ferror(fp));
}

for (i = 0; i < encryption_key_length; i++)
{
if (sscanf(buf+2*i, "%2hhx", key + i) == 0)
elog(ERROR, "Invalid character in encryption key at position %d", 2*i);
}
if (bytes_read > encryption_key_length*2)
{
if (buf[encryption_key_length*2] != '\n')
elog(ERROR, "Encryption key too long '%s' %d.", buf, buf[encryption_key_length*2]);
}

while (fread(buf, 1, sizeof(buf), fp) != 0)
{
/* Discard rest of the output */
}

pclose(fp);

return true;
}

/*
* Pgcrypto module does AES-128-XTS encryption.
* 初始化使用AES-128-XTS加密上下文
*/
static bool
pgcrypto_encryption_setup()
{
uint8 key[encryption_key_length];

if (!pgcrypto_run_keysetup_command(key))
{
char *passphrase = getenv("PGENCRYPTIONKEY");

/* Empty or missing passphrase means that encryption is not configured */
if (passphrase == NULL || passphrase[0] == '\0')
{
ereport(LOG,
(errmsg("encryption key not provided"),
errdetail("The database cluster was initialized with encryption"
" but the server was started without an encryption key."),
errhint("Set the key using PGENCRYPTIONKEY environment variable.")));
return false;
}

/* TODO: replace with PBKDF2 or scrypt */
{
SHA256_CTX sha_ctx;
SHA256_Init(&sha_ctx);
SHA256_Update(&sha_ctx, (uint8*) passphrase, strlen(passphrase));
SHA256_Final(key, &sha_ctx);
}
}

if (xts_encrypt_key(key, encryption_key_length, db_key.enc_ctx) != EXIT_SUCCESS ||
xts_decrypt_key(key, encryption_key_length, db_key.dec_ctx) != EXIT_SUCCESS)
{
elog(ERROR, "Encryption key setup failed.");
return false;
}

return true;
}

/*
* 使用AES-128-XTS的加密函数
*/
static void
pgcrypto_encrypt_block(const char *input, char *output, Size size,
const char *tweak)
{
if (input != output)
memcpy(output, input, size);

xts_encrypt_block((uint8*) output, (const uint8*) tweak, size, db_key.enc_ctx);
}

/*
* 使用AES-128-XTS的解密函数
*/
static void
pgcrypto_decrypt_block(const char *input, char *output, Size size,
const char *tweak)
{
if (input != output)
memcpy(output, input, size);

xts_decrypt_block((uint8*) output, (const uint8*) tweak, size, db_key.dec_ctx);
}

void
_PG_init(void)
{
EncryptionRoutines routines;
routines.SetupEncryption = &pgcrypto_encryption_setup;
routines.EncryptBlock = &pgcrypto_encrypt_block;
routines.DecryptBlock = &pgcrypto_decrypt_block;

register_encryption_module("pgcrypto", &routines);

DefineCustomStringVariable("pgcrypto.keysetup_command",
"Command to fetch database encryption key",
"This command will be run at database startup to set up database"
" encryption key.",
&pgcrypto_keysetup_command,
"",
PGC_POSTMASTER,
0,
NULL,
NULL,
NULL);

EmitWarningsOnPlaceholders("pgcrypto");
}

从encryption.c提供的函数入手,查看PostgreSQL是什么时候调用的。

先看setup_encryption函数
在 postmaster.c中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/*
* Postmaster main entry point
*/
void
PostmasterMain(int argc, char *argv[])
{
...
CreateDataDirLockFile(true);

/*
* Initialize SSL library, if specified.
*/
#ifdef USE_SSL
if (EnableSSL)
secure_initialize();
#endif

setup_encryption();

/*
* process any libraries that should be preloaded at postmaster start
*/
process_shared_preload_libraries();
...
}

在 bootstrap.c中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
/*
* AuxiliaryProcessMain
*
* The main entry point for auxiliary processes, such as the bgwriter,
* walwriter, walreceiver, bootstrapper and the shared memory checker code.
*
* This code is here just because of historical reasons.
*/
void
AuxiliaryProcessMain(int argc, char *argv[])
{
...
/* Initialize MaxBackends (if under postmaster, was done already) */
if (!IsUnderPostmaster)
InitializeMaxBackends();

if (!IsUnderPostmaster)
setup_encryption();

if (encryption_enabled)
{
bootstrap_data_encrypted = true;
bootstrap_encryption_sample = palloc0(ENCRYPTION_SAMPLE_SIZE);
sample_encryption(bootstrap_encryption_sample);
}
...
}

在postgres.c中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
/* ----------------------------------------------------------------
* PostgresMain
* postgres main loop -- all backends, interactive or otherwise start here
*
* argc/argv are the command line arguments to be used. (When being forked
* by the postmaster, these are not the original argv array of the process.)
* dbname is the name of the database to connect to, or NULL if the database
* name should be extracted from the command line arguments or defaulted.
* username is the PostgreSQL user name to be used for the session.
* ----------------------------------------------------------------
*/
void
PostgresMain(int argc, char *argv[],
const char *dbname,
const char *username)
{
...
if (!IsUnderPostmaster)
{
/*
* Validate we have been given a reasonable-looking DataDir (if under
* postmaster, assume postmaster did this already).
*/
Assert(DataDir);
ValidatePgVersion(DataDir);

/* Change into DataDir (if under postmaster, was done already) */
ChangeToDataDir();

setup_encryption();

/*
* Create lockfile for data directory.
*/
CreateDataDirLockFile(false);

/* Initialize MaxBackends (if under postmaster, was done already) */
InitializeMaxBackends();
}
...
}

setup_encryption 在几个关键进程启动是即调用

再看encrypt_block
在 slru.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
/*
* Physical write of a page from a buffer slot
*
* On failure, we cannot just ereport(ERROR) since caller has put state in
* shared memory that must be undone. So, we return FALSE and save enough
* info in static variables to let SlruReportIOError make the report.
*
* For now, assume it's not worth keeping a file pointer open across
* independent read/write operations. We do batch operations during
* SimpleLruFlush, though.
*
* fdata is NULL for a standalone write, pointer to open-file info during
* SimpleLruFlush.
*/
static bool
SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
{
...
wbuf = shared->page_buffer[slotno];
if (encryption_enabled)
{
SlruEncryptionTweak(slru_encryption_tweak, pageno);
encrypt_block(wbuf, slru_encryption_buf, BLCKSZ, slru_encryption_tweak);
wbuf = slru_encryption_buf;
}

errno = 0;
if (write(fd, wbuf, BLCKSZ) != BLCKSZ)
{
/* if write didn't set errno, assume problem is no disk space */
if (errno == 0)
errno = ENOSPC;
slru_errcause = SLRU_WRITE_FAILED;
slru_errno = errno;
if (!fdata)
CloseTransientFile(fd);
return false;
}
...
}

在xlog.c中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
/*
* Write and/or fsync the log at least as far as WriteRqst indicates.
*
* If flexible == TRUE, we don't have to write as far as WriteRqst, but
* may stop at any convenient boundary (such as a cache or logfile boundary).
* This option allows us to avoid uselessly issuing multiple writes when a
* single one would do.
*
* Must be called with WALWriteLock held. WaitXLogInsertionsToFinish(WriteRqst)
* must be called before grabbing the lock, to make sure the data is ready to
* write.
*/
static void
XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
{
...
/* OK to write the page(s) */
from = XLogCtl->pages + startidx * (Size) XLOG_BLCKSZ;
nbytes = npages * (Size) XLOG_BLCKSZ;
if (encryption_enabled) {
int i;
/*
* XXX: use larger encryption buffer to enable larger writes
* and reduce number of syscalls?
*/
for (i = 0; i < npages; i++) {
char buf[XLOG_BLCKSZ];
char tweak[TWEAK_SIZE];
XLogEncryptionTweak(tweak, ThisTimeLineID, openLogSegNo, openLogOff);
encrypt_block(from, buf, XLOG_BLCKSZ, tweak);

XLogWritePages(buf, 1);

from += XLOG_BLCKSZ;
openLogOff += XLOG_BLCKSZ;
}
} else {
XLogWritePages(from, npages);
openLogOff += nbytes;
}
...
}

...
/*
* This func must be called ONCE on system install. It creates pg_control
* and the initial XLOG segment.
*/
void
BootStrapXLOG(void)
{
...
if (encryption_enabled)
{
char tweak[TWEAK_SIZE];
XLogEncryptionTweak(tweak, ThisTimeLineID, 1, 0);
encrypt_block((char*)page, (char*)page, XLOG_BLCKSZ, tweak);
}

/* Write the first page with the initial record */
errno = 0;
if (write(openLogFile, page, XLOG_BLCKSZ) != XLOG_BLCKSZ)
{
...
}

在写wal日志时先加密,后写入磁盘

在buffile.c中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/*
* BufFileDumpBuffer
*
* Dump buffer contents starting at curOffset.
* At call, should have dirty = true, nbytes > 0.
* On exit, dirty is cleared if successful write, and curOffset is advanced.
*/
static void
BufFileDumpBuffer(BufFile *file)
{
...
if (encryption_enabled)
{
char tweak[TWEAK_SIZE];
/*
* FIXME: figure out how to handle nbytes < smallest encryption block
* size
**/
BufFileTweak(tweak, file, file->curFile, file->curOffset);
encrypt_block(file->buffer, writeBuffer, file->nbytes, tweak);
writePtr = writeBuffer;
} else
writePtr = file->buffer;
...
}

在md.c中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
static void
mddecrypt(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum, char *dest)
{
mdtweak(md_encryption_tweak, &(reln->smgr_rnode.node), forknum, blocknum);
decrypt_block(md_encryption_buffer, dest, BLCKSZ, md_encryption_tweak);
}

BlockNumber
ReencryptBlock(char *buffer, int blocks,
RelFileNode *srcNode, RelFileNode *dstNode,
ForkNumber srcForkNum, ForkNumber dstForkNum,
BlockNumber blockNum)
{
char *cur;
char srcTweak[16];
char dstTweak[16];
for (cur = buffer; cur < buffer + blocks * BLCKSZ; cur += BLCKSZ)
{
mdtweak(srcTweak, srcNode, srcForkNum, blockNum);
mdtweak(dstTweak, dstNode, dstForkNum, blockNum);
decrypt_block(cur, cur, BLCKSZ, srcTweak);
encrypt_block(cur, cur, BLCKSZ, dstTweak);
blockNum++;
}
return blockNum;
}

/*
* mdextend() -- Add a block to the specified relation.
*
* The semantics are nearly the same as mdwrite(): write at the
* specified position. However, this is to be used for the case of
* extending a relation (i.e., blocknum is at or beyond the current
* EOF). Note that we assume writing a block beyond current EOF
* causes intervening file space to become filled with zeroes.
*/
void
mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
char *buffer, bool skipFsync)
{
...
if (encryption_enabled)
mdencrypt(reln, forknum, blocknum, buffer);

if ((nbytes = FileWrite(v->mdfd_vfd, encryption_enabled ? md_encryption_buffer : buffer, BLCKSZ)) != BLCKSZ)
{
...
}

/*
* mdwrite() -- Write the supplied block at the appropriate location.
*
* This is to be used only for updating already-existing blocks of a
* relation (ie, those before the current EOF). To extend a relation,
* use mdextend().
*/
void
mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
char *buffer, bool skipFsync)
{
...
if (encryption_enabled)
mdencrypt(reln, forknum, blocknum, buffer);
nbytes = FileWrite(v->mdfd_vfd, encryption_enabled ? md_encryption_buffer : buffer, BLCKSZ);
...
}

查找decrypt_block
在slru.c中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
/*
* Physical read of a (previously existing) page into a buffer slot
*
* On failure, we cannot just ereport(ERROR) since caller has put state in
* shared memory that must be undone. So, we return FALSE and save enough
* info in static variables to let SlruReportIOError make the report.
*
* For now, assume it's not worth keeping a file pointer open across
* read/write operations. We could cache one virtual file pointer ...
*/
static bool
SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno)
{
...
if (encryption_enabled)
{
SlruEncryptionTweak(slru_encryption_tweak, pageno);
decrypt_block(slru_encryption_buf, shared->page_buffer[slotno],
BLCKSZ, slru_encryption_tweak);
}
...
}

在xlog.c中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/*
* Read the XLOG page containing RecPtr into readBuf (if not read already).
* Returns number of bytes read, if the page is read successfully, or -1
* in case of errors. When errors occur, they are ereport'ed, but only
* if they have not been previously reported.
*
* This is responsible for restoring files from archive as needed, as well
* as for waiting for the requested WAL record to arrive in standby mode.
*
* 'emode' specifies the log level used for reporting "file not found" or
* "end of WAL" situations in archive recovery, or in standby mode when a
* trigger file is found. If set to WARNING or below, XLogPageRead() returns
* false in those situations, on higher log levels the ereport() won't
* return.
*
* In standby mode, if after a successful return of XLogPageRead() the
* caller finds the record it's interested in to be broken, it should
* ereport the error with the level determined by
* emode_for_corrupt_record(), and then set lastSourceFailed
* and call XLogPageRead() again with the same arguments. This lets
* XLogPageRead() to try fetching the record from another source, or to
* sleep and retry.
*/
static int
XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetRecPtr, char *readBuf, TimeLineID *readTLI)
{
...
if (encryption_enabled)
{
char tweak[TWEAK_SIZE];
XLogEncryptionTweak(tweak, curFileTLI, readSegNo, readOff);
decrypt_block(readBuf, readBuf, XLOG_BLCKSZ, tweak);
}
...
}

在xlogutils.c中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
/*
* Read 'count' bytes from WAL into 'buf', starting at location 'startptr'
* in timeline 'tli'.
*
* Will open, and keep open, one WAL segment stored in the static file
* descriptor 'sendFile'. This means if XLogRead is used once, there will
* always be one descriptor left open until the process ends, but never
* more than one.
*
* XXX This is very similar to pg_xlogdump's XLogDumpXLogRead and to XLogRead
* in walsender.c but for small differences (such as lack of elog() in
* frontend). Probably these should be merged at some point.
*/
static void
XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count)
{
...
/* Decrypt completed blocks */
if (encryption_enabled)
{
while (decrypt_p + XLOG_BLCKSZ <= p)
{
char tweak[TWEAK_SIZE];
XLogEncryptionTweak(tweak, tli, sendSegNo, decryptOff);
decrypt_block(decrypt_p, decrypt_p, XLOG_BLCKSZ, tweak);

decrypt_p += XLOG_BLCKSZ;
decryptOff += XLOG_BLCKSZ;
}
}
...
}

在buffile.c中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/*
* BufFileLoadBuffer
*
* Load some data into buffer, if possible, starting from curOffset.
* At call, must have dirty = false, nbytes = 0.
* On exit, nbytes is number of bytes loaded.
*/
static void
BufFileLoadBuffer(BufFile *file)
{
...
if (encryption_enabled)
{
char tweak[TWEAK_SIZE];
BufFileTweak(tweak, file, file->curFile, file->curOffset);
decrypt_block(file->buffer, file->buffer, file->nbytes, tweak);
}
...
}

在md.c中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
static void
mddecrypt(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum, char *dest)
{
mdtweak(md_encryption_tweak, &(reln->smgr_rnode.node), forknum, blocknum);
decrypt_block(md_encryption_buffer, dest, BLCKSZ, md_encryption_tweak);
}

/*
* mdread() -- Read the specified block from a relation.
*/
void
mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
char *buffer)
{
...
if (nbytes != BLCKSZ)
{
if (nbytes < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not read block %u in file \"%s\": %m",
blocknum, FilePathName(v->mdfd_vfd))));

if (zero_damaged_pages || InRecovery)
MemSet(buffer, 0, BLCKSZ);
else
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("could not read block %u in file \"%s\": read only %d of %d bytes",
blocknum, FilePathName(v->mdfd_vfd),
nbytes, BLCKSZ)));
}
else if (encryption_enabled)
mddecrypt(reln, forknum, blocknum, buffer);
}

其中ReencryptBlock在copydir.c中有调用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

/*
* copy one file. If decryption and reencryption is needed specify
* relfilenodes for source and target.
*/
void
copy_file(char *fromfile, char *tofile, RelFileNode *fromNode,
RelFileNode *toNode, ForkNumber fromForkNum, ForkNumber toForkNum,
int segment)
{
...
/*
* If the database is encrypted we need to decrypt the data here
* and reencrypt it to adjust the tweak values of blocks.
*/
if (fromNode != NULL)
{
Assert(toNode != NULL);
blockNum = ReencryptBlock(buffer, nbytes/BLCKSZ,
fromNode, toNode, fromForkNum, toForkNum, blockNum);
}
...
}

以上基本知道了PostgreSQL-FDE对数据文件进行加密的过程,当然里面还有一些细节的问题还需要后续研究。

参考:

PostgreSQL 透明加密(TDE,FDE) - 块级加密

postgresSQL的FDE加密