博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
PostgreSQL Daily Maintenance - vacuum
阅读量:5749 次
发布时间:2019-06-18

本文共 13298 字,大约阅读时间需要 44 分钟。

PostgreSQL数据库日常维护需要维护哪些东西, 和数据库中的业务类型有莫大的关系.
PostgreSQL的并发控制简单来说是通过多tuple版本, tuple infomask信息, 事务提交状态以及事务snapshot来实现的.
当删除一条记录时, 并不是马上回收被删除的空间, 因为有可能其他事务还会用到它, 当更新一条记录是, 老的记录会保留, 然后插入新的记录.
例如 :
digoal=# create table tbl(id int, info text);CREATE TABLEdigoal=# insert into tbl values (1, 'test');INSERT 0 1digoal=# delete from tbl;DELETE 1digoal=# insert into tbl values (1, 'test');INSERT 0 1digoal=# delete from tbl;DELETE 1digoal=# insert into tbl values (1, 'test');INSERT 0 1digoal=# select ctid,* from tbl; ctid  | id | info -------+----+------ (0,3) |  1 | test(1 row)
多次删除插入后, ctid以及变成3了, 因为前面的两条并为删除. 
update也是如此 :
digoal=# update tbl set info='new';UPDATE 1digoal=# select ctid,* from tbl; ctid  | id | info -------+----+------ (0,4) |  1 | new(1 row)
老的tuple在0号block的itemid=3的位置, 新的tuple是后面插入的在0号block的4号槽.
那么这些垃圾数据是怎么回收的呢, PostgreSQL的vacuum进程就是干这个事情的.
1. vacuum 数据清理.
以上测试表在执行vacuum后的输出如下 : 
移除了3个版本.
digoal=# vacuum verbose tbl;INFO:  vacuuming "public.tbl"INFO:  "tbl": removed 3 row versions in 1 pagesINFO:  "tbl": found 3 removable, 1 nonremovable row versions in 1 out of 1 pagesDETAIL:  0 dead row versions cannot be removed yet.There were 0 unused item pointers.0 pages are entirely empty.CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO:  vacuuming "pg_toast.pg_toast_32771"INFO:  index "pg_toast_32771_index" now contains 0 row versions in 1 pagesDETAIL:  0 index row versions were removed.0 index pages have been deleted, 0 are currently reusable.CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO:  "pg_toast_32771": found 0 removable, 0 nonremovable row versions in 0 out of 0 pagesDETAIL:  0 dead row versions cannot be removed yet.There were 0 unused item pointers.0 pages are entirely empty.CPU 0.00s/0.00u sec elapsed 0.00 sec.VACUUM
重新插入数据, 此时那些被垃圾占用的槽位就可以被利用了.
digoal=# insert into tbl values (1, 'test');INSERT 0 1digoal=# select ctid,* from tbl; ctid  | id | info -------+----+------ (0,1) |  1 | test (0,4) |  1 | new(2 rows)
一个表有多少条垃圾数据, 多少条活跃数据在系统表
pg_stat_all_tables
中可以查询.
digoal=# select * from pg_stat_all_tables where relid='tbl'::regclass;-[ RECORD 1 ]-----+------------------------------relid             | 32771schemaname        | publicrelname           | tblseq_scan          | 6seq_tup_read      | 7idx_scan          | idx_tup_fetch     | n_tup_ins         | 4n_tup_upd         | 1n_tup_del         | 2n_tup_hot_upd     | 1n_live_tup        | 2n_dead_tup        | 0last_vacuum       | 2013-05-27 17:00:17.094391+08last_autovacuum   | last_analyze      | last_autoanalyze  | vacuum_count      | 1autovacuum_count  | 0analyze_count     | 0autoanalyze_count | 0
n_live_tup        | 2表示有2条活跃数据, 
n_dead_tup        | 0表示有0条垃圾数据.
执行以下删除后, 会发生变化 :
digoal=# delete from tbl;;DELETE 2digoal=# select * from pg_stat_all_tables where relid='tbl'::regclass;-[ RECORD 1 ]-----+------------------------------relid             | 32771schemaname        | publicrelname           | tblseq_scan          | 7seq_tup_read      | 9idx_scan          | idx_tup_fetch     | n_tup_ins         | 4n_tup_upd         | 1n_tup_del         | 4n_tup_hot_upd     | 1n_live_tup        | 0n_dead_tup        | 2last_vacuum       | 2013-05-27 17:00:17.094391+08last_autovacuum   | last_analyze      | last_autoanalyze  | vacuum_count      | 1autovacuum_count  | 0analyze_count     | 0autoanalyze_count | 0
n_live_tup        | 0表示有0条活跃数据, 
n_dead_tup        | 2表示有2条垃圾数据.
vacuum 后活跃数据和垃圾数据都会变成0
digoal=# vacuum tbl;VACUUMdigoal=# select * from pg_stat_all_tables where relid='tbl'::regclass;-[ RECORD 1 ]-----+------------------------------relid             | 32771schemaname        | publicrelname           | tblseq_scan          | 7seq_tup_read      | 9idx_scan          | idx_tup_fetch     | n_tup_ins         | 4n_tup_upd         | 1n_tup_del         | 4n_tup_hot_upd     | 1n_live_tup        | 0n_dead_tup        | 0last_vacuum       | 2013-05-27 17:05:17.664564+08last_autovacuum   | last_analyze      | last_autoanalyze  | vacuum_count      | 2autovacuum_count  | 0analyze_count     | 0autoanalyze_count | 0
2. 自动垃圾回收的配置.
对于一个DML频繁的数据库, 如果靠手动来回收垃圾是不太靠谱的事情, PostgreSQL提供了自动的垃圾回收配置.
相关参数如下 :
#------------------------------------------------------------------------------# AUTOVACUUM PARAMETERS#------------------------------------------------------------------------------#autovacuum = on                        # Enable autovacuum subprocess?  'on'                                        # requires track_counts to also be on.#log_autovacuum_min_duration = -1       # -1 disables, 0 logs all actions and                                        # their durations, > 0 logs only                                        # actions running at least this number                                        # of milliseconds.#autovacuum_max_workers = 3             # max number of autovacuum subprocesses                                        # (change requires restart)#autovacuum_naptime = 1min              # time between autovacuum runs#autovacuum_vacuum_threshold = 50       # min number of row updates before                                        # vacuum#autovacuum_analyze_threshold = 50      # min number of row updates before                                        # analyze#autovacuum_vacuum_scale_factor = 0.2   # fraction of table size before vacuum#autovacuum_analyze_scale_factor = 0.1  # fraction of table size before analyze#autovacuum_freeze_max_age = 200000000  # maximum XID age before forced vacuum                                        # (change requires restart)#autovacuum_vacuum_cost_delay = 20ms    # default vacuum cost delay for                                        # autovacuum, in milliseconds;                                        # -1 means use vacuum_cost_delay#autovacuum_vacuum_cost_limit = -1      # default vacuum cost limit for                                        # autovacuum, -1 means use                                        # vacuum_cost_limit# - Cost-Based Vacuum Delay -#vacuum_cost_delay = 0                  # 0-100 milliseconds#vacuum_cost_page_hit = 1               # 0-10000 credits#vacuum_cost_page_miss = 10             # 0-10000 credits#vacuum_cost_page_dirty = 20            # 0-10000 credits#vacuum_cost_limit = 200                # 1-10000 credits
简单介绍一下参数的含义 : 
autovacuum, 自动垃圾回收的开关
log_autovacuum_min_duration, 在什么情况下记录autovacuum日志输出. 0表示记录所有的autovacuum, -1表示不记录, 其他为时间阈值, 大于或等于这个时长的autovacuum才记录.
autovacuum_max_workers, 指最大允许多少个autovacuum子进程同时工作. 因为vacuum会带来IO上的开销, 还会消耗内存. 这个就不要配太大了. 
autovacuum_vacuum_threshold表示autovacuum的vacuum操作所需的最小变更数, 如果这个表的update/delete的tuple总数小于这个数字则不会触发autovacuum的vacuum操作.
和autovacuum_analyze_threshold
表示autovacuum的analyze操作所需的最小变更数, 如果这个表的insert/update/delete的tuple总数小于这个数字则不会触发autovacuum的analyze操作.
autovacuum_vacuum_scale_factor, 
表示autovacuum的vacuum操作所需的变更量阈值,当
这个表的update/delete的tuple总数
大于(pg_class.reltuples*
autovacuum_vacuum_scale_factor+
autovacuum_vacuum_threshold)时, 触发vacuum操作.
autovacuum_analyze_scale_factor, 
表示autovacuum的analyze操作所需的变更量阈值,当
这个表的INSERT/update/delete的tuple总数
大于(pg_class.reltuples*
autovacuum_analyze_scale_factor+
autovacuum_analyze_threshold)时, 触发analyze操作.
autovacuum_freeze_max_age, 即使autovacuum未开启, 为了防止wrapped xid导致数据不可见, 也会自动触发的vacuum操作. 表示一个表中存在的最早的事务信息到现在为止经历的事务数. 超出则强制vacuum. 防止xid wrapped.
autovacuum_vacuum_cost_delay, 因为vacuum会带来一定的IO开销, 所以PostgreSQL允许管理员指定当vacuum达到一定的阈值后进入随眠状态, 然后再唤醒继续vacuum. 具体的计算需要配置项Cost-Based Vacuum Delay决定.
接下来主要举例说明几个
threshold参数的作用 : 
查看当前的阈值 :
digoal=# show autovacuum_analyze_scale_factor; autovacuum_analyze_scale_factor --------------------------------- 0.1(1 row)digoal=# show autovacuum_vacuum_scale_factor; autovacuum_vacuum_scale_factor -------------------------------- 0.2(1 row)digoal=# show autovacuum_analyze_threshold; autovacuum_analyze_threshold ------------------------------ 50(1 row)digoal=# show autovacuum_vacuum_threshold; autovacuum_vacuum_threshold ----------------------------- 50(1 row)
修改naptime, 以及log_autovacuum_min_duration 便于从日志中或者统计表中观察结果 : 
pg93@db-172-16-3-33-> cd $PGDATApg93@db-172-16-3-33-> vi postgresql.conf autovacuum_naptime = 1slog_autovacuum_min_duration = 0pg93@db-172-16-3-33-> pg_ctl reloadserver signaled
创建测试表 :
digoal=# create table tbl(id int, info text);CREATE TABLE
计算插入多少条数据后会触发analyze : 
digoal=# select reltuples from pg_class where relname='tbl'; reltuples -----------         0(1 row)autovacuum_analyze_scale_factor*0+autovacuum_analyze_threshold=50;
因此插入51条数据后会发生analyze.
记录pg_stat_all_tables的tbl信息, 注意
last_autovacuum , 和
last_autoanalyze 的值.
digoal=# select * from pg_stat_all_tables where relname ='tbl';-[ RECORD 1 ]-----+-------relid             | 32798schemaname        | publicrelname           | tblseq_scan          | 0seq_tup_read      | 0idx_scan          | idx_tup_fetch     | n_tup_ins         | 0n_tup_upd         | 0n_tup_del         | 0n_tup_hot_upd     | 0n_live_tup        | 0n_dead_tup        | 0last_vacuum       | last_autovacuum   | last_analyze      | last_autoanalyze  | vacuum_count      | 0autovacuum_count  | 0analyze_count     | 0autoanalyze_count | 0
插入50条测试数据 :
digoal=# insert into tbl select generate_series(1,50),'test';INSERT 0 50
stat信息, 未触发analyze.
digoal=# select * from pg_stat_all_tables where relname ='tbl';-[ RECORD 1 ]-----+-------relid             | 32798schemaname        | publicrelname           | tblseq_scan          | 0seq_tup_read      | 0idx_scan          | idx_tup_fetch     | n_tup_ins         | 50n_tup_upd         | 0n_tup_del         | 0n_tup_hot_upd     | 0n_live_tup        | 50n_dead_tup        | 0last_vacuum       | last_autovacuum   | last_analyze      | last_autoanalyze  | vacuum_count      | 0autovacuum_count  | 0analyze_count     | 0autoanalyze_count | 0
再插入1条记录.
digoal=# insert into tbl select 51,'test';INSERT 0 1
触发analyze :
digoal=# select * from pg_stat_all_tables where relname ='tbl';-[ RECORD 1 ]-----+------------------------------relid             | 32798schemaname        | publicrelname           | tblseq_scan          | 0seq_tup_read      | 0idx_scan          | idx_tup_fetch     | n_tup_ins         | 51n_tup_upd         | 0n_tup_del         | 0n_tup_hot_upd     | 0n_live_tup        | 51n_dead_tup        | 0last_vacuum       | last_autovacuum   | last_analyze      | last_autoanalyze  | 2013-05-27 21:23:49.144829+08vacuum_count      | 0autovacuum_count  | 0analyze_count     | 0autoanalyze_count | 1
计算插入多少条数据后会触发analyze : 
digoal=# select reltuples from pg_class where relname='tbl';-[ RECORD 1 ]-reltuples | 51autovacuum_analyze_scale_factor*51+autovacuum_analyze_threshold=55.1;
因此插入56条数据后会触发analyze.
digoal=# insert into tbl select generate_series(1,55),'test';INSERT 0 55digoal=# select * from pg_stat_all_tables where relname ='tbl';-[ RECORD 1 ]-----+------------------------------relid             | 32798schemaname        | publicrelname           | tblseq_scan          | 0seq_tup_read      | 0idx_scan          | idx_tup_fetch     | n_tup_ins         | 106n_tup_upd         | 0n_tup_del         | 0n_tup_hot_upd     | 0n_live_tup        | 106n_dead_tup        | 0last_vacuum       | last_autovacuum   | last_analyze      | last_autoanalyze  | 2013-05-27 21:23:49.144829+08vacuum_count      | 0autovacuum_count  | 0analyze_count     | 0autoanalyze_count | 1digoal=# select reltuples from pg_class where relname='tbl';-[ RECORD 1 ]-reltuples | 51
再插入1条即可触发analyze.
digoal=# insert into tbl select 1,'test';INSERT 0 1digoal=# select * from pg_stat_all_tables where relname ='tbl';-[ RECORD 1 ]-----+-----------------------------relid             | 32798schemaname        | publicrelname           | tblseq_scan          | 0seq_tup_read      | 0idx_scan          | idx_tup_fetch     | n_tup_ins         | 107n_tup_upd         | 0n_tup_del         | 0n_tup_hot_upd     | 0n_live_tup        | 107n_dead_tup        | 0last_vacuum       | last_autovacuum   | last_analyze      | last_autoanalyze  | 2013-05-27 21:26:25.57402+08vacuum_count      | 0autovacuum_count  | 0analyze_count     | 0autoanalyze_count | 2digoal=# select reltuples from pg_class where relname='tbl';-[ RECORD 1 ]--reltuples | 107
计算UPDATE/DELETE多少条数据后会触发vacuum :
autovacuum_vacuum_scale_factor*107+autovacuum_vacuum_threshold=71.4;
因此更新或删除共计72条数据后会触发vacuum, 如果中间发生了analyze, 导致pg_class.reltuples发生变化, 这个值也会变化.
发生62次insert,update,delete后会触发analyze.
digoal=# update tbl set info='new' where id<18;UPDATE 35digoal=# update tbl set info='new' where id<10;UPDATE 19digoal=# delete from tbl where id<9;DELETE 17
总共发生了35+19+17=71超出62次dml, 发生analyze, 
analyze后, pg_class.reltuples变成90,
digoal=# select reltuples from pg_class where relname='tbl';-[ RECORD 1 ]-reltuples | 90
所以触发vacuum的值变成了多少呢?
autovacuum_vacuum_scale_factor*90+autovacuum_vacuum_threshold=68;
因此只需要69次update/delete即可触发vacuum, 而上一次vacuum到现在已经发生了71次update/delete, 因此会触发vacuum.
digoal=# select * from pg_stat_all_tables where relname ='tbl';-[ RECORD 1 ]-----+------------------------------relid             | 32798schemaname        | publicrelname           | tblseq_scan          | 8seq_tup_read      | 856idx_scan          | idx_tup_fetch     | n_tup_ins         | 107n_tup_upd         | 54n_tup_del         | 17n_tup_hot_upd     | 54n_live_tup        | 90n_dead_tup        | 0last_vacuum       | last_autovacuum   | 2013-05-27 21:31:23.560703+08last_analyze      | last_autoanalyze  | 2013-05-27 21:31:22.474655+08vacuum_count      | 0autovacuum_count  | 1analyze_count     | 0autoanalyze_count | 3

转载地址:http://jfrzx.baihongyu.com/

你可能感兴趣的文章
多流聚合与JSON全文检索的功能应用
查看>>
关于电梯阻塞原因的小思考
查看>>
多维透视表 - 矩表实现商品销售对比统计
查看>>
INNODB 页节点数据的存储方式、数据链、删除链的学习和实验总结
查看>>
使用 readfile() 下载文件
查看>>
[20150619]undo文件损坏或者丢失的恢复2
查看>>
MYSQL 导入出错从指定行号截取文件(C语言写的)及注意事项
查看>>
JAVA深入研究——Method的Invoke方法(转)
查看>>
js中的各种宽高以及位置总结
查看>>
[老老实实学WCF] 第一篇 Hello WCF
查看>>
Git常用命令(转)
查看>>
使用 InstallShield limited edition 打包部署Outlook 2013 Office add-in插件
查看>>
十天学Linux内核之第二天---进程
查看>>
一个疑难故障,坑了我半年青春-----知识就是生产力
查看>>
ArcGIS API for Silverlight之配准JPG图片地图文字倾斜解决方案
查看>>
微信智能硬件——微信相框
查看>>
我的Android进阶之旅------&gt;Android Activity的singleTask加载模式和onActivityResult方法之间的冲突...
查看>>
Kernel 中的 GPIO 定义和控制
查看>>
sqlserver性能调优方法论
查看>>
Sql Server服务远程过程调用失败
查看>>