Report for Bloated Tables in Postgres and manual mitigation steps
Manually running vacuum Jobs in postgres
Determining what needs to be vacuumed
This report will organize tables by schema in order to plan manual run of vacuuming jobs. The bloat report will only report on table who's bloat is greater than 25% and total storage reclamation is rolled up per schema in order to see where the most bloat is per schema. Actual table sizing information is provided in order to estimate how long these vacuums jobs may take.
select
'<database name>' as db,
coalesce(nspname,'Grand Totals',nspname) AS "schema",
coalesce(nspname || '.' || relname,
'----------------------------------------',
nspname || '.' || relname) as table_name,
pg_size_pretty(sum(pg_total_relation_size(nspname || '.' || relname))) as actual_size,
pg_relation_size(nspname || '.' || relname) a,
pg_size_pretty(sum(pg_total_relation_size(nspname || '.' || relname))) as bloat_size,
pg_size_pretty(sum(pg_total_relation_size(nspname || '.' || relname) - pg_relation_size(nspname || '.' || relname))) as reclaim_size_pretty,
case
when sum(pg_total_relation_size(nspname || '.' || relname)) > 0
then (sum(pg_total_relation_size(nspname || '.' || relname) - pg_relation_size(nspname || '.' || relname)) * 100 / sum(pg_total_relation_size(nspname || '.' || relname)))::integer
else 0
end as bloat_percentage,
sum(pg_total_relation_size(nspname || '.' || relname) - pg_relation_size(nspname || '.' || relname)) as reclaim_size,
'VACUUM FULL VERBOSE ANALYZE ' || nspname || '.' || relname || ';' "sql"
from
pg_class c
join pg_namespace n on
c.relnamespace = n.oid
where
relkind = 'r'
and nspname not like 'pg_%'
and nspname != 'information_schema'
and reltuples > 0
and (pg_total_relation_size(nspname || '.' || relname) - pg_relation_size(nspname || '.' || relname)) *
100 / pg_total_relation_size(nspname || '.' || relname) > 25
-- and pg_relation_size(nspname || '.' || relname) > 10033438720 -- specific size filter
group by
rollup(nspname,
relname)
having
grouping(nspname) = 0
or grouping(relname) = 1
order by
nspname,
case
when grouping(relname) = 0 then 0
else 1
end,
sum(pg_total_relation_size(nspname || '.' || relname)) asc;
Creating SQL Scripts to perform vacuuming runs
Using above report you can organize you vacuum statements into scripts for each schema, the smaller and shorter schema statements can be combined into a single script file to reduce your amount of keyboard time.
-
vacuum_<environment>_<schema>_<devops ticket #>.sql
-
vacuum_other_schemas_<devops ticket #>.sql
file name examples:
vacuum_prod_other_stage_omop_49924.sql
vacuum_source_epic_clarity_49924.sql
vacuum_prod_omop_prod_49924.sql
Preparing devops task
Attaching and placement of SQL scripts
Once the scripts are created simply attach them to your devops tasks and using desktop SMB client create a working directory and also place them on storage1.
create working directory via SMB:
Create a directory named after your ticket number in the devops folder for the environment you're working in via your SMB explorer or client.
\\storage1.ris.wustl.edu\i2\Active\rdc2_etl\<environment>\devops\<devops ticket #>
Creating Execution Plans for Vacuuming
Buy logging in the RIS compute cluster and using the Lsub script for postgres, the run job command can be used to quickly generate the commands needed execute the sql scripts that are created above.
Using the Lsub job for postgres log into the desired environment and execute run_job for each vacuum script.
NOTE:
The ouput of run_job is not actually executed but is the display of what should be executed in order to run the specific sql script against the postgres environment.
cd ${RDC_DEVOPS}/<devops ticket>
-- or if this is a version release
cd ${RDC_REPOS}/<specific git repo>
run_job <connection index> <scrript file> <devops ticket #>
The output of the command (execution plan) can be copied to your devops ticket's description (do not execute anything outputted by run_job without appropriate approvals).
Example run_job command run:
cd /storage1/fs1/I2/Active/rdc2_etl/prod/devops/49924
run_job 3 ./vacuum_prod_other_stage_omop.sql 49924
${PG_STRING} = -abe postgresql://rdc_etl_user@postgresql.prod.i2rdc3.wustl.edu:5432/omop_dw?sslmode=require&keepalives=1&keepalives_idle=600 --set=env=prod --set=ON_ERROR_STOP=1 --set=serv=postgresql.prod.i2rdc3.wustl.edu
${RDC_LOGS} = /storage1/fs1/I2/Active/rdc2_etl/prod/logs
execute:
set_connection 3
time psql ${PG_STRING} -f ./vacuum_prod_other_stage_omop.sql > $RDC_LOGS/vacuum_prod_other_stage_omop_49924_20241214-225216.log 2>&1
log_test ${RDC_LOGS}/vacuum_prod_other_stage_omop_49924_20241214-225216.log
Additional Notes:
If the task you are running is in either QA or PRod environments, Approvals are needed and must be requested before execution.
Related devops tickets
- https://dev.azure.com/wustl-i2/Planning and Administration/_workitems/edit/49924
- https://dev.azure.com/wustl-i2/Planning and Administration/_workitems/edit/49923
- https://dev.azure.com/wustl-i2/Planning and Administration/_workitems/edit/47874
- https://dev.azure.com/wustl-i2/Planning and Administration/_workitems/edit/47875