Varlena, LLC | PostgreSQL General Bits Newsletter

PostgreSQL Training,
Consulting & Support

By A. Elein Mustain

25-Apr-2005 Issue: 107

Archives |

General Tidbits |

Google General Bits |

Prev |

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

Interpreting pg_stat Views
Understanding pg_stat Views	21-Apr-2005

Most people are familiar with the view pg_stat_activity which shows who is connected to the database and what they are doing. There are more pg_stat views available that are useful in tuning and understanding the rhythm of your database activity. We will review a few of these here.

Statistics collections must be turned on in the postgresql.conf file in order for statistics to be collected. This variable is stats_start_collector and is on by default. Other settings in the postgresql.conf file affect how much information the statistics collector gathers. These settings are: stats_command_string, stats_block_level, and stats_row_level and for information for their settings see Tidbits/Performance

The sample data below comes from several different sources and should not necessarily be read as a whole. Some names have changed and the same data sets are not used for each example. However, when you look at this information for your own database, you should be able to see correlation between the statistics on each view.

The sample data here also does not include all of the views available. The full list of all of the statistics views is here. Those highlighted are discussed in this issue. The contents of the other views can be discerned by understanding the contents of the four highlighted views.

pg_stat_activity	pg_stat_database
pg_stat_all_tables	pg_stat_sys_tables	pg_stat_user_tables
pg_stat_all_indexes	pg_stat_sys_indexes	pg_stat_user_indexes
pg_statio_all_tables	pg_statio_sys_tables	pg_statio_user_tables
pg_statio_all_indexes	pg_statio_sys_indexes	pg_statio_user_indexes
pg_statio_all_sequences	pg_statio_sys_sequences	pg_statio_user_sequences

pg_stat_database

This table shows two rows of data selected from pg_stat_database. You will have one row per database in your installation. It shows us the transaction statistics of each database and how many blocks were read from disk (blks_read) and how many blocks where read from the cache (blks_hit). High numbers of blocks read from the cache is a good thing.

datid	datname	numbackends	xact_commit	xact_rollback	blks_read	blks_hit
17142	goldrock	0	41481	5668965	1152300	9613617
17143	datanews	27	94011034	0	863238029	14997016697

So what does this tell us? It tells us that goldrock might be a development database since there are so many transactions rolled back and there are no active connections. Either that or there are a lot of errors in the code for goldrock. Both databases get a good number of cache hits. If the cache hits were low and the reads were high, we may want to consider increasing the cache sizes; but that does not seem to be the case here. Depending on the volatility of the data, we may want to consider the number of transactions committed between vacuums. Is it too high? Then we may want to vacuum more often. Too low? Maybe not as often. Some of the other views will tell us more about this.

pg_stat_user_tables

The first two columns of this table are the relation id of the table and the schema name of the table. These have been removed in order to narrow the table for publication. There are also pg_stat_sys_tables and pg_stat_all_tables to look at. This table shows the relation name and the sequential scan statistics, the index scan statistics and the insert, update and delete information for a specific relation or table.

relname seq_scan seq_tup_read idx_scan idx_tup_fetch n_tup_ins n_tup_upd n_tup_del
blog_stats 336 48541598 32 82358 521554 0 1200
data_contacts 15355 1499879 0 0 2 6 0
contact_areas 1761 2465791 40556 54654 57 5 33
role_users 1493 7465 0 0 0 0 0
areas 17605 16898222 4227149 1745533 40 228 25
adminusers 206971 2483546 0 0 0 5 0
data_stats 222 747400 0 0 8490 0 0
groups 7203 21606 0 0 0 0 0
userstats 1506 12840322 219517910 172630886 7098 2192804 6936
breadth 7461 9023948 630 608 11 0 0
provider_traffic 1504 5816918 119 27874 1543 0 0
group_renames 1498 10836 0 0 2 0 0
user_session_ips 4483 232274266 2202891 2254480 2192743 0 2190299
breadth_attributes 1493 380715 0 0 0 0 0
contacts 1561 4096736 204198 40477 65 112 70
client_data 5879361 11758722 0 0 0 0 0
incoming 4334 638388 131 134 6 0 2
userstats_history 1493 23150333 0 0 6936 0 0
init_info 1493 5216605 73662 2199 228 3 75
chargehistory 14943 2468386808 16307 8977 7330 7687 75
acct_session_attrs 1494 49628668 8771255 70162054 35080249 0 35081608
data_clusters 7069 41215 0 0 1 0 0
dataers 7066 41202 0 0 1 0 0
accounttypes 239756 2637316 0 0 0 11 0
acct_sessions 2986 21319301 39465796 39465796 4385560 0 4385708
data_errors 216 172456 0 0 1566 0 0
users 50677 976541801 274746 177126 352 32225 25
data_contacts 1727 178240 0 0 2 0 0
groupstats 16843 10293192 2297646 2297318 645 2297501 658
ipranges 9830 85654695 126 155 99 10 65
user_accesses 2984 46244951 1179679 0 1176140
newsdata 9073 866327 473 452 4 23 1
group_traffic 1666 327369154 3200 1803857 112814 935 0
path_aliases 7212 1166897 4 0 4 0 1
The number of sequential scans should be lower than the index scans on all but small tables or queries that hit the majority of rows in the table. We can assume that the relations in this table that have 0 index scans are relatively small. We can also assume that the relations that have no value for index scans have no indexes. If you think there should be more index scans, you may want to take a look at the queries accessing the table and determine if they are using the indexes you expect them to use and try to understand why they are not.
The number of rows fetched using either the sequential scans or the index scans tells you how active and large the table is for selections. The number of inserts, updates and deletes can help you in the art of timing your vacuums. Tables with many updates and deletes should be vacuumed fairly often. These numbers also give you an idea of the type of operation your application does most often--and it should match with what you think it does most often.
pg_stat_user_indexes
For more details in specific index scans, you can use pg_stat_user_indexes. There are also pg_stat_sys_indexes and pg_stat_all_indexes to look at. This table includes both the table OID and the index OID and the schema name. These have been removed to narrow the table for readability here.

relname indexrelname idx_scan idx_tup_read idx_tup_fetch
team_rank team_rank_pkey 3222201 1862876012 1862876012
team_rank team_joins__overall_rank 321211 40837096 40837096
team_rank team_joins__day_rank 10551 35091783 35091783
team_rank_last_update team_rank_last_update_pkey 416
stats_team stats_team__team_listmode 0 0 0
stats_team stats_team_pkey 10221917 146956388 146956388
email_rank email_rank_pkey 90242022 153430377 153430377
email_rank email_rank__day_rank 10445 9218946 9218946
email_rank email_rank__overall_rank 417093 4293395 4293395
email_rank_last_update email_rank_last_update_pkey 500501 500426 500426
In this table we can see the details of the index scans. The number of indexes read and the number of indexed values read are usually the same. If they are different, then it is possible that indexes are bloated and need to be rebuilt. You can also see which indexes are used most often and which are not used at all.
pg_statio_user_tables
This view summarizes the different kinds of I/O done for each table. The different kinds of I/O are for simple (heap) rows, indexes, TOAST items and access to TOAST indexes. TOAST, you recall, is when a column grows to a large size and is pushed outside the primary row storage to TOAST storage, and, of course, it is indexed. Blocks read are read from disk and Blocks Hit are cache hits. Cache hits are better than disk reads.
The table below has had two columns removed. The first two columns are the OID and the schema name of the table.

relname heap
blks_read heap
blks_hit idx
blks_read idx
blks_hit toast
blks_read toast
blks_hit tidx
blks_read tidx
blks_hit
groups 3095 4858 1492 746 0 0 746 0
area_aliases 12372 19476 1500 754 0 0 746 0
user_session_ips 12513150 92230884 5696858 38071201 0 0 748 2
contacts 2503 13651 1502 752 0 0 746 0
contact_area 35185 65693 45566 129047 0 0 746 0
role_users 1493 746 1492 746 0 0 746 0
acct_sessions 680475 98820645 2551772 173060962 0 0 748 2
areas 356336 2164618 26922 5272862 0 0 746 0
accounttypes 11849 228754 1494 766 0 0 746 0
contacts 64378 65253 75512 578659 0 0 746 0
adminusers 15958 191807 1500 748 0 0 746 0
transmission 6870441 23239223 2110938 2492543 0 0 746 1
group_renames 1496 752 1494 748 0 0 746 0
breadth_attributes 2986 1492 6714 5222 0 0 746 0
group_owners 16792 11028 4774 3494 0 0 746 0
signupinfo 63172 1290637 94846 199373 0 0 746 0
breadth 133575 14810 11562 10503 0 0 746 0
dataers 3058 4757 1494 746 0 0 746 0
feed_errors 2524 4287 2197 4786 0 0 101 1
new_data 14124 22261 1628 1610 2419 1676 4838 747
data_stats 13735 256637 10760 28902 0 0 101 1

As we expect most of the data is accessed directly from the heap or via indexes. The high rate of cache hits indicates that the cache sizes specified in the postgresql.conf are pretty good. This database does not seem to have a lot of TOASTED columns, but it does have some and a few of those are accessed via the TOAST index cache.

Summary
Review your pg_stat views and/or collect them over time before VACUUM ANALYZE. They will tell you about the activities of your tables and indexes. You will have better information for setting your postgresql.conf variables and you will have a good idea of which indexes work well and which do not. You will have an idea about the ratio of INSERT, DELETE and UPDATES which will help you tune your VACUUM maintenance.
Contributors: elein at varlena.com, andrewsn at freenode.net, decibull at freenode.net

Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009

Top