Interpreting pg_stat Views
|
Understanding pg_stat Views
|
21-Apr-2005
| |
Most people are familiar with the view pg_stat_activity which shows
who is connected to the database and what they are doing. There are more
pg_stat views available that are useful in tuning and understanding the
rhythm of your database activity. We will review a few of these here.
Statistics collections must be turned on in the postgresql.conf file in order
for statistics to be collected. This variable is stats_start_collector
and is on by default. Other settings in the postgresql.conf file affect how
much information the statistics collector gathers. These settings are:
stats_command_string, stats_block_level, and stats_row_level
and for information for their settings see
Tidbits/Performance
The sample data below comes from several different sources and should not necessarily
be read as a whole. Some names have changed and the same data sets are not used for
each example. However, when you look at this information for your own database, you
should be able to see correlation between the statistics on each view.
The sample data here also does not include all of the views available. The full list of
all of the statistics views is here. Those highlighted are discussed in this issue.
The contents of the other views can be discerned by understanding the contents of the
four highlighted views.
pg_stat_activity |
pg_stat_database |
|
pg_stat_all_tables |
pg_stat_sys_tables |
pg_stat_user_tables |
pg_stat_all_indexes |
pg_stat_sys_indexes |
pg_stat_user_indexes |
pg_statio_all_tables |
pg_statio_sys_tables |
pg_statio_user_tables |
pg_statio_all_indexes |
pg_statio_sys_indexes |
pg_statio_user_indexes |
pg_statio_all_sequences |
pg_statio_sys_sequences |
pg_statio_user_sequences |
pg_stat_database
This table shows two rows of data selected from pg_stat_database. You will have one row per database in your installation. It shows us the transaction statistics of each database and how many blocks were read from disk (blks_read) and how many blocks where read from the cache (blks_hit). High numbers of blocks
read from the cache is a good thing.
datid |
datname |
numbackends |
xact_commit |
xact_rollback |
blks_read |
blks_hit |
17142 |
goldrock |
0 |
41481 |
5668965 |
1152300 |
9613617 |
17143 |
datanews |
27 |
94011034 |
0 |
863238029 |
14997016697 |
So what does this tell us? It tells us that goldrock might be a development database
since there are so many transactions rolled back and there are no active connections.
Either that or there are a lot of errors in the code for goldrock. Both databases get a
good number of cache hits. If the cache hits were low and the reads were high, we may want
to consider increasing the cache sizes; but that does not seem to be the case here.
Depending on the volatility of the data, we may want to consider the number of transactions
committed between vacuums. Is it too high? Then we may want to vacuum more often. Too low?
Maybe not as often. Some of the other views will tell us more about this.
pg_stat_user_tables
The first two columns of this table are the relation id of the table and the schema name of
the table. These have been removed in order to narrow the table for publication.
There are also pg_stat_sys_tables and pg_stat_all_tables to look at.
This table shows the relation name and the sequential scan statistics, the index scan statistics
and the insert, update and delete information for a specific relation or table.
relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del |
blog_stats | 336 | 48541598 | 32 | 82358 | 521554 | 0 | 1200 |
data_contacts | 15355 | 1499879 | 0 | 0 | 2 | 6 | 0 |
contact_areas | 1761 | 2465791 | 40556 | 54654 | 57 | 5 | 33 |
role_users | 1493 | 7465 | 0 | 0 | 0 | 0 | 0 |
areas | 17605 | 16898222 | 4227149 | 1745533 | 40 | 228 | 25 |
adminusers | 206971 | 2483546 | 0 | 0 | 0 | 5 | 0 |
data_stats | 222 | 747400 | 0 | 0 | 8490 | 0 | 0 |
groups | 7203 | 21606 | 0 | 0 | 0 | 0 | 0 |
userstats | 1506 | 12840322 | 219517910 | 172630886 | 7098 | 2192804 | 6936 |
breadth | 7461 | 9023948 | 630 | 608 | 11 | 0 | 0 |
provider_traffic | 1504 | 5816918 | 119 | 27874 | 1543 | 0 | 0 |
group_renames | 1498 | 10836 | 0 | 0 | 2 | 0 | 0 |
user_session_ips | 4483 | 232274266 | 2202891 | 2254480 | 2192743 | 0 | 2190299 |
breadth_attributes | 1493 | 380715 | 0 | 0 | 0 | 0 | 0 |
contacts | 1561 | 4096736 | 204198 | 40477 | 65 | 112 | 70 |
client_data | 5879361 | 11758722 | 0 | 0 | 0 | 0 | 0 |
incoming | 4334 | 638388 | 131 | 134 | 6 | 0 | 2 |
userstats_history | 1493 | 23150333 | 0 | 0 | 6936 | 0 | 0 |
init_info | 1493 | 5216605 | 73662 | 2199 | 228 | 3 | 75 |
chargehistory | 14943 | 2468386808 | 16307 | 8977 | 7330 | 7687 | 75 |
acct_session_attrs | 1494 | 49628668 | 8771255 | 70162054 | 35080249 | 0 | 35081608 |
data_clusters | 7069 | 41215 | 0 | 0 | 1 | 0 | 0 |
dataers | 7066 | 41202 | 0 | 0 | 1 | 0 | 0 |
accounttypes | 239756 | 2637316 | 0 | 0 | 0 | 11 | 0 |
acct_sessions | 2986 | 21319301 | 39465796 | 39465796 | 4385560 | 0 | 4385708 |
data_errors | 216 | 172456 | 0 | 0 | 1566 | 0 | 0 |
users | 50677 | 976541801 | 274746 | 177126 | 352 | 32225 | 25 |
data_contacts | 1727 | 178240 | 0 | 0 | 2 | 0 | 0 |
groupstats | 16843 | 10293192 | 2297646 | 2297318 | 645 | 2297501 | 658 |
ipranges | 9830 | 85654695 | 126 | 155 | 99 | 10 | 65 |
user_accesses | 2984 | 46244951 | | | 1179679 | 0 | 1176140 |
newsdata | 9073 | 866327 | 473 | 452 | 4 | 23 | 1 |
group_traffic | 1666 | 327369154 | 3200 | 1803857 | 112814 | 935 | 0 |
path_aliases | 7212 | 1166897 | 4 | 0 | 4 | 0 | 1 |
The number of sequential scans should be lower than the index scans on all but small tables or queries
that hit the majority of rows in the table. We can assume that the relations in this table that have 0 index
scans are relatively small. We can also assume that the relations that have no value for index scans
have no indexes. If you think there should be more index scans, you may want to take a look
at the queries accessing the table and determine if they are using the indexes you expect them to use
and try to understand why they are not.
The number of rows fetched using either the sequential scans
or the index scans tells you how active and large the table is for selections. The number of inserts,
updates and deletes can help you in the art of timing your vacuums. Tables with many updates and
deletes should be vacuumed fairly often. These numbers also give you an idea of the type of operation
your application does most often--and it should match with what you think it does most often.
pg_stat_user_indexes
For more details in specific index scans, you can use pg_stat_user_indexes.
There are also pg_stat_sys_indexes and pg_stat_all_indexes to look at.
This table includes both the table OID and the index OID and the schema name. These
have been removed to narrow the table for readability here.
relname | indexrelname | idx_scan | idx_tup_read | idx_tup_fetch |
team_rank | team_rank_pkey | 3222201 | 1862876012 | 1862876012 |
team_rank | team_joins__overall_rank | 321211 | 40837096 | 40837096 |
team_rank | team_joins__day_rank | 10551 | 35091783 | 35091783 |
team_rank_last_update | team_rank_last_update_pkey | 416 | |
stats_team | stats_team__team_listmode | 0 | 0 | 0 |
stats_team | stats_team_pkey | 10221917 | 146956388 | 146956388 |
email_rank | email_rank_pkey | 90242022 | 153430377 | 153430377 |
email_rank | email_rank__day_rank | 10445 | 9218946 | 9218946 |
email_rank | email_rank__overall_rank | 417093 | 4293395 | 4293395 |
email_rank_last_update | email_rank_last_update_pkey | 500501 | 500426 | 500426 |
In this table we can see the details of the index scans. The number of indexes read and the number of
indexed values read are usually the same. If they are different, then it is possible that indexes
are bloated and need to be rebuilt. You can also see which indexes are used most often and which
are not used at all.
pg_statio_user_tables
This view summarizes the different kinds of I/O done for each table.
The different kinds of I/O are for simple (heap) rows, indexes,
TOAST items and access to TOAST indexes. TOAST, you recall, is when a column grows to a large
size and is pushed outside the primary row storage to TOAST storage, and, of course, it is indexed.
Blocks read are read from disk and Blocks Hit are cache hits. Cache hits are better than disk reads.
The table below has had two columns removed. The first two columns are the OID and the schema name of the
table.
relname | heap blks_read | heap blks_hit | idx blks_read | idx blks_hit | toast blks_read | toast blks_hit | tidx blks_read | tidx blks_hit |
groups | 3095 | 4858 | 1492 | 746 | 0 | 0 | 746 | 0 |
area_aliases | 12372 | 19476 | 1500 | 754 | 0 | 0 | 746 | 0 |
user_session_ips | 12513150 | 92230884 | 5696858 | 38071201 | 0 | 0 | 748 | 2 |
contacts | 2503 | 13651 | 1502 | 752 | 0 | 0 | 746 | 0 |
contact_area | 35185 | 65693 | 45566 | 129047 | 0 | 0 | 746 | 0 |
role_users | 1493 | 746 | 1492 | 746 | 0 | 0 | 746 | 0 |
acct_sessions | 680475 | 98820645 | 2551772 | 173060962 | 0 | 0 | 748 | 2 |
areas | 356336 | 2164618 | 26922 | 5272862 | 0 | 0 | 746 | 0 |
accounttypes | 11849 | 228754 | 1494 | 766 | 0 | 0 | 746 | 0 |
contacts | 64378 | 65253 | 75512 | 578659 | 0 | 0 | 746 | 0 |
adminusers | 15958 | 191807 | 1500 | 748 | 0 | 0 | 746 | 0 |
transmission | 6870441 | 23239223 | 2110938 | 2492543 | 0 | 0 | 746 | 1 |
group_renames | 1496 | 752 | 1494 | 748 | 0 | 0 | 746 | 0 |
breadth_attributes | 2986 | 1492 | 6714 | 5222 | 0 | 0 | 746 | 0 |
group_owners | 16792 | 11028 | 4774 | 3494 | 0 | 0 | 746 | 0 |
signupinfo | 63172 | 1290637 | 94846 | 199373 | 0 | 0 | 746 | 0 |
breadth | 133575 | 14810 | 11562 | 10503 | 0 | 0 | 746 | 0 |
dataers | 3058 | 4757 | 1494 | 746 | 0 | 0 | 746 | 0 |
feed_errors | 2524 | 4287 | 2197 | 4786 | 0 | 0 | 101 | 1 |
new_data | 14124 | 22261 | 1628 | 1610 | 2419 | 1676 | 4838 | 747 |
data_stats | 13735 | 256637 | 10760 | 28902 | 0 | 0 | 101 | 1 |
As we expect most of the data is accessed directly from the heap or via indexes. The high rate of cache hits
indicates that the cache sizes specified in the postgresql.conf are pretty good.
This database does not seem to have a lot of TOASTED columns, but it does have some and a few of those
are accessed via the TOAST index cache.
Summary
Review your pg_stat views and/or collect them over time before VACUUM ANALYZE. They will tell
you about the activities of your tables and indexes. You will have better information for setting your
postgresql.conf variables and you will have a good idea of which indexes work well and which
do not. You will have an idea about the ratio of INSERT, DELETE and UPDATES which will help
you tune your VACUUM maintenance.
Contributors:
elein at varlena.com,
andrewsn at freenode.net,
decibull at freenode.net
Comments and Corrections are welcome. Suggestions and
contributions of items are also welcome.
Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009
|
Top
|