Skip to content

Monitoring Long-Running Operations

WarehousePG can report the progress of ANALYZE, CLUSTER, COPY, CREATE INDEX, REINDEX, and VACUUM commands during command execution. WarehousePG can also report the progress of a running base backup (initiated during gprecoverseg -F) command invocation, allowing you to monitor the progress of these possibly long-running operations.

WarehousePG reports the command progress via ephemeral system views, which return data only while the operations are running. Two sets of progress reporting views are provided:

  • gp_stat_progress_<command> - displays the progress of running <command> invocations on the coordinator and all segments, with a row per segment instance
  • gp_stat_progress_<command>_summary - aggregates <command> progress on the coordinator and all segments, and displays one row per running <command> invocation

WarehousePG reports progress in phases, where the phases are specific to the command. For example, acquiring sample rows is an analyze progress phase, while building index is an index creation progress phase. WarehousePG reports the progress for both heap and AO/CO tables. For most commands, heap and AO/CO table share the same phases. For vacuum and cluster operations, WarehousePG reports heap and AO/CO table progress in separate phases.

ANALYZE Progress Reporting

The gp_stat_progress_analyze system view reports the progress of running ANALYZE and analyzedb operations. The view displays a row per segment instance that is currently servicing an analyze operation.

For each active analyze operation, the gp_stat_progress_analyze_summary view aggregates across the WarehousePG cluster the metrics reported by gp_stat_progress_analyze.

The table below describes how to interpret the phase-specific information reported in the views:

PhaseDescription
initializingThe command is preparing to begin scanning the heap. This phase is expected to be very brief.
acquiring sample rowsThe command is currently scanning the table given by relid to obtain sample rows.
acquiring inherited sample rowsThe command is currently scanning child tables to obtain sample rows. Columns child_tables_total, child_tables_done, and current_child_table_relid contain the progress information for this phase.
computing statisticsThe command is computing statistics from the sample rows obtained during the table scan.
computing extended statisticsThe command is computing extended statistics from the sample rows obtained during the table scan.
finalizing analyzeThe command is updating pg_class. When this phase is completed, ANALYZE ends.

CLUSTER and VACUUM FULL Progress Reporting

The gp_stat_progress_cluster system view reports the progress of running CLUSTER, clusterdb, and VACUUM FULL (on a heap table) operations. (VACUUM FULL on a heap table is similar to CLUSTER in that WarehousePG performs a re-write of the table.) The view displays a row per segment instance that is currently servicing any of the mentioned commands.

For each active cluster or vacuum full operation, the gp_stat_progress_cluster_summary view aggregates across the WarehousePG cluster the metrics reported by gp_stat_progress_cluster.

Heap Table Cluster Phases

The table below describes how to interpret the heap table phase-specific information reported in the views:

PhaseDescription
initializingThe command is preparing to begin scanning the heap. This phase is expected to be very brief.
seq scanning heapThe command is currently scanning the table using a sequential scan.
index scanning heapCLUSTER is currently scanning the table using an index scan.
sorting tuplesCLUSTER is currently sorting tuples.
writing new heapCLUSTER is currently writing the new heap.
swapping relation filesThe command is currently swapping newly-built files into place.
rebuilding indexThe command is currently rebuilding an index.
performing final cleanupThe command is performing final cleanup. When this phase is completed, CLUSTER or VACUUM FULL ends.

AO/CO Table Cluster Phases

The table below describes how to interpret the AO/CO table phase-specific information reported in the views:

PhaseDescription
initializingThe command is preparing to begin scanning. This phase is expected to be very brief.
seq scanning append-optimizedThe command is currently scanning the AO/CO table using a sequential scan.
index scanning heapCLUSTER is currently scanning the table using an index scan.
sorting tuplesCLUSTER is currently sorting tuples.
writing new append-optimizedCLUSTER is currently writing the new AO/CO table.
swapping relation filesThe command is currently swapping newly-built files into place.
performing final cleanupThe command is performing final cleanup. When this phase is completed, CLUSTER or VACUUM FULL ends.

COPY Progress Reporting

The gp_stat_progress_copy system view reports the progress of running COPY operations. The view displays a row per segment instance that is currently servicing a copy operation.

For each active copy operation, the gp_stat_progress_copy_summary aggregates across the WarehousePG cluster the metrics reported by gp_stat_progress_copy.

You can also use these views to monitor the data movement progress of utilities that use COPY under the hood, such as gpbackup/gprestore.

WarehousePG calculates the bytes_processed, bytes_total, tuples_processed, and tuples_excluded column values differently for the gp_stat_progress_copy_summary view depending on the type of table and type of COPY operation. The table below identifies the table types, the types of COPY operations, and which of sum() or average() WarehousePG uses to calculate the final value:

Table typeCOPY TOCOPY FROMCOPY TO/FROM ON SEGMENT
Distributed tablesum()sum()sum()
Replicated tablesum()average()sum()

WarehousePG uses sum() for COPY ... ON SEGMENT as the command explicitly operates on each segment irrespective of the table type.

WarehousePG uses sum() for COPY TO with replicated tables, as the actual copy originates from only a single segment.

CREATE INDEX Progress Reporting

The gp_stat_progress_create_index system view reports the progress of running CREATE INDEX and REINDEX operations. The view displays a row per segment instance that is currently servicing either command.

For each active index operation, the gp_stat_progress_create_index_summary view aggregates across the WarehousePG cluster the metrics reported by gp_stat_progress_create_index.

The table below describes how to interpret the phase-specific information reported in the views:

PhaseDescription
initializingCREATE INDEX or REINDEX is preparing to create the index. This phase is expected to be very brief.
waiting for writers before buildThis phase is not applicable, WarehousePG does not support CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY.
building indexThe index is being built by the access method-specific code. In this phase, access methods that support progress reporting fill in their own progress data, and the subphase is indicated in this column. Typically, blocks_total and blocks_done will contain progress data, as well as potentially tuples_total and tuples_done.
waiting for writers before validationThis phase is not applicable, WarehousePG does not support CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY.
index validation: scanning indexThis phase is not applicable, WarehousePG does not support CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY.
index validation: sorting tuplesThis phase is not applicable, WarehousePG does not support CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY.
index validation: scanning tableThis phase is not applicable, WarehousePG does not support CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY.
waiting for old snapshotsThis phase is not applicable, WarehousePG does not support CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY.
waiting for readers before marking deadThis phase is not applicable, WarehousePG does not support REINDEX CONCURRENTLY.
waiting for readers before droppingThis phase is not applicable, WarehousePG does not support REINDEX CONCURRENTLY.

WarehousePG skips several phases because it does not support concurrent index creation or concurrent reindexing.

VACUUM Progress Reporting

The gp_stat_progress_vacuum system view reports the progress of running VACUUM and vacuumdb operations on AO/CO and heap tables, and VACUUM FULL operations on AO/CO tables. The view displays a row per segment instance that is currently servicing a vacuum operation.

For each active vacuum operation, the gp_stat_progress_vacuum_summary view aggregates across the WarehousePG cluster the metrics reported by gp_stat_progress_vacuum.

Regular VACUUM modifies a heap table in place. VACUUM FULL on a heap table rewrites the table as does a CLUSTER operation. For information about progress reporting for VACUUM FULL on a heap table, see CLUSTER and VACUUM FULL Progress Reporting.

In WarehousePG, an AO/CO table vacuum behaves differently than a heap table vacuum. Because WarehousePG stores the logical EOF for each segment file, it does not need to scan physical blocks after the logical EOF, so WarehousePG can truncate them. WarehousePG always rewrites data into new segment files to get rid of dead tuples for AO/CO tables, and performs the same operations for both VACUUM FULL and VACUUM. Due to these difference, WarehousePG reports vacuum progress on heap and AO/CO tables using different phases.

Heap Table Vacuum Phases

The table below describes how to interpret the heap table phase-specific information reported in the views:

Heap Table PhaseDescription
initializingVACUUM is preparing to begin scanning the heap. This phase is expected to be very brief.
scanning heapVACUUM is currently scanning the heap. It prunes and defragment seach page if required, and possibly performs freezing activity. Use the heap_blks_scanned column to monitor the progress of the scan.
vacuuming indexesVACUUM is currently vacuuming the indexes. If a table has any indexes, this will happen at least once per vacuum, after the heap has been completely scanned. It may happen multiple times per vacuum if maintenance_work_mem (or, in the case of autovacuum, autovacuum_work_mem if set) is insufficient to store the number of dead tuples found.
vacuuming heapVACUUM is currently vacuuming the heap. Vacuuming the heap is distinct from scanning the heap, and occurs after each instance of vacuuming indexes. If heap_blks_scanned is less than heap_blks_total, the system returns to scanning the heap after this phase is completed; otherwise, it begins cleaning up indexes after this phase is completed.
cleaning up indexesVACUUM is currently cleaning up indexes. This occurs after the heap has been completely scanned and all vacuuming of the indexes and the heap has been completed.
truncating heapVACUUM is currently truncating the heap so as to return empty pages at the end of the relation to the operating system. This occurs after cleaning up indexes.
performing final cleanupVACUUM is performing final cleanup. During this phase, VACUUM will vacuum the free space map, update statistics in pg_class, and report statistics to the statistics collector. When this phase is completed, the VACUUM operation ends.

AO/CO Table Vacuum Phases

The table below describes how to interpret the AO/CO table phase-specific information reported in the views:

AO/CO Table PhaseDescription
initializingVACUUM is preparing to begin scanning AO/CO tables. This phase is expected to be very brief.
vacuuming indexesFor AO/CO tables, the vacuuming indexes phase is a sub-phase that may occur during both append-optimized pre-cleanup and append-optimized post-cleanup phases if the relation has an index and there are invisible awaiting-drop segment files.
append-optimized pre-cleanupVACUUM is performing pre-cleanup of the AO/CO table; this includes recycling AWAITING_DROP segments that are no longer visible to anyone from previous VACUUM operations, and truncating all live segment files to their logcial EOFs. Use heap_blks_vacuumed and index_vacuum_count to monitor the progress.
append-optimized compactVACUUM is vacuuming (or "compacting") the AO/CO table by scanning the segment files and rewriting only the surviving tuples into new segment files. Use heap_blks_scanned and num_dead_tuples to monitor the progress.
append-optimized post-cleanupVACUUM is performing post-cleanup of the AO/CO table; this includes recycling old segments that are no longer visible after the "compact" phase, and truncating all live segment files to their logcial EOFs. Use heap_blks_vacuumed and index_vacuum_count to monitor the progress.

Base Backup Progress Reporting

The gp_stat_progress_basebackup system view reports the progress of running base backup operations, as is performed by gprecoverseg -F (full recovery). The view displays a row per segment instance that is currently servicing replication commands.

For each active base backup operation, the gp_stat_progress_basebackup_summary view aggregates across the WarehousePG cluster the metrics reported by gp_stat_progress_basebackup.

The table below describes how to interpret the phase-specific information reported in the views:

PhaseDescription
initializingThe WAL sender process is preparing to begin the backup. This phase is expected to be very brief.
waiting for checkpoint to finishThe WAL sender process is currently performing pg_backup_start to prepare to take a base backup, and waiting for the start-of-backup checkpoint to finish.
estimating backup sizeThe WAL sender process is currently estimating the total amount of database files that will be streamed as a base backup.
streaming database filesThe WAL sender process is currently streaming database files as a base backup.
waiting for wal archiving to finishThe WAL sender process is currently performing pg_backup_stop to finish the backup, and waiting for all the WAL files required for the base backup to be successfully archived. The backup ends when this phase is completed.
transferring wal filesThe WAL sender process is currently transferring all WAL logs generated during the backup. This phase may occur after waiting for wal archiving to finish phase. The backup ends when this phase is completed.

Example: Viewing Real-Time Command Progress

The following example commands display the real-time progress of all running VACUUM commands every 0.5 seconds:

host$ cat > viewer.sql << EOF
SELECT * FROM gp_stat_progress_vacuum ORDER BY gp_segment_id;
SELECT * FROM gp_stat_progress_vacuum_summary;
EOF

host$ watch -n 0.5 "psql -af viewer.sql"

Additional Considerations

Take note of these additional considerations related to progress reporting in WarehousePG:

  • Progress phases and their interpretation of attributes are slightly different for AO/AOCO tables and heap tables for the CREATE INDEX/REINDEX, VACUUM, and CLUSTER operations. Refer to the individual command topics for more information.
  • The progress views for an ANALYZE operation on a partitioned table display the progress of each individual leaf partition. WarehousePG does not currently provide a measure of the combined progress of all partitions.
  • WarehousePG does not support the CLUSTER operation on a partitioned table.