GBase 8c Performance Tuning: A Systematic Approach from Statistics and Execution Plans to Resource Pools

#gbase #database #数据库

GBase 8c, the China‑domestically developed multi‑model database from GBASE, supports row‑store, column‑store, and distributed deployment. When a query slows down, the cause often lies deeper than SQL syntax — outdated statistics, a shifted execution plan, or resource contention. This article walks through a layered tuning methodology: verify statistics, inspect the execution plan, align storage and distribution with workload, and finally manage sessions and resources.

1. A Layered Perspective on Tuning

Performance issues in a gbase database generally fall into three layers:

Model layer: Performance is unstable from the start, and scaling doesn't help. Check storage mode, distribution strategy, and index design.
Optimizer layer: The same SQL suddenly shows a different plan with volatile execution times. Check statistics, EXPLAIN output, and misplaced hints.
Resource layer: Everything slows down during peak hours, even if no single query is terrible. Check work_mem, shared_buffers, resource pools, and Cgroups.

2. Statistics: The Foundation of the Execution Plan

The optimizer relies on statistics collected by ANALYZE and stored in pg_class, pg_statistic, etc. Stale statistics lead to inaccurate row estimates and poor plan choices.

Always update statistics after bulk loads, deletes, archiving, partition switches, or when data distribution changes on hot columns.

-- Single table
ANALYZE sales_order;

-- Entire database
ANALYZE;

-- Specific columns
ANALYZE sales_order (customer_id, order_date);

-- Verify with EXPLAIN ANALYZE
EXPLAIN ANALYZE
SELECT customer_id, SUM(pay_amount)
FROM sales_order
WHERE order_date >= date '2026-03-01'
GROUP BY customer_id;

For partitioned tables, ANALYZE updates both the parent and all child partitions — essential for accurate partition pruning.

3. Reading Execution Plans: Focus on Row Estimates and Operator Choice

Use EXPLAIN (ANALYZE, VERBOSE, COSTS, BUFFERS, TIMING) to get detailed runtime information. Key indicators:

Row estimate vs. actual: Large discrepancies lead to poor JOIN or scan choices.
Scan type: A Seq Scan on a large, frequently filtered column suggests missing indexes or stale statistics.
Join type: Hash Join spilling to disk usually means work_mem is too low or the input set is too large. Nested Loop driven by a large result set often points to wrong row estimates.
Sort and aggregation: High cost on Sort/GroupAggregate may be reduced by slimming the column list or pre‑aggregating.
Buffer hit ratio: A low shared hit ratio suggests the buffer cache may be undersized.

Example:

EXPLAIN (ANALYZE, VERBOSE, COSTS, BUFFERS, TIMING)
SELECT o.customer_id, SUM(o.pay_amount)
FROM sales_order o
JOIN dim_customer c ON o.customer_id = c.customer_id
WHERE o.order_date >= date '2026-03-01'
  AND c.customer_level = 'VIP'
GROUP BY o.customer_id;

Common plan signals and actions:

Signal	Likely Cause	Action
Seq Scan on large table	Missing index or bad row estimate	Verify statistics first, then index
Hash Join with heavy spill	work_mem too small or large input	Reduce input, increase session memory
Nested Loop with large driver	Severely inaccurate row estimate	Fix statistics, then consider hint
Heavy Sort / GroupAggregate	Bloated column set	Slim SQL, pre‑aggregate

4. Hints: Emergency Intervention Only

Plan hints (/*+ ... */) such as Leading, HashJoin, NestLoop, IndexScan, SeqScan, and Rows allow you to override the optimizer. Use them only for short‑term fixes or when the optimizer consistently chooses the wrong plan despite accurate statistics and proper indexes.

SELECT /*+ Leading((c o)) HashJoin(c o) */
       o.customer_id, SUM(o.pay_amount)
FROM dim_customer c
JOIN sales_order o ON c.customer_id = o.customer_id
WHERE c.customer_level = 'VIP'
  AND o.order_date >= date '2026-03-01'
GROUP BY o.customer_id;

Always follow up a hint with model and parameter improvements; don't let it become a permanent crutch.

5. Key Parameters and Slow Query Tracking

work_mem: Controls memory for sorts and hash joins. Set it per session based on concurrency — too high risks memory exhaustion.
shared_buffers: Database shared buffer size, critical for read‑heavy workloads.
Statement tracking: Configure track_stmt_stat_level (full/slow), log_min_duration_statement (threshold), and enable_stmt_track. Retrieve slow queries with:

SELECT *
FROM dbe_perf.get_global_slow_sql_by_timestamp(
  '2026-03-24 09:00:00',
  '2026-03-24 09:10:00'
);

6. Resource Management with Cgroups and Resource Pools

GBase 8c's resource management is built on Linux Cgroups, configured via gs_cgroup. Resource pools isolate CPU, memory, and I/O for different workloads — online transactions, reports, ETL — preventing a single heavy query from starving the entire cluster.

7. Choosing the Right Storage and Distribution

Row store (orientation=row): Best for frequent point queries, updates, and short transactions.
Column store (orientation=column): Ideal for analytical scans and aggregations.
Replicated tables (DISTRIBUTE BY replication): Small dimension tables that are joined frequently — eliminates cross‑node data movement.
Hash distribution (DISTRIBUTE BY hash): Large fact tables, distributed on the most common JOIN key or high‑frequency access column.

-- Transaction detail: row store, hash distributed by order_id
CREATE TABLE txn_order (
    order_id      bigint,
    customer_id   bigint,
    order_time    timestamp,
    order_status  varchar(20),
    pay_amount    numeric(18,2)
) WITH (orientation=row)
DISTRIBUTE BY hash(order_id);

-- Analytical summary: column store, hash distributed by customer_id
CREATE TABLE rpt_order_day (
    stat_date      date,
    customer_id    bigint,
    city_id        int,
    order_cnt      bigint,
    pay_amount_sum numeric(18,2)
) WITH (orientation=column)
DISTRIBUTE BY hash(customer_id);

-- Small dimension: replicated
CREATE TABLE dim_city (
    city_id    int,
    city_name  varchar(64),
    region_id  int
) DISTRIBUTE BY replication;

8. A Systematic Tuning Workflow

Confirm the problem is reproducible and capture the business time window.
Verify statement tracking settings and collect slow queries.
Analyze the execution plan with EXPLAIN ANALYZE — focus on row estimates and operator choices.
Update statistics to give the optimizer accurate data.
Tune SQL, add indexes, or apply hints as a short‑term measure.
For peak‑time issues, examine resource pools, Cgroups, memory, and buffer cache as a whole.

Building a reliable gbase database performance baseline means keeping statistics fresh, understanding how the optimizer thinks, aligning storage models with actual workloads, and establishing clear resource boundaries. This layered approach prevents the common cycle of reactive, single‑query patches and delivers consistent performance at scale.