PoWA 3

June, 28 2016 - 5432... Meet us!

Authors

 

  • Ronan Dunklau

    • DBA @ Dalibo

    • Open-Source: Multicorn...

    • Some PostgreSQL contributions (IMPORT FOREIGN SCHEMA...)

  • Julien Rouhaud

    • DBA @ Dalibo

    • Open-Source: HypoPG, OPM...

    • Some PostgreSQL contributions

  • But also...

    • Marc Cousin

    • Thomas Reiss

PoWA ?

What is PoWA

[t]

2cm

  image

5cm

  image

  • Workload analysis tool

  • Suggests opitmizations

  • Live!

Application stack

Presentation

pg_stat_statements

Presentation

  • Official PostgreSQL contrib

  • Normalized queries

  • Cumulative counters (buffers, execution time...), by

    • user

    • database

    • query

pg_stat_statements

Useful indicators

  • Number of execution per normalized query

  • Average execution time

  • Temporary file creation

  • Blocks access from or outside PostgreSQL’s cache

pg_stat_statements

In action 1

image
image

pg_stat_statements

In action 2

image
image

pg_stat_kcache

Presentation

  • Collects system metrics, by normalized queries

    • Physical disk access

    • CPU usage

pg_stat_kcache

Meaning...

  • “real” hit-ratio (PostgreSQL cache Vs system cache)

  • Identify CPU bound queries

pg_stat_kcache

In action 1

image
image

pg_stat_kcache

In action 2

image
image

pg_qualstats

Presentation

  • Predicate analysis

    • WHERE clauses

    • JOIN clauses

  • Collects various metrics

    • Selectivity

    • Constants sampling (most executed, most filtering...)

    • Execution count

    • Evalutation type (Index clause or post-scan filtering)

pg_qualstats

In action 1

image
image

pg_qualstats

In action 2

image
image

pg_qualstats

In action 3

image
image

pg_qualstats

In action 4

image
image

powa-archivist

Presentation

  • Archive those data sources

  • Configurable (retention, frequency...)

  • Extensible to other datasources

powa-archivist

What to get

  • Where / when are the bottlenecks

  • For what reason

  • How to fix

  • Live!

Compatibility

 

  • PostgreSQL 9.4 et later

  • PoWA 1 compatible with 9.3, but much more limited

powa-web

Presentation

  • Web interface for PoWA

  • Manage one or more PoWA instance

  • Drill-down analysis

powa-web

Usage example

  • problem: bad performance on parts of an application

  • Select an analysis period

  • Identify the database

powa-web

cluster view - 1

image
image

powa-web

cluster view - 2

image
image

powa-web

Database view

  • Problematic database has been identified...

  • let’s drill down to the query level!

powa-web

Database view - 1

image
image

powa-web

Database view - 2

image
image

powa-web

Database view - 3

image
image

powa-web

Query view

  • 2 problematic queries

  • Drill down on each of them

[fragile]

powa-web

First query - SQL

[mathescape, numbersep=5pt, gobble=2, frame=lines, framesep=2mm]sql SELECT com.id, sum(cl.pric) AS totalprice FROM command com JOIN commandline cl ON com.id = cl.idcommand JOIN client cli ON cli.id = com.idclient WHERE cli.id = ? GROUP BY com.id

powa-web

First query - cache

image
image

powa-web

First query - CPU

image
image

powa-web

First query - predicates

image
image

powa-web

First query - index

image
image

[fragile]powa-web

Second query - SQL

[mathescape, numbersep=5pt, gobble=2, frame=lines, framesep=2mm]sql SELECT id, dt FROM command WHERE state = ?

powa-web

Second query - EXPLAIN

image
image

powa-web

Second query - distribution

image
image

powa-web

Video

powa-web

What’s new in version 3

HypoPG

Presentation

  • Allow for hypothetical indexes creation

  • Instant creation, no impact on resources and no lock

  • Only used in EXPLAIN statements

[fragile]HypoPG

Example

[mathescape, numbersep=5pt, gobble=2, frame=lines, framesep=2mm]sql rjuju=# EXPLAIN SELECT * FROM t1 WHERE id = 3 ; QUERY PLAN ————————————————— Seq Scan on t1 (cost=0.00..1693.00 rows=1 width=4) Filter: (id = 3) (2 rows)

[fragile]HypoPG

Example

[mathescape, numbersep=5pt, gobble=2, frame=lines, framesep=2mm]sql # SELECT hypopgcreateindex(’CREATE INDEX ON t1(id)’) ; hypopgcreateindex ————————– (77523,<77523>btreet1id) (1 row)

rjuju=# EXPLAIN SELECT * FROM t1 WHERE id = 3 ; QUERY PLAN ————————————————————————– Index Only Scan using <77523>btreet1id on t1 (0.04..8.06 rows=1 width=4) Index Cond: (id = 3) (2 rows)

HypoPG

What is it useful for

  • Will PostgreSQL use such an index

  • What size can I expect it to be

  • How useful can it be

HypoPG

In action

image
image

Global optimization

Presentation

  • Find the optimal set of index to add

    • Helping every queries

    • Minimum set of indexes

    • Privileging multi-column indexes

Global optimization

Algorithm - 1

  • Fetch the predicates that need optimization (pg_qualstats)

    • Predicates filtering more than X lines out

    • Predicates filtering more than X% of lines out

    • Predicates used as part of a Seq Scan

Global optimization

Algorithm - 2

  • Group predicates by supported access methods

    • Hint: Think about btree_gist and btree_gin
  • Build a list of predicates “contained” by each predicates

    • WHERE id = ? AND label = ?

    • WHERE id = ?

    • WHERE label = ?

  • For each node, attribute a “score” to it (currently, number of columns)

Global optimization

Algorithm - 3

  • For each node, compute a path containing all included node

  • Score it (sum of individual nodes scores)

  • Starting with the highest scoring path, generate the index definition for it

  • Delete any other path made obsolete by this one

  • Loop until no path is left unoptimized

Global optimization

Validation

  • Goal: estimate if the indexes will be used, and how much improvement they bring

  • If HypoPG is available on the target database:

    • Create hypothetical index for each suggestion

    • EXPLAIN every query with and without the hypothetical indexes

    • Based on the difference cost, estimate the gain

    • If HypoPG is available on the target database:

      • By query

      • Globally

    • Estimate the size of the indexes

Global optimization

In action

image
image

Global optimization

In action

image
image

Global optimization

In action

image
image

Global optimization

In action

image
image

Global optimization

In action

image
image

Global optimization

In action

  • vidéo

What’s next

Future enhancements

  • Find correlations, and suggest them once correlated statistics are available

    • WHERE cityname = ? AND zipcode = ? (10 rows avg)

    • WHERE cityname = ? (10 rows avg)

    • WHERE zipcode = ? (10 rows avg)

    • It means that cityname and zipcode are probably correlated

  • Collect statistics on table to take DML workload into account

  • Suggest partial indexes based on most-often used values

Useful links

 

Questions ?