|
- <?xml version="1.0" encoding="UTF-8" standalone="no"?>
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>61.6. Index Cost Estimation Functions</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="prev" href="index-unique-checks.html" title="61.5. Index Uniqueness Checks" /><link rel="next" href="generic-wal.html" title="Chapter 62. Generic WAL Records" /></head><body><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">61.6. Index Cost Estimation Functions</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="index-unique-checks.html" title="61.5. Index Uniqueness Checks">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="indexam.html" title="Chapter 61. Index Access Method Interface Definition">Up</a></td><th width="60%" align="center">Chapter 61. Index Access Method Interface Definition</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 12.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="generic-wal.html" title="Chapter 62. Generic WAL Records">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="INDEX-COST-ESTIMATION"><div class="titlepage"><div><div><h2 class="title" style="clear: both">61.6. Index Cost Estimation Functions</h2></div></div></div><p>
- The <code class="function">amcostestimate</code> function is given information describing
- a possible index scan, including lists of WHERE and ORDER BY clauses that
- have been determined to be usable with the index. It must return estimates
- of the cost of accessing the index and the selectivity of the WHERE
- clauses (that is, the fraction of parent-table rows that will be
- retrieved during the index scan). For simple cases, nearly all the
- work of the cost estimator can be done by calling standard routines
- in the optimizer; the point of having an <code class="function">amcostestimate</code> function is
- to allow index access methods to provide index-type-specific knowledge,
- in case it is possible to improve on the standard estimates.
- </p><p>
- Each <code class="function">amcostestimate</code> function must have the signature:
-
- </p><pre class="programlisting">
- void
- amcostestimate (PlannerInfo *root,
- IndexPath *path,
- double loop_count,
- Cost *indexStartupCost,
- Cost *indexTotalCost,
- Selectivity *indexSelectivity,
- double *indexCorrelation,
- double *indexPages);
- </pre><p>
-
- The first three parameters are inputs:
-
- </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><em class="parameter"><code>root</code></em></span></dt><dd><p>
- The planner's information about the query being processed.
- </p></dd><dt><span class="term"><em class="parameter"><code>path</code></em></span></dt><dd><p>
- The index access path being considered. All fields except cost and
- selectivity values are valid.
- </p></dd><dt><span class="term"><em class="parameter"><code>loop_count</code></em></span></dt><dd><p>
- The number of repetitions of the index scan that should be factored
- into the cost estimates. This will typically be greater than one when
- considering a parameterized scan for use in the inside of a nestloop
- join. Note that the cost estimates should still be for just one scan;
- a larger <em class="parameter"><code>loop_count</code></em> means that it may be appropriate
- to allow for some caching effects across multiple scans.
- </p></dd></dl></div><p>
- </p><p>
- The last five parameters are pass-by-reference outputs:
-
- </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><em class="parameter"><code>*indexStartupCost</code></em></span></dt><dd><p>
- Set to cost of index start-up processing
- </p></dd><dt><span class="term"><em class="parameter"><code>*indexTotalCost</code></em></span></dt><dd><p>
- Set to total cost of index processing
- </p></dd><dt><span class="term"><em class="parameter"><code>*indexSelectivity</code></em></span></dt><dd><p>
- Set to index selectivity
- </p></dd><dt><span class="term"><em class="parameter"><code>*indexCorrelation</code></em></span></dt><dd><p>
- Set to correlation coefficient between index scan order and
- underlying table's order
- </p></dd><dt><span class="term"><em class="parameter"><code>*indexPages</code></em></span></dt><dd><p>
- Set to number of index leaf pages
- </p></dd></dl></div><p>
- </p><p>
- Note that cost estimate functions must be written in C, not in SQL or
- any available procedural language, because they must access internal
- data structures of the planner/optimizer.
- </p><p>
- The index access costs should be computed using the parameters used by
- <code class="filename">src/backend/optimizer/path/costsize.c</code>: a sequential
- disk block fetch has cost <code class="varname">seq_page_cost</code>, a nonsequential fetch
- has cost <code class="varname">random_page_cost</code>, and the cost of processing one index
- row should usually be taken as <code class="varname">cpu_index_tuple_cost</code>. In
- addition, an appropriate multiple of <code class="varname">cpu_operator_cost</code> should
- be charged for any comparison operators invoked during index processing
- (especially evaluation of the indexquals themselves).
- </p><p>
- The access costs should include all disk and CPU costs associated with
- scanning the index itself, but <span class="emphasis"><em>not</em></span> the costs of retrieving or
- processing the parent-table rows that are identified by the index.
- </p><p>
- The <span class="quote">“<span class="quote">start-up cost</span>”</span> is the part of the total scan cost that
- must be expended before we can begin to fetch the first row. For most
- indexes this can be taken as zero, but an index type with a high start-up
- cost might want to set it nonzero.
- </p><p>
- The <em class="parameter"><code>indexSelectivity</code></em> should be set to the estimated fraction of the parent
- table rows that will be retrieved during the index scan. In the case
- of a lossy query, this will typically be higher than the fraction of
- rows that actually pass the given qual conditions.
- </p><p>
- The <em class="parameter"><code>indexCorrelation</code></em> should be set to the correlation (ranging between
- -1.0 and 1.0) between the index order and the table order. This is used
- to adjust the estimate for the cost of fetching rows from the parent
- table.
- </p><p>
- The <em class="parameter"><code>indexPages</code></em> should be set to the number of leaf pages.
- This is used to estimate the number of workers for parallel index scan.
- </p><p>
- When <em class="parameter"><code>loop_count</code></em> is greater than one, the returned numbers
- should be averages expected for any one scan of the index.
- </p><div class="procedure" id="id-1.10.14.12.13"><p class="title"><strong>Cost Estimation</strong></p><p>
- A typical cost estimator will proceed as follows:
- </p><ol class="procedure" type="1"><li class="step"><p>
- Estimate and return the fraction of parent-table rows that will be visited
- based on the given qual conditions. In the absence of any index-type-specific
- knowledge, use the standard optimizer function <code class="function">clauselist_selectivity()</code>:
-
- </p><pre class="programlisting">
- *indexSelectivity = clauselist_selectivity(root, path->indexquals,
- path->indexinfo->rel->relid,
- JOIN_INNER, NULL);
- </pre><p>
- </p></li><li class="step"><p>
- Estimate the number of index rows that will be visited during the
- scan. For many index types this is the same as <em class="parameter"><code>indexSelectivity</code></em> times
- the number of rows in the index, but it might be more. (Note that the
- index's size in pages and rows is available from the
- <code class="literal">path->indexinfo</code> struct.)
- </p></li><li class="step"><p>
- Estimate the number of index pages that will be retrieved during the scan.
- This might be just <em class="parameter"><code>indexSelectivity</code></em> times the index's size in pages.
- </p></li><li class="step"><p>
- Compute the index access cost. A generic estimator might do this:
-
- </p><pre class="programlisting">
- /*
- * Our generic assumption is that the index pages will be read
- * sequentially, so they cost seq_page_cost each, not random_page_cost.
- * Also, we charge for evaluation of the indexquals at each index row.
- * All the costs are assumed to be paid incrementally during the scan.
- */
- cost_qual_eval(&index_qual_cost, path->indexquals, root);
- *indexStartupCost = index_qual_cost.startup;
- *indexTotalCost = seq_page_cost * numIndexPages +
- (cpu_index_tuple_cost + index_qual_cost.per_tuple) * numIndexTuples;
- </pre><p>
-
- However, the above does not account for amortization of index reads
- across repeated index scans.
- </p></li><li class="step"><p>
- Estimate the index correlation. For a simple ordered index on a single
- field, this can be retrieved from pg_statistic. If the correlation
- is not known, the conservative estimate is zero (no correlation).
- </p></li></ol></div><p>
- Examples of cost estimator functions can be found in
- <code class="filename">src/backend/utils/adt/selfuncs.c</code>.
- </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="index-unique-checks.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="indexam.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="generic-wal.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">61.5. Index Uniqueness Checks </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Chapter 62. Generic WAL Records</td></tr></table></div></body></html>
|