gooderp18绿色标准版
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

144 line
11KB

  1. <?xml version="1.0" encoding="UTF-8" standalone="no"?>
  2. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>14.3. Controlling the Planner with Explicit JOIN Clauses</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="prev" href="planner-stats.html" title="14.2. Statistics Used by the Planner" /><link rel="next" href="populate.html" title="14.4. Populating a Database" /></head><body><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">14.3. Controlling the Planner with Explicit <code xmlns="http://www.w3.org/1999/xhtml" class="literal">JOIN</code> Clauses</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="planner-stats.html" title="14.2. Statistics Used by the Planner">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="performance-tips.html" title="Chapter 14. Performance Tips">Up</a></td><th width="60%" align="center">Chapter 14. Performance Tips</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 12.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="populate.html" title="14.4. Populating a Database">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="EXPLICIT-JOINS"><div class="titlepage"><div><div><h2 class="title" style="clear: both">14.3. Controlling the Planner with Explicit <code class="literal">JOIN</code> Clauses</h2></div></div></div><a id="id-1.5.13.6.2" class="indexterm"></a><p>
  3. It is possible
  4. to control the query planner to some extent by using the explicit <code class="literal">JOIN</code>
  5. syntax. To see why this matters, we first need some background.
  6. </p><p>
  7. In a simple join query, such as:
  8. </p><pre class="programlisting">
  9. SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;
  10. </pre><p>
  11. the planner is free to join the given tables in any order. For
  12. example, it could generate a query plan that joins A to B, using
  13. the <code class="literal">WHERE</code> condition <code class="literal">a.id = b.id</code>, and then
  14. joins C to this joined table, using the other <code class="literal">WHERE</code>
  15. condition. Or it could join B to C and then join A to that result.
  16. Or it could join A to C and then join them with B — but that
  17. would be inefficient, since the full Cartesian product of A and C
  18. would have to be formed, there being no applicable condition in the
  19. <code class="literal">WHERE</code> clause to allow optimization of the join. (All
  20. joins in the <span class="productname">PostgreSQL</span> executor happen
  21. between two input tables, so it's necessary to build up the result
  22. in one or another of these fashions.) The important point is that
  23. these different join possibilities give semantically equivalent
  24. results but might have hugely different execution costs. Therefore,
  25. the planner will explore all of them to try to find the most
  26. efficient query plan.
  27. </p><p>
  28. When a query only involves two or three tables, there aren't many join
  29. orders to worry about. But the number of possible join orders grows
  30. exponentially as the number of tables expands. Beyond ten or so input
  31. tables it's no longer practical to do an exhaustive search of all the
  32. possibilities, and even for six or seven tables planning might take an
  33. annoyingly long time. When there are too many input tables, the
  34. <span class="productname">PostgreSQL</span> planner will switch from exhaustive
  35. search to a <em class="firstterm">genetic</em> probabilistic search
  36. through a limited number of possibilities. (The switch-over threshold is
  37. set by the <a class="xref" href="runtime-config-query.html#GUC-GEQO-THRESHOLD">geqo_threshold</a> run-time
  38. parameter.)
  39. The genetic search takes less time, but it won't
  40. necessarily find the best possible plan.
  41. </p><p>
  42. When the query involves outer joins, the planner has less freedom
  43. than it does for plain (inner) joins. For example, consider:
  44. </p><pre class="programlisting">
  45. SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
  46. </pre><p>
  47. Although this query's restrictions are superficially similar to the
  48. previous example, the semantics are different because a row must be
  49. emitted for each row of A that has no matching row in the join of B and C.
  50. Therefore the planner has no choice of join order here: it must join
  51. B to C and then join A to that result. Accordingly, this query takes
  52. less time to plan than the previous query. In other cases, the planner
  53. might be able to determine that more than one join order is safe.
  54. For example, given:
  55. </p><pre class="programlisting">
  56. SELECT * FROM a LEFT JOIN b ON (a.bid = b.id) LEFT JOIN c ON (a.cid = c.id);
  57. </pre><p>
  58. it is valid to join A to either B or C first. Currently, only
  59. <code class="literal">FULL JOIN</code> completely constrains the join order. Most
  60. practical cases involving <code class="literal">LEFT JOIN</code> or <code class="literal">RIGHT JOIN</code>
  61. can be rearranged to some extent.
  62. </p><p>
  63. Explicit inner join syntax (<code class="literal">INNER JOIN</code>, <code class="literal">CROSS
  64. JOIN</code>, or unadorned <code class="literal">JOIN</code>) is semantically the same as
  65. listing the input relations in <code class="literal">FROM</code>, so it does not
  66. constrain the join order.
  67. </p><p>
  68. Even though most kinds of <code class="literal">JOIN</code> don't completely constrain
  69. the join order, it is possible to instruct the
  70. <span class="productname">PostgreSQL</span> query planner to treat all
  71. <code class="literal">JOIN</code> clauses as constraining the join order anyway.
  72. For example, these three queries are logically equivalent:
  73. </p><pre class="programlisting">
  74. SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;
  75. SELECT * FROM a CROSS JOIN b CROSS JOIN c WHERE a.id = b.id AND b.ref = c.id;
  76. SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
  77. </pre><p>
  78. But if we tell the planner to honor the <code class="literal">JOIN</code> order,
  79. the second and third take less time to plan than the first. This effect
  80. is not worth worrying about for only three tables, but it can be a
  81. lifesaver with many tables.
  82. </p><p>
  83. To force the planner to follow the join order laid out by explicit
  84. <code class="literal">JOIN</code>s,
  85. set the <a class="xref" href="runtime-config-query.html#GUC-JOIN-COLLAPSE-LIMIT">join_collapse_limit</a> run-time parameter to 1.
  86. (Other possible values are discussed below.)
  87. </p><p>
  88. You do not need to constrain the join order completely in order to
  89. cut search time, because it's OK to use <code class="literal">JOIN</code> operators
  90. within items of a plain <code class="literal">FROM</code> list. For example, consider:
  91. </p><pre class="programlisting">
  92. SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;
  93. </pre><p>
  94. With <code class="varname">join_collapse_limit</code> = 1, this
  95. forces the planner to join A to B before joining them to other tables,
  96. but doesn't constrain its choices otherwise. In this example, the
  97. number of possible join orders is reduced by a factor of 5.
  98. </p><p>
  99. Constraining the planner's search in this way is a useful technique
  100. both for reducing planning time and for directing the planner to a
  101. good query plan. If the planner chooses a bad join order by default,
  102. you can force it to choose a better order via <code class="literal">JOIN</code> syntax
  103. — assuming that you know of a better order, that is. Experimentation
  104. is recommended.
  105. </p><p>
  106. A closely related issue that affects planning time is collapsing of
  107. subqueries into their parent query. For example, consider:
  108. </p><pre class="programlisting">
  109. SELECT *
  110. FROM x, y,
  111. (SELECT * FROM a, b, c WHERE something) AS ss
  112. WHERE somethingelse;
  113. </pre><p>
  114. This situation might arise from use of a view that contains a join;
  115. the view's <code class="literal">SELECT</code> rule will be inserted in place of the view
  116. reference, yielding a query much like the above. Normally, the planner
  117. will try to collapse the subquery into the parent, yielding:
  118. </p><pre class="programlisting">
  119. SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
  120. </pre><p>
  121. This usually results in a better plan than planning the subquery
  122. separately. (For example, the outer <code class="literal">WHERE</code> conditions might be such that
  123. joining X to A first eliminates many rows of A, thus avoiding the need to
  124. form the full logical output of the subquery.) But at the same time,
  125. we have increased the planning time; here, we have a five-way join
  126. problem replacing two separate three-way join problems. Because of the
  127. exponential growth of the number of possibilities, this makes a big
  128. difference. The planner tries to avoid getting stuck in huge join search
  129. problems by not collapsing a subquery if more than <code class="varname">from_collapse_limit</code>
  130. <code class="literal">FROM</code> items would result in the parent
  131. query. You can trade off planning time against quality of plan by
  132. adjusting this run-time parameter up or down.
  133. </p><p>
  134. <a class="xref" href="runtime-config-query.html#GUC-FROM-COLLAPSE-LIMIT">from_collapse_limit</a> and <a class="xref" href="runtime-config-query.html#GUC-JOIN-COLLAPSE-LIMIT">join_collapse_limit</a>
  135. are similarly named because they do almost the same thing: one controls
  136. when the planner will <span class="quote">“<span class="quote">flatten out</span>”</span> subqueries, and the
  137. other controls when it will flatten out explicit joins. Typically
  138. you would either set <code class="varname">join_collapse_limit</code> equal to
  139. <code class="varname">from_collapse_limit</code> (so that explicit joins and subqueries
  140. act similarly) or set <code class="varname">join_collapse_limit</code> to 1 (if you want
  141. to control join order with explicit joins). But you might set them
  142. differently if you are trying to fine-tune the trade-off between planning
  143. time and run time.
  144. </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="planner-stats.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="performance-tips.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="populate.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">14.2. Statistics Used by the Planner </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> 14.4. Populating a Database</td></tr></table></div></body></html>
上海开阖软件有限公司 沪ICP备12045867号-1