|
- <?xml version="1.0" encoding="UTF-8" standalone="no"?>
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>12.2. Tables and Indexes</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="prev" href="textsearch-intro.html" title="12.1. Introduction" /><link rel="next" href="textsearch-controls.html" title="12.3. Controlling Text Search" /></head><body><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">12.2. Tables and Indexes</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="textsearch-intro.html" title="12.1. Introduction">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="textsearch.html" title="Chapter 12. Full Text Search">Up</a></td><th width="60%" align="center">Chapter 12. Full Text Search</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 12.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="textsearch-controls.html" title="12.3. Controlling Text Search">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="TEXTSEARCH-TABLES"><div class="titlepage"><div><div><h2 class="title" style="clear: both">12.2. Tables and Indexes</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="textsearch-tables.html#TEXTSEARCH-TABLES-SEARCH">12.2.1. Searching a Table</a></span></dt><dt><span class="sect2"><a href="textsearch-tables.html#TEXTSEARCH-TABLES-INDEX">12.2.2. Creating Indexes</a></span></dt></dl></div><p>
- The examples in the previous section illustrated full text matching using
- simple constant strings. This section shows how to search table data,
- optionally using indexes.
- </p><div class="sect2" id="TEXTSEARCH-TABLES-SEARCH"><div class="titlepage"><div><div><h3 class="title">12.2.1. Searching a Table</h3></div></div></div><p>
- It is possible to do a full text search without an index. A simple query
- to print the <code class="structname">title</code> of each row that contains the word
- <code class="literal">friend</code> in its <code class="structfield">body</code> field is:
-
- </p><pre class="programlisting">
- SELECT title
- FROM pgweb
- WHERE to_tsvector('english', body) @@ to_tsquery('english', 'friend');
- </pre><p>
-
- This will also find related words such as <code class="literal">friends</code>
- and <code class="literal">friendly</code>, since all these are reduced to the same
- normalized lexeme.
- </p><p>
- The query above specifies that the <code class="literal">english</code> configuration
- is to be used to parse and normalize the strings. Alternatively we
- could omit the configuration parameters:
-
- </p><pre class="programlisting">
- SELECT title
- FROM pgweb
- WHERE to_tsvector(body) @@ to_tsquery('friend');
- </pre><p>
-
- This query will use the configuration set by <a class="xref" href="runtime-config-client.html#GUC-DEFAULT-TEXT-SEARCH-CONFIG">default_text_search_config</a>.
- </p><p>
- A more complex example is to
- select the ten most recent documents that contain <code class="literal">create</code> and
- <code class="literal">table</code> in the <code class="structname">title</code> or <code class="structname">body</code>:
-
- </p><pre class="programlisting">
- SELECT title
- FROM pgweb
- WHERE to_tsvector(title || ' ' || body) @@ to_tsquery('create & table')
- ORDER BY last_mod_date DESC
- LIMIT 10;
- </pre><p>
-
- For clarity we omitted the <code class="function">coalesce</code> function calls
- which would be needed to find rows that contain <code class="literal">NULL</code>
- in one of the two fields.
- </p><p>
- Although these queries will work without an index, most applications
- will find this approach too slow, except perhaps for occasional ad-hoc
- searches. Practical use of text searching usually requires creating
- an index.
- </p></div><div class="sect2" id="TEXTSEARCH-TABLES-INDEX"><div class="titlepage"><div><div><h3 class="title">12.2.2. Creating Indexes</h3></div></div></div><p>
- We can create a <acronym class="acronym">GIN</acronym> index (<a class="xref" href="textsearch-indexes.html" title="12.9. GIN and GiST Index Types">Section 12.9</a>) to speed up text searches:
-
- </p><pre class="programlisting">
- CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', body));
- </pre><p>
-
- Notice that the 2-argument version of <code class="function">to_tsvector</code> is
- used. Only text search functions that specify a configuration name can
- be used in expression indexes (<a class="xref" href="indexes-expressional.html" title="11.7. Indexes on Expressions">Section 11.7</a>).
- This is because the index contents must be unaffected by <a class="xref" href="runtime-config-client.html#GUC-DEFAULT-TEXT-SEARCH-CONFIG">default_text_search_config</a>. If they were affected, the
- index contents might be inconsistent because different entries could
- contain <code class="type">tsvector</code>s that were created with different text search
- configurations, and there would be no way to guess which was which. It
- would be impossible to dump and restore such an index correctly.
- </p><p>
- Because the two-argument version of <code class="function">to_tsvector</code> was
- used in the index above, only a query reference that uses the 2-argument
- version of <code class="function">to_tsvector</code> with the same configuration
- name will use that index. That is, <code class="literal">WHERE
- to_tsvector('english', body) @@ 'a & b'</code> can use the index,
- but <code class="literal">WHERE to_tsvector(body) @@ 'a & b'</code> cannot.
- This ensures that an index will be used only with the same configuration
- used to create the index entries.
- </p><p>
- It is possible to set up more complex expression indexes wherein the
- configuration name is specified by another column, e.g.:
-
- </p><pre class="programlisting">
- CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector(config_name, body));
- </pre><p>
-
- where <code class="literal">config_name</code> is a column in the <code class="literal">pgweb</code>
- table. This allows mixed configurations in the same index while
- recording which configuration was used for each index entry. This
- would be useful, for example, if the document collection contained
- documents in different languages. Again,
- queries that are meant to use the index must be phrased to match, e.g.,
- <code class="literal">WHERE to_tsvector(config_name, body) @@ 'a & b'</code>.
- </p><p>
- Indexes can even concatenate columns:
-
- </p><pre class="programlisting">
- CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', title || ' ' || body));
- </pre><p>
- </p><p>
- Another approach is to create a separate <code class="type">tsvector</code> column
- to hold the output of <code class="function">to_tsvector</code>. To keep this
- column automatically up to date with its source data, use a stored
- generated column. This example is a
- concatenation of <code class="literal">title</code> and <code class="literal">body</code>,
- using <code class="function">coalesce</code> to ensure that one field will still be
- indexed when the other is <code class="literal">NULL</code>:
-
- </p><pre class="programlisting">
- ALTER TABLE pgweb
- ADD COLUMN textsearchable_index_col tsvector
- GENERATED ALWAYS AS (to_tsvector('english', coalesce(title, '') || ' ' || coalesce(body, ''))) STORED;
- </pre><p>
-
- Then we create a <acronym class="acronym">GIN</acronym> index to speed up the search:
-
- </p><pre class="programlisting">
- CREATE INDEX textsearch_idx ON pgweb USING GIN (textsearchable_index_col);
- </pre><p>
-
- Now we are ready to perform a fast full text search:
-
- </p><pre class="programlisting">
- SELECT title
- FROM pgweb
- WHERE textsearchable_index_col @@ to_tsquery('create & table')
- ORDER BY last_mod_date DESC
- LIMIT 10;
- </pre><p>
- </p><p>
- One advantage of the separate-column approach over an expression index
- is that it is not necessary to explicitly specify the text search
- configuration in queries in order to make use of the index. As shown
- in the example above, the query can depend on
- <code class="varname">default_text_search_config</code>. Another advantage is that
- searches will be faster, since it will not be necessary to redo the
- <code class="function">to_tsvector</code> calls to verify index matches. (This is more
- important when using a GiST index than a GIN index; see <a class="xref" href="textsearch-indexes.html" title="12.9. GIN and GiST Index Types">Section 12.9</a>.) The expression-index approach is
- simpler to set up, however, and it requires less disk space since the
- <code class="type">tsvector</code> representation is not stored explicitly.
- </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="textsearch-intro.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="textsearch.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="textsearch-controls.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">12.1. Introduction </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> 12.3. Controlling Text Search</td></tr></table></div></body></html>
|