gooderp18绿色标准版
您最多选择25个主题 主题必须以字母或数字开头,可以包含连字符 (-),并且长度不得超过35个字符

127 行
8.8KB

  1. <?xml version="1.0" encoding="UTF-8" standalone="no"?>
  2. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>F.43. unaccent</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="prev" href="tsm-system-time.html" title="F.42. tsm_system_time" /><link rel="next" href="uuid-ossp.html" title="F.44. uuid-ossp" /></head><body><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">F.43. unaccent</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="tsm-system-time.html" title="F.42. tsm_system_time">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="contrib.html" title="Appendix F. Additional Supplied Modules">Up</a></td><th width="60%" align="center">Appendix F. Additional Supplied Modules</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 12.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="uuid-ossp.html" title="F.44. uuid-ossp">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="UNACCENT"><div class="titlepage"><div><div><h2 class="title" style="clear: both">F.43. unaccent</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="unaccent.html#id-1.11.7.52.5">F.43.1. Configuration</a></span></dt><dt><span class="sect2"><a href="unaccent.html#id-1.11.7.52.6">F.43.2. Usage</a></span></dt><dt><span class="sect2"><a href="unaccent.html#id-1.11.7.52.7">F.43.3. Functions</a></span></dt></dl></div><a id="id-1.11.7.52.2" class="indexterm"></a><p>
  3. <code class="filename">unaccent</code> is a text search dictionary that removes accents
  4. (diacritic signs) from lexemes.
  5. It's a filtering dictionary, which means its output is
  6. always passed to the next dictionary (if any), unlike the normal
  7. behavior of dictionaries. This allows accent-insensitive processing
  8. for full text search.
  9. </p><p>
  10. The current implementation of <code class="filename">unaccent</code> cannot be used as a
  11. normalizing dictionary for the <code class="filename">thesaurus</code> dictionary.
  12. </p><div class="sect2" id="id-1.11.7.52.5"><div class="titlepage"><div><div><h3 class="title">F.43.1. Configuration</h3></div></div></div><p>
  13. An <code class="literal">unaccent</code> dictionary accepts the following options:
  14. </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
  15. <code class="literal">RULES</code> is the base name of the file containing the list of
  16. translation rules. This file must be stored in
  17. <code class="filename">$SHAREDIR/tsearch_data/</code> (where <code class="literal">$SHAREDIR</code> means
  18. the <span class="productname">PostgreSQL</span> installation's shared-data directory).
  19. Its name must end in <code class="literal">.rules</code> (which is not to be included in
  20. the <code class="literal">RULES</code> parameter).
  21. </p></li></ul></div><p>
  22. The rules file has the following format:
  23. </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
  24. Each line represents one translation rule, consisting of a character with
  25. accent followed by a character without accent. The first is translated
  26. into the second. For example,
  27. </p><pre class="programlisting">
  28. À A
  29. Á A
  30. Â A
  31. Ã A
  32. Ä A
  33. Å A
  34. Æ AE
  35. </pre><p>
  36. The two characters must be separated by whitespace, and any leading or
  37. trailing whitespace on a line is ignored.
  38. </p></li><li class="listitem"><p>
  39. Alternatively, if only one character is given on a line, instances of
  40. that character are deleted; this is useful in languages where accents
  41. are represented by separate characters.
  42. </p></li><li class="listitem"><p>
  43. Actually, each <span class="quote">“<span class="quote">character</span>”</span> can be any string not containing
  44. whitespace, so <code class="filename">unaccent</code> dictionaries could be used for
  45. other sorts of substring substitutions besides diacritic removal.
  46. </p></li><li class="listitem"><p>
  47. As with other <span class="productname">PostgreSQL</span> text search configuration files,
  48. the rules file must be stored in UTF-8 encoding. The data is
  49. automatically translated into the current database's encoding when
  50. loaded. Any lines containing untranslatable characters are silently
  51. ignored, so that rules files can contain rules that are not applicable in
  52. the current encoding.
  53. </p></li></ul></div><p>
  54. A more complete example, which is directly useful for most European
  55. languages, can be found in <code class="filename">unaccent.rules</code>, which is installed
  56. in <code class="filename">$SHAREDIR/tsearch_data/</code> when the <code class="filename">unaccent</code>
  57. module is installed. This rules file translates characters with accents
  58. to the same characters without accents, and it also expands ligatures
  59. into the equivalent series of simple characters (for example, Æ to
  60. AE).
  61. </p></div><div class="sect2" id="id-1.11.7.52.6"><div class="titlepage"><div><div><h3 class="title">F.43.2. Usage</h3></div></div></div><p>
  62. Installing the <code class="literal">unaccent</code> extension creates a text
  63. search template <code class="literal">unaccent</code> and a dictionary <code class="literal">unaccent</code>
  64. based on it. The <code class="literal">unaccent</code> dictionary has the default
  65. parameter setting <code class="literal">RULES='unaccent'</code>, which makes it immediately
  66. usable with the standard <code class="filename">unaccent.rules</code> file.
  67. If you wish, you can alter the parameter, for example
  68. </p><pre class="programlisting">
  69. mydb=# ALTER TEXT SEARCH DICTIONARY unaccent (RULES='my_rules');
  70. </pre><p>
  71. or create new dictionaries based on the template.
  72. </p><p>
  73. To test the dictionary, you can try:
  74. </p><pre class="programlisting">
  75. mydb=# select ts_lexize('unaccent','Hôtel');
  76. ts_lexize
  77. -----------
  78. {Hotel}
  79. (1 row)
  80. </pre><p>
  81. </p><p>
  82. Here is an example showing how to insert the
  83. <code class="filename">unaccent</code> dictionary into a text search configuration:
  84. </p><pre class="programlisting">
  85. mydb=# CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
  86. mydb=# ALTER TEXT SEARCH CONFIGURATION fr
  87. ALTER MAPPING FOR hword, hword_part, word
  88. WITH unaccent, french_stem;
  89. mydb=# select to_tsvector('fr','Hôtels de la Mer');
  90. to_tsvector
  91. -------------------
  92. 'hotel':1 'mer':4
  93. (1 row)
  94. mydb=# select to_tsvector('fr','Hôtel de la Mer') @@ to_tsquery('fr','Hotels');
  95. ?column?
  96. ----------
  97. t
  98. (1 row)
  99. mydb=# select ts_headline('fr','Hôtel de la Mer',to_tsquery('fr','Hotels'));
  100. ts_headline
  101. ------------------------
  102. &lt;b&gt;Hôtel&lt;/b&gt; de la Mer
  103. (1 row)
  104. </pre><p>
  105. </p></div><div class="sect2" id="id-1.11.7.52.7"><div class="titlepage"><div><div><h3 class="title">F.43.3. Functions</h3></div></div></div><p>
  106. The <code class="function">unaccent()</code> function removes accents (diacritic signs) from
  107. a given string. Basically, it's a wrapper around
  108. <code class="filename">unaccent</code>-type dictionaries, but it can be used outside normal
  109. text search contexts.
  110. </p><a id="id-1.11.7.52.7.3" class="indexterm"></a><pre class="synopsis">
  111. unaccent([<span class="optional"><em class="replaceable"><code>dictionary</code></em> <code class="type">regdictionary</code>, </span>] <em class="replaceable"><code>string</code></em> <code class="type">text</code>) returns <code class="type">text</code>
  112. </pre><p>
  113. If the <em class="replaceable"><code>dictionary</code></em> argument is
  114. omitted, the text search dictionary named <code class="literal">unaccent</code> and
  115. appearing in the same schema as the <code class="function">unaccent()</code>
  116. function itself is used.
  117. </p><p>
  118. For example:
  119. </p><pre class="programlisting">
  120. SELECT unaccent('unaccent', 'Hôtel');
  121. SELECT unaccent('Hôtel');
  122. </pre><p>
  123. </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="tsm-system-time.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="contrib.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="uuid-ossp.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">F.42. tsm_system_time </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> F.44. uuid-ossp</td></tr></table></div></body></html>
上海开阖软件有限公司 沪ICP备12045867号-1