|
- <?xml version="1.0" encoding="UTF-8" standalone="no"?>
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>64.3. Extensibility</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="prev" href="gist-builtin-opclasses.html" title="64.2. Built-in Operator Classes" /><link rel="next" href="gist-implementation.html" title="64.4. Implementation" /></head><body><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">64.3. Extensibility</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="gist-builtin-opclasses.html" title="64.2. Built-in Operator Classes">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="gist.html" title="Chapter 64. GiST Indexes">Up</a></td><th width="60%" align="center">Chapter 64. GiST Indexes</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 12.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="gist-implementation.html" title="64.4. Implementation">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="GIST-EXTENSIBILITY"><div class="titlepage"><div><div><h2 class="title" style="clear: both">64.3. Extensibility</h2></div></div></div><p>
- Traditionally, implementing a new index access method meant a lot of
- difficult work. It was necessary to understand the inner workings of the
- database, such as the lock manager and Write-Ahead Log. The
- <acronym class="acronym">GiST</acronym> interface has a high level of abstraction,
- requiring the access method implementer only to implement the semantics of
- the data type being accessed. The <acronym class="acronym">GiST</acronym> layer itself
- takes care of concurrency, logging and searching the tree structure.
- </p><p>
- This extensibility should not be confused with the extensibility of the
- other standard search trees in terms of the data they can handle. For
- example, <span class="productname">PostgreSQL</span> supports extensible B-trees
- and hash indexes. That means that you can use
- <span class="productname">PostgreSQL</span> to build a B-tree or hash over any
- data type you want. But B-trees only support range predicates
- (<code class="literal"><</code>, <code class="literal">=</code>, <code class="literal">></code>),
- and hash indexes only support equality queries.
- </p><p>
- So if you index, say, an image collection with a
- <span class="productname">PostgreSQL</span> B-tree, you can only issue queries
- such as <span class="quote">“<span class="quote">is imagex equal to imagey</span>”</span>, <span class="quote">“<span class="quote">is imagex less
- than imagey</span>”</span> and <span class="quote">“<span class="quote">is imagex greater than imagey</span>”</span>.
- Depending on how you define <span class="quote">“<span class="quote">equals</span>”</span>, <span class="quote">“<span class="quote">less than</span>”</span>
- and <span class="quote">“<span class="quote">greater than</span>”</span> in this context, this could be useful.
- However, by using a <acronym class="acronym">GiST</acronym> based index, you could create
- ways to ask domain-specific questions, perhaps <span class="quote">“<span class="quote">find all images of
- horses</span>”</span> or <span class="quote">“<span class="quote">find all over-exposed images</span>”</span>.
- </p><p>
- All it takes to get a <acronym class="acronym">GiST</acronym> access method up and running
- is to implement several user-defined methods, which define the behavior of
- keys in the tree. Of course these methods have to be pretty fancy to
- support fancy queries, but for all the standard queries (B-trees,
- R-trees, etc.) they're relatively straightforward. In short,
- <acronym class="acronym">GiST</acronym> combines extensibility along with generality, code
- reuse, and a clean interface.
- </p><p>
- There are five methods that an index operator class for
- <acronym class="acronym">GiST</acronym> must provide, and four that are optional.
- Correctness of the index is ensured
- by proper implementation of the <code class="function">same</code>, <code class="function">consistent</code>
- and <code class="function">union</code> methods, while efficiency (size and speed) of the
- index will depend on the <code class="function">penalty</code> and <code class="function">picksplit</code>
- methods.
- Two optional methods are <code class="function">compress</code> and
- <code class="function">decompress</code>, which allow an index to have internal tree data of
- a different type than the data it indexes. The leaves are to be of the
- indexed data type, while the other tree nodes can be of any C struct (but
- you still have to follow <span class="productname">PostgreSQL</span> data type rules here,
- see about <code class="literal">varlena</code> for variable sized data). If the tree's
- internal data type exists at the SQL level, the <code class="literal">STORAGE</code> option
- of the <code class="command">CREATE OPERATOR CLASS</code> command can be used.
- The optional eighth method is <code class="function">distance</code>, which is needed
- if the operator class wishes to support ordered scans (nearest-neighbor
- searches). The optional ninth method <code class="function">fetch</code> is needed if the
- operator class wishes to support index-only scans, except when the
- <code class="function">compress</code> method is omitted.
- </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="function">consistent</code></span></dt><dd><p>
- Given an index entry <code class="literal">p</code> and a query value <code class="literal">q</code>,
- this function determines whether the index entry is
- <span class="quote">“<span class="quote">consistent</span>”</span> with the query; that is, could the predicate
- <span class="quote">“<span class="quote"><em class="replaceable"><code>indexed_column</code></em>
- <em class="replaceable"><code>indexable_operator</code></em> <code class="literal">q</code></span>”</span> be true for
- any row represented by the index entry? For a leaf index entry this is
- equivalent to testing the indexable condition, while for an internal
- tree node this determines whether it is necessary to scan the subtree
- of the index represented by the tree node. When the result is
- <code class="literal">true</code>, a <code class="literal">recheck</code> flag must also be returned.
- This indicates whether the predicate is certainly true or only possibly
- true. If <code class="literal">recheck</code> = <code class="literal">false</code> then the index has
- tested the predicate condition exactly, whereas if <code class="literal">recheck</code>
- = <code class="literal">true</code> the row is only a candidate match. In that case the
- system will automatically evaluate the
- <em class="replaceable"><code>indexable_operator</code></em> against the actual row value to see
- if it is really a match. This convention allows
- <acronym class="acronym">GiST</acronym> to support both lossless and lossy index
- structures.
- </p><p>
- The <acronym class="acronym">SQL</acronym> declaration of the function must look like this:
-
- </p><pre class="programlisting">
- CREATE OR REPLACE FUNCTION my_consistent(internal, data_type, smallint, oid, internal)
- RETURNS bool
- AS 'MODULE_PATHNAME'
- LANGUAGE C STRICT;
- </pre><p>
-
- And the matching code in the C module could then follow this skeleton:
-
- </p><pre class="programlisting">
- PG_FUNCTION_INFO_V1(my_consistent);
-
- Datum
- my_consistent(PG_FUNCTION_ARGS)
- {
- GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
- data_type *query = PG_GETARG_DATA_TYPE_P(1);
- StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
- /* Oid subtype = PG_GETARG_OID(3); */
- bool *recheck = (bool *) PG_GETARG_POINTER(4);
- data_type *key = DatumGetDataType(entry->key);
- bool retval;
-
- /*
- * determine return value as a function of strategy, key and query.
- *
- * Use GIST_LEAF(entry) to know where you're called in the index tree,
- * which comes handy when supporting the = operator for example (you could
- * check for non empty union() in non-leaf nodes and equality in leaf
- * nodes).
- */
-
- *recheck = true; /* or false if check is exact */
-
- PG_RETURN_BOOL(retval);
- }
- </pre><p>
-
- Here, <code class="varname">key</code> is an element in the index and <code class="varname">query</code>
- the value being looked up in the index. The <code class="literal">StrategyNumber</code>
- parameter indicates which operator of your operator class is being
- applied — it matches one of the operator numbers in the
- <code class="command">CREATE OPERATOR CLASS</code> command.
- </p><p>
- Depending on which operators you have included in the class, the data
- type of <code class="varname">query</code> could vary with the operator, since it will
- be whatever type is on the righthand side of the operator, which might
- be different from the indexed data type appearing on the lefthand side.
- (The above code skeleton assumes that only one type is possible; if
- not, fetching the <code class="varname">query</code> argument value would have to depend
- on the operator.) It is recommended that the SQL declaration of
- the <code class="function">consistent</code> function use the opclass's indexed data
- type for the <code class="varname">query</code> argument, even though the actual type
- might be something else depending on the operator.
- </p></dd><dt><span class="term"><code class="function">union</code></span></dt><dd><p>
- This method consolidates information in the tree. Given a set of
- entries, this function generates a new index entry that represents
- all the given entries.
- </p><p>
- The <acronym class="acronym">SQL</acronym> declaration of the function must look like this:
-
- </p><pre class="programlisting">
- CREATE OR REPLACE FUNCTION my_union(internal, internal)
- RETURNS storage_type
- AS 'MODULE_PATHNAME'
- LANGUAGE C STRICT;
- </pre><p>
-
- And the matching code in the C module could then follow this skeleton:
-
- </p><pre class="programlisting">
- PG_FUNCTION_INFO_V1(my_union);
-
- Datum
- my_union(PG_FUNCTION_ARGS)
- {
- GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
- GISTENTRY *ent = entryvec->vector;
- data_type *out,
- *tmp,
- *old;
- int numranges,
- i = 0;
-
- numranges = entryvec->n;
- tmp = DatumGetDataType(ent[0].key);
- out = tmp;
-
- if (numranges == 1)
- {
- out = data_type_deep_copy(tmp);
-
- PG_RETURN_DATA_TYPE_P(out);
- }
-
- for (i = 1; i < numranges; i++)
- {
- old = out;
- tmp = DatumGetDataType(ent[i].key);
- out = my_union_implementation(out, tmp);
- }
-
- PG_RETURN_DATA_TYPE_P(out);
- }
- </pre><p>
- </p><p>
- As you can see, in this skeleton we're dealing with a data type
- where <code class="literal">union(X, Y, Z) = union(union(X, Y), Z)</code>. It's easy
- enough to support data types where this is not the case, by
- implementing the proper union algorithm in this
- <acronym class="acronym">GiST</acronym> support method.
- </p><p>
- The result of the <code class="function">union</code> function must be a value of the
- index's storage type, whatever that is (it might or might not be
- different from the indexed column's type). The <code class="function">union</code>
- function should return a pointer to newly <code class="function">palloc()</code>ed
- memory. You can't just return the input value as-is, even if there is
- no type change.
- </p><p>
- As shown above, the <code class="function">union</code> function's
- first <code class="type">internal</code> argument is actually
- a <code class="structname">GistEntryVector</code> pointer. The second argument is a
- pointer to an integer variable, which can be ignored. (It used to be
- required that the <code class="function">union</code> function store the size of its
- result value into that variable, but this is no longer necessary.)
- </p></dd><dt><span class="term"><code class="function">compress</code></span></dt><dd><p>
- Converts a data item into a format suitable for physical storage in
- an index page.
- If the <code class="function">compress</code> method is omitted, data items are stored
- in the index without modification.
- </p><p>
- The <acronym class="acronym">SQL</acronym> declaration of the function must look like this:
-
- </p><pre class="programlisting">
- CREATE OR REPLACE FUNCTION my_compress(internal)
- RETURNS internal
- AS 'MODULE_PATHNAME'
- LANGUAGE C STRICT;
- </pre><p>
-
- And the matching code in the C module could then follow this skeleton:
-
- </p><pre class="programlisting">
- PG_FUNCTION_INFO_V1(my_compress);
-
- Datum
- my_compress(PG_FUNCTION_ARGS)
- {
- GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
- GISTENTRY *retval;
-
- if (entry->leafkey)
- {
- /* replace entry->key with a compressed version */
- compressed_data_type *compressed_data = palloc(sizeof(compressed_data_type));
-
- /* fill *compressed_data from entry->key ... */
-
- retval = palloc(sizeof(GISTENTRY));
- gistentryinit(*retval, PointerGetDatum(compressed_data),
- entry->rel, entry->page, entry->offset, FALSE);
- }
- else
- {
- /* typically we needn't do anything with non-leaf entries */
- retval = entry;
- }
-
- PG_RETURN_POINTER(retval);
- }
- </pre><p>
- </p><p>
- You have to adapt <em class="replaceable"><code>compressed_data_type</code></em> to the specific
- type you're converting to in order to compress your leaf nodes, of
- course.
- </p></dd><dt><span class="term"><code class="function">decompress</code></span></dt><dd><p>
- Converts the stored representation of a data item into a format that
- can be manipulated by the other GiST methods in the operator class.
- If the <code class="function">decompress</code> method is omitted, it is assumed that
- the other GiST methods can work directly on the stored data format.
- (<code class="function">decompress</code> is not necessarily the reverse of
- the <code class="function">compress</code> method; in particular,
- if <code class="function">compress</code> is lossy then it's impossible
- for <code class="function">decompress</code> to exactly reconstruct the original
- data. <code class="function">decompress</code> is not necessarily equivalent
- to <code class="function">fetch</code>, either, since the other GiST methods might not
- require full reconstruction of the data.)
- </p><p>
- The <acronym class="acronym">SQL</acronym> declaration of the function must look like this:
-
- </p><pre class="programlisting">
- CREATE OR REPLACE FUNCTION my_decompress(internal)
- RETURNS internal
- AS 'MODULE_PATHNAME'
- LANGUAGE C STRICT;
- </pre><p>
-
- And the matching code in the C module could then follow this skeleton:
-
- </p><pre class="programlisting">
- PG_FUNCTION_INFO_V1(my_decompress);
-
- Datum
- my_decompress(PG_FUNCTION_ARGS)
- {
- PG_RETURN_POINTER(PG_GETARG_POINTER(0));
- }
- </pre><p>
-
- The above skeleton is suitable for the case where no decompression
- is needed. (But, of course, omitting the method altogether is even
- easier, and is recommended in such cases.)
- </p></dd><dt><span class="term"><code class="function">penalty</code></span></dt><dd><p>
- Returns a value indicating the <span class="quote">“<span class="quote">cost</span>”</span> of inserting the new
- entry into a particular branch of the tree. Items will be inserted
- down the path of least <code class="function">penalty</code> in the tree.
- Values returned by <code class="function">penalty</code> should be non-negative.
- If a negative value is returned, it will be treated as zero.
- </p><p>
- The <acronym class="acronym">SQL</acronym> declaration of the function must look like this:
-
- </p><pre class="programlisting">
- CREATE OR REPLACE FUNCTION my_penalty(internal, internal, internal)
- RETURNS internal
- AS 'MODULE_PATHNAME'
- LANGUAGE C STRICT; -- in some cases penalty functions need not be strict
- </pre><p>
-
- And the matching code in the C module could then follow this skeleton:
-
- </p><pre class="programlisting">
- PG_FUNCTION_INFO_V1(my_penalty);
-
- Datum
- my_penalty(PG_FUNCTION_ARGS)
- {
- GISTENTRY *origentry = (GISTENTRY *) PG_GETARG_POINTER(0);
- GISTENTRY *newentry = (GISTENTRY *) PG_GETARG_POINTER(1);
- float *penalty = (float *) PG_GETARG_POINTER(2);
- data_type *orig = DatumGetDataType(origentry->key);
- data_type *new = DatumGetDataType(newentry->key);
-
- *penalty = my_penalty_implementation(orig, new);
- PG_RETURN_POINTER(penalty);
- }
- </pre><p>
-
- For historical reasons, the <code class="function">penalty</code> function doesn't
- just return a <code class="type">float</code> result; instead it has to store the value
- at the location indicated by the third argument. The return
- value per se is ignored, though it's conventional to pass back the
- address of that argument.
- </p><p>
- The <code class="function">penalty</code> function is crucial to good performance of
- the index. It'll get used at insertion time to determine which branch
- to follow when choosing where to add the new entry in the tree. At
- query time, the more balanced the index, the quicker the lookup.
- </p></dd><dt><span class="term"><code class="function">picksplit</code></span></dt><dd><p>
- When an index page split is necessary, this function decides which
- entries on the page are to stay on the old page, and which are to move
- to the new page.
- </p><p>
- The <acronym class="acronym">SQL</acronym> declaration of the function must look like this:
-
- </p><pre class="programlisting">
- CREATE OR REPLACE FUNCTION my_picksplit(internal, internal)
- RETURNS internal
- AS 'MODULE_PATHNAME'
- LANGUAGE C STRICT;
- </pre><p>
-
- And the matching code in the C module could then follow this skeleton:
-
- </p><pre class="programlisting">
- PG_FUNCTION_INFO_V1(my_picksplit);
-
- Datum
- my_picksplit(PG_FUNCTION_ARGS)
- {
- GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
- GIST_SPLITVEC *v = (GIST_SPLITVEC *) PG_GETARG_POINTER(1);
- OffsetNumber maxoff = entryvec->n - 1;
- GISTENTRY *ent = entryvec->vector;
- int i,
- nbytes;
- OffsetNumber *left,
- *right;
- data_type *tmp_union;
- data_type *unionL;
- data_type *unionR;
- GISTENTRY **raw_entryvec;
-
- maxoff = entryvec->n - 1;
- nbytes = (maxoff + 1) * sizeof(OffsetNumber);
-
- v->spl_left = (OffsetNumber *) palloc(nbytes);
- left = v->spl_left;
- v->spl_nleft = 0;
-
- v->spl_right = (OffsetNumber *) palloc(nbytes);
- right = v->spl_right;
- v->spl_nright = 0;
-
- unionL = NULL;
- unionR = NULL;
-
- /* Initialize the raw entry vector. */
- raw_entryvec = (GISTENTRY **) malloc(entryvec->n * sizeof(void *));
- for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
- raw_entryvec[i] = &(entryvec->vector[i]);
-
- for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
- {
- int real_index = raw_entryvec[i] - entryvec->vector;
-
- tmp_union = DatumGetDataType(entryvec->vector[real_index].key);
- Assert(tmp_union != NULL);
-
- /*
- * Choose where to put the index entries and update unionL and unionR
- * accordingly. Append the entries to either v->spl_left or
- * v->spl_right, and care about the counters.
- */
-
- if (my_choice_is_left(unionL, curl, unionR, curr))
- {
- if (unionL == NULL)
- unionL = tmp_union;
- else
- unionL = my_union_implementation(unionL, tmp_union);
-
- *left = real_index;
- ++left;
- ++(v->spl_nleft);
- }
- else
- {
- /*
- * Same on the right
- */
- }
- }
-
- v->spl_ldatum = DataTypeGetDatum(unionL);
- v->spl_rdatum = DataTypeGetDatum(unionR);
- PG_RETURN_POINTER(v);
- }
- </pre><p>
-
- Notice that the <code class="function">picksplit</code> function's result is delivered
- by modifying the passed-in <code class="structname">v</code> structure. The return
- value per se is ignored, though it's conventional to pass back the
- address of <code class="structname">v</code>.
- </p><p>
- Like <code class="function">penalty</code>, the <code class="function">picksplit</code> function
- is crucial to good performance of the index. Designing suitable
- <code class="function">penalty</code> and <code class="function">picksplit</code> implementations
- is where the challenge of implementing well-performing
- <acronym class="acronym">GiST</acronym> indexes lies.
- </p></dd><dt><span class="term"><code class="function">same</code></span></dt><dd><p>
- Returns true if two index entries are identical, false otherwise.
- (An <span class="quote">“<span class="quote">index entry</span>”</span> is a value of the index's storage type,
- not necessarily the original indexed column's type.)
- </p><p>
- The <acronym class="acronym">SQL</acronym> declaration of the function must look like this:
-
- </p><pre class="programlisting">
- CREATE OR REPLACE FUNCTION my_same(storage_type, storage_type, internal)
- RETURNS internal
- AS 'MODULE_PATHNAME'
- LANGUAGE C STRICT;
- </pre><p>
-
- And the matching code in the C module could then follow this skeleton:
-
- </p><pre class="programlisting">
- PG_FUNCTION_INFO_V1(my_same);
-
- Datum
- my_same(PG_FUNCTION_ARGS)
- {
- prefix_range *v1 = PG_GETARG_PREFIX_RANGE_P(0);
- prefix_range *v2 = PG_GETARG_PREFIX_RANGE_P(1);
- bool *result = (bool *) PG_GETARG_POINTER(2);
-
- *result = my_eq(v1, v2);
- PG_RETURN_POINTER(result);
- }
- </pre><p>
-
- For historical reasons, the <code class="function">same</code> function doesn't
- just return a Boolean result; instead it has to store the flag
- at the location indicated by the third argument. The return
- value per se is ignored, though it's conventional to pass back the
- address of that argument.
- </p></dd><dt><span class="term"><code class="function">distance</code></span></dt><dd><p>
- Given an index entry <code class="literal">p</code> and a query value <code class="literal">q</code>,
- this function determines the index entry's
- <span class="quote">“<span class="quote">distance</span>”</span> from the query value. This function must be
- supplied if the operator class contains any ordering operators.
- A query using the ordering operator will be implemented by returning
- index entries with the smallest <span class="quote">“<span class="quote">distance</span>”</span> values first,
- so the results must be consistent with the operator's semantics.
- For a leaf index entry the result just represents the distance to
- the index entry; for an internal tree node, the result must be the
- smallest distance that any child entry could have.
- </p><p>
- The <acronym class="acronym">SQL</acronym> declaration of the function must look like this:
-
- </p><pre class="programlisting">
- CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid, internal)
- RETURNS float8
- AS 'MODULE_PATHNAME'
- LANGUAGE C STRICT;
- </pre><p>
-
- And the matching code in the C module could then follow this skeleton:
-
- </p><pre class="programlisting">
- PG_FUNCTION_INFO_V1(my_distance);
-
- Datum
- my_distance(PG_FUNCTION_ARGS)
- {
- GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
- data_type *query = PG_GETARG_DATA_TYPE_P(1);
- StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
- /* Oid subtype = PG_GETARG_OID(3); */
- /* bool *recheck = (bool *) PG_GETARG_POINTER(4); */
- data_type *key = DatumGetDataType(entry->key);
- double retval;
-
- /*
- * determine return value as a function of strategy, key and query.
- */
-
- PG_RETURN_FLOAT8(retval);
- }
- </pre><p>
-
- The arguments to the <code class="function">distance</code> function are identical to
- the arguments of the <code class="function">consistent</code> function.
- </p><p>
- Some approximation is allowed when determining the distance, so long
- as the result is never greater than the entry's actual distance. Thus,
- for example, distance to a bounding box is usually sufficient in
- geometric applications. For an internal tree node, the distance
- returned must not be greater than the distance to any of the child
- nodes. If the returned distance is not exact, the function must set
- <code class="literal">*recheck</code> to true. (This is not necessary for internal tree
- nodes; for them, the calculation is always assumed to be inexact.) In
- this case the executor will calculate the accurate distance after
- fetching the tuple from the heap, and reorder the tuples if necessary.
- </p><p>
- If the distance function returns <code class="literal">*recheck = true</code> for any
- leaf node, the original ordering operator's return type must
- be <code class="type">float8</code> or <code class="type">float4</code>, and the distance function's
- result values must be comparable to those of the original ordering
- operator, since the executor will sort using both distance function
- results and recalculated ordering-operator results. Otherwise, the
- distance function's result values can be any finite <code class="type">float8</code>
- values, so long as the relative order of the result values matches the
- order returned by the ordering operator. (Infinity and minus infinity
- are used internally to handle cases such as nulls, so it is not
- recommended that <code class="function">distance</code> functions return these values.)
- </p></dd><dt><span class="term"><code class="function">fetch</code></span></dt><dd><p>
- Converts the compressed index representation of a data item into the
- original data type, for index-only scans. The returned data must be an
- exact, non-lossy copy of the originally indexed value.
- </p><p>
- The <acronym class="acronym">SQL</acronym> declaration of the function must look like this:
-
- </p><pre class="programlisting">
- CREATE OR REPLACE FUNCTION my_fetch(internal)
- RETURNS internal
- AS 'MODULE_PATHNAME'
- LANGUAGE C STRICT;
- </pre><p>
-
- The argument is a pointer to a <code class="structname">GISTENTRY</code> struct. On
- entry, its <code class="structfield">key</code> field contains a non-NULL leaf datum in
- compressed form. The return value is another <code class="structname">GISTENTRY</code>
- struct, whose <code class="structfield">key</code> field contains the same datum in its
- original, uncompressed form. If the opclass's compress function does
- nothing for leaf entries, the <code class="function">fetch</code> method can return the
- argument as-is. Or, if the opclass does not have a compress function,
- the <code class="function">fetch</code> method can be omitted as well, since it would
- necessarily be a no-op.
- </p><p>
- The matching code in the C module could then follow this skeleton:
-
- </p><pre class="programlisting">
- PG_FUNCTION_INFO_V1(my_fetch);
-
- Datum
- my_fetch(PG_FUNCTION_ARGS)
- {
- GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
- input_data_type *in = DatumGetPointer(entry->key);
- fetched_data_type *fetched_data;
- GISTENTRY *retval;
-
- retval = palloc(sizeof(GISTENTRY));
- fetched_data = palloc(sizeof(fetched_data_type));
-
- /*
- * Convert 'fetched_data' into the a Datum of the original datatype.
- */
-
- /* fill *retval from fetched_data. */
- gistentryinit(*retval, PointerGetDatum(converted_datum),
- entry->rel, entry->page, entry->offset, FALSE);
-
- PG_RETURN_POINTER(retval);
- }
- </pre><p>
- </p><p>
- If the compress method is lossy for leaf entries, the operator class
- cannot support index-only scans, and must not define
- a <code class="function">fetch</code> function.
- </p></dd></dl></div><p>
- All the GiST support methods are normally called in short-lived memory
- contexts; that is, <code class="varname">CurrentMemoryContext</code> will get reset after
- each tuple is processed. It is therefore not very important to worry about
- pfree'ing everything you palloc. However, in some cases it's useful for a
- support method to cache data across repeated calls. To do that, allocate
- the longer-lived data in <code class="literal">fcinfo->flinfo->fn_mcxt</code>, and
- keep a pointer to it in <code class="literal">fcinfo->flinfo->fn_extra</code>. Such
- data will survive for the life of the index operation (e.g., a single GiST
- index scan, index build, or index tuple insertion). Be careful to pfree
- the previous value when replacing a <code class="literal">fn_extra</code> value, or the leak
- will accumulate for the duration of the operation.
- </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="gist-builtin-opclasses.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="gist.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="gist-implementation.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">64.2. Built-in Operator Classes </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> 64.4. Implementation</td></tr></table></div></body></html>
|