|
- <?xml version="1.0" encoding="UTF-8" standalone="no"?>
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>37.10. C-Language Functions</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="prev" href="xfunc-internal.html" title="37.9. Internal Functions" /><link rel="next" href="xfunc-optimization.html" title="37.11. Function Optimization Information" /></head><body><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">37.10. C-Language Functions</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="xfunc-internal.html" title="37.9. Internal Functions">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="extend.html" title="Chapter 37. Extending SQL">Up</a></td><th width="60%" align="center">Chapter 37. Extending <acronym xmlns="http://www.w3.org/1999/xhtml" class="acronym">SQL</acronym></th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 12.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="xfunc-optimization.html" title="37.11. Function Optimization Information">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="XFUNC-C"><div class="titlepage"><div><div><h2 class="title" style="clear: both">37.10. C-Language Functions</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="xfunc-c.html#XFUNC-C-DYNLOAD">37.10.1. Dynamic Loading</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#XFUNC-C-BASETYPE">37.10.2. Base Types in C-Language Functions</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#id-1.8.3.13.7">37.10.3. Version 1 Calling Conventions</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#id-1.8.3.13.8">37.10.4. Writing Code</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#DFUNC">37.10.5. Compiling and Linking Dynamically-Loaded Functions</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#id-1.8.3.13.10">37.10.6. Composite-Type Arguments</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#id-1.8.3.13.11">37.10.7. Returning Rows (Composite Types)</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#XFUNC-C-RETURN-SET">37.10.8. Returning Sets</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#id-1.8.3.13.13">37.10.9. Polymorphic Arguments and Return Types</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#id-1.8.3.13.14">37.10.10. Shared Memory and LWLocks</a></span></dt><dt><span class="sect2"><a href="xfunc-c.html#EXTEND-CPP">37.10.11. Using C++ for Extensibility</a></span></dt></dl></div><a id="id-1.8.3.13.2" class="indexterm"></a><p>
- User-defined functions can be written in C (or a language that can
- be made compatible with C, such as C++). Such functions are
- compiled into dynamically loadable objects (also called shared
- libraries) and are loaded by the server on demand. The dynamic
- loading feature is what distinguishes <span class="quote">“<span class="quote">C language</span>”</span> functions
- from <span class="quote">“<span class="quote">internal</span>”</span> functions — the actual coding conventions
- are essentially the same for both. (Hence, the standard internal
- function library is a rich source of coding examples for user-defined
- C functions.)
- </p><p>
- Currently only one calling convention is used for C functions
- (<span class="quote">“<span class="quote">version 1</span>”</span>). Support for that calling convention is
- indicated by writing a <code class="literal">PG_FUNCTION_INFO_V1()</code> macro
- call for the function, as illustrated below.
- </p><div class="sect2" id="XFUNC-C-DYNLOAD"><div class="titlepage"><div><div><h3 class="title">37.10.1. Dynamic Loading</h3></div></div></div><a id="id-1.8.3.13.5.2" class="indexterm"></a><p>
- The first time a user-defined function in a particular
- loadable object file is called in a session,
- the dynamic loader loads that object file into memory so that the
- function can be called. The <code class="command">CREATE FUNCTION</code>
- for a user-defined C function must therefore specify two pieces of
- information for the function: the name of the loadable
- object file, and the C name (link symbol) of the specific function to call
- within that object file. If the C name is not explicitly specified then
- it is assumed to be the same as the SQL function name.
- </p><p>
- The following algorithm is used to locate the shared object file
- based on the name given in the <code class="command">CREATE FUNCTION</code>
- command:
-
- </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>
- If the name is an absolute path, the given file is loaded.
- </p></li><li class="listitem"><p>
- If the name starts with the string <code class="literal">$libdir</code>,
- that part is replaced by the <span class="productname">PostgreSQL</span> package
- library directory
- name, which is determined at build time.<a id="id-1.8.3.13.5.4.2.2.1.3" class="indexterm"></a>
- </p></li><li class="listitem"><p>
- If the name does not contain a directory part, the file is
- searched for in the path specified by the configuration variable
- <a class="xref" href="runtime-config-client.html#GUC-DYNAMIC-LIBRARY-PATH">dynamic_library_path</a>.<a id="id-1.8.3.13.5.4.2.3.1.2" class="indexterm"></a>
- </p></li><li class="listitem"><p>
- Otherwise (the file was not found in the path, or it contains a
- non-absolute directory part), the dynamic loader will try to
- take the name as given, which will most likely fail. (It is
- unreliable to depend on the current working directory.)
- </p></li></ol></div><p>
-
- If this sequence does not work, the platform-specific shared
- library file name extension (often <code class="filename">.so</code>) is
- appended to the given name and this sequence is tried again. If
- that fails as well, the load will fail.
- </p><p>
- It is recommended to locate shared libraries either relative to
- <code class="literal">$libdir</code> or through the dynamic library path.
- This simplifies version upgrades if the new installation is at a
- different location. The actual directory that
- <code class="literal">$libdir</code> stands for can be found out with the
- command <code class="literal">pg_config --pkglibdir</code>.
- </p><p>
- The user ID the <span class="productname">PostgreSQL</span> server runs
- as must be able to traverse the path to the file you intend to
- load. Making the file or a higher-level directory not readable
- and/or not executable by the <span class="systemitem">postgres</span>
- user is a common mistake.
- </p><p>
- In any case, the file name that is given in the
- <code class="command">CREATE FUNCTION</code> command is recorded literally
- in the system catalogs, so if the file needs to be loaded again
- the same procedure is applied.
- </p><div class="note"><h3 class="title">Note</h3><p>
- <span class="productname">PostgreSQL</span> will not compile a C function
- automatically. The object file must be compiled before it is referenced
- in a <code class="command">CREATE
- FUNCTION</code> command. See <a class="xref" href="xfunc-c.html#DFUNC" title="37.10.5. Compiling and Linking Dynamically-Loaded Functions">Section 37.10.5</a> for additional
- information.
- </p></div><a id="id-1.8.3.13.5.9" class="indexterm"></a><p>
- To ensure that a dynamically loaded object file is not loaded into an
- incompatible server, <span class="productname">PostgreSQL</span> checks that the
- file contains a <span class="quote">“<span class="quote">magic block</span>”</span> with the appropriate contents.
- This allows the server to detect obvious incompatibilities, such as code
- compiled for a different major version of
- <span class="productname">PostgreSQL</span>. To include a magic block,
- write this in one (and only one) of the module source files, after having
- included the header <code class="filename">fmgr.h</code>:
-
- </p><pre class="programlisting">
- PG_MODULE_MAGIC;
- </pre><p>
- </p><p>
- After it is used for the first time, a dynamically loaded object
- file is retained in memory. Future calls in the same session to
- the function(s) in that file will only incur the small overhead of
- a symbol table lookup. If you need to force a reload of an object
- file, for example after recompiling it, begin a fresh session.
- </p><a id="id-1.8.3.13.5.12" class="indexterm"></a><a id="id-1.8.3.13.5.13" class="indexterm"></a><a id="id-1.8.3.13.5.14" class="indexterm"></a><a id="id-1.8.3.13.5.15" class="indexterm"></a><p>
- Optionally, a dynamically loaded file can contain initialization and
- finalization functions. If the file includes a function named
- <code class="function">_PG_init</code>, that function will be called immediately after
- loading the file. The function receives no parameters and should
- return void. If the file includes a function named
- <code class="function">_PG_fini</code>, that function will be called immediately before
- unloading the file. Likewise, the function receives no parameters and
- should return void. Note that <code class="function">_PG_fini</code> will only be called
- during an unload of the file, not during process termination.
- (Presently, unloads are disabled and will never occur, but this may
- change in the future.)
- </p></div><div class="sect2" id="XFUNC-C-BASETYPE"><div class="titlepage"><div><div><h3 class="title">37.10.2. Base Types in C-Language Functions</h3></div></div></div><a id="id-1.8.3.13.6.2" class="indexterm"></a><p>
- To know how to write C-language functions, you need to know how
- <span class="productname">PostgreSQL</span> internally represents base
- data types and how they can be passed to and from functions.
- Internally, <span class="productname">PostgreSQL</span> regards a base
- type as a <span class="quote">“<span class="quote">blob of memory</span>”</span>. The user-defined
- functions that you define over a type in turn define the way that
- <span class="productname">PostgreSQL</span> can operate on it. That
- is, <span class="productname">PostgreSQL</span> will only store and
- retrieve the data from disk and use your user-defined functions
- to input, process, and output the data.
- </p><p>
- Base types can have one of three internal formats:
-
- </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
- pass by value, fixed-length
- </p></li><li class="listitem"><p>
- pass by reference, fixed-length
- </p></li><li class="listitem"><p>
- pass by reference, variable-length
- </p></li></ul></div><p>
- </p><p>
- By-value types can only be 1, 2, or 4 bytes in length
- (also 8 bytes, if <code class="literal">sizeof(Datum)</code> is 8 on your machine).
- You should be careful to define your types such that they will be the
- same size (in bytes) on all architectures. For example, the
- <code class="literal">long</code> type is dangerous because it is 4 bytes on some
- machines and 8 bytes on others, whereas <code class="type">int</code> type is 4 bytes
- on most Unix machines. A reasonable implementation of the
- <code class="type">int4</code> type on Unix machines might be:
-
- </p><pre class="programlisting">
- /* 4-byte integer, passed by value */
- typedef int int4;
- </pre><p>
-
- (The actual PostgreSQL C code calls this type <code class="type">int32</code>, because
- it is a convention in C that <code class="type">int<em class="replaceable"><code>XX</code></em></code>
- means <em class="replaceable"><code>XX</code></em> <span class="emphasis"><em>bits</em></span>. Note
- therefore also that the C type <code class="type">int8</code> is 1 byte in size. The
- SQL type <code class="type">int8</code> is called <code class="type">int64</code> in C. See also
- <a class="xref" href="xfunc-c.html#XFUNC-C-TYPE-TABLE" title="Table 37.1. Equivalent C Types for Built-in SQL Types">Table 37.1</a>.)
- </p><p>
- On the other hand, fixed-length types of any size can
- be passed by-reference. For example, here is a sample
- implementation of a <span class="productname">PostgreSQL</span> type:
-
- </p><pre class="programlisting">
- /* 16-byte structure, passed by reference */
- typedef struct
- {
- double x, y;
- } Point;
- </pre><p>
-
- Only pointers to such types can be used when passing
- them in and out of <span class="productname">PostgreSQL</span> functions.
- To return a value of such a type, allocate the right amount of
- memory with <code class="literal">palloc</code>, fill in the allocated memory,
- and return a pointer to it. (Also, if you just want to return the
- same value as one of your input arguments that's of the same data type,
- you can skip the extra <code class="literal">palloc</code> and just return the
- pointer to the input value.)
- </p><p>
- Finally, all variable-length types must also be passed
- by reference. All variable-length types must begin
- with an opaque length field of exactly 4 bytes, which will be set
- by <code class="symbol">SET_VARSIZE</code>; never set this field directly! All data to
- be stored within that type must be located in the memory
- immediately following that length field. The
- length field contains the total length of the structure,
- that is, it includes the size of the length field
- itself.
- </p><p>
- Another important point is to avoid leaving any uninitialized bits
- within data type values; for example, take care to zero out any
- alignment padding bytes that might be present in structs. Without
- this, logically-equivalent constants of your data type might be
- seen as unequal by the planner, leading to inefficient (though not
- incorrect) plans.
- </p><div class="warning"><h3 class="title">Warning</h3><p>
- <span class="emphasis"><em>Never</em></span> modify the contents of a pass-by-reference input
- value. If you do so you are likely to corrupt on-disk data, since
- the pointer you are given might point directly into a disk buffer.
- The sole exception to this rule is explained in
- <a class="xref" href="xaggr.html" title="37.12. User-Defined Aggregates">Section 37.12</a>.
- </p></div><p>
- As an example, we can define the type <code class="type">text</code> as
- follows:
-
- </p><pre class="programlisting">
- typedef struct {
- int32 length;
- char data[FLEXIBLE_ARRAY_MEMBER];
- } text;
- </pre><p>
-
- The <code class="literal">[FLEXIBLE_ARRAY_MEMBER]</code> notation means that the actual
- length of the data part is not specified by this declaration.
- </p><p>
- When manipulating
- variable-length types, we must be careful to allocate
- the correct amount of memory and set the length field correctly.
- For example, if we wanted to store 40 bytes in a <code class="structname">text</code>
- structure, we might use a code fragment like this:
-
- </p><pre class="programlisting">
- #include "postgres.h"
- ...
- char buffer[40]; /* our source data */
- ...
- text *destination = (text *) palloc(VARHDRSZ + 40);
- SET_VARSIZE(destination, VARHDRSZ + 40);
- memcpy(destination->data, buffer, 40);
- ...
-
- </pre><p>
-
- <code class="literal">VARHDRSZ</code> is the same as <code class="literal">sizeof(int32)</code>, but
- it's considered good style to use the macro <code class="literal">VARHDRSZ</code>
- to refer to the size of the overhead for a variable-length type.
- Also, the length field <span class="emphasis"><em>must</em></span> be set using the
- <code class="literal">SET_VARSIZE</code> macro, not by simple assignment.
- </p><p>
- <a class="xref" href="xfunc-c.html#XFUNC-C-TYPE-TABLE" title="Table 37.1. Equivalent C Types for Built-in SQL Types">Table 37.1</a> specifies which C type
- corresponds to which SQL type when writing a C-language function
- that uses a built-in type of <span class="productname">PostgreSQL</span>.
- The <span class="quote">“<span class="quote">Defined In</span>”</span> column gives the header file that
- needs to be included to get the type definition. (The actual
- definition might be in a different file that is included by the
- listed file. It is recommended that users stick to the defined
- interface.) Note that you should always include
- <code class="filename">postgres.h</code> first in any source file, because
- it declares a number of things that you will need anyway.
- </p><div class="table" id="XFUNC-C-TYPE-TABLE"><p class="title"><strong>Table 37.1. Equivalent C Types for Built-in SQL Types</strong></p><div class="table-contents"><table class="table" summary="Equivalent C Types for Built-in SQL Types" border="1"><colgroup><col /><col /><col /></colgroup><thead><tr><th>
- SQL Type
- </th><th>
- C Type
- </th><th>
- Defined In
- </th></tr></thead><tbody><tr><td><code class="type">boolean</code></td><td><code class="type">bool</code></td><td><code class="filename">postgres.h</code> (maybe compiler built-in)</td></tr><tr><td><code class="type">box</code></td><td><code class="type">BOX*</code></td><td><code class="filename">utils/geo_decls.h</code></td></tr><tr><td><code class="type">bytea</code></td><td><code class="type">bytea*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">"char"</code></td><td><code class="type">char</code></td><td>(compiler built-in)</td></tr><tr><td><code class="type">character</code></td><td><code class="type">BpChar*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">cid</code></td><td><code class="type">CommandId</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">date</code></td><td><code class="type">DateADT</code></td><td><code class="filename">utils/date.h</code></td></tr><tr><td><code class="type">smallint</code> (<code class="type">int2</code>)</td><td><code class="type">int16</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">int2vector</code></td><td><code class="type">int2vector*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">integer</code> (<code class="type">int4</code>)</td><td><code class="type">int32</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">real</code> (<code class="type">float4</code>)</td><td><code class="type">float4*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">double precision</code> (<code class="type">float8</code>)</td><td><code class="type">float8*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">interval</code></td><td><code class="type">Interval*</code></td><td><code class="filename">datatype/timestamp.h</code></td></tr><tr><td><code class="type">lseg</code></td><td><code class="type">LSEG*</code></td><td><code class="filename">utils/geo_decls.h</code></td></tr><tr><td><code class="type">name</code></td><td><code class="type">Name</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">oid</code></td><td><code class="type">Oid</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">oidvector</code></td><td><code class="type">oidvector*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">path</code></td><td><code class="type">PATH*</code></td><td><code class="filename">utils/geo_decls.h</code></td></tr><tr><td><code class="type">point</code></td><td><code class="type">POINT*</code></td><td><code class="filename">utils/geo_decls.h</code></td></tr><tr><td><code class="type">regproc</code></td><td><code class="type">regproc</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">text</code></td><td><code class="type">text*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">tid</code></td><td><code class="type">ItemPointer</code></td><td><code class="filename">storage/itemptr.h</code></td></tr><tr><td><code class="type">time</code></td><td><code class="type">TimeADT</code></td><td><code class="filename">utils/date.h</code></td></tr><tr><td><code class="type">time with time zone</code></td><td><code class="type">TimeTzADT</code></td><td><code class="filename">utils/date.h</code></td></tr><tr><td><code class="type">timestamp</code></td><td><code class="type">Timestamp*</code></td><td><code class="filename">datatype/timestamp.h</code></td></tr><tr><td><code class="type">varchar</code></td><td><code class="type">VarChar*</code></td><td><code class="filename">postgres.h</code></td></tr><tr><td><code class="type">xid</code></td><td><code class="type">TransactionId</code></td><td><code class="filename">postgres.h</code></td></tr></tbody></table></div></div><br class="table-break" /><p>
- Now that we've gone over all of the possible structures
- for base types, we can show some examples of real functions.
- </p></div><div class="sect2" id="id-1.8.3.13.7"><div class="titlepage"><div><div><h3 class="title">37.10.3. Version 1 Calling Conventions</h3></div></div></div><p>
- The version-1 calling convention relies on macros to suppress most
- of the complexity of passing arguments and results. The C declaration
- of a version-1 function is always:
- </p><pre class="programlisting">
- Datum funcname(PG_FUNCTION_ARGS)
- </pre><p>
- In addition, the macro call:
- </p><pre class="programlisting">
- PG_FUNCTION_INFO_V1(funcname);
- </pre><p>
- must appear in the same source file. (Conventionally, it's
- written just before the function itself.) This macro call is not
- needed for <code class="literal">internal</code>-language functions, since
- <span class="productname">PostgreSQL</span> assumes that all internal functions
- use the version-1 convention. It is, however, required for
- dynamically-loaded functions.
- </p><p>
- In a version-1 function, each actual argument is fetched using a
- <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>()</code>
- macro that corresponds to the argument's data type. (In non-strict
- functions there needs to be a previous check about argument null-ness
- using <code class="function">PG_ARGISNULL()</code>; see below.)
- The result is returned using a
- <code class="function">PG_RETURN_<em class="replaceable"><code>xxx</code></em>()</code>
- macro for the return type.
- <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>()</code>
- takes as its argument the number of the function argument to
- fetch, where the count starts at 0.
- <code class="function">PG_RETURN_<em class="replaceable"><code>xxx</code></em>()</code>
- takes as its argument the actual value to return.
- </p><p>
- Here are some examples using the version-1 calling convention:
- </p><pre class="programlisting">
- #include "postgres.h"
- #include <string.h>
- #include "fmgr.h"
- #include "utils/geo_decls.h"
-
- PG_MODULE_MAGIC;
-
- /* by value */
-
- PG_FUNCTION_INFO_V1(add_one);
-
- Datum
- add_one(PG_FUNCTION_ARGS)
- {
- int32 arg = PG_GETARG_INT32(0);
-
- PG_RETURN_INT32(arg + 1);
- }
-
- /* by reference, fixed length */
-
- PG_FUNCTION_INFO_V1(add_one_float8);
-
- Datum
- add_one_float8(PG_FUNCTION_ARGS)
- {
- /* The macros for FLOAT8 hide its pass-by-reference nature. */
- float8 arg = PG_GETARG_FLOAT8(0);
-
- PG_RETURN_FLOAT8(arg + 1.0);
- }
-
- PG_FUNCTION_INFO_V1(makepoint);
-
- Datum
- makepoint(PG_FUNCTION_ARGS)
- {
- /* Here, the pass-by-reference nature of Point is not hidden. */
- Point *pointx = PG_GETARG_POINT_P(0);
- Point *pointy = PG_GETARG_POINT_P(1);
- Point *new_point = (Point *) palloc(sizeof(Point));
-
- new_point->x = pointx->x;
- new_point->y = pointy->y;
-
- PG_RETURN_POINT_P(new_point);
- }
-
- /* by reference, variable length */
-
- PG_FUNCTION_INFO_V1(copytext);
-
- Datum
- copytext(PG_FUNCTION_ARGS)
- {
- text *t = PG_GETARG_TEXT_PP(0);
-
- /*
- * VARSIZE_ANY_EXHDR is the size of the struct in bytes, minus the
- * VARHDRSZ or VARHDRSZ_SHORT of its header. Construct the copy with a
- * full-length header.
- */
- text *new_t = (text *) palloc(VARSIZE_ANY_EXHDR(t) + VARHDRSZ);
- SET_VARSIZE(new_t, VARSIZE_ANY_EXHDR(t) + VARHDRSZ);
-
- /*
- * VARDATA is a pointer to the data region of the new struct. The source
- * could be a short datum, so retrieve its data through VARDATA_ANY.
- */
- memcpy((void *) VARDATA(new_t), /* destination */
- (void *) VARDATA_ANY(t), /* source */
- VARSIZE_ANY_EXHDR(t)); /* how many bytes */
- PG_RETURN_TEXT_P(new_t);
- }
-
- PG_FUNCTION_INFO_V1(concat_text);
-
- Datum
- concat_text(PG_FUNCTION_ARGS)
- {
- text *arg1 = PG_GETARG_TEXT_PP(0);
- text *arg2 = PG_GETARG_TEXT_PP(1);
- int32 arg1_size = VARSIZE_ANY_EXHDR(arg1);
- int32 arg2_size = VARSIZE_ANY_EXHDR(arg2);
- int32 new_text_size = arg1_size + arg2_size + VARHDRSZ;
- text *new_text = (text *) palloc(new_text_size);
-
- SET_VARSIZE(new_text, new_text_size);
- memcpy(VARDATA(new_text), VARDATA_ANY(arg1), arg1_size);
- memcpy(VARDATA(new_text) + arg1_size, VARDATA_ANY(arg2), arg2_size);
- PG_RETURN_TEXT_P(new_text);
- }
-
- </pre><p>
- Supposing that the above code has been prepared in file
- <code class="filename">funcs.c</code> and compiled into a shared object,
- we could define the functions to <span class="productname">PostgreSQL</span>
- with commands like this:
- </p><pre class="programlisting">
- CREATE FUNCTION add_one(integer) RETURNS integer
- AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'add_one'
- LANGUAGE C STRICT;
-
- -- note overloading of SQL function name "add_one"
- CREATE FUNCTION add_one(double precision) RETURNS double precision
- AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'add_one_float8'
- LANGUAGE C STRICT;
-
- CREATE FUNCTION makepoint(point, point) RETURNS point
- AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'makepoint'
- LANGUAGE C STRICT;
-
- CREATE FUNCTION copytext(text) RETURNS text
- AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'copytext'
- LANGUAGE C STRICT;
-
- CREATE FUNCTION concat_text(text, text) RETURNS text
- AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'concat_text'
- LANGUAGE C STRICT;
- </pre><p>
- Here, <em class="replaceable"><code>DIRECTORY</code></em> stands for the
- directory of the shared library file (for instance the
- <span class="productname">PostgreSQL</span> tutorial directory, which
- contains the code for the examples used in this section).
- (Better style would be to use just <code class="literal">'funcs'</code> in the
- <code class="literal">AS</code> clause, after having added
- <em class="replaceable"><code>DIRECTORY</code></em> to the search path. In any
- case, we can omit the system-specific extension for a shared
- library, commonly <code class="literal">.so</code>.)
- </p><p>
- Notice that we have specified the functions as <span class="quote">“<span class="quote">strict</span>”</span>,
- meaning that
- the system should automatically assume a null result if any input
- value is null. By doing this, we avoid having to check for null inputs
- in the function code. Without this, we'd have to check for null values
- explicitly, using <code class="function">PG_ARGISNULL()</code>.
- </p><p>
- The macro <code class="function">PG_ARGISNULL(<em class="replaceable"><code>n</code></em>)</code>
- allows a function to test whether each input is null. (Of course, doing
- this is only necessary in functions not declared <span class="quote">“<span class="quote">strict</span>”</span>.)
- As with the
- <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>()</code> macros,
- the input arguments are counted beginning at zero. Note that one
- should refrain from executing
- <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>()</code> until
- one has verified that the argument isn't null.
- To return a null result, execute <code class="function">PG_RETURN_NULL()</code>;
- this works in both strict and nonstrict functions.
- </p><p>
- At first glance, the version-1 coding conventions might appear
- to be just pointless obscurantism, compared to using
- plain <code class="literal">C</code> calling conventions. They do however allow
- us to deal with <code class="literal">NULL</code>able arguments/return values,
- and <span class="quote">“<span class="quote">toasted</span>”</span> (compressed or out-of-line) values.
- </p><p>
- Other options provided by the version-1 interface are two
- variants of the
- <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>()</code>
- macros. The first of these,
- <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>_COPY()</code>,
- guarantees to return a copy of the specified argument that is
- safe for writing into. (The normal macros will sometimes return a
- pointer to a value that is physically stored in a table, which
- must not be written to. Using the
- <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>_COPY()</code>
- macros guarantees a writable result.)
- The second variant consists of the
- <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em>_SLICE()</code>
- macros which take three arguments. The first is the number of the
- function argument (as above). The second and third are the offset and
- length of the segment to be returned. Offsets are counted from
- zero, and a negative length requests that the remainder of the
- value be returned. These macros provide more efficient access to
- parts of large values in the case where they have storage type
- <span class="quote">“<span class="quote">external</span>”</span>. (The storage type of a column can be specified using
- <code class="literal">ALTER TABLE <em class="replaceable"><code>tablename</code></em> ALTER
- COLUMN <em class="replaceable"><code>colname</code></em> SET STORAGE
- <em class="replaceable"><code>storagetype</code></em></code>. <em class="replaceable"><code>storagetype</code></em> is one of
- <code class="literal">plain</code>, <code class="literal">external</code>, <code class="literal">extended</code>,
- or <code class="literal">main</code>.)
- </p><p>
- Finally, the version-1 function call conventions make it possible
- to return set results (<a class="xref" href="xfunc-c.html#XFUNC-C-RETURN-SET" title="37.10.8. Returning Sets">Section 37.10.8</a>) and
- implement trigger functions (<a class="xref" href="triggers.html" title="Chapter 38. Triggers">Chapter 38</a>) and
- procedural-language call handlers (<a class="xref" href="plhandler.html" title="Chapter 55. Writing a Procedural Language Handler">Chapter 55</a>). For more details
- see <code class="filename">src/backend/utils/fmgr/README</code> in the
- source distribution.
- </p></div><div class="sect2" id="id-1.8.3.13.8"><div class="titlepage"><div><div><h3 class="title">37.10.4. Writing Code</h3></div></div></div><p>
- Before we turn to the more advanced topics, we should discuss
- some coding rules for <span class="productname">PostgreSQL</span>
- C-language functions. While it might be possible to load functions
- written in languages other than C into
- <span class="productname">PostgreSQL</span>, this is usually difficult
- (when it is possible at all) because other languages, such as
- C++, FORTRAN, or Pascal often do not follow the same calling
- convention as C. That is, other languages do not pass argument
- and return values between functions in the same way. For this
- reason, we will assume that your C-language functions are
- actually written in C.
- </p><p>
- The basic rules for writing and building C functions are as follows:
-
- </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
- Use <code class="literal">pg_config
- --includedir-server</code><a id="id-1.8.3.13.8.3.1.1.1.2" class="indexterm"></a>
- to find out where the <span class="productname">PostgreSQL</span> server header
- files are installed on your system (or the system that your
- users will be running on).
- </p></li><li class="listitem"><p>
- Compiling and linking your code so that it can be dynamically
- loaded into <span class="productname">PostgreSQL</span> always
- requires special flags. See <a class="xref" href="xfunc-c.html#DFUNC" title="37.10.5. Compiling and Linking Dynamically-Loaded Functions">Section 37.10.5</a> for a
- detailed explanation of how to do it for your particular
- operating system.
- </p></li><li class="listitem"><p>
- Remember to define a <span class="quote">“<span class="quote">magic block</span>”</span> for your shared library,
- as described in <a class="xref" href="xfunc-c.html#XFUNC-C-DYNLOAD" title="37.10.1. Dynamic Loading">Section 37.10.1</a>.
- </p></li><li class="listitem"><p>
- When allocating memory, use the
- <span class="productname">PostgreSQL</span> functions
- <code class="function">palloc</code><a id="id-1.8.3.13.8.3.1.4.1.3" class="indexterm"></a> and <code class="function">pfree</code><a id="id-1.8.3.13.8.3.1.4.1.5" class="indexterm"></a>
- instead of the corresponding C library functions
- <code class="function">malloc</code> and <code class="function">free</code>.
- The memory allocated by <code class="function">palloc</code> will be
- freed automatically at the end of each transaction, preventing
- memory leaks.
- </p></li><li class="listitem"><p>
- Always zero the bytes of your structures using <code class="function">memset</code>
- (or allocate them with <code class="function">palloc0</code> in the first place).
- Even if you assign to each field of your structure, there might be
- alignment padding (holes in the structure) that contain
- garbage values. Without this, it's difficult to
- support hash indexes or hash joins, as you must pick out only
- the significant bits of your data structure to compute a hash.
- The planner also sometimes relies on comparing constants via
- bitwise equality, so you can get undesirable planning results if
- logically-equivalent values aren't bitwise equal.
- </p></li><li class="listitem"><p>
- Most of the internal <span class="productname">PostgreSQL</span>
- types are declared in <code class="filename">postgres.h</code>, while
- the function manager interfaces
- (<code class="symbol">PG_FUNCTION_ARGS</code>, etc.) are in
- <code class="filename">fmgr.h</code>, so you will need to include at
- least these two files. For portability reasons it's best to
- include <code class="filename">postgres.h</code> <span class="emphasis"><em>first</em></span>,
- before any other system or user header files. Including
- <code class="filename">postgres.h</code> will also include
- <code class="filename">elog.h</code> and <code class="filename">palloc.h</code>
- for you.
- </p></li><li class="listitem"><p>
- Symbol names defined within object files must not conflict
- with each other or with symbols defined in the
- <span class="productname">PostgreSQL</span> server executable. You
- will have to rename your functions or variables if you get
- error messages to this effect.
- </p></li></ul></div><p>
- </p></div><div class="sect2" id="DFUNC"><div class="titlepage"><div><div><h3 class="title">37.10.5. Compiling and Linking Dynamically-Loaded Functions</h3></div></div></div><p>
- Before you are able to use your
- <span class="productname">PostgreSQL</span> extension functions written in
- C, they must be compiled and linked in a special way to produce a
- file that can be dynamically loaded by the server. To be precise, a
- <em class="firstterm">shared library</em> needs to be
- created.<a id="id-1.8.3.13.9.2.3" class="indexterm"></a>
-
- </p><p>
- For information beyond what is contained in this section
- you should read the documentation of your
- operating system, in particular the manual pages for the C compiler,
- <code class="command">cc</code>, and the link editor, <code class="command">ld</code>.
- In addition, the <span class="productname">PostgreSQL</span> source code
- contains several working examples in the
- <code class="filename">contrib</code> directory. If you rely on these
- examples you will make your modules dependent on the availability
- of the <span class="productname">PostgreSQL</span> source code, however.
- </p><p>
- Creating shared libraries is generally analogous to linking
- executables: first the source files are compiled into object files,
- then the object files are linked together. The object files need to
- be created as <em class="firstterm">position-independent code</em>
- (<acronym class="acronym">PIC</acronym>),<a id="id-1.8.3.13.9.4.3" class="indexterm"></a> which
- conceptually means that they can be placed at an arbitrary location
- in memory when they are loaded by the executable. (Object files
- intended for executables are usually not compiled that way.) The
- command to link a shared library contains special flags to
- distinguish it from linking an executable (at least in theory
- — on some systems the practice is much uglier).
- </p><p>
- In the following examples we assume that your source code is in a
- file <code class="filename">foo.c</code> and we will create a shared library
- <code class="filename">foo.so</code>. The intermediate object file will be
- called <code class="filename">foo.o</code> unless otherwise noted. A shared
- library can contain more than one object file, but we only use one
- here.
- </p><div class="variablelist"><dl class="variablelist"><dt><span class="term">
- <span class="systemitem">FreeBSD</span>
- <a id="id-1.8.3.13.9.6.1.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- The compiler flag to create <acronym class="acronym">PIC</acronym> is
- <code class="option">-fPIC</code>. To create shared libraries the compiler
- flag is <code class="option">-shared</code>.
- </p><pre class="programlisting">
- gcc -fPIC -c foo.c
- gcc -shared -o foo.so foo.o
- </pre><p>
- This is applicable as of version 3.0 of
- <span class="systemitem">FreeBSD</span>.
- </p></dd><dt><span class="term">
- <span class="systemitem">HP-UX</span>
- <a id="id-1.8.3.13.9.6.2.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- The compiler flag of the system compiler to create
- <acronym class="acronym">PIC</acronym> is <code class="option">+z</code>. When using
- <span class="application">GCC</span> it's <code class="option">-fPIC</code>. The
- linker flag for shared libraries is <code class="option">-b</code>. So:
- </p><pre class="programlisting">
- cc +z -c foo.c
- </pre><p>
- or:
- </p><pre class="programlisting">
- gcc -fPIC -c foo.c
- </pre><p>
- and then:
- </p><pre class="programlisting">
- ld -b -o foo.sl foo.o
- </pre><p>
- <span class="systemitem">HP-UX</span> uses the extension
- <code class="filename">.sl</code> for shared libraries, unlike most other
- systems.
- </p></dd><dt><span class="term">
- <span class="systemitem">Linux</span>
- <a id="id-1.8.3.13.9.6.3.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- The compiler flag to create <acronym class="acronym">PIC</acronym> is
- <code class="option">-fPIC</code>.
- The compiler flag to create a shared library is
- <code class="option">-shared</code>. A complete example looks like this:
- </p><pre class="programlisting">
- cc -fPIC -c foo.c
- cc -shared -o foo.so foo.o
- </pre><p>
- </p></dd><dt><span class="term">
- <span class="systemitem">macOS</span>
- <a id="id-1.8.3.13.9.6.4.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- Here is an example. It assumes the developer tools are installed.
- </p><pre class="programlisting">
- cc -c foo.c
- cc -bundle -flat_namespace -undefined suppress -o foo.so foo.o
- </pre><p>
- </p></dd><dt><span class="term">
- <span class="systemitem">NetBSD</span>
- <a id="id-1.8.3.13.9.6.5.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- The compiler flag to create <acronym class="acronym">PIC</acronym> is
- <code class="option">-fPIC</code>. For <acronym class="acronym">ELF</acronym> systems, the
- compiler with the flag <code class="option">-shared</code> is used to link
- shared libraries. On the older non-ELF systems, <code class="literal">ld
- -Bshareable</code> is used.
- </p><pre class="programlisting">
- gcc -fPIC -c foo.c
- gcc -shared -o foo.so foo.o
- </pre><p>
- </p></dd><dt><span class="term">
- <span class="systemitem">OpenBSD</span>
- <a id="id-1.8.3.13.9.6.6.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- The compiler flag to create <acronym class="acronym">PIC</acronym> is
- <code class="option">-fPIC</code>. <code class="literal">ld -Bshareable</code> is
- used to link shared libraries.
- </p><pre class="programlisting">
- gcc -fPIC -c foo.c
- ld -Bshareable -o foo.so foo.o
- </pre><p>
- </p></dd><dt><span class="term">
- <span class="systemitem">Solaris</span>
- <a id="id-1.8.3.13.9.6.7.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- The compiler flag to create <acronym class="acronym">PIC</acronym> is
- <code class="option">-KPIC</code> with the Sun compiler and
- <code class="option">-fPIC</code> with <span class="application">GCC</span>. To
- link shared libraries, the compiler option is
- <code class="option">-G</code> with either compiler or alternatively
- <code class="option">-shared</code> with <span class="application">GCC</span>.
- </p><pre class="programlisting">
- cc -KPIC -c foo.c
- cc -G -o foo.so foo.o
- </pre><p>
- or
- </p><pre class="programlisting">
- gcc -fPIC -c foo.c
- gcc -G -o foo.so foo.o
- </pre><p>
- </p></dd></dl></div><div class="tip"><h3 class="title">Tip</h3><p>
- If this is too complicated for you, you should consider using
- <a class="ulink" href="https://www.gnu.org/software/libtool/" target="_top">
- <span class="productname">GNU Libtool</span></a>,
- which hides the platform differences behind a uniform interface.
- </p></div><p>
- The resulting shared library file can then be loaded into
- <span class="productname">PostgreSQL</span>. When specifying the file name
- to the <code class="command">CREATE FUNCTION</code> command, one must give it
- the name of the shared library file, not the intermediate object file.
- Note that the system's standard shared-library extension (usually
- <code class="literal">.so</code> or <code class="literal">.sl</code>) can be omitted from
- the <code class="command">CREATE FUNCTION</code> command, and normally should
- be omitted for best portability.
- </p><p>
- Refer back to <a class="xref" href="xfunc-c.html#XFUNC-C-DYNLOAD" title="37.10.1. Dynamic Loading">Section 37.10.1</a> about where the
- server expects to find the shared library files.
- </p></div><div class="sect2" id="id-1.8.3.13.10"><div class="titlepage"><div><div><h3 class="title">37.10.6. Composite-Type Arguments</h3></div></div></div><p>
- Composite types do not have a fixed layout like C structures.
- Instances of a composite type can contain null fields. In
- addition, composite types that are part of an inheritance
- hierarchy can have different fields than other members of the
- same inheritance hierarchy. Therefore,
- <span class="productname">PostgreSQL</span> provides a function
- interface for accessing fields of composite types from C.
- </p><p>
- Suppose we want to write a function to answer the query:
-
- </p><pre class="programlisting">
- SELECT name, c_overpaid(emp, 1500) AS overpaid
- FROM emp
- WHERE name = 'Bill' OR name = 'Sam';
- </pre><p>
-
- Using the version-1 calling conventions, we can define
- <code class="function">c_overpaid</code> as:
-
- </p><pre class="programlisting">
- #include "postgres.h"
- #include "executor/executor.h" /* for GetAttributeByName() */
-
- PG_MODULE_MAGIC;
-
- PG_FUNCTION_INFO_V1(c_overpaid);
-
- Datum
- c_overpaid(PG_FUNCTION_ARGS)
- {
- HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0);
- int32 limit = PG_GETARG_INT32(1);
- bool isnull;
- Datum salary;
-
- salary = GetAttributeByName(t, "salary", &isnull);
- if (isnull)
- PG_RETURN_BOOL(false);
- /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
-
- PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
- }
-
- </pre><p>
- </p><p>
- <code class="function">GetAttributeByName</code> is the
- <span class="productname">PostgreSQL</span> system function that
- returns attributes out of the specified row. It has
- three arguments: the argument of type <code class="type">HeapTupleHeader</code> passed
- into
- the function, the name of the desired attribute, and a
- return parameter that tells whether the attribute
- is null. <code class="function">GetAttributeByName</code> returns a <code class="type">Datum</code>
- value that you can convert to the proper data type by using the
- appropriate <code class="function">DatumGet<em class="replaceable"><code>XXX</code></em>()</code>
- macro. Note that the return value is meaningless if the null flag is
- set; always check the null flag before trying to do anything with the
- result.
- </p><p>
- There is also <code class="function">GetAttributeByNum</code>, which selects
- the target attribute by column number instead of name.
- </p><p>
- The following command declares the function
- <code class="function">c_overpaid</code> in SQL:
-
- </p><pre class="programlisting">
- CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
- AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'c_overpaid'
- LANGUAGE C STRICT;
- </pre><p>
-
- Notice we have used <code class="literal">STRICT</code> so that we did not have to
- check whether the input arguments were NULL.
- </p></div><div class="sect2" id="id-1.8.3.13.11"><div class="titlepage"><div><div><h3 class="title">37.10.7. Returning Rows (Composite Types)</h3></div></div></div><p>
- To return a row or composite-type value from a C-language
- function, you can use a special API that provides macros and
- functions to hide most of the complexity of building composite
- data types. To use this API, the source file must include:
- </p><pre class="programlisting">
- #include "funcapi.h"
- </pre><p>
- </p><p>
- There are two ways you can build a composite data value (henceforth
- a <span class="quote">“<span class="quote">tuple</span>”</span>): you can build it from an array of Datum values,
- or from an array of C strings that can be passed to the input
- conversion functions of the tuple's column data types. In either
- case, you first need to obtain or construct a <code class="structname">TupleDesc</code>
- descriptor for the tuple structure. When working with Datums, you
- pass the <code class="structname">TupleDesc</code> to <code class="function">BlessTupleDesc</code>,
- and then call <code class="function">heap_form_tuple</code> for each row. When working
- with C strings, you pass the <code class="structname">TupleDesc</code> to
- <code class="function">TupleDescGetAttInMetadata</code>, and then call
- <code class="function">BuildTupleFromCStrings</code> for each row. In the case of a
- function returning a set of tuples, the setup steps can all be done
- once during the first call of the function.
- </p><p>
- Several helper functions are available for setting up the needed
- <code class="structname">TupleDesc</code>. The recommended way to do this in most
- functions returning composite values is to call:
- </p><pre class="programlisting">
- TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
- Oid *resultTypeId,
- TupleDesc *resultTupleDesc)
- </pre><p>
- passing the same <code class="literal">fcinfo</code> struct passed to the calling function
- itself. (This of course requires that you use the version-1
- calling conventions.) <code class="varname">resultTypeId</code> can be specified
- as <code class="literal">NULL</code> or as the address of a local variable to receive the
- function's result type OID. <code class="varname">resultTupleDesc</code> should be the
- address of a local <code class="structname">TupleDesc</code> variable. Check that the
- result is <code class="literal">TYPEFUNC_COMPOSITE</code>; if so,
- <code class="varname">resultTupleDesc</code> has been filled with the needed
- <code class="structname">TupleDesc</code>. (If it is not, you can report an error along
- the lines of <span class="quote">“<span class="quote">function returning record called in context that
- cannot accept type record</span>”</span>.)
- </p><div class="tip"><h3 class="title">Tip</h3><p>
- <code class="function">get_call_result_type</code> can resolve the actual type of a
- polymorphic function result; so it is useful in functions that return
- scalar polymorphic results, not only functions that return composites.
- The <code class="varname">resultTypeId</code> output is primarily useful for functions
- returning polymorphic scalars.
- </p></div><div class="note"><h3 class="title">Note</h3><p>
- <code class="function">get_call_result_type</code> has a sibling
- <code class="function">get_expr_result_type</code>, which can be used to resolve the
- expected output type for a function call represented by an expression
- tree. This can be used when trying to determine the result type from
- outside the function itself. There is also
- <code class="function">get_func_result_type</code>, which can be used when only the
- function's OID is available. However these functions are not able
- to deal with functions declared to return <code class="structname">record</code>, and
- <code class="function">get_func_result_type</code> cannot resolve polymorphic types,
- so you should preferentially use <code class="function">get_call_result_type</code>.
- </p></div><p>
- Older, now-deprecated functions for obtaining
- <code class="structname">TupleDesc</code>s are:
- </p><pre class="programlisting">
- TupleDesc RelationNameGetTupleDesc(const char *relname)
- </pre><p>
- to get a <code class="structname">TupleDesc</code> for the row type of a named relation,
- and:
- </p><pre class="programlisting">
- TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
- </pre><p>
- to get a <code class="structname">TupleDesc</code> based on a type OID. This can
- be used to get a <code class="structname">TupleDesc</code> for a base or
- composite type. It will not work for a function that returns
- <code class="structname">record</code>, however, and it cannot resolve polymorphic
- types.
- </p><p>
- Once you have a <code class="structname">TupleDesc</code>, call:
- </p><pre class="programlisting">
- TupleDesc BlessTupleDesc(TupleDesc tupdesc)
- </pre><p>
- if you plan to work with Datums, or:
- </p><pre class="programlisting">
- AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
- </pre><p>
- if you plan to work with C strings. If you are writing a function
- returning set, you can save the results of these functions in the
- <code class="structname">FuncCallContext</code> structure — use the
- <code class="structfield">tuple_desc</code> or <code class="structfield">attinmeta</code> field
- respectively.
- </p><p>
- When working with Datums, use:
- </p><pre class="programlisting">
- HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
- </pre><p>
- to build a <code class="structname">HeapTuple</code> given user data in Datum form.
- </p><p>
- When working with C strings, use:
- </p><pre class="programlisting">
- HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
- </pre><p>
- to build a <code class="structname">HeapTuple</code> given user data
- in C string form. <em class="parameter"><code>values</code></em> is an array of C strings,
- one for each attribute of the return row. Each C string should be in
- the form expected by the input function of the attribute data
- type. In order to return a null value for one of the attributes,
- the corresponding pointer in the <em class="parameter"><code>values</code></em> array
- should be set to <code class="symbol">NULL</code>. This function will need to
- be called again for each row you return.
- </p><p>
- Once you have built a tuple to return from your function, it
- must be converted into a <code class="type">Datum</code>. Use:
- </p><pre class="programlisting">
- HeapTupleGetDatum(HeapTuple tuple)
- </pre><p>
- to convert a <code class="structname">HeapTuple</code> into a valid Datum. This
- <code class="type">Datum</code> can be returned directly if you intend to return
- just a single row, or it can be used as the current return value
- in a set-returning function.
- </p><p>
- An example appears in the next section.
- </p></div><div class="sect2" id="XFUNC-C-RETURN-SET"><div class="titlepage"><div><div><h3 class="title">37.10.8. Returning Sets</h3></div></div></div><p>
- C-language functions have two options for returning sets (multiple
- rows). In one method, called <em class="firstterm">ValuePerCall</em>
- mode, a set-returning function is called repeatedly (passing the same
- arguments each time) and it returns one new row on each call, until
- it has no more rows to return and signals that by returning NULL.
- The set-returning function (<acronym class="acronym">SRF</acronym>) must therefore
- save enough state across calls to remember what it was doing and
- return the correct next item on each call.
- In the other method, called <em class="firstterm">Materialize</em> mode,
- a SRF fills and returns a tuplestore object containing its
- entire result; then only one call occurs for the whole result, and
- no inter-call state is needed.
- </p><p>
- When using ValuePerCall mode, it is important to remember that the
- query is not guaranteed to be run to completion; that is, due to
- options such as <code class="literal">LIMIT</code>, the executor might stop
- making calls to the set-returning function before all rows have been
- fetched. This means it is not safe to perform cleanup activities in
- the last call, because that might not ever happen. It's recommended
- to use Materialize mode for functions that need access to external
- resources, such as file descriptors.
- </p><p>
- The remainder of this section documents a set of helper macros that
- are commonly used (though not required to be used) for SRFs using
- ValuePerCall mode. Additional details about Materialize mode can be
- found in <code class="filename">src/backend/utils/fmgr/README</code>. Also,
- the <code class="filename">contrib</code> modules in
- the <span class="productname">PostgreSQL</span> source distribution contain
- many examples of SRFs using both ValuePerCall and Materialize mode.
- </p><p>
- To use the ValuePerCall support macros described here,
- include <code class="filename">funcapi.h</code>. These macros work with a
- structure <code class="structname">FuncCallContext</code> that contains the
- state that needs to be saved across calls. Within the calling
- SRF, <code class="literal">fcinfo->flinfo->fn_extra</code> is used to
- hold a pointer to <code class="structname">FuncCallContext</code> across
- calls. The macros automatically fill that field on first use,
- and expect to find the same pointer there on subsequent uses.
- </p><pre class="programlisting">
- typedef struct FuncCallContext
- {
- /*
- * Number of times we've been called before
- *
- * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
- * incremented for you every time SRF_RETURN_NEXT() is called.
- */
- uint64 call_cntr;
-
- /*
- * OPTIONAL maximum number of calls
- *
- * max_calls is here for convenience only and setting it is optional.
- * If not set, you must provide alternative means to know when the
- * function is done.
- */
- uint64 max_calls;
-
- /*
- * OPTIONAL pointer to miscellaneous user-provided context information
- *
- * user_fctx is for use as a pointer to your own data to retain
- * arbitrary context information between calls of your function.
- */
- void *user_fctx;
-
- /*
- * OPTIONAL pointer to struct containing attribute type input metadata
- *
- * attinmeta is for use when returning tuples (i.e., composite data types)
- * and is not used when returning base data types. It is only needed
- * if you intend to use BuildTupleFromCStrings() to create the return
- * tuple.
- */
- AttInMetadata *attinmeta;
-
- /*
- * memory context used for structures that must live for multiple calls
- *
- * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
- * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
- * context for any memory that is to be reused across multiple calls
- * of the SRF.
- */
- MemoryContext multi_call_memory_ctx;
-
- /*
- * OPTIONAL pointer to struct containing tuple description
- *
- * tuple_desc is for use when returning tuples (i.e., composite data types)
- * and is only needed if you are going to build the tuples with
- * heap_form_tuple() rather than with BuildTupleFromCStrings(). Note that
- * the TupleDesc pointer stored here should usually have been run through
- * BlessTupleDesc() first.
- */
- TupleDesc tuple_desc;
-
- } FuncCallContext;
- </pre><p>
- </p><p>
- The macros to be used by an <acronym class="acronym">SRF</acronym> using this
- infrastructure are:
- </p><pre class="programlisting">
- SRF_IS_FIRSTCALL()
- </pre><p>
- Use this to determine if your function is being called for the first or a
- subsequent time. On the first call (only), call:
- </p><pre class="programlisting">
- SRF_FIRSTCALL_INIT()
- </pre><p>
- to initialize the <code class="structname">FuncCallContext</code>. On every function call,
- including the first, call:
- </p><pre class="programlisting">
- SRF_PERCALL_SETUP()
- </pre><p>
- to set up for using the <code class="structname">FuncCallContext</code>.
- </p><p>
- If your function has data to return in the current call, use:
- </p><pre class="programlisting">
- SRF_RETURN_NEXT(funcctx, result)
- </pre><p>
- to return it to the caller. (<code class="literal">result</code> must be of type
- <code class="type">Datum</code>, either a single value or a tuple prepared as
- described above.) Finally, when your function is finished
- returning data, use:
- </p><pre class="programlisting">
- SRF_RETURN_DONE(funcctx)
- </pre><p>
- to clean up and end the <acronym class="acronym">SRF</acronym>.
- </p><p>
- The memory context that is current when the <acronym class="acronym">SRF</acronym> is called is
- a transient context that will be cleared between calls. This means
- that you do not need to call <code class="function">pfree</code> on everything
- you allocated using <code class="function">palloc</code>; it will go away anyway. However, if you want to allocate
- any data structures to live across calls, you need to put them somewhere
- else. The memory context referenced by
- <code class="structfield">multi_call_memory_ctx</code> is a suitable location for any
- data that needs to survive until the <acronym class="acronym">SRF</acronym> is finished running. In most
- cases, this means that you should switch into
- <code class="structfield">multi_call_memory_ctx</code> while doing the
- first-call setup.
- Use <code class="literal">funcctx->user_fctx</code> to hold a pointer to
- any such cross-call data structures.
- (Data you allocate
- in <code class="structfield">multi_call_memory_ctx</code> will go away
- automatically when the query ends, so it is not necessary to free
- that data manually, either.)
- </p><div class="warning"><h3 class="title">Warning</h3><p>
- While the actual arguments to the function remain unchanged between
- calls, if you detoast the argument values (which is normally done
- transparently by the
- <code class="function">PG_GETARG_<em class="replaceable"><code>xxx</code></em></code> macro)
- in the transient context then the detoasted copies will be freed on
- each cycle. Accordingly, if you keep references to such values in
- your <code class="structfield">user_fctx</code>, you must either copy them into the
- <code class="structfield">multi_call_memory_ctx</code> after detoasting, or ensure
- that you detoast the values only in that context.
- </p></div><p>
- A complete pseudo-code example looks like the following:
- </p><pre class="programlisting">
- Datum
- my_set_returning_function(PG_FUNCTION_ARGS)
- {
- FuncCallContext *funcctx;
- Datum result;
- <em class="replaceable"><code>further declarations as needed</code></em>
-
- if (SRF_IS_FIRSTCALL())
- {
- MemoryContext oldcontext;
-
- funcctx = SRF_FIRSTCALL_INIT();
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
- /* One-time setup code appears here: */
- <em class="replaceable"><code>user code</code></em>
- <em class="replaceable"><code>if returning composite</code></em>
- <em class="replaceable"><code>build TupleDesc, and perhaps AttInMetadata</code></em>
- <em class="replaceable"><code>endif returning composite</code></em>
- <em class="replaceable"><code>user code</code></em>
- MemoryContextSwitchTo(oldcontext);
- }
-
- /* Each-time setup code appears here: */
- <em class="replaceable"><code>user code</code></em>
- funcctx = SRF_PERCALL_SETUP();
- <em class="replaceable"><code>user code</code></em>
-
- /* this is just one way we might test whether we are done: */
- if (funcctx->call_cntr < funcctx->max_calls)
- {
- /* Here we want to return another item: */
- <em class="replaceable"><code>user code</code></em>
- <em class="replaceable"><code>obtain result Datum</code></em>
- SRF_RETURN_NEXT(funcctx, result);
- }
- else
- {
- /* Here we are done returning items, so just report that fact. */
- /* (Resist the temptation to put cleanup code here.) */
- SRF_RETURN_DONE(funcctx);
- }
- }
- </pre><p>
- </p><p>
- A complete example of a simple <acronym class="acronym">SRF</acronym> returning a composite type
- looks like:
- </p><pre class="programlisting">
- PG_FUNCTION_INFO_V1(retcomposite);
-
- Datum
- retcomposite(PG_FUNCTION_ARGS)
- {
- FuncCallContext *funcctx;
- int call_cntr;
- int max_calls;
- TupleDesc tupdesc;
- AttInMetadata *attinmeta;
-
- /* stuff done only on the first call of the function */
- if (SRF_IS_FIRSTCALL())
- {
- MemoryContext oldcontext;
-
- /* create a function context for cross-call persistence */
- funcctx = SRF_FIRSTCALL_INIT();
-
- /* switch to memory context appropriate for multiple function calls */
- oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
-
- /* total number of tuples to be returned */
- funcctx->max_calls = PG_GETARG_UINT32(0);
-
- /* Build a tuple descriptor for our result type */
- if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("function returning record called in context "
- "that cannot accept type record")));
-
- /*
- * generate attribute metadata needed later to produce tuples from raw
- * C strings
- */
- attinmeta = TupleDescGetAttInMetadata(tupdesc);
- funcctx->attinmeta = attinmeta;
-
- MemoryContextSwitchTo(oldcontext);
- }
-
- /* stuff done on every call of the function */
- funcctx = SRF_PERCALL_SETUP();
-
- call_cntr = funcctx->call_cntr;
- max_calls = funcctx->max_calls;
- attinmeta = funcctx->attinmeta;
-
- if (call_cntr < max_calls) /* do when there is more left to send */
- {
- char **values;
- HeapTuple tuple;
- Datum result;
-
- /*
- * Prepare a values array for building the returned tuple.
- * This should be an array of C strings which will
- * be processed later by the type input functions.
- */
- values = (char **) palloc(3 * sizeof(char *));
- values[0] = (char *) palloc(16 * sizeof(char));
- values[1] = (char *) palloc(16 * sizeof(char));
- values[2] = (char *) palloc(16 * sizeof(char));
-
- snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
- snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
- snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
-
- /* build a tuple */
- tuple = BuildTupleFromCStrings(attinmeta, values);
-
- /* make the tuple into a datum */
- result = HeapTupleGetDatum(tuple);
-
- /* clean up (this is not really necessary) */
- pfree(values[0]);
- pfree(values[1]);
- pfree(values[2]);
- pfree(values);
-
- SRF_RETURN_NEXT(funcctx, result);
- }
- else /* do when there is no more left */
- {
- SRF_RETURN_DONE(funcctx);
- }
- }
-
- </pre><p>
-
- One way to declare this function in SQL is:
- </p><pre class="programlisting">
- CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
-
- CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
- RETURNS SETOF __retcomposite
- AS '<em class="replaceable"><code>filename</code></em>', 'retcomposite'
- LANGUAGE C IMMUTABLE STRICT;
- </pre><p>
- A different way is to use OUT parameters:
- </p><pre class="programlisting">
- CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
- OUT f1 integer, OUT f2 integer, OUT f3 integer)
- RETURNS SETOF record
- AS '<em class="replaceable"><code>filename</code></em>', 'retcomposite'
- LANGUAGE C IMMUTABLE STRICT;
- </pre><p>
- Notice that in this method the output type of the function is formally
- an anonymous <code class="structname">record</code> type.
- </p></div><div class="sect2" id="id-1.8.3.13.13"><div class="titlepage"><div><div><h3 class="title">37.10.9. Polymorphic Arguments and Return Types</h3></div></div></div><p>
- C-language functions can be declared to accept and
- return the polymorphic types
- <code class="type">anyelement</code>, <code class="type">anyarray</code>, <code class="type">anynonarray</code>,
- <code class="type">anyenum</code>, and <code class="type">anyrange</code>.
- See <a class="xref" href="extend-type-system.html#EXTEND-TYPES-POLYMORPHIC" title="37.2.5. Polymorphic Types">Section 37.2.5</a> for a more detailed explanation
- of polymorphic functions. When function arguments or return types
- are defined as polymorphic types, the function author cannot know
- in advance what data type it will be called with, or
- need to return. There are two routines provided in <code class="filename">fmgr.h</code>
- to allow a version-1 C function to discover the actual data types
- of its arguments and the type it is expected to return. The routines are
- called <code class="literal">get_fn_expr_rettype(FmgrInfo *flinfo)</code> and
- <code class="literal">get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</code>.
- They return the result or argument type OID, or <code class="symbol">InvalidOid</code> if the
- information is not available.
- The structure <code class="literal">flinfo</code> is normally accessed as
- <code class="literal">fcinfo->flinfo</code>. The parameter <code class="literal">argnum</code>
- is zero based. <code class="function">get_call_result_type</code> can also be used
- as an alternative to <code class="function">get_fn_expr_rettype</code>.
- There is also <code class="function">get_fn_expr_variadic</code>, which can be used to
- find out whether variadic arguments have been merged into an array.
- This is primarily useful for <code class="literal">VARIADIC "any"</code> functions,
- since such merging will always have occurred for variadic functions
- taking ordinary array types.
- </p><p>
- For example, suppose we want to write a function to accept a single
- element of any type, and return a one-dimensional array of that type:
-
- </p><pre class="programlisting">
- PG_FUNCTION_INFO_V1(make_array);
- Datum
- make_array(PG_FUNCTION_ARGS)
- {
- ArrayType *result;
- Oid element_type = get_fn_expr_argtype(fcinfo->flinfo, 0);
- Datum element;
- bool isnull;
- int16 typlen;
- bool typbyval;
- char typalign;
- int ndims;
- int dims[MAXDIM];
- int lbs[MAXDIM];
-
- if (!OidIsValid(element_type))
- elog(ERROR, "could not determine data type of input");
-
- /* get the provided element, being careful in case it's NULL */
- isnull = PG_ARGISNULL(0);
- if (isnull)
- element = (Datum) 0;
- else
- element = PG_GETARG_DATUM(0);
-
- /* we have one dimension */
- ndims = 1;
- /* and one element */
- dims[0] = 1;
- /* and lower bound is 1 */
- lbs[0] = 1;
-
- /* get required info about the element type */
- get_typlenbyvalalign(element_type, &typlen, &typbyval, &typalign);
-
- /* now build the array */
- result = construct_md_array(&element, &isnull, ndims, dims, lbs,
- element_type, typlen, typbyval, typalign);
-
- PG_RETURN_ARRAYTYPE_P(result);
- }
- </pre><p>
- </p><p>
- The following command declares the function
- <code class="function">make_array</code> in SQL:
-
- </p><pre class="programlisting">
- CREATE FUNCTION make_array(anyelement) RETURNS anyarray
- AS '<em class="replaceable"><code>DIRECTORY</code></em>/funcs', 'make_array'
- LANGUAGE C IMMUTABLE;
- </pre><p>
- </p><p>
- There is a variant of polymorphism that is only available to C-language
- functions: they can be declared to take parameters of type
- <code class="literal">"any"</code>. (Note that this type name must be double-quoted,
- since it's also a SQL reserved word.) This works like
- <code class="type">anyelement</code> except that it does not constrain different
- <code class="literal">"any"</code> arguments to be the same type, nor do they help
- determine the function's result type. A C-language function can also
- declare its final parameter to be <code class="literal">VARIADIC "any"</code>. This will
- match one or more actual arguments of any type (not necessarily the same
- type). These arguments will <span class="emphasis"><em>not</em></span> be gathered into an array
- as happens with normal variadic functions; they will just be passed to
- the function separately. The <code class="function">PG_NARGS()</code> macro and the
- methods described above must be used to determine the number of actual
- arguments and their types when using this feature. Also, users of such
- a function might wish to use the <code class="literal">VARIADIC</code> keyword in their
- function call, with the expectation that the function would treat the
- array elements as separate arguments. The function itself must implement
- that behavior if wanted, after using <code class="function">get_fn_expr_variadic</code> to
- detect that the actual argument was marked with <code class="literal">VARIADIC</code>.
- </p></div><div class="sect2" id="id-1.8.3.13.14"><div class="titlepage"><div><div><h3 class="title">37.10.10. Shared Memory and LWLocks</h3></div></div></div><p>
- Add-ins can reserve LWLocks and an allocation of shared memory on server
- startup. The add-in's shared library must be preloaded by specifying
- it in
- <a class="xref" href="runtime-config-client.html#GUC-SHARED-PRELOAD-LIBRARIES">shared_preload_libraries</a><a id="id-1.8.3.13.14.2.2" class="indexterm"></a>.
- Shared memory is reserved by calling:
- </p><pre class="programlisting">
- void RequestAddinShmemSpace(int size)
- </pre><p>
- from your <code class="function">_PG_init</code> function.
- </p><p>
- LWLocks are reserved by calling:
- </p><pre class="programlisting">
- void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
- </pre><p>
- from <code class="function">_PG_init</code>. This will ensure that an array of
- <code class="literal">num_lwlocks</code> LWLocks is available under the name
- <code class="literal">tranche_name</code>. Use <code class="function">GetNamedLWLockTranche</code>
- to get a pointer to this array.
- </p><p>
- To avoid possible race-conditions, each backend should use the LWLock
- <code class="function">AddinShmemInitLock</code> when connecting to and initializing
- its allocation of shared memory, as shown here:
- </p><pre class="programlisting">
- static mystruct *ptr = NULL;
-
- if (!ptr)
- {
- bool found;
-
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
- ptr = ShmemInitStruct("my struct name", size, &found);
- if (!found)
- {
- initialize contents of shmem area;
- acquire any requested LWLocks using:
- ptr->locks = GetNamedLWLockTranche("my tranche name");
- }
- LWLockRelease(AddinShmemInitLock);
- }
- </pre><p>
- </p></div><div class="sect2" id="EXTEND-CPP"><div class="titlepage"><div><div><h3 class="title">37.10.11. Using C++ for Extensibility</h3></div></div></div><a id="id-1.8.3.13.15.2" class="indexterm"></a><p>
- Although the <span class="productname">PostgreSQL</span> backend is written in
- C, it is possible to write extensions in C++ if these guidelines are
- followed:
-
- </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
- All functions accessed by the backend must present a C interface
- to the backend; these C functions can then call C++ functions.
- For example, <code class="literal">extern C</code> linkage is required for
- backend-accessed functions. This is also necessary for any
- functions that are passed as pointers between the backend and
- C++ code.
- </p></li><li class="listitem"><p>
- Free memory using the appropriate deallocation method. For example,
- most backend memory is allocated using <code class="function">palloc()</code>, so use
- <code class="function">pfree()</code> to free it. Using C++
- <code class="function">delete</code> in such cases will fail.
- </p></li><li class="listitem"><p>
- Prevent exceptions from propagating into the C code (use a catch-all
- block at the top level of all <code class="literal">extern C</code> functions). This
- is necessary even if the C++ code does not explicitly throw any
- exceptions, because events like out-of-memory can still throw
- exceptions. Any exceptions must be caught and appropriate errors
- passed back to the C interface. If possible, compile C++ with
- <code class="option">-fno-exceptions</code> to eliminate exceptions entirely; in such
- cases, you must check for failures in your C++ code, e.g. check for
- NULL returned by <code class="function">new()</code>.
- </p></li><li class="listitem"><p>
- If calling backend functions from C++ code, be sure that the
- C++ call stack contains only plain old data structures
- (<acronym class="acronym">POD</acronym>). This is necessary because backend errors
- generate a distant <code class="function">longjmp()</code> that does not properly
- unroll a C++ call stack with non-POD objects.
- </p></li></ul></div><p>
- </p><p>
- In summary, it is best to place C++ code behind a wall of
- <code class="literal">extern C</code> functions that interface to the backend,
- and avoid exception, memory, and call stack leakage.
- </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="xfunc-internal.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="extend.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="xfunc-optimization.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">37.9. Internal Functions </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> 37.11. Function Optimization Information</td></tr></table></div></body></html>
|