|
- <?xml version="1.0" encoding="UTF-8" standalone="no"?>
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>18.4. Managing Kernel Resources</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets V1.79.1" /><link rel="prev" href="server-start.html" title="18.3. Starting the Database Server" /><link rel="next" href="server-shutdown.html" title="18.5. Shutting Down the Server" /></head><body><div xmlns="http://www.w3.org/TR/xhtml1/transitional" class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">18.4. Managing Kernel Resources</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="server-start.html" title="18.3. Starting the Database Server">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="runtime.html" title="Chapter 18. Server Setup and Operation">Up</a></td><th width="60%" align="center">Chapter 18. Server Setup and Operation</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 12.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="server-shutdown.html" title="18.5. Shutting Down the Server">Next</a></td></tr></table><hr></hr></div><div class="sect1" id="KERNEL-RESOURCES"><div class="titlepage"><div><div><h2 class="title" style="clear: both">18.4. Managing Kernel Resources</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="kernel-resources.html#SYSVIPC">18.4.1. Shared Memory and Semaphores</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#SYSTEMD-REMOVEIPC">18.4.2. systemd RemoveIPC</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#id-1.6.5.6.5">18.4.3. Resource Limits</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#LINUX-MEMORY-OVERCOMMIT">18.4.4. Linux Memory Overcommit</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#LINUX-HUGE-PAGES">18.4.5. Linux Huge Pages</a></span></dt></dl></div><p>
- <span class="productname">PostgreSQL</span> can sometimes exhaust various operating system
- resource limits, especially when multiple copies of the server are running
- on the same system, or in very large installations. This section explains
- the kernel resources used by <span class="productname">PostgreSQL</span> and the steps you
- can take to resolve problems related to kernel resource consumption.
- </p><div class="sect2" id="SYSVIPC"><div class="titlepage"><div><div><h3 class="title">18.4.1. Shared Memory and Semaphores</h3></div></div></div><a id="id-1.6.5.6.3.2" class="indexterm"></a><a id="id-1.6.5.6.3.3" class="indexterm"></a><p>
- <span class="productname">PostgreSQL</span> requires the operating system to provide
- inter-process communication (<acronym class="acronym">IPC</acronym>) features, specifically
- shared memory and semaphores. Unix-derived systems typically provide
- <span class="quote">“<span class="quote"><span class="systemitem">System V</span></span>”</span> <acronym class="acronym">IPC</acronym>,
- <span class="quote">“<span class="quote"><span class="systemitem">POSIX</span></span>”</span> <acronym class="acronym">IPC</acronym>, or both.
- <span class="systemitem">Windows</span> has its own implementation of
- these features and is not discussed here.
- </p><p>
- The complete lack of these facilities is usually manifested by an
- <span class="quote">“<span class="quote"><span class="errorname">Illegal system call</span></span>”</span> error upon server
- start. In that case there is no alternative but to reconfigure your
- kernel. <span class="productname">PostgreSQL</span> won't work without them.
- This situation is rare, however, among modern operating systems.
- </p><p>
- By default, <span class="productname">PostgreSQL</span> allocates
- a very small amount of System V shared memory, as well as a much larger
- amount of anonymous <code class="function">mmap</code> shared memory.
- Alternatively, a single large System V shared memory region can be used
- (see <a class="xref" href="runtime-config-resource.html#GUC-SHARED-MEMORY-TYPE">shared_memory_type</a>).
-
- In addition a significant number of semaphores, which can be either
- System V or POSIX style, are created at server startup. Currently,
- POSIX semaphores are used on Linux and FreeBSD systems while other
- platforms use System V semaphores.
- </p><div class="note"><h3 class="title">Note</h3><p>
- Prior to <span class="productname">PostgreSQL</span> 9.3, only System V shared memory
- was used, so the amount of System V shared memory required to start the
- server was much larger. If you are running an older version of the
- server, please consult the documentation for your server version.
- </p></div><p>
- System V <acronym class="acronym">IPC</acronym> features are typically constrained by
- system-wide allocation limits.
- When <span class="productname">PostgreSQL</span> exceeds one of these limits,
- the server will refuse to start and
- should leave an instructive error message describing the problem
- and what to do about it. (See also <a class="xref" href="server-start.html#SERVER-START-FAILURES" title="18.3.1. Server Start-up Failures">Section 18.3.1</a>.) The relevant kernel
- parameters are named consistently across different systems; <a class="xref" href="kernel-resources.html#SYSVIPC-PARAMETERS" title="Table 18.1. System V IPC Parameters">Table 18.1</a> gives an overview. The methods to set
- them, however, vary. Suggestions for some platforms are given below.
- </p><div class="table" id="SYSVIPC-PARAMETERS"><p class="title"><strong>Table 18.1. <span class="systemitem">System V</span> <acronym class="acronym">IPC</acronym> Parameters</strong></p><div class="table-contents"><table class="table" summary="System V IPC Parameters" border="1"><colgroup><col /><col /><col /></colgroup><thead><tr><th>Name</th><th>Description</th><th>Values needed to run one <span class="productname">PostgreSQL</span> instance</th></tr></thead><tbody><tr><td><code class="varname">SHMMAX</code></td><td>Maximum size of shared memory segment (bytes)</td><td>at least 1kB, but the default is usually much higher</td></tr><tr><td><code class="varname">SHMMIN</code></td><td>Minimum size of shared memory segment (bytes)</td><td>1</td></tr><tr><td><code class="varname">SHMALL</code></td><td>Total amount of shared memory available (bytes or pages)</td><td>same as <code class="varname">SHMMAX</code> if bytes,
- or <code class="literal">ceil(SHMMAX/PAGE_SIZE)</code> if pages,
- plus room for other applications</td></tr><tr><td><code class="varname">SHMSEG</code></td><td>Maximum number of shared memory segments per process</td><td>only 1 segment is needed, but the default is much higher</td></tr><tr><td><code class="varname">SHMMNI</code></td><td>Maximum number of shared memory segments system-wide</td><td>like <code class="varname">SHMSEG</code> plus room for other applications</td></tr><tr><td><code class="varname">SEMMNI</code></td><td>Maximum number of semaphore identifiers (i.e., sets)</td><td>at least <code class="literal">ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</code> plus room for other applications</td></tr><tr><td><code class="varname">SEMMNS</code></td><td>Maximum number of semaphores system-wide</td><td><code class="literal">ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16) * 17</code> plus room for other applications</td></tr><tr><td><code class="varname">SEMMSL</code></td><td>Maximum number of semaphores per set</td><td>at least 17</td></tr><tr><td><code class="varname">SEMMAP</code></td><td>Number of entries in semaphore map</td><td>see text</td></tr><tr><td><code class="varname">SEMVMX</code></td><td>Maximum value of semaphore</td><td>at least 1000 (The default is often 32767; do not change unless necessary)</td></tr></tbody></table></div></div><br class="table-break" /><p>
- <span class="productname">PostgreSQL</span> requires a few bytes of System V shared memory
- (typically 48 bytes, on 64-bit platforms) for each copy of the server.
- On most modern operating systems, this amount can easily be allocated.
- However, if you are running many copies of the server or you explicitly
- configure the server to use large amounts of System V shared memory (see
- <a class="xref" href="runtime-config-resource.html#GUC-SHARED-MEMORY-TYPE">shared_memory_type</a> and <a class="xref" href="runtime-config-resource.html#GUC-DYNAMIC-SHARED-MEMORY-TYPE">dynamic_shared_memory_type</a>), it may be necessary to
- increase <code class="varname">SHMALL</code>, which is the total amount of System V shared
- memory system-wide. Note that <code class="varname">SHMALL</code> is measured in pages
- rather than bytes on many systems.
- </p><p>
- Less likely to cause problems is the minimum size for shared
- memory segments (<code class="varname">SHMMIN</code>), which should be at most
- approximately 32 bytes for <span class="productname">PostgreSQL</span> (it is
- usually just 1). The maximum number of segments system-wide
- (<code class="varname">SHMMNI</code>) or per-process (<code class="varname">SHMSEG</code>) are unlikely
- to cause a problem unless your system has them set to zero.
- </p><p>
- When using System V semaphores,
- <span class="productname">PostgreSQL</span> uses one semaphore per allowed connection
- (<a class="xref" href="runtime-config-connection.html#GUC-MAX-CONNECTIONS">max_connections</a>), allowed autovacuum worker process
- (<a class="xref" href="runtime-config-autovacuum.html#GUC-AUTOVACUUM-MAX-WORKERS">autovacuum_max_workers</a>) and allowed background
- process (<a class="xref" href="runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES">max_worker_processes</a>), in sets of 16.
- Each such set will
- also contain a 17th semaphore which contains a <span class="quote">“<span class="quote">magic
- number</span>”</span>, to detect collision with semaphore sets used by
- other applications. The maximum number of semaphores in the system
- is set by <code class="varname">SEMMNS</code>, which consequently must be at least
- as high as <code class="varname">max_connections</code> plus
- <code class="varname">autovacuum_max_workers</code> plus <code class="varname">max_wal_senders</code>,
- plus <code class="varname">max_worker_processes</code>, plus one extra for each 16
- allowed connections plus workers (see the formula in <a class="xref" href="kernel-resources.html#SYSVIPC-PARAMETERS" title="Table 18.1. System V IPC Parameters">Table 18.1</a>). The parameter <code class="varname">SEMMNI</code>
- determines the limit on the number of semaphore sets that can
- exist on the system at one time. Hence this parameter must be at
- least <code class="literal">ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</code>.
- Lowering the number
- of allowed connections is a temporary workaround for failures,
- which are usually confusingly worded <span class="quote">“<span class="quote">No space
- left on device</span>”</span>, from the function <code class="function">semget</code>.
- </p><p>
- In some cases it might also be necessary to increase
- <code class="varname">SEMMAP</code> to be at least on the order of
- <code class="varname">SEMMNS</code>. If the system has this parameter
- (many do not), it defines the size of the semaphore
- resource map, in which each contiguous block of available semaphores
- needs an entry. When a semaphore set is freed it is either added to
- an existing entry that is adjacent to the freed block or it is
- registered under a new map entry. If the map is full, the freed
- semaphores get lost (until reboot). Fragmentation of the semaphore
- space could over time lead to fewer available semaphores than there
- should be.
- </p><p>
- Various other settings related to <span class="quote">“<span class="quote">semaphore undo</span>”</span>, such as
- <code class="varname">SEMMNU</code> and <code class="varname">SEMUME</code>, do not affect
- <span class="productname">PostgreSQL</span>.
- </p><p>
- When using POSIX semaphores, the number of semaphores needed is the
- same as for System V, that is one semaphore per allowed connection
- (<a class="xref" href="runtime-config-connection.html#GUC-MAX-CONNECTIONS">max_connections</a>), allowed autovacuum worker process
- (<a class="xref" href="runtime-config-autovacuum.html#GUC-AUTOVACUUM-MAX-WORKERS">autovacuum_max_workers</a>) and allowed background
- process (<a class="xref" href="runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES">max_worker_processes</a>).
- On the platforms where this option is preferred, there is no specific
- kernel limit on the number of POSIX semaphores.
- </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><span class="systemitem">AIX</span>
- <a id="id-1.6.5.6.3.16.1.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- At least as of version 5.1, it should not be necessary to do
- any special configuration for such parameters as
- <code class="varname">SHMMAX</code>, as it appears this is configured to
- allow all memory to be used as shared memory. That is the
- sort of configuration commonly used for other databases such
- as <span class="application">DB/2</span>.</p><p> It might, however, be necessary to modify the global
- <code class="command">ulimit</code> information in
- <code class="filename">/etc/security/limits</code>, as the default hard
- limits for file sizes (<code class="varname">fsize</code>) and numbers of
- files (<code class="varname">nofiles</code>) might be too low.
- </p></dd><dt><span class="term"><span class="systemitem">FreeBSD</span>
- <a id="id-1.6.5.6.3.16.2.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- The default IPC settings can be changed using
- the <code class="command">sysctl</code> or
- <code class="command">loader</code> interfaces. The following
- parameters can be set using <code class="command">sysctl</code>:
- </p><pre class="screen">
- <code class="prompt">#</code> <strong class="userinput"><code>sysctl kern.ipc.shmall=32768</code></strong>
- <code class="prompt">#</code> <strong class="userinput"><code>sysctl kern.ipc.shmmax=134217728</code></strong>
- </pre><p>
- To make these settings persist over reboots, modify
- <code class="filename">/etc/sysctl.conf</code>.
- </p><p>
- These semaphore-related settings are read-only as far as
- <code class="command">sysctl</code> is concerned, but can be set in
- <code class="filename">/boot/loader.conf</code>:
- </p><pre class="programlisting">
- kern.ipc.semmni=256
- kern.ipc.semmns=512
- </pre><p>
- After modifying that file, a reboot is required for the new
- settings to take effect.
- </p><p>
- You might also want to configure your kernel to lock System V shared
- memory into RAM and prevent it from being paged out to swap.
- This can be accomplished using the <code class="command">sysctl</code>
- setting <code class="literal">kern.ipc.shm_use_phys</code>.
- </p><p>
- If running in FreeBSD jails by enabling <span class="application">sysctl</span>'s
- <code class="literal">security.jail.sysvipc_allowed</code>, <span class="application">postmaster</span>s
- running in different jails should be run by different operating system
- users. This improves security because it prevents non-root users
- from interfering with shared memory or semaphores in different jails,
- and it allows the PostgreSQL IPC cleanup code to function properly.
- (In FreeBSD 6.0 and later the IPC cleanup code does not properly detect
- processes in other jails, preventing the running of postmasters on the
- same port in different jails.)
- </p><p>
- <span class="systemitem">FreeBSD</span> versions before 4.0 work like
- old <span class="systemitem">OpenBSD</span> (see below).
- </p></dd><dt><span class="term"><span class="systemitem">NetBSD</span>
- <a id="id-1.6.5.6.3.16.3.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- In <span class="systemitem">NetBSD</span> 5.0 and later,
- IPC parameters can be adjusted using <code class="command">sysctl</code>,
- for example:
- </p><pre class="screen">
- <code class="prompt">#</code> <strong class="userinput"><code>sysctl -w kern.ipc.semmni=100</code></strong>
- </pre><p>
- To make these settings persist over reboots, modify
- <code class="filename">/etc/sysctl.conf</code>.
- </p><p>
- You will usually want to increase <code class="literal">kern.ipc.semmni</code>
- and <code class="literal">kern.ipc.semmns</code>,
- as <span class="systemitem">NetBSD</span>'s default settings
- for these are uncomfortably small.
- </p><p>
- You might also want to configure your kernel to lock System V shared
- memory into RAM and prevent it from being paged out to swap.
- This can be accomplished using the <code class="command">sysctl</code>
- setting <code class="literal">kern.ipc.shm_use_phys</code>.
- </p><p>
- <span class="systemitem">NetBSD</span> versions before 5.0
- work like old <span class="systemitem">OpenBSD</span>
- (see below), except that kernel parameters should be set with the
- keyword <code class="literal">options</code> not <code class="literal">option</code>.
- </p></dd><dt><span class="term"><span class="systemitem">OpenBSD</span>
- <a id="id-1.6.5.6.3.16.4.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- In <span class="systemitem">OpenBSD</span> 3.3 and later,
- IPC parameters can be adjusted using <code class="command">sysctl</code>,
- for example:
- </p><pre class="screen">
- <code class="prompt">#</code> <strong class="userinput"><code>sysctl kern.seminfo.semmni=100</code></strong>
- </pre><p>
- To make these settings persist over reboots, modify
- <code class="filename">/etc/sysctl.conf</code>.
- </p><p>
- You will usually want to
- increase <code class="literal">kern.seminfo.semmni</code>
- and <code class="literal">kern.seminfo.semmns</code>,
- as <span class="systemitem">OpenBSD</span>'s default settings
- for these are uncomfortably small.
- </p><p>
- In older <span class="systemitem">OpenBSD</span> versions,
- you will need to build a custom kernel to change the IPC parameters.
- Make sure that the options <code class="varname">SYSVSHM</code>
- and <code class="varname">SYSVSEM</code> are enabled, too. (They are by
- default.) The following shows an example of how to set the various
- parameters in the kernel configuration file:
- </p><pre class="programlisting">
- option SYSVSHM
- option SHMMAXPGS=4096
- option SHMSEG=256
-
- option SYSVSEM
- option SEMMNI=256
- option SEMMNS=512
- option SEMMNU=256
- </pre><p>
- </p></dd><dt><span class="term"><span class="systemitem">HP-UX</span>
- <a id="id-1.6.5.6.3.16.5.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- The default settings tend to suffice for normal installations.
- On <span class="productname">HP-UX</span> 10, the factory default for
- <code class="varname">SEMMNS</code> is 128, which might be too low for larger
- database sites.
- </p><p>
- <acronym class="acronym">IPC</acronym> parameters can be set in the <span class="application">System
- Administration Manager</span> (<acronym class="acronym">SAM</acronym>) under
- <span class="guimenu">Kernel
- Configuration</span> → <span class="guimenuitem">Configurable Parameters</span>. Choose
- <span class="guibutton">Create A New Kernel</span> when you're done.
- </p></dd><dt><span class="term"><span class="systemitem">Linux</span>
- <a id="id-1.6.5.6.3.16.6.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- The default maximum segment size is 32 MB, and the
- default maximum total size is 2097152
- pages. A page is almost always 4096 bytes except in unusual
- kernel configurations with <span class="quote">“<span class="quote">huge pages</span>”</span>
- (use <code class="literal">getconf PAGE_SIZE</code> to verify).
- </p><p>
- The shared memory size settings can be changed via the
- <code class="command">sysctl</code> interface. For example, to allow 16 GB:
- </p><pre class="screen">
- <code class="prompt">$</code> <strong class="userinput"><code>sysctl -w kernel.shmmax=17179869184</code></strong>
- <code class="prompt">$</code> <strong class="userinput"><code>sysctl -w kernel.shmall=4194304</code></strong>
- </pre><p>
- In addition these settings can be preserved between reboots in
- the file <code class="filename">/etc/sysctl.conf</code>. Doing that is
- highly recommended.
- </p><p>
- Ancient distributions might not have the <code class="command">sysctl</code> program,
- but equivalent changes can be made by manipulating the
- <code class="filename">/proc</code> file system:
- </p><pre class="screen">
- <code class="prompt">$</code> <strong class="userinput"><code>echo 17179869184 >/proc/sys/kernel/shmmax</code></strong>
- <code class="prompt">$</code> <strong class="userinput"><code>echo 4194304 >/proc/sys/kernel/shmall</code></strong>
- </pre><p>
- </p><p>
- The remaining defaults are quite generously sized, and usually
- do not require changes.
- </p></dd><dt><span class="term"><span class="systemitem">macOS</span>
- <a id="id-1.6.5.6.3.16.7.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- The recommended method for configuring shared memory in macOS
- is to create a file named <code class="filename">/etc/sysctl.conf</code>,
- containing variable assignments such as:
- </p><pre class="programlisting">
- kern.sysv.shmmax=4194304
- kern.sysv.shmmin=1
- kern.sysv.shmmni=32
- kern.sysv.shmseg=8
- kern.sysv.shmall=1024
- </pre><p>
- Note that in some macOS versions,
- <span class="emphasis"><em>all five</em></span> shared-memory parameters must be set in
- <code class="filename">/etc/sysctl.conf</code>, else the values will be ignored.
- </p><p>
- Beware that recent releases of macOS ignore attempts to set
- <code class="varname">SHMMAX</code> to a value that isn't an exact multiple of 4096.
- </p><p>
- <code class="varname">SHMALL</code> is measured in 4 kB pages on this platform.
- </p><p>
- In older macOS versions, you will need to reboot to have changes in the
- shared memory parameters take effect. As of 10.5 it is possible to
- change all but <code class="varname">SHMMNI</code> on the fly, using
- <span class="application">sysctl</span>. But it's still best to set up your preferred
- values via <code class="filename">/etc/sysctl.conf</code>, so that the values will be
- kept across reboots.
- </p><p>
- The file <code class="filename">/etc/sysctl.conf</code> is only honored in macOS
- 10.3.9 and later. If you are running a previous 10.3.x release,
- you must edit the file <code class="filename">/etc/rc</code>
- and change the values in the following commands:
- </p><pre class="programlisting">
- sysctl -w kern.sysv.shmmax
- sysctl -w kern.sysv.shmmin
- sysctl -w kern.sysv.shmmni
- sysctl -w kern.sysv.shmseg
- sysctl -w kern.sysv.shmall
- </pre><p>
- Note that
- <code class="filename">/etc/rc</code> is usually overwritten by macOS system updates,
- so you should expect to have to redo these edits after each update.
- </p><p>
- In macOS 10.2 and earlier, instead edit these commands in the file
- <code class="filename">/System/Library/StartupItems/SystemTuning/SystemTuning</code>.
- </p></dd><dt><span class="term"><span class="systemitem">Solaris</span> 2.6 to 2.9 (Solaris
- 6 to Solaris 9)
- <a id="id-1.6.5.6.3.16.8.1.2" class="indexterm"></a>
- </span></dt><dd><p>
- The relevant settings can be changed in
- <code class="filename">/etc/system</code>, for example:
- </p><pre class="programlisting">
- set shmsys:shminfo_shmmax=0x2000000
- set shmsys:shminfo_shmmin=1
- set shmsys:shminfo_shmmni=256
- set shmsys:shminfo_shmseg=256
-
- set semsys:seminfo_semmap=256
- set semsys:seminfo_semmni=512
- set semsys:seminfo_semmns=512
- set semsys:seminfo_semmsl=32
- </pre><p>
- You need to reboot for the changes to take effect. See also
- <a class="ulink" href="http://sunsite.uakom.sk/sunworldonline/swol-09-1997/swol-09-insidesolaris.html" target="_top">http://sunsite.uakom.sk/sunworldonline/swol-09-1997/swol-09-insidesolaris.html</a>
- for information on shared memory under older versions of Solaris.
- </p></dd><dt><span class="term"><span class="systemitem">Solaris</span> 2.10 (Solaris
- 10) and later<br /></span><span class="term"><span class="systemitem">OpenSolaris</span></span></dt><dd><p>
- In Solaris 10 and later, and OpenSolaris, the default shared memory and
- semaphore settings are good enough for most
- <span class="productname">PostgreSQL</span> applications. Solaris now defaults
- to a <code class="varname">SHMMAX</code> of one-quarter of system <acronym class="acronym">RAM</acronym>.
- To further adjust this setting, use a project setting associated
- with the <code class="literal">postgres</code> user. For example, run the
- following as <code class="literal">root</code>:
- </p><pre class="programlisting">
- projadd -c "PostgreSQL DB User" -K "project.max-shm-memory=(privileged,8GB,deny)" -U postgres -G postgres user.postgres
- </pre><p>
- </p><p>
- This command adds the <code class="literal">user.postgres</code> project and
- sets the shared memory maximum for the <code class="literal">postgres</code>
- user to 8GB, and takes effect the next time that user logs
- in, or when you restart <span class="productname">PostgreSQL</span> (not reload).
- The above assumes that <span class="productname">PostgreSQL</span> is run by
- the <code class="literal">postgres</code> user in the <code class="literal">postgres</code>
- group. No server reboot is required.
- </p><p>
- Other recommended kernel setting changes for database servers which will
- have a large number of connections are:
- </p><pre class="programlisting">
- project.max-shm-ids=(priv,32768,deny)
- project.max-sem-ids=(priv,4096,deny)
- project.max-msg-ids=(priv,4096,deny)
- </pre><p>
- </p><p>
- Additionally, if you are running <span class="productname">PostgreSQL</span>
- inside a zone, you may need to raise the zone resource usage
- limits as well. See "Chapter2: Projects and Tasks" in the
- <em class="citetitle">System Administrator's Guide</em> for more
- information on <code class="literal">projects</code> and <code class="command">prctl</code>.
- </p></dd></dl></div></div><div class="sect2" id="SYSTEMD-REMOVEIPC"><div class="titlepage"><div><div><h3 class="title">18.4.2. systemd RemoveIPC</h3></div></div></div><a id="id-1.6.5.6.4.2" class="indexterm"></a><p>
- If <span class="productname">systemd</span> is in use, some care must be taken
- that IPC resources (including shared memory) are not prematurely
- removed by the operating system. This is especially of concern when
- installing PostgreSQL from source. Users of distribution packages of
- PostgreSQL are less likely to be affected, as
- the <code class="literal">postgres</code> user is then normally created as a system
- user.
- </p><p>
- The setting <code class="literal">RemoveIPC</code>
- in <code class="filename">logind.conf</code> controls whether IPC objects are
- removed when a user fully logs out. System users are exempt. This
- setting defaults to on in stock <span class="productname">systemd</span>, but
- some operating system distributions default it to off.
- </p><p>
- A typical observed effect when this setting is on is that shared memory
- objects used for parallel query execution are removed at apparently random
- times, leading to errors and warnings while attempting to open and remove
- them, like
- </p><pre class="screen">
- WARNING: could not remove shared memory segment "/PostgreSQL.1450751626": No such file or directory
- </pre><p>
- Different types of IPC objects (shared memory vs. semaphores, System V
- vs. POSIX) are treated slightly differently
- by <span class="productname">systemd</span>, so one might observe that some IPC
- resources are not removed in the same way as others. But it is not
- advisable to rely on these subtle differences.
- </p><p>
- A <span class="quote">“<span class="quote">user logging out</span>”</span> might happen as part of a maintenance
- job or manually when an administrator logs in as
- the <code class="literal">postgres</code> user or something similar, so it is hard
- to prevent in general.
- </p><p>
- What is a <span class="quote">“<span class="quote">system user</span>”</span> is determined
- at <span class="productname">systemd</span> compile time from
- the <code class="symbol">SYS_UID_MAX</code> setting
- in <code class="filename">/etc/login.defs</code>.
- </p><p>
- Packaging and deployment scripts should be careful to create
- the <code class="literal">postgres</code> user as a system user by
- using <code class="literal">useradd -r</code>, <code class="literal">adduser --system</code>,
- or equivalent.
- </p><p>
- Alternatively, if the user account was created incorrectly or cannot be
- changed, it is recommended to set
- </p><pre class="programlisting">
- RemoveIPC=no
- </pre><p>
- in <code class="filename">/etc/systemd/logind.conf</code> or another appropriate
- configuration file.
- </p><div class="caution"><h3 class="title">Caution</h3><p>
- At least one of these two things has to be ensured, or the PostgreSQL
- server will be very unreliable.
- </p></div></div><div class="sect2" id="id-1.6.5.6.5"><div class="titlepage"><div><div><h3 class="title">18.4.3. Resource Limits</h3></div></div></div><p>
- Unix-like operating systems enforce various kinds of resource limits
- that might interfere with the operation of your
- <span class="productname">PostgreSQL</span> server. Of particular
- importance are limits on the number of processes per user, the
- number of open files per process, and the amount of memory available
- to each process. Each of these have a <span class="quote">“<span class="quote">hard</span>”</span> and a
- <span class="quote">“<span class="quote">soft</span>”</span> limit. The soft limit is what actually counts
- but it can be changed by the user up to the hard limit. The hard
- limit can only be changed by the root user. The system call
- <code class="function">setrlimit</code> is responsible for setting these
- parameters. The shell's built-in command <code class="command">ulimit</code>
- (Bourne shells) or <code class="command">limit</code> (<span class="application">csh</span>) is
- used to control the resource limits from the command line. On
- BSD-derived systems the file <code class="filename">/etc/login.conf</code>
- controls the various resource limits set during login. See the
- operating system documentation for details. The relevant
- parameters are <code class="varname">maxproc</code>,
- <code class="varname">openfiles</code>, and <code class="varname">datasize</code>. For
- example:
- </p><pre class="programlisting">
- default:\
- ...
- :datasize-cur=256M:\
- :maxproc-cur=256:\
- :openfiles-cur=256:\
- ...
- </pre><p>
- (<code class="literal">-cur</code> is the soft limit. Append
- <code class="literal">-max</code> to set the hard limit.)
- </p><p>
- Kernels can also have system-wide limits on some resources.
- </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
- On <span class="productname">Linux</span>
- <code class="filename">/proc/sys/fs/file-max</code> determines the
- maximum number of open files that the kernel will support. It can
- be changed by writing a different number into the file or by
- adding an assignment in <code class="filename">/etc/sysctl.conf</code>.
- The maximum limit of files per process is fixed at the time the
- kernel is compiled; see
- <code class="filename">/usr/src/linux/Documentation/proc.txt</code> for
- more information.
- </p></li></ul></div><p>
- </p><p>
- The <span class="productname">PostgreSQL</span> server uses one process
- per connection so you should provide for at least as many processes
- as allowed connections, in addition to what you need for the rest
- of your system. This is usually not a problem but if you run
- several servers on one machine things might get tight.
- </p><p>
- The factory default limit on open files is often set to
- <span class="quote">“<span class="quote">socially friendly</span>”</span> values that allow many users to
- coexist on a machine without using an inappropriate fraction of
- the system resources. If you run many servers on a machine this
- is perhaps what you want, but on dedicated servers you might want to
- raise this limit.
- </p><p>
- On the other side of the coin, some systems allow individual
- processes to open large numbers of files; if more than a few
- processes do so then the system-wide limit can easily be exceeded.
- If you find this happening, and you do not want to alter the
- system-wide limit, you can set <span class="productname">PostgreSQL</span>'s <a class="xref" href="runtime-config-resource.html#GUC-MAX-FILES-PER-PROCESS">max_files_per_process</a> configuration parameter to
- limit the consumption of open files.
- </p></div><div class="sect2" id="LINUX-MEMORY-OVERCOMMIT"><div class="titlepage"><div><div><h3 class="title">18.4.4. Linux Memory Overcommit</h3></div></div></div><a id="id-1.6.5.6.6.2" class="indexterm"></a><a id="id-1.6.5.6.6.3" class="indexterm"></a><a id="id-1.6.5.6.6.4" class="indexterm"></a><p>
- In Linux 2.4 and later, the default virtual memory behavior is not
- optimal for <span class="productname">PostgreSQL</span>. Because of the
- way that the kernel implements memory overcommit, the kernel might
- terminate the <span class="productname">PostgreSQL</span> postmaster (the
- master server process) if the memory demands of either
- <span class="productname">PostgreSQL</span> or another process cause the
- system to run out of virtual memory.
- </p><p>
- If this happens, you will see a kernel message that looks like
- this (consult your system documentation and configuration on where
- to look for such a message):
- </p><pre class="programlisting">
- Out of Memory: Killed process 12345 (postgres).
- </pre><p>
- This indicates that the <code class="filename">postgres</code> process
- has been terminated due to memory pressure.
- Although existing database connections will continue to function
- normally, no new connections will be accepted. To recover,
- <span class="productname">PostgreSQL</span> will need to be restarted.
- </p><p>
- One way to avoid this problem is to run
- <span class="productname">PostgreSQL</span> on a machine where you can
- be sure that other processes will not run the machine out of
- memory. If memory is tight, increasing the swap space of the
- operating system can help avoid the problem, because the
- out-of-memory (OOM) killer is invoked only when physical memory and
- swap space are exhausted.
- </p><p>
- If <span class="productname">PostgreSQL</span> itself is the cause of the
- system running out of memory, you can avoid the problem by changing
- your configuration. In some cases, it may help to lower memory-related
- configuration parameters, particularly
- <a class="link" href="runtime-config-resource.html#GUC-SHARED-BUFFERS"><code class="varname">shared_buffers</code></a>
- and <a class="link" href="runtime-config-resource.html#GUC-WORK-MEM"><code class="varname">work_mem</code></a>. In
- other cases, the problem may be caused by allowing too many connections
- to the database server itself. In many cases, it may be better to reduce
- <a class="link" href="runtime-config-connection.html#GUC-MAX-CONNECTIONS"><code class="varname">max_connections</code></a>
- and instead make use of external connection-pooling software.
- </p><p>
- On Linux 2.6 and later, it is possible to modify the
- kernel's behavior so that it will not <span class="quote">“<span class="quote">overcommit</span>”</span> memory.
- Although this setting will not prevent the <a class="ulink" href="https://lwn.net/Articles/104179/" target="_top">OOM killer</a> from being invoked
- altogether, it will lower the chances significantly and will therefore
- lead to more robust system behavior. This is done by selecting strict
- overcommit mode via <code class="command">sysctl</code>:
- </p><pre class="programlisting">
- sysctl -w vm.overcommit_memory=2
- </pre><p>
- or placing an equivalent entry in <code class="filename">/etc/sysctl.conf</code>.
- You might also wish to modify the related setting
- <code class="varname">vm.overcommit_ratio</code>. For details see the kernel documentation
- file <a class="ulink" href="https://www.kernel.org/doc/Documentation/vm/overcommit-accounting" target="_top">https://www.kernel.org/doc/Documentation/vm/overcommit-accounting</a>.
- </p><p>
- Another approach, which can be used with or without altering
- <code class="varname">vm.overcommit_memory</code>, is to set the process-specific
- <em class="firstterm">OOM score adjustment</em> value for the postmaster process to
- <code class="literal">-1000</code>, thereby guaranteeing it will not be targeted by the OOM
- killer. The simplest way to do this is to execute
- </p><pre class="programlisting">
- echo -1000 > /proc/self/oom_score_adj
- </pre><p>
- in the postmaster's startup script just before invoking the postmaster.
- Note that this action must be done as root, or it will have no effect;
- so a root-owned startup script is the easiest place to do it. If you
- do this, you should also set these environment variables in the startup
- script before invoking the postmaster:
- </p><pre class="programlisting">
- export PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
- export PG_OOM_ADJUST_VALUE=0
- </pre><p>
- These settings will cause postmaster child processes to run with the
- normal OOM score adjustment of zero, so that the OOM killer can still
- target them at need. You could use some other value for
- <code class="envar">PG_OOM_ADJUST_VALUE</code> if you want the child processes to run
- with some other OOM score adjustment. (<code class="envar">PG_OOM_ADJUST_VALUE</code>
- can also be omitted, in which case it defaults to zero.) If you do not
- set <code class="envar">PG_OOM_ADJUST_FILE</code>, the child processes will run with the
- same OOM score adjustment as the postmaster, which is unwise since the
- whole point is to ensure that the postmaster has a preferential setting.
- </p><p>
- Older Linux kernels do not offer <code class="filename">/proc/self/oom_score_adj</code>,
- but may have a previous version of the same functionality called
- <code class="filename">/proc/self/oom_adj</code>. This works the same except the disable
- value is <code class="literal">-17</code> not <code class="literal">-1000</code>.
- </p><div class="note"><h3 class="title">Note</h3><p>
- Some vendors' Linux 2.4 kernels are reported to have early versions
- of the 2.6 overcommit <code class="command">sysctl</code> parameter. However, setting
- <code class="literal">vm.overcommit_memory</code> to 2
- on a 2.4 kernel that does not have the relevant code will make
- things worse, not better. It is recommended that you inspect
- the actual kernel source code (see the function
- <code class="function">vm_enough_memory</code> in the file <code class="filename">mm/mmap.c</code>)
- to verify what is supported in your kernel before you try this in a 2.4
- installation. The presence of the <code class="filename">overcommit-accounting</code>
- documentation file should <span class="emphasis"><em>not</em></span> be taken as evidence that the
- feature is there. If in any doubt, consult a kernel expert or your
- kernel vendor.
- </p></div></div><div class="sect2" id="LINUX-HUGE-PAGES"><div class="titlepage"><div><div><h3 class="title">18.4.5. Linux Huge Pages</h3></div></div></div><p>
- Using huge pages reduces overhead when using large contiguous chunks of
- memory, as <span class="productname">PostgreSQL</span> does, particularly when
- using large values of <a class="xref" href="runtime-config-resource.html#GUC-SHARED-BUFFERS">shared_buffers</a>. To use this
- feature in <span class="productname">PostgreSQL</span> you need a kernel
- with <code class="varname">CONFIG_HUGETLBFS=y</code> and
- <code class="varname">CONFIG_HUGETLB_PAGE=y</code>. You will also have to adjust
- the kernel setting <code class="varname">vm.nr_hugepages</code>. To estimate the
- number of huge pages needed, start <span class="productname">PostgreSQL</span>
- without huge pages enabled and check the
- postmaster's anonymous shared memory segment size, as well as the system's
- huge page size, using the <code class="filename">/proc</code> file system. This might
- look like:
- </p><pre class="programlisting">
- $ <strong class="userinput"><code>head -1 $PGDATA/postmaster.pid</code></strong>
- 4170
- $ <strong class="userinput"><code>pmap 4170 | awk '/rw-s/ && /zero/ {print $2}'</code></strong>
- 6490428K
- $ <strong class="userinput"><code>grep ^Hugepagesize /proc/meminfo</code></strong>
- Hugepagesize: 2048 kB
- </pre><p>
- <code class="literal">6490428</code> / <code class="literal">2048</code> gives approximately
- <code class="literal">3169.154</code>, so in this example we need at
- least <code class="literal">3170</code> huge pages, which we can set with:
- </p><pre class="programlisting">
- $ <strong class="userinput"><code>sysctl -w vm.nr_hugepages=3170</code></strong>
- </pre><p>
- A larger setting would be appropriate if other programs on the machine
- also need huge pages. Don't forget to add this setting
- to <code class="filename">/etc/sysctl.conf</code> so that it will be reapplied
- after reboots.
- </p><p>
- Sometimes the kernel is not able to allocate the desired number of huge
- pages immediately, so it might be necessary to repeat the command or to
- reboot. (Immediately after a reboot, most of the machine's memory
- should be available to convert into huge pages.) To verify the huge
- page allocation situation, use:
- </p><pre class="programlisting">
- $ <strong class="userinput"><code>grep Huge /proc/meminfo</code></strong>
- </pre><p>
- </p><p>
- It may also be necessary to give the database server's operating system
- user permission to use huge pages by setting
- <code class="varname">vm.hugetlb_shm_group</code> via <span class="application">sysctl</span>, and/or
- give permission to lock memory with <code class="command">ulimit -l</code>.
- </p><p>
- The default behavior for huge pages in
- <span class="productname">PostgreSQL</span> is to use them when possible and
- to fall back to normal pages when failing. To enforce the use of huge
- pages, you can set <a class="xref" href="runtime-config-resource.html#GUC-HUGE-PAGES">huge_pages</a>
- to <code class="literal">on</code> in <code class="filename">postgresql.conf</code>.
- Note that with this setting <span class="productname">PostgreSQL</span> will fail to
- start if not enough huge pages are available.
- </p><p>
- For a detailed description of the <span class="productname">Linux</span> huge
- pages feature have a look
- at <a class="ulink" href="https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt" target="_top">https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt</a>.
- </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="server-start.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="runtime.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="server-shutdown.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">18.3. Starting the Database Server </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> 18.5. Shutting Down the Server</td></tr></table></div></body></html>
|