|
- <!DOCTYPE html>
-
- <html lang="en" data-content_root="../">
- <head>
- <meta charset="utf-8" />
- <meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
- <meta property="og:title" content="Unicode Objects and Codecs" />
- <meta property="og:type" content="website" />
- <meta property="og:url" content="https://docs.python.org/3/c-api/unicode.html" />
- <meta property="og:site_name" content="Python documentation" />
- <meta property="og:description" content="Unicode Objects: Since the implementation of PEP 393 in Python 3.3, Unicode objects internally use a variety of representations, in order to allow handling the complete range of Unicode characters ..." />
- <meta property="og:image" content="https://docs.python.org/3/_static/og-image.png" />
- <meta property="og:image:alt" content="Python documentation" />
- <meta name="description" content="Unicode Objects: Since the implementation of PEP 393 in Python 3.3, Unicode objects internally use a variety of representations, in order to allow handling the complete range of Unicode characters ..." />
- <meta property="og:image:width" content="200" />
- <meta property="og:image:height" content="200" />
- <meta name="theme-color" content="#3776ab" />
-
- <title>Unicode Objects and Codecs — Python 3.12.3 documentation</title><meta name="viewport" content="width=device-width, initial-scale=1.0">
-
- <link rel="stylesheet" type="text/css" href="../_static/pygments.css?v=80d5e7a1" />
- <link rel="stylesheet" type="text/css" href="../_static/pydoctheme.css?v=bb723527" />
- <link id="pygments_dark_css" media="(prefers-color-scheme: dark)" rel="stylesheet" type="text/css" href="../_static/pygments_dark.css?v=b20cc3f5" />
-
- <script src="../_static/documentation_options.js?v=2c828074"></script>
- <script src="../_static/doctools.js?v=888ff710"></script>
- <script src="../_static/sphinx_highlight.js?v=dc90522c"></script>
-
- <script src="../_static/sidebar.js"></script>
-
- <link rel="search" type="application/opensearchdescription+xml"
- title="Search within Python 3.12.3 documentation"
- href="../_static/opensearch.xml"/>
- <link rel="author" title="About these documents" href="../about.html" />
- <link rel="index" title="Index" href="../genindex.html" />
- <link rel="search" title="Search" href="../search.html" />
- <link rel="copyright" title="Copyright" href="../copyright.html" />
- <link rel="next" title="Tuple Objects" href="tuple.html" />
- <link rel="prev" title="Byte Array Objects" href="bytearray.html" />
- <link rel="canonical" href="https://docs.python.org/3/c-api/unicode.html" />
-
-
-
-
-
- <style>
- @media only screen {
- table.full-width-table {
- width: 100%;
- }
- }
- </style>
- <link rel="stylesheet" href="../_static/pydoctheme_dark.css" media="(prefers-color-scheme: dark)" id="pydoctheme_dark_css">
- <link rel="shortcut icon" type="image/png" href="../_static/py.svg" />
- <script type="text/javascript" src="../_static/copybutton.js"></script>
- <script type="text/javascript" src="../_static/menu.js"></script>
- <script type="text/javascript" src="../_static/search-focus.js"></script>
- <script type="text/javascript" src="../_static/themetoggle.js"></script>
-
- </head>
- <body>
- <div class="mobile-nav">
- <input type="checkbox" id="menuToggler" class="toggler__input" aria-controls="navigation"
- aria-pressed="false" aria-expanded="false" role="button" aria-label="Menu" />
- <nav class="nav-content" role="navigation">
- <label for="menuToggler" class="toggler__label">
- <span></span>
- </label>
- <span class="nav-items-wrapper">
- <a href="https://www.python.org/" class="nav-logo">
- <img src="../_static/py.svg" alt="Python logo"/>
- </a>
- <span class="version_switcher_placeholder"></span>
- <form role="search" class="search" action="../search.html" method="get">
- <svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" class="search-icon">
- <path fill-rule="nonzero" fill="currentColor" d="M15.5 14h-.79l-.28-.27a6.5 6.5 0 001.48-5.34c-.47-2.78-2.79-5-5.59-5.34a6.505 6.505 0 00-7.27 7.27c.34 2.8 2.56 5.12 5.34 5.59a6.5 6.5 0 005.34-1.48l.27.28v.79l4.25 4.25c.41.41 1.08.41 1.49 0 .41-.41.41-1.08 0-1.49L15.5 14zm-6 0C7.01 14 5 11.99 5 9.5S7.01 5 9.5 5 14 7.01 14 9.5 11.99 14 9.5 14z"></path>
- </svg>
- <input placeholder="Quick search" aria-label="Quick search" type="search" name="q" />
- <input type="submit" value="Go"/>
- </form>
- </span>
- </nav>
- <div class="menu-wrapper">
- <nav class="menu" role="navigation" aria-label="main navigation">
- <div class="language_switcher_placeholder"></div>
-
- <label class="theme-selector-label">
- Theme
- <select class="theme-selector" oninput="activateTheme(this.value)">
- <option value="auto" selected>Auto</option>
- <option value="light">Light</option>
- <option value="dark">Dark</option>
- </select>
- </label>
- <div>
- <h3><a href="../contents.html">Table of Contents</a></h3>
- <ul>
- <li><a class="reference internal" href="#">Unicode Objects and Codecs</a><ul>
- <li><a class="reference internal" href="#unicode-objects">Unicode Objects</a><ul>
- <li><a class="reference internal" href="#unicode-type">Unicode Type</a></li>
- <li><a class="reference internal" href="#unicode-character-properties">Unicode Character Properties</a></li>
- <li><a class="reference internal" href="#creating-and-accessing-unicode-strings">Creating and accessing Unicode strings</a></li>
- <li><a class="reference internal" href="#locale-encoding">Locale Encoding</a></li>
- <li><a class="reference internal" href="#file-system-encoding">File System Encoding</a></li>
- <li><a class="reference internal" href="#wchar-t-support">wchar_t Support</a></li>
- </ul>
- </li>
- <li><a class="reference internal" href="#built-in-codecs">Built-in Codecs</a><ul>
- <li><a class="reference internal" href="#generic-codecs">Generic Codecs</a></li>
- <li><a class="reference internal" href="#utf-8-codecs">UTF-8 Codecs</a></li>
- <li><a class="reference internal" href="#utf-32-codecs">UTF-32 Codecs</a></li>
- <li><a class="reference internal" href="#utf-16-codecs">UTF-16 Codecs</a></li>
- <li><a class="reference internal" href="#utf-7-codecs">UTF-7 Codecs</a></li>
- <li><a class="reference internal" href="#unicode-escape-codecs">Unicode-Escape Codecs</a></li>
- <li><a class="reference internal" href="#raw-unicode-escape-codecs">Raw-Unicode-Escape Codecs</a></li>
- <li><a class="reference internal" href="#latin-1-codecs">Latin-1 Codecs</a></li>
- <li><a class="reference internal" href="#ascii-codecs">ASCII Codecs</a></li>
- <li><a class="reference internal" href="#character-map-codecs">Character Map Codecs</a></li>
- <li><a class="reference internal" href="#mbcs-codecs-for-windows">MBCS codecs for Windows</a></li>
- <li><a class="reference internal" href="#methods-slots">Methods & Slots</a></li>
- </ul>
- </li>
- <li><a class="reference internal" href="#methods-and-slot-functions">Methods and Slot Functions</a></li>
- </ul>
- </li>
- </ul>
-
- </div>
- <div>
- <h4>Previous topic</h4>
- <p class="topless"><a href="bytearray.html"
- title="previous chapter">Byte Array Objects</a></p>
- </div>
- <div>
- <h4>Next topic</h4>
- <p class="topless"><a href="tuple.html"
- title="next chapter">Tuple Objects</a></p>
- </div>
- <div role="note" aria-label="source link">
- <h3>This Page</h3>
- <ul class="this-page-menu">
- <li><a href="../bugs.html">Report a Bug</a></li>
- <li>
- <a href="https://github.com/python/cpython/blob/main/Doc/c-api/unicode.rst"
- rel="nofollow">Show Source
- </a>
- </li>
- </ul>
- </div>
- </nav>
- </div>
- </div>
-
-
- <div class="related" role="navigation" aria-label="related navigation">
- <h3>Navigation</h3>
- <ul>
- <li class="right" style="margin-right: 10px">
- <a href="../genindex.html" title="General Index"
- accesskey="I">index</a></li>
- <li class="right" >
- <a href="../py-modindex.html" title="Python Module Index"
- >modules</a> |</li>
- <li class="right" >
- <a href="tuple.html" title="Tuple Objects"
- accesskey="N">next</a> |</li>
- <li class="right" >
- <a href="bytearray.html" title="Byte Array Objects"
- accesskey="P">previous</a> |</li>
-
- <li><img src="../_static/py.svg" alt="Python logo" style="vertical-align: middle; margin-top: -1px"/></li>
- <li><a href="https://www.python.org/">Python</a> »</li>
- <li class="switchers">
- <div class="language_switcher_placeholder"></div>
- <div class="version_switcher_placeholder"></div>
- </li>
- <li>
-
- </li>
- <li id="cpython-language-and-version">
- <a href="../index.html">3.12.3 Documentation</a> »
- </li>
-
- <li class="nav-item nav-item-1"><a href="index.html" >Python/C API Reference Manual</a> »</li>
- <li class="nav-item nav-item-2"><a href="concrete.html" accesskey="U">Concrete Objects Layer</a> »</li>
- <li class="nav-item nav-item-this"><a href="">Unicode Objects and Codecs</a></li>
- <li class="right">
-
-
- <div class="inline-search" role="search">
- <form class="inline-search" action="../search.html" method="get">
- <input placeholder="Quick search" aria-label="Quick search" type="search" name="q" id="search-box" />
- <input type="submit" value="Go" />
- </form>
- </div>
- |
- </li>
- <li class="right">
- <label class="theme-selector-label">
- Theme
- <select class="theme-selector" oninput="activateTheme(this.value)">
- <option value="auto" selected>Auto</option>
- <option value="light">Light</option>
- <option value="dark">Dark</option>
- </select>
- </label> |</li>
-
- </ul>
- </div>
-
- <div class="document">
- <div class="documentwrapper">
- <div class="bodywrapper">
- <div class="body" role="main">
-
- <section id="unicode-objects-and-codecs">
- <span id="unicodeobjects"></span><h1>Unicode Objects and Codecs<a class="headerlink" href="#unicode-objects-and-codecs" title="Link to this heading">¶</a></h1>
- <section id="unicode-objects">
- <h2>Unicode Objects<a class="headerlink" href="#unicode-objects" title="Link to this heading">¶</a></h2>
- <p>Since the implementation of <span class="target" id="index-0"></span><a class="pep reference external" href="https://peps.python.org/pep-0393/"><strong>PEP 393</strong></a> in Python 3.3, Unicode objects internally
- use a variety of representations, in order to allow handling the complete range
- of Unicode characters while staying memory efficient. There are special cases
- for strings where all code points are below 128, 256, or 65536; otherwise, code
- points must be below 1114112 (which is the full Unicode range).</p>
- <p>UTF-8 representation is created on demand and cached in the Unicode object.</p>
- <div class="admonition note">
- <p class="admonition-title">Note</p>
- <p>The <a class="reference internal" href="#c.Py_UNICODE" title="Py_UNICODE"><code class="xref c c-type docutils literal notranslate"><span class="pre">Py_UNICODE</span></code></a> representation has been removed since Python 3.12
- with deprecated APIs.
- See <span class="target" id="index-1"></span><a class="pep reference external" href="https://peps.python.org/pep-0623/"><strong>PEP 623</strong></a> for more information.</p>
- </div>
- <section id="unicode-type">
- <h3>Unicode Type<a class="headerlink" href="#unicode-type" title="Link to this heading">¶</a></h3>
- <p>These are the basic Unicode object types used for the Unicode implementation in
- Python:</p>
- <dl class="c type">
- <dt class="sig sig-object c" id="c.Py_UCS4">
- <span class="k"><span class="pre">type</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UCS4</span></span></span><a class="headerlink" href="#c.Py_UCS4" title="Link to this definition">¶</a><br /></dt>
- <dt class="sig sig-object c" id="c.Py_UCS2">
- <span class="k"><span class="pre">type</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UCS2</span></span></span><a class="headerlink" href="#c.Py_UCS2" title="Link to this definition">¶</a><br /></dt>
- <dt class="sig sig-object c" id="c.Py_UCS1">
- <span class="k"><span class="pre">type</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UCS1</span></span></span><a class="headerlink" href="#c.Py_UCS1" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>These types are typedefs for unsigned integer types wide enough to contain
- characters of 32 bits, 16 bits and 8 bits, respectively. When dealing with
- single Unicode characters, use <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><code class="xref c c-type docutils literal notranslate"><span class="pre">Py_UCS4</span></code></a>.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c type">
- <dt class="sig sig-object c" id="c.Py_UNICODE">
- <span class="k"><span class="pre">type</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE</span></span></span><a class="headerlink" href="#c.Py_UNICODE" title="Link to this definition">¶</a><br /></dt>
- <dd><p>This is a typedef of <code class="xref c c-type docutils literal notranslate"><span class="pre">wchar_t</span></code>, which is a 16-bit type or 32-bit type
- depending on the platform.</p>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.3: </span>In previous versions, this was a 16-bit type or a 32-bit type depending on
- whether you selected a “narrow” or “wide” Unicode version of Python at
- build time.</p>
- </div>
- </dd></dl>
-
- <dl class="c type">
- <dt class="sig sig-object c" id="c.PyASCIIObject">
- <span class="k"><span class="pre">type</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyASCIIObject</span></span></span><a class="headerlink" href="#c.PyASCIIObject" title="Link to this definition">¶</a><br /></dt>
- <dt class="sig sig-object c" id="c.PyCompactUnicodeObject">
- <span class="k"><span class="pre">type</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyCompactUnicodeObject</span></span></span><a class="headerlink" href="#c.PyCompactUnicodeObject" title="Link to this definition">¶</a><br /></dt>
- <dt class="sig sig-object c" id="c.PyUnicodeObject">
- <span class="k"><span class="pre">type</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicodeObject</span></span></span><a class="headerlink" href="#c.PyUnicodeObject" title="Link to this definition">¶</a><br /></dt>
- <dd><p>These subtypes of <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><code class="xref c c-type docutils literal notranslate"><span class="pre">PyObject</span></code></a> represent a Python Unicode object. In
- almost all cases, they shouldn’t be used directly, since all API functions
- that deal with Unicode objects take and return <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><code class="xref c c-type docutils literal notranslate"><span class="pre">PyObject</span></code></a> pointers.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c var">
- <dt class="sig sig-object c" id="c.PyUnicode_Type">
- <a class="reference internal" href="type.html#c.PyTypeObject" title="PyTypeObject"><span class="n"><span class="pre">PyTypeObject</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Type</span></span></span><a class="headerlink" href="#c.PyUnicode_Type" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>This instance of <a class="reference internal" href="type.html#c.PyTypeObject" title="PyTypeObject"><code class="xref c c-type docutils literal notranslate"><span class="pre">PyTypeObject</span></code></a> represents the Python Unicode type. It
- is exposed to Python code as <code class="docutils literal notranslate"><span class="pre">str</span></code>.</p>
- </dd></dl>
-
- <p>The following APIs are C macros and static inlined functions for fast checks and
- access to internal read-only data of Unicode objects:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Check">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Check</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">obj</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Check" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return true if the object <em>obj</em> is a Unicode object or an instance of a Unicode
- subtype. This function always succeeds.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_CheckExact">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_CheckExact</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">obj</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_CheckExact" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return true if the object <em>obj</em> is a Unicode object, but not an instance of a
- subtype. This function always succeeds.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_READY">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_READY</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_READY" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Returns <code class="docutils literal notranslate"><span class="pre">0</span></code>. This API is kept only for backward compatibility.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- <div class="deprecated">
- <p><span class="versionmodified deprecated">Deprecated since version 3.10: </span>This API does nothing since Python 3.12.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_GET_LENGTH">
- <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_GET_LENGTH</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_GET_LENGTH" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return the length of the Unicode string, in code points. <em>unicode</em> has to be a
- Unicode object in the “canonical” representation (not checked).</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_1BYTE_DATA">
- <a class="reference internal" href="#c.Py_UCS1" title="Py_UCS1"><span class="n"><span class="pre">Py_UCS1</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_1BYTE_DATA</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_1BYTE_DATA" title="Link to this definition">¶</a><br /></dt>
- <dt class="sig sig-object c" id="c.PyUnicode_2BYTE_DATA">
- <a class="reference internal" href="#c.Py_UCS2" title="Py_UCS2"><span class="n"><span class="pre">Py_UCS2</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_2BYTE_DATA</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_2BYTE_DATA" title="Link to this definition">¶</a><br /></dt>
- <dt class="sig sig-object c" id="c.PyUnicode_4BYTE_DATA">
- <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_4BYTE_DATA</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_4BYTE_DATA" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return a pointer to the canonical representation cast to UCS1, UCS2 or UCS4
- integer types for direct character access. No checks are performed if the
- canonical representation has the correct character size; use
- <a class="reference internal" href="#c.PyUnicode_KIND" title="PyUnicode_KIND"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_KIND()</span></code></a> to select the right function.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c macro">
- <dt class="sig sig-object c" id="c.PyUnicode_1BYTE_KIND">
- <span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_1BYTE_KIND</span></span></span><a class="headerlink" href="#c.PyUnicode_1BYTE_KIND" title="Link to this definition">¶</a><br /></dt>
- <dt class="sig sig-object c" id="c.PyUnicode_2BYTE_KIND">
- <span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_2BYTE_KIND</span></span></span><a class="headerlink" href="#c.PyUnicode_2BYTE_KIND" title="Link to this definition">¶</a><br /></dt>
- <dt class="sig sig-object c" id="c.PyUnicode_4BYTE_KIND">
- <span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_4BYTE_KIND</span></span></span><a class="headerlink" href="#c.PyUnicode_4BYTE_KIND" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return values of the <a class="reference internal" href="#c.PyUnicode_KIND" title="PyUnicode_KIND"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_KIND()</span></code></a> macro.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.12: </span><code class="docutils literal notranslate"><span class="pre">PyUnicode_WCHAR_KIND</span></code> has been removed.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_KIND">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_KIND</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_KIND" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return one of the PyUnicode kind constants (see above) that indicate how many
- bytes per character this Unicode object uses to store its data. <em>unicode</em> has to
- be a Unicode object in the “canonical” representation (not checked).</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DATA">
- <span class="kt"><span class="pre">void</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DATA</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DATA" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return a void pointer to the raw Unicode buffer. <em>unicode</em> has to be a Unicode
- object in the “canonical” representation (not checked).</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_WRITE">
- <span class="kt"><span class="pre">void</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_WRITE</span></span></span><span class="sig-paren">(</span><span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="n"><span class="pre">kind</span></span>, <span class="kt"><span class="pre">void</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">data</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">index</span></span>, <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">value</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_WRITE" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Write into a canonical representation <em>data</em> (as obtained with
- <a class="reference internal" href="#c.PyUnicode_DATA" title="PyUnicode_DATA"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DATA()</span></code></a>). This function performs no sanity checks, and is
- intended for usage in loops. The caller should cache the <em>kind</em> value and
- <em>data</em> pointer as obtained from other calls. <em>index</em> is the index in
- the string (starts at 0) and <em>value</em> is the new code point value which should
- be written to that location.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_READ">
- <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_READ</span></span></span><span class="sig-paren">(</span><span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="n"><span class="pre">kind</span></span>, <span class="kt"><span class="pre">void</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">data</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">index</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_READ" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Read a code point from a canonical representation <em>data</em> (as obtained with
- <a class="reference internal" href="#c.PyUnicode_DATA" title="PyUnicode_DATA"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DATA()</span></code></a>). No checks or ready calls are performed.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_READ_CHAR">
- <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_READ_CHAR</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">index</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_READ_CHAR" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Read a character from a Unicode object <em>unicode</em>, which must be in the “canonical”
- representation. This is less efficient than <a class="reference internal" href="#c.PyUnicode_READ" title="PyUnicode_READ"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_READ()</span></code></a> if you
- do multiple consecutive reads.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_MAX_CHAR_VALUE">
- <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_MAX_CHAR_VALUE</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_MAX_CHAR_VALUE" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return the maximum code point that is suitable for creating another string
- based on <em>unicode</em>, which must be in the “canonical” representation. This is
- always an approximation but more efficient than iterating over the string.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_IsIdentifier">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_IsIdentifier</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_IsIdentifier" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> if the string is a valid identifier according to the language
- definition, section <a class="reference internal" href="../reference/lexical_analysis.html#identifiers"><span class="std std-ref">Identifiers and keywords</span></a>. Return <code class="docutils literal notranslate"><span class="pre">0</span></code> otherwise.</p>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.9: </span>The function does not call <a class="reference internal" href="sys.html#c.Py_FatalError" title="Py_FatalError"><code class="xref c c-func docutils literal notranslate"><span class="pre">Py_FatalError()</span></code></a> anymore if the string
- is not ready.</p>
- </div>
- </dd></dl>
-
- </section>
- <section id="unicode-character-properties">
- <h3>Unicode Character Properties<a class="headerlink" href="#unicode-character-properties" title="Link to this heading">¶</a></h3>
- <p>Unicode provides many different character properties. The most often needed ones
- are available through these macros which are mapped to C functions depending on
- the Python configuration.</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_ISSPACE">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_ISSPACE</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_ISSPACE" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> or <code class="docutils literal notranslate"><span class="pre">0</span></code> depending on whether <em>ch</em> is a whitespace character.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_ISLOWER">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_ISLOWER</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_ISLOWER" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> or <code class="docutils literal notranslate"><span class="pre">0</span></code> depending on whether <em>ch</em> is a lowercase character.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_ISUPPER">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_ISUPPER</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_ISUPPER" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> or <code class="docutils literal notranslate"><span class="pre">0</span></code> depending on whether <em>ch</em> is an uppercase character.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_ISTITLE">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_ISTITLE</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_ISTITLE" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> or <code class="docutils literal notranslate"><span class="pre">0</span></code> depending on whether <em>ch</em> is a titlecase character.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_ISLINEBREAK">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_ISLINEBREAK</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_ISLINEBREAK" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> or <code class="docutils literal notranslate"><span class="pre">0</span></code> depending on whether <em>ch</em> is a linebreak character.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_ISDECIMAL">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_ISDECIMAL</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_ISDECIMAL" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> or <code class="docutils literal notranslate"><span class="pre">0</span></code> depending on whether <em>ch</em> is a decimal character.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_ISDIGIT">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_ISDIGIT</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_ISDIGIT" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> or <code class="docutils literal notranslate"><span class="pre">0</span></code> depending on whether <em>ch</em> is a digit character.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_ISNUMERIC">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_ISNUMERIC</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_ISNUMERIC" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> or <code class="docutils literal notranslate"><span class="pre">0</span></code> depending on whether <em>ch</em> is a numeric character.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_ISALPHA">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_ISALPHA</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_ISALPHA" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> or <code class="docutils literal notranslate"><span class="pre">0</span></code> depending on whether <em>ch</em> is an alphabetic character.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_ISALNUM">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_ISALNUM</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_ISALNUM" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> or <code class="docutils literal notranslate"><span class="pre">0</span></code> depending on whether <em>ch</em> is an alphanumeric character.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_ISPRINTABLE">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_ISPRINTABLE</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_ISPRINTABLE" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> or <code class="docutils literal notranslate"><span class="pre">0</span></code> depending on whether <em>ch</em> is a printable character.
- Nonprintable characters are those characters defined in the Unicode character
- database as “Other” or “Separator”, excepting the ASCII space (0x20) which is
- considered printable. (Note that printable characters in this context are
- those which should not be escaped when <a class="reference internal" href="../library/functions.html#repr" title="repr"><code class="xref py py-func docutils literal notranslate"><span class="pre">repr()</span></code></a> is invoked on a string.
- It has no bearing on the handling of strings written to <a class="reference internal" href="../library/sys.html#sys.stdout" title="sys.stdout"><code class="xref py py-data docutils literal notranslate"><span class="pre">sys.stdout</span></code></a> or
- <a class="reference internal" href="../library/sys.html#sys.stderr" title="sys.stderr"><code class="xref py py-data docutils literal notranslate"><span class="pre">sys.stderr</span></code></a>.)</p>
- </dd></dl>
-
- <p>These APIs can be used for fast direct character conversions:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_TOLOWER">
- <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_TOLOWER</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_TOLOWER" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return the character <em>ch</em> converted to lower case.</p>
- <div class="deprecated">
- <p><span class="versionmodified deprecated">Deprecated since version 3.3: </span>This function uses simple case mappings.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_TOUPPER">
- <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_TOUPPER</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_TOUPPER" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return the character <em>ch</em> converted to upper case.</p>
- <div class="deprecated">
- <p><span class="versionmodified deprecated">Deprecated since version 3.3: </span>This function uses simple case mappings.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_TOTITLE">
- <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_TOTITLE</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_TOTITLE" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return the character <em>ch</em> converted to title case.</p>
- <div class="deprecated">
- <p><span class="versionmodified deprecated">Deprecated since version 3.3: </span>This function uses simple case mappings.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_TODECIMAL">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_TODECIMAL</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_TODECIMAL" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return the character <em>ch</em> converted to a decimal positive integer. Return
- <code class="docutils literal notranslate"><span class="pre">-1</span></code> if this is not possible. This function does not raise exceptions.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_TODIGIT">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_TODIGIT</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_TODIGIT" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return the character <em>ch</em> converted to a single digit integer. Return <code class="docutils literal notranslate"><span class="pre">-1</span></code> if
- this is not possible. This function does not raise exceptions.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_TONUMERIC">
- <span class="kt"><span class="pre">double</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_TONUMERIC</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_TONUMERIC" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Return the character <em>ch</em> converted to a double. Return <code class="docutils literal notranslate"><span class="pre">-1.0</span></code> if this is not
- possible. This function does not raise exceptions.</p>
- </dd></dl>
-
- <p>These APIs can be used to work with surrogates:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_IS_SURROGATE">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_IS_SURROGATE</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_IS_SURROGATE" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Check if <em>ch</em> is a surrogate (<code class="docutils literal notranslate"><span class="pre">0xD800</span> <span class="pre"><=</span> <span class="pre">ch</span> <span class="pre"><=</span> <span class="pre">0xDFFF</span></code>).</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_IS_HIGH_SURROGATE">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_IS_HIGH_SURROGATE</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_IS_HIGH_SURROGATE" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Check if <em>ch</em> is a high surrogate (<code class="docutils literal notranslate"><span class="pre">0xD800</span> <span class="pre"><=</span> <span class="pre">ch</span> <span class="pre"><=</span> <span class="pre">0xDBFF</span></code>).</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_IS_LOW_SURROGATE">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_IS_LOW_SURROGATE</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_IS_LOW_SURROGATE" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Check if <em>ch</em> is a low surrogate (<code class="docutils literal notranslate"><span class="pre">0xDC00</span> <span class="pre"><=</span> <span class="pre">ch</span> <span class="pre"><=</span> <span class="pre">0xDFFF</span></code>).</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.Py_UNICODE_JOIN_SURROGATES">
- <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">Py_UNICODE_JOIN_SURROGATES</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">high</span></span>, <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">low</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.Py_UNICODE_JOIN_SURROGATES" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Join two surrogate characters and return a single <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><code class="xref c c-type docutils literal notranslate"><span class="pre">Py_UCS4</span></code></a> value.
- <em>high</em> and <em>low</em> are respectively the leading and trailing surrogates in a
- surrogate pair. <em>high</em> must be in the range [0xD800; 0xDBFF] and <em>low</em> must
- be in the range [0xDC00; 0xDFFF].</p>
- </dd></dl>
-
- </section>
- <section id="creating-and-accessing-unicode-strings">
- <h3>Creating and accessing Unicode strings<a class="headerlink" href="#creating-and-accessing-unicode-strings" title="Link to this heading">¶</a></h3>
- <p>To create Unicode objects and access their basic sequence properties, use these
- APIs:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_New">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_New</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">maxchar</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_New" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><p>Create a new Unicode object. <em>maxchar</em> should be the true maximum code point
- to be placed in the string. As an approximation, it can be rounded up to the
- nearest value in the sequence 127, 255, 65535, 1114111.</p>
- <p>This is the recommended way to allocate a new Unicode object. Objects
- created using this function are not resizable.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_FromKindAndData">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_FromKindAndData</span></span></span><span class="sig-paren">(</span><span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="n"><span class="pre">kind</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">void</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">buffer</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_FromKindAndData" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><p>Create a new Unicode object with the given <em>kind</em> (possible values are
- <a class="reference internal" href="#c.PyUnicode_1BYTE_KIND" title="PyUnicode_1BYTE_KIND"><code class="xref c c-macro docutils literal notranslate"><span class="pre">PyUnicode_1BYTE_KIND</span></code></a> etc., as returned by
- <a class="reference internal" href="#c.PyUnicode_KIND" title="PyUnicode_KIND"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_KIND()</span></code></a>). The <em>buffer</em> must point to an array of <em>size</em>
- units of 1, 2 or 4 bytes per character, as given by the kind.</p>
- <p>If necessary, the input <em>buffer</em> is copied and transformed into the
- canonical representation. For example, if the <em>buffer</em> is a UCS4 string
- (<a class="reference internal" href="#c.PyUnicode_4BYTE_KIND" title="PyUnicode_4BYTE_KIND"><code class="xref c c-macro docutils literal notranslate"><span class="pre">PyUnicode_4BYTE_KIND</span></code></a>) and it consists only of codepoints in
- the UCS1 range, it will be transformed into UCS1
- (<a class="reference internal" href="#c.PyUnicode_1BYTE_KIND" title="PyUnicode_1BYTE_KIND"><code class="xref c c-macro docutils literal notranslate"><span class="pre">PyUnicode_1BYTE_KIND</span></code></a>).</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_FromStringAndSize">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_FromStringAndSize</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_FromStringAndSize" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Create a Unicode object from the char buffer <em>str</em>. The bytes will be
- interpreted as being UTF-8 encoded. The buffer is copied into the new
- object.
- The return value might be a shared object, i.e. modification of the data is
- not allowed.</p>
- <p>This function raises <a class="reference internal" href="../library/exceptions.html#SystemError" title="SystemError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">SystemError</span></code></a> when:</p>
- <ul class="simple">
- <li><p><em>size</em> < 0,</p></li>
- <li><p><em>str</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code> and <em>size</em> > 0</p></li>
- </ul>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.12: </span><em>str</em> == <code class="docutils literal notranslate"><span class="pre">NULL</span></code> with <em>size</em> > 0 is not allowed anymore.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_FromString">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_FromString</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_FromString" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Create a Unicode object from a UTF-8 encoded null-terminated char buffer
- <em>str</em>.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_FromFormat">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_FromFormat</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">format</span></span>, <span class="p"><span class="pre">...</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_FromFormat" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Take a C <code class="xref c c-func docutils literal notranslate"><span class="pre">printf()</span></code>-style <em>format</em> string and a variable number of
- arguments, calculate the size of the resulting Python Unicode string and return
- a string with the values formatted into it. The variable arguments must be C
- types and must correspond exactly to the format characters in the <em>format</em>
- ASCII-encoded string.</p>
- <p>A conversion specifier contains two or more characters and has the following
- components, which must occur in this order:</p>
- <ol class="arabic simple">
- <li><p>The <code class="docutils literal notranslate"><span class="pre">'%'</span></code> character, which marks the start of the specifier.</p></li>
- <li><p>Conversion flags (optional), which affect the result of some conversion
- types.</p></li>
- <li><p>Minimum field width (optional).
- If specified as an <code class="docutils literal notranslate"><span class="pre">'*'</span></code> (asterisk), the actual width is given in the
- next argument, which must be of type <span class="c-expr sig sig-inline c"><span class="kt">int</span></span>, and the object to
- convert comes after the minimum field width and optional precision.</p></li>
- <li><p>Precision (optional), given as a <code class="docutils literal notranslate"><span class="pre">'.'</span></code> (dot) followed by the precision.
- If specified as <code class="docutils literal notranslate"><span class="pre">'*'</span></code> (an asterisk), the actual precision is given in
- the next argument, which must be of type <span class="c-expr sig sig-inline c"><span class="kt">int</span></span>, and the value to
- convert comes after the precision.</p></li>
- <li><p>Length modifier (optional).</p></li>
- <li><p>Conversion type.</p></li>
- </ol>
- <p>The conversion flag characters are:</p>
- <table class="docutils align-default">
- <thead>
- <tr class="row-odd"><th class="head"><p>Flag</p></th>
- <th class="head"><p>Meaning</p></th>
- </tr>
- </thead>
- <tbody>
- <tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">0</span></code></p></td>
- <td><p>The conversion will be zero padded for numeric values.</p></td>
- </tr>
- <tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">-</span></code></p></td>
- <td><p>The converted value is left adjusted (overrides the <code class="docutils literal notranslate"><span class="pre">0</span></code>
- flag if both are given).</p></td>
- </tr>
- </tbody>
- </table>
- <p>The length modifiers for following integer conversions (<code class="docutils literal notranslate"><span class="pre">d</span></code>, <code class="docutils literal notranslate"><span class="pre">i</span></code>,
- <code class="docutils literal notranslate"><span class="pre">o</span></code>, <code class="docutils literal notranslate"><span class="pre">u</span></code>, <code class="docutils literal notranslate"><span class="pre">x</span></code>, or <code class="docutils literal notranslate"><span class="pre">X</span></code>) specify the type of the argument
- (<span class="c-expr sig sig-inline c"><span class="kt">int</span></span> by default):</p>
- <table class="docutils align-default">
- <thead>
- <tr class="row-odd"><th class="head"><p>Modifier</p></th>
- <th class="head"><p>Types</p></th>
- </tr>
- </thead>
- <tbody>
- <tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">l</span></code></p></td>
- <td><p><span class="c-expr sig sig-inline c"><span class="kt">long</span></span> or <span class="c-expr sig sig-inline c"><span class="kt">unsigned</span><span class="w"> </span><span class="kt">long</span></span></p></td>
- </tr>
- <tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">ll</span></code></p></td>
- <td><p><span class="c-expr sig sig-inline c"><span class="kt">long</span><span class="w"> </span><span class="kt">long</span></span> or <span class="c-expr sig sig-inline c"><span class="kt">unsigned</span><span class="w"> </span><span class="kt">long</span><span class="w"> </span><span class="kt">long</span></span></p></td>
- </tr>
- <tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">j</span></code></p></td>
- <td><p><code class="xref c c-type docutils literal notranslate"><span class="pre">intmax_t</span></code> or <code class="xref c c-type docutils literal notranslate"><span class="pre">uintmax_t</span></code></p></td>
- </tr>
- <tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">z</span></code></p></td>
- <td><p><code class="xref c c-type docutils literal notranslate"><span class="pre">size_t</span></code> or <code class="xref c c-type docutils literal notranslate"><span class="pre">ssize_t</span></code></p></td>
- </tr>
- <tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">t</span></code></p></td>
- <td><p><code class="xref c c-type docutils literal notranslate"><span class="pre">ptrdiff_t</span></code></p></td>
- </tr>
- </tbody>
- </table>
- <p>The length modifier <code class="docutils literal notranslate"><span class="pre">l</span></code> for following conversions <code class="docutils literal notranslate"><span class="pre">s</span></code> or <code class="docutils literal notranslate"><span class="pre">V</span></code> specify
- that the type of the argument is <span class="c-expr sig sig-inline c"><span class="k">const</span><span class="w"> </span><span class="n">wchar_t</span><span class="p">*</span></span>.</p>
- <p>The conversion specifiers are:</p>
- <table class="docutils align-default">
- <thead>
- <tr class="row-odd"><th class="head"><p>Conversion Specifier</p></th>
- <th class="head"><p>Type</p></th>
- <th class="head"><p>Comment</p></th>
- </tr>
- </thead>
- <tbody>
- <tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">%</span></code></p></td>
- <td><p><em>n/a</em></p></td>
- <td><p>The literal <code class="docutils literal notranslate"><span class="pre">%</span></code> character.</p></td>
- </tr>
- <tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">d</span></code>, <code class="docutils literal notranslate"><span class="pre">i</span></code></p></td>
- <td><p>Specified by the length modifier</p></td>
- <td><p>The decimal representation of a signed C integer.</p></td>
- </tr>
- <tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">u</span></code></p></td>
- <td><p>Specified by the length modifier</p></td>
- <td><p>The decimal representation of an unsigned C integer.</p></td>
- </tr>
- <tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">o</span></code></p></td>
- <td><p>Specified by the length modifier</p></td>
- <td><p>The octal representation of an unsigned C integer.</p></td>
- </tr>
- <tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">x</span></code></p></td>
- <td><p>Specified by the length modifier</p></td>
- <td><p>The hexadecimal representation of an unsigned C integer (lowercase).</p></td>
- </tr>
- <tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">X</span></code></p></td>
- <td><p>Specified by the length modifier</p></td>
- <td><p>The hexadecimal representation of an unsigned C integer (uppercase).</p></td>
- </tr>
- <tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">c</span></code></p></td>
- <td><p><span class="c-expr sig sig-inline c"><span class="kt">int</span></span></p></td>
- <td><p>A single character.</p></td>
- </tr>
- <tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">s</span></code></p></td>
- <td><p><span class="c-expr sig sig-inline c"><span class="k">const</span><span class="w"> </span><span class="kt">char</span><span class="p">*</span></span> or <span class="c-expr sig sig-inline c"><span class="k">const</span><span class="w"> </span><span class="n">wchar_t</span><span class="p">*</span></span></p></td>
- <td><p>A null-terminated C character array.</p></td>
- </tr>
- <tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">p</span></code></p></td>
- <td><p><span class="c-expr sig sig-inline c"><span class="k">const</span><span class="w"> </span><span class="kt">void</span><span class="p">*</span></span></p></td>
- <td><p>The hex representation of a C pointer.
- Mostly equivalent to <code class="docutils literal notranslate"><span class="pre">printf("%p")</span></code> except that it is guaranteed to
- start with the literal <code class="docutils literal notranslate"><span class="pre">0x</span></code> regardless of what the platform’s
- <code class="docutils literal notranslate"><span class="pre">printf</span></code> yields.</p></td>
- </tr>
- <tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">A</span></code></p></td>
- <td><p><span class="c-expr sig sig-inline c"><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n">PyObject</span></a><span class="p">*</span></span></p></td>
- <td><p>The result of calling <a class="reference internal" href="../library/functions.html#ascii" title="ascii"><code class="xref py py-func docutils literal notranslate"><span class="pre">ascii()</span></code></a>.</p></td>
- </tr>
- <tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">U</span></code></p></td>
- <td><p><span class="c-expr sig sig-inline c"><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n">PyObject</span></a><span class="p">*</span></span></p></td>
- <td><p>A Unicode object.</p></td>
- </tr>
- <tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">V</span></code></p></td>
- <td><p><span class="c-expr sig sig-inline c"><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n">PyObject</span></a><span class="p">*</span></span>, <span class="c-expr sig sig-inline c"><span class="k">const</span><span class="w"> </span><span class="kt">char</span><span class="p">*</span></span> or <span class="c-expr sig sig-inline c"><span class="k">const</span><span class="w"> </span><span class="n">wchar_t</span><span class="p">*</span></span></p></td>
- <td><p>A Unicode object (which may be <code class="docutils literal notranslate"><span class="pre">NULL</span></code>) and a null-terminated
- C character array as a second parameter (which will be used,
- if the first parameter is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>).</p></td>
- </tr>
- <tr class="row-even"><td><p><code class="docutils literal notranslate"><span class="pre">S</span></code></p></td>
- <td><p><span class="c-expr sig sig-inline c"><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n">PyObject</span></a><span class="p">*</span></span></p></td>
- <td><p>The result of calling <a class="reference internal" href="object.html#c.PyObject_Str" title="PyObject_Str"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyObject_Str()</span></code></a>.</p></td>
- </tr>
- <tr class="row-odd"><td><p><code class="docutils literal notranslate"><span class="pre">R</span></code></p></td>
- <td><p><span class="c-expr sig sig-inline c"><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n">PyObject</span></a><span class="p">*</span></span></p></td>
- <td><p>The result of calling <a class="reference internal" href="object.html#c.PyObject_Repr" title="PyObject_Repr"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyObject_Repr()</span></code></a>.</p></td>
- </tr>
- </tbody>
- </table>
- <div class="admonition note">
- <p class="admonition-title">Note</p>
- <p>The width formatter unit is number of characters rather than bytes.
- The precision formatter unit is number of bytes or <code class="xref c c-type docutils literal notranslate"><span class="pre">wchar_t</span></code>
- items (if the length modifier <code class="docutils literal notranslate"><span class="pre">l</span></code> is used) for <code class="docutils literal notranslate"><span class="pre">"%s"</span></code> and
- <code class="docutils literal notranslate"><span class="pre">"%V"</span></code> (if the <code class="docutils literal notranslate"><span class="pre">PyObject*</span></code> argument is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>), and a number of
- characters for <code class="docutils literal notranslate"><span class="pre">"%A"</span></code>, <code class="docutils literal notranslate"><span class="pre">"%U"</span></code>, <code class="docutils literal notranslate"><span class="pre">"%S"</span></code>, <code class="docutils literal notranslate"><span class="pre">"%R"</span></code> and <code class="docutils literal notranslate"><span class="pre">"%V"</span></code>
- (if the <code class="docutils literal notranslate"><span class="pre">PyObject*</span></code> argument is not <code class="docutils literal notranslate"><span class="pre">NULL</span></code>).</p>
- </div>
- <div class="admonition note">
- <p class="admonition-title">Note</p>
- <p>Unlike to C <code class="xref c c-func docutils literal notranslate"><span class="pre">printf()</span></code> the <code class="docutils literal notranslate"><span class="pre">0</span></code> flag has effect even when
- a precision is given for integer conversions (<code class="docutils literal notranslate"><span class="pre">d</span></code>, <code class="docutils literal notranslate"><span class="pre">i</span></code>, <code class="docutils literal notranslate"><span class="pre">u</span></code>, <code class="docutils literal notranslate"><span class="pre">o</span></code>,
- <code class="docutils literal notranslate"><span class="pre">x</span></code>, or <code class="docutils literal notranslate"><span class="pre">X</span></code>).</p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.2: </span>Support for <code class="docutils literal notranslate"><span class="pre">"%lld"</span></code> and <code class="docutils literal notranslate"><span class="pre">"%llu"</span></code> added.</p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.3: </span>Support for <code class="docutils literal notranslate"><span class="pre">"%li"</span></code>, <code class="docutils literal notranslate"><span class="pre">"%lli"</span></code> and <code class="docutils literal notranslate"><span class="pre">"%zi"</span></code> added.</p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.4: </span>Support width and precision formatter for <code class="docutils literal notranslate"><span class="pre">"%s"</span></code>, <code class="docutils literal notranslate"><span class="pre">"%A"</span></code>, <code class="docutils literal notranslate"><span class="pre">"%U"</span></code>,
- <code class="docutils literal notranslate"><span class="pre">"%V"</span></code>, <code class="docutils literal notranslate"><span class="pre">"%S"</span></code>, <code class="docutils literal notranslate"><span class="pre">"%R"</span></code> added.</p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.12: </span>Support for conversion specifiers <code class="docutils literal notranslate"><span class="pre">o</span></code> and <code class="docutils literal notranslate"><span class="pre">X</span></code>.
- Support for length modifiers <code class="docutils literal notranslate"><span class="pre">j</span></code> and <code class="docutils literal notranslate"><span class="pre">t</span></code>.
- Length modifiers are now applied to all integer conversions.
- Length modifier <code class="docutils literal notranslate"><span class="pre">l</span></code> is now applied to conversion specifiers <code class="docutils literal notranslate"><span class="pre">s</span></code> and <code class="docutils literal notranslate"><span class="pre">V</span></code>.
- Support for variable width and precision <code class="docutils literal notranslate"><span class="pre">*</span></code>.
- Support for flag <code class="docutils literal notranslate"><span class="pre">-</span></code>.</p>
- <p>An unrecognized format character now sets a <a class="reference internal" href="../library/exceptions.html#SystemError" title="SystemError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">SystemError</span></code></a>.
- In previous versions it caused all the rest of the format string to be
- copied as-is to the result string, and any extra arguments discarded.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_FromFormatV">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_FromFormatV</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">format</span></span>, <span class="n"><span class="pre">va_list</span></span><span class="w"> </span><span class="n"><span class="pre">vargs</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_FromFormatV" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Identical to <a class="reference internal" href="#c.PyUnicode_FromFormat" title="PyUnicode_FromFormat"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_FromFormat()</span></code></a> except that it takes exactly two
- arguments.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_FromObject">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_FromObject</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">obj</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_FromObject" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Copy an instance of a Unicode subtype to a new true Unicode object if
- necessary. If <em>obj</em> is already a true Unicode object (not a subtype),
- return a new <a class="reference internal" href="../glossary.html#term-strong-reference"><span class="xref std std-term">strong reference</span></a> to the object.</p>
- <p>Objects other than Unicode or its subtypes will cause a <a class="reference internal" href="../library/exceptions.html#TypeError" title="TypeError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">TypeError</span></code></a>.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_FromEncodedObject">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_FromEncodedObject</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">obj</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">encoding</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_FromEncodedObject" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Decode an encoded object <em>obj</em> to a Unicode object.</p>
- <p><a class="reference internal" href="../library/stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>, <a class="reference internal" href="../library/stdtypes.html#bytearray" title="bytearray"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytearray</span></code></a> and other
- <a class="reference internal" href="../glossary.html#term-bytes-like-object"><span class="xref std std-term">bytes-like objects</span></a>
- are decoded according to the given <em>encoding</em> and using the error handling
- defined by <em>errors</em>. Both can be <code class="docutils literal notranslate"><span class="pre">NULL</span></code> to have the interface use the default
- values (see <a class="reference internal" href="#builtincodecs"><span class="std std-ref">Built-in Codecs</span></a> for details).</p>
- <p>All other objects, including Unicode objects, cause a <a class="reference internal" href="../library/exceptions.html#TypeError" title="TypeError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">TypeError</span></code></a> to be
- set.</p>
- <p>The API returns <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if there was an error. The caller is responsible for
- decref’ing the returned objects.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_GetLength">
- <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_GetLength</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_GetLength" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> since version 3.7.</em><p>Return the length of the Unicode object, in code points.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_CopyCharacters">
- <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_CopyCharacters</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">to</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">to_start</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">from</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">from_start</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">how_many</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_CopyCharacters" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Copy characters from one Unicode object into another. This function performs
- character conversion when necessary and falls back to <code class="xref c c-func docutils literal notranslate"><span class="pre">memcpy()</span></code> if
- possible. Returns <code class="docutils literal notranslate"><span class="pre">-1</span></code> and sets an exception on error, otherwise returns
- the number of copied characters.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Fill">
- <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Fill</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">start</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">length</span></span>, <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">fill_char</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Fill" title="Link to this definition">¶</a><br /></dt>
- <dd><p>Fill a string with a character: write <em>fill_char</em> into
- <code class="docutils literal notranslate"><span class="pre">unicode[start:start+length]</span></code>.</p>
- <p>Fail if <em>fill_char</em> is bigger than the string maximum character, or if the
- string has more than 1 reference.</p>
- <p>Return the number of written character, or return <code class="docutils literal notranslate"><span class="pre">-1</span></code> and raise an
- exception on error.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_WriteChar">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_WriteChar</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">index</span></span>, <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">character</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_WriteChar" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> since version 3.7.</em><p>Write a character to a string. The string must have been created through
- <a class="reference internal" href="#c.PyUnicode_New" title="PyUnicode_New"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_New()</span></code></a>. Since Unicode strings are supposed to be immutable,
- the string must not be shared, or have been hashed yet.</p>
- <p>This function checks that <em>unicode</em> is a Unicode object, that the index is
- not out of bounds, and that the object can be modified safely (i.e. that it
- its reference count is one).</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_ReadChar">
- <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_ReadChar</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">index</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_ReadChar" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> since version 3.7.</em><p>Read a character from a string. This function checks that <em>unicode</em> is a
- Unicode object and the index is not out of bounds, in contrast to
- <a class="reference internal" href="#c.PyUnicode_READ_CHAR" title="PyUnicode_READ_CHAR"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_READ_CHAR()</span></code></a>, which performs no error checking.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Substring">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Substring</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">start</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">end</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Substring" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> since version 3.7.</em><p>Return a substring of <em>unicode</em>, from character index <em>start</em> (included) to
- character index <em>end</em> (excluded). Negative indices are not supported.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsUCS4">
- <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsUCS4</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">buffer</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">buflen</span></span>, <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="n"><span class="pre">copy_null</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsUCS4" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> since version 3.7.</em><p>Copy the string <em>unicode</em> into a UCS4 buffer, including a null character, if
- <em>copy_null</em> is set. Returns <code class="docutils literal notranslate"><span class="pre">NULL</span></code> and sets an exception on error (in
- particular, a <a class="reference internal" href="../library/exceptions.html#SystemError" title="SystemError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">SystemError</span></code></a> if <em>buflen</em> is smaller than the length of
- <em>unicode</em>). <em>buffer</em> is returned on success.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsUCS4Copy">
- <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsUCS4Copy</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsUCS4Copy" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> since version 3.7.</em><p>Copy the string <em>unicode</em> into a new UCS4 buffer that is allocated using
- <a class="reference internal" href="memory.html#c.PyMem_Malloc" title="PyMem_Malloc"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyMem_Malloc()</span></code></a>. If this fails, <code class="docutils literal notranslate"><span class="pre">NULL</span></code> is returned with a
- <a class="reference internal" href="../library/exceptions.html#MemoryError" title="MemoryError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">MemoryError</span></code></a> set. The returned buffer always has an extra
- null code point appended.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- </section>
- <section id="locale-encoding">
- <h3>Locale Encoding<a class="headerlink" href="#locale-encoding" title="Link to this heading">¶</a></h3>
- <p>The current locale encoding can be used to decode text from the operating
- system.</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeLocaleAndSize">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeLocaleAndSize</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">length</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeLocaleAndSize" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> since version 3.7.</em><p>Decode a string from UTF-8 on Android and VxWorks, or from the current
- locale encoding on other platforms. The supported
- error handlers are <code class="docutils literal notranslate"><span class="pre">"strict"</span></code> and <code class="docutils literal notranslate"><span class="pre">"surrogateescape"</span></code>
- (<span class="target" id="index-2"></span><a class="pep reference external" href="https://peps.python.org/pep-0383/"><strong>PEP 383</strong></a>). The decoder uses <code class="docutils literal notranslate"><span class="pre">"strict"</span></code> error handler if
- <em>errors</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>. <em>str</em> must end with a null character but
- cannot contain embedded null characters.</p>
- <p>Use <a class="reference internal" href="#c.PyUnicode_DecodeFSDefaultAndSize" title="PyUnicode_DecodeFSDefaultAndSize"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeFSDefaultAndSize()</span></code></a> to decode a string from
- the <a class="reference internal" href="../glossary.html#term-filesystem-encoding-and-error-handler"><span class="xref std std-term">filesystem encoding and error handler</span></a>.</p>
- <p>This function ignores the <a class="reference internal" href="../library/os.html#utf8-mode"><span class="std std-ref">Python UTF-8 Mode</span></a>.</p>
- <div class="admonition seealso">
- <p class="admonition-title">See also</p>
- <p>The <a class="reference internal" href="sys.html#c.Py_DecodeLocale" title="Py_DecodeLocale"><code class="xref c c-func docutils literal notranslate"><span class="pre">Py_DecodeLocale()</span></code></a> function.</p>
- </div>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.7: </span>The function now also uses the current locale encoding for the
- <code class="docutils literal notranslate"><span class="pre">surrogateescape</span></code> error handler, except on Android. Previously, <a class="reference internal" href="sys.html#c.Py_DecodeLocale" title="Py_DecodeLocale"><code class="xref c c-func docutils literal notranslate"><span class="pre">Py_DecodeLocale()</span></code></a>
- was used for the <code class="docutils literal notranslate"><span class="pre">surrogateescape</span></code>, and the current locale encoding was
- used for <code class="docutils literal notranslate"><span class="pre">strict</span></code>.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeLocale">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeLocale</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeLocale" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> since version 3.7.</em><p>Similar to <a class="reference internal" href="#c.PyUnicode_DecodeLocaleAndSize" title="PyUnicode_DecodeLocaleAndSize"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeLocaleAndSize()</span></code></a>, but compute the string
- length using <code class="xref c c-func docutils literal notranslate"><span class="pre">strlen()</span></code>.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_EncodeLocale">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_EncodeLocale</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_EncodeLocale" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> since version 3.7.</em><p>Encode a Unicode object to UTF-8 on Android and VxWorks, or to the current
- locale encoding on other platforms. The
- supported error handlers are <code class="docutils literal notranslate"><span class="pre">"strict"</span></code> and <code class="docutils literal notranslate"><span class="pre">"surrogateescape"</span></code>
- (<span class="target" id="index-3"></span><a class="pep reference external" href="https://peps.python.org/pep-0383/"><strong>PEP 383</strong></a>). The encoder uses <code class="docutils literal notranslate"><span class="pre">"strict"</span></code> error handler if
- <em>errors</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>. Return a <a class="reference internal" href="../library/stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> object. <em>unicode</em> cannot
- contain embedded null characters.</p>
- <p>Use <a class="reference internal" href="#c.PyUnicode_EncodeFSDefault" title="PyUnicode_EncodeFSDefault"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_EncodeFSDefault()</span></code></a> to encode a string to the
- <a class="reference internal" href="../glossary.html#term-filesystem-encoding-and-error-handler"><span class="xref std std-term">filesystem encoding and error handler</span></a>.</p>
- <p>This function ignores the <a class="reference internal" href="../library/os.html#utf8-mode"><span class="std std-ref">Python UTF-8 Mode</span></a>.</p>
- <div class="admonition seealso">
- <p class="admonition-title">See also</p>
- <p>The <a class="reference internal" href="sys.html#c.Py_EncodeLocale" title="Py_EncodeLocale"><code class="xref c c-func docutils literal notranslate"><span class="pre">Py_EncodeLocale()</span></code></a> function.</p>
- </div>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.7: </span>The function now also uses the current locale encoding for the
- <code class="docutils literal notranslate"><span class="pre">surrogateescape</span></code> error handler, except on Android. Previously,
- <a class="reference internal" href="sys.html#c.Py_EncodeLocale" title="Py_EncodeLocale"><code class="xref c c-func docutils literal notranslate"><span class="pre">Py_EncodeLocale()</span></code></a>
- was used for the <code class="docutils literal notranslate"><span class="pre">surrogateescape</span></code>, and the current locale encoding was
- used for <code class="docutils literal notranslate"><span class="pre">strict</span></code>.</p>
- </div>
- </dd></dl>
-
- </section>
- <section id="file-system-encoding">
- <h3>File System Encoding<a class="headerlink" href="#file-system-encoding" title="Link to this heading">¶</a></h3>
- <p>Functions encoding to and decoding from the <a class="reference internal" href="../glossary.html#term-filesystem-encoding-and-error-handler"><span class="xref std std-term">filesystem encoding and
- error handler</span></a> (<span class="target" id="index-4"></span><a class="pep reference external" href="https://peps.python.org/pep-0383/"><strong>PEP 383</strong></a> and <span class="target" id="index-5"></span><a class="pep reference external" href="https://peps.python.org/pep-0529/"><strong>PEP 529</strong></a>).</p>
- <p>To encode file names to <a class="reference internal" href="../library/stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> during argument parsing, the <code class="docutils literal notranslate"><span class="pre">"O&"</span></code>
- converter should be used, passing <a class="reference internal" href="#c.PyUnicode_FSConverter" title="PyUnicode_FSConverter"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_FSConverter()</span></code></a> as the
- conversion function:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_FSConverter">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_FSConverter</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">obj</span></span>, <span class="kt"><span class="pre">void</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">result</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_FSConverter" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>ParseTuple converter: encode <a class="reference internal" href="../library/stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> objects – obtained directly or
- through the <a class="reference internal" href="../library/os.html#os.PathLike" title="os.PathLike"><code class="xref py py-class docutils literal notranslate"><span class="pre">os.PathLike</span></code></a> interface – to <a class="reference internal" href="../library/stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> using
- <a class="reference internal" href="#c.PyUnicode_EncodeFSDefault" title="PyUnicode_EncodeFSDefault"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_EncodeFSDefault()</span></code></a>; <a class="reference internal" href="../library/stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> objects are output as-is.
- <em>result</em> must be a <span class="c-expr sig sig-inline c"><a class="reference internal" href="bytes.html#c.PyBytesObject" title="PyBytesObject"><span class="n">PyBytesObject</span></a><span class="p">*</span></span> which must be released when it is
- no longer used.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.1.</span></p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.6: </span>Accepts a <a class="reference internal" href="../glossary.html#term-path-like-object"><span class="xref std std-term">path-like object</span></a>.</p>
- </div>
- </dd></dl>
-
- <p>To decode file names to <a class="reference internal" href="../library/stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> during argument parsing, the <code class="docutils literal notranslate"><span class="pre">"O&"</span></code>
- converter should be used, passing <a class="reference internal" href="#c.PyUnicode_FSDecoder" title="PyUnicode_FSDecoder"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_FSDecoder()</span></code></a> as the
- conversion function:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_FSDecoder">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_FSDecoder</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">obj</span></span>, <span class="kt"><span class="pre">void</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">result</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_FSDecoder" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>ParseTuple converter: decode <a class="reference internal" href="../library/stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a> objects – obtained either
- directly or indirectly through the <a class="reference internal" href="../library/os.html#os.PathLike" title="os.PathLike"><code class="xref py py-class docutils literal notranslate"><span class="pre">os.PathLike</span></code></a> interface – to
- <a class="reference internal" href="../library/stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a> using <a class="reference internal" href="#c.PyUnicode_DecodeFSDefaultAndSize" title="PyUnicode_DecodeFSDefaultAndSize"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeFSDefaultAndSize()</span></code></a>; <a class="reference internal" href="../library/stdtypes.html#str" title="str"><code class="xref py py-class docutils literal notranslate"><span class="pre">str</span></code></a>
- objects are output as-is. <em>result</em> must be a <span class="c-expr sig sig-inline c"><a class="reference internal" href="#c.PyUnicodeObject" title="PyUnicodeObject"><span class="n">PyUnicodeObject</span></a><span class="p">*</span></span> which
- must be released when it is no longer used.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.2.</span></p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.6: </span>Accepts a <a class="reference internal" href="../glossary.html#term-path-like-object"><span class="xref std std-term">path-like object</span></a>.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeFSDefaultAndSize">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeFSDefaultAndSize</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeFSDefaultAndSize" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Decode a string from the <a class="reference internal" href="../glossary.html#term-filesystem-encoding-and-error-handler"><span class="xref std std-term">filesystem encoding and error handler</span></a>.</p>
- <p>If you need to decode a string from the current locale encoding, use
- <a class="reference internal" href="#c.PyUnicode_DecodeLocaleAndSize" title="PyUnicode_DecodeLocaleAndSize"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeLocaleAndSize()</span></code></a>.</p>
- <div class="admonition seealso">
- <p class="admonition-title">See also</p>
- <p>The <a class="reference internal" href="sys.html#c.Py_DecodeLocale" title="Py_DecodeLocale"><code class="xref c c-func docutils literal notranslate"><span class="pre">Py_DecodeLocale()</span></code></a> function.</p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.6: </span>The <a class="reference internal" href="../glossary.html#term-filesystem-encoding-and-error-handler"><span class="xref std std-term">filesystem error handler</span></a> is now used.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeFSDefault">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeFSDefault</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeFSDefault" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Decode a null-terminated string from the <a class="reference internal" href="../glossary.html#term-filesystem-encoding-and-error-handler"><span class="xref std std-term">filesystem encoding and
- error handler</span></a>.</p>
- <p>If the string length is known, use
- <a class="reference internal" href="#c.PyUnicode_DecodeFSDefaultAndSize" title="PyUnicode_DecodeFSDefaultAndSize"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeFSDefaultAndSize()</span></code></a>.</p>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.6: </span>The <a class="reference internal" href="../glossary.html#term-filesystem-encoding-and-error-handler"><span class="xref std std-term">filesystem error handler</span></a> is now used.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_EncodeFSDefault">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_EncodeFSDefault</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_EncodeFSDefault" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Encode a Unicode object to the <a class="reference internal" href="../glossary.html#term-filesystem-encoding-and-error-handler"><span class="xref std std-term">filesystem encoding and error
- handler</span></a>, and return <a class="reference internal" href="../library/stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>. Note that the resulting <a class="reference internal" href="../library/stdtypes.html#bytes" title="bytes"><code class="xref py py-class docutils literal notranslate"><span class="pre">bytes</span></code></a>
- object can contain null bytes.</p>
- <p>If you need to encode a string to the current locale encoding, use
- <a class="reference internal" href="#c.PyUnicode_EncodeLocale" title="PyUnicode_EncodeLocale"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_EncodeLocale()</span></code></a>.</p>
- <div class="admonition seealso">
- <p class="admonition-title">See also</p>
- <p>The <a class="reference internal" href="sys.html#c.Py_EncodeLocale" title="Py_EncodeLocale"><code class="xref c c-func docutils literal notranslate"><span class="pre">Py_EncodeLocale()</span></code></a> function.</p>
- </div>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.2.</span></p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.6: </span>The <a class="reference internal" href="../glossary.html#term-filesystem-encoding-and-error-handler"><span class="xref std std-term">filesystem error handler</span></a> is now used.</p>
- </div>
- </dd></dl>
-
- </section>
- <section id="wchar-t-support">
- <h3>wchar_t Support<a class="headerlink" href="#wchar-t-support" title="Link to this heading">¶</a></h3>
- <p><code class="xref c c-type docutils literal notranslate"><span class="pre">wchar_t</span></code> support for platforms which support it:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_FromWideChar">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_FromWideChar</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="n"><span class="pre">wchar_t</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">wstr</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_FromWideChar" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Create a Unicode object from the <code class="xref c c-type docutils literal notranslate"><span class="pre">wchar_t</span></code> buffer <em>wstr</em> of the given <em>size</em>.
- Passing <code class="docutils literal notranslate"><span class="pre">-1</span></code> as the <em>size</em> indicates that the function must itself compute the length,
- using <code class="xref c c-func docutils literal notranslate"><span class="pre">wcslen()</span></code>.
- Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> on failure.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsWideChar">
- <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsWideChar</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <span class="n"><span class="pre">wchar_t</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">wstr</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsWideChar" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Copy the Unicode object contents into the <code class="xref c c-type docutils literal notranslate"><span class="pre">wchar_t</span></code> buffer <em>wstr</em>. At most
- <em>size</em> <code class="xref c c-type docutils literal notranslate"><span class="pre">wchar_t</span></code> characters are copied (excluding a possibly trailing
- null termination character). Return the number of <code class="xref c c-type docutils literal notranslate"><span class="pre">wchar_t</span></code> characters
- copied or <code class="docutils literal notranslate"><span class="pre">-1</span></code> in case of an error.</p>
- <p>When <em>wstr</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, instead return the <em>size</em> that would be required
- to store all of <em>unicode</em> including a terminating null.</p>
- <p>Note that the resulting <span class="c-expr sig sig-inline c"><span class="n">wchar_t</span><span class="p">*</span></span>
- string may or may not be null-terminated. It is the responsibility of the caller
- to make sure that the <span class="c-expr sig sig-inline c"><span class="n">wchar_t</span><span class="p">*</span></span> string is null-terminated in case this is
- required by the application. Also, note that the <span class="c-expr sig sig-inline c"><span class="n">wchar_t</span><span class="p">*</span></span> string
- might contain null characters, which would cause the string to be truncated
- when used with most C functions.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsWideCharString">
- <span class="n"><span class="pre">wchar_t</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsWideCharString</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">size</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsWideCharString" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> since version 3.7.</em><p>Convert the Unicode object to a wide character string. The output string
- always ends with a null character. If <em>size</em> is not <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, write the number
- of wide characters (excluding the trailing null termination character) into
- <em>*size</em>. Note that the resulting <code class="xref c c-type docutils literal notranslate"><span class="pre">wchar_t</span></code> string might contain
- null characters, which would cause the string to be truncated when used with
- most C functions. If <em>size</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code> and the <span class="c-expr sig sig-inline c"><span class="n">wchar_t</span><span class="p">*</span></span> string
- contains null characters a <a class="reference internal" href="../library/exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a> is raised.</p>
- <p>Returns a buffer allocated by <a class="reference internal" href="memory.html#c.PyMem_New" title="PyMem_New"><code class="xref c c-macro docutils literal notranslate"><span class="pre">PyMem_New</span></code></a> (use
- <a class="reference internal" href="memory.html#c.PyMem_Free" title="PyMem_Free"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyMem_Free()</span></code></a> to free it) on success. On error, returns <code class="docutils literal notranslate"><span class="pre">NULL</span></code>
- and <em>*size</em> is undefined. Raises a <a class="reference internal" href="../library/exceptions.html#MemoryError" title="MemoryError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">MemoryError</span></code></a> if memory allocation
- is failed.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.2.</span></p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.7: </span>Raises a <a class="reference internal" href="../library/exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a> if <em>size</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code> and the <span class="c-expr sig sig-inline c"><span class="n">wchar_t</span><span class="p">*</span></span>
- string contains null characters.</p>
- </div>
- </dd></dl>
-
- </section>
- </section>
- <section id="built-in-codecs">
- <span id="builtincodecs"></span><h2>Built-in Codecs<a class="headerlink" href="#built-in-codecs" title="Link to this heading">¶</a></h2>
- <p>Python provides a set of built-in codecs which are written in C for speed. All of
- these codecs are directly usable via the following functions.</p>
- <p>Many of the following APIs take two arguments encoding and errors, and they
- have the same semantics as the ones of the built-in <a class="reference internal" href="../library/stdtypes.html#str" title="str"><code class="xref py py-func docutils literal notranslate"><span class="pre">str()</span></code></a> string object
- constructor.</p>
- <p>Setting encoding to <code class="docutils literal notranslate"><span class="pre">NULL</span></code> causes the default encoding to be used
- which is UTF-8. The file system calls should use
- <a class="reference internal" href="#c.PyUnicode_FSConverter" title="PyUnicode_FSConverter"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_FSConverter()</span></code></a> for encoding file names. This uses the
- <a class="reference internal" href="../glossary.html#term-filesystem-encoding-and-error-handler"><span class="xref std std-term">filesystem encoding and error handler</span></a> internally.</p>
- <p>Error handling is set by errors which may also be set to <code class="docutils literal notranslate"><span class="pre">NULL</span></code> meaning to use
- the default handling defined for the codec. Default error handling for all
- built-in codecs is “strict” (<a class="reference internal" href="../library/exceptions.html#ValueError" title="ValueError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">ValueError</span></code></a> is raised).</p>
- <p>The codecs all use a similar interface. Only deviations from the following
- generic ones are documented for simplicity.</p>
- <section id="generic-codecs">
- <h3>Generic Codecs<a class="headerlink" href="#generic-codecs" title="Link to this heading">¶</a></h3>
- <p>These are the generic codec APIs:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Decode">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Decode</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">encoding</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Decode" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Create a Unicode object by decoding <em>size</em> bytes of the encoded string <em>str</em>.
- <em>encoding</em> and <em>errors</em> have the same meaning as the parameters of the same name
- in the <a class="reference internal" href="../library/stdtypes.html#str" title="str"><code class="xref py py-func docutils literal notranslate"><span class="pre">str()</span></code></a> built-in function. The codec to be used is looked up
- using the Python codec registry. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by
- the codec.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsEncodedString">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsEncodedString</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">encoding</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsEncodedString" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Encode a Unicode object and return the result as Python bytes object.
- <em>encoding</em> and <em>errors</em> have the same meaning as the parameters of the same
- name in the Unicode <a class="reference internal" href="../library/stdtypes.html#str.encode" title="str.encode"><code class="xref py py-meth docutils literal notranslate"><span class="pre">encode()</span></code></a> method. The codec to be used is looked up
- using the Python codec registry. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by
- the codec.</p>
- </dd></dl>
-
- </section>
- <section id="utf-8-codecs">
- <h3>UTF-8 Codecs<a class="headerlink" href="#utf-8-codecs" title="Link to this heading">¶</a></h3>
- <p>These are the UTF-8 codec APIs:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeUTF8">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeUTF8</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeUTF8" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Create a Unicode object by decoding <em>size</em> bytes of the UTF-8 encoded string
- <em>str</em>. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the codec.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeUTF8Stateful">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeUTF8Stateful</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">consumed</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeUTF8Stateful" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>If <em>consumed</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, behave like <a class="reference internal" href="#c.PyUnicode_DecodeUTF8" title="PyUnicode_DecodeUTF8"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeUTF8()</span></code></a>. If
- <em>consumed</em> is not <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, trailing incomplete UTF-8 byte sequences will not be
- treated as an error. Those bytes will not be decoded and the number of bytes
- that have been decoded will be stored in <em>consumed</em>.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsUTF8String">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsUTF8String</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsUTF8String" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Encode a Unicode object using UTF-8 and return the result as Python bytes
- object. Error handling is “strict”. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was
- raised by the codec.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsUTF8AndSize">
- <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsUTF8AndSize</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">size</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsUTF8AndSize" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> since version 3.10.</em><p>Return a pointer to the UTF-8 encoding of the Unicode object, and
- store the size of the encoded representation (in bytes) in <em>size</em>. The
- <em>size</em> argument can be <code class="docutils literal notranslate"><span class="pre">NULL</span></code>; in this case no size will be stored. The
- returned buffer always has an extra null byte appended (not included in
- <em>size</em>), regardless of whether there are any other null code points.</p>
- <p>In the case of an error, <code class="docutils literal notranslate"><span class="pre">NULL</span></code> is returned with an exception set and no
- <em>size</em> is stored.</p>
- <p>This caches the UTF-8 representation of the string in the Unicode object, and
- subsequent calls will return a pointer to the same buffer. The caller is not
- responsible for deallocating the buffer. The buffer is deallocated and
- pointers to it become invalid when the Unicode object is garbage collected.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.7: </span>The return type is now <code class="docutils literal notranslate"><span class="pre">const</span> <span class="pre">char</span> <span class="pre">*</span></code> rather of <code class="docutils literal notranslate"><span class="pre">char</span> <span class="pre">*</span></code>.</p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.10: </span>This function is a part of the <a class="reference internal" href="stable.html#limited-c-api"><span class="std std-ref">limited API</span></a>.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsUTF8">
- <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsUTF8</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsUTF8" title="Link to this definition">¶</a><br /></dt>
- <dd><p>As <a class="reference internal" href="#c.PyUnicode_AsUTF8AndSize" title="PyUnicode_AsUTF8AndSize"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_AsUTF8AndSize()</span></code></a>, but does not store the size.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.7: </span>The return type is now <code class="docutils literal notranslate"><span class="pre">const</span> <span class="pre">char</span> <span class="pre">*</span></code> rather of <code class="docutils literal notranslate"><span class="pre">char</span> <span class="pre">*</span></code>.</p>
- </div>
- </dd></dl>
-
- </section>
- <section id="utf-32-codecs">
- <h3>UTF-32 Codecs<a class="headerlink" href="#utf-32-codecs" title="Link to this heading">¶</a></h3>
- <p>These are the UTF-32 codec APIs:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeUTF32">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeUTF32</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span>, <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">byteorder</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeUTF32" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Decode <em>size</em> bytes from a UTF-32 encoded buffer string and return the
- corresponding Unicode object. <em>errors</em> (if non-<code class="docutils literal notranslate"><span class="pre">NULL</span></code>) defines the error
- handling. It defaults to “strict”.</p>
- <p>If <em>byteorder</em> is non-<code class="docutils literal notranslate"><span class="pre">NULL</span></code>, the decoder starts decoding using the given byte
- order:</p>
- <div class="highlight-c notranslate"><div class="highlight"><pre><span></span><span class="o">*</span><span class="n">byteorder</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">-1</span><span class="o">:</span><span class="w"> </span><span class="n">little</span><span class="w"> </span><span class="n">endian</span>
- <span class="o">*</span><span class="n">byteorder</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">0</span><span class="o">:</span><span class="w"> </span><span class="n">native</span><span class="w"> </span><span class="n">order</span>
- <span class="o">*</span><span class="n">byteorder</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">1</span><span class="o">:</span><span class="w"> </span><span class="n">big</span><span class="w"> </span><span class="n">endian</span>
- </pre></div>
- </div>
- <p>If <code class="docutils literal notranslate"><span class="pre">*byteorder</span></code> is zero, and the first four bytes of the input data are a
- byte order mark (BOM), the decoder switches to this byte order and the BOM is
- not copied into the resulting Unicode string. If <code class="docutils literal notranslate"><span class="pre">*byteorder</span></code> is <code class="docutils literal notranslate"><span class="pre">-1</span></code> or
- <code class="docutils literal notranslate"><span class="pre">1</span></code>, any byte order mark is copied to the output.</p>
- <p>After completion, <em>*byteorder</em> is set to the current byte order at the end
- of input data.</p>
- <p>If <em>byteorder</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, the codec starts in native order mode.</p>
- <p>Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the codec.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeUTF32Stateful">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeUTF32Stateful</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span>, <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">byteorder</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">consumed</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeUTF32Stateful" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>If <em>consumed</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, behave like <a class="reference internal" href="#c.PyUnicode_DecodeUTF32" title="PyUnicode_DecodeUTF32"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeUTF32()</span></code></a>. If
- <em>consumed</em> is not <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, <a class="reference internal" href="#c.PyUnicode_DecodeUTF32Stateful" title="PyUnicode_DecodeUTF32Stateful"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeUTF32Stateful()</span></code></a> will not treat
- trailing incomplete UTF-32 byte sequences (such as a number of bytes not divisible
- by four) as an error. Those bytes will not be decoded and the number of bytes
- that have been decoded will be stored in <em>consumed</em>.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsUTF32String">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsUTF32String</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsUTF32String" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Return a Python byte string using the UTF-32 encoding in native byte
- order. The string always starts with a BOM mark. Error handling is “strict”.
- Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the codec.</p>
- </dd></dl>
-
- </section>
- <section id="utf-16-codecs">
- <h3>UTF-16 Codecs<a class="headerlink" href="#utf-16-codecs" title="Link to this heading">¶</a></h3>
- <p>These are the UTF-16 codec APIs:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeUTF16">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeUTF16</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span>, <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">byteorder</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeUTF16" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Decode <em>size</em> bytes from a UTF-16 encoded buffer string and return the
- corresponding Unicode object. <em>errors</em> (if non-<code class="docutils literal notranslate"><span class="pre">NULL</span></code>) defines the error
- handling. It defaults to “strict”.</p>
- <p>If <em>byteorder</em> is non-<code class="docutils literal notranslate"><span class="pre">NULL</span></code>, the decoder starts decoding using the given byte
- order:</p>
- <div class="highlight-c notranslate"><div class="highlight"><pre><span></span><span class="o">*</span><span class="n">byteorder</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">-1</span><span class="o">:</span><span class="w"> </span><span class="n">little</span><span class="w"> </span><span class="n">endian</span>
- <span class="o">*</span><span class="n">byteorder</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">0</span><span class="o">:</span><span class="w"> </span><span class="n">native</span><span class="w"> </span><span class="n">order</span>
- <span class="o">*</span><span class="n">byteorder</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">1</span><span class="o">:</span><span class="w"> </span><span class="n">big</span><span class="w"> </span><span class="n">endian</span>
- </pre></div>
- </div>
- <p>If <code class="docutils literal notranslate"><span class="pre">*byteorder</span></code> is zero, and the first two bytes of the input data are a
- byte order mark (BOM), the decoder switches to this byte order and the BOM is
- not copied into the resulting Unicode string. If <code class="docutils literal notranslate"><span class="pre">*byteorder</span></code> is <code class="docutils literal notranslate"><span class="pre">-1</span></code> or
- <code class="docutils literal notranslate"><span class="pre">1</span></code>, any byte order mark is copied to the output (where it will result in
- either a <code class="docutils literal notranslate"><span class="pre">\ufeff</span></code> or a <code class="docutils literal notranslate"><span class="pre">\ufffe</span></code> character).</p>
- <p>After completion, <code class="docutils literal notranslate"><span class="pre">*byteorder</span></code> is set to the current byte order at the end
- of input data.</p>
- <p>If <em>byteorder</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, the codec starts in native order mode.</p>
- <p>Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the codec.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeUTF16Stateful">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeUTF16Stateful</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span>, <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">byteorder</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">consumed</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeUTF16Stateful" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>If <em>consumed</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, behave like <a class="reference internal" href="#c.PyUnicode_DecodeUTF16" title="PyUnicode_DecodeUTF16"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeUTF16()</span></code></a>. If
- <em>consumed</em> is not <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, <a class="reference internal" href="#c.PyUnicode_DecodeUTF16Stateful" title="PyUnicode_DecodeUTF16Stateful"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeUTF16Stateful()</span></code></a> will not treat
- trailing incomplete UTF-16 byte sequences (such as an odd number of bytes or a
- split surrogate pair) as an error. Those bytes will not be decoded and the
- number of bytes that have been decoded will be stored in <em>consumed</em>.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsUTF16String">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsUTF16String</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsUTF16String" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Return a Python byte string using the UTF-16 encoding in native byte
- order. The string always starts with a BOM mark. Error handling is “strict”.
- Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the codec.</p>
- </dd></dl>
-
- </section>
- <section id="utf-7-codecs">
- <h3>UTF-7 Codecs<a class="headerlink" href="#utf-7-codecs" title="Link to this heading">¶</a></h3>
- <p>These are the UTF-7 codec APIs:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeUTF7">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeUTF7</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeUTF7" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Create a Unicode object by decoding <em>size</em> bytes of the UTF-7 encoded string
- <em>str</em>. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the codec.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeUTF7Stateful">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeUTF7Stateful</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">consumed</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeUTF7Stateful" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>If <em>consumed</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, behave like <a class="reference internal" href="#c.PyUnicode_DecodeUTF7" title="PyUnicode_DecodeUTF7"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeUTF7()</span></code></a>. If
- <em>consumed</em> is not <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, trailing incomplete UTF-7 base-64 sections will not
- be treated as an error. Those bytes will not be decoded and the number of
- bytes that have been decoded will be stored in <em>consumed</em>.</p>
- </dd></dl>
-
- </section>
- <section id="unicode-escape-codecs">
- <h3>Unicode-Escape Codecs<a class="headerlink" href="#unicode-escape-codecs" title="Link to this heading">¶</a></h3>
- <p>These are the “Unicode Escape” codec APIs:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeUnicodeEscape">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeUnicodeEscape</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeUnicodeEscape" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Create a Unicode object by decoding <em>size</em> bytes of the Unicode-Escape encoded
- string <em>str</em>. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the codec.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsUnicodeEscapeString">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsUnicodeEscapeString</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsUnicodeEscapeString" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Encode a Unicode object using Unicode-Escape and return the result as a
- bytes object. Error handling is “strict”. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was
- raised by the codec.</p>
- </dd></dl>
-
- </section>
- <section id="raw-unicode-escape-codecs">
- <h3>Raw-Unicode-Escape Codecs<a class="headerlink" href="#raw-unicode-escape-codecs" title="Link to this heading">¶</a></h3>
- <p>These are the “Raw Unicode Escape” codec APIs:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeRawUnicodeEscape">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeRawUnicodeEscape</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeRawUnicodeEscape" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Create a Unicode object by decoding <em>size</em> bytes of the Raw-Unicode-Escape
- encoded string <em>str</em>. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the codec.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsRawUnicodeEscapeString">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsRawUnicodeEscapeString</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsRawUnicodeEscapeString" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Encode a Unicode object using Raw-Unicode-Escape and return the result as
- a bytes object. Error handling is “strict”. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception
- was raised by the codec.</p>
- </dd></dl>
-
- </section>
- <section id="latin-1-codecs">
- <h3>Latin-1 Codecs<a class="headerlink" href="#latin-1-codecs" title="Link to this heading">¶</a></h3>
- <p>These are the Latin-1 codec APIs: Latin-1 corresponds to the first 256 Unicode
- ordinals and only these are accepted by the codecs during encoding.</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeLatin1">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeLatin1</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeLatin1" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Create a Unicode object by decoding <em>size</em> bytes of the Latin-1 encoded string
- <em>str</em>. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the codec.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsLatin1String">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsLatin1String</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsLatin1String" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Encode a Unicode object using Latin-1 and return the result as Python bytes
- object. Error handling is “strict”. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was
- raised by the codec.</p>
- </dd></dl>
-
- </section>
- <section id="ascii-codecs">
- <h3>ASCII Codecs<a class="headerlink" href="#ascii-codecs" title="Link to this heading">¶</a></h3>
- <p>These are the ASCII codec APIs. Only 7-bit ASCII data is accepted. All other
- codes generate errors.</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeASCII">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeASCII</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeASCII" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Create a Unicode object by decoding <em>size</em> bytes of the ASCII encoded string
- <em>str</em>. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the codec.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsASCIIString">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsASCIIString</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsASCIIString" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Encode a Unicode object using ASCII and return the result as Python bytes
- object. Error handling is “strict”. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was
- raised by the codec.</p>
- </dd></dl>
-
- </section>
- <section id="character-map-codecs">
- <h3>Character Map Codecs<a class="headerlink" href="#character-map-codecs" title="Link to this heading">¶</a></h3>
- <p>This codec is special in that it can be used to implement many different codecs
- (and this is in fact what was done to obtain most of the standard codecs
- included in the <code class="xref py py-mod docutils literal notranslate"><span class="pre">encodings</span></code> package). The codec uses mappings to encode and
- decode characters. The mapping objects provided must support the
- <a class="reference internal" href="../reference/datamodel.html#object.__getitem__" title="object.__getitem__"><code class="xref py py-meth docutils literal notranslate"><span class="pre">__getitem__()</span></code></a> mapping interface; dictionaries and sequences work well.</p>
- <p>These are the mapping codec APIs:</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeCharmap">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeCharmap</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">length</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">mapping</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeCharmap" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Create a Unicode object by decoding <em>size</em> bytes of the encoded string <em>str</em>
- using the given <em>mapping</em> object. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised
- by the codec.</p>
- <p>If <em>mapping</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, Latin-1 decoding will be applied. Else
- <em>mapping</em> must map bytes ordinals (integers in the range from 0 to 255)
- to Unicode strings, integers (which are then interpreted as Unicode
- ordinals) or <code class="docutils literal notranslate"><span class="pre">None</span></code>. Unmapped data bytes – ones which cause a
- <a class="reference internal" href="../library/exceptions.html#LookupError" title="LookupError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">LookupError</span></code></a>, as well as ones which get mapped to <code class="docutils literal notranslate"><span class="pre">None</span></code>,
- <code class="docutils literal notranslate"><span class="pre">0xFFFE</span></code> or <code class="docutils literal notranslate"><span class="pre">'\ufffe'</span></code>, are treated as undefined mappings and cause
- an error.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsCharmapString">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsCharmapString</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">mapping</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsCharmapString" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Encode a Unicode object using the given <em>mapping</em> object and return the
- result as a bytes object. Error handling is “strict”. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an
- exception was raised by the codec.</p>
- <p>The <em>mapping</em> object must map Unicode ordinal integers to bytes objects,
- integers in the range from 0 to 255 or <code class="docutils literal notranslate"><span class="pre">None</span></code>. Unmapped character
- ordinals (ones which cause a <a class="reference internal" href="../library/exceptions.html#LookupError" title="LookupError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">LookupError</span></code></a>) as well as mapped to
- <code class="docutils literal notranslate"><span class="pre">None</span></code> are treated as “undefined mapping” and cause an error.</p>
- </dd></dl>
-
- <p>The following codec API is special in that maps Unicode to Unicode.</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Translate">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Translate</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">table</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Translate" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Translate a string by applying a character mapping table to it and return the
- resulting Unicode object. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the
- codec.</p>
- <p>The mapping table must map Unicode ordinal integers to Unicode ordinal integers
- or <code class="docutils literal notranslate"><span class="pre">None</span></code> (causing deletion of the character).</p>
- <p>Mapping tables need only provide the <a class="reference internal" href="../reference/datamodel.html#object.__getitem__" title="object.__getitem__"><code class="xref py py-meth docutils literal notranslate"><span class="pre">__getitem__()</span></code></a> interface; dictionaries
- and sequences work well. Unmapped character ordinals (ones which cause a
- <a class="reference internal" href="../library/exceptions.html#LookupError" title="LookupError"><code class="xref py py-exc docutils literal notranslate"><span class="pre">LookupError</span></code></a>) are left untouched and are copied as-is.</p>
- <p><em>errors</em> has the usual meaning for codecs. It may be <code class="docutils literal notranslate"><span class="pre">NULL</span></code> which indicates to
- use the default error handling.</p>
- </dd></dl>
-
- </section>
- <section id="mbcs-codecs-for-windows">
- <h3>MBCS codecs for Windows<a class="headerlink" href="#mbcs-codecs-for-windows" title="Link to this heading">¶</a></h3>
- <p>These are the MBCS codec APIs. They are currently only available on Windows and
- use the Win32 MBCS converters to implement the conversions. Note that MBCS (or
- DBCS) is a class of encodings, not just one. The target encoding is defined by
- the user settings on the machine running the codec.</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeMBCS">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeMBCS</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeMBCS" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> on Windows since version 3.7.</em><p>Create a Unicode object by decoding <em>size</em> bytes of the MBCS encoded string <em>str</em>.
- Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the codec.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_DecodeMBCSStateful">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_DecodeMBCSStateful</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">size</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">consumed</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_DecodeMBCSStateful" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> on Windows since version 3.7.</em><p>If <em>consumed</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, behave like <a class="reference internal" href="#c.PyUnicode_DecodeMBCS" title="PyUnicode_DecodeMBCS"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeMBCS()</span></code></a>. If
- <em>consumed</em> is not <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, <a class="reference internal" href="#c.PyUnicode_DecodeMBCSStateful" title="PyUnicode_DecodeMBCSStateful"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_DecodeMBCSStateful()</span></code></a> will not decode
- trailing lead byte and the number of bytes that have been decoded will be stored
- in <em>consumed</em>.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_AsMBCSString">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_AsMBCSString</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_AsMBCSString" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> on Windows since version 3.7.</em><p>Encode a Unicode object using MBCS and return the result as Python bytes
- object. Error handling is “strict”. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was
- raised by the codec.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_EncodeCodePage">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_EncodeCodePage</span></span></span><span class="sig-paren">(</span><span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="n"><span class="pre">code_page</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">errors</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_EncodeCodePage" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> on Windows since version 3.7.</em><p>Encode the Unicode object using the specified code page and return a Python
- bytes object. Return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> if an exception was raised by the codec. Use
- <code class="xref c c-macro docutils literal notranslate"><span class="pre">CP_ACP</span></code> code page to get the MBCS encoder.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- </dd></dl>
-
- </section>
- <section id="methods-slots">
- <h3>Methods & Slots<a class="headerlink" href="#methods-slots" title="Link to this heading">¶</a></h3>
- </section>
- </section>
- <section id="methods-and-slot-functions">
- <span id="unicodemethodsandslots"></span><h2>Methods and Slot Functions<a class="headerlink" href="#methods-and-slot-functions" title="Link to this heading">¶</a></h2>
- <p>The following APIs are capable of handling Unicode objects and strings on input
- (we refer to them as strings in the descriptions) and return Unicode objects or
- integers as appropriate.</p>
- <p>They all return <code class="docutils literal notranslate"><span class="pre">NULL</span></code> or <code class="docutils literal notranslate"><span class="pre">-1</span></code> if an exception occurs.</p>
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Concat">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Concat</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">left</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">right</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Concat" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Concat two strings giving a new Unicode string.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Split">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Split</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">sep</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">maxsplit</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Split" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Split a string giving a list of Unicode strings. If <em>sep</em> is <code class="docutils literal notranslate"><span class="pre">NULL</span></code>, splitting
- will be done at all whitespace substrings. Otherwise, splits occur at the given
- separator. At most <em>maxsplit</em> splits will be done. If negative, no limit is
- set. Separators are not included in the resulting list.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Splitlines">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Splitlines</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="n"><span class="pre">keepends</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Splitlines" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Split a Unicode string at line breaks, returning a list of Unicode strings.
- CRLF is considered to be one line break. If <em>keepends</em> is <code class="docutils literal notranslate"><span class="pre">0</span></code>, the Line break
- characters are not included in the resulting strings.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Join">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Join</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">separator</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">seq</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Join" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Join a sequence of strings using the given <em>separator</em> and return the resulting
- Unicode string.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Tailmatch">
- <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Tailmatch</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">substr</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">start</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">end</span></span>, <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="n"><span class="pre">direction</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Tailmatch" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Return <code class="docutils literal notranslate"><span class="pre">1</span></code> if <em>substr</em> matches <code class="docutils literal notranslate"><span class="pre">unicode[start:end]</span></code> at the given tail end
- (<em>direction</em> == <code class="docutils literal notranslate"><span class="pre">-1</span></code> means to do a prefix match, <em>direction</em> == <code class="docutils literal notranslate"><span class="pre">1</span></code> a suffix match),
- <code class="docutils literal notranslate"><span class="pre">0</span></code> otherwise. Return <code class="docutils literal notranslate"><span class="pre">-1</span></code> if an error occurred.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Find">
- <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Find</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">substr</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">start</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">end</span></span>, <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="n"><span class="pre">direction</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Find" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Return the first position of <em>substr</em> in <code class="docutils literal notranslate"><span class="pre">unicode[start:end]</span></code> using the given
- <em>direction</em> (<em>direction</em> == <code class="docutils literal notranslate"><span class="pre">1</span></code> means to do a forward search, <em>direction</em> == <code class="docutils literal notranslate"><span class="pre">-1</span></code> a
- backward search). The return value is the index of the first match; a value of
- <code class="docutils literal notranslate"><span class="pre">-1</span></code> indicates that no match was found, and <code class="docutils literal notranslate"><span class="pre">-2</span></code> indicates that an error
- occurred and an exception has been set.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_FindChar">
- <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_FindChar</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="#c.Py_UCS4" title="Py_UCS4"><span class="n"><span class="pre">Py_UCS4</span></span></a><span class="w"> </span><span class="n"><span class="pre">ch</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">start</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">end</span></span>, <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="n"><span class="pre">direction</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_FindChar" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a> since version 3.7.</em><p>Return the first position of the character <em>ch</em> in <code class="docutils literal notranslate"><span class="pre">unicode[start:end]</span></code> using
- the given <em>direction</em> (<em>direction</em> == <code class="docutils literal notranslate"><span class="pre">1</span></code> means to do a forward search,
- <em>direction</em> == <code class="docutils literal notranslate"><span class="pre">-1</span></code> a backward search). The return value is the index of the
- first match; a value of <code class="docutils literal notranslate"><span class="pre">-1</span></code> indicates that no match was found, and <code class="docutils literal notranslate"><span class="pre">-2</span></code>
- indicates that an error occurred and an exception has been set.</p>
- <div class="versionadded">
- <p><span class="versionmodified added">New in version 3.3.</span></p>
- </div>
- <div class="versionchanged">
- <p><span class="versionmodified changed">Changed in version 3.7: </span><em>start</em> and <em>end</em> are now adjusted to behave like <code class="docutils literal notranslate"><span class="pre">unicode[start:end]</span></code>.</p>
- </div>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Count">
- <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Count</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">substr</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">start</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">end</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Count" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Return the number of non-overlapping occurrences of <em>substr</em> in
- <code class="docutils literal notranslate"><span class="pre">unicode[start:end]</span></code>. Return <code class="docutils literal notranslate"><span class="pre">-1</span></code> if an error occurred.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Replace">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Replace</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">substr</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">replstr</span></span>, <a class="reference internal" href="intro.html#c.Py_ssize_t" title="Py_ssize_t"><span class="n"><span class="pre">Py_ssize_t</span></span></a><span class="w"> </span><span class="n"><span class="pre">maxcount</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Replace" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Replace at most <em>maxcount</em> occurrences of <em>substr</em> in <em>unicode</em> with <em>replstr</em> and
- return the resulting Unicode object. <em>maxcount</em> == <code class="docutils literal notranslate"><span class="pre">-1</span></code> means replace all
- occurrences.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Compare">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Compare</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">left</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">right</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Compare" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Compare two strings and return <code class="docutils literal notranslate"><span class="pre">-1</span></code>, <code class="docutils literal notranslate"><span class="pre">0</span></code>, <code class="docutils literal notranslate"><span class="pre">1</span></code> for less than, equal, and greater than,
- respectively.</p>
- <p>This function returns <code class="docutils literal notranslate"><span class="pre">-1</span></code> upon failure, so one should call
- <a class="reference internal" href="exceptions.html#c.PyErr_Occurred" title="PyErr_Occurred"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyErr_Occurred()</span></code></a> to check for errors.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_CompareWithASCIIString">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_CompareWithASCIIString</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">string</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_CompareWithASCIIString" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Compare a Unicode object, <em>unicode</em>, with <em>string</em> and return <code class="docutils literal notranslate"><span class="pre">-1</span></code>, <code class="docutils literal notranslate"><span class="pre">0</span></code>, <code class="docutils literal notranslate"><span class="pre">1</span></code> for less
- than, equal, and greater than, respectively. It is best to pass only
- ASCII-encoded strings, but the function interprets the input string as
- ISO-8859-1 if it contains non-ASCII characters.</p>
- <p>This function does not raise exceptions.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_RichCompare">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_RichCompare</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">left</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">right</span></span>, <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="n"><span class="pre">op</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_RichCompare" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Rich compare two Unicode strings and return one of the following:</p>
- <ul class="simple">
- <li><p><code class="docutils literal notranslate"><span class="pre">NULL</span></code> in case an exception was raised</p></li>
- <li><p><a class="reference internal" href="bool.html#c.Py_True" title="Py_True"><code class="xref c c-data docutils literal notranslate"><span class="pre">Py_True</span></code></a> or <a class="reference internal" href="bool.html#c.Py_False" title="Py_False"><code class="xref c c-data docutils literal notranslate"><span class="pre">Py_False</span></code></a> for successful comparisons</p></li>
- <li><p><a class="reference internal" href="object.html#c.Py_NotImplemented" title="Py_NotImplemented"><code class="xref c c-data docutils literal notranslate"><span class="pre">Py_NotImplemented</span></code></a> in case the type combination is unknown</p></li>
- </ul>
- <p>Possible values for <em>op</em> are <a class="reference internal" href="typeobj.html#c.Py_GT" title="Py_GT"><code class="xref c c-macro docutils literal notranslate"><span class="pre">Py_GT</span></code></a>, <a class="reference internal" href="typeobj.html#c.Py_GE" title="Py_GE"><code class="xref c c-macro docutils literal notranslate"><span class="pre">Py_GE</span></code></a>, <a class="reference internal" href="typeobj.html#c.Py_EQ" title="Py_EQ"><code class="xref c c-macro docutils literal notranslate"><span class="pre">Py_EQ</span></code></a>,
- <a class="reference internal" href="typeobj.html#c.Py_NE" title="Py_NE"><code class="xref c c-macro docutils literal notranslate"><span class="pre">Py_NE</span></code></a>, <a class="reference internal" href="typeobj.html#c.Py_LT" title="Py_LT"><code class="xref c c-macro docutils literal notranslate"><span class="pre">Py_LT</span></code></a>, and <a class="reference internal" href="typeobj.html#c.Py_LE" title="Py_LE"><code class="xref c c-macro docutils literal notranslate"><span class="pre">Py_LE</span></code></a>.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Format">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Format</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">format</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">args</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Format" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Return a new string object from <em>format</em> and <em>args</em>; this is analogous to
- <code class="docutils literal notranslate"><span class="pre">format</span> <span class="pre">%</span> <span class="pre">args</span></code>.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_Contains">
- <span class="kt"><span class="pre">int</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_Contains</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">unicode</span></span>, <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">substr</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_Contains" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Check whether <em>substr</em> is contained in <em>unicode</em> and return true or false
- accordingly.</p>
- <p><em>substr</em> has to coerce to a one element Unicode string. <code class="docutils literal notranslate"><span class="pre">-1</span></code> is returned
- if there was an error.</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_InternInPlace">
- <span class="kt"><span class="pre">void</span></span><span class="w"> </span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_InternInPlace</span></span></span><span class="sig-paren">(</span><a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">p_unicode</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_InternInPlace" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>Intern the argument <span class="c-expr sig sig-inline c"><span class="o">*</span><a class="reference internal" href="#c.PyUnicode_InternInPlace" title="p_unicode"><span class="n">p_unicode</span></a></span> in place. The argument must be the address of a
- pointer variable pointing to a Python Unicode string object. If there is an
- existing interned string that is the same as <span class="c-expr sig sig-inline c"><span class="o">*</span><a class="reference internal" href="#c.PyUnicode_InternInPlace" title="p_unicode"><span class="n">p_unicode</span></a></span>, it sets <span class="c-expr sig sig-inline c"><span class="o">*</span><a class="reference internal" href="#c.PyUnicode_InternInPlace" title="p_unicode"><span class="n">p_unicode</span></a></span> to
- it (releasing the reference to the old string object and creating a new
- <a class="reference internal" href="../glossary.html#term-strong-reference"><span class="xref std std-term">strong reference</span></a> to the interned string object), otherwise it leaves
- <span class="c-expr sig sig-inline c"><span class="o">*</span><a class="reference internal" href="#c.PyUnicode_InternInPlace" title="p_unicode"><span class="n">p_unicode</span></a></span> alone and interns it (creating a new <a class="reference internal" href="../glossary.html#term-strong-reference"><span class="xref std std-term">strong reference</span></a>).
- (Clarification: even though there is a lot of talk about references, think
- of this function as reference-neutral; you own the object after the call
- if and only if you owned it before the call.)</p>
- </dd></dl>
-
- <dl class="c function">
- <dt class="sig sig-object c" id="c.PyUnicode_InternFromString">
- <a class="reference internal" href="structures.html#c.PyObject" title="PyObject"><span class="n"><span class="pre">PyObject</span></span></a><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="sig-name descname"><span class="n"><span class="pre">PyUnicode_InternFromString</span></span></span><span class="sig-paren">(</span><span class="k"><span class="pre">const</span></span><span class="w"> </span><span class="kt"><span class="pre">char</span></span><span class="w"> </span><span class="p"><span class="pre">*</span></span><span class="n"><span class="pre">str</span></span><span class="sig-paren">)</span><a class="headerlink" href="#c.PyUnicode_InternFromString" title="Link to this definition">¶</a><br /></dt>
- <dd><em class="refcount">Return value: New reference.</em><em class="stableabi"> Part of the <a class="reference internal" href="stable.html#stable"><span class="std std-ref">Stable ABI</span></a>.</em><p>A combination of <a class="reference internal" href="#c.PyUnicode_FromString" title="PyUnicode_FromString"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_FromString()</span></code></a> and
- <a class="reference internal" href="#c.PyUnicode_InternInPlace" title="PyUnicode_InternInPlace"><code class="xref c c-func docutils literal notranslate"><span class="pre">PyUnicode_InternInPlace()</span></code></a>, returning either a new Unicode string
- object that has been interned, or a new (“owned”) reference to an earlier
- interned string object with the same value.</p>
- </dd></dl>
-
- </section>
- </section>
-
-
- <div class="clearer"></div>
- </div>
- </div>
- </div>
- <div class="sphinxsidebar" role="navigation" aria-label="main navigation">
- <div class="sphinxsidebarwrapper">
- <div>
- <h3><a href="../contents.html">Table of Contents</a></h3>
- <ul>
- <li><a class="reference internal" href="#">Unicode Objects and Codecs</a><ul>
- <li><a class="reference internal" href="#unicode-objects">Unicode Objects</a><ul>
- <li><a class="reference internal" href="#unicode-type">Unicode Type</a></li>
- <li><a class="reference internal" href="#unicode-character-properties">Unicode Character Properties</a></li>
- <li><a class="reference internal" href="#creating-and-accessing-unicode-strings">Creating and accessing Unicode strings</a></li>
- <li><a class="reference internal" href="#locale-encoding">Locale Encoding</a></li>
- <li><a class="reference internal" href="#file-system-encoding">File System Encoding</a></li>
- <li><a class="reference internal" href="#wchar-t-support">wchar_t Support</a></li>
- </ul>
- </li>
- <li><a class="reference internal" href="#built-in-codecs">Built-in Codecs</a><ul>
- <li><a class="reference internal" href="#generic-codecs">Generic Codecs</a></li>
- <li><a class="reference internal" href="#utf-8-codecs">UTF-8 Codecs</a></li>
- <li><a class="reference internal" href="#utf-32-codecs">UTF-32 Codecs</a></li>
- <li><a class="reference internal" href="#utf-16-codecs">UTF-16 Codecs</a></li>
- <li><a class="reference internal" href="#utf-7-codecs">UTF-7 Codecs</a></li>
- <li><a class="reference internal" href="#unicode-escape-codecs">Unicode-Escape Codecs</a></li>
- <li><a class="reference internal" href="#raw-unicode-escape-codecs">Raw-Unicode-Escape Codecs</a></li>
- <li><a class="reference internal" href="#latin-1-codecs">Latin-1 Codecs</a></li>
- <li><a class="reference internal" href="#ascii-codecs">ASCII Codecs</a></li>
- <li><a class="reference internal" href="#character-map-codecs">Character Map Codecs</a></li>
- <li><a class="reference internal" href="#mbcs-codecs-for-windows">MBCS codecs for Windows</a></li>
- <li><a class="reference internal" href="#methods-slots">Methods & Slots</a></li>
- </ul>
- </li>
- <li><a class="reference internal" href="#methods-and-slot-functions">Methods and Slot Functions</a></li>
- </ul>
- </li>
- </ul>
-
- </div>
- <div>
- <h4>Previous topic</h4>
- <p class="topless"><a href="bytearray.html"
- title="previous chapter">Byte Array Objects</a></p>
- </div>
- <div>
- <h4>Next topic</h4>
- <p class="topless"><a href="tuple.html"
- title="next chapter">Tuple Objects</a></p>
- </div>
- <div role="note" aria-label="source link">
- <h3>This Page</h3>
- <ul class="this-page-menu">
- <li><a href="../bugs.html">Report a Bug</a></li>
- <li>
- <a href="https://github.com/python/cpython/blob/main/Doc/c-api/unicode.rst"
- rel="nofollow">Show Source
- </a>
- </li>
- </ul>
- </div>
- </div>
- <div id="sidebarbutton" title="Collapse sidebar">
- <span>«</span>
- </div>
-
- </div>
- <div class="clearer"></div>
- </div>
- <div class="related" role="navigation" aria-label="related navigation">
- <h3>Navigation</h3>
- <ul>
- <li class="right" style="margin-right: 10px">
- <a href="../genindex.html" title="General Index"
- >index</a></li>
- <li class="right" >
- <a href="../py-modindex.html" title="Python Module Index"
- >modules</a> |</li>
- <li class="right" >
- <a href="tuple.html" title="Tuple Objects"
- >next</a> |</li>
- <li class="right" >
- <a href="bytearray.html" title="Byte Array Objects"
- >previous</a> |</li>
-
- <li><img src="../_static/py.svg" alt="Python logo" style="vertical-align: middle; margin-top: -1px"/></li>
- <li><a href="https://www.python.org/">Python</a> »</li>
- <li class="switchers">
- <div class="language_switcher_placeholder"></div>
- <div class="version_switcher_placeholder"></div>
- </li>
- <li>
-
- </li>
- <li id="cpython-language-and-version">
- <a href="../index.html">3.12.3 Documentation</a> »
- </li>
-
- <li class="nav-item nav-item-1"><a href="index.html" >Python/C API Reference Manual</a> »</li>
- <li class="nav-item nav-item-2"><a href="concrete.html" >Concrete Objects Layer</a> »</li>
- <li class="nav-item nav-item-this"><a href="">Unicode Objects and Codecs</a></li>
- <li class="right">
-
-
- <div class="inline-search" role="search">
- <form class="inline-search" action="../search.html" method="get">
- <input placeholder="Quick search" aria-label="Quick search" type="search" name="q" id="search-box" />
- <input type="submit" value="Go" />
- </form>
- </div>
- |
- </li>
- <li class="right">
- <label class="theme-selector-label">
- Theme
- <select class="theme-selector" oninput="activateTheme(this.value)">
- <option value="auto" selected>Auto</option>
- <option value="light">Light</option>
- <option value="dark">Dark</option>
- </select>
- </label> |</li>
-
- </ul>
- </div>
- <div class="footer">
- ©
- <a href="../copyright.html">
-
- Copyright
-
- </a>
- 2001-2024, Python Software Foundation.
- <br />
- This page is licensed under the Python Software Foundation License Version 2.
- <br />
- Examples, recipes, and other code in the documentation are additionally licensed under the Zero Clause BSD License.
- <br />
-
- See <a href="/license.html">History and License</a> for more information.<br />
-
-
- <br />
-
- The Python Software Foundation is a non-profit corporation.
- <a href="https://www.python.org/psf/donations/">Please donate.</a>
- <br />
- <br />
- Last updated on Apr 09, 2024 (13:47 UTC).
-
- <a href="/bugs.html">Found a bug</a>?
-
- <br />
-
- Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 7.2.6.
- </div>
-
- </body>
- </html>
|