inweb-bootstrap/docs/foundation-module/4-pm.html

565 lines
127 KiB
HTML
Raw Normal View History

2019-02-04 22:26:45 +00:00
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
2020-04-08 22:41:00 +00:00
<title>Pattern Matching</title>
2020-04-23 22:23:44 +00:00
<link href="../docs-assets/Breadcrumbs.css" rel="stylesheet" rev="stylesheet" type="text/css">
<meta name="viewport" content="width=device-width initial-scale=1">
2019-02-04 22:26:45 +00:00
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Language" content="en-gb">
2020-04-20 22:26:08 +00:00
<link href="../docs-assets/Contents.css" rel="stylesheet" rev="stylesheet" type="text/css">
2020-04-30 22:36:38 +00:00
<link href="../docs-assets/Progress.css" rel="stylesheet" rev="stylesheet" type="text/css">
2020-04-25 10:33:39 +00:00
<link href="../docs-assets/Navigation.css" rel="stylesheet" rev="stylesheet" type="text/css">
<link href="../docs-assets/Fonts.css" rel="stylesheet" rev="stylesheet" type="text/css">
2020-04-20 22:26:08 +00:00
<link href="../docs-assets/Base.css" rel="stylesheet" rev="stylesheet" type="text/css">
<script>
function togglePopup(material_id) {
var popup = document.getElementById(material_id);
popup.classList.toggle("show");
}
</script>
<link href="../docs-assets/Popups.css" rel="stylesheet" rev="stylesheet" type="text/css">
2020-04-21 16:55:17 +00:00
<link href="../docs-assets/Colours.css" rel="stylesheet" rev="stylesheet" type="text/css">
2020-04-23 22:23:44 +00:00
2019-02-04 22:26:45 +00:00
</head>
2020-04-25 10:33:39 +00:00
<body class="commentary-font">
<nav role="navigation">
2020-04-13 16:06:45 +00:00
<h1><a href="../index.html">
2020-04-20 22:26:08 +00:00
<img src="../docs-assets/Octagram.png" width=72 height=72">
2020-04-13 16:06:45 +00:00
</a></h1>
<ul><li><a href="../inweb/index.html">inweb</a></li>
</ul><h2>Foundation Module</h2><ul>
<li><a href="index.html"><span class="selectedlink">foundation</span></a></li>
<li><a href="../foundation-test/index.html">foundation-test</a></li>
2020-04-13 16:06:45 +00:00
</ul><h2>Example Webs</h2><ul>
2020-04-12 16:24:23 +00:00
<li><a href="../goldbach/index.html">goldbach</a></li>
<li><a href="../twinprimes/twinprimes.html">twinprimes</a></li>
2020-04-15 22:45:08 +00:00
<li><a href="../eastertide/index.html">eastertide</a></li>
2020-04-14 17:36:42 +00:00
</ul><h2>Repository</h2><ul>
2020-04-20 22:34:44 +00:00
<li><a href="https://github.com/ganelson/inweb"><img src="../docs-assets/github.png" height=18> github</a></li>
2020-04-14 17:36:42 +00:00
</ul><h2>Related Projects</h2><ul>
<li><a href="../../../inform/docs/index.html">inform</a></li>
<li><a href="../../../intest/docs/index.html">intest</a></li>
2020-04-13 16:06:45 +00:00
</ul>
</nav>
<main role="main">
2020-04-23 22:23:44 +00:00
<!--Weave of 'Pattern Matching' generated by Inweb-->
2020-04-30 22:36:38 +00:00
<div class="breadcrumbs">
<ul class="crumbs"><li><a href="../index.html">Home</a></li><li><a href="index.html">foundation</a></li><li><a href="index.html#4">Chapter 4: Text Handling</a></li><li><b>Pattern Matching</b></li></ul></div>
<p class="purpose">To provide a limited regular-expression parser.</p>
2019-02-04 22:26:45 +00:00
2020-04-15 22:45:08 +00:00
<ul class="toc"><li><a href="4-pm.html#SP1">&#167;1. Character types</a></li><li><a href="4-pm.html#SP3">&#167;3. Simple parsing</a></li><li><a href="4-pm.html#SP6">&#167;6. A Worse PCRE</a></li><li><a href="4-pm.html#SP14">&#167;14. Replacement</a></li></ul><hr class="tocbar">
2019-02-04 22:26:45 +00:00
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP1"></a><b>&#167;1. Character types. </b>We will define white space as spaces and tabs only, since the various kinds
2019-02-04 22:26:45 +00:00
of line terminator will always be stripped out before this is applied.
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::white_space</span><button class="popup" onclick="togglePopup('usagePopup1')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup1">Usage of <span class="code-font"><span class="function-syntax">Regexp::white_space</span></span>:<br/><a href="4-pm.html#SP5">&#167;5</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">c</span><span class="plain-syntax">) {</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">' '</span><span class="plain-syntax">) || (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">'\t'</span><span class="plain-syntax">)) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">;</span>
<span class="plain-syntax">}</span>
</pre>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP2"></a><b>&#167;2. </b>The presence of <span class="extract"><span class="extract-syntax">:</span></span> here is perhaps a bit surprising, since it's illegal in
2019-02-04 22:26:45 +00:00
C and has other meanings in other languages, but it's legal in C-for-Inform
identifiers.
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::identifier_char</span><button class="popup" onclick="togglePopup('usagePopup2')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup2">Usage of <span class="code-font"><span class="function-syntax">Regexp::identifier_char</span></span>:<br/><a href="4-pm.html#SP13">&#167;13</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">c</span><span class="plain-syntax">) {</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">'_'</span><span class="plain-syntax">) || (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">':'</span><span class="plain-syntax">) ||</span>
<span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &gt;= </span><span class="character-syntax">'A'</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &lt;= </span><span class="character-syntax">'Z'</span><span class="plain-syntax">)) ||</span>
<span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &gt;= </span><span class="character-syntax">'a'</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &lt;= </span><span class="character-syntax">'z'</span><span class="plain-syntax">)) ||</span>
<span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &gt;= </span><span class="character-syntax">'0'</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &lt;= </span><span class="character-syntax">'9'</span><span class="plain-syntax">))) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">;</span>
<span class="plain-syntax">}</span>
</pre>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP3"></a><b>&#167;3. Simple parsing. </b>The following finds the earliest minimal-length substring of a string,
2020-04-22 22:57:09 +00:00
delimited by two pairs of characters: for example, <span class="extract"><span class="extract-syntax">&lt;&lt;</span></span> and <span class="extract"><span class="extract-syntax">&gt;&gt;</span></span>. This could
easily be done as a regular expression using <span class="extract"><span class="extract-syntax">Regexp::match</span></span>, but the routine
2019-02-04 22:26:45 +00:00
here is much quicker.
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-25 12:26:09 +00:00
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::find_expansion</span><span class="plain-syntax">(</span><span class="reserved-syntax">text_stream</span><span class="plain-syntax"> *</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> </span><span class="identifier-syntax">on1</span><span class="plain-syntax">, </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> </span><span class="identifier-syntax">on2</span><span class="plain-syntax">,</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> </span><span class="identifier-syntax">off1</span><span class="plain-syntax">, </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> </span><span class="identifier-syntax">off2</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> *</span><span class="identifier-syntax">len</span><span class="plain-syntax">) {</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax"> &lt; </span><a href="4-sm.html#SP8" class="function-link"><span class="function-syntax">Str::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">); </span><span class="identifier-syntax">i</span><span class="plain-syntax">++)</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">i</span><span class="plain-syntax">) == </span><span class="identifier-syntax">on1</span><span class="plain-syntax">) &amp;&amp; (</span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">i</span><span class="plain-syntax">+1) == </span><span class="identifier-syntax">on2</span><span class="plain-syntax">)) {</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">j</span><span class="plain-syntax">=</span><span class="identifier-syntax">i</span><span class="plain-syntax">+2; </span><span class="identifier-syntax">j</span><span class="plain-syntax"> &lt; </span><a href="4-sm.html#SP8" class="function-link"><span class="function-syntax">Str::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">); </span><span class="identifier-syntax">j</span><span class="plain-syntax">++)</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">j</span><span class="plain-syntax">) == </span><span class="identifier-syntax">off1</span><span class="plain-syntax">) &amp;&amp; (</span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">j</span><span class="plain-syntax">+1) == </span><span class="identifier-syntax">off2</span><span class="plain-syntax">)) {</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> *</span><span class="identifier-syntax">len</span><span class="plain-syntax"> = </span><span class="identifier-syntax">j</span><span class="plain-syntax">+2-</span><span class="identifier-syntax">i</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> -1;</span>
<span class="plain-syntax">}</span>
</pre>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP4"></a><b>&#167;4. </b>Still more simply:
2019-02-04 22:26:45 +00:00
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-25 12:26:09 +00:00
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::find_open_brace</span><span class="plain-syntax">(</span><span class="reserved-syntax">text_stream</span><span class="plain-syntax"> *</span><span class="identifier-syntax">text</span><span class="plain-syntax">) {</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax">=0; </span><span class="identifier-syntax">i</span><span class="plain-syntax"> &lt; </span><a href="4-sm.html#SP8" class="function-link"><span class="function-syntax">Str::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">); </span><span class="identifier-syntax">i</span><span class="plain-syntax">++)</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">i</span><span class="plain-syntax">) == </span><span class="character-syntax">'{'</span><span class="plain-syntax">)</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> -1;</span>
<span class="plain-syntax">}</span>
</pre>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP5"></a><b>&#167;5. </b>Note that we count the empty string as being white space. Again, this is
2020-04-22 22:57:09 +00:00
equivalent to <span class="extract"><span class="extract-syntax">Regexp::match(p, " *")</span></span>, but much faster.
2019-02-04 22:26:45 +00:00
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-25 12:26:09 +00:00
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::string_is_white_space</span><span class="plain-syntax">(</span><span class="reserved-syntax">text_stream</span><span class="plain-syntax"> *</span><span class="identifier-syntax">text</span><span class="plain-syntax">) {</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">LOOP_THROUGH_TEXT</span><span class="plain-syntax">(</span><span class="identifier-syntax">P</span><span class="plain-syntax">, </span><span class="identifier-syntax">text</span><span class="plain-syntax">)</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><a href="4-pm.html#SP1" class="function-link"><span class="function-syntax">Regexp::white_space</span></a><span class="plain-syntax">(</span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">P</span><span class="plain-syntax">)) == </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">)</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">;</span>
<span class="plain-syntax">}</span>
</pre>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP6"></a><b>&#167;6. A Worse PCRE. </b>I originally wanted to call the function in this section <span class="extract"><span class="extract-syntax">a_better_sscanf</span></span>, then
2020-04-22 22:57:09 +00:00
thought perhaps <span class="extract"><span class="extract-syntax">a_worse_PCRE</span></span> would be more true. (PCRE is Philip Hazel's superb
2019-02-04 22:26:45 +00:00
C implementation of regular-expression parsing, but I didn't need its full strength,
and I didn't want to complicate the build process by linking to it.)
</p>
2020-04-24 23:06:02 +00:00
<p class="commentary">This is a very minimal regular expression parser, simply for convenience of parsing
2019-02-04 22:26:45 +00:00
short texts against particularly simple patterns. Here is an example of use:
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">match_results</span><span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax"> = </span><span class="function-syntax">Regexp::create_mr</span><span class="plain-syntax">();</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="function-syntax">Regexp::match</span><span class="plain-syntax">(&amp;</span><span class="identifier-syntax">mr</span><span class="plain-syntax">, </span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">L</span><span class="string-syntax">"fish (%d+) ([a-zA-Z_][a-zA-Z0-9_]*) *"</span><span class="plain-syntax">) {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">PRINT</span><span class="plain-syntax">(</span><span class="string-syntax">"Fish number: %S\n"</span><span class="plain-syntax">, </span><span class="identifier-syntax">mr</span><span class="plain-syntax">.</span><span class="element-syntax">exp</span><span class="plain-syntax">[0]);</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">PRINT</span><span class="plain-syntax">(</span><span class="string-syntax">"Fish name: %S\n"</span><span class="plain-syntax">, </span><span class="identifier-syntax">mr</span><span class="plain-syntax">.</span><span class="element-syntax">exp</span><span class="plain-syntax">[1]);</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="function-syntax">Regexp::dispose_of</span><span class="plain-syntax">(&amp;</span><span class="identifier-syntax">mr</span><span class="plain-syntax">);</span>
2019-02-04 22:26:45 +00:00
</pre>
2020-04-24 23:06:02 +00:00
<p class="commentary">Note the <span class="extract"><span class="extract-syntax">L</span></span> at the front of the regex itself: this is a wide string.
2019-02-04 22:26:45 +00:00
</p>
2020-04-24 23:06:02 +00:00
<p class="commentary">This tries to match the given <span class="extract"><span class="extract-syntax">text</span></span> to see if it consists of the word fish,
2019-02-04 22:26:45 +00:00
then any amount of whitespace, then a string of digits which are copied into
2020-04-22 22:57:09 +00:00
<span class="extract"><span class="extract-syntax">mr-&gt;exp[0]</span></span>, then whitespace again, and then an alphanumeric identifier to be
copied into <span class="extract"><span class="extract-syntax">mr-&gt;exp[1]</span></span>, and finally optional whitespace. (If no match is
2019-02-04 22:26:45 +00:00
made, the contents of the found strings are undefined.)
</p>
2020-04-24 23:06:02 +00:00
<p class="commentary">Note that this differs from, for example, Perl's regular expression matcher
2019-02-04 22:26:45 +00:00
in several ways. The regular expression syntax is slightly different and in
general simpler. A match has to be made from start to end, so it's as if there
2020-04-22 22:57:09 +00:00
were an implicit <span class="extract"><span class="extract-syntax">^</span></span> at the front and <span class="extract"><span class="extract-syntax">$</span></span> at the back (in Perl terms). The
2019-02-04 22:26:45 +00:00
full match text is therefore always the entire text put in, so there's no
2020-04-22 22:57:09 +00:00
need to record this. In Perl, matching against <span class="extract"><span class="extract-syntax">m/(.*) plus (.*)/</span></span> would
2019-02-04 22:26:45 +00:00
set three subexpressions: number 0 would be the whole text matched, number
1 would be the first bracketed part, number 2 the second. Here, though, the
2020-04-22 22:57:09 +00:00
corresponding regex would be written <span class="extract"><span class="extract-syntax">L"(%c*) plus (%c*)"</span></span>, and the bracketed
2019-02-04 22:26:45 +00:00
terms would be subexpressions 0 and 1.
</p>
2020-04-25 10:33:39 +00:00
<pre class="definitions code-font"><span class="definition-keyword">define</span> <span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax"> </span><span class="constant-syntax">5</span><span class="plain-syntax"> </span><span class="comment-syntax"> this many bracketed subexpressions can be extracted</span>
2019-02-04 22:26:45 +00:00
</pre>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP7"></a><b>&#167;7. </b>The internal state of the matcher is stored as follows:
2019-02-04 22:26:45 +00:00
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-21 16:55:17 +00:00
<span class="reserved-syntax">typedef</span><span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">match_position</span><span class="plain-syntax"> {</span>
2020-04-24 23:06:02 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">tpos</span><span class="plain-syntax">; </span><span class="comment-syntax"> position within text being matched</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">ppos</span><span class="plain-syntax">; </span><span class="comment-syntax"> position within pattern</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">bc</span><span class="plain-syntax">; </span><span class="comment-syntax"> count of bracketed subexpressions so far begun</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">bl</span><span class="plain-syntax">; </span><span class="comment-syntax"> bracket indentation level</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">bracket_nesting</span><span class="plain-syntax">[</span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">];</span>
2020-04-24 23:06:02 +00:00
<span class="plain-syntax"> </span><span class="comment-syntax"> which subexpression numbers (0, 1, 2, 3) correspond to which nesting</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">brackets_start</span><span class="plain-syntax">[</span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">], </span><span class="identifier-syntax">brackets_end</span><span class="plain-syntax">[</span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">];</span>
2020-04-24 23:06:02 +00:00
<span class="plain-syntax"> </span><span class="comment-syntax"> positions in text being matched, inclusive</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax">} </span><span class="reserved-syntax">match_position</span><span class="plain-syntax">;</span>
</pre>
<ul class="endnotetexts"><li>The structure match_position is private to this section.</li></ul>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP8"></a><b>&#167;8. </b>It may appear that match texts are limited to 64 characters here, but they
2019-02-04 22:26:45 +00:00
are not. They are simply a little faster to access if short.
</p>
2020-04-25 10:33:39 +00:00
<pre class="definitions code-font"><span class="definition-keyword">define</span> <span class="constant-syntax">MATCH_TEXT_INITIAL_ALLOCATION</span><span class="plain-syntax"> </span><span class="constant-syntax">64</span>
2019-02-04 22:26:45 +00:00
</pre>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-21 16:55:17 +00:00
<span class="reserved-syntax">typedef</span><span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">match_result</span><span class="plain-syntax"> {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> </span><span class="identifier-syntax">match_text_storage</span><span class="plain-syntax">[</span><span class="constant-syntax">MATCH_TEXT_INITIAL_ALLOCATION</span><span class="plain-syntax">];</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">text_stream</span><span class="plain-syntax"> </span><span class="identifier-syntax">match_text_struct</span><span class="plain-syntax">;</span>
<span class="plain-syntax">} </span><span class="reserved-syntax">match_result</span><span class="plain-syntax">;</span>
<span class="reserved-syntax">typedef</span><span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">match_results</span><span class="plain-syntax"> {</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">no_matched_texts</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">match_result</span><span class="plain-syntax"> </span><span class="identifier-syntax">exp_storage</span><span class="plain-syntax">[</span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">];</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">text_stream</span><span class="plain-syntax"> *</span><span class="identifier-syntax">exp</span><span class="plain-syntax">[</span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">];</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">exp_at</span><span class="plain-syntax">[</span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">];</span>
<span class="plain-syntax">} </span><span class="reserved-syntax">match_results</span><span class="plain-syntax">;</span>
</pre>
<ul class="endnotetexts"><li>The structure match_result is private to this section.</li><li>The structure match_results is accessed in 3/cla, 8/ws, 8/wm, 8/bf and here.</li></ul>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP9"></a><b>&#167;9. </b>Match result objects are inherently ephemeral, and we can expect to be
2019-02-04 22:26:45 +00:00
creating them and throwing them away frequently. This must be done
explicitly. Note that the storage required is on the C stack (unless some
result strings grow very large), so that it's very quick to allocate and
deallocate.
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
<span class="reserved-syntax">match_results</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::create_mr</span><button class="popup" onclick="togglePopup('usagePopup3')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup3">Usage of <span class="code-font"><span class="function-syntax">Regexp::create_mr</span></span>:<br/><a href="4-pm.html#SP14">&#167;14</a><br/>Command Line Arguments - <a href="3-cla.html#SP11">&#167;11</a>, <a href="3-cla.html#SP12">&#167;12</a><br/>Web Structure - <a href="8-ws.html#SP7_3_2">&#167;7.3.2</a>, <a href="8-ws.html#SP7_3_3_2">&#167;7.3.3.2</a>, <a href="8-ws.html#SP7_3_3_2_1">&#167;7.3.3.2.1</a>, <a href="8-ws.html#SP7_2_1">&#167;7.2.1</a>, <a href="8-ws.html#SP7_2_2_1">&#167;7.2.2.1</a>, <a href="8-ws.html#SP7_2_2_3">&#167;7.2.2.3</a><br/>Web Modules - <a href="8-wm.html#SP9">&#167;9</a><br/>Build Files - <a href="8-bf.html#SP3">&#167;3</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">void</span><span class="plain-syntax">) {</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">match_results</span><span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">.</span><span class="element-syntax">no_matched_texts</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax">=0; </span><span class="identifier-syntax">i</span><span class="plain-syntax">&lt;</span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax">++) {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">.</span><span class="identifier-syntax">exp</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">] = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">.</span><span class="identifier-syntax">exp_at</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">] = -1;</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">;</span>
<span class="plain-syntax">}</span>
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::dispose_of</span><button class="popup" onclick="togglePopup('usagePopup4')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup4">Usage of <span class="code-font"><span class="function-syntax">Regexp::dispose_of</span></span>:<br/><a href="4-pm.html#SP10">&#167;10</a>, <a href="4-pm.html#SP14">&#167;14</a><br/>Command Line Arguments - <a href="3-cla.html#SP11">&#167;11</a><br/>Web Structure - <a href="8-ws.html#SP7_3_2">&#167;7.3.2</a>, <a href="8-ws.html#SP7_3_3_2">&#167;7.3.3.2</a>, <a href="8-ws.html#SP7_3_3_2_1">&#167;7.3.3.2.1</a>, <a href="8-ws.html#SP7_2_1">&#167;7.2.1</a>, <a href="8-ws.html#SP7_2_2_1">&#167;7.2.2.1</a>, <a href="8-ws.html#SP7_2_2_3">&#167;7.2.2.3</a><br/>Web Modules - <a href="8-wm.html#SP9">&#167;9</a><br/>Build Files - <a href="8-bf.html#SP3">&#167;3</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">match_results</span><span class="plain-syntax"> *</span><span class="identifier-syntax">mr</span><span class="plain-syntax">) {</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">mr</span><span class="plain-syntax">) {</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax">=0; </span><span class="identifier-syntax">i</span><span class="plain-syntax">&lt;</span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax">++)</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">]) {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">STREAM_CLOSE</span><span class="plain-syntax">(</span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="identifier-syntax">exp</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">]);</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">] = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">no_matched_texts</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax">}</span>
</pre>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP10"></a><b>&#167;10. </b>So, then: the matcher itself.
2019-02-04 22:26:45 +00:00
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::match</span><button class="popup" onclick="togglePopup('usagePopup5')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup5">Usage of <span class="code-font"><span class="function-syntax">Regexp::match</span></span>:<br/>Command Line Arguments - <a href="3-cla.html#SP11">&#167;11</a>, <a href="3-cla.html#SP12">&#167;12</a><br/>Web Structure - <a href="8-ws.html#SP7_3_2">&#167;7.3.2</a>, <a href="8-ws.html#SP7_3_3_2">&#167;7.3.3.2</a>, <a href="8-ws.html#SP7_3_3_2_1">&#167;7.3.3.2.1</a>, <a href="8-ws.html#SP7_2_1">&#167;7.2.1</a>, <a href="8-ws.html#SP7_2_2_1">&#167;7.2.2.1</a>, <a href="8-ws.html#SP7_2_2_3">&#167;7.2.2.3</a><br/>Web Modules - <a href="8-wm.html#SP9">&#167;9</a><br/>Build Files - <a href="8-bf.html#SP3">&#167;3</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">match_results</span><span class="plain-syntax"> *</span><span class="identifier-syntax">mr</span><span class="plain-syntax">, </span><span class="reserved-syntax">text_stream</span><span class="plain-syntax"> *</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">) {</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">mr</span><span class="plain-syntax">) </span><a href="4-pm.html#SP10" class="function-link"><span class="function-syntax">Regexp::prepare</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">mr</span><span class="plain-syntax">);</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">rv</span><span class="plain-syntax"> = (</span><a href="4-pm.html#SP11" class="function-link"><span class="function-syntax">Regexp::match_r</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">mr</span><span class="plain-syntax">, </span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">pattern</span><span class="plain-syntax">, </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">, </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">) &gt;= </span><span class="constant-syntax">0</span><span class="plain-syntax">)?</span><span class="identifier-syntax">TRUE:FALSE</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">mr</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">rv</span><span class="plain-syntax"> == </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">)) </span><a href="4-pm.html#SP9" class="function-link"><span class="function-syntax">Regexp::dispose_of</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">mr</span><span class="plain-syntax">);</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">rv</span><span class="plain-syntax">;</span>
<span class="plain-syntax">}</span>
2020-04-25 12:26:09 +00:00
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::match_from</span><span class="plain-syntax">(</span><span class="reserved-syntax">match_results</span><span class="plain-syntax"> *</span><span class="identifier-syntax">mr</span><span class="plain-syntax">, </span><span class="reserved-syntax">text_stream</span><span class="plain-syntax"> *</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">,</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">x</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">allow_partial</span><span class="plain-syntax">) {</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">match_to</span><span class="plain-syntax"> = </span><span class="identifier-syntax">x</span><span class="plain-syntax">;</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">x</span><span class="plain-syntax"> &lt; </span><a href="4-sm.html#SP8" class="function-link"><span class="function-syntax">Str::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">)) {</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">mr</span><span class="plain-syntax">) </span><a href="4-pm.html#SP10" class="function-link"><span class="function-syntax">Regexp::prepare</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">mr</span><span class="plain-syntax">);</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">match_position</span><span class="plain-syntax"> </span><span class="identifier-syntax">at</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="identifier-syntax">tpos</span><span class="plain-syntax"> = </span><span class="identifier-syntax">x</span><span class="plain-syntax">; </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bc</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bl</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">match_to</span><span class="plain-syntax"> = </span><a href="4-pm.html#SP11" class="function-link"><span class="function-syntax">Regexp::match_r</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">mr</span><span class="plain-syntax">, </span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">pattern</span><span class="plain-syntax">, &amp;</span><span class="identifier-syntax">at</span><span class="plain-syntax">, </span><span class="identifier-syntax">allow_partial</span><span class="plain-syntax">);</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">match_to</span><span class="plain-syntax"> == -1) {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">match_to</span><span class="plain-syntax"> = </span><span class="identifier-syntax">x</span><span class="plain-syntax">;</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">mr</span><span class="plain-syntax">) </span><a href="4-pm.html#SP9" class="function-link"><span class="function-syntax">Regexp::dispose_of</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">mr</span><span class="plain-syntax">);</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">match_to</span><span class="plain-syntax"> - </span><span class="identifier-syntax">x</span><span class="plain-syntax">;</span>
<span class="plain-syntax">}</span>
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::prepare</span><button class="popup" onclick="togglePopup('usagePopup6')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup6">Usage of <span class="code-font"><span class="function-syntax">Regexp::prepare</span></span>:<br/><a href="4-pm.html#SP14">&#167;14</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">match_results</span><span class="plain-syntax"> *</span><span class="identifier-syntax">mr</span><span class="plain-syntax">) {</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">mr</span><span class="plain-syntax">) {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">no_matched_texts</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax">=0; </span><span class="identifier-syntax">i</span><span class="plain-syntax">&lt;</span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax">++) {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp_at</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">] = -1;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">]) </span><span class="identifier-syntax">STREAM_CLOSE</span><span class="plain-syntax">(</span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">]);</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp_storage</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">].</span><span class="element-syntax">match_text_struct</span><span class="plain-syntax"> =</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><a href="2-str.html#SP27" class="function-link"><span class="function-syntax">Streams::new_buffer</span></a><span class="plain-syntax">(</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="constant-syntax">MATCH_TEXT_INITIAL_ALLOCATION</span><span class="plain-syntax">, </span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp_storage</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">].</span><span class="element-syntax">match_text_storage</span><span class="plain-syntax">);</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp_storage</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">].</span><span class="element-syntax">match_text_struct</span><span class="plain-syntax">.</span><span class="element-syntax">stream_flags</span><span class="plain-syntax"> |= </span><span class="constant-syntax">FOR_RE_STRF</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">] = &amp;(</span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp_storage</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">].</span><span class="element-syntax">match_text_struct</span><span class="plain-syntax">);</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax">}</span>
</pre>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP11"></a><b>&#167;11. </b></p>
2020-04-21 16:55:17 +00:00
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::match_r</span><button class="popup" onclick="togglePopup('usagePopup7')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup7">Usage of <span class="code-font"><span class="function-syntax">Regexp::match_r</span></span>:<br/><a href="4-pm.html#SP10">&#167;10</a>, <a href="4-pm.html#SP11_5">&#167;11.5</a>, <a href="4-pm.html#SP14">&#167;14</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">match_results</span><span class="plain-syntax"> *</span><span class="identifier-syntax">mr</span><span class="plain-syntax">, </span><span class="reserved-syntax">text_stream</span><span class="plain-syntax"> *</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">,</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">match_position</span><span class="plain-syntax"> *</span><span class="identifier-syntax">scan_from</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">allow_partial</span><span class="plain-syntax">) {</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">match_position</span><span class="plain-syntax"> </span><span class="identifier-syntax">at</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">scan_from</span><span class="plain-syntax">) </span><span class="identifier-syntax">at</span><span class="plain-syntax"> = *</span><span class="identifier-syntax">scan_from</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">else</span><span class="plain-syntax"> { </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">tpos</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bc</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bl</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; }</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">while</span><span class="plain-syntax"> ((</span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="identifier-syntax">tpos</span><span class="plain-syntax">)) || (</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax">])) {</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">allow_partial</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax">] == </span><span class="constant-syntax">0</span><span class="plain-syntax">)) </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-pm.html#SP11_1" class="named-paragraph-link"><span class="named-paragraph">Parentheses in the match pattern set up substrings to extract</span><span class="named-paragraph-number">11.1</span></a></span><span class="plain-syntax">;</span>
2020-04-21 16:55:17 +00:00
2020-05-09 12:05:00 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">chcl</span><span class="plain-syntax">, </span><span class="comment-syntax"> what class of characters to match: a </span><span class="extract"><span class="extract-syntax">*_CHARCLASS</span></span><span class="comment-syntax"> value</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">range_from</span><span class="plain-syntax">, </span><span class="identifier-syntax">range_to</span><span class="plain-syntax">, </span><span class="comment-syntax"> for </span><span class="extract"><span class="extract-syntax">LITERAL_CHARCLASS</span></span><span class="comment-syntax"> only</span>
2020-04-24 23:06:02 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">reverse</span><span class="plain-syntax"> = </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">; </span><span class="comment-syntax"> require a non-match rather than a match</span>
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-pm.html#SP11_2" class="named-paragraph-link"><span class="named-paragraph">Extract the character class to match from the pattern</span><span class="named-paragraph-number">11.2</span></a></span><span class="plain-syntax">;</span>
2020-04-21 16:55:17 +00:00
2020-04-24 23:06:02 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">rep_from</span><span class="plain-syntax"> = </span><span class="constant-syntax">1</span><span class="plain-syntax">, </span><span class="identifier-syntax">rep_to</span><span class="plain-syntax"> = </span><span class="constant-syntax">1</span><span class="plain-syntax">; </span><span class="comment-syntax"> minimum and maximum number of repetitions</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">greedy</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">; </span><span class="comment-syntax"> go for a maximal-length match if possible</span>
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-pm.html#SP11_3" class="named-paragraph-link"><span class="named-paragraph">Extract repetition markers from the pattern</span><span class="named-paragraph-number">11.3</span></a></span><span class="plain-syntax">;</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">reps</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-pm.html#SP11_4" class="named-paragraph-link"><span class="named-paragraph">Count how many repetitions can be made here</span><span class="named-paragraph-number">11.4</span></a></span><span class="plain-syntax">;</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">reps</span><span class="plain-syntax"> &lt; </span><span class="identifier-syntax">rep_from</span><span class="plain-syntax">) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> -1;</span>
2020-04-24 23:06:02 +00:00
<span class="plain-syntax"> </span><span class="comment-syntax"> we can now accept anything from </span><span class="extract"><span class="extract-syntax">rep_from</span></span><span class="comment-syntax"> to </span><span class="extract"><span class="extract-syntax">reps</span></span><span class="comment-syntax"> repetitions</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">rep_from</span><span class="plain-syntax"> == </span><span class="identifier-syntax">reps</span><span class="plain-syntax">) { </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">tpos</span><span class="plain-syntax"> += </span><span class="identifier-syntax">reps</span><span class="plain-syntax">; </span><span class="reserved-syntax">continue</span><span class="plain-syntax">; }</span>
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-pm.html#SP11_5" class="named-paragraph-link"><span class="named-paragraph">Try all possible match lengths until we find a match</span><span class="named-paragraph-number">11.5</span></a></span><span class="plain-syntax">;</span>
2020-04-21 16:55:17 +00:00
2020-04-24 23:06:02 +00:00
<span class="plain-syntax"> </span><span class="comment-syntax"> no match length worked, so no match</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> -1;</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-pm.html#SP11_6" class="named-paragraph-link"><span class="named-paragraph">Copy the bracketed texts found into the global strings</span><span class="named-paragraph-number">11.6</span></a></span><span class="plain-syntax">;</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">tpos</span><span class="plain-syntax">;</span>
<span class="plain-syntax">}</span>
</pre>
2020-04-25 10:33:39 +00:00
<p class="commentary firstcommentary"><a id="SP11_1"></a><b>&#167;11.1. </b><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">Parentheses in the match pattern set up substrings to extract</span><span class="named-paragraph-number">11.1</span></span><span class="comment-syntax"> =</span>
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax">] == </span><span class="character-syntax">'('</span><span class="plain-syntax">) {</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bl</span><span class="plain-syntax"> &lt; </span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">) </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bracket_nesting</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bl</span><span class="plain-syntax">] = -1;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bc</span><span class="plain-syntax"> &lt; </span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">) {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bracket_nesting</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bl</span><span class="plain-syntax">] = </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bc</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">brackets_start</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="identifier-syntax">bc</span><span class="plain-syntax">] = </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">tpos</span><span class="plain-syntax">; </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="identifier-syntax">brackets_end</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bc</span><span class="plain-syntax">] = -1;</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="identifier-syntax">bl</span><span class="plain-syntax">++; </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bc</span><span class="plain-syntax">++; </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax">++;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">continue</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax">] == </span><span class="character-syntax">')'</span><span class="plain-syntax">) {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="identifier-syntax">bl</span><span class="plain-syntax">--;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bl</span><span class="plain-syntax"> &gt;= </span><span class="constant-syntax">0</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bl</span><span class="plain-syntax"> &lt; </span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bracket_nesting</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bl</span><span class="plain-syntax">] &gt;= </span><span class="constant-syntax">0</span><span class="plain-syntax">))</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="identifier-syntax">brackets_end</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bracket_nesting</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bl</span><span class="plain-syntax">]] = </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">tpos</span><span class="plain-syntax">-1;</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="identifier-syntax">ppos</span><span class="plain-syntax">++;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">continue</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> }</span>
</pre>
<ul class="endnotetexts"><li>This code is used in <a href="4-pm.html#SP11">&#167;11</a>.</li></ul>
2020-04-25 10:33:39 +00:00
<p class="commentary firstcommentary"><a id="SP11_2"></a><b>&#167;11.2. </b><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">Extract the character class to match from the pattern</span><span class="named-paragraph-number">11.2</span></span><span class="comment-syntax"> =</span>
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">len</span><span class="plain-syntax">;</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">chcl</span><span class="plain-syntax"> = </span><a href="4-pm.html#SP12" class="function-link"><span class="function-syntax">Regexp::get_cclass</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">, </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax">, &amp;</span><span class="identifier-syntax">len</span><span class="plain-syntax">, &amp;</span><span class="identifier-syntax">range_from</span><span class="plain-syntax">, &amp;</span><span class="identifier-syntax">range_to</span><span class="plain-syntax">, &amp;</span><span class="identifier-syntax">reverse</span><span class="plain-syntax">);</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax"> += </span><span class="identifier-syntax">len</span><span class="plain-syntax">;</span>
</pre>
<ul class="endnotetexts"><li>This code is used in <a href="4-pm.html#SP11">&#167;11</a>.</li></ul>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP11_3"></a><b>&#167;11.3. </b>This is standard regular-expression notation, except that I haven't bothered
2019-02-04 22:26:45 +00:00
to implement numeric repetition counts, which we won't need:
</p>
2020-04-25 10:33:39 +00:00
<p class="commentary"><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">Extract repetition markers from the pattern</span><span class="named-paragraph-number">11.3</span></span><span class="comment-syntax"> =</span>
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-05-09 12:05:00 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">chcl</span><span class="plain-syntax"> == </span><span class="constant-syntax">WHITESPACE_CHARCLASS</span><span class="plain-syntax">) {</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">rep_from</span><span class="plain-syntax"> = </span><span class="constant-syntax">1</span><span class="plain-syntax">; </span><span class="identifier-syntax">rep_to</span><span class="plain-syntax"> = </span><a href="4-sm.html#SP8" class="function-link"><span class="function-syntax">Str::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">)-</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">tpos</span><span class="plain-syntax">;</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax">] == </span><span class="character-syntax">'+'</span><span class="plain-syntax">) {</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">rep_from</span><span class="plain-syntax"> = </span><span class="constant-syntax">1</span><span class="plain-syntax">; </span><span class="identifier-syntax">rep_to</span><span class="plain-syntax"> = </span><a href="4-sm.html#SP8" class="function-link"><span class="function-syntax">Str::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">)-</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">tpos</span><span class="plain-syntax">; </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax">++;</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> } </span><span class="reserved-syntax">else</span><span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax">] == </span><span class="character-syntax">'*'</span><span class="plain-syntax">) {</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">rep_from</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">rep_to</span><span class="plain-syntax"> = </span><a href="4-sm.html#SP8" class="function-link"><span class="function-syntax">Str::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">)-</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">tpos</span><span class="plain-syntax">; </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax">++;</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">[</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax">] == </span><span class="character-syntax">'?'</span><span class="plain-syntax">) { </span><span class="identifier-syntax">greedy</span><span class="plain-syntax"> = </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">; </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax">++; }</span>
</pre>
<ul class="endnotetexts"><li>This code is used in <a href="4-pm.html#SP11">&#167;11</a>.</li></ul>
2020-04-25 10:33:39 +00:00
<p class="commentary firstcommentary"><a id="SP11_4"></a><b>&#167;11.4. </b><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">Count how many repetitions can be made here</span><span class="named-paragraph-number">11.4</span></span><span class="comment-syntax"> =</span>
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="identifier-syntax">reps</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; ((</span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">tpos</span><span class="plain-syntax">+</span><span class="identifier-syntax">reps</span><span class="plain-syntax">)) &amp;&amp; (</span><span class="identifier-syntax">reps</span><span class="plain-syntax"> &lt; </span><span class="identifier-syntax">rep_to</span><span class="plain-syntax">)); </span><span class="identifier-syntax">reps</span><span class="plain-syntax">++)</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><a href="4-pm.html#SP13" class="function-link"><span class="function-syntax">Regexp::test_cclass</span></a><span class="plain-syntax">(</span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">tpos</span><span class="plain-syntax">+</span><span class="identifier-syntax">reps</span><span class="plain-syntax">), </span><span class="identifier-syntax">chcl</span><span class="plain-syntax">,</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">range_from</span><span class="plain-syntax">, </span><span class="identifier-syntax">range_to</span><span class="plain-syntax">, </span><span class="identifier-syntax">pattern</span><span class="plain-syntax">, </span><span class="identifier-syntax">reverse</span><span class="plain-syntax">) == </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">)</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
</pre>
<ul class="endnotetexts"><li>This code is used in <a href="4-pm.html#SP11">&#167;11</a>.</li></ul>
2020-04-25 10:33:39 +00:00
<p class="commentary firstcommentary"><a id="SP11_5"></a><b>&#167;11.5. </b><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">Try all possible match lengths until we find a match</span><span class="named-paragraph-number">11.5</span></span><span class="comment-syntax"> =</span>
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">from</span><span class="plain-syntax"> = </span><span class="identifier-syntax">rep_from</span><span class="plain-syntax">, </span><span class="identifier-syntax">to</span><span class="plain-syntax"> = </span><span class="identifier-syntax">reps</span><span class="plain-syntax">, </span><span class="identifier-syntax">dj</span><span class="plain-syntax"> = </span><span class="constant-syntax">1</span><span class="plain-syntax">, </span><span class="identifier-syntax">from_tpos</span><span class="plain-syntax"> = </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">tpos</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">greedy</span><span class="plain-syntax">) { </span><span class="identifier-syntax">from</span><span class="plain-syntax"> = </span><span class="identifier-syntax">reps</span><span class="plain-syntax">; </span><span class="identifier-syntax">to</span><span class="plain-syntax"> = </span><span class="identifier-syntax">rep_from</span><span class="plain-syntax">; </span><span class="identifier-syntax">dj</span><span class="plain-syntax"> = -1; }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">j</span><span class="plain-syntax"> = </span><span class="identifier-syntax">from</span><span class="plain-syntax">; </span><span class="identifier-syntax">j</span><span class="plain-syntax"> != </span><span class="identifier-syntax">to</span><span class="plain-syntax">+</span><span class="identifier-syntax">dj</span><span class="plain-syntax">; </span><span class="identifier-syntax">j</span><span class="plain-syntax"> += </span><span class="identifier-syntax">dj</span><span class="plain-syntax">) {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="identifier-syntax">tpos</span><span class="plain-syntax"> = </span><span class="identifier-syntax">from_tpos</span><span class="plain-syntax"> + </span><span class="identifier-syntax">j</span><span class="plain-syntax">;</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">try</span><span class="plain-syntax"> = </span><a href="4-pm.html#SP11" class="function-link"><span class="function-syntax">Regexp::match_r</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">mr</span><span class="plain-syntax">, </span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">pattern</span><span class="plain-syntax">, &amp;</span><span class="identifier-syntax">at</span><span class="plain-syntax">, </span><span class="identifier-syntax">allow_partial</span><span class="plain-syntax">);</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">try</span><span class="plain-syntax"> &gt;= </span><span class="constant-syntax">0</span><span class="plain-syntax">) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">try</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> }</span>
</pre>
<ul class="endnotetexts"><li>This code is used in <a href="4-pm.html#SP11">&#167;11</a>.</li></ul>
2020-04-25 10:33:39 +00:00
<p class="commentary firstcommentary"><a id="SP11_6"></a><b>&#167;11.6. </b><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">Copy the bracketed texts found into the global strings</span><span class="named-paragraph-number">11.6</span></span><span class="comment-syntax"> =</span>
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">mr</span><span class="plain-syntax">) {</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax">=0; </span><span class="identifier-syntax">i</span><span class="plain-syntax">&lt;</span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bc</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax">++) {</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><a href="4-sm.html#SP15" class="function-link"><span class="function-syntax">Str::clear</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">]);</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">j</span><span class="plain-syntax"> = </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">brackets_start</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">]; </span><span class="identifier-syntax">j</span><span class="plain-syntax"> &lt;= </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="identifier-syntax">brackets_end</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">]; </span><span class="identifier-syntax">j</span><span class="plain-syntax">++)</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">PUT_TO</span><span class="plain-syntax">(</span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">], </span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">j</span><span class="plain-syntax">));</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">exp_at</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">] = </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">brackets_start</span><span class="plain-syntax">[</span><span class="identifier-syntax">i</span><span class="plain-syntax">];</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax">-&gt;</span><span class="element-syntax">no_matched_texts</span><span class="plain-syntax"> = </span><span class="identifier-syntax">at</span><span class="plain-syntax">.</span><span class="element-syntax">bc</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> }</span>
</pre>
<ul class="endnotetexts"><li>This code is used in <a href="4-pm.html#SP11">&#167;11</a>.</li></ul>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP12"></a><b>&#167;12. </b>So then: most characters in the pattern are taken literally (if the pattern
2020-04-22 22:57:09 +00:00
says <span class="extract"><span class="extract-syntax">q</span></span>, the only match is with a lower-case letter "q"), except that:
2019-02-04 22:26:45 +00:00
</p>
2020-04-24 23:06:02 +00:00
<ul class="items"><li>(a) a space means "one or more characters of white space";
</li><li>(b) <span class="extract"><span class="extract-syntax">%d</span></span> means any decimal digit;
</li><li>(c) <span class="extract"><span class="extract-syntax">%c</span></span> means any character at all;
</li><li>(d) <span class="extract"><span class="extract-syntax">%C</span></span> means any character which isn't white space;
</li><li>(e) <span class="extract"><span class="extract-syntax">%i</span></span> means any character from the identifier class (see above);
</li><li>(f) <span class="extract"><span class="extract-syntax">%p</span></span> means any character which can be used in the name of a Preform
2019-02-04 22:26:45 +00:00
nonterminal, which is to say, an identifier character or a hyphen;
2020-04-24 23:06:02 +00:00
</li><li>(g) <span class="extract"><span class="extract-syntax">%P</span></span> means the same or else a colon;
</li><li>(h) <span class="extract"><span class="extract-syntax">%t</span></span> means a tab;
</li><li>(i) <span class="extract"><span class="extract-syntax">%q</span></span> means a double-quote.
</li></ul>
<p class="commentary"><span class="extract"><span class="extract-syntax">%</span></span> otherwise makes a literal escape; a space means any whitespace character;
2019-02-04 22:26:45 +00:00
square brackets enclose literal alternatives, and note as usual with grep
2020-04-22 22:57:09 +00:00
engines that <span class="extract"><span class="extract-syntax">[]xyz]</span></span> is legal and makes a set of four possibilities, the
2019-02-04 22:26:45 +00:00
first of which is a literal close square; within a set, a hyphen makes a
2020-04-22 22:57:09 +00:00
character range; an initial <span class="extract"><span class="extract-syntax">^</span></span> negates the result; and otherwise everything
2019-02-04 22:26:45 +00:00
is literal.
</p>
2020-05-09 12:05:00 +00:00
<pre class="definitions code-font"><span class="definition-keyword">define</span> <span class="constant-syntax">ANY_CHARCLASS</span><span class="plain-syntax"> </span><span class="constant-syntax">1</span>
<span class="definition-keyword">define</span> <span class="constant-syntax">DIGIT_CHARCLASS</span><span class="plain-syntax"> </span><span class="constant-syntax">2</span>
<span class="definition-keyword">define</span> <span class="constant-syntax">WHITESPACE_CHARCLASS</span><span class="plain-syntax"> </span><span class="constant-syntax">3</span>
<span class="definition-keyword">define</span> <span class="constant-syntax">NONWHITESPACE_CHARCLASS</span><span class="plain-syntax"> </span><span class="constant-syntax">4</span>
<span class="definition-keyword">define</span> <span class="constant-syntax">IDENTIFIER_CHARCLASS</span><span class="plain-syntax"> </span><span class="constant-syntax">5</span>
<span class="definition-keyword">define</span> <span class="constant-syntax">PREFORM_CHARCLASS</span><span class="plain-syntax"> </span><span class="constant-syntax">6</span>
<span class="definition-keyword">define</span> <span class="constant-syntax">PREFORMC_CHARCLASS</span><span class="plain-syntax"> </span><span class="constant-syntax">7</span>
<span class="definition-keyword">define</span> <span class="constant-syntax">LITERAL_CHARCLASS</span><span class="plain-syntax"> </span><span class="constant-syntax">8</span>
<span class="definition-keyword">define</span> <span class="constant-syntax">TAB_CHARCLASS</span><span class="plain-syntax"> </span><span class="constant-syntax">9</span>
<span class="definition-keyword">define</span> <span class="constant-syntax">QUOTE_CHARCLASS</span><span class="plain-syntax"> </span><span class="constant-syntax">10</span>
2019-02-04 22:26:45 +00:00
</pre>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::get_cclass</span><button class="popup" onclick="togglePopup('usagePopup8')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup8">Usage of <span class="code-font"><span class="function-syntax">Regexp::get_cclass</span></span>:<br/><a href="4-pm.html#SP11_2">&#167;11.2</a></span></button><span class="plain-syntax">(</span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">ppos</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> *</span><span class="identifier-syntax">len</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> *</span><span class="identifier-syntax">from</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> *</span><span class="identifier-syntax">to</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> *</span><span class="identifier-syntax">reverse</span><span class="plain-syntax">) {</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">[</span><span class="identifier-syntax">ppos</span><span class="plain-syntax">] == </span><span class="character-syntax">'^'</span><span class="plain-syntax">) { </span><span class="identifier-syntax">ppos</span><span class="plain-syntax">++; *</span><span class="identifier-syntax">reverse</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">; } </span><span class="reserved-syntax">else</span><span class="plain-syntax"> { *</span><span class="identifier-syntax">reverse</span><span class="plain-syntax"> = </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">; }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">switch</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">[</span><span class="identifier-syntax">ppos</span><span class="plain-syntax">]) {</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'%'</span><span class="plain-syntax">:</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">ppos</span><span class="plain-syntax">++;</span>
<span class="plain-syntax"> *</span><span class="identifier-syntax">len</span><span class="plain-syntax"> = </span><span class="constant-syntax">2</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">switch</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">[</span><span class="identifier-syntax">ppos</span><span class="plain-syntax">]) {</span>
2020-05-09 12:05:00 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'d'</span><span class="plain-syntax">: </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">DIGIT_CHARCLASS</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'c'</span><span class="plain-syntax">: </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">ANY_CHARCLASS</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'C'</span><span class="plain-syntax">: </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">NONWHITESPACE_CHARCLASS</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'i'</span><span class="plain-syntax">: </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">IDENTIFIER_CHARCLASS</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'p'</span><span class="plain-syntax">: </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">PREFORM_CHARCLASS</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'P'</span><span class="plain-syntax">: </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">PREFORMC_CHARCLASS</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'q'</span><span class="plain-syntax">: </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">QUOTE_CHARCLASS</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'t'</span><span class="plain-syntax">: </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">TAB_CHARCLASS</span><span class="plain-syntax">;</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> }</span>
2020-05-09 12:05:00 +00:00
<span class="plain-syntax"> *</span><span class="identifier-syntax">from</span><span class="plain-syntax"> = </span><span class="identifier-syntax">ppos</span><span class="plain-syntax">; *</span><span class="identifier-syntax">to</span><span class="plain-syntax"> = </span><span class="identifier-syntax">ppos</span><span class="plain-syntax">; </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">LITERAL_CHARCLASS</span><span class="plain-syntax">;</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">'['</span><span class="plain-syntax">:</span>
<span class="plain-syntax"> *</span><span class="identifier-syntax">from</span><span class="plain-syntax"> = </span><span class="identifier-syntax">ppos</span><span class="plain-syntax">+1;</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">ppos</span><span class="plain-syntax"> += </span><span class="constant-syntax">2</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">while</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">[</span><span class="identifier-syntax">ppos</span><span class="plain-syntax">]) &amp;&amp; (</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">[</span><span class="identifier-syntax">ppos</span><span class="plain-syntax">] != </span><span class="character-syntax">']'</span><span class="plain-syntax">)) </span><span class="identifier-syntax">ppos</span><span class="plain-syntax">++;</span>
<span class="plain-syntax"> *</span><span class="identifier-syntax">to</span><span class="plain-syntax"> = </span><span class="identifier-syntax">ppos</span><span class="plain-syntax"> - </span><span class="constant-syntax">1</span><span class="plain-syntax">; *</span><span class="identifier-syntax">len</span><span class="plain-syntax"> = </span><span class="identifier-syntax">ppos</span><span class="plain-syntax"> - *</span><span class="identifier-syntax">from</span><span class="plain-syntax"> + </span><span class="constant-syntax">2</span><span class="plain-syntax">;</span>
2020-05-09 12:05:00 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">LITERAL_CHARCLASS</span><span class="plain-syntax">;</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="character-syntax">' '</span><span class="plain-syntax">:</span>
2020-05-09 12:05:00 +00:00
<span class="plain-syntax"> *</span><span class="identifier-syntax">len</span><span class="plain-syntax"> = </span><span class="constant-syntax">1</span><span class="plain-syntax">; </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">WHITESPACE_CHARCLASS</span><span class="plain-syntax">;</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> }</span>
2020-05-09 12:05:00 +00:00
<span class="plain-syntax"> *</span><span class="identifier-syntax">len</span><span class="plain-syntax"> = </span><span class="constant-syntax">1</span><span class="plain-syntax">; *</span><span class="identifier-syntax">from</span><span class="plain-syntax"> = </span><span class="identifier-syntax">ppos</span><span class="plain-syntax">; *</span><span class="identifier-syntax">to</span><span class="plain-syntax"> = </span><span class="identifier-syntax">ppos</span><span class="plain-syntax">; </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">LITERAL_CHARCLASS</span><span class="plain-syntax">;</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax">}</span>
</pre>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP13"></a><b>&#167;13. </b></p>
2020-04-21 16:55:17 +00:00
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::test_cclass</span><button class="popup" onclick="togglePopup('usagePopup9')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup9">Usage of <span class="code-font"><span class="function-syntax">Regexp::test_cclass</span></span>:<br/><a href="4-pm.html#SP11_4">&#167;11.4</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">c</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">chcl</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">range_from</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">range_to</span><span class="plain-syntax">, </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">drawn_from</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">reverse</span><span class="plain-syntax">) {</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">match</span><span class="plain-syntax"> = </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">switch</span><span class="plain-syntax"> (</span><span class="identifier-syntax">chcl</span><span class="plain-syntax">) {</span>
2020-05-09 12:05:00 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">ANY_CHARCLASS:</span><span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax">) </span><span class="identifier-syntax">match</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">DIGIT_CHARCLASS:</span><span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">isdigit</span><span class="plain-syntax">(</span><span class="identifier-syntax">c</span><span class="plain-syntax">)) </span><span class="identifier-syntax">match</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">WHITESPACE_CHARCLASS:</span><span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><a href="4-chr.html#SP2" class="function-link"><span class="function-syntax">Characters::is_whitespace</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">c</span><span class="plain-syntax">)) </span><span class="identifier-syntax">match</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">TAB_CHARCLASS:</span><span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">'\t'</span><span class="plain-syntax">) </span><span class="identifier-syntax">match</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">NONWHITESPACE_CHARCLASS:</span><span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (!(</span><a href="4-chr.html#SP2" class="function-link"><span class="function-syntax">Characters::is_whitespace</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">c</span><span class="plain-syntax">))) </span><span class="identifier-syntax">match</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">QUOTE_CHARCLASS:</span><span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> != </span><span class="character-syntax">'\"'</span><span class="plain-syntax">) </span><span class="identifier-syntax">match</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">IDENTIFIER_CHARCLASS:</span><span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><a href="4-pm.html#SP2" class="function-link"><span class="function-syntax">Regexp::identifier_char</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">c</span><span class="plain-syntax">)) </span><span class="identifier-syntax">match</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">PREFORM_CHARCLASS:</span><span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">'-'</span><span class="plain-syntax">) || (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">'_'</span><span class="plain-syntax">) ||</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &gt;= </span><span class="character-syntax">'a'</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &lt;= </span><span class="character-syntax">'z'</span><span class="plain-syntax">)) ||</span>
<span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &gt;= </span><span class="character-syntax">'0'</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &lt;= </span><span class="character-syntax">'9'</span><span class="plain-syntax">))) </span><span class="identifier-syntax">match</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
2020-05-09 12:05:00 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">PREFORMC_CHARCLASS:</span><span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">'-'</span><span class="plain-syntax">) || (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">'_'</span><span class="plain-syntax">) || (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">':'</span><span class="plain-syntax">) ||</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &gt;= </span><span class="character-syntax">'a'</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &lt;= </span><span class="character-syntax">'z'</span><span class="plain-syntax">)) ||</span>
<span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &gt;= </span><span class="character-syntax">'0'</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &lt;= </span><span class="character-syntax">'9'</span><span class="plain-syntax">))) </span><span class="identifier-syntax">match</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
2020-05-09 12:05:00 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">LITERAL_CHARCLASS:</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">range_to</span><span class="plain-syntax"> &gt; </span><span class="identifier-syntax">range_from</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">drawn_from</span><span class="plain-syntax">[</span><span class="identifier-syntax">range_from</span><span class="plain-syntax">] == </span><span class="character-syntax">'^'</span><span class="plain-syntax">)) {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">range_from</span><span class="plain-syntax">++; </span><span class="identifier-syntax">reverse</span><span class="plain-syntax"> = </span><span class="identifier-syntax">reverse</span><span class="plain-syntax">?</span><span class="identifier-syntax">FALSE:TRUE</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">j</span><span class="plain-syntax"> = </span><span class="identifier-syntax">range_from</span><span class="plain-syntax">; </span><span class="identifier-syntax">j</span><span class="plain-syntax"> &lt;= </span><span class="identifier-syntax">range_to</span><span class="plain-syntax">; </span><span class="identifier-syntax">j</span><span class="plain-syntax">++) {</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">c1</span><span class="plain-syntax"> = </span><span class="identifier-syntax">drawn_from</span><span class="plain-syntax">[</span><span class="identifier-syntax">j</span><span class="plain-syntax">], </span><span class="identifier-syntax">c2</span><span class="plain-syntax"> = </span><span class="identifier-syntax">c1</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">j</span><span class="plain-syntax">+1 &lt; </span><span class="identifier-syntax">range_to</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">drawn_from</span><span class="plain-syntax">[</span><span class="identifier-syntax">j</span><span class="plain-syntax">+1] == </span><span class="character-syntax">'-'</span><span class="plain-syntax">)) { </span><span class="identifier-syntax">c2</span><span class="plain-syntax"> = </span><span class="identifier-syntax">drawn_from</span><span class="plain-syntax">[</span><span class="identifier-syntax">j</span><span class="plain-syntax">+2]; </span><span class="identifier-syntax">j</span><span class="plain-syntax"> += </span><span class="constant-syntax">2</span><span class="plain-syntax">; }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &gt;= </span><span class="identifier-syntax">c1</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> &lt;= </span><span class="identifier-syntax">c2</span><span class="plain-syntax">)) {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">match</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">reverse</span><span class="plain-syntax">) </span><span class="identifier-syntax">match</span><span class="plain-syntax"> = (</span><span class="identifier-syntax">match</span><span class="plain-syntax">)?</span><span class="identifier-syntax">FALSE:TRUE</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">match</span><span class="plain-syntax">;</span>
<span class="plain-syntax">}</span>
</pre>
2020-04-24 23:06:02 +00:00
<p class="commentary firstcommentary"><a id="SP14"></a><b>&#167;14. Replacement. </b>And this routine conveniently handles searching and replacing. This time we
2020-04-22 22:57:09 +00:00
can match at substrings of the <span class="extract"><span class="extract-syntax">text</span></span> (i.e., we are not forced to match
2019-02-04 22:26:45 +00:00
from the start right to the end), and multiple replacements can be made.
For example,
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="function-syntax">Regexp::replace</span><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">L</span><span class="string-syntax">"[aeiou]"</span><span class="plain-syntax">, </span><span class="identifier-syntax">L</span><span class="string-syntax">"!"</span><span class="plain-syntax">, </span><span class="constant-syntax">REP_REPEATING</span><span class="plain-syntax">);</span>
2019-02-04 22:26:45 +00:00
</pre>
2020-04-24 23:06:02 +00:00
<p class="commentary">will turn the <span class="extract"><span class="extract-syntax">text</span></span> "goose eggs" into "g!!s! !ggs".
2019-02-04 22:26:45 +00:00
</p>
2020-04-25 10:33:39 +00:00
<pre class="definitions code-font"><span class="definition-keyword">define</span> <span class="constant-syntax">REP_REPEATING</span><span class="plain-syntax"> </span><span class="constant-syntax">1</span>
<span class="definition-keyword">define</span> <span class="constant-syntax">REP_ATSTART</span><span class="plain-syntax"> </span><span class="constant-syntax">2</span>
2019-02-04 22:26:45 +00:00
</pre>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-25 12:26:09 +00:00
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Regexp::replace</span><span class="plain-syntax">(</span><span class="reserved-syntax">text_stream</span><span class="plain-syntax"> *</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">pattern</span><span class="plain-syntax">, </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">replacement</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">options</span><span class="plain-syntax">) {</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">TEMPORARY_TEXT</span><span class="plain-syntax">(</span><span class="identifier-syntax">altered</span><span class="plain-syntax">);</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">match_results</span><span class="plain-syntax"> </span><span class="identifier-syntax">mr</span><span class="plain-syntax"> = </span><a href="4-pm.html#SP9" class="function-link"><span class="function-syntax">Regexp::create_mr</span></a><span class="plain-syntax">();</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">changes</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax">=0, </span><span class="identifier-syntax">L</span><span class="plain-syntax">=</span><a href="4-sm.html#SP8" class="function-link"><span class="function-syntax">Str::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">); </span><span class="identifier-syntax">i</span><span class="plain-syntax">&lt;</span><span class="identifier-syntax">L</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax">++) {</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">match_position</span><span class="plain-syntax"> </span><span class="identifier-syntax">mp</span><span class="plain-syntax">; </span><span class="identifier-syntax">mp</span><span class="plain-syntax">.</span><span class="element-syntax">tpos</span><span class="plain-syntax"> = </span><span class="identifier-syntax">i</span><span class="plain-syntax">; </span><span class="identifier-syntax">mp</span><span class="plain-syntax">.</span><span class="element-syntax">ppos</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">mp</span><span class="plain-syntax">.</span><span class="element-syntax">bc</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">mp</span><span class="plain-syntax">.</span><span class="element-syntax">bl</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><a href="4-pm.html#SP10" class="function-link"><span class="function-syntax">Regexp::prepare</span></a><span class="plain-syntax">(&amp;</span><span class="identifier-syntax">mr</span><span class="plain-syntax">);</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">try</span><span class="plain-syntax"> = </span><a href="4-pm.html#SP11" class="function-link"><span class="function-syntax">Regexp::match_r</span></a><span class="plain-syntax">(&amp;</span><span class="identifier-syntax">mr</span><span class="plain-syntax">, </span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">pattern</span><span class="plain-syntax">, &amp;</span><span class="identifier-syntax">mp</span><span class="plain-syntax">, </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">);</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">try</span><span class="plain-syntax"> &gt;= </span><span class="constant-syntax">0</span><span class="plain-syntax">) {</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">replacement</span><span class="plain-syntax">)</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">j</span><span class="plain-syntax">=0; </span><span class="identifier-syntax">replacement</span><span class="plain-syntax">[</span><span class="identifier-syntax">j</span><span class="plain-syntax">]; </span><span class="identifier-syntax">j</span><span class="plain-syntax">++) {</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">c</span><span class="plain-syntax"> = </span><span class="identifier-syntax">replacement</span><span class="plain-syntax">[</span><span class="identifier-syntax">j</span><span class="plain-syntax">];</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">'%'</span><span class="plain-syntax">) {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">j</span><span class="plain-syntax">++;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">ind</span><span class="plain-syntax"> = </span><span class="identifier-syntax">replacement</span><span class="plain-syntax">[</span><span class="identifier-syntax">j</span><span class="plain-syntax">] - </span><span class="character-syntax">'0'</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">ind</span><span class="plain-syntax"> &gt;= </span><span class="constant-syntax">0</span><span class="plain-syntax">) &amp;&amp; (</span><span class="identifier-syntax">ind</span><span class="plain-syntax"> &lt; </span><span class="constant-syntax">MAX_BRACKETED_SUBEXPRESSIONS</span><span class="plain-syntax">))</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">WRITE_TO</span><span class="plain-syntax">(</span><span class="identifier-syntax">altered</span><span class="plain-syntax">, </span><span class="string-syntax">"%S"</span><span class="plain-syntax">, </span><span class="identifier-syntax">mr</span><span class="plain-syntax">.</span><span class="element-syntax">exp</span><span class="plain-syntax">[</span><span class="identifier-syntax">ind</span><span class="plain-syntax">]);</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">else</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">PUT_TO</span><span class="plain-syntax">(</span><span class="identifier-syntax">altered</span><span class="plain-syntax">, </span><span class="identifier-syntax">replacement</span><span class="plain-syntax">[</span><span class="identifier-syntax">j</span><span class="plain-syntax">]);</span>
<span class="plain-syntax"> } </span><span class="reserved-syntax">else</span><span class="plain-syntax"> {</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">PUT_TO</span><span class="plain-syntax">(</span><span class="identifier-syntax">altered</span><span class="plain-syntax">, </span><span class="identifier-syntax">replacement</span><span class="plain-syntax">[</span><span class="identifier-syntax">j</span><span class="plain-syntax">]);</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> }</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">left</span><span class="plain-syntax"> = </span><span class="identifier-syntax">L</span><span class="plain-syntax"> - </span><span class="identifier-syntax">try</span><span class="plain-syntax">;</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">changes</span><span class="plain-syntax">++;</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><a href="4-pm.html#SP9" class="function-link"><span class="function-syntax">Regexp::dispose_of</span></a><span class="plain-syntax">(&amp;</span><span class="identifier-syntax">mr</span><span class="plain-syntax">);</span>
<span class="plain-syntax"> </span><span class="identifier-syntax">L</span><span class="plain-syntax"> = </span><a href="4-sm.html#SP8" class="function-link"><span class="function-syntax">Str::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">); </span><span class="identifier-syntax">i</span><span class="plain-syntax"> = </span><span class="identifier-syntax">L</span><span class="plain-syntax">-</span><span class="identifier-syntax">left</span><span class="plain-syntax">-1;</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">options</span><span class="plain-syntax"> &amp; </span><span class="constant-syntax">REP_REPEATING</span><span class="plain-syntax">) == </span><span class="constant-syntax">0</span><span class="plain-syntax">) { </span><span class="named-paragraph-container code-font"><a href="4-pm.html#SP14_1" class="named-paragraph-link"><span class="named-paragraph">Add the rest</span><span class="named-paragraph-number">14.1</span></a></span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">; }</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">continue</span><span class="plain-syntax">;</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> } </span><span class="reserved-syntax">else</span><span class="plain-syntax"> </span><span class="identifier-syntax">PUT_TO</span><span class="plain-syntax">(</span><span class="identifier-syntax">altered</span><span class="plain-syntax">, </span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">i</span><span class="plain-syntax">));</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">options</span><span class="plain-syntax"> &amp; </span><span class="constant-syntax">REP_ATSTART</span><span class="plain-syntax">) { </span><span class="named-paragraph-container code-font"><a href="4-pm.html#SP14_1" class="named-paragraph-link"><span class="named-paragraph">Add the rest</span><span class="named-paragraph-number">14.1</span></a></span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">; }</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> }</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><a href="4-pm.html#SP9" class="function-link"><span class="function-syntax">Regexp::dispose_of</span></a><span class="plain-syntax">(&amp;</span><span class="identifier-syntax">mr</span><span class="plain-syntax">);</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">changes</span><span class="plain-syntax"> &gt; </span><span class="constant-syntax">0</span><span class="plain-syntax">) </span><a href="4-sm.html#SP17" class="function-link"><span class="function-syntax">Str::copy</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">altered</span><span class="plain-syntax">);</span>
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">DISCARD_TEXT</span><span class="plain-syntax">(</span><span class="identifier-syntax">altered</span><span class="plain-syntax">);</span>
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">changes</span><span class="plain-syntax">;</span>
<span class="plain-syntax">}</span>
</pre>
2020-04-25 10:33:39 +00:00
<p class="commentary firstcommentary"><a id="SP14_1"></a><b>&#167;14.1. </b><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">Add the rest</span><span class="named-paragraph-number">14.1</span></span><span class="comment-syntax"> =</span>
</p>
2020-04-25 10:33:39 +00:00
<pre class="displayed-code all-displayed-code code-font">
2020-04-21 16:55:17 +00:00
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="identifier-syntax">i</span><span class="plain-syntax">++; </span><span class="identifier-syntax">i</span><span class="plain-syntax">&lt;</span><span class="identifier-syntax">L</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax">++)</span>
2020-04-21 23:52:25 +00:00
<span class="plain-syntax"> </span><span class="identifier-syntax">PUT_TO</span><span class="plain-syntax">(</span><span class="identifier-syntax">altered</span><span class="plain-syntax">, </span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">text</span><span class="plain-syntax">, </span><span class="identifier-syntax">i</span><span class="plain-syntax">));</span>
</pre>
<ul class="endnotetexts"><li>This code is used in <a href="4-pm.html#SP14">&#167;14</a> (twice).</li></ul>
2020-04-30 22:36:38 +00:00
<nav role="progress"><div class="progresscontainer">
<ul class="progressbar"><li class="progressprev"><a href="4-taa.html">&#10094;</a></li><li class="progresschapter"><a href="P-abgtf.html">P</a></li><li class="progresschapter"><a href="1-fm.html">1</a></li><li class="progresschapter"><a href="2-dl.html">2</a></li><li class="progresschapter"><a href="3-em.html">3</a></li><li class="progresscurrentchapter">4</li><li class="progresssection"><a href="4-chr.html">chr</a></li><li class="progresssection"><a href="4-cst.html">cst</a></li><li class="progresssection"><a href="4-ws.html">ws</a></li><li class="progresssection"><a href="4-sm.html">sm</a></li><li class="progresssection"><a href="4-tf.html">tf</a></li><li class="progresssection"><a href="4-taa.html">taa</a></li><li class="progresscurrent">pm</li><li class="progresschapter"><a href="5-htm.html">5</a></li><li class="progresschapter"><a href="6-bf.html">6</a></li><li class="progresschapter"><a href="7-vn.html">7</a></li><li class="progresschapter"><a href="8-ws.html">8</a></li><li class="progressnext"><a href="5-htm.html">&#10095;</a></li></ul></div>
</nav><!--End of weave-->
2020-04-23 22:23:44 +00:00
</main>
2019-02-04 22:26:45 +00:00
</body>
</html>