466 lines
89 KiB
HTML
466 lines
89 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|
<html>
|
|
<head>
|
|
<title>Tries and Avinues</title>
|
|
<link href="../docs-assets/Breadcrumbs.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<meta name="viewport" content="width=device-width initial-scale=1">
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
<meta http-equiv="Content-Language" content="en-gb">
|
|
|
|
<link href="../docs-assets/Contents.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Progress.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Navigation.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Fonts.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Base.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<script>
|
|
function togglePopup(material_id) {
|
|
var popup = document.getElementById(material_id);
|
|
popup.classList.toggle("show");
|
|
}
|
|
</script>
|
|
|
|
<link href="../docs-assets/Popups.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
<link href="../docs-assets/Colours.css" rel="stylesheet" rev="stylesheet" type="text/css">
|
|
|
|
</head>
|
|
<body class="commentary-font">
|
|
<nav role="navigation">
|
|
<h1><a href="../index.html">
|
|
<img src="../docs-assets/Octagram.png" width=72 height=72">
|
|
</a></h1>
|
|
<ul><li><a href="../inweb/index.html">inweb</a></li>
|
|
</ul><h2>Foundation Module</h2><ul>
|
|
<li><a href="index.html"><span class="selectedlink">foundation</span></a></li>
|
|
<li><a href="../foundation-test/index.html">foundation-test</a></li>
|
|
</ul><h2>Example Webs</h2><ul>
|
|
<li><a href="../goldbach/index.html">goldbach</a></li>
|
|
<li><a href="../twinprimes/twinprimes.html">twinprimes</a></li>
|
|
<li><a href="../eastertide/index.html">eastertide</a></li>
|
|
</ul><h2>Repository</h2><ul>
|
|
<li><a href="https://github.com/ganelson/inweb"><img src="../docs-assets/github.png" height=18> github</a></li>
|
|
</ul><h2>Related Projects</h2><ul>
|
|
<li><a href="../../../inform/docs/index.html">inform</a></li>
|
|
<li><a href="../../../intest/docs/index.html">intest</a></li>
|
|
|
|
</ul>
|
|
</nav>
|
|
<main role="main">
|
|
<!--Weave of 'Tries and Avinues' generated by Inweb-->
|
|
<div class="breadcrumbs">
|
|
<ul class="crumbs"><li><a href="../index.html">Home</a></li><li><a href="index.html">foundation</a></li><li><a href="index.html#4">Chapter 4: Text Handling</a></li><li><b>Tries and Avinues</b></li></ul></div>
|
|
<p class="purpose">To examine heads and tails of text, to see how it may inflect.</p>
|
|
|
|
<ul class="toc"><li><a href="4-taa.html#SP1">§1. Tries</a></li><li><a href="4-taa.html#SP5">§5. Avinues</a></li><li><a href="4-taa.html#SP9">§9. Logging</a></li></ul><hr class="tocbar">
|
|
|
|
<p class="commentary firstcommentary"><a id="SP1" class="paragraph-anchor"></a><b>§1. Tries. </b>The standard data structure for searches through possible prefixes or
|
|
suffixes is a "trie". The term goes back to Edward Fredkin in 1961;
|
|
some pronounce it "try" and some "tree", and either would be a fair
|
|
description. Like hash tables, tries are a means of minimising string
|
|
comparisons when sorting through possible outcomes based on a text.
|
|
</p>
|
|
|
|
<p class="commentary">The trie is a tree with three kinds of node:
|
|
</p>
|
|
|
|
<ul class="items"><li>(a) "Heads". Every trie has exactly one such node, and it's always the root.
|
|
There are two versions of this: a start head represents matching from the
|
|
front of a text, whereas an end head represents matching from the back.
|
|
</li><li>(b) "Choices". A choice node has a given match character, say an "f", and
|
|
represents which node to go to next if this is the current character in the
|
|
text. It must either be a valid Unicode character or <span class="extract"><span class="extract-syntax">TRIE_ANYTHING</span></span>, which
|
|
is a wildcard representing "any text of any length here". Since a choice
|
|
must always lead somewhere, <span class="extract"><span class="extract-syntax">on_success</span></span> must point to another node.
|
|
There can be any number of choices at a given position, so choice nodes
|
|
are always organised in linked lists joined by <span class="extract"><span class="extract-syntax">next</span></span>.
|
|
</li><li>(c) "Terminals", always leaves, which have match character set to the
|
|
impossible value <span class="extract"><span class="extract-syntax">TRIE_STOP</span></span>, and for which <span class="extract"><span class="extract-syntax">match_outcome</span></span> is non-null; thus,
|
|
different terminal nodes can result in different outcomes if they are ever
|
|
reached at the end of a successful scan. A terminal node is always the only item
|
|
in a list.
|
|
</li></ul>
|
|
<pre class="definitions code-font"><span class="definition-keyword">define</span> <span class="constant-syntax">TRIE_START</span><span class="plain-syntax"> -1 </span><span class="comment-syntax"> head: the root of a trie parsing forwards from the start</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">TRIE_END</span><span class="plain-syntax"> -2 </span><span class="comment-syntax"> head: the root of a trie parsing backwards from the end</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">TRIE_ANYTHING</span><span class="plain-syntax"> </span><span class="constant-syntax">10003</span><span class="plain-syntax"> </span><span class="comment-syntax"> choice: match any text here</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">TRIE_ANY_GROUP</span><span class="plain-syntax"> </span><span class="constant-syntax">10001</span><span class="plain-syntax"> </span><span class="comment-syntax"> choice: match any character from this group</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">TRIE_NOT_GROUP</span><span class="plain-syntax"> </span><span class="constant-syntax">10002</span><span class="plain-syntax"> </span><span class="comment-syntax"> choice: match any character not in this group</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">TRIE_STOP</span><span class="plain-syntax"> -3 </span><span class="comment-syntax"> terminal: here's the outcome</span>
|
|
<span class="definition-keyword">define</span> <span class="constant-syntax">MAX_TRIE_GROUP_SIZE</span><span class="plain-syntax"> </span><span class="constant-syntax">26</span><span class="plain-syntax"> </span><span class="comment-syntax"> size of the allowable groups of characters</span>
|
|
</pre>
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">typedef</span><span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">match_character</span><span class="plain-syntax">; </span><span class="comment-syntax"> or one of the special cases above</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> </span><span class="identifier-syntax">group_characters</span><span class="plain-syntax">[</span><span class="constant-syntax">MAX_TRIE_GROUP_SIZE</span><span class="plain-syntax">+1];</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">match_outcome</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">on_success</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">next</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">} </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax">;</span>
|
|
</pre>
|
|
<ul class="endnotetexts"><li>The structure match_trie is accessed in 2/mmr, 2/trs and here.</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP2" class="paragraph-anchor"></a><b>§2. </b>We have just one routine for extending and scanning the trie: it either
|
|
tries to find whether a text <span class="extract"><span class="extract-syntax">p</span></span> leads to any outcome in the existing trie,
|
|
or else forcibly extends the existing trie to ensure that it does.
|
|
</p>
|
|
|
|
<p class="commentary">It might look as if calling <span class="extract"><span class="extract-syntax">Tries::search</span></span> always returns <span class="extract"><span class="extract-syntax">add_outcome</span></span> when
|
|
this is set, but this isn't true: if the trie already contains a node
|
|
representing how to deal with <span class="extract"><span class="extract-syntax">p</span></span>, we get whatever outcome is already
|
|
established.
|
|
</p>
|
|
|
|
<p class="commentary">There are two motions to keep track of: our progress through the text <span class="extract"><span class="extract-syntax">p</span></span>
|
|
being scanned, and our progress through the trie which tells us how to scan it.
|
|
</p>
|
|
|
|
<p class="commentary">We scan the text either forwards or backwards, starting with the first or
|
|
last character and then working through, finishing with a 0 terminator.
|
|
(This is true even if working backwards: we pretend the character stored
|
|
before the text began is 0.) <span class="extract"><span class="extract-syntax">i</span></span> represents the index of our current position
|
|
in <span class="extract"><span class="extract-syntax">p</span></span>, and runs either from 0 up to <span class="extract"><span class="extract-syntax">N</span></span> or from <span class="extract"><span class="extract-syntax">N-1</span></span> down to <span class="extract"><span class="extract-syntax">-1</span></span>,
|
|
where <span class="extract"><span class="extract-syntax">N</span></span> is the number of characters in <span class="extract"><span class="extract-syntax">p</span></span>.
|
|
</p>
|
|
|
|
<p class="commentary">We scan the trie using a pair of pointers. <span class="extract"><span class="extract-syntax">prev</span></span> is the last node we
|
|
successfully left, and <span class="extract"><span class="extract-syntax">pos</span></span> is one we are currently at, which can be
|
|
either a terminal node or a choice node (in which case it's the head of
|
|
a linked list of such nodes).
|
|
</p>
|
|
|
|
<pre class="definitions code-font"><span class="definition-keyword">define</span> <span class="constant-syntax">MAX_TRIE_REWIND</span><span class="plain-syntax"> </span><span class="constant-syntax">10</span><span class="plain-syntax"> </span><span class="comment-syntax"> that should be far, far more rewinding than necessary</span>
|
|
</pre>
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="function-syntax">Tries::search</span><button class="popup" onclick="togglePopup('usagePopup1')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup1">Usage of <span class="code-font"><span class="function-syntax">Tries::search</span></span>:<br/><a href="4-taa.html#SP6">§6</a>, <a href="4-taa.html#SP8">§8</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">T</span><span class="plain-syntax">, </span><span class="reserved-syntax">text_stream</span><span class="plain-syntax"> *</span><span class="identifier-syntax">p</span><span class="plain-syntax">, </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">add_outcome</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">T</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) </span><span class="identifier-syntax">internal_error</span><span class="plain-syntax">(</span><span class="string-syntax">"no trie to search"</span><span class="plain-syntax">);</span>
|
|
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">start</span><span class="plain-syntax">, </span><span class="identifier-syntax">endpoint</span><span class="plain-syntax">, </span><span class="identifier-syntax">delta</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-taa.html#SP2_1" class="named-paragraph-link"><span class="named-paragraph">Look at the root node of the trie, setting up the scan accordingly</span><span class="named-paragraph-number">2.1</span></a></span><span class="plain-syntax">;</span>
|
|
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">prev</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">, *</span><span class="identifier-syntax">pos</span><span class="plain-syntax"> = </span><span class="identifier-syntax">T</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-taa.html#SP2_4" class="named-paragraph-link"><span class="named-paragraph">Accept the current node of the trie</span><span class="named-paragraph-number">2.4</span></a></span><span class="plain-syntax">;</span>
|
|
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">rewind_sp</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">rewind_points</span><span class="plain-syntax">[</span><span class="constant-syntax">MAX_TRIE_REWIND</span><span class="plain-syntax">];</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">rewind_positions</span><span class="plain-syntax">[</span><span class="constant-syntax">MAX_TRIE_REWIND</span><span class="plain-syntax">];</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">rewind_prev_positions</span><span class="plain-syntax">[</span><span class="constant-syntax">MAX_TRIE_REWIND</span><span class="plain-syntax">];</span>
|
|
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax"> = </span><span class="identifier-syntax">start</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax"> != </span><span class="identifier-syntax">endpoint</span><span class="plain-syntax">+</span><span class="identifier-syntax">delta</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax"> += </span><span class="identifier-syntax">delta</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> </span><span class="identifier-syntax">group</span><span class="plain-syntax">[</span><span class="constant-syntax">MAX_TRIE_GROUP_SIZE</span><span class="plain-syntax">+1];</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">g</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="comment-syntax"> size of group</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> </span><span class="identifier-syntax">c</span><span class="plain-syntax"> = (</span><span class="identifier-syntax">i</span><span class="plain-syntax"><0)?0:(</span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">p</span><span class="plain-syntax">, </span><span class="identifier-syntax">i</span><span class="plain-syntax">)); </span><span class="comment-syntax"> i.e., zero at the two ends of the text</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">c</span><span class="plain-syntax"> >= </span><span class="constant-syntax">0x20</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> <= </span><span class="constant-syntax">0x7f</span><span class="plain-syntax">)) </span><span class="identifier-syntax">c</span><span class="plain-syntax"> = </span><a href="4-chr.html#SP1" class="function-link"><span class="function-syntax">Characters::tolower</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">c</span><span class="plain-syntax">); </span><span class="comment-syntax"> normalise it within ASCII</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="constant-syntax">0x20</span><span class="plain-syntax">) { </span><span class="identifier-syntax">c</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">i</span><span class="plain-syntax"> = </span><span class="identifier-syntax">endpoint</span><span class="plain-syntax"> - </span><span class="identifier-syntax">delta</span><span class="plain-syntax">; } </span><span class="comment-syntax"> force any space to be equivalent to the final 0</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">add_outcome</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> </span><span class="identifier-syntax">pairc</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">'<'</span><span class="plain-syntax">) </span><span class="identifier-syntax">pairc</span><span class="plain-syntax"> = </span><span class="character-syntax">'>'</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">'>'</span><span class="plain-syntax">) </span><span class="identifier-syntax">pairc</span><span class="plain-syntax"> = </span><span class="character-syntax">'<'</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pairc</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">j</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="identifier-syntax">j</span><span class="plain-syntax"> = </span><span class="identifier-syntax">i</span><span class="plain-syntax">+</span><span class="identifier-syntax">delta</span><span class="plain-syntax">; </span><span class="identifier-syntax">j</span><span class="plain-syntax"> != </span><span class="identifier-syntax">endpoint</span><span class="plain-syntax">; </span><span class="identifier-syntax">j</span><span class="plain-syntax"> += </span><span class="identifier-syntax">delta</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> </span><span class="identifier-syntax">ch</span><span class="plain-syntax"> = (</span><span class="identifier-syntax">j</span><span class="plain-syntax"><0)?0:(</span><a href="4-sm.html#SP13" class="function-link"><span class="function-syntax">Str::get_at</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">p</span><span class="plain-syntax">, </span><span class="identifier-syntax">j</span><span class="plain-syntax">));</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">ch</span><span class="plain-syntax"> == </span><span class="identifier-syntax">pairc</span><span class="plain-syntax">) </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">g</span><span class="plain-syntax"> > </span><span class="constant-syntax">MAX_TRIE_GROUP_SIZE</span><span class="plain-syntax">) { </span><span class="identifier-syntax">g</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="reserved-syntax">break</span><span class="plain-syntax">; }</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">group</span><span class="plain-syntax">[</span><span class="identifier-syntax">g</span><span class="plain-syntax">++] = </span><span class="identifier-syntax">ch</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">group</span><span class="plain-syntax">[</span><span class="identifier-syntax">g</span><span class="plain-syntax">] = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">g</span><span class="plain-syntax"> > </span><span class="constant-syntax">0</span><span class="plain-syntax">) </span><span class="identifier-syntax">i</span><span class="plain-syntax"> = </span><span class="identifier-syntax">j</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">'*'</span><span class="plain-syntax">) </span><span class="identifier-syntax">endpoint</span><span class="plain-syntax"> -= </span><span class="identifier-syntax">delta</span><span class="plain-syntax">;</span>
|
|
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">RewindHere:</span>
|
|
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-taa.html#SP2_2" class="named-paragraph-link"><span class="named-paragraph">Look through the possible exits from this position and move on if any match</span><span class="named-paragraph-number">2.2</span></a></span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">add_outcome</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">rewind_sp</span><span class="plain-syntax"> > </span><span class="constant-syntax">0</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">i</span><span class="plain-syntax"> = </span><span class="identifier-syntax">rewind_points</span><span class="plain-syntax">[</span><span class="identifier-syntax">rewind_sp</span><span class="plain-syntax">-1];</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">pos</span><span class="plain-syntax"> = </span><span class="identifier-syntax">rewind_positions</span><span class="plain-syntax">[</span><span class="identifier-syntax">rewind_sp</span><span class="plain-syntax">-1];</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">prev</span><span class="plain-syntax"> = </span><span class="identifier-syntax">rewind_prev_positions</span><span class="plain-syntax">[</span><span class="identifier-syntax">rewind_sp</span><span class="plain-syntax">-1];</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">rewind_sp</span><span class="plain-syntax">--;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">goto</span><span class="plain-syntax"> </span><span class="identifier-syntax">RewindHere</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">; </span><span class="comment-syntax"> failure!</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-taa.html#SP2_3" class="named-paragraph-link"><span class="named-paragraph">We have run out of trie and must create a new exit to continue</span><span class="named-paragraph-number">2.3</span></a></span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">pos</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax"> == </span><span class="constant-syntax">TRIE_ANYTHING</span><span class="plain-syntax">)) </span><span class="named-paragraph-container code-font"><a href="4-taa.html#SP2_4" class="named-paragraph-link"><span class="named-paragraph">Accept the current node of the trie</span><span class="named-paragraph-number">2.4</span></a></span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">pos</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">match_outcome</span><span class="plain-syntax">)) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">match_outcome</span><span class="plain-syntax">; </span><span class="comment-syntax"> success!</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">add_outcome</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">; </span><span class="comment-syntax"> failure!</span>
|
|
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pos</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">)</span>
|
|
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-taa.html#SP2_5" class="named-paragraph-link"><span class="named-paragraph">We failed by running out of trie, so we must add a terminal node to make this string acceptable</span><span class="named-paragraph-number">2.5</span></a></span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">else</span>
|
|
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-taa.html#SP2_6" class="named-paragraph-link"><span class="named-paragraph">We failed by finishing at a non-terminal node, so we must add an outcome</span><span class="named-paragraph-number">2.6</span></a></span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP2_1" class="paragraph-anchor"></a><b>§2.1. </b><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">Look at the root node of the trie, setting up the scan accordingly</span><span class="named-paragraph-number">2.1</span></span><span class="comment-syntax"> =</span>
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">start</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">endpoint</span><span class="plain-syntax"> = </span><a href="4-sm.html#SP8" class="function-link"><span class="function-syntax">Str::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">p</span><span class="plain-syntax">); </span><span class="identifier-syntax">delta</span><span class="plain-syntax"> = </span><span class="constant-syntax">1</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="identifier-syntax">match_character</span><span class="plain-syntax"> == </span><span class="constant-syntax">TRIE_END</span><span class="plain-syntax">) { </span><span class="identifier-syntax">start</span><span class="plain-syntax"> = </span><a href="4-sm.html#SP8" class="function-link"><span class="function-syntax">Str::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">p</span><span class="plain-syntax">)-1; </span><span class="identifier-syntax">endpoint</span><span class="plain-syntax"> = -1; </span><span class="identifier-syntax">delta</span><span class="plain-syntax"> = -1; }</span>
|
|
</pre>
|
|
<ul class="endnotetexts"><li>This code is used in <a href="4-taa.html#SP2">§2</a>.</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP2_2" class="paragraph-anchor"></a><b>§2.2. </b>In general trie searches can be made more efficient if the trie is shuffled
|
|
so that the most recently matched exit in the list if moved to the top, as
|
|
this tends to make commonly used exits migrate upwards and rarities downwards.
|
|
But we aren't going to search these tries anything like intensively enough
|
|
to make it worth the trouble.
|
|
</p>
|
|
|
|
<p class="commentary">(The following cannot be a <span class="extract"><span class="extract-syntax">while</span></span> loop since C does not allow us to <span class="extract"><span class="extract-syntax">break</span></span>
|
|
or <span class="extract"><span class="extract-syntax">continue</span></span> out of an outer loop from an inner one.)
|
|
</p>
|
|
|
|
<p class="commentary"><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">Look through the possible exits from this position and move on if any match</span><span class="named-paragraph-number">2.2</span></span><span class="comment-syntax"> =</span>
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">ambig</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">, </span><span class="identifier-syntax">unambig</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">point</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="identifier-syntax">point</span><span class="plain-syntax"> = </span><span class="identifier-syntax">pos</span><span class="plain-syntax">; </span><span class="identifier-syntax">point</span><span class="plain-syntax">; </span><span class="identifier-syntax">point</span><span class="plain-syntax"> = </span><span class="identifier-syntax">point</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax">)</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><a href="4-taa.html#SP3" class="function-link"><span class="function-syntax">Tries::is_ambiguous</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">point</span><span class="plain-syntax">)) </span><span class="identifier-syntax">ambig</span><span class="plain-syntax">++;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">else</span><span class="plain-syntax"> </span><span class="identifier-syntax">unambig</span><span class="plain-syntax">++;</span>
|
|
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">FauxWhileLoop:</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pos</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">add_outcome</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) || (</span><a href="4-taa.html#SP3" class="function-link"><span class="function-syntax">Tries::is_ambiguous</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">pos</span><span class="plain-syntax">) == </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">))</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><a href="4-taa.html#SP3" class="function-link"><span class="function-syntax">Tries::matches</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">pos</span><span class="plain-syntax">, </span><span class="identifier-syntax">c</span><span class="plain-syntax">)) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax"> == </span><span class="constant-syntax">TRIE_ANYTHING</span><span class="plain-syntax">) </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">add_outcome</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">ambig</span><span class="plain-syntax"> > </span><span class="constant-syntax">0</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">ambig</span><span class="plain-syntax">+</span><span class="identifier-syntax">unambig</span><span class="plain-syntax"> > </span><span class="constant-syntax">1</span><span class="plain-syntax">)</span>
|
|
<span class="plain-syntax"> && (</span><span class="identifier-syntax">rewind_sp</span><span class="plain-syntax"> < </span><span class="constant-syntax">MAX_TRIE_REWIND</span><span class="plain-syntax">)) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">rewind_points</span><span class="plain-syntax">[</span><span class="identifier-syntax">rewind_sp</span><span class="plain-syntax">] = </span><span class="identifier-syntax">i</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">rewind_positions</span><span class="plain-syntax">[</span><span class="identifier-syntax">rewind_sp</span><span class="plain-syntax">] = </span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">rewind_prev_positions</span><span class="plain-syntax">[</span><span class="identifier-syntax">rewind_sp</span><span class="plain-syntax">] = </span><span class="identifier-syntax">prev</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">rewind_sp</span><span class="plain-syntax">++;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-taa.html#SP2_4" class="named-paragraph-link"><span class="named-paragraph">Accept the current node of the trie</span><span class="named-paragraph-number">2.4</span></a></span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">continue</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">pos</span><span class="plain-syntax"> = </span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">goto</span><span class="plain-syntax"> </span><span class="identifier-syntax">FauxWhileLoop</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
</pre>
|
|
<ul class="endnotetexts"><li>This code is used in <a href="4-taa.html#SP2">§2</a>.</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP2_3" class="paragraph-anchor"></a><b>§2.3. </b><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">We have run out of trie and must create a new exit to continue</span><span class="named-paragraph-number">2.3</span></span><span class="comment-syntax"> =</span>
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">new_pos</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">g</span><span class="plain-syntax"> > </span><span class="constant-syntax">0</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">nt</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRIE_ANY_GROUP</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">from</span><span class="plain-syntax"> = </span><span class="identifier-syntax">group</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">group</span><span class="plain-syntax">[0] == </span><span class="character-syntax">'!'</span><span class="plain-syntax">) { </span><span class="identifier-syntax">from</span><span class="plain-syntax">++; </span><span class="identifier-syntax">nt</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRIE_NOT_GROUP</span><span class="plain-syntax">; }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">group</span><span class="plain-syntax">[(</span><span class="reserved-syntax">int</span><span class="plain-syntax">) </span><a href="4-ws.html#SP1" class="function-link"><span class="function-syntax">Wide::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">group</span><span class="plain-syntax">)-1] == </span><span class="character-syntax">'!'</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">group</span><span class="plain-syntax">[(</span><span class="reserved-syntax">int</span><span class="plain-syntax">) </span><a href="4-ws.html#SP1" class="function-link"><span class="function-syntax">Wide::len</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">group</span><span class="plain-syntax">)-1] = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">nt</span><span class="plain-syntax"> = </span><span class="constant-syntax">TRIE_NOT_GROUP</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">new_pos</span><span class="plain-syntax"> = </span><a href="4-taa.html#SP4" class="function-link"><span class="function-syntax">Tries::new</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">nt</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">wcscpy</span><span class="plain-syntax">(</span><span class="identifier-syntax">new_pos</span><span class="plain-syntax">-></span><span class="element-syntax">group_characters</span><span class="plain-syntax">, </span><span class="identifier-syntax">from</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> } </span><span class="reserved-syntax">else</span><span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="character-syntax">'*'</span><span class="plain-syntax">) </span><span class="identifier-syntax">new_pos</span><span class="plain-syntax"> = </span><a href="4-taa.html#SP4" class="function-link"><span class="function-syntax">Tries::new</span></a><span class="plain-syntax">(</span><span class="constant-syntax">TRIE_ANYTHING</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">else</span><span class="plain-syntax"> </span><span class="identifier-syntax">new_pos</span><span class="plain-syntax"> = </span><a href="4-taa.html#SP4" class="function-link"><span class="function-syntax">Tries::new</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">c</span><span class="plain-syntax">);</span>
|
|
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">prev</span><span class="plain-syntax">-></span><span class="element-syntax">on_success</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) </span><span class="identifier-syntax">prev</span><span class="plain-syntax">-></span><span class="element-syntax">on_success</span><span class="plain-syntax"> = </span><span class="identifier-syntax">new_pos</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">else</span><span class="plain-syntax"> {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">ppoint</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">, *</span><span class="identifier-syntax">point</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="identifier-syntax">point</span><span class="plain-syntax"> = </span><span class="identifier-syntax">prev</span><span class="plain-syntax">-></span><span class="element-syntax">on_success</span><span class="plain-syntax">; </span><span class="identifier-syntax">point</span><span class="plain-syntax">; </span><span class="identifier-syntax">ppoint</span><span class="plain-syntax"> = </span><span class="identifier-syntax">point</span><span class="plain-syntax">, </span><span class="identifier-syntax">point</span><span class="plain-syntax"> = </span><span class="identifier-syntax">point</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">new_pos</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax"> < </span><span class="identifier-syntax">point</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">ppoint</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">new_pos</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax"> = </span><span class="identifier-syntax">prev</span><span class="plain-syntax">-></span><span class="element-syntax">on_success</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">prev</span><span class="plain-syntax">-></span><span class="element-syntax">on_success</span><span class="plain-syntax"> = </span><span class="identifier-syntax">new_pos</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> } </span><span class="reserved-syntax">else</span><span class="plain-syntax"> {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">ppoint</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax"> = </span><span class="identifier-syntax">new_pos</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">new_pos</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax"> = </span><span class="identifier-syntax">point</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">point</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">point</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax"> = </span><span class="identifier-syntax">new_pos</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> }</span>
|
|
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">pos</span><span class="plain-syntax"> = </span><span class="identifier-syntax">new_pos</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="named-paragraph-container code-font"><a href="4-taa.html#SP2_4" class="named-paragraph-link"><span class="named-paragraph">Accept the current node of the trie</span><span class="named-paragraph-number">2.4</span></a></span><span class="plain-syntax">; </span><span class="reserved-syntax">continue</span><span class="plain-syntax">;</span>
|
|
</pre>
|
|
<ul class="endnotetexts"><li>This code is used in <a href="4-taa.html#SP2">§2</a>.</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP2_4" class="paragraph-anchor"></a><b>§2.4. </b><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">Accept the current node of the trie</span><span class="named-paragraph-number">2.4</span></span><span class="comment-syntax"> =</span>
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pos</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) </span><span class="identifier-syntax">internal_error</span><span class="plain-syntax">(</span><span class="string-syntax">"trie invariant broken"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">prev</span><span class="plain-syntax"> = </span><span class="identifier-syntax">pos</span><span class="plain-syntax">; </span><span class="identifier-syntax">pos</span><span class="plain-syntax"> = </span><span class="identifier-syntax">prev</span><span class="plain-syntax">-></span><span class="element-syntax">on_success</span><span class="plain-syntax">;</span>
|
|
</pre>
|
|
<ul class="endnotetexts"><li>This code is used in <a href="4-taa.html#SP2">§2</a> (twice), <a href="4-taa.html#SP2_2">§2.2</a>, <a href="4-taa.html#SP2_3">§2.3</a>.</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP2_5" class="paragraph-anchor"></a><b>§2.5. </b>If <span class="extract"><span class="extract-syntax">pos</span></span> is <span class="extract"><span class="extract-syntax">NULL</span></span> then it follows that <span class="extract"><span class="extract-syntax">prev->on_success</span></span> is <span class="extract"><span class="extract-syntax">NULL</span></span>, since
|
|
this is how <span class="extract"><span class="extract-syntax">pos</span></span> was calculated; so to add a new terminal node we simply add
|
|
it there.
|
|
</p>
|
|
|
|
<p class="commentary"><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">We failed by running out of trie, so we must add a terminal node to make this string acceptable</span><span class="named-paragraph-number">2.5</span></span><span class="comment-syntax"> =</span>
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">prev</span><span class="plain-syntax">-></span><span class="identifier-syntax">on_success</span><span class="plain-syntax"> = </span><a href="4-taa.html#SP4" class="function-link"><span class="function-syntax">Tries::new</span></a><span class="plain-syntax">(</span><span class="constant-syntax">TRIE_STOP</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">prev</span><span class="plain-syntax">-></span><span class="identifier-syntax">on_success</span><span class="plain-syntax">-></span><span class="element-syntax">match_outcome</span><span class="plain-syntax"> = </span><span class="identifier-syntax">add_outcome</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">add_outcome</span><span class="plain-syntax">;</span>
|
|
</pre>
|
|
<ul class="endnotetexts"><li>This code is used in <a href="4-taa.html#SP2">§2</a>.</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP2_6" class="paragraph-anchor"></a><b>§2.6. </b><span class="named-paragraph-container code-font"><span class="named-paragraph-defn">We failed by finishing at a non-terminal node, so we must add an outcome</span><span class="named-paragraph-number">2.6</span></span><span class="comment-syntax"> =</span>
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">prev</span><span class="plain-syntax">-></span><span class="identifier-syntax">on_success</span><span class="plain-syntax"> = </span><a href="4-taa.html#SP4" class="function-link"><span class="function-syntax">Tries::new</span></a><span class="plain-syntax">(</span><span class="constant-syntax">TRIE_STOP</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">prev</span><span class="plain-syntax">-></span><span class="identifier-syntax">on_success</span><span class="plain-syntax">-></span><span class="element-syntax">match_outcome</span><span class="plain-syntax"> = </span><span class="identifier-syntax">add_outcome</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">add_outcome</span><span class="plain-syntax">;</span>
|
|
</pre>
|
|
<ul class="endnotetexts"><li>This code is used in <a href="4-taa.html#SP2">§2</a>.</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP3" class="paragraph-anchor"></a><b>§3. </b>Single nodes are matched thus:
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Tries::matches</span><button class="popup" onclick="togglePopup('usagePopup2')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup2">Usage of <span class="code-font"><span class="function-syntax">Tries::matches</span></span>:<br/><a href="4-taa.html#SP2_2">§2.2</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">pos</span><span class="plain-syntax">, </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">c</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax"> == </span><span class="constant-syntax">TRIE_ANYTHING</span><span class="plain-syntax">) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax"> == </span><span class="constant-syntax">TRIE_ANY_GROUP</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">k</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="identifier-syntax">k</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">group_characters</span><span class="plain-syntax">[</span><span class="identifier-syntax">k</span><span class="plain-syntax">]; </span><span class="identifier-syntax">k</span><span class="plain-syntax">++)</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">group_characters</span><span class="plain-syntax">[</span><span class="identifier-syntax">k</span><span class="plain-syntax">])</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax"> == </span><span class="constant-syntax">TRIE_NOT_GROUP</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">k</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (</span><span class="identifier-syntax">k</span><span class="plain-syntax"> = </span><span class="constant-syntax">0</span><span class="plain-syntax">; </span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">group_characters</span><span class="plain-syntax">[</span><span class="identifier-syntax">k</span><span class="plain-syntax">]; </span><span class="identifier-syntax">k</span><span class="plain-syntax">++)</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">c</span><span class="plain-syntax"> == </span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">group_characters</span><span class="plain-syntax">[</span><span class="identifier-syntax">k</span><span class="plain-syntax">])</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax"> == </span><span class="identifier-syntax">c</span><span class="plain-syntax">) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
|
|
<span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="function-syntax">Tries::is_ambiguous</span><button class="popup" onclick="togglePopup('usagePopup3')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup3">Usage of <span class="code-font"><span class="function-syntax">Tries::is_ambiguous</span></span>:<br/><a href="4-taa.html#SP2_2">§2.2</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">pos</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax"> == </span><span class="constant-syntax">TRIE_ANYTHING</span><span class="plain-syntax">) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax"> == </span><span class="constant-syntax">TRIE_ANY_GROUP</span><span class="plain-syntax">) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">pos</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax"> == </span><span class="constant-syntax">TRIE_NOT_GROUP</span><span class="plain-syntax">) </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">TRUE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="constant-syntax">FALSE</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP4" class="paragraph-anchor"></a><b>§4. </b>Where:
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="function-syntax">Tries::new</span><button class="popup" onclick="togglePopup('usagePopup4')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup4">Usage of <span class="code-font"><span class="function-syntax">Tries::new</span></span>:<br/><a href="4-taa.html#SP2_3">§2.3</a>, <a href="4-taa.html#SP2_5">§2.5</a>, <a href="4-taa.html#SP2_6">§2.6</a>, <a href="4-taa.html#SP6">§6</a></span></button><span class="plain-syntax">(</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">mc</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">T</span><span class="plain-syntax"> = </span><span class="identifier-syntax">CREATE</span><span class="plain-syntax">(</span><span class="reserved-syntax">match_trie</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax"> = </span><span class="identifier-syntax">mc</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">match_outcome</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">on_success</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">T</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP5" class="paragraph-anchor"></a><b>§5. Avinues. </b>A trie is only a limited form of finite state machine. We're not going to need
|
|
the whole power of these, but we do find it useful to chain a series of tries
|
|
together. The idea is to scan against one trie, then, if there's no result,
|
|
start again with the next, and so on. Inform therefore often matches text
|
|
against a linked list of tries: we'll call that an "avinue".
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">typedef</span><span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax"> {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">the_trie</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">struct</span><span class="plain-syntax"> </span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax"> *</span><span class="identifier-syntax">next</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">} </span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax">;</span>
|
|
</pre>
|
|
<ul class="endnotetexts"><li>The structure match_avinue is accessed in 2/mmr, 2/trs and here.</li></ul>
|
|
<p class="commentary firstcommentary"><a id="SP6" class="paragraph-anchor"></a><b>§6. </b>An avinue starts out with a single trie, which itself has just a single
|
|
head node (of either sort).
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">match_avinue</span><span class="plain-syntax"> *</span><span class="function-syntax">Tries::new_avinue</span><span class="plain-syntax">(</span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">from_start</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax"> *</span><span class="identifier-syntax">A</span><span class="plain-syntax"> = </span><span class="identifier-syntax">CREATE</span><span class="plain-syntax">(</span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">A</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">A</span><span class="plain-syntax">-></span><span class="element-syntax">the_trie</span><span class="plain-syntax"> = </span><a href="4-taa.html#SP4" class="function-link"><span class="function-syntax">Tries::new</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">from_start</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">A</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Tries::add_to_avinue</span><span class="plain-syntax">(</span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax"> *</span><span class="identifier-syntax">mt</span><span class="plain-syntax">, </span><span class="reserved-syntax">text_stream</span><span class="plain-syntax"> *</span><span class="identifier-syntax">from</span><span class="plain-syntax">, </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">to</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">mt</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) || (</span><span class="identifier-syntax">mt</span><span class="plain-syntax">-></span><span class="element-syntax">the_trie</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">)) </span><span class="identifier-syntax">internal_error</span><span class="plain-syntax">(</span><span class="string-syntax">"null trie"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><a href="4-taa.html#SP2" class="function-link"><span class="function-syntax">Tries::search</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">mt</span><span class="plain-syntax">-></span><span class="element-syntax">the_trie</span><span class="plain-syntax">, </span><span class="identifier-syntax">from</span><span class="plain-syntax">, </span><span class="identifier-syntax">to</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP7" class="paragraph-anchor"></a><b>§7. </b>The following duplicates an avinue, pointing to the same sequence of
|
|
tries.
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">match_avinue</span><span class="plain-syntax"> *</span><span class="function-syntax">Tries::duplicate_avinue</span><span class="plain-syntax">(</span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax"> *</span><span class="identifier-syntax">A</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax"> *</span><span class="identifier-syntax">F</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">, *</span><span class="identifier-syntax">FL</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">while</span><span class="plain-syntax"> (</span><span class="identifier-syntax">A</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax"> *</span><span class="identifier-syntax">FN</span><span class="plain-syntax"> = </span><span class="identifier-syntax">CREATE</span><span class="plain-syntax">(</span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">FN</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">FN</span><span class="plain-syntax">-></span><span class="element-syntax">the_trie</span><span class="plain-syntax"> = </span><span class="identifier-syntax">A</span><span class="plain-syntax">-></span><span class="element-syntax">the_trie</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">A</span><span class="plain-syntax"> = </span><span class="identifier-syntax">A</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">FL</span><span class="plain-syntax">) </span><span class="identifier-syntax">FL</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax"> = </span><span class="identifier-syntax">FN</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">F</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">) </span><span class="identifier-syntax">F</span><span class="plain-syntax"> = </span><span class="identifier-syntax">FN</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">FL</span><span class="plain-syntax"> = </span><span class="identifier-syntax">FN</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">F</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP8" class="paragraph-anchor"></a><b>§8. </b>As noted above, searching an avinue is a matter of searching with each
|
|
trie in turn until one matches (if it does).
|
|
</p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="function-syntax">Tries::search_avinue</span><span class="plain-syntax">(</span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax"> *</span><span class="identifier-syntax">T</span><span class="plain-syntax">, </span><span class="reserved-syntax">text_stream</span><span class="plain-syntax"> *</span><span class="identifier-syntax">p</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">wchar_t</span><span class="plain-syntax"> *</span><span class="identifier-syntax">result</span><span class="plain-syntax"> = </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">while</span><span class="plain-syntax"> ((</span><span class="identifier-syntax">T</span><span class="plain-syntax">) && (</span><span class="identifier-syntax">result</span><span class="plain-syntax"> == </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">)) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">result</span><span class="plain-syntax"> = </span><a href="4-taa.html#SP2" class="function-link"><span class="function-syntax">Tries::search</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">the_trie</span><span class="plain-syntax">, </span><span class="identifier-syntax">p</span><span class="plain-syntax">, </span><span class="identifier-syntax">NULL</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">T</span><span class="plain-syntax"> = </span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">return</span><span class="plain-syntax"> </span><span class="identifier-syntax">result</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<p class="commentary firstcommentary"><a id="SP9" class="paragraph-anchor"></a><b>§9. Logging. </b></p>
|
|
|
|
<pre class="displayed-code all-displayed-code code-font">
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Tries::log_avinue</span><button class="popup" onclick="togglePopup('usagePopup5')"><span class="comment-syntax">?</span><span class="popuptext" id="usagePopup5">Usage of <span class="code-font"><span class="function-syntax">Tries::log_avinue</span></span>:<br/>Foundation Module - <a href="1-fm.html#SP8_3">§8.3</a></span></button><span class="plain-syntax">(</span><span class="constant-syntax">OUTPUT_STREAM</span><span class="plain-syntax">, </span><span class="reserved-syntax">void</span><span class="plain-syntax"> *</span><span class="identifier-syntax">vA</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax"> *</span><span class="identifier-syntax">A</span><span class="plain-syntax"> = (</span><span class="reserved-syntax">match_avinue</span><span class="plain-syntax"> *) </span><span class="identifier-syntax">vA</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"Avinue:\n"</span><span class="plain-syntax">); </span><span class="constant-syntax">INDENT</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">int</span><span class="plain-syntax"> </span><span class="identifier-syntax">n</span><span class="plain-syntax"> = </span><span class="constant-syntax">1</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">while</span><span class="plain-syntax"> (</span><span class="identifier-syntax">A</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"Trie %d:\n"</span><span class="plain-syntax">, </span><span class="identifier-syntax">n</span><span class="plain-syntax">++); </span><span class="constant-syntax">INDENT</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><a href="4-taa.html#SP9" class="function-link"><span class="function-syntax">Tries::log</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">OUT</span><span class="plain-syntax">, </span><span class="identifier-syntax">A</span><span class="plain-syntax">-></span><span class="element-syntax">the_trie</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="constant-syntax">OUTDENT</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">A</span><span class="plain-syntax"> = </span><span class="identifier-syntax">A</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="constant-syntax">OUTDENT</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax">}</span>
|
|
|
|
<span class="reserved-syntax">void</span><span class="plain-syntax"> </span><span class="function-syntax">Tries::log</span><span class="plain-syntax">(</span><span class="constant-syntax">OUTPUT_STREAM</span><span class="plain-syntax">, </span><span class="reserved-syntax">match_trie</span><span class="plain-syntax"> *</span><span class="identifier-syntax">T</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">for</span><span class="plain-syntax"> (; </span><span class="identifier-syntax">T</span><span class="plain-syntax">; </span><span class="identifier-syntax">T</span><span class="plain-syntax"> = </span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">next</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">switch</span><span class="plain-syntax"> (</span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">TRIE_START:</span><span class="plain-syntax"> </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"Start"</span><span class="plain-syntax">); </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">TRIE_END:</span><span class="plain-syntax"> </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"End"</span><span class="plain-syntax">); </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">TRIE_ANYTHING:</span><span class="plain-syntax"> </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"Anything"</span><span class="plain-syntax">); </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">TRIE_ANY_GROUP:</span><span class="plain-syntax"> </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"Group <%w>"</span><span class="plain-syntax">, </span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">group_characters</span><span class="plain-syntax">); </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">TRIE_NOT_GROUP:</span><span class="plain-syntax"> </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"Negated group <%w>"</span><span class="plain-syntax">, </span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">group_characters</span><span class="plain-syntax">); </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="identifier-syntax">TRIE_STOP:</span><span class="plain-syntax"> </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"Stop"</span><span class="plain-syntax">); </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">case</span><span class="plain-syntax"> </span><span class="constant-syntax">0</span><span class="plain-syntax">: </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"00"</span><span class="plain-syntax">); </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">default:</span><span class="plain-syntax"> </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"%c"</span><span class="plain-syntax">, </span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">match_character</span><span class="plain-syntax">); </span><span class="reserved-syntax">break</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">match_outcome</span><span class="plain-syntax">) </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">" --> %s"</span><span class="plain-syntax">, </span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">match_outcome</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="identifier-syntax">WRITE</span><span class="plain-syntax">(</span><span class="string-syntax">"\n"</span><span class="plain-syntax">);</span>
|
|
<span class="plain-syntax"> </span><span class="reserved-syntax">if</span><span class="plain-syntax"> (</span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">on_success</span><span class="plain-syntax">) {</span>
|
|
<span class="plain-syntax"> </span><span class="constant-syntax">INDENT</span><span class="plain-syntax">; </span><a href="4-taa.html#SP9" class="function-link"><span class="function-syntax">Tries::log</span></a><span class="plain-syntax">(</span><span class="identifier-syntax">OUT</span><span class="plain-syntax">, </span><span class="identifier-syntax">T</span><span class="plain-syntax">-></span><span class="element-syntax">on_success</span><span class="plain-syntax">); </span><span class="constant-syntax">OUTDENT</span><span class="plain-syntax">;</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax"> }</span>
|
|
<span class="plain-syntax">}</span>
|
|
</pre>
|
|
<nav role="progress"><div class="progresscontainer">
|
|
<ul class="progressbar"><li class="progressprev"><a href="4-tf.html">❮</a></li><li class="progresschapter"><a href="P-abgtf.html">P</a></li><li class="progresschapter"><a href="1-fm.html">1</a></li><li class="progresschapter"><a href="2-dl.html">2</a></li><li class="progresschapter"><a href="3-em.html">3</a></li><li class="progresscurrentchapter">4</li><li class="progresssection"><a href="4-chr.html">chr</a></li><li class="progresssection"><a href="4-cst.html">cst</a></li><li class="progresssection"><a href="4-ws.html">ws</a></li><li class="progresssection"><a href="4-sm.html">sm</a></li><li class="progresssection"><a href="4-tf.html">tf</a></li><li class="progresscurrent">taa</li><li class="progresssection"><a href="4-pm.html">pm</a></li><li class="progresschapter"><a href="5-htm.html">5</a></li><li class="progresschapter"><a href="6-bf.html">6</a></li><li class="progresschapter"><a href="7-vn.html">7</a></li><li class="progresschapter"><a href="8-ws.html">8</a></li><li class="progressnext"><a href="4-pm.html">❯</a></li></ul></div>
|
|
</nav><!--End of weave-->
|
|
|
|
</main>
|
|
</body>
|
|
</html>
|
|
|