inweb-bootstrap/docs/inweb/2-lc.html

199 lines
27 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>2/tr</title>
<meta name="viewport" content="width=device-width initial-scale=1">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Language" content="en-gb">
<link href="../inweb.css" rel="stylesheet" rev="stylesheet" type="text/css">
</head>
<body>
<nav role="navigation">
<h1><a href="../webs.html">Sources</a></h1>
<ul>
<li><a href="../inweb/index.html">inweb</a></li>
</ul>
<h2>Foundation</h2>
<ul>
<li><a href="../foundation-module/index.html">foundation-module</a></li>
<li><a href="../foundation-test/index.html">foundation-test</a></li>
</ul>
</nav>
<main role="main">
<!--Weave of '2/lc' generated by 7-->
<ul class="crumbs"><li><a href="../webs.html">Source</a></li><li><a href="index.html">inweb</a></li><li><a href="index.html#2">Chapter 2: Parsing a Web</a></li><li><b>Line Categories</b></li></ul><p class="purpose">To store individual lines from webs, and to categorise them according to their meaning.</p>
<ul class="toc"><li><a href="#SP1">&#167;1. Line storage</a></li><li><a href="#SP3">&#167;3. Categories</a></li><li><a href="#SP5">&#167;5. Command codes</a></li></ul><hr class="tocbar">
<p class="inwebparagraph"><a id="SP1"></a><b>&#167;1. Line storage. </b>In the next section, we'll read in an entire web, building its hierarchical
structure of chapters, sections and eventually paragraphs. But before we do
that, we'll define the structure used to store a single line of the web.
</p>
<p class="inwebparagraph">Because Inweb markup makes use of the special characters <code class="display"><span class="extract">@</span></code> and <code class="display"><span class="extract">=</span></code> as
dividers, but only in column 1, the important divisions between material
all effectively occur at line boundaries &mdash; this is a major point of
difference with, for example, CWEB, for which the source is just a stream
of characters in which all white space is equivalent. Because Inweb source
is so tidily divisible into lines, we can usefully make each source line
correspond to one of these:
</p>
<pre class="display">
<span class="reserved">typedef</span><span class="plain"> </span><span class="reserved">struct</span><span class="plain"> </span><span class="reserved">source_line</span><span class="plain"> {</span>
<span class="reserved">struct</span><span class="plain"> </span><span class="reserved">text_stream</span><span class="plain"> *</span><span class="identifier">text</span><span class="plain">; </span> <span class="comment">the text as read in</span>
<span class="reserved">struct</span><span class="plain"> </span><span class="reserved">text_stream</span><span class="plain"> *</span><span class="identifier">text_operand</span><span class="plain">; </span> <span class="comment">meaning depends on category</span>
<span class="reserved">struct</span><span class="plain"> </span><span class="reserved">text_stream</span><span class="plain"> *</span><span class="identifier">text_operand2</span><span class="plain">; </span> <span class="comment">meaning depends on category</span>
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">category</span><span class="plain">; </span> <span class="comment">what sort of line this is: an <code class="display"><span class="extract">*_LCAT</span></code> value</span>
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">command_code</span><span class="plain">; </span> <span class="comment">used only for <code class="display"><span class="extract">COMMAND_LCAT</span></code> lines: a <code class="display"><span class="extract">*_CMD</span></code> value</span>
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">default_defn</span><span class="plain">; </span> <span class="comment">used only for <code class="display"><span class="extract">BEGIN_DEFINITION_LCAT</span></code> lines</span>
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">is_commentary</span><span class="plain">; </span> <span class="comment">flag</span>
<span class="reserved">struct</span><span class="plain"> </span><span class="reserved">function</span><span class="plain"> *</span><span class="identifier">function_defined</span><span class="plain">; </span> <span class="comment">if any C-like function is defined on this line</span>
<span class="reserved">struct</span><span class="plain"> </span><span class="reserved">preform_nonterminal</span><span class="plain"> *</span><span class="identifier">preform_nonterminal_defined</span><span class="plain">; </span> <span class="comment">similarly</span>
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">suppress_tangling</span><span class="plain">; </span> <span class="comment">if e.g., lines are tangled out of order</span>
<span class="reserved">int</span><span class="plain"> </span><span class="identifier">interface_line_identified</span><span class="plain">; </span> <span class="comment">only relevant during parsing of Interface lines</span>
<span class="reserved">struct</span><span class="plain"> </span><span class="reserved">text_file_position</span><span class="plain"> </span><span class="identifier">source</span><span class="plain">; </span> <span class="comment">which file this was read in from, if any</span>
<span class="reserved">struct</span><span class="plain"> </span><span class="reserved">section</span><span class="plain"> *</span><span class="identifier">owning_section</span><span class="plain">; </span> <span class="comment">for interleaved title lines, it's the one about to start</span>
<span class="reserved">struct</span><span class="plain"> </span><span class="reserved">source_line</span><span class="plain"> *</span><span class="identifier">next_line</span><span class="plain">; </span> <span class="comment">within the owning section's linked list</span>
<span class="reserved">struct</span><span class="plain"> </span><span class="reserved">paragraph</span><span class="plain"> *</span><span class="identifier">owning_paragraph</span><span class="plain">; </span> <span class="comment">for lines falling under paragraphs; <code class="display"><span class="extract">NULL</span></code> if not</span>
<span class="plain">} </span><span class="reserved">source_line</span><span class="plain">;</span>
</pre>
<p class="inwebparagraph"></p>
<p class="endnote">The structure source_line is accessed in 1/pc, 2/tr, 2/tp, 2/pm, 2/ec, 2/pn, 3/ta, 3/tw, 3/tt, 4/pl, 4/cl, 4/is, 4/ps, 5/tf and here.</p>
<p class="inwebparagraph"><a id="SP2"></a><b>&#167;2. </b></p>
<pre class="display">
<span class="reserved">source_line</span><span class="plain"> *</span><span class="functiontext">Lines::new_source_line</span><span class="plain">(</span><span class="reserved">text_stream</span><span class="plain"> *</span><span class="identifier">line</span><span class="plain">, </span><span class="reserved">text_file_position</span><span class="plain"> *</span><span class="identifier">tfp</span><span class="plain">) {</span>
<span class="reserved">source_line</span><span class="plain"> *</span><span class="identifier">sl</span><span class="plain"> = </span><span class="identifier">CREATE</span><span class="plain">(</span><span class="reserved">source_line</span><span class="plain">);</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;text</span><span class="plain"> = </span><span class="functiontext">Str::duplicate</span><span class="plain">(</span><span class="identifier">line</span><span class="plain">);</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;text_operand</span><span class="plain"> = </span><span class="functiontext">Str::new</span><span class="plain">();</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;text_operand2</span><span class="plain"> = </span><span class="functiontext">Str::new</span><span class="plain">();</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;category</span><span class="plain"> = </span><span class="constant">NO_LCAT</span><span class="plain">; </span> <span class="comment">that is, unknown category as yet</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;command_code</span><span class="plain"> = </span><span class="constant">NO_CMD</span><span class="plain">;</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;default_defn</span><span class="plain"> = </span><span class="constant">FALSE</span><span class="plain">;</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;is_commentary</span><span class="plain"> = </span><span class="constant">FALSE</span><span class="plain">;</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;function_defined</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;preform_nonterminal_defined</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;suppress_tangling</span><span class="plain"> = </span><span class="constant">FALSE</span><span class="plain">;</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;interface_line_identified</span><span class="plain"> = </span><span class="constant">FALSE</span><span class="plain">;</span>
<span class="reserved">if</span><span class="plain"> (</span><span class="identifier">tfp</span><span class="plain">) </span><span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;source</span><span class="plain"> = *</span><span class="identifier">tfp</span><span class="plain">; </span><span class="reserved">else</span><span class="plain"> </span><span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;source</span><span class="plain"> = </span><span class="functiontext">TextFiles::nowhere</span><span class="plain">();</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;owning_section</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;next_line</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
<span class="identifier">sl</span><span class="plain">-</span><span class="element">&gt;owning_paragraph</span><span class="plain"> = </span><span class="identifier">NULL</span><span class="plain">;</span>
<span class="reserved">return</span><span class="plain"> </span><span class="identifier">sl</span><span class="plain">;</span>
<span class="plain">}</span>
</pre>
<p class="inwebparagraph"></p>
<p class="endnote">The function Lines::new_source_line is used in 2/tr (<a href="2-tr.html#SP6_1_2">&#167;6.1.2</a>), 2/tp (<a href="2-tp.html#SP1_1_5_1">&#167;1.1.5.1</a>).</p>
<p class="inwebparagraph"><a id="SP3"></a><b>&#167;3. Categories. </b>The line categories are enumerated as follows. We briefly note what the text
operands (TO and TO2) are set to, if anything: most of the time they're blank.
Note that a few of these categories are needed only for the more cumbersome
version 1 syntax; version 2 removed the need for <code class="display"><span class="extract">BAR_LCAT</span></code>,
<code class="display"><span class="extract">INTERFACE_BODY_LCAT</span></code>, and <code class="display"><span class="extract">INTERFACE_LCAT</span></code>.
</p>
<pre class="definitions">
<span class="definitionkeyword">enum</span> <span class="constant">NO_LCAT</span><span class="definitionkeyword"> from </span><span class="constant">0</span> <span class="comment">(used when none has been set as yet)</span>
<span class="definitionkeyword">enum</span> <span class="constant">BAR_LCAT</span><span class="plain"> </span> <span class="comment">a bar line <code class="display"><span class="extract">@---------------</span></code>...</span>
<span class="definitionkeyword">enum</span> <span class="constant">BEGIN_CODE_LCAT</span><span class="plain"> </span> <span class="comment">an <code class="display"><span class="extract">@c</span></code>, <code class="display"><span class="extract">@e</span></code> or <code class="display"><span class="extract">@x</span></code> line below which is code, early code or extract</span>
<span class="definitionkeyword">enum</span> <span class="constant">BEGIN_DEFINITION_LCAT</span><span class="plain"> </span> <span class="comment">an <code class="display"><span class="extract">@d</span></code> definition: TO is term, TO2 is this line's part of defn</span>
<span class="definitionkeyword">enum</span> <span class="constant">C_LIBRARY_INCLUDE_LCAT</span><span class="plain"> </span> <span class="comment">C-like languages only: a <code class="display"><span class="extract">#include</span></code> for an ANSI C header file</span>
<span class="definitionkeyword">enum</span> <span class="constant">CHAPTER_HEADING_LCAT</span><span class="plain"> </span> <span class="comment">chapter heading line inserted automatically, not read from web</span>
<span class="definitionkeyword">enum</span> <span class="constant">CODE_BODY_LCAT</span><span class="plain"> </span> <span class="comment">the rest of the paragraph under an <code class="display"><span class="extract">@c</span></code> or <code class="display"><span class="extract">@e</span></code> or macro definition</span>
<span class="definitionkeyword">enum</span> <span class="constant">COMMAND_LCAT</span><span class="plain"> </span> <span class="comment">a <code class="display"><span class="extract">[[Command]]</span></code> line, with the operand set to the <code class="display"><span class="extract">*_CMD</span></code> value</span>
<span class="definitionkeyword">enum</span> <span class="constant">COMMENT_BODY_LCAT</span><span class="plain"> </span> <span class="comment">text following a paragraph header, which is all comment</span>
<span class="definitionkeyword">enum</span> <span class="constant">CONT_DEFINITION_LCAT</span><span class="plain"> </span> <span class="comment">subsequent lines of an <code class="display"><span class="extract">@d</span></code> definition</span>
<span class="definitionkeyword">enum</span> <span class="constant">DEFINITIONS_LCAT</span><span class="plain"> </span> <span class="comment">line holding the <code class="display"><span class="extract">@Definitions:</span></code> heading</span>
<span class="definitionkeyword">enum</span> <span class="constant">HEADING_START_LCAT</span><span class="plain"> </span> <span class="comment"><code class="display"><span class="extract">@h</span></code> paragraph start: TO is title, TO2 is rest of line</span>
<span class="definitionkeyword">enum</span> <span class="constant">INTERFACE_BODY_LCAT</span><span class="plain"> </span> <span class="comment">line within the interface, under this heading</span>
<span class="definitionkeyword">enum</span> <span class="constant">INTERFACE_LCAT</span><span class="plain"> </span> <span class="comment">line holding the <code class="display"><span class="extract">@Interface:</span></code> heading</span>
<span class="definitionkeyword">enum</span> <span class="constant">MACRO_DEFINITION_LCAT</span><span class="plain"> </span> <span class="comment">line on which a paragraph macro is defined with an <code class="display"><span class="extract">=</span></code> sign</span>
<span class="definitionkeyword">enum</span> <span class="constant">PARAGRAPH_START_LCAT</span><span class="plain"> </span> <span class="comment">simple <code class="display"><span class="extract">@</span></code> paragraph start: TO is blank, TO2 is rest of line</span>
<span class="definitionkeyword">enum</span> <span class="constant">PREFORM_GRAMMAR_LCAT</span><span class="plain"> </span> <span class="comment">InC only: line of Preform grammar</span>
<span class="definitionkeyword">enum</span> <span class="constant">PREFORM_LCAT</span><span class="plain"> </span> <span class="comment">InC only: opening line of a Preform nonterminal</span>
<span class="definitionkeyword">enum</span> <span class="constant">PURPOSE_BODY_LCAT</span><span class="plain"> </span> <span class="comment">continuation lines of purpose declaration</span>
<span class="definitionkeyword">enum</span> <span class="constant">PURPOSE_LCAT</span><span class="plain"> </span> <span class="comment">first line of purpose declaration; TO is rest of line</span>
<span class="definitionkeyword">enum</span> <span class="constant">SECTION_HEADING_LCAT</span><span class="plain"> </span> <span class="comment">section heading line, at top of file</span>
<span class="definitionkeyword">enum</span> <span class="constant">SOURCE_DISPLAY_LCAT</span><span class="plain"> </span> <span class="comment">commentary line beginning <code class="display"><span class="extract">&gt;&gt;</span></code> for display: TO is display text</span>
<span class="definitionkeyword">enum</span> <span class="constant">TEXT_EXTRACT_LCAT</span><span class="plain"> </span> <span class="comment">the rest of the paragraph under an <code class="display"><span class="extract">@x</span></code></span>
<span class="definitionkeyword">enum</span> <span class="constant">TYPEDEF_LCAT</span><span class="plain"> </span> <span class="comment">C-like languages only: a <code class="display"><span class="extract">typedef</span></code> which isn't a structure definition</span>
</pre>
<p class="inwebparagraph"><a id="SP4"></a><b>&#167;4. </b>We want to print these out nicely for the sake of a <code class="display"><span class="extract">-scan</span></code> analysis run
of Inweb:
</p>
<pre class="display">
<span class="reserved">char</span><span class="plain"> *</span><span class="functiontext">Lines::category_name</span><span class="plain">(</span><span class="reserved">int</span><span class="plain"> </span><span class="identifier">cat</span><span class="plain">) {</span>
<span class="reserved">switch</span><span class="plain"> (</span><span class="identifier">cat</span><span class="plain">) {</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">NO_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"(uncategorised)"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">BAR_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"BAR"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">BEGIN_CODE_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"BEGIN_CODE"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">BEGIN_DEFINITION_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"BEGIN_DEFINITION"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">C_LIBRARY_INCLUDE_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"C_LIBRARY_INCLUDE"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">CHAPTER_HEADING_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"CHAPTER_HEADING"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">CODE_BODY_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"CODE_BODY"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">COMMAND_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"COMMAND"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">COMMENT_BODY_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"COMMENT_BODY"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">CONT_DEFINITION_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"CONT_DEFINITION"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">DEFINITIONS_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"DEFINITIONS"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">HEADING_START_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"PB_PARAGRAPH_START"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">INTERFACE_BODY_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"INTERFACE_BODY"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">INTERFACE_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"INTERFACE"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">MACRO_DEFINITION_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"MACRO_DEFINITION"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">PARAGRAPH_START_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"PARAGRAPH_START"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">PREFORM_GRAMMAR_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"PREFORM_GRAMMAR"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">PREFORM_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"PREFORM"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">PURPOSE_BODY_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"PURPOSE_BODY"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">PURPOSE_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"PURPOSE"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">SECTION_HEADING_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"SECTION_HEADING"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">SOURCE_DISPLAY_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"SOURCE_DISPLAY"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">TEXT_EXTRACT_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"TEXT_EXTRACT"</span><span class="plain">;</span>
<span class="reserved">case</span><span class="plain"> </span><span class="constant">TYPEDEF_LCAT</span><span class="plain">: </span><span class="reserved">return</span><span class="plain"> </span><span class="string">"TYPEDEF"</span><span class="plain">;</span>
<span class="plain">}</span>
<span class="reserved">return</span><span class="plain"> </span><span class="string">"(?unknown)"</span><span class="plain">;</span>
<span class="plain">}</span>
</pre>
<p class="inwebparagraph"></p>
<p class="endnote">The function Lines::category_name is used in 3/ta (<a href="3-ta.html#SP1_1">&#167;1.1</a>).</p>
<p class="inwebparagraph"><a id="SP5"></a><b>&#167;5. Command codes. </b>Command-category lines are further divided up into the following. Again,
some of these fell into disuse in version 2 syntax.
</p>
<pre class="definitions">
<span class="definitionkeyword">enum</span> <span class="constant">NO_CMD</span><span class="definitionkeyword"> from </span><span class="constant">0</span>
<span class="definitionkeyword">enum</span> <span class="constant">PAGEBREAK_CMD</span>
<span class="definitionkeyword">enum</span> <span class="constant">GRAMMAR_INDEX_CMD</span>
<span class="definitionkeyword">enum</span> <span class="constant">FIGURE_CMD</span>
<span class="definitionkeyword">enum</span> <span class="constant">TAG_CMD</span>
</pre>
<hr class="tocbar">
<ul class="toc"><li><a href="2-tr.html">Back to 'The Reader'</a></li><li><a href="2-tp.html">Continue with 'The Parser'</a></li></ul><hr class="tocbar">
<!--End of weave-->
</main>
</body>
</html>