How to use Inweb to weave or tangle a web already written.
- §1. All-in-one webs
- §4. Multi-section webs
- §7. Tangling
- §10. Weaving
- §13. Weave tags
- §14. Modules
- §17. The section catalogue
- §18. Makefile
- §19. Gitignore
- §20. README files
- §23. Semantic version numbering and build metadata
§1. All-in-one webs. A program written for use with Inweb is called a "web". Inweb was primarily designed for large, multisection webs, but it can also be used in a much simpler way on smaller webs. In this documentation we'll call those "all-in-one webs", meaning that there is just a single source code file for the program.
Such a file should be a UTF-8 encoded plain text file with the file extension .inweb. The following is a "hello world" example, which can be found in the Inweb distribution as inweb/Examples/hellow.inweb:
Title: hellow Author: Graham Nelson Purpose: A minimal example of a C program written for inweb. Language: C @ = #include <stdio.h> int main(int argc, char *argv[]) { printf("Hello world!\n"); }
§2. This of course is just a regular C "hello world" program written below the @ = marker, and some metadata written above it. The metadata above is called the "contents section": for a larger web, it would expand out to something more like a contents page, though here it's more like a title page. The Title, Author and Purpose make no functional difference to the program produced - they are purely descriptive - but the Language setting is another matter, as we shall see.
The contents end, and the code begins, when the first "paragraph" begins. Code in an Inweb web is divided into paragraphs. The core Inform compiler currently has 8362 paragraphs, whereas hellow has just one. (If you are reading this documentation in a web page or a PDF, you will see that it's divided up into little numbered sections: those are individual paragraphs from the inweb web.) More on this below, but the use of an @ character in column 1 of the web file is what marks a paragraph break.
As mentioned earlier, there are two basic things we can do with a web: tangle, to make a program ready to compile and run; and weave, to make a comfortably legible version for human eyes instead. Let's now tangle:
$ inweb/Tangled/inweb inweb/Examples/hellow.inweb -tangle web "hellow": 1 section(s) : 1 paragraph(s) : 9 line(s) tangling <inweb/Examples/hellow.c> (written in C)
And inweb/Examples/hellow.c is now a regular C program which can then be compiled. If we had wanted it to be written somewhere else, or called something else, we could have used -tangle-to F to specify a file F to create instead.
In general, you never need to look at or edit tangled code, but if we take a look at this one to see what has happened, two things are worth noting.
- (a) First, the use of the #line C preprocessor feature, which ensures that any compilation errors occurring will be reported at the correct point of origin in the original Inweb file, not in the tangled file.
- (b) Secondly, notice that the main function has automatically been predeclared at the top of the file. Because Inweb does this for C programs, the programmer can freely call functions defined lower down in the source code, without having to write tiresome predeclarations or header files. (As it happens, there was no need in the case of main, but nor was there any harm.)
Tangled output generated by inweb: do not edit #include <stdio.h> #line 9 "inweb/Examples/hellow.inweb" int main(int argc, char *argv[]) ; #line 8 "inweb/Examples/hellow.inweb" int main(int argc, char *argv[]) { printf("Hello world!\n"); }
§3. So much for tangling: we can also weave. hellow is so uninteresting to look at that this seems a good point to switch to inweb/Examples/twinprimes.inweb, a C program to find twin prime numbers. If we weave:
$ inweb/Tangled/inweb inweb/Examples/twinprimes.inweb -weave web "twinprimes": 1 section(s) : 4 paragraph(s) : 48 line(s) [Complete Program: HTML -> inweb/Examples/twinprimes.html]
As with tangling, we can override this destination with -weave-to F, telling Inweb to weave into just a single file (which in this instance it was going to do anyway) and call it F; or we can similarly -weave-into D, telling Inweb to weave a set of file into the directory D, rather than the usual Woven subdirectory of the web in question.
By default, -weave makes an HTML representation of the program. (On a larger web, with multiple sections, it would make a set of linked pages, but here there's just one.) This can then be looked at with a browser such as Chrome or Safari. HTML is not the only format we can produce. Inweb performs the weave by following a "pattern", and it has several patterns built in, notably HTML, Ebook and TeX.
Running Inweb with -weave-as P tells it to weave with pattern P; the plain command -weave is equivalent to -weave-as HTML. The Ebook pattern makes an EPUB file suitable for readers such as Apple's Books app, but that would be overkill for such a tiny program. Instead:
$ inweb/Tangled/inweb inweb/Examples/twinprimes.inweb -weave-as TeX
This will only work if you have the mathematical typesetting system TeX installed, and in particular, the pdftex tool. (This comes as part of the standard TeXLive distribution, so simply "installing TeX" on your platform will probably install pdftex automatically.) Now the response is like so:
$ inweb/Tangled/inweb inweb/Examples/twinprimes.inweb -weave-as TeX web "twinprimes": 1 section(s) : 4 paragraph(s) : 48 line(s) [Complete Program: PDF -> inweb/Examples/twinprimes.tex: 1pp 103K]
Inweb automatically creates twinprimes.tex and runs it through pdftex to produce twinprimes.pdf: it reads over the TeX log file to see how many pages that comes to, and reports back. All being well, the .tex and .log files are silently removed, leaving just twinprimes.pdf behind.
§4. Multi-section webs. The twinprimes.inweb example was a program so small that it could comfortably fit into one source file, but for really large programs, that would be madness. The core Inform compiler, for example, runs to about 210,000 lines of code, and distributes those across 418 source files called "sections", together with a special 419th section which forms its contents page. It's a matter of personal taste how much should be in a section, but an ideal section file might contain 500 to 1000 lines of material and weave to a standalone essay, describing and implementing a single well-defined component of the whole program.
In this documentation, we'll call such webs "multi-section".
A multi-section web is stored as a directory, whose name should be (a short version of) the name of the program. For example, Inweb's own source is in a directory called inweb. A web directory is a tidy, self-contained area in which the program can be written, compiled and used.
Inweb expects that a multi-section web will contain at least two source files, each of which is a UTF-8 encoded text file with the file extension .w. One source file is special, must always be called Contents.w, and must be directly stored in the web directory. All other section files are stored in subdirectories of the web directory:
- (a) If the web is still relatively small, there may only be a few of these, stored in a single subdirectory called Sections.
- (b) Alternatively (not additionally), a larger web can use chapter subdirectories called Manual, Preliminaries, Chapter 1, Chapter 2, ..., Appendix A, Appendix B, ...; preliminaries and appendices being optional. (There can't be a Chapter 0, though there can be Appendix A, B, C, ..., L.)
A multi-section web can contain a variety of other subdirectories as needed. Two in particular, Woven and Tangled, are automatically created by Inweb as needed to store the results of tangling and weaving, respectively: they are not intended to hold any material of lasting value, and can be emptied at any time and regenerated later.
§5. Uniquely, the Contents.w section provides neither typeset output nor compiled code: it is instead a roster telling Inweb about the rest of the web, and how the other sections are organised. It has a completely different syntax from all other sections. (It's essentially a fuller version of the top part of an all-in-one web file as demonstrated above, but now it occupies the whole file.)
The contents section opens with some bibliographic data. For example:
Title: inter Author: Graham Nelson Purpose: For handling intermediate Inform code Language: InC Licence: Artistic License 2.0 Version Number: 1 Version Name: Axion
This is a simply a block of name-value pairs specifying some bibliographic details; there is then a skipped line, and the roster of sections begins.
Note that the program's Title need not be the same as the directory-name for the web, which is useful if the program has a long or file-system-unfriendly name. The Purpose should be brief enough to fit onto one line. Licence can also have the US spelling, License; Inweb treats these as equivalent. Version number and name are, of course, optional.
The Language is the programming language in which the code is written: much more on that later on, but for now, the important ones are probably C, InC and Plain Text.
§6. After the header block of details, then, we have the roster of sections. This is like a contents page — the order is the order in which the sections are presented on any website, or in any of the larger PDFs woven. For a short, unchaptered web, we might have for instance:
Sections Program Control Command Line and Configuration Scan Documentation HTML and Javascript Renderer
And then Inweb will expect to find, for instance, the section file Scan Documentation.w in the Sections directory.
A larger web, however, won't have a "Sections" directory. It may have a much longer roster, such as:
Preliminaries Preface Thematic Index Licence and Copyright Declaration BNF Grammar Chapter 1: Definitions "In which some globally-used constants are defined and the standard C libraries are interfaced with, with all the differences between platforms (Mac OS X, Windows, Linux, Solaris, Sugar/XO and so forth) taken care of once and for all." Basic Definitions Platform-Specific Definitions
... and so on...
Appendix A: The Standard Rules (Independent Inform 7) "This is the body of Inform 7 source text automatically included with every project run through the Inform compiler, and which defines most of what end users see as the Inform language." SR0 - Preamble SR1 - Physical World Model
... and so on. Here the sections appear in directories called Preliminaries, Chapter 1, Chapter 2, ..., Appendix A. (There can't be a Chapter 0, though there can be Appendix B, C, ..., O; there can also be a Manual chapter, in the sense of documentation.)
In case of any doubt we can use the following command-line switch to see how Inweb is actually reading the sections of a web W:
$ inweb/Tangled/inweb W -catalogue -verbose
§7. Tangling. At this point, it may be worth experimenting with a second mathematical example: inweb/Examples/goldbach, which is to do with a problem in number theory called the Goldbach Conjecture. This is a multi-section web, though really only for the sake of an example: it's still a very small web.
This is once again a C program. Actually building and running this is a little trouble, of course, and because there are multiple source files, it's not so easy to keep track of whether the program is built up to date. So a convenience of Inweb is that it can make makefiles to help with this:
$ inweb/Tangled/inweb inweb/Examples/goldbach -makefile inweb/Examples/goldbach/goldbach.mk
With this done,
$ make -f inweb/Examples/goldbach/goldbach.mk
tangles and then compiles the program as necessary. The tangling part of that is nothing fancy - as before, it's just
$ inweb/Tangled/inweb inweb/Examples/goldbach -tangle
Assuming all goes well:
$ inweb/Examples/goldbach/Tangled/goldbach
should then print out some results.
§8. It is legal in some circumstances to tangle only part of a web. This is done by specifying a "range", much as will be seen later with weaving - but because it's not normally meaningful to tangle only part of a program, the possible ranges are much more restricted. In fact, the only partial tangles allowed are for chapters or sections marked in the Contents.w as being "Independent". For example:
Appendix A: The Standard Rules (Independent Inform 7)
declares that Appendix A is a sort of sidekick program, written in the language "Inform 7". As a result, it won't be included in a regular -tangle, and to obtain it we have to:
$ inweb/Tangled/inweb inform7 -tangle A
§9. In some C programs, it's useful to require that a header file be added to a tangle. This can be done by adding:
Header: H
to the contents page of a web. The heacer file H in question should then be stored in the web's Headers subdirectory. (At one time, the Foundation module used this to bring in a Windows-only header file.)
§10. Weaving. As with all-in-one webs, the commands for weaving are like so:
$ inweb inweb/Examples/goldbach -weave $ inweb inweb/Examples/goldbach -weave-as TeX
This will produce single HTML or PDF files of the woven form of the whole program. (Note that the PDF file now has a cover page: on a web with just a single section, this wouldn't happen.) But with a growing web, that can be cumbersome.
§11. After setting -weave or -weave-as, we can also optionally choose a range. The default range is all, so up to now we have implicitly been running weaves like these:
$ inweb inweb/Examples/goldbach -weave all $ inweb inweb/Examples/goldbach -weave-as TeX all
The opposite extreme from all is sections. This still weaves the entire web, but now cuts it up into individual files, one for each section. For example,
$ inweb inweb/Examples/goldbach -weave sections
makes a miniature website; files include some CSS, and:
inweb/Examples/goldbach/Woven/index.html inweb/Examples/goldbach/Woven/S-tgc.html inweb/Examples/goldbach/Woven/S-tsoe.html
Those abbreviated names S-tgc and S-tsoe are cut down from the full names of the sections involved, "The Goldbach Conjecture" and "The Sieve of Eratosthenes". Similarly,
$ inweb inweb/Examples/goldbach -weave-as TeX sections
creates the files:
inweb/Examples/goldbach/Woven/index.html inweb/Examples/goldbach/Woven/S-tgc.pdf inweb/Examples/goldbach/Woven/S-tsoe.pdf
The index file here is a table of contents offering links to the PDFs.
An intermediate level of granularity is the range chapters, which makes sense only for chaptered webs, and puts each chapter into its own file.
§12. Ranges can also be used to weave only part of a web:
- (a) In a chaptered web, chapters are abbreviated to just their numbers: for example, the range 2 means "just Chapter 2". The Preliminaries alone is P; the Manual, M. Appendix A, B, C are A, B, C and so on. (This is why Appendices can only run up to L.)
- (b) In an unchaptered web, S means "all the sections". This is almost but not quite the same as all: the cover sheet (a sort of title page) is omitted.
- (c) The abbreviation for a section makes a range of just that section. For example, S/tgc and S/tsoe in the Goldbach web example, or 2/ec for the "Enumerated Constants" section of Chapter 2 of Inweb itself. Note that running Inweb with -catalogue shows all the sections of a web, and their abbreviations. If it's a nuisance that these section ranges are hard to predict, run with -sequential to have them simply be X/s1, X/s2, ..., within each chapter, where X is the chapter range.
§13. Weave tags. An alternative to a range is to specify a tag. Rather than weaving contiguous pieces of the web, this collates together all those paragraphs with a given tag. The result is a booklet of extracts.
Most paragraphs are never tagged. A tag is simply a word; paragraphs can have multiple tags, but for each individual tags they either have it or don't. A very few tags are automatically applied by Inweb:
If the program is for a C-like language, Inweb automatically tags any paragraph containing a typedef struct with the tag Structures. So, for example,
$ inweb/Tangled/inweb inweb -weave-tag Structures
weaves just the structure definitions culled from a much larger web; this can make a convenient reference. Similarly, any paragraph containing an illustration is automatically tagged Figures, and any paragraph in an InC web which defines Preform grammar is automatically tagged Preform. (In the Inform project, this is used to generate the PDF of the formal syntax of the language.)
All other tags must be typed by hand. If the line introducing a paragraph is marked at the end with ^"Fun", then that paragraph will be tagged as Fun, and so on. Paragraphs can have multiple tags:
@ ^"Algorithms" ^"History" The original version of the program used an in-place insertion sort, but ...
A tag can optionally supply a caption. For example:
@ ^"Algorithms: Sorting rulebooks" The original version of the program used an in-place insertion sort, but ...
Here the tag is just Algorithms, but when a -weave-to Algorithms is performed, the caption text "Sorting rulebooks" will be used in a subheading in the resulting booklet.
Beyond that, an entire section can be tagged from the Contents.w page. For example:
Sections The Goldbach Conjecture The Sieve of Eratosthenes ^"Greek"
tags every paragraph in the section "The Sieve of Eratosthenes" with the tag Greek. In this instance, a caption is not allowed.
Note that if we -weave-to a tag which does not exist - or rather, which no paragraph in the range has - then rather than producing an empty document, Inweb will halt with an "empty weave request" error.
§14. Modules. Up to now, the webs described have all been self-contained: one web makes one program, and contains the code in its entirety. But Inweb also supports "modules". A module is simply a web which provides a compoment of a program but is not a program in its own right.
For example, all of the Inform tools (including Inweb itself) make use of a module called foundation, which is written in InC and provides facilities for managing memory, manipulating strings, filenames, and so on. On the other hand, the Inform project also includes a module called inter which is used only by the core compiler inform7 and by a wrapper utility also called inter; in fact, inform7 is entirely divided up into modules, some of which are used only by itself.
§15. It makes little sense to tangle a module on its own. Instead, a web which wishes to use a module needs to declare this on its Contents.w page. This is done with a list of "imports", after the metadata but before the list of sections. For example,
Import: foundation Chapter 1 Startup
...and so on. When this new web is tangled, the module's code will tangled into it. Any functions or variables defined in the module will thus be available to the new web.
However, it makes perfectly good sense to weave a module. For example:
$ inweb/Tangled/inweb inweb/foundation-module -weave sections
§16. That's everything there is to say about modules, except where Inweb looks to find them. When it reads a request from a web W to import a module M, it looks for a web directory called M-module (note the hyphen). For example, Import: fruit would look for the directory fruit-module. Inweb tries the following locations, in sequence, until it finds it:
- (1) Directly inside W.
- (2) In the directory containing W (i.e., one directory higher up).
- (3) Directly inside Inweb's own web directory.
- (4) In the directory specified by -import-from D at the command line, if any.
§17. The section catalogue. Inweb can do a handful of other things. One is to list the contents of a web:
- (a) -catalogue (or -catalog) lists the sections in the web.
- (b) -structures lists the sections, and all of the structure definitions made in them (for C-like languages).
- (c) -functions lists the sections, with all structure definitions and also all function definitions.
In addition, for debugging purposes, -scan shows how Inweb is parsing lines of source code in the web, and -verbose makes it generally print out more descriptive output.
§18. Makefile. As mentioned earlier, Inweb can construct a suitable makefile for a web:
$ inweb/Tangled/inweb W -makefile M
creates a makefile for the web W and stores it in M. For example,
$ inweb/Tangled/inweb inweb -makefile inweb/inweb.mk
The makefile is constructed using a prototype file called a "makescript". Ordinarily the script used will be the one stored in
W/makescript.txt
or, if no such file exists, the default one stored in Inweb:
inweb/Materials/makescript.txt
but this can be changed by using -prototype S, which tells Inweb to use S as the script. If a -prototype is given, then there's no need to specify any one web for Inweb to use: this allows Inweb to construct more elaborate makefiles for multi-web projects. (This is how the main makefile for the Inform project is constructed.)
To see how makescripts work, it's easiest simply to look at the default one.
§19. Gitignore. A similar convenience exists for users who want to use the git source control tool with a web: for example, uploading it to Github.
The files produced by weaving or tangling a web are not significant and should probably not be subject to source control: they should be "ignored", in git terminology. This means writing a special file called .gitignore which specifies the files to be ignored. The following does so for a web W:
$ inweb/Tangled/inweb W -gitignore W/.gitignore
Once again, Inweb does this by working from a script, this time called gitignorescript.txt.
§20. README files. Repositories at Github customarily have README.mk files, in Markdown syntax, explaining what they are. These of course should probably include current version numbers, and it's a pain keeping that up to date. For really complicated repositories, containing multiple webs, some automation is essential, and once again Inweb can oblige.
$ inweb/Tangled/inweb W -write-me W/README.mk
expands a script called READMEscript.txt into README.mk. Alternatively, the script can be specified explicitly:
$ inweb/Tangled/inweb W -prototype MySpecialThang.txt -write-me W/README.mk
§21. Everything in the script is copied over verbatim except where the @ character is used, which was chosen because it isn't significant in Github's form of Markdown. @name(args) is like a function call (or, in more traditional language, a macro): it expands out to something depending on the arguments. args is a comma-separated list of fragments of text, which can themselves contain further uses of @. (If these fragments of text need to contain commas or brackets, they can be put into single quotes: @thus(4,',') has two arguments, 4 and ,.) Three functions are built in:
- (a) @version(A) expands to the version number of A, which is normally the path to a web; it then produces the value of the [[Version Number]] for that web. But A can also be the filename of an Inform extension, provided that it ends in .i7x, or a few other Inform-specific things for which Inweb is able to deduce a version number.
- (b) @purpose(A) is the same, but for the [[Purpose]] of a web. It's blank for everything else.
- (c) @var(A,D) is more general, and reads the bibliographic datum D from the web indicated by A. In fact, @version(A) is an abbreviation for @var(A,Version Number) and @purpose(A) for @var(A,Purpose), so this is really the only one needed.
§22. It is also possible to define new functions. For example:
@define book(title, path, topic) * @title - @topic. Ebook in Indoc format, stored at path @path. @end
The definition lies between @define and @end commands. This one takes three parameters, and inside the definition, their values can be referred to as @title, @path and @topic. Functions are free to use other functions:
@define primary(program, language) * @program - @purpose(@program) - __@version(@program)__ @end
However, each function needs to have been defined before any line on which it is actually expanded. A definition of one function A can refer to another function B not yet defined; but any actual use of A must be made after both A and B have been defined. So, basically, declare before use.
§23. Semantic version numbering and build metadata. When Inweb reads in a web, it also looks for a file called build.txt in the web's directory; if that isn't there, it looks for the same file in the current working directory; if that's not there either, never mind.
Such a file contains up to three text fields, all optional:
Prerelease: alpha.1 Build Date: 23 March 2020 Build Number: 6Q26
The bibliographic variables Prerelease and so on are then set from this file. (They can equally well be set by the Contents section of the web, and if so then that takes priority.)
The Prerelease and Build Number, if given, are used in combination with the Version Number (set in the Contents) to produce the semantic version number, or semver, for the web. For example, if the Contents included:
Version Number: 6.2.12
then the semver would be 6.2.12-alpha.1+6Q26. This is accessible within the web as the variable Semantic Version Number.
For more on semvers, see: https://semver.org
§24. A special advancing mechanism exists to update build numbers and dates. Running Inweb with -advance-build W checks the build date for web W: if it differs from today, then it is changed to today, and the build code is advanced by one.
Running -advance-build-file B does this for a stand-alone build file B, without need of a web.