<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
        "http://www.w3.org/TR/2000/REC-xhtml1-20000126/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
    <title>pyWeb Literate Programming</title>
    <meta name="generator" content="BBEdit 6.5.2" />
    <meta name="author" content="Steven F. Lott" />
    <link rel="StyleSheet" href="pyweb.css" type="text/css" />
</head>
<body>

<!-- title page -->
<p class="title"><em>pyWeb</em></p>
<p class="title">In Python, Yet Another Literate Programming Tool</p>
<p class="subtitle"><a href="mailto:s_lott@@yahoo.com">Steven F. Lott</a></p>

<h1>Table of Contents</h1>
<!--TOC-->

<h1>Introduction</h1>
<p>Literate programming was pioneered by Knuth as a method for
developing readable, understandable presentations of programs.
These would present a program in a literate fashion for people
to read and understand; this would be in parallel with presentation as source text
for a compiler to process and both would be generated from a common source file.
</p>
<p>
One intent is to synchronize the program source with the
documentation about that source.  If the program and the documentation
have a common origin, then the traditional gaps between intent 
(expressed in the documentation) and action (expressed in the
working program) are significantly reduced.
</p>
<p>Numerous tools have been developed based on Knuth's initial
work.  A relatively complete survey is available at sites
like <a href="http://www.literateprogramming.com/">Literate Programming</a>, 
and the OASIS
<a href="http://www.oasis-open.org/cover/xmlLitProg.html">XML Cover Pages: Literate Programming with SGML and XML</a>.
</p>
<p>The immediate predecessors to this <em>pyWEB</em> tool are 
<a href="http://www.ross.net/funnelweb/"><i>FunnelWeb</i></a>,
<a href="http://www.eecs.harvard.edu/~nr/noweb/"><i>noweb</i></a> and 
<a href="http://sourceforge.net/projects/nuweb/"><i>nuweb</i></a>.  The ideas lifted from these other
tools created the foundation for <em>pyWEB</em>.
</p>
<p>There are several Python-oriented literate programming tools.  
These include 
<a href="http://personalpages.tds.net/~edream/front.html"><i>LEO</i></a>, 
<a href="http://interscript.sourceforge.net/"><i>interscript</i></a>, 
<a href="http://www.danbala.com/python/lpy/"><i>lpy</i></a>, 
<a href="http://www.egenix.com/files/python/SoftwareDescriptions.html#py2html.py"><i>py2html</i></a>.
</p>
<p>The <i>FunnelWeb</i> tool is independent of any programming language
and only mildly dependent on T<small>E</small>X.
It has 19 commands, many of which duplicate features of HTML or 
L<sup><small>A</small></sup>T<small>E</small>X.
</p>
<p>The <i>noweb</i> tool was written by Norman Ramsey.
This tool uses a sophisticated multi-processing framework, via Unix
pipes, to permit flexible manipulation of the source file to tangle
and weave the programming language and documentation markup files.
</p>
<p>The <i>nuweb</i> Simple Literate Programming Tool was developed by
Preston Briggs (preston@@tera.com).  His work was supported by ARPA,
through ONR grant N00014-91-J-1989.  It is written
in C, and very focused on producing L<sup><small>A</small></sup>T<small>E</small>X documents.  It can 
produce HTML, but this is clearly added after the fact.  It cannot be 
easily extended, and is not object-oriented.
</p>
<p>The <i>LEO</i> tool, is a structured GUI editor for creating
source.  It uses XML and <i>noweb</i>-style chunk management.  It is more
than a simple weave and tangle tool.</p>
<p>The <i>interscript</i> tool is very large and sophisticated, but doesn't gracefully
tolerate HTML markup in the document.  It can create a variety of 
markup languages from the interscript source, making it suitable for
creating HTML as well as L<sup><small>A</small></sup>T<small>E</small>X.</p>
<p>The <i>lpy</i> tool can produce very complex HTML representations of
a Python program.  It works by locating documentation markup embedded
in Python comments and docstrings.  This is called inverted literate 
programming.</p>
<p>The <i>py2html</i> tool does very sophisticated syntax coloring.</p>

<h2>Background</h2>

<p>The following is an almost verbatim quote from Briggs' <i>nuweb</i> documentation, and provides an apt summary of Literate Programming.</p>

<div>
<p class="quote">In 1984, Knuth introduced the idea of <em>literate programming</em> and
described a pair of tools to support the practise (Donald E. Knuth, <i>Literate Programming</i>, The Computer Journal <b>27</b> (1984), no. 2, 97-111.)
His approach was to combine Pascal code with T<small>E</small>X documentation to
produce a new language, <tt>WEB</tt>, that offered programmers a superior
approach to programming. He wrote several programs in <tt>WEB</tt>,
including <tt>weave</tt> and <tt>tangle</tt>, the programs used to support
literate programming.
The idea was that a programmer wrote one document, the web file, that
combined documentation written in T<small>E</small>X (Donald E. Knuth, 
<i>The T<small>E</small>Xbook</i>, Computers and Typesetting, 1986) with code (written in Pascal).
</p>
<p class="quote">
Running <tt>tangle</tt> on the web file would produce a complete
Pascal program, ready for compilation by an ordinary Pascal compiler.
The primary function of <tt>tangle</tt> is to allow the programmer to
present elements of the program in any desired order, regardless of
the restrictions imposed by the programming language. Thus, the
programmer is free to present his program in a top-down fashion,
bottom-up fashion, or whatever seems best in terms of promoting
understanding and maintenance.
</p>
<p class="quote">
Running <tt>weave</tt> on the web file would produce a T<small>E</small>X file, ready
to be processed by T<small>E</small>X. The resulting document included a variety of
automatically generated indices and cross-references that made it much
easier to navigate the code. Additionally, all of the code sections
were automatically prettyprinted, resulting in a quite impressive
document. 
</p>
<p class="quote">
Knuth also wrote the programs for T<small>E</small>X and <small><i>METAFONT</i></small>
entirely in <tt>WEB</tt>, eventually publishing them in book
form. These are probably the
largest programs ever published in a readable form.
</p>
</div>

<h2><em>pyWeb</em></h2>
<p><em>pyWeb</em> works with any 
programming language and any markup language.  This philosophy
comes from <i>FunnelWeb</i>,
<i>noweb</i> and <i>nuweb</i>.  The primary differences
between <em>pyWeb</em> and other tools are the following.</p>
<ul>
<li><em>pyWeb</em> is object-oriented, permitting easy extension.  
<i>noweb</i> extensions
are separate processes that communicate through a sophisticated protocol.
<i>nuweb</i> is not easily extended without rewriting and recompiling
the C programs.</li>
<li><em>pyWeb</em> is built in the very portable Python programming 
language.  This allows it to run anywhere that Python 2.1 runs, with
no additional tool or compiler dependencies.  This makes it a useful
tool for Python, Perl and Tcl programmers.</li>
<li><em>pyWeb</em> is much simpler than <i>FunnelWeb</i>, <i>LEO</i> or <i>Interscript</i>.  It has 
a very limited selection of commands, but can still produce 
complex programs and HTML documents.</li>
<li><em>pyWeb</em> does not invent its own markup language like <i>Interscript</i>.
Because <i>Interscript</i> has its own markup, it can generate LaTex or HTML or other
output formats from a unique input format.  While powerful, it seems simpler to
avoid inventing yet another sophisticated markup language.  The language <em>pyWeb</em>
uses is very simple, and the author's use their preferred markup language almost
exclusively.</li>
<li><em>pyWeb</em> supports the forward literate programming philosophy, 
where a source document creates programming language and markup language.
The alternative, deriving the document from markup embedded in 
program comments, called inverted literate programming, seems less appealing.
The disadvantage of inverted literate programming is that the final document
must describe the program elements in the exact order they are presented
to the compiler or interpreter.  This is not always the best way to explain 
a program, and a more flexible literate programming tool can prepare 
files an order appropriate for successful exposition, freed from compiler
restrictions.</li>
<li><em>pyWeb</em> also specifically rejects some features of <i>nuweb</i>
and <i>FunnelWeb</i>.  These include the macro capability with parameter
substitution, and multiple references to a chunk.  These two capabilities
can be used to grow object-like applications from non-object programming
languages (<em>e.g.</em> C or Pascal).  Since most modern languages (Python,
Java, C++) are object-oriented, this macro capability is more of a problem
than a help.</li>
<li>Since <em>pyWeb</em> is built in the Python interpreter, a source document
can include Python expressions that are evaluated during weave operation to
produce time stamps, source file descriptions or other information in the woven 
or tangled output.</li>
</ul>

<p><em>pyWeb</em> works with any programming language and any markup language.
The initial release supports HTML completely, and 
L<sup><small>A</small></sup>T<small>E</small>X approximately.  The biggest
gap in the L<sup><small>A</small></sup>T<small>E</small>X support is 
a complete lack of understanding of the original markup in <i>nuweb</i>, and the
very real damage done to that markup when creating <em>pyWeb</em>.
</p>

<p>The following is extensively quoted from Briggs' <i>nuweb</i> documentation, 
and provides an excellent background in the advantages of the very
simple approach started by <i>nuweb</i> and adopted by <em>pyWeb</em>.</p>

<div>
<p class="quote">
The need to support arbitrary
programming languages has many consequences:</p>
<dl class="quote">
<dt>No prettyprinting</dt><dd class="quote"> Both <tt>WEB</tt> and <tt>CWEB</tt> are able to
  prettyprint the code sections of their documents because they
  understand the language well enough to parse it. Since we want to use
  <em>any</em> language, we've got to abandon this feature.
  However, we do allow particular individual formulas or fragments
  of L<sup><small>A</small></sup>T<small>E</small>X
  or HTML code to be formatted and still be part of the output files.</dd>
<dt>Limited index of identifiers</dt><dd class="quote"> Because <tt>WEB</tt> knows about Pascal,
  it is able to construct an index of all the identifiers occurring in
  the code sections (filtering out keywords and the standard type
  identifiers). Unfortunately, this isn't as easy in our case. We don't
  know what an identifier looks like in each language and we certainly
  don't know all the keywords.  We provide a mechanism to mark 
  identifiers, and we use a pretty standard pattern for recognizing
  identifiers almost most programming languages.</dd>
</dl>
<p class="quote">
Of course, we've got to have some compensation for our losses or the
whole idea would be a waste. Here are the advantages I [Briggs] can see:
</p>

<dl class="quote">
    <dt>Simplicity</dt>
        <dd class="quote">The majority of the commands in <tt>WEB</tt> are concerned with control of the automatic prettyprinting. Since we don't prettyprint, many commands are eliminated. A further set of commands is subsumed by L<sup><small>A</small></sup>T<small>E</small>X  and may also be eliminated. As a result, our set of commands is reduced to only about seven members (explained in the next section). This simplicity is also reflected in the size of this tool, which is quite a bit smaller than the tools used with other approaches.</dd>

    <dt>No prettyprinting</dt>
        <dd class="quote">Everyone disagrees about how their code should look, so automatic formatting annoys many people. One approach is to provide ways to control the formatting. Our approach is simpler -- we perform no automatic formatting and therefore allow the programmer complete control of code layout.</dd>

    <dt>Control</dt>
        <dd class="quote">We also offer the programmer reasonably complete control of the layout of his output files (the files generated during tangling). Of course, this is essential for languages that are sensitive to layout; but it is also important in many practical situations, <em>e.g.</em>, debugging.</dd>

    <dt>Speed</dt>
        <dd class="quote">Since [<em>pyWeb</em>] doesn't do too much, it runs very quickly. It combines the functions of <tt>tangle</tt> and <tt>weave</tt> into a single program that performs both functions at once.</dd>

    <dt>Chunk numbers</dt>
        <dd class="quote">Inspired by the example of <i>noweb</i>, [<em>pyWeb</em>] refers to all program code chunks by a simple, ascending sequence number through the file.  This becomes the HTML anchor name, also.</dd>

    <dt>Multiple file output</dt>
        <dd class="quote">The programmer may specify more than one output file in a single [<em>pyWeb</em>] source file. This is required when constructing programs in a combination of languages (say, Fortran and C). It's also an advantage when constructing very large programs.</dd>
</dl>

</div>

<h2>Use Cases</h2>
<p><em>pyWeb</em> supports two use cases, <i>Tangle Source Files</i> and <i>Weave Documentation</i>.
These are often combined into a single request of the application that will both
weave and tangle.</p>
<h3>Tangle Source Files</h3>
<p>A user initiates this process when they have a complete <tt>.w</tt> file that contains 
a description of source files.  These source files are described with <tt>@@o</tt> commands
in the <tt>.w</tt> file.</p>
<p>The use case is successful when the source files are produced.</p>
<p>Outside this use case, the user will debug those source files, possibly updating the
<tt>.w</tt> file.  This will lead to a need to restart this use case.</p>
<p>The use case is a failure when the source files cannot be produced, due to 
errors in the <tt>.w</tt> file.  These must be corrected based on information in log messages.</p>
<p>The sequence is simply <tt>./pyweb.py <i>theFile</i>.w</tt>.</p>

<h3>Weave Source Files</h3>
<p>A user initiates this process when they have a <tt>.w</tt> file that contains 
a description of a document to produce.  The document is described by the entire
<tt>.w</tt> file.</p>
<p>The use case is successful when the documentation file is produced.</p>
<p>Outside this use case, the user will edit the documentation file, possibly updating the
<tt>.w</tt> file.  This will lead to a need to restart this use case.</p>
<p>The use case is a failure when the documentation file cannot be produced, due to 
errors in the <tt>.w</tt> file.  These must be corrected based on information in log messages.</p>
<p>The sequence is simply <tt>./pyweb.py <i>theFile</i>.w</tt>.</p>

<h3>Tangle, Regression Test and Weave</h3>
<p>A user initiates this process when they have a <tt>.w</tt> file that contains 
a description of a document to produce.  The document is described by the entire
<tt>.w</tt> file.  Further, their final document should include regression test output 
from the source files created by the tangle operation.</p>
<p>The use case is successful when the documentation file is produced, including
current regression test output.</p>
<p>Outside this use case, the user will edit the documentation file, possibly updating the
<tt>.w</tt> file.  This will lead to a need to restart this use case.</p>
<p>The use case is a failure when the documentation file cannot be produced, due to 
errors in the <tt>.w</tt> file.  These must be corrected based on information in log messages.</p>
<p>The use case is a failure when the documentation file does not include current
regression test output.</p>
<p>The sequence is as follows:</p>
<pre>
./pyweb.py -xw -pi <i>theFile</i>.w
python <i>theTest</i> &gt;<i>aLog</i>
./pyweb.py -xt <i>theFile</i>.w
</pre>
<p>The first step excludes weaving and permits errors on the <tt>@@i</tt> command.  The <tt>-pi</tt> option
is necessary in the event that the log file does not yet exist.  The second step 
runs the regression test, creating a log file.  The third step weaves the final document,
including the regression test output.</p> 

<h2>Writing <em>pyWeb</em> .w Files</h2>
<p>The input to <em>pyWeb</em> is a <tt>.w</tt> file that consists of a
series of <i>Chunks</i>.  Each Chunk is either program source code to 
be <i>tangled</i> or it is documentation to be <i>woven</i>.  The bulk of
the file is typically documentation chunks that describe the program in
some human-oriented markup language like HTML 
or L<sup><small>A</small></sup>T<small>E</small>X.
</p>

<p>The <em>pyWeb</em> tool parses the input, and performs the
tangle and weave operations.  It tangles each individual output file
from the program source chunks.  It weaves the final documentation
file from the entire sequence of chunks provided, mixing the author's 
original documentation with the program source.
</p>

<p>The <i>Major</i> commands partition the input and define the
various chunks.  The <i>Minor</i> commands are used to control the
woven and tangled output from those chunks.
</p>

<h3>Major Commands</h3>
<p>There are three major commands that define the various chunks
in an input file.</p>
<dl>
    <dt><tt>@@o <i>file</i> @@{ <i>text</i> @@}</tt></dt>
        <dd>The <tt>@@o</tt> (output) command defines a named output file chunk.  The text is tangled to the named
        file with no alteration.  It is woven into the document
        in an appropriate fixed-width font.</dd>
    <dt><tt>@@d <i>name</i> @@{ <i>text</i> @@}</tt></dt>
        <dd>The <tt>@@d</tt> (define) command defines a named chunk of program source.  This text is tangled
        or woven when it is referenced by the <i>reference</i> minor command.</dd>
    <dt><tt>@@i <i>file</i></tt></dt>
        <dd>The <tt>@@i</tt> (include) command includes another file.  The previous chunk
        is ended.  The file is processed completely, then a new chunk
        is started for the text after the <tt>@@i</tt> command.</dd>
</dl>
<p>All material that is not explicitly in a <tt>@@o</tt> or <tt>@@d</tt> named chunk is
implicitly collected into a sequence of anonymous document source chunks.
These anonymous chunks form the backbone of the document that is woven.
The anonymous chunks are never tangled into output program source files.
</p>
<p>Note that white space (line breaks (<tt>'\n'</tt>), tabs and spaces) have no effect on the input parsing.
They are completely preserved on output.</p>

<p>The following example has three chunks.  An anonymous chunk of
documentation, a named output chunk, and an anonymous chunk of documentation.
</p>
<pre><code>
&lt;p&gt;Some HTML documentation that describes the following piece of the
program.&lt;/p&gt;
@@o myFile.py 
@@{
import math
print math.pi
@@}
&lt;p&gt;Some more HTML documentation.&lt;/p&gt;
</code></pre>

<h3>Minor Commands</h3>
<p>There are five minor commands that cause content to be created where the 
command is referenced.</p>
<dl>
    <dt><tt>@@@@</tt></dt>
        <dd>The <tt>@@@@</tt> command creates a single <tt>@@</tt> in the output file.</dd>
    <dt><tt>@@&lt;<i>name</i>@@&gt;</tt></dt>
        <dd>The <i>name</i> references a named chunk.
        When tangling, the referenced chunk replaces the reference command.
        When weaving, a reference marker is used.  In HTML, this will be
        the <tt>&lt;A HREF=...&gt;</tt> markup.</dd>
    <dt><tt>@@f</tt></dt>
        <dd>The <tt>@@f</tt> command inserts a file cross reference.  This
        lists the name of each file created by an <tt>@@o</tt> command, and all of the various
        chunks that are concatenated to create this file.</dd>
    <dt><tt>@@m</tt></dt>
        <dd>The <tt>@@m</tt> command inserts a named chunk ("macro") cross reference.  This
        lists the name of each chunk created by an @@d command, and all of the various
        chunks that are concatenated to create the complete chunk.</dd>
    <dt><tt>@@u</tt></dt>
        <dd>The <tt>@@u</tt> command inserts a user identifier cross reference.  This
        lists the name of each chunk created by an <tt>@@d</tt> command, and all of the various
        chunks that are concatenated to create the complete chunk.</dd>
    <dt><tt>@@|</tt></dt>
        <dd>A chunk may define user identifiers.  The list of defined identifiers is placed
in the chunk, set off by a <tt>@@|</tt> separator.</dd>
    <dt><tt>@@(<i>Python expression</i>@@)</tt></dt>
        <dd>The <i>Python expression</i> is evaluated and the result is tangled or
        woven in place.  A few global variables and modules are available.
        These are described <a href="expressionContext">below</a>.</dd>
</dl>

<h3>Additional Features</h3>
<p>The named chunks (from both <tt>@@o</tt> and <tt>@@d</tt> commands) are assigned unique sequence numbers to simplify cross references.  In LaTex it is possible 
to determine the page breaks and assign the sequence numbers based on
the physical pages.</p>
<p>Chunk names and file names are case sensitive.</p>
<p>Chunk names can be abbreviated.  A partial name can have a trailing ellipsis (...), this will be resolved to the full name.  The most typical use for this
is shown in the following example.</p>

<pre><code>
&lt;p&gt;Some HTML documentation.&lt;/p&gt;
@@o myFile.py 
@@{
@@&lt;imports of the various packages used@@&gt;
print math.pi,time.time()
@@}
&lt;p&gt;Some notes on the packages used.&lt;/p&gt;
@@d imports...
@@{
import math,time
@@| math time
@@}
&lt;p&gt;Some more HTML documentation.&lt;/p&gt;
</code></pre>

<ol>
<li>An anonymous chunk of documentation.</li>
<li>A named chunk that tangles the <tt>myFile.py</tt> output.  It has
a reference to the <i>imports of the various packages used</i> chunk.
Note that the full name of the chunk is essentially a line of 
documentation, traditionally done as a comment line in a non-literate
programming environment.</li>
<li>An anonymous chunk of documentation.</li>
<li>A named chunk with an abbreviated name.  The <i>imports...</i>
matches the complete name.  Set off after the <tt>@@|</tt> separator is
the list of identifiers defined in this chunk.</li>
<li>An anonymous chunk of documentation.</li>
</ol>

<p>Named chunks are concatenated from their various pieces.
This allows a named chunk to be broken into several pieces, simplifying
the description.  This is most often used when producing 
fairly complex output files.</p>

<pre><code>
&lt;p&gt;An anonymous chunk with some HTML documentation.&lt;/p&gt;
@@o myFile.py 
@@{
import math,time
@@}
&lt;p&gt;Some notes on the packages used.&lt;/p&gt;
@@o myFile.py
@@{
print math.pi,time.time()
@@}
&lt;p&gt;Some more HTML documentation.&lt;/p&gt;
</code></pre>

<ol>
<li>An anonymous chunk of documentation.</li>
<li>A named chunk that tangles the <tt>myFile.py</tt> output.  It has
the first part of the file.  In the woven document
this is marked with <tt>"="</tt>.</li>
<li>An anonymous chunk of documentation.</li>
<li>A named chunk that also tangles the <tt>myFile.py</tt> output. This
chunk's content is appended to the first chunk.  In the woven document
this is marked with <tt>"+="</tt>.</li>
<li>An anonymous chunk of documentation.</li>
</ol>

<p>Newline characters are preserved on input.  Because of this the output may appear to have excessive newlines.  In all of the above examples, each
named chunk was defined with the following.</p>
<pre><code>
@@{
import math,time
@@}
</code></pre>
<p>This puts a newline character before and after the import line.</p>

<p>One transformation is performed when tangling output.  The indentation
of a chunk reference is applied to the entire chunk.  This makes it
simpler to prepare source for languages (like Python) where indentation
is important.  It also gives the author control over how the final
tangled output looks.</p>

<p>Also, note that the <tt>myFile.py</tt> uses the <tt>@@|</tt> command
to show that this chunk defines the identifier <tt>aFunction</tt>.
</p>
<pre><code>
&lt;p&gt;An anonymous chunk with some HTML documentation.&lt;/p&gt;
@@o myFile.py 
@@{
def aFunction( a, b ):
    @@&lt;body of the aFunction@@&gt;
@@| aFunction @@}
&lt;p&gt;Some notes on the packages used.&lt;/p&gt;
@@d body...
@@{
"""doc string"""
return a + b
@@}
&lt;p&gt;Some more HTML documentation.&lt;/p&gt;
</code></pre>

<p>The tangled output from this will look like the following.
All of the newline characters are preserved, and the reference to
<i>body of the aFunction</i> is indented to match the prevailing
indent where it was referenced.  In the following example, 
explicit line markers of <b><tt>~</tt></b> are provided to make the blank lines 
more obvious.
</p>
<pre><code>
~
~def aFunction( a, b ):
~        
~    """doc string"""
~    return a + b
~
</code></pre>

<p>There are two possible implementations for evaluation of a Python
expression in the input.</p>
<ol>
<li>Create an <b>ExpressionCommand</b>, and append this to the current <b>Chunk</b>.
This will allow evaluation during weave processing and during tangle processing.  This
makes the entire weave (or tangle) context available to the expression, including
completed cross reference information.</li>
<li>Evaluate the expression during input parsing, and append the resulting text
as a <b>TextCommand</b> to the current <b>Chunk</b>.  This provides a common result
available to both weave and parse, but the only context available is the <b>WebReader</b> and
the incomplete <b>Web</b>, built up to that point.</li>
</ol>
<a name="expressionContext"></a>
<p>In this implementation, we adopt the latter approach, and evaluate expressions immediately.
A simple global context is created with the following variables defined.</p>
<dl>
    <dt><tt>time</tt></dt><dd>This is the standard time module.</dd>
    <dt><tt>os</tt></dt><dd>This is the standard os module.</dd>
    <dt><tt>theLocation</tt></dt><dd>A tuple with the file name, first line number and last line number
    for the original expression's location</dd>
    <dt><tt>theWebReader</tt></dt><dd>The <b>WebReader</b> instance doing the parsing.</dd>
    <dt><tt>thisApplication</tt></dt><dd>The name of the running <em>pyWeb</em> application.</dd>
    <dt><tt>__version__</tt></dt><dd>The version string in the <em>pyWeb</em> application.</dd>
</dl>

<h2>Running <em>pyWeb</em> to Tangle and Weave</h2>

<p>Assuming that you have marked <tt>pyweb.py</tt> as executable,
you do the following.</p>
<pre>
./pyweb.py <i>file</i>...
</pre>
<p>This will tangle the <tt>@@o</tt> commands in each <i>file</i>.
It will also weave the output, and create <i>file</i>.html.
</p>

<h3>Command Line Options</h3>
<p>Currently, the following command line options are accepted.</p>
<dl>
    <dt><tt>-v</tt></dt>
        <dd>Verbose logging.  The default is changed by updating the 
        <a href="#log_setting">constructor</a>
        for <i>theLog</i> from <tt>Logger(standard)</tt> to <tt>Logger(verbose)</tt>.</dd>
    <dt><tt>-s</tt></dt>
        <dd>Silent operation.  The default is changed by updating the 
        <a href="#log_setting">constructor</a>
        for <i>theLog</i> from <tt>Logger(standard)</tt> to <tt>Logger(silent)</tt>.</dd>
    <dt><tt>-c <i>x</i></tt></dt>
        <dd>Change the command character from <tt>@@</tt> to <tt><i>x</i></tt>.
        The default is changed by updating the 
        <a href="#command_setting">constructor</a> for <i>theWebReader</i> from
        <tt>WebReader(f,'@@')</tt> to <tt>WebReader(f,'<i>x</i>')</tt>.</dd>
    <dt><tt>-w <i>weaver</i></tt></dt>
        <dd>Choose a particular documentation weaver, for instance 'Weaver', 'HTML', 'Latex', or 
        'HTMLPython'.  The default is based on the first few characters of the input file.
        You can do this by updating the 
        <a href="#pick_language">language determination</a> call in the application
        main function from <tt>l= w.language()</tt> to <tt>l= HTML()</tt>.</dd>
    <dt><tt>-t <i>tangler</i></tt></dt>
        <dd>Choose a particular source file tangler, for instance 'Tangler' or 'TanglerMake'.
        The default is the make-friendly tangler.  The default is changed by updating the 
        <a href="#command_setting">constructor</a> for <i>theTangler</i> from
        <tt>TanglerMake()</tt> to <tt>Tangler()</tt>.</dd>
    <dt><tt>-xw</tt></dt>
        <dd>Exclude weaving.  This does tangling of source program files only.</dd>
    <dt><tt>-xt</tt></dt>
        <dd>Exclude tangling.  This does weaving of the document file only.</dd>
    <dt><tt>-p<i>command</i></tt></dt>
        <dd>Permit errors in the given list of commands.  The most common
        version is <tt>-pi</tt> to permit errors in locating an include file.
        This is done in the following scenario: pass 1 uses <tt>-xw -pi</tt> to exclude
        weaving and permit include-file errors; 
        the tangled program is run to create test results; pass 2 uses
        <tt>-xt</tt> to exclude tangling and include the test results.</dd>
</dl>


<h2>Restrictions</h2>
<p><em>pyWeb</em> requires Python 2.1. or newer.</p>
<p>Currently, input is not detabbed; Python users generally are discouraged from using tab characters in their files.</p>

<h2>Installation</h2>
<p>You must have <a href="http://www.python.org">Python 2.1</a>.</p>
<ol>
<li>Download and expand pyweb.zip.  You will get pyweb.css, pyweb.html, pyweb.pdf,
pyweb.py and pyweb.w.</li>
<li>If necessary, <tt>chmod +x pyweb.py</tt>.</li>
<li>If you like, <tt>cp pyweb.py /usr/local/bin/pyweb</tt> to make a global command.</li>
<li>Make a bootstrap copy of pyweb.py (I copy it to pyweb1.py).  
You can run <tt>./pyweb.py pyweb.w</tt> to generate the latest and greatest pyweb.py file,
as well as this documentation, pyweb.html.</li>
</ol>
<p>Be sure to save a bootstrap copy of pyweb.py before changing pyweb.w.  
Should your changes to pyweb.w introduce a bug into pyweb.py, you will need a fall-back version
of <em>pyWeb</em> that you can use in place of the one you just damaged.
</p>

<h2>Acknowledgements</h2>
<p>This application is very directly based on (derived from?) work that
 preceded this, particularly the following:</p>
<ul>
<li>Ross N. Williams' <a href="http://www.ross.net/funnelweb/"><i>FunnelWeb</i></a></li>
<li>Norman Ramsey's <a href="http://www.eecs.harvard.edu/~nr/noweb/"><i>noweb</i></a></li> 
<li>Preston Briggs' <a href="http://sourceforge.net/projects/nuweb/"><i>nuweb</i></a>, 
currently supported by Charles Martin and Marc W. Mengel</li>
</ul>
<p>Also, after using John Skaller's <a href="http://interscript.sourceforge.net/"><i>interscript</i></a>
for two large development efforts, I finally understood the feature set I really needed.
</p>

<h1>TODO:</h1>
<ol>
<li>Use the Decorator pattern to apply code chunk (<tt>@@d</tt>) vs. file chunk (<tt>@@o</tt>) decorations
when weaving a named chunk.  This should factor out the distinctions between these
two weave operations, separating the common parts.  This should remove codeBegin, codeEnd,
fileBegin and fileEnd and instead make these calls to a subclass of 
ChunkDecorator (DecorateCode vs. DecorateFile).  The decorator is instantiated when the
chunk is created, because the decorator refers to the chunk.  Weaving and Tangling will
call the decorator, which in turn calls the chunk to produce the final output.  Further, 
the NamedDocumentChunk (which is otherwise undecorated) is unified, meaning we really
only have one class of named chunk.  Indeed, we may only have one class of chunk, once
this activity is factored out.</li>
<li>Create an application that merges something like
<a href="ftp://starship.python.net/pub/crew/just/PyFontify.py">
PyFontify</a> module with <em>pyWeb</em> base classes to add Syntax coloring
to a Python-specific HTML weaver.</li> 
<li>The <b>createUsedBy()</b> method can be done incrementally by 
accumulating a list of forward references to chunks; as each
new chunk is added, any references to the chunk are removed from
the forward references list, and a call is made to the Web's
setUsage method.  References backward to already existing chunks
are easily resolved with a simple lookup.  The advantage of 
the incremental resolution is a simplification in the protocol
for using a Web instance.</li>
<li>Use the Builder pattern to provide an explicit WebBuilder instance
with the WebReader class to build the parse tree.   This can be overridden to,
for example, do incremental building in one pass.</li>
<li>Note that the Web is a lot like a NamedChunk; this could be factored out.
This will create a more proper Composition pattern implementation.</li>
</ol>

<h1>Design Overview</h1>
<p>This application breaks the overall problem into the following sub-problems.</p>
<ol>
<li>Reading and parsing the input.</li>
<li>Building an internal representation of the source Web.</li>
<li>Weaving a document file.</li>
<li>Tangling the desired program source files.</li>
</ol>
<p>A solution to the reading and parsing problem depends on a convenient 
tool for breaking up the input stream and a representation for the chunks of input.
Input decomposition is done with the Python <i>Splitter</i> pattern.  The representation
of the source document is done with the <i>Composition</i> pattern.
</p>
<p>The Splitter pattern is widely used in text processing, and has a long legacy
in a variety of languages and libraries.  A Splitter decomposes a string into
a sequence of strings using a split pattern.  There are many variant implementations.
One variant locates only a single occurence (usually the left-most); this is
commonly implemented as a Find or Search string function.  Another variant locates
occurrences of a specific string or character, and discards the matching string or
character.
</p>
<p>
The variation on Splitter that we use in this application
creates each element in the resulting sequence as either (1) an instance of the 
split regular expression or (2) the text between split patterns.  By preserving 
the actual split text, we can define our splitting pattern with the regular
expression <tt>'@@.'</tt>.  This will split on any <tt>@@</tt> followed by a single character.
We can then examine the instances of the split RE to locate pyWeb commands.
</p>
<p>We could be a tad more specific and use the following as a split pattern:
<tt>'@@[doOifmu|<>(){}[\]]'</tt>.  This would silently ignore unknown commands, 
merging them in with the surrounding text.  This would leave the <tt>'@@@@'</tt> sequences 
completely alone, allowing us to replace <tt>'@@@@'</tt> with <tt>'@@'</tt> in
every text chunk.
</p>
<p>The Composition pattern is used to build up a parse tree of instances of
<b>Chunk</b>.  This parse tree is contained in the overall <b>Web</b>, which is a sequence
of Chunks.  Each named chunk may be a sequence of Chunks with a common name.
</p>
<p>Each chunk is composed of a sequence of instances of <b>Command</b>.  
Because of this uniform composition, the several operations (particularly
weave and tangle) can be 
delegated to each Chunk, and in turn, delegated to each Command that
composes a Chunk.
</p>
<p>The weaving operation depends on the target document markup language.
There are several approaches to this problem.  One is to use a markup language
unique to <em>pyweb</em>, and emit markup in the desired target language.
Another is to use a standard markup language and use converters to transform
the standard markup to the desired target markup.  The problem with the second
method is specifying the markup for actual source code elements in the
document.  These must be emitted in the proper markup language.
</p>
<p>Since the application must transform input into a specific markup language,
we opt using the Strategy pattern to encapsulate markup language details.
Each alternative markup strategy is then a subclass of <b>Weaver</b>.  This 
simplifies adding additional markup languages without inventing a 
markup language unique to <em>pyweb</em>.
The author uses their preferred markup, and their preferred
toolset to convert to other output languages.
</p>
<p>The tangling operation produces output files.  In earlier tools,
some care was taken to understand the source code context for tangling, and
provide a correct indentation.  This required a command-line parameter
to turn off indentation for languages like Fortran, where identation
is not used.  In <em>pyweb</em>, the indent of
the actual <tt>@@&lt;</tt> command is used to set the indent of the 
material that follows.  If all <tt>@@&lt;</tt> commands are presented at the
left margin, no indentation will be done.  This is helpful simplification,
particularly for users of Python, where indentation is significant.
</p>
<p>The standard <b>Emitter</b> class handles this basic indentation.  A subclass can be 
created, if necessary, to handle more elaborate indentation rules.</p>

<h1>Implementation</h1>

<p>The implementation is contained in a file that both defines
the base classes and provides an overall <b>main()</b> function.  The <b>main()</b> 
function uses these base classes to weave and tangle the output files.
</p>
<p>An additional file provides a more sophisticated implementation,
adding features to an HTML weave subclass.
</p>

<h2><em>pyWeb</em> Base File</h2>

<p>The <em>pyWeb</em> base file is shown below:</p>

@o pyweb.py 
@{@<Shell Escape@>
@<DOC String@>
@<CVS Cruft and pyweb generator warning@>
@<Imports@>
@<Base Class Definitions@>
@<Application Class@>
@<Module Initialization of global variables@>
@<Interface Functions@>
@}

<p>The overhead elements are described in separate sub sections as follows:</p>
<ul>
<li>imports</li>
<li>doc string</li>
<li>other overheads: shell escape and CVS cruft</li>
</ul>
<p>The more important elements are described in separate sections:</p>
<ul>
<li>Base Class Definitions</li>
<li>Application Class and Main Functions</li>
<li>Module Initialization</li>
<li>Interface Functions</li>
</ul>

<h3>Python Library Imports</h3>

<p>The following Python library modules are used by this application.</p>
<ul>
<li>The <b>sys</b> module provides access to the command line arguments.</li>
<li>The <b>os</b> module provide os-specific file and path manipulations; it is used
to transform the input file name into the output file name.</li>
<li>The <b>re</b> module provides regular expressions; these are used to 
parse the input file.</li>
<li>The <b>time</b> module provides a handy current-time string; this is used
to by the HTML Weaver to write a closing timestamp on generated HTML files, 
as well as log messages.</li>
<li>The <b>getopt</b> module provides a simple command-line parsing interface.</li>
</ul>
<p>The following modules are used by specific subclasses for more specialized
purposes.</p>
<ul>
<li>The <b>tempfile</b> module provides a unique temporary file name; the <b>TanglerMake</b> class
uses this to create make-friendly files that are not
touched unless there is an actual content change.</li>
<li>The <b>filecmp</b> module compares files; this is used by the <b>TanglerMake</b> class.</li>
</ul>

@d Imports
@{import sys, os, re, time, getopt
import tempfile, filecmp
@| sys os re time getopt tempfile filecmp
@}

<h3>Python DOC String</h3>

<p>A Python <tt>__doc__</tt> string provides a standard vehicle for documenting
the module or the application program.  The usual style is to provide
a one-sentence summary on the first line.  This is followed by more 
detailed usage information.
</p>

@d DOC String 
@{"""pyWeb Literate Programming - tangle and weave tool.

Yet another simple literate programming tool derived from nuweb, 
implemented entirely in Python.  
This produces HTML (or LATEX) for any programming language.

Usage:
    pyweb [-vs] [-c x] [-w format] [-t format] file.w

Options:
    -v           verbose output
    -s           silent output
    -c x         change the command character from '@@' to x
    -w format    Use the given weaver for the final document.
                 The default is based on the input file, a leading '<'
                 indicates HTML, otherwise Latex.
    -t format    Use the given tangler to produce the output files.
    -xw          Exclude weaving
    -xt          Exclude tangling
    -pi          Permit include-command errors
    
    file.w       The input file, with @@o, @@d, @@i, @@[, @@{, @@|, @@<, @@f, @@m, @@u commands.
"""
@}

<h3>Other Python Overheads</h3>

<p>The shell escape is provided so that the user can define this
file as executable, and launch it directly from their shell.
The shell reads the first line of a file; when it finds the <tt>'#!'</tt> shell
escape, the remainder of the line is taken as the path to the binary program
that should be run.  The shell runs this binary, providing the 
file as standard input.
</p>

@d Shell Escape
@{#!/usr/local/bin/python@}

<p>The CVS cruft is a standard way of placing CVS information into
a Python module so it is preserved.  See PEP (Python Enhancement Proposal) #8 for information
on recommended styles.
</p>

<p>We also sneak in the "DO NOT EDIT" warning that belongs in all generated application 
source files.</p>

@d CVS Cruft...
@{__version__ = """$Revision$"""

### DO NOT EDIT THIS FILE!
### It was created by @(thisApplication@), __version__='@(__version__@)'.
### From source @(theFile@) modified @(time.ctime(os.path.getmtime(theFile))@).
### In working directory '@(os.getcwd()@)'.
@| __version__ @}

<h2>Base Class Definitions</h2>

<p>There are three major class hierarchies that compose the base of this application.  These are
families of related classes that express the basic relationships among entities.</p>
<ul>
<li>Emitters - An <b>Emitter</b> creates an output file, either source code, Latex or HTML from
the chunks that make up the source file.  Two major subclasses are <b>Weaver</b>, which 
has a focus on markup output, and <b>Tangler</b> which has a focus on pure source output.
<b>HTML</b> and <b>Latex</b> are further specializations of the <b>Weaver</b> class.  
The <b>TanglerMake</b> subclass of the <b>Tangler</b> class is a make-friendly source-code emitter.</li>
<li>Chunks - a <b>Chunk</b> is a collection of <b>Command</b> instances.  This can be
either an anonymous chunk that will be sent directly to the output, 
or one the classes of named chunks delimited by the
major <tt>@@d</tt>, <tt>@@o</tt> or <tt>@@O</tt> commands.</li>
<li>Commands - A <b>Command</b> contains user input and creates output.  
This can be a block of text from the input file, 
one of the various kinds of cross reference commands (<tt>@@f</tt>, <tt>@@m</tt>, <tt>@@u</tt>) 
or a reference to a chunk (via the <tt>@@&lt;<i>name</i>@@&gt;</tt> sequence).</li>
</ul>
<p>Additionally, there are several supporting classes:</p>
<ul>
<li>a <b>Web</b> class for the interconnected web of Chunks.</li>
<li>an <b>EmitterFactory</b> class that generates instances of the Emitter
class hierarchy given a string parameter.  This allows extension of the class hierarchy
by a module which includes this module.</li>
<li>a <b>WebReader</b> class that parses the input, creating the Commands and Chunks.</li>
<li>an <b>Error</b> class for exceptions that are unique to this application.</li>
<li>Also, there is an potentially separate module, consisting of several
classes that are used to control logging.  
A global variable, <b>theLog</b>, references the global singleton 
instance of the <b>Logger</b> class.  Additionall, an <b>Event</b> class hierarchy describes
the individual events that are logged.
</li>
</ul>

@d Base Class Definitions 
@{
@<Error class - defines the errors raised@>
@<Logger classes - handle logging of status messages@>
@<Command class hierarchy - used to describe individual commands@>
@<Chunk class hierarchy - used to describe input chunks@>
@<Emitter class hierarchy - used to control output files@>
@<Web class - describes the overall "web" of chunks@>
@<WebReader class - parses the input file, building the Web structure@>
@<Operation class hierarchy - used to describe basic operations of the application@>
@}

<h3>Emitters</h3>

<p>An <b>Emitter</b> instance is resposible for control of an output file format.
This includes the necessary file naming, opening, writing and closing operations.
It also includes providing the correct markup for the file type.
</p>

<p>There are several subclasses of the <b>Emitter</b> superclass, specialized for various file
formats.
</p>
@d Emitter class hierarchy...
@{
@<Emitter superclass@>
@<Weaver subclass to control documentation production@>
@<LaTex subclass of Weaver@>
@<HTML subclass of Weaver@>
@<Tangler subclass of Emitter@>
@<Tangler subclass which is make-sensitive@>

@<Emitter Factory - used to generate emitter instances from parameter strings@>
@}

<p>An <b>Emitter</b> instance is created to contain the various details of
writing an output file.  Emitters are created as follows:
</p>
<ol>
<li>A <b>Web</b> object will create an <b>Emitter</b> to <em>weave</em> the final document.</li>
<li>A <b>Web</b> object will create an <b>Emitter</b> to <em>tangle</em> each file.</li>
</ol>
<p>Since each <b>Emitter</b> instance is responsible for the details of one file
type, different subclasses of Emitter are used when tangling source code files (<b>Tangler</b>) and 
weaving files that include source code plus markup (<b>Weaver</b>).  Further specialization is required
when weaving HTML or LaTex.
</p>
<p>In the case of tangling, the following algorithm is used:</p>
<ol>
<li>Visit each each output <b>Chunk</b> (<tt>@@o</tt> or <tt>@@O</tt>), doing the following:
    <ol>
    <li>Open the <b>Tangler</b> instance using the target file name.</li>
    <li>Visit each <b>Chunk</b> directed to the file, calling the chunk's <b>tangle()</b> method.
        <ol>
        <li>Call the Tangler's <b>docBegin()</b> method.  This sets the Tangler's indents.</li>
        <li>Visit each <b>Command</b>, call the command's <b>tangle()</b> method.  For the text
            of the chunk, the
            text is written to the tangler using the <b>codeBlock()</b> method.  For
            references to other chunks, the referenced chunk is tangled using the 
            referenced chunk's <b>tangler()</b> method.</li>
        <li>Call the Tangler's <b>docEnd()</b> method.  This clears the Tangler's indents.</li>
        </ol>
    </li>
    </ol>
</li>
</ol>
<p>In the case of weaving, the following algorithm is used:</p>
<ol>
<li>If no Weaver is given, examine the first Command of the first Chunk and create a weaver
appropriate for the output format.  A leading '<' indicates HTML, otherwise assume Latex.
<li>Open the <b>Weaver</b> instance using the source file name.  This name is transformed
by the weaver to an output file name appropriate to the language.</li>
<li>Visit each each sequential <b>Chunk</b> (anonymous, <tt>@@d</tt>, <tt>@@o</tt> or <tt>@@O</tt>), doing the following:
    <ol>
    <li>Visit each <b>Chunk</b>, calling the Chunk's <b>weave()</b> method.
        <ol>
        <li>Call the Weaver's <b>docBegin()</b>, <b>fileBegin()</b> or <b>codeBegin()</b> method, 
        depending on the subclass of Chunk.  For 
        <b>fileBegin()</b> and <b>codeBegin()</b>, this writes the header for
        a code chunk in the weaver's markup language.  A slightly different decoration
        is applied by <b>fileBegin()</b> and <b>codeBegin()</b>.</li>
        <li>Visit each <b>Command</b>, call the Command's <b>weave()</b> method.  
            For ordinary text, the
            text is written to the Weaver using the <b>codeBlock()</b> method.  For
            references to other chunks, the referenced chunk is woven using 
            the Weaver's <b>referenceTo()</b> method.</li>
        <li>Call the Weaver's <b>docEnd()</b>, <b>fileEnd()</b> or <b>codeEnd()</b> method.  
        For <b>fileEnd()</b> or <b>codeEnd()</b>, this writes a trailer for
        a code chunk in the Weaver's markup language.</li>
        </ol>
    </li>
    </ol>
</li>
</ol>

<h4>Emitter Superclass</h4>

<h5>Usage</h5>
<p>The <b>Emitter</b> class is not a concrete class, and is never instantiated.  It
contains common features factored out of the <b>Weaver</b> and <b>Tangler</b> subclasses.</p>
<p>Inheriting from the Emitter class generally requires overriding one or more
of the core methods: <b>doOpen()</b>, <b>doClose()</b> and <b>doWrite()</b>.
A subclass of Tangler, might override the code writing methods: 
<b>codeLine()</b>, <b>codeBlock()</b> or <b>codeFinish()</b>.
</p>

<h5>Design</h5>

<p>The <b>Emitter</b> class is an abstract superclass for all emitters.  It defines the basic
framework used to create and write to an output file.
This class follows the <i>Template</i> design pattern.  This design pattern
directs us to factor the basic open(), close() and write() methods into three step algorithms.
</p>
<pre>
def open( self ):
    <i>common preparation</i>
    self.do_open() <i>#overridden by subclasses</i>
    <i>common finish-up tasks</i>
</pre>
<p>The <i>common preparation</i> and <i>common finish-up</i> sections are generally internal 
housekeeping.  The <b>do_open()</b> method would be overridden by subclasses to change the
basic behavior.
</p>

<h5>Implementation</h5>

<p>The class has the following attributes:</p>
<ul>
<li><i>fileName</i>, the name of the current open file created by the
open method;</li>
<li><i>theFile</i>, the current open file created by the
open method;</li>
<li><i>context</i>, the indentation context stack, updated by setIndent, clrIndent 
and resetIndent methods;</li>
<li><i>indent</i>, the current indentation, the topmost value on the <i>context</i>
stack;</li>
<li><i>lastIndent</i>, the last indent used when writing a line of source code.</li>
<li><i>linesWritten</i>, the total number of '\n' characters written to the file.</li>
</ul>

@d Emitter superclass
@{
class Emitter:
    """Emit an output file; handling indentation context."""
    def __init__( self ):
        self.fileName= ""
        self.theFile= None
        self.context= [0]
        self.indent= 0
        self.lastIndent= 0
        self.linesWritten= 0
        self.totalFiles= 0
        self.totalLines= 0
    @<Emitter core open, close and write@>
    @<Emitter write a block of code@>
    @<Emitter indent control: set, clear and reset@>
@| Emitter @}

<p>The core <b>open()</b> method tracks the open files.
A subclass overrides a <b>doOpen()</b> method to apply an OS-specific operations
required to correctly name the output file, and
opens the file.
If any specific preamble is required by the output file format, this could be done 
in the <b>doOpen()</b> override.
This kind of feature, however, is discouraged.  The point
of <em>pyWeb</em> (and its predecessors, <i>noweb</i> and <i>nuweb</i>) is to 
be very simple, putting complete control in the hands of
the author.
</p>
<p>The <b>close()</b> method closes the file.
If some specific postamble is required, this can be part of a function
that overrides <b>doClose()</b>.
</p>
<p>The <b>write()</b> method is the lowest-level, unadorned write.
This does no some additional counting as well as moving the
characters to the file.  Any further processing could be added in a function
that overrides <b>doWrite()</b>.
</p>
<p>The default implementations of the <b>open()</b> and <b>close()</b> methods do nothing, 
making them safe for debugging.  The default <b>write()</b> method prints to 
the standard output file.</p>

@d Emitter core...
@{
def open( self, aFile ):
    """Open a file."""
    self.fileName= aFile
    self.doOpen( aFile )
    self.linesWritten= 0
@<Emitter doOpen, to be overridden by subclasses@>
def close( self ):
    self.codeFinish()
    self.doClose()
    self.totalFiles += 1
    self.totalLines += self.linesWritten
@<Emitter doClose, to be overridden by subclasses@>
def write( self, text ):
    self.linesWritten += text.count('\n')
    self.doWrite( text )
@<Emitter doWrite, to be overridden by subclasses@>
@}

<p>The <b>doOpen()</b>, <b>doClose()</b> and <b>doWrite()</b> 
method is overridden by the various subclasses to
perform the unique operation for the subclass.
</p>
@d Emitter doOpen... @{
def doOpen( self, aFile ):
    self.fileName= aFile
    print "> creating %r" % self.fileName
@}

@d Emitter doClose... @{
def doClose( self ):
    print "> wrote %d lines to %s" % ( self.linesWritten, self.fileName )
@}

@d Emitter doWrite... @{
def doWrite( self, text ):
    print text,
@}


<p>The <b>codeBlock()</b> method writes several lines of code.  It calls
the <b>codeLine()</b> method for each line of code after doing the correct indentation.
Often, the last line of code is incomplete, so it is left unterminated.
This last line of code also shows the indentation for any 
additional code to be tangled into this section.
</p>
<p>
Note that tab characters confuse the indent algorithm.  Tabs are 
not expanded to spaces in this application.  They should be expanded 
prior to creating a .w file.
</p>
<p>The algorithm is as follows:</p>
<ol>
<li>Save the topmost value of the context stack as the current indent.</li>
<li>Split the block of text on <tt>'\n'</tt> boundaries.</li>
<li>For each line (except the last), call <b>codeLine()</b> with the indented text, 
ending with a newline.</li>
<li>The string <b>split()</b> method will put a trailing 
zero-length element in the list if the original block ended with a
newline.  We drop this zero length piece to prevent writing a useless fragment 
of indent-only after the final <tt>'\n'</tt>.  
If the last line has content, call codeLine with the indented text, 
but do not write a trailing <tt>'\n'</tt>.</li>
<li>Save the length of the last line as the most recent indent.</li>
</ol>

@d Emitter write a block...
@{
def codeBlock( self, text ):
    """Indented write of a block of code."""
    self.indent= self.context[-1]
    lines= text.split( '\n' )
    for l in lines[:-1]:
        self.codeLine( '%s%s\n' % (self.indent*' ',l) )
    if lines[-1]:
        self.codeLine( '%s%s' % (self.indent*' ',lines[-1]) )
    self.lastIndent= self.indent+len(lines[-1])
@}

<p>The <b>codeLine()</b> method writes a single line of source code.
This is often overridden by weaver subclasses to transform source into
a form acceptable by the final weave file format.
</p>
<p>In the case of an HTML weaver, the HTML reserved characters
(<tt>&lt;</tt>, <tt>&gt;</tt>, <tt>&amp;</tt>, and <tt>&quot;</tt>) must be replaced in the output
of code.  However, since the author's original document sections contain
HTML these will not be altered.
</p>

@d Emitter write a block...
@{
def codeLine( self, aLine ):
    """Each individual line of code; often overridden by weavers."""
    self.write( aLine )
@}

<p>The <b>codeFinish()</b> method finishes writing any cached lines when
the emitter is closed.</p>

@d Emitter write a block...
@{
def codeFinish( self ):
    if self.lastIndent > 0:
        self.write('\n')
@}

<p>The <b>setIndent()</b> method pushes the last indent on the context stack.  
This is used when tangling source
to be sure that the included text is indented correctly with respect to the
surrounding text.
</p>
<p>The <b>clrIndent()</b> method discards the most recent indent from the context stack.  
This is used when finished
tangling a source chunk.  This restores the indent to the prevailing indent.
</p>
<p>The <b>resetIndent()</b> method removes all indent context information.
</p>


@d Emitter indent...
@{
def setIndent( self ):
    self.context.append( self.lastIndent )
def clrIndent( self ):
    self.context.pop()
def resetIndent( self ):
    self.context= [0]
@}

<h4>Weaver subclass of Emitter</h4>
<h5>Usage</h5>
<p>A Weaver is an Emitter that produces markup in addition to user source document
and code.  The Weaver class is abstract, and a concrete subclass must provide
markup in a specific language.</p>

<h5>Design</h5>
<p>The <b>Weaver</b> subclass defines an <b>Emitter</b> used to <em>weave</em> the final
documentation.  This involves decorating source code to make it
displayable.  It also involves creating references and cross
references among the various chunks.
</p>
<p>The <b>Weaver</b> class adds several methods to the basic <b>Emitter</b> methods.  These
additional methods are used exclusively when weaving, never when tangling.
</p>

<h5>Implementation</h5>
@d Weaver subclass...
@{
class Weaver( Emitter ):
    """Format various types of XRef's and code blocks when weaving."""
    @<Weaver doOpen, doClose and doWrite overrides@>
    # A possible Decorator interface
    @<Weaver document chunk begin-end@>
    @<Weaver code chunk begin-end@>
    @<Weaver file chunk begin-end@>
    @<Weaver reference command output@>
    @<Weaver cross reference output methods@>
@| Weaver @}

<p>The default for all weavers is to create an HTML file.  While not
truly universally applicable, it is a common-enough operation that
it might be justified in the parent class. 
</p>
<p>This close method overrides the <b>Emitter</b> class <b>close()</b> method by closing the
actual file created by the open() method.
</p>
<p>This write method overrides the <b>Emitter</b> class <b>write()</b> method by writing to the
actual file created by the <b>open()</b> method.
</p>

@d Weaver doOpen...
@{
def doOpen( self, aFile ):
    src, junk = os.path.splitext( aFile )
    self.fileName= src + '.html'
    self.theFile= open( self.fileName, "w" )
    theLog.event( WeaveStartEvent, "Weaving %r" % self.fileName )
def doClose( self ):
    self.theFile.close()
    theLog.event( WeaveEndEvent, "Wrote %d lines to %r" % 
        (self.linesWritten,self.fileName) )
def doWrite( self, text ):
    self.theFile.write( text )
@}

<p>The following functions all form part of an interface that could be 
removed to a separate class that is a kind of Decorator.  Each weaver file
format is really another of the possible decorators for woven output.  This
could separate the basic mechanism of weaving from the file-format issues
of latex and HTML.
</p>

<p>The <b>docBegin()</b> and <b>docEnd()</b> methods are used when weaving document text.
Typically, nothing is done before emitting these kinds of chunks.
However, putting a <tt>&lt;!--line number--&gt;</tt> comment is an example
of possible additional processing.
</p>

@d Weaver document...
@{
def docBegin( self, aChunk ):
    pass
def docEnd( self, aChunk ):
    pass
@}

<p>The <b>codeBegin()</b> method emits the necessary material prior to 
a chunk of source code, defined with the <tt>@@d</tt> command.
A subclass would override this to provide specific text
for the intended file type.
</p>
<p>The <b>codeEnd()</b> method emits the necessary material subsequent to 
a chunk of source code, defined with the <tt>@@d</tt> command.  The list of references
is also provided so that links or cross references to chunks that 
refer to this chunk can be emitted.
A subclass would override this to provide specific text
for the intended file type.
</p>

@d Weaver code...
@{
def codeBegin( self, aChunk ):
    pass
def codeEnd( self, aChunk, references ):
    pass
@}

<p>The <b>fileBegin()</b> method emits the necessary material prior to 
a chunk of source code, defined with the <tt>@@o</tt> or <tt>@@O</tt> command.
A subclass would override this to provide specific text
for the intended file type.
</p>
<p>The <b>fileEnd()</b> method emits the necessary material subsequent to 
a chunk of source code, defined with the <tt>@@o</tt> or <tt>@@O</tt> command.  
The list of references
is also provided so that links or cross references to chunks that 
refer to this chunk can be emitted.
A subclass would override this to provide specific text
for the intended file type.
</p>

@d Weaver file...
@{
def fileBegin( self, aChunk ):
    pass
def fileEnd( self, aChunk, references ):
    pass
@}

<p>The <b>referenceTo()</b> method emits a reference to 
a chunk of source code.  There reference is made with a
<tt>@@&lt;...@@&gt;</tt> reference form within a <tt>@@d</tt>, <tt>@@o</tt> or <tt>@@O</tt> chunk.
The references are defined with the <tt>@@d</tt>, <tt>@@o</tt> or <tt>@@O</tt> commands.  
A subclass would override this to provide specific text
for the intended file type.
</p>

@d Weaver reference...
@{
def referenceTo( self, name, sequence ):
    pass
@}

<p>The <b>xrefHead()</b> method puts decoration in front of cross-reference
output.  A subclass may override this to change the look of the final
woven document.
</p>
<p>The <b>xrefFoot()</b> method puts decoration after cross-reference
output.  A subclass may override this to change the look of the final
woven document.
</p>
<p>The <b>xrefLine()</b> method is used for both 
file and macro cross-references to show a name (either file name
or macro name) and a list of chunks that reference the file or macro.
</p>
<p>The <b>xrefDefLine()</b> method is used for the user identifier cross-reference.
This shows a name and a list of chunks that 
reference or define the name.  One of the chunks is identified as the
defining chunk, all others are referencing chunks.
</p>
<p>The default behavior simply writes the Python data structure used
to represent cross reference information.  A subclass may override this 
to change the look of the final woven document.
</p>

@d Weaver cross reference...
@{
def xrefHead( self ):
    pass
def xrefFoot( self ):
    pass
def xrefLine( self, name, refList ):
    """File Xref and Macro Xref detail line."""
    self.write( "%s: %r\n" % ( name, refList ) )
def xrefDefLine( self, name, defn, refList ):
    """User ID Xref detail line."""
    self.write( "%s: %s, %r\n" % ( name, defn, refList ) )
@}

<h4>LaTex subclass of Weaver</h4>
<h5>Usage</h5>
<p>An instance of <b>Latex</b> can be used by the <b>Web</b> object to 
weave an output document.  The instance is created outside the Web, and
given to the <b>weave()</b> method of the Web.
</p>
<pre>
w= Web()
WebReader(aFile).load( w  )
weave_latex= Latex()
w.weave( weave_latex )
</pre>

<h5>Design</h5>
<p>The <b>LaTex</b> subclass defines a Weaver that is customized to
produce LaTex output of code sections and cross reference information.
</p>
<p>Note that this implementation is incomplete, and possibly incorrect.
This is a  badly damaged snapshot from the <i>nuweb</i> original source.
</p>
<h5>Implementation</h5>

@d LaTex subclass...
@{
class Latex( Weaver ):
    """Latex formatting for XRef's and code blocks when weaving."""
    @<LaTex doOpen override, close and write are the same as Weaver@>
    @<LaTex code chunk begin@>
    @<LaTex code chunk end@>
    @<LaTex file output begin@>
    @<LaTex file output end@>
    @<LaTex references summary at the end of a chunk@>
    @<LaTex write a line of code@>
    @<LaTex reference to a chunk@>
@| Latex @}

<p>The LaTex <b>open()</b> method opens a .tex file by replacing the
source file's suffix with <tt>".tex"</tt> and opening the resulting file.
</p>

@d LaTex doOpen...
@{
def doOpen( self, aFile ):
    src, junk = os.path.splitext( aFile )
    self.fileName= src + '.tex'
    self.theFile= open( self.fileName, "w" )
    theLog.event( WeaveStartEvent, "Weaving %r" % self.fileName )
@}

<p>The LaTex <b>codeBegin()</b> method writes the header prior to a chunk
of source code.
</p>

@d LaTex code chunk begin
@{
def codeBegin( self, aChunk ):
    self.resetIndent()
    self.write("\\begin{flushleft} \\small")
    if not aChunk.big_definition: # defined with 'O' instead of 'o'
        self.write("\\begin{minipage}{\\linewidth}")
    self.write( " \\label{scrap%d}" % aChunk.seq )
    self.write( '\\verb@@"%s"@@~{\\footnotesize ' % aChunk.name )
    self.write( "\\NWtarget{nuweb%s}{%s}$\\equiv$"
        % (aChunk.name,aChunk.seq) )
    self.write( "\\vspace{-1ex}\n\\begin{list}{}{} \\item" )
@}

<p>The LaTex <b>codeEnd()</b> method writes the trailer subsequent to a chunk
of source code.
This calls the LaTex <b>references()</b> method to write a reference to the 
chunk that invokes this chunk.
</p>

@d LaTex code chunk end
@{
def codeEnd( self, aChunk, references ):
    self.write("{\\NWsep}\n\\end{list}")
    self.references( references )
    if not aChunk.big_definition: # defined with 'O' instead of 'o'
        self.write("\\end{minipage}\\\\[4ex]")
    self.write("\\end{flushleft}")
@}

<p>The LaTex <b>codeBegin()</b> method writes the header prior to a the
creation of a tangled file.
</p>

@d LaTex file output begin
@{
def fileBegin( self, aChunk ):
    self.resetIndent()
    self.write("\\begin{flushleft} \\small")
    if not aChunk.big_definition: # defined with 'O' instead of 'o'
        self.write("\\begin{minipage}{\\linewidth}")
    self.write( " \\label{scrap%d}" % aChunk.seq )
    self.write( '\\verb@@"%s"@@~{\\footnotesize ' % aChunk.name )
    self.write( "\\NWtarget{nuweb%s}{%s}$\\equiv$"% (aChunk.name,aChunk.seq) )
    self.write( "\\vspace{-1ex}\n\\begin{list}{}{} \\item" )
@}

<p>The LaTex <b>codeEnd()</b> method writes the trailer subsequent to a tangled file.
This calls the LaTex <b>references()</b> method to write a reference to the 
chunk that invokes this chunk.
</p>

@d LaTex file output end
@{
def fileEnd( self, aChunk, references ):
    self.write("{\\NWsep}\n\\end{list}")
    self.references( references )
    if not aChunk.big_definition: # defined with 'O' instead of 'o'
        self.write("\\end{minipage}\\\\[4ex]")
    self.write("\\end{flushleft}")
@}

<p>The <b>references()</b> method writes a list of references after a chunk of code.
</p>

@d LaTex references summary...
@{
def references( self, references ):
    if references:
        self.write("\\vspace{-1ex}")
        self.write("\\footnotesize\\addtolength{\\baselineskip}{-1ex}")
        self.write("\\begin{list}{}{\\setlength{\\itemsep}{-\\parsep}")
        self.write("\\setlength{\\itemindent}{-\\leftmargin}}")
        for n,s in references:
            self.write("\\item \\NWtxtFileDefBy\\ %s (%s)" % (n,s) )
        self.write("\\end{list}")
    else:
        self.write("\\vspace{-2ex}")
@}

<p>The <b>codeLine()</b> method writes a single line of code to the weaver, 
providing the necessary LaTex markup.
</p>

@d LaTex write a line...
@{
def codeLine( self, aLine ):
    """Each individual line of code with LaTex decoration."""
    self.write( '\\mbox{}\\verb@@%s@@\\\\\n' % aLine.rstrip() )
@}

<p>The <b>referenceTo()</b> method writes a reference to another chunk of
code.  It uses write directly as to follow the current indentation on
the current line of code.
</p>

@d LaTex reference to...
@{
def referenceTo( self, name, sequence ):
    self.write( "\\NWlink{nuweb%s}{%s}$\\equiv$"% (name,sequence) )
@}

<h4>HTML subclass of Weaver</h4>
<h5>Usage</h5>
<p>An instance of <b>HTML</b> can be used by the <b>Web</b> object to 
weave an output document.  The instance is created outside the Web, and
given to the <b>weave()</b> method of the Web.
</p>
<pre>
w= Web()
WebReader(aFile).load( w  )
weave_html= HTML()
w.weave( weave_html )
</pre>

<h5>Design</h5>
<p>The <b></b>HTML subclass defines a Weaver that is customized to
produce HTML output of code sections and cross reference information.
</p>
<p>All HTML chunks are identified by anchor names of the form <tt>pyweb<i>n</i></tt>.  Each
<i>n</i> is the unique chunk number, in sequential order.
</p>

<h5>Implementation</h5>

@d HTML subclass...
@{
class HTML( Weaver ):
    """HTML formatting for XRef's and code blocks when weaving."""
    @<HTML code chunk begin@>
    @<HTML code chunk end@>
    @<HTML output file begin@>
    @<HTML output file end@>
    @<HTML references summary at the end of a chunk@>
    @<HTML write a line of code@>
    @<HTML reference to a chunk@>
    @<HTML simple cross reference markup@>
@| HTML @}

<p>The <b>codeBegin()</b> method starts a chunk of code, defined with <tt>@@d</tt>, providing a label
and HTML tags necessary to set the code off visually.
</p>

@d HTML code chunk begin
@{
def codeBegin( self, aChunk ):
    self.resetIndent()
    self.write( '\n<a name="pyweb%s"></a>\n' % ( aChunk.seq ) )
    self.write( '<!--line number %s-->' % (aChunk.lineNumber()) )
    self.write( '<p><em>%s</em> (%s)&nbsp;%s</p>\n' 
        % (aChunk.fullName,aChunk.seq,aChunk.firstSecond) )
    self.write( "<pre><code>\n" )
@}

<p>The <b>codeEnd()</b> method ends a chunk of code, providing a HTML tags necessary 
to finish the code block visually.  This calls the references method to
write the list of chunks that reference this chunk.
</p>

@d HTML code chunk end
@{
def codeEnd( self, aChunk, references ):
    self.write( "\n</code></pre>\n" )
    self.write( '<p>&loz; <em>%s</em> (%s).' % (aChunk.fullName,aChunk.seq) )
    self.references( references )
    self.write( "</p>\n" )
@}

<p>The <b>fileBegin()</b> method starts a chunk of code, defined with <tt>@@o</tt> or <tt>@@O</tt>, providing a label
and HTML tags necessary to set the code off visually.
</p>

@d HTML output file begin
@{
def fileBegin( self, aChunk ):
    self.resetIndent()
    self.write( '\n<a name="pyweb%s"></a>\n' % ( aChunk.seq ) )
    self.write( '<!--line number %s-->' % (aChunk.lineNumber()) )
    self.write( '<p><tt>"%s"</tt> (%s)&nbsp;%s</p>\n' 
        % (aChunk.fullName,aChunk.seq,aChunk.firstSecond) )
    self.write( "<pre><code>\n" )
@}

<p>The <b>fileEnd()</b> method ends a chunk of code, providing a HTML tags necessary 
to finish the code block visually.  This calls the references method to
write the list of chunks that reference this chunk.
</p>

@d HTML output file end
@{
def fileEnd( self, aChunk, references ):
    self.write( "\n</code></pre>\n" )
    self.write( '<p>&loz; <tt>"%s"</tt> (%s).' % (aChunk.fullName,aChunk.seq) )
    self.references( references )
    self.write( "</p>\n" )
@}

<p>The <b>references()</b> method writes the list of chunks that refer to this chunk.
</p>
@d HTML references summary...
@{
def references( self, references ):
    if references:
        self.write( "  Used by ")
        for n,s in references:
            self.write( '<a href="#pyweb%s"><em>%s</em> (%s)</a>  ' % ( s,n,s ) )
        self.write( "." )
@}

<p>The <b>codeLine()</b> method writes an individual line of code for HTML purposes.
This encodes the four basic HTML entities (&lt;, &gt;, &amp;, &quot;) to prevent code from being interpreted
as HTML.
</p>
<p>The <b>htmlClean()</b> method does the basic HTML entity replacement.  This is factored out of
the basic <b>codeLine()</b> method so that subclasses can use this method, also.
</p>

@d HTML write a line of code
@{
def htmlClean( self, text ):
    """Replace basic HTML entities."""
    clean= text.replace( "&", "&amp;" ).replace( '"', "&quot;" )
    clean= clean.replace( "<", "&lt;" ).replace( ">", "&gt;" )
    return clean
def codeLine( self, aLine ):
    """Each individual line of code with HTML cleanup."""
    self.write( self.htmlClean(aLine) )
@}

<p>The <b>referenceTo()</b> method writes a reference to another chunk.  It uses the 
direct <b>write()</b> method so that the reference is indented properly with the
surrounding source code.
</p>

@d HTML reference to a chunk
@{
def referenceTo( self, aName, seq ):
    """Weave a reference to a chunk."""
    # Provide name to get a full reference.
    # Omit name to get a short reference.
    if aName:
        self.write( '<a href="#pyweb%s">&rarr;<em>%s</em> (%s)</a> ' 
            % ( seq, aName, seq ) )
    else:
        self.write( '<a href="#pyweb%s">(%s)</a> ' 
            % ( seq, seq ) )
@}

<p>The <b>xrefHead()</b> method writes the heading for any of the cross reference blocks created by
<tt>@@f</tt>, <tt>@@m</tt>, or <tt>@@u</tt>.  In this implementation, the cross references are simply unordered lists. 
</p>
<p>The <b>xrefFoot()</b> method writes the footing for any of the cross reference blocks created by
<tt>@@f</tt>, <tt>@@m</tt>, or <tt>@@u</tt>.  In this implementation, the cross references are simply unordered lists. 
</p>
<p>The <b>xrefLine()</b> method writes a line for the file or macro cross reference blocks created by
<tt>@@f</tt> or <tt>@@m</tt>.  In this implementation, the cross references are simply unordered lists. 
</p>

@d HTML simple cross reference markup
@{
def xrefHead( self ):
    self.write( "<dl>\n" )
def xrefFoot( self ):
    self.write( "</dl>\n" )
def xrefLine( self, name, refList ):
    self.write( "<dt>%s:</dt><dd>" % name )
    for r in refList:
        self.write( '<a href="#pyweb%s">%s</a>  ' % (r,r) )
    self.write( "</dd>\n" )
@<HTML write user id cross reference line@>
@}

<p>The <b>xrefDefLine()</b> method writes a line for the user identifier cross reference blocks created by
@@u.  In this implementation, the cross references are simply unordered lists.  The defining instance 
is included in the correct order with the other instances, but is bold and marked with a bullet (&bull;).
</p>

@d HTML write user id cross reference line
@{
def xrefDefLine( self, name, defn, refList ):
    self.write( "<dt>%s:</dt><dd>" % name )
    allPlaces= refList+[defn]
    allPlaces.sort()
    for r in allPlaces:
        if r == defn:
            self.write( '<a href="#pyweb%s"><b>&bull;%s</b></a>  ' 
                % (r,r) )
        else:
            self.write( '<a href="#pyweb%s">%s</a>  ' % (r,r) )
    self.write( "</dd>\n" )
@}

<h4>Tangler subclass of Emitter</h4>
<h5>Usage</h5>
<p>The <b>Tangler</b> class is concrete, and can tangle source files.  An
instance of <b>Tangler</b> is given to the <b>Web</b> class <b>tangle()</b> method.
<pre>
w= Web()
WebReader( aFile ).load( w )
t= Tangler()
w.tangle( t )
</pre>

<h5>Design</h5>
<p>The <b>Tangler</b> subclass defines an Emitter used to <em>tangle</em> the various
program source files.  The superclass is used to simply emit correctly indented 
source code and do very little else that could corrupt or alter the output.
</p>
<p>Language-specific subclasses could be used to provide additional decoration.
For example, inserting <tt>#line</tt> directives showing the line number
in the original source file.
</p>
<h5>Implementation</h5>

@d Tangler subclass of Emitter...
@{
class Tangler( Emitter ):
    """Tangle output files."""
    @<Tangler doOpen, doClose and doWrite overrides@>
    @<Tangler code chunk begin@>
    @<Tangler code chunk end@>
@| Tangler @}

<p>The default for all tanglers is to create the named file.
</p>
<p>This <b>doClose()</b> method overrides the <b>Emitter</b> class <b>doClose()</b> method by closing the
actual file created by open.
</p>
<p>This <b>doWrite()</b> method overrides the <b>Emitter</b> class <b>doWrite()</b> method by writing to the
actual file created by open.
</p>

@d Tangler doOpen...
@{
def doOpen( self, aFile ):
    self.fileName= aFile
    self.theFile= open( aFile, "w" )
    theLog.event( TangleStartEvent, "Tangling %r" % aFile )
def doClose( self ):
    self.theFile.close()
    theLog.event( TangleEndEvent, "Wrote %d lines to %r" 
        % (self.linesWritten,self.fileName) )
def doWrite( self, text ):
    self.theFile.write( text )
@}

<p>The <b>codeBegin()</b> method starts emitting a new chunk of code.
It does this by setting the Tangler's indent to the
prevailing indent at the start of the <tt>@@&lt;</tt> reference command.</p>

@d Tangler code chunk begin
@{
def codeBegin( self, aChunk ):
    self.setIndent()
@}

<p>The <b>codeEnd()</b> method ends emitting a new chunk of code.
It does this by resetting the Tangler's indent to the previous
setting.</p>

@d Tangler code chunk end
@{
def codeEnd( self, aChunk ):
    self.clrIndent()
@}

<h4>TanglerMake subclass of Tangler</h4>

<h5>Usage</h5>
<p>The <b>TanglerMake</b> class is can tangle source files.  An
instance of <b>TanglerMake</b> is given to the <b>Web</b> class <b>tangle()</b> method.
<pre>
w= Web()
WebReader( aFile ).load( w )
t= TanglerMake()
w.tangle( t )
</pre>

<h5>Design</h5>
<p>The <b>TanglerMake</b> subclass makes the <b>Tangler</b> used to <em>tangle</em> the various
program source files more make-friendly.  This subclass of <b>Tangler</b> 
does not <i>touch</i> an output file
where there is no change.  This is helpful when <em>pyWeb</em>'s output is
sent to <i>make</i>.  Using <b>TanglerMake</b> assures that only files with real changes
are rewritten, minimizing recompilation of an application for changes to
the associated documentation.
</p>

<h5>Implementation</h5>
@d Tangler subclass which is make-sensitive...
@{
class TanglerMake( Tangler ):
    """Tangle output files, leaving files untouched if there are no changes."""
    def __init__( self ):
        Tangler.__init__( self )
        self.tempname= None
    @<TanglerMake doOpen override, using a temporary file@>
    @<TanglerMake doClose override, comparing temporary to original@>
@| TanglerMake @}

<p>A <b>TanglerMake</b> creates a temporary file to collect the
tangled output.  When this file is completed, we can compare
it with the original file in this directory, avoiding
a "touch" if the new file is the same as the original.
</p>

@d TanglerMake doOpen...
@{
def doOpen( self, aFile ):
    self.tempname= tempfile.mktemp()
    self.theFile= open( self.tempname, "w" )
    theLog.event( TangleStartEvent, "Tangling %r" % aFile )
@}

<p>If there is a previous file: compare the temporary file and the previous file.  
If there was  previous file or the files are different: rename temporary to replace previous;
else: unlink temporary and discard it.  This preserves the original (with the original date
and time) if nothing has changed.
</p>

@d TanglerMake doClose...
@{
def doClose( self ):
    self.theFile.close()
    try:
        same= filecmp.cmp( self.tempname, self.fileName )
    except OSError,e:
        same= 0
    if same:
        theLog.event( SummaryEvent, "No change to %r" % (self.fileName) )
        os.remove( self.tempname )
    else:
        # note the Windows requires the original file name be removed first
        os.rename( self.tempname, self.fileName )
        theLog.event( TangleEndEvent, "Wrote %d lines to %r" 
            % (self.linesWritten,self.fileName) )
@}

<a name="emitterFactory"></a>
<h3>Emitter Factory</h3>

<h4>Usage</h4>
<p>We use the <i>Factory Method</i> design pattern to permit extending the <b>Emitter</b> class
hierarchy.  Any application that imports this basic <em>pyWeb</em> module can define
appropriate new subclasses, provide a subclass of this <b>EmitterFactory</b>, and use the
existing main program.
</p>
<pre>
import pyweb

class MyHTMLWeaver( HTML ):
    ... (overrides to various methods) ...
    
class MyEmitterFactory( EmitterFactory ):
    def mkEmitter( self, name ):
        """Make an Emitter - try superclass first, then locally defined."""
        s= pyweb.EmitterFactory.mkEmitter( self, name )
        if s: return s
        if name.lower() == 'myhtmlweaver': return MyHTMLWeaver()
        return None

if __name__ == "__main__":
    pyweb.main( MyEmitterFactory(), sys.argv ) 
</pre>

<h4>Design</h4>
<p>We use a <i>Chain of Command</i>-like design for the <b>mkEmitter()</b> method.
A subclass first uses the parent class mkEmitter() to see if the name is recognized.
If it is not, then the subclass can match the added class names against the argument.
</p>

<h4>Implementation</h4>
<p>To emphasize the implementation, we provide an <b>EmitterFactory</b> superclass
that creates the abstract superclasses of <b>Weaver</b> and <b>Tangler</b>.  We subclass this
to create a more useful <b>EmitterFactory</b> that creates any of the instances
in this base <em>pyWeb</em> module.
</p>

<p>The <b>EmitterFactorySuper</b> is a superclass that only 
recognizes the basic <b>Weaver</b> and <b>Tangler</b> emitters.
This must be subclassed to recognize the more useful emitters.</p>

@d Emitter Factory...
@{
class EmitterFactorySuper:
    def mkEmitter( self, name ):
        if name.lower() == 'weaver': return Weaver()
        elif name.lower() == 'tangler': return Tangler()
        return None
@}

<p>The <b>EmitterFactory</b> class is a subclass of <b>EmitterFactorySuper</b> that 
recognizes all of the various emitters defined in this module.  It also
shows how a subclass would be constructed.
</p>

@d Emitter Factory...
@{
class EmitterFactory( EmitterFactorySuper ):
    def mkEmitter( self, name ):
        """Make an Emitter - try superclass first, then locally defined."""
        s= EmitterFactorySuper.mkEmitter( self, name )
        if s: return s
        if name.lower() == 'html': return Weaver()
        elif name.lower() == 'latex': return Latex()
        elif name.lower() == 'tanglermake': return TanglerMake()
        return None
@}

<h3>Chunks</h3>

<p>A <b>Chunk</b> is a piece of the input file.  It is a collection of <b>Command</b> instances.
A chunk can be woven or tangled to create output.</p>
<p>The two most important methods are the <b>weave()</b> and <b>tangle()</b> methods.  These
visit the commands of this chunk, producing the required output file.
</p>
<p>Additional methods (<b>startswith()</b>, <b>searchForRE()</b> and <b>usedBy()</b>)
 are used to examine the text of the <b>Command</b> instances within
the chunk.</p>
<p>A <b>Chunk</b> instance is created by the <b>WebReader</b> as the input file is parsed.
Each <b>Chunk</b> instance has one or more pieces of the original input text.  This text can be program source,
a reference command, or the documentation source.
</p>

@d Chunk class hierarchy...
@{
@<Chunk class@>
@<NamedChunk class@>
@<OutputChunk class@>
@<NamedDocumentChunk class@>
@}

<p>The <b>Chunk</b> class is both the superclass for this hierarchy and the implementation 
for anonymous chunks.  An anonymous chunk is always documentation in the 
target markup language.  No transformation is ever done on anonymous chunks.
</p>
<p>A <b>NamedChunk</b> is a chunk created with a <tt>@@d</tt> command.  
This is a chunk of source programming language, bracketed with <tt>@@{</tt> and <tt>@@}</tt>.
</p>
<p>An <b>OutputChunk</b> is a named chunk created with a <tt>@@o</tt> or <tt>@@O</tt> command.  
This must be a chunk of source programming language, bracketed with <tt>@@{</tt> and <tt>@@}</tt>.
</p>
<p>A <b>NamedDocumentChunk</b> is a named chunk created with a <tt>@@d</tt> command.  
This is a chunk of documentation in the target markup language,
 bracketed with <tt>@@[</tt> and <tt>@@]</tt>.
</p>

<h4>Chunk Superclass</h4>

<h5>Usage</h5>
<p>An instance of the <b>Chunk</b> class has a life that includes four important events:
creation, cross-reference, weave and tangle.</p>
<p>A <b>Chunk</b> is created by a <b>WebReader</b>, and associated with a <b>Web</b>.
There are several web append methods, depending on the exact subclass of <b>Chunk</b>.
The <b>WebReader</b> calls the chunk's <b>webAdd()</b> method select the correct method
for appending and indexing the chunk.
Individual instances of <b>Command</b> are appended to the chunk.
The basic outline for creating a <b>Chunk</b> instance is as follows:</p>
<pre>
w= Web()
c= Chunk()
c.webAdd( w )
c.append( ...some Command... )
c.append( ...some Command... )
</pre>
<p>Before weaving or tangling, a cross reference is created for all
user identifiers in all of the <b>Chunk</b> instances.
This is done by: (1) visit each <b>Chunk</b> and call the 
<b>getUserIDRefs()</b> method to gather all identifiers; (2) for each identifier, 
visit each <b>Chunk</b> and call the <b>searchForRE()</b> method to find uses of
the identifier.</p>
<pre>
ident= []
for c in <i>the Web's named chunk list</i>:
    ident.extend( c.getUserIDRefs() )
for i in ident:
    pattern= re.compile('\W%s\W' % i)
    for c in <i>the Web's named chunk list</i>:
        c.searchForRE( pattern, self )
</pre>
<p>A <b>Chunk</b> is woven or tangled by the <b>Web</b>.  The basic outline for weaving is
as follows.  The tangling operation is essentially the same.</p>
<pre>
for c in <i>the Web's chunk list</i>:
    c.weave( aWeaver )
</pre>

<h5>Design</h5>
<p>The <b>Chunk</b> class contains the overall definitions for all of the
various specialized subclasses.  In particular, it contains the <b>append()</b>, <b>appendChar()</b>
and <b>appendText()</b> methods used by all of the various <b>Chunk</b> subclasses.
</p>

<p>When a <tt>@@@@</tt> construct is located in the input stream, the stream contains
three text tokens: material before the <tt>@@@@</tt>, the <tt>@@@@</tt>, 
and the material after the <tt>@@@@</tt>.
These three tokens are reassembled into a single block of text.  This reassembly
is accomplished by changing the chunk's state so that the next <b>TextCommand</b> is
appended onto the previous <b>TextCommand</b>.
</p>

<p>There are two operating states for instances of this class.  The state change
is accomplished on a call to the <b>appendChar()</b> method, and alters the behavior of
the <b>appendText()</b> method.  The <b>appendText()</b> method either:</p>
<ul>
<li>creates a <b>TextCommand</b> instance and appends it to this chunk,</li>
<li><em>or</em> appends the text block of a <b>TextCommand</b> to the end of the most recent 
<b>Command</b> instance (which must be a <b>TextCommand</b>), and then changes state back to 
ordinary append mode.</li>
</ul>

<p>Each subclass of <b>Chunk</b> has a particular type of text that it will process.  Anonymous chunks
only handle document text.  The <b>NamedChunk</b> subclass that handles program source
will override this method to create a different command type.  The <b>makeContent()</b> method
creates the appropriate <b>Command</b> instance for this <b>Chunk</b> subclass.
</p>

<p>The <b>weave()</b> method of an anonymous <b>Chunk</b> uses the weaver's 
<b>docBegin()</b> and <b>docEnd()</b>
methods to insert text that is source markup.  Other subclasses will override this to 
use different <b>Weaver</b> methods for different kinds of text.
</p>

<h5>Implementation</h5>

<p>The <b>Chunk</b> constructor initializes the following instance variables:</p>
<ul>
<li><i>commands</i> is a sequence of the various <b>Command</b> instances the comprise this
chunk.</li>
<li><i>lastCommand</i> is used to force a character to be appended to the last
command (which must be a <b>Textcommand</b> instance) instead of appending a new command.</li>
<li><i>big_definition</i> is used to recognize a <tt>@@O</tt> definition instead of the
default <tt>@@o</tt> definition</li>
<li><i>xref</i> is used the list of user identifiers associated with
this chunk.  This attribute is always <b>None</b> for this class.
The <b>NamedChunk</b> subclass, however, can have user identifiers.</li>
<li><i>firstSecond</i> is used to hold a flag showing if this is the first
definition ('=') or a subsequent definition ('+=').</li>
<li><i>name</i> has the name of the chunk.  This is '' for anonymous chunks.</li>
<li><i>seq</i> has the sequence number associated with this chunk.  This is None
for anonymous chunks.</li>
</ul>


@d Chunk class
@{
class Chunk:
    """Anonymous piece of input file: will be output through the weaver only."""
    # construction and insertion into the web
    def __init__( self ):
        self.commands= [ ]
        self.lastCommand= None
        self.big_definition= None
        self.xref= None
        self.firstSecond= None
        self.name= ''
        self.seq= None
    def __str__( self ):
        return "\n".join( map( str, self.commands ) )
    @<Chunk append a command@>
    @<Chunk append a character@>
    @<Chunk append text@>
    @<Chunk add to the web@>
    def makeContent( self, text, lineNumber=0 ):
        return TextCommand( text, lineNumber )
    @<Chunk examination: starts with, matches pattern, references@>
    @<Chunk weave@>
    @<Chunk tangle@>
@| Chunk @}

<p>The <b>append()</b> method simply appends a <b>Command</b> instance to this chunk.</p>

@d Chunk append a command
@{
def append( self, command ):
    """Add another Command to this chunk."""
    self.commands.append( command )
@}

<p>When an <tt>@@@@</tt> construct is located, the <b>appendChar()</b> method:</p>
<ol>
<li>accumulates the <tt>@@</tt> character at the end of the previous <b>TextCommand</b>,</li>
<li>and changes the state of the chunk so that the next <b>TextCommand</b> is 
concatenated, also.</li>
</ol>

@d Chunk append a character
@{
def appendChar( self, text, lineNumber=0 ):
    """Append a single character to the most recent TextCommand."""
    if len(self.commands)==0 or not isinstance(self.commands[-1],TextCommand):
        self.commands.append( self.makeContent("",lineNumber) )
    self.commands[-1].text += text
    self.lastCommand= self.commands[-1]
@}

<p>The <b>appendText()</b> method appends a <b>TextCommand</b> to this chunk,
or it appends it to the most recent <b>TextCommand</b>.  This condition is
defined by the <b>appendChar()</b> method.
</p>
@d Chunk append text
@{
def appendText( self, text, lineNumber=0 ):
    """Add another TextCommand to this chunk or concatenate to the most recent TextCommand."""
    if self.lastCommand:
        assert len(self.commands)>=1 and isinstance(self.commands[-1],TextCommand)
        self.commands[-1].text += text
        self.lastCommand= None
    else:
        self.commands.append( self.makeContent(text,lineNumber) )
@}

<p>The <b>webAdd()</b> method adds this chunk to the given document web.
Each subclass of the <b>Chunk</b> class must override this to be sure that the various
<b>Chunk</b> subclasses are indexed properly.  The
<b>Chunk</b> class uses the <b>add()</b> method
of the <b>Web</b> class to append an anonymous, unindexed chunk.
</p>

@d Chunk add to the web
@{
def webAdd( self, web ):
    """Add self to a Web as anonymous chunk."""
    web.add( self )
@}

<p>The <b>startsWith()</b> method examines a the first <b>Command</b> instance this
<b>Chunk</b> instance to see if it starts
with the given prefix string.
</p>
<p>The <b>lineNumber()</b> method returns the line number of the first
<b>Command</b> in this chunk.  This provides some context for where the chunk
occurs in the original input file.
</p>
<p>A <b>NamedChunk</b> instance may define one or more identifiers.  This parent class
provides a dummy version of the <b>getUserIDRefs</b> method.  The <b>NamedChunk</b>
subclass overrides this to provide actual results.  By providing this
at the superclass level, the <b>Web</b> can easily gather identifiers without
knowing the actual subclass of <b>Chunk</b>.
</p>
<p>The <b>searchForRE()</b> method examines each <b>Command</b> instance to see if it matches
with the given regular expression.  If so, this can be reported to the Web instance
and accumulated as part of a cross reference for this <b>Chunk</b>.
</p>
<p>The <b>usedBy()</b> method visits each <b>Command</b> instance; a <b>Command</b>
instance calls the <b>Web</b> class <b>setUsage()</b> method to report the references 
from this <b>Chunk</b> to other <b>Chunks</b>.  This set of references can be reversed to identify
the chunks that refer to this chunk.
</p>

@d Chunk examination...
@{
def startswith( self, prefix ):
    """Examine the first command's starting text."""
    return len(self.commands) >= 1 and self.commands[0].startswith( prefix )
def searchForRE( self, rePat, aWeb ):
    """Visit each command, applying the pattern."""
    @<Chunk search for user identifiers done by iteration through each command@>
def usedBy( self, aWeb ):
    """Update web's used-by xref."""
    @<Chunk usedBy update done by iteration through each command@>
def lineNumber( self ):
    """Return the first command's line number or None."""
    return len(self.commands) >= 1 and self.commands[0].lineNumber
def getUserIDRefs( self ):
    return []
@}

<p>The chunk search in the <b>searchForRE()</b> method parallels weaving and tangling a <b>Chunk</b>.
The operation is delegated to each <b>Command</b> instance within the <b>Chunk</b> instance.
</p>

@d Chunk search...
@{
for c in self.commands:
    if c.searchForRE( rePat, aWeb ):
        return self
return None
@}

<p>The <b>usedBy()</b> update visits each Command instance.  It calls the <b>Command</b> class
<b>usedBy()</b> method, passing in the overall <b>Web</b> instance and this <b>Chunk</b> instance.
This allows the <b>Command</b> to generate a reference from this <b>Chunk</b> to another <b>Chunk</b>, 
and notify the <b>Web</b> instance of this reference.  
The <b>Command</b>, if it is a <b>ReferenceCommand</b>, will also update 
the <b>Chunk</b> instance <i>refCount</i> attribute.
</p>
<p>Note that an exception may be raised by this operation if a referenced
<b>Chunk</b> does not actually exist.  If a reference <b>Command</b> does raise an error, 
we append this <b>Chunk</b> information and reraise the error with the additional 
context information.
</p>

@d Chunk usedBy update...
@{
try:
    for t in self.commands:
        t.usedBy( aWeb, self )
except Error,e:
    raise Error,e.args+(self,)
@}


<p>The <b>weave()</b> method weaves this chunk into the final document as follows:</p>
<ol>
<li>call
the <b>Weaver</b> class <b>docBegin()</b> method.  This method does nothing for document content.</li>
<li>visit each <b>Command</b> instance: call the <b>Command</b> instance <b>weave()</b> method to 
emit the content of the <b>Command</b> instance</li>
<li>call the <b>Weaver</b> class <b>docEnd()</b> method.  This method does nothing for document content.</li>
</ol>
<p>Note that an exception may be raised by this operation if a referenced
<b>Chunk</b> does not actually exist.  If a reference <b>Command</b> does raise an error, 
we append this <b>Chunk</b> information and reraise the error with the additional 
context information.
</p>

@d Chunk weave
@{
def weave( self, aWeb, aWeaver ):
    """Create the nicely formatted document from an anonymous chunk."""
    aWeaver.docBegin( self )
    try:
        for t in self.commands:
            t.weave( aWeb, aWeaver )
    except Error, e:
        raise Error,e.args+(self,)
    aWeaver.docEnd( self )
def weaveReferenceTo( self, aWeb, aWeaver ):
    """Create a reference to this chunk -- except for anonymous chunks."""
    raise Exception( "Cannot reference an anonymous chunk.""")
def weaveShortReferenceTo( self, aWeb, aWeaver ):
    """Create a short reference to this chunk -- except for anonymous chunks."""
    raise Exception( "Cannot reference an anonymous chunk.""")
@}

<p>Anonymous chunks cannot be tangled.  Any attempt indicates a serious
problem with this program or the input file.</p>

@d Chunk tangle
@{
def tangle( self, aWeb, aTangler ):
    """Create source code -- except anonymous chunks should not be tangled"""
    raise Error( 'Cannot tangle an anonymous chunk', self )
@}

<h4>NamedChunk class</h4>

<h5>Usage</h5>
<p>A <b>NamedChunk</b> is created and used almost identically to an anonymous <b>Chunk</b>.
The most significant difference is that a name is provided when the <b>NamedChunk</b> is created.
This name is used by the <b>Web</b> to organize the chunks.
</p>

<h5>Design</h5>

<p>A <b>NamedChunk</b> is created with a <tt>@@d</tt>, <tt>@@o</tt> or <tt>@@O</tt> command.  
A <b>NamedChunk</b> contains programming language source
 when the brackets are <tt>@@{</tt> and <tt>@@}</tt>.  A
separate subclass of <b>NamedDocumentChunk</b> is used when
the brackets are <tt>@@[</tt> and <tt>@@]</tt>.
</p>
<p>A <b>NamedChunk</b> can be both tangled into the output program files, and
woven into the output document file. 
</p>
<p>The <b>weave()</b> method of a <b>NamedChunk</b> uses the Weaver's 
<b>codeBegin()</b> and <b>codeEnd()</b>
methods to insert text that is program source and requires additional
markup to make it stand out from documentation.  Other subclasses can override this to 
use different <b>Weaver</b> methods for different kinds of text.
</p>

<h5>Implementation</h5>

<p>This class introduces some additional attributes.</p>
<ul>
<li><i>fullName</i> is the full name of the chunk.  It's possible for a 
chunk to be an abbreviated forward reference; full names cannot be resolved
until all chunks have been seen.</li>
<li><i>xref</i> is the list of user identifiers associated with this chunk.</li>
<li><i>refCount</i> is the count of references to this chunk.  If this is
zero, the chunk is unused; if this is more than one, this chunk is 
multiply used.  Either of these conditions is a possible error in the input. 
This is set by the <b>usedBy()</b> method.</li>
<li><i>name</i> has the name of the chunk.  Names can be abbreviated.</li>
<li><i>seq</i> has the sequence number associated with this chunk.  This
is set by the Web by the <b>webAdd()</b> method.</li>
</ul>

@d NamedChunk class
@{
class NamedChunk( Chunk ):
    """Named piece of input file: will be output as both tangler and weaver."""
    def __init__( self, name ):
        Chunk.__init__( self )
        self.name= name
        self.seq= None
        self.fullName= None
        self.xref= []
        self.refCount= 0
    def __str__( self ):
        return "%r: %s" % ( self.name, Chunk.__str__(self) )
    def makeContent( self, text, lineNumber=0 ):
        return CodeCommand( text, lineNumber )
    @<NamedChunk user identifiers set and get@>
    @<NamedChunk add to the web@>
    @<NamedChunk weave@>
    @<NamedChunk tangle@>
@| NamedChunk @}

<p>The <b>setUserIDRefs()</b> method accepts a list of user identifiers that are
associated with this chunk.  These are provided after the <tt>@@|</tt> separator
in a <tt>@@d</tt> named chunk.  These are used by the <tt>@@u</tt> cross reference generator.
</p>

@d NamedChunk user identifiers...
@{
def setUserIDRefs( self, text ):
    """Save xref variable names."""
    self.xref= text.split()
def getUserIDRefs( self ):
    return self.xref
@}

<p>The <b>webAdd()</b> method adds this chunk to the given document <b>Web</b> instance.
Each class of <b>Chunk</b> must override this to be sure that the various
<b>Chunk</b> classes are indexed properly.  This class uses the <b>addNamed()</b> method
of the <b>Web</b> class to append a named chunk.
</p>

@d NamedChunk add to the web
@{
def webAdd( self, web ):
    """Add self to a Web as named chunk, update xrefs."""
    web.addNamed( self )
@}

<p>The <b>weave()</b> method weaves this chunk into the final document as follows:</p>
<ol>
<li>call
the <b>Weaver</b> class <b>codeBegin()</b> method.  This method emits the necessary markup
for code appearing in the woven output.</li>
<li>visit each <b>Command</b>, calling the command's <b>weave()</b> method to emit the command's content</li>
<li>call the <b>Weaver</b> class <b>CodeEnd()</b> method.  This method emits the necessary markup
for code appearing in the woven output.</li>
</ol>

<p>The <b>weaveRefenceTo()</b> method weaves a reference to a chunk using both name and sequence number.
The <b>weaveShortReferenceTo()</b> method weaves a reference to a chunk using only the sequence number.
These references are created by <b>ReferenceCommand</b> instances within a chunk being woven.
</p>
<p>If a <b>ReferenceCommand</b> does raise an error during weaving,
we append this <b>Chunk</b> information and reraise the error with the additional 
context information.
</p>

@d NamedChunk weave
@{
def weave( self, aWeb, aWeaver ):
    """Create the nicely formatted document from a chunk of code."""
    # format as <pre> in a different-colored box
    self.fullName= aWeb.fullNameFor( self.name )
    aWeaver.codeBegin( self )
    for t in self.commands:
        try:
            t.weave( aWeb, aWeaver )
        except Error,e:
            raise Error,e.args+(self,)
    aWeaver.codeEnd( self, aWeb.chunkReferencedBy( self.seq ) )
def weaveReferenceTo( self, aWeb, aWeaver ):
    """Create a reference to this chunk."""
    self.fullName= aWeb.fullNameFor( self.name )
    aWeaver.referenceTo( self.fullName, self.seq )
def weaveShortReferenceTo( self, aWeb, aWeaver ):
    """Create a shortened reference to this chunk."""
    aWeaver.referenceTo( None, self.seq )
@}

<p>The <small>tangle()</small> method tangles this chunk into the final document as follows:</p>
<ol>
<li>call the <b>Tangler</b> class <b>codeBegin()</b> method to set indents properly.</li>
<li>visit each Command, calling the Command's <b>tangle()</b> method to emit the Command's content</li>
<li>call the <b>Tangler</b> class <b>codeEnd()</b> method to restore indents.</li>
</ol>
<p>If a <b>ReferenceCommand</b> does raise an error during tangling,
we append this Chunk information and reraise the error with the additional 
context information.
</p>

@d NamedChunk tangle
@{
def tangle( self, aWeb, aTangler ):
    """Create source code."""
    # use aWeb to resolve @@<namedChunk@@>
    # format as correctly indented source text
    aTangler.codeBegin( self )
    for t in self.commands:
        try:
            t.tangle( aWeb, aTangler )
        except Error,e:
            raise Error,e.args+(self,)
    aTangler.codeEnd( self )
@}

<h4>OutputChunk class</h4>
<h5>Usage</h5>
<p>A <b>OutputChunk</b> is created and used identically to a <b>NamedChunk</b>.
The difference between this class and the parent class is the decoration of 
the markup when weaving.
</p>

<h5>Design</h5>

<p>The <b>OutputChunk</b> class is a subclass of <b>NamedChunk</b> that handles 
file output chunks defined with <tt>@@o</tt> or <tt>@@O</tt>.  These are woven slightly
differently, to allow for a presentation of the file
chunks that is different from the presentation of the other named
chunks.
</p>
<p>The <b>weave()</b> method of a <b>OutputChunk</b> uses the Weaver's 
<b>fileBegin()</b> and <b>fileEnd()</b>
methods to insert text that is program source and requires additional
markup to make it stand out from documentation.  Other subclasses could override this to 
use different <b>Weaver</b> methods for different kinds of text.
</p>
<p>All other methods, including the tangle method are identical to <b>NamedChunk</b>.</p>

<h5>Implementation</h5>

@d OutputChunk class
@{
class OutputChunk( NamedChunk ):
    """Named piece of input file, defines an output tangle."""
    @<OutputChunk add to the web@>
    @<OutputChunk weave@>
@| OutputChunk @}

<p>The <b>webAdd()</b> method adds this chunk to the given document <b>Web</b>.
Each class of <b>Chunk</b> must override this to be sure that the various
<b>Chunk</b> classes are indexed properly.  This class uses the <b>addOutput()</b> method
of the <b>Web</b> class to append a file output chunk.
</p>

@d OutputChunk add to the web
@{
def webAdd( self, web ):
    """Add self to a Web as output chunk, update xrefs."""
    web.addOutput( self )
@}

<p>The <b>weave()</b> method weaves this chunk into the final document as follows:</p>
<ol>
<li>call the <b>Weaver</b> class <b>codeBegin()</b> method to emit proper markup for an output file chunk.</li>
<li>visit each <b>Command</b>, call the Command's <b>weave()</b> method to emit the Command's content</li>
<li>call the <b>Weaver</b> class <b>codeEnd()</b> method to emit proper markup for an output file chunk.</li>
</ol>
<p>These chunks of documentation are never tangled.  Any attempt is an
error.</p>
<p>If a <b>ReferenceCommand</b> does raise an error during weaving,
we append this <b>Chunk</b> information and reraise the error with the additional 
context information.
</p>

@d OutputChunk weave
@{
def weave( self, aWeb, aWeaver ):
    """Create the nicely formatted document from a chunk of code."""
    # format as <pre> in a different-colored box
    self.fullName= aWeb.fullNameFor( self.name )
    aWeaver.fileBegin( self )
    try:
        for t in self.commands:
            t.weave( aWeb, aWeaver )
    except Error,e:
        raise Error,e.args+(self,)
    aWeaver.fileEnd( self, aWeb.chunkReferencedBy( self.seq ) )
@}

<h4>NamedDocumentChunk class</h4>
<h5>Usage</h5>
<p>A <b>NamedDocumentChunk</b> is created and used identically to a <b>NamedChunk</b>.
The difference between this class and the parent class is that this chunk
is only woven when referenced.  The original definition is silently skipped.
</p>

<h5>Design</h5>

<p>The <b>NamedDocumentChunk</b> class is a subclass of <b>NamedChunk</b> that handles 
named chunks defined with <tt>@@d</tt> and the <tt>@@[</tt>...<tt>@@]</tt> delimiters.  
These are woven slightly
differently, since they are document source, not programming language source.
</p>
<p>We're not as interested in the cross reference of named document chunks.
They can be used multiple times or never.  They are often referenced
by anonymous chunks.  While this chunk subclass participates in this data 
gathering, it is ignored for reporting purposes.</p>
<p>All other methods, including the tangle method are identical to <b>NamedChunk</b>.</p>

<h5>Implementation</h5>

@d NamedDocumentChunk class
@{
class NamedDocumentChunk( NamedChunk ):
    """Named piece of input file with document source, defines an output tangle."""
    def makeContent( self, text, lineNumber=0 ):
        return TextCommand( text, lineNumber )
    @<NamedDocumentChunk weave@>
    @<NamedDocumentChunk tangle@>
@| NamedDocumentChunk @}

<p>The <b>weave()</b> method quietly ignores this chunk in the document.
A named document chunk is only included when it is referenced 
during weaving of another chunk (usually an anonymous document
chunk).
</p>
<p>The <b>weaveReferenceTo()</b> method inserts the content of this
chunk into the output document.  This is done in response to a
<b>ReferenceCommand</b> in another chunk.  
The <b>weaveShortReferenceTo()</b> method calls the <b>weaveReferenceTo()</b>
to insert the entire chunk.
</p>

@d NamedDocumentChunk weave
@{
def weave( self, aWeb, aWeaver ):
    """Ignore this when producing the document."""
    pass
def weaveReferenceTo( self, aWeb, aWeaver ):
    """On a reference to this chunk, expand the body in place."""
    try:
        for t in self.commands:
            t.weave( aWeb, aWeaver )
    except Error,e:
        raise Error,e.args+(self,)
def weaveShortReferenceTo( self, aWeb, aWeaver ):
    """On a reference to this chunk, expand the body in place."""
    self.weaveReferenceTo( aWeb, aWeaver )
@}

@d NamedDocumentChunk tangle
@{
def tangle( self, aWeb, aTangler ):
    """Raise an exception on an attempt to tangle."""
    raise Error( "Cannot tangle a chunk defined with @@[.""" )
@}

<h3>Commands</h3>

<p>The input stream is broken into individual commands, based on the
various <tt>@@<i>x</i></tt> strings in the file.  There are several subclasses of <b>Command</b>,
each used to describe a different command or block of text in the input.
</p>

<p>All instances of the <b>Command</b> class are created by a <b>WebReader</b> instance.  
In this case, a <b>WebReader</b> can be thought of as a factory for <b>Command</b> instances.
Each <b>Command</b> instance is appended to the sequence of commands that
belong to a <b>Chunk</b>.  A chunk may be as small as a single command, or a long sequence
of commands.</p>

<p>Each command instance responds to methods to examine the content, gather 
cross reference information and tangle a file or weave the final document.
</p>

@d Command class hierarchy...
@{
@<Command superclass@>
@<TextCommand class to contain a document text block@>
@<CodeCommand class to contain a program source code block@>
@<XrefCommand superclass for all cross-reference commands@>
@<FileXrefCommand class for an output file cross-reference@>
@<MacroXrefCommand class for a named chunk cross-reference@>
@<UserIdXrefCommand class for a user identifier cross-reference@>
@<ReferenceCommand class for chunk references@>
@}

<h4>Command Superclass</h4>

<h5>Usage</h5>
<p>A <b>Command</b> is created by the <b>WebReader</b>, and attached to a <b>Chunk</b>.
The Command participates in cross reference creation, weaving and tangling.
</p>
<p>The <b>Command</b> superclass is abstract, and has default methods factored out
of the various subclasses.  When a subclass is created, it will override some
of the methods provided in this superclass.
</p>
<pre>
class MyNewCommand( Command ):
    ... overrides for various methods ...
</pre>
<p>Additionally, a subclass of <b>WebReader</b> must be defined to parse the new command
syntax.  The main <b>process()</b> function must also be updated to use this new subclass
of <b>WebReader</b>.</p>

<h5>Design</h5>

<p>The <b>Command</b> superclass provides the parent class definition
for all of the various command types.  The most common command
is a block of text, which is woven or tangled.  The next most
common command is a reference to a chunk, which is woven as a 
mark-up reference, but tangled as an expansion of the source 
code.
</p>
<ul>
<li>The <b>startswith()</b> method examines any source text to see if
it begins with the given prefix text.</li>
<li>The <b>searchForRE()</b> method examines any source text to see if
it matches the given regular expression, usually a match for a user identifier.</li>
<li>The <b>usedBy()</b> method is ignored by all but the <b>Reference</b> subclass,
which calls back to the web with the reference made by the
parent chunk.</li>
<li>The <b>weave()</b> method weaves this into the output.  If a document text
command, it is emitted directly; if a program source code command, 
markup is applied.  In the case of cross-reference commands,
the actual cross-reference content is emitted.  In the case of 
reference commands, they are woven as a reference to a named
chunk.</li>
<li>The <b>tangle()</b> method tangles this into the output.  If a
this is a document text command, it is ignored; if a this is a
program source code
command, it is indented and emitted.  In the case of cross-reference
commands, no output is produced.  In the case of reference
commands, the named chunk is indented and emitted.</li>
</ul>
<p>The attributes of a <b>Command</b> instance includes the line number on which
the command began, in <i>lineNumber</i>.</p>

<h5>Implementation</h5>

@d Command superclass
@{
class Command:
    """A Command is the lowest level of granularity in the input stream."""
    def __init__( self, fromLine=0 ):
        self.lineNumber= fromLine
    def __str__( self ):
        return "at %r" % self.lineNumber
    def startswith( self, prefix ):
        return None
    def searchForRE( self, rePat, aWeb ):
        return None
    def usedBy( self, aWeb, aChunk ):
        pass
    def weave( self, aWeb, aWeaver ):
        pass
    def tangle( self, aWeb, aTangler ):
        pass
@| Command @}

<h4>TextCommand class</h4>
<h5>Usage</h5>

<p>A <b>TextCommand</b> is created by a <b>Chunk</b> or a <b>NamedDocumentChunk</b> when a 
<b>WebReader</b> calls the chunk's <b>appendChar()</b> or <b>appendText()</b> method.
This Command participates in cross reference creation, weaving and tangling.  When it is
created, the source line number is provided so that this text can be tied back
to the source document. 
</p>

<h5>Design</h5>
<p>An instance of the <b>TextCommand</b> class is a block of document text.  It can originate
in an anonymous block or a named chunk delimited with <tt>@@[</tt> and <tt>@@]</tt>.
</p>
<p>This subclass provides a concrete implementation for all of the methods.  Since
text is the author's original markup language, it is emitted directly to the weaver
or tangler.
</p>

<h5>Implementation</h5>


@d TextCommand class...
@{
class TextCommand( Command ):
    """A piece of document source text."""
    def __init__( self, text, fromLine=0 ):
        Command.__init__( self, fromLine )
        self.text= text
    def __str__( self ):
        return "at %r: %r..." % (self.lineNumber,self.text[:32])
    def startswith( self, prefix ):
        return self.text.startswith( prefix )
    def searchForRE( self, rePat, aWeb ):
        return rePat.search( self.text )
    def weave( self, aWeb, aWeaver ):
        aWeaver.write( self.text )
    def tangle( self, aWeb, aTangler ):
        aTangler.write( self.text )
@| TextCommand @}

<h4>CodeCommand class</h4>
<h5>Usage</h5>
<p>A <b>CodeCommand</b> is created by a <b>NamedChunk</b> when a 
<b>WebReader</b> calls the <b>appendText()</b> or <i>appendChar()</i> method.
The Command participates in cross reference creation, weaving and tangling.  When it is
created, the source line number is provided so that this text can be tied back
to the source document. 
</p>
<h5>Design</h5>
<p>An instance of the <b>CodeCommand</b> class is a block of program source code text.
It can originate in a named chunk (<tt>@@d</tt>) with a <tt>@@{</tt> and <tt>@@}</tt> delimiter.
Or it can be a file output chunk (<tt>@@o</tt>, <tt>@@O</tt>).
</p>
<p>It uses the <b>codeBlock()</b> methods of a <b>Weaver</b> or <b>Tangler</b>.  The weaver will 
insert appropriate markup for this code.  The tangler will assure that the prevailing
indentation is maintained.
<h5>Implementation</h5>

@d CodeCommand class...
@{
class CodeCommand( TextCommand ):
    """A piece of program source code."""
    def weave( self, aWeb, aWeaver ):
        aWeaver.codeBlock( self.text )
    def tangle( self, aWeb, aTangler ):
        aTangler.codeBlock( self.text )
@| CodeCommand @}

<h4>XrefCommand superclass</h4>
<h5>Usage</h5>
<p>An <b>XrefCommand</b> is created by the <b>WebReader</b> when any of the 
<tt>@@f</tt>, <tt>@@m</tt>, <tt>@@u</tt> commands are found in the input stream.
The Command is then appended to the current Chunk being built by the WebReader.
</p>

<h5>Design</h5>
<p>The <b>XrefCommand</b> superclass defines any common features of the
various cross-reference commands (<tt>@@f</tt>, <tt>@@m</tt>, <tt>@@u</tt>).
</p>
<p>The <b>formatXref()</b> method creates the body of a cross-reference
by the following algorithm:</p>
<ol>
<li>Use the <b>Weaver</b> class <b>xrefHead()</b> method to emit the cross-reference header.</li>
<li>Sort the keys in the cross-reference mapping.</li>
<li>Use the <b>Weaver</b> class <b>xrefLine()</b> method to emit each line of the cross-reference mapping.</li>
<li>Use the <b>Weaver</b> class <b>xrefFoot()</b> method to emit the cross-reference footer.</li>
</ol>
<p>If this command winds up in a tangle operation, that use
is illegal.  An exception is raised and processing stops.
</p>
 
<h5>Implementation</h5>

@d XrefCommand superclass...
@{
class XrefCommand( Command ):
    """Any of the Xref-goes-here commands in the input."""
    def __str__( self ):
        return "at %r: cross reference" % (self.lineNumber)
    def formatXref( self, xref, aWeaver ):
        aWeaver.xrefHead()
        xk= xref.keys()
        xk.sort()
        for n in xk:
            aWeaver.xrefLine( n, xref[n] )
        aWeaver.xrefFoot()
    def tangle( self, aWeb, aTangler ):
        raise Error('Illegal tangling of a cross reference command.')
@| XrefCommand @}

<h4>FileXrefCommand class</h4>
<h5>Usage</h5>
<p>A <b>FileXrefCommand</b> is created by the <b>WebReader</b> when the 
<tt>@@f</tt> command is found in the input stream.
The Command is then appended to the current Chunk being built by the WebReader.
</p>
<h5>Design</h5>
<p>The <b>FileXrefCommand</b> class weave method gets the
file cross reference from the overall web instance, and uses
the  <b>formatXref()</b> method of the <b>XrefCommand</b> superclass for format this result.
</p>

<h5>Implementation</h5>

@d FileXrefCommand class...
@{
class FileXrefCommand( XrefCommand ):
    """A FileXref command."""
    def weave( self, aWeb, aWeaver ):
        """Weave a File Xref from @@o commands."""
        self.formatXref( aWeb.fileXref(), aWeaver )
@| FileXrefCommand @}

<h4>MacroXrefCommand class</h4>
<h5>Usage</h5>
<p>A <b>MacroXrefCommand</b> is created by the <b>WebReader</b> when the 
<tt>@@m</tt> command is found in the input stream.
The Command is then appended to the current Chunk being built by the WebReader.
</p>
<h5>Design</h5>

<p>The <b>MacroXrefCommand</b> class weave method gets the
named chunk (macro) cross reference from the overall web instance, and uses
the <b>formatXref()</b> method of the <b>XrefCommand</b> superclass method for format this result.
</p>

<h5>Implementation</h5>

@d MacroXrefCommand class...
@{
class MacroXrefCommand( XrefCommand ):
    """A MacroXref command."""
    def weave( self, aWeb, aWeaver ):
        """Weave the Macro Xref from @@d commands."""
        self.formatXref( aWeb.chunkXref(), aWeaver )
@| MacroXrefCommand @}

<h4>UserIdXrefCommand class</h4>
<h5>Usage</h5>
<p>A <b>MacroXrefCommand</b> is created by the <b>WebReader</b> when the 
<tt>@@u</tt> command is found in the input stream.
The Command is then appended to the current Chunk being built by the WebReader.
</p>
<h5>Design</h5>

<p>The <b>UserIdXrefCommand</b> class weave method gets the
user identifier cross reference information from the 
overall web instance.  It then formats this line using the following 
algorithm, which is similar to the algorithm in the <b>XrefCommand</b> superclass.
</p>
<ol>
<li>Use the <b>Weaver</b> class <b>xrefHead()</b> method to emit the cross-reference header.</li>
<li>Sort the keys in the cross-reference mapping.</li>
<li>Use the <b>Weaver</b> class <b>xrefDefLine()</b> method to emit each line of the cross-reference definition mapping.</li>
<li>Use the <b>Weaver</b> class <b>xrefFoor()</b> method to emit the cross-reference footer.</li>
</ol>
<h5>Implementation</h5>

@d UserIdXrefCommand class...
@{
class UserIdXrefCommand( XrefCommand ):
    """A UserIdXref command."""
    def weave( self, aWeb, aWeaver ):
        """Weave a user identifier Xref from @@d commands."""
        ux= aWeb.userNamesXref()
        aWeaver.xrefHead()
        un= ux.keys()
        un.sort()
        for u in un:
            defn, refList= ux[u]
            aWeaver.xrefDefLine( u, defn, refList )
        aWeaver.xrefFoot()
@| UserIdXrefCommand @}

<h4>ReferenceCommand class</h4>
<h5>Usage</h5>
<p>A <b>ReferenceCommand</b> instance is created by a <b>WebReader</b> when
 a <tt>@@&lt;<i>name</i>@@&gt;</tt> construct in is found in the input stream.  This is attached
 to the current <b>Chunk</b> being built by the WebReader.  </p>

<h5>Design</h5>
<p>During a weave, this creates a markup reference to
another <b>NamedChunk</b>.  During tangle, this actually includes the <b>NamedChunk</b> 
at this point in the tangled output file.
</p>

<p>The constructor creates several attributes of an instance
of a <b>ReferenceCommand</b>.
</p>
<ul>
<li><i>refTo</i>, the name of the chunk to which this refers, possibly 
elided with a trailing <tt>'...'</tt>.</li>
<li><i>fullName</i>, the full name of the chunk to which this refers.</li>
<li><i>sequenceList</i>, the list of sequence numbers of the chunks to which the name refers.</li>
</ul>

<h5>Implementation</h5>


@d ReferenceCommand class...
@{
class ReferenceCommand( Command ):
    """A reference to a named chunk, via @@<name@@>."""
    def __init__( self, refTo, fromLine=0 ):
        Command.__init__( self, fromLine )
        self.refTo= refTo
        self.fullname= None
        self.sequenceList= None
    def __str__( self ):
        return "at %r: reference to chunk %r" % (self.lineNumber,self.refTo)
    @<ReferenceCommand resolve this chunk name if it was abbreviated@>
    @<ReferenceCommand refers to chunk@>
    @<ReferenceCommand weave a reference to a chunk@>
    @<ReferenceCommand tangle a referenced chunk@>
@| ReferenceCommand @}

<p>The <b>resolve()</b> method queries the overall <b>Web</b> instance for the full
name and sequence number for this chunk reference.  This is used
by the <b>Weaver</b> class <b>referenceTo()</b> method to write the markup reference
to the chunk.
</p>

@d ReferenceCommand resolve...
@{
def resolve( self, aWeb ):
    """Reference to actual location of this chunk"""
    self.fullName, self.sequenceList = aWeb.chunkReference( self.refTo )
@}

<p>The <b>usedBy()</b> method is a request that is delegated by a <b>Chunk</b>;
it resolves the reference and calls the <b>setUsage()</b> method of the
overall <b>Web</b> instance to indicate that the parent chunk refers to
the named chunk.  This also updates the reference count for the named
chunk.
</p>

@d ReferenceCommand refers to chunk
@{
def usedBy( self, aWeb, aChunk ):
    self.resolve( aWeb )
    aWeb.setUsage( aChunk, self.fullName )
@}

<p>The <b>weave()</b> method inserts a markup reference to a named
chunk.  It uses the <b>Weaver</b> class <b>referenceTo()</b> method to format
this appropriately for the document type being woven.
</p>

@d ReferenceCommand weave...
@{
def weave( self, aWeb, aWeaver ):
    """Create the nicely formatted reference to a chunk of code."""
    self.resolve( aWeb )
    aWeb.weaveChunk( self.fullName, aWeaver )
@}

<p>The <b>tangle()</b> method inserts the resolved chunk in this
place.  When a chunk is tangled, it sets the indent,
inserts the chunk and resets the indent.
</p>

@d ReferenceCommand tangle...
@{
def tangle( self, aWeb, aTangler ):
    """Create source code."""
    self.resolve( aWeb )
    aWeb.tangleChunk( self.fullName, aTangler )
@}

<h3>Error class</h3>
<h4>Usage</h4>
<p>An <b>Error</b> is raised whenever processing cannot continue.  Since it
is a subclass of Exception, it takes an arbitrary number of arguments.  The
first should be the basic message text.  Subsequent arguments provide 
additional details.  We will try to be sure that
all of our internal exceptions reference a specific chunk, if possible.
This means either including the chunk as an argument, or catching the 
exception and appending the current chunk to the exception's arguments.
<p>The
Python <tt>raise</tt> statement takes an instance of Error and passes it
to the enclosing <tt>try/except</tt> statement for processing.</p>
<p>The typical creation is as follows:</p>
<pre>
raise Error("No full name for %r" % chunk.name, chunk)
</pre> 
<p>A typical exception-handling suite might look like this:</p>
<pre>
try:
    ...something that may raise an Error or Exception...
except Error,e:
    print e.args # this is our internal Error
except Exception,w:
    print w.args # this is some other Python Exception
</pre>

<h4>Design</h4>

<p>The <b>Error</b> class is a subclass of <b>Exception</b> used to differentiate 
application-specific
exceptions from other Python exceptions.  It does no additional processing,
but merely creates a distinct class to facilitate writing <tt>except</tt> statements.
</p>

<h4>Implementation</h4>

@d Error class...
@{
class Error( Exception ): pass
@| Error @}

<h3>Logger Classes</h3>

<h4>Usage</h4>
<p>A single global variable <i>theLog</i> has an instance of the Logger.
This instance must be global to this entire module.  It is created
at the module scope.  See the <a href="#moduleInit">Module Initialization</a> section, below.
</p>
<pre>
theLog= Logger(standard)
</pre>
<p>Important application events are defined as subclasses of <b>Event</b>.  By
default, these classes behave somewhat like exceptions.  They are constructed with
an arbitrary list of values as their arguments.  The intent is to name and package
the arguments, so there are no methods to override.
</p>
<pre>
class MyEvent( Event ): pass
</pre>
<p>
When a log message needs to be written, the <b>event()</b> method of the logger
actually creates the subclass of <b>Event</b> with the desired arguments.  It also attaches
a <b>LogActivity</b> object to the <b>Event</b>, and calls the Event's <b>log()</b> method.
</p>
<pre>
theLog.event( MyEvent, "arg1", arg2, etc... )
</pre>
<p>The global logger instance can be configured to apply certain logging strategy
methods to each <b>Event</b> instance that is created.  The default strategies
are <b>LogReport</b>, <b>LogDebug</b> and <b>LogDiscard</b>.  These are applied
by the <b>event()</b> method after the <b>Event</b> instance is constructed.
The <b>LogReport</b> strategy writes a summary to stdout; the <b>LogDebug</b> 
strategy writes a very detailed line to stdout; the <b>LogDiscard</b> strategy silently
ignores the event.
</p>

<h4>Design</h4>
<p>An instance of the <b>Logger</b> class provides global context information
for all debugging activity.  The most important service is the
<b>event()</b> method; this method creates and then activates the given log event.
</p>
<p>The <b>Logger</b> class <b>event()</b> method constructs an <b>Event</b> instance.
The function accepts a sequence of arguments.  The first
argument must be an <b>Event</b> class.  The remaining arguments are arguments
given to the <b>Event</b> class constructor. 
</p>
<p>
The <b>LogActivity</b> instance determines what is done with this
class of event.  Two of the built-in <b>LogActivity</b> classes are <b>LogReport</b> and
<b>LogDiscard</b>.  An instance of <b>LogReport</b> will report the event.  An instance
of <b>LogDiscard</b> will silently discard the event.
</p>

</h4>Implementation</h4>

@d Logger classes...
@{
class Logger:
    def __init__( self, logConfig ):
        self.logConfig= logConfig
    def event( self, className, message ):
        className(self,message).log()
@<LogActivity strategy class hierarchy - including LogReport and LogDiscard@>
@<Logger Event base class definitions@>
@<Logger Event subclasses are unique to this application@>
@<Global singletons that define the activities for each Event class@>
@| Logger Event ExecutionEvent @}

<p>The various <b>Event</b> subclasses are used to separate application
events for convenience in logging and debugging.  
When an <b>Event</b> instance is created, the <i>Logger</i> configuration
is used to attach the correct <b>LogActivity</b> strategy instance.
</p>

@d Logger Event base class...
@{
class Event:
    def __init__( self, *args ):
        self.logger= args[0]
        self.args= args[1:]
        self.action= LogReport()    # default action
        self.time= time.time()
        for cl,a in self.logger.logConfig:
            if self.__class__.__name__ in cl:
                self.action= a
    def __str__( self ):
        return "%s" % ( ';'.join( self.args[:1]) )
    def __repr__( self ):
        return ("%s %s: %s" 
            % ( self.__class__.__name__, 
            time.strftime( '%c', time.localtime( self.time ) ),
            ';'.join( self.args[:1]) ) )
    def log( self ):
        self.action.log( self, self.logger )
class ExecutionEvent( Event ): 
    def __init__( self, *args ):
        apply( Event.__init__, ( self, ) + args )
@}

<p>These subclasses of Event are unique to this <em>pyWeb</em> application.
</p>
<p>The <b>ErrorEvent</b> overrides the <b>__str__()</b> method.  It could provide a slightly
different report.  In the current version, the display is the same as all other 
log messages.</p>

@d Logger Event subclasses...
@{
class InputEvent( Event ): pass
class OptionsEvent( Event ): pass
class ReadEvent( Event ): pass
class WeaveEvent( Event ): pass
class WeaveStartEvent( Event ): pass
class WeaveEndEvent( Event ): pass
class TangleEvent( Event ): pass
class TangleStartEvent( Event ): pass
class TangleEndEvent( Event ): pass
class SummaryEvent( Event ): pass
class ErrorEvent( Event ): 
    def __str__( self ):
        return "%s" % ( ';'.join( self.args[:1]) )
class WarningEvent( ErrorEvent ): pass
@| InputEvent OptionsEvent ReadEvent WeaveEvent WeaveStartEvent WeaveEndEvent
TangleEvent TangleStartEvent TangleEndEvent SummaryEvent ErrorEvent @}

<p>The strategies provide alternate activities to be taken
when a log event's <b>log()</b> method is called.
</p>

@d LogActivity strategy class...
@{
class LogActivity:
    def log( self, anEvent, aLogger ):
        pass
class LogReport( LogActivity ):
    def log( self, anEvent, aLogger ):
        print anEvent
class LogDebug( LogActivity ):
    def log( self, anEvent, aLogger ):
        print `anEvent`
class LogDiscard( LogActivity ):
    pass
@| LogActivity LogReport LogDiscard @}

<h3>The Web Class</h3>

<p>The overall web of chunks and their cross references is carried in a 
single instance of the <b>Web</b> class that drives the weaving and tangling operations.  
Broadly, the functionality of a Web can be separated into several areas.
</p>
<ul>
<li>construction methods used by <b>Chunks</b> and <b>WebReader</b></li>
<li><b>Chunk</b> name resolution methods</li>
<li><b>Chunk</b> cross reference methods</li>
<li>miscellaneous access</li>
<li>tangle</li>
<li>weave</li>
</ul>

<p>A web instance has a number of attributes.</p>
<ul>
<li><i>sourceFileName</i>, the name of the original .w file.</li>
<li><i>chunkSeq</i>, the sequence of <b>Chunk</b> instances as seen in the input file.</li>
<li><i>output</i>, the <tt>@@o</tt> and <tt>@@O</tt> named <b>OutputChunk</b> chunks.  Each element of this 
dictionary is a sequence of chunks that have the same name.  The first is the
initial definition (marked with "="), all others a second definitions
(marked with "+=").</li>
<li><i>named</i>, the <tt>@@d</tt> named <b>NamedChunk</b> chunks.  Each element of this 
dictionary is a sequence of chunks that have the same name.  The first is the
initial definition (marked with "="), all others a second definitions
(marked with "+=").</li>
<li><i>usedBy</i>, the cross reference of chunks referenced by commands in other
chunks.</li>
<li><i>sequence</i>, is used to assign a unique sequence number to each
named chunk.</li>
</ul>

@d Web class...
@{
class Web:
    """The overall Web of chunks and their cross-references."""
    def __init__( self ):
        self.sourceFileName= None
        self.chunkSeq= []
        self.output= {}
        self.named= {}
        self.usedBy= {}
        self.sequence= 0
    def __str__( self ):
        return "chunks=%r" % self.chunkSeq
    @<Web construction methods used by Chunks and WebReader@>
    @<Web Chunk name resolution methods@>
    @<Web Chunk cross reference methods@>
    @<Web determination of the language from the first chunk@>
    @<Web tangle the output files@>
    @<Web weave the output document@>
@| Web @}

<p>During web construction, it is convenient to capture
information about the individual <b>Chunk</b> instances being appended to
the web.  This done using a <i>Callback</i> design pattern.
Each subclass of <b>Chunk</b> provides an override for the <b>Chunk</b> class
<b>webAdd()</b> method.  This override calls one of the appropriate
web construction methods.</p>
<p>Also note that the full name for a chunk can be given
either as part of the definition, or as part a reference.
Typically, the reference has the full name and the definition
has the elided name.  This allows a reference to a chunk
to contain a more complete description of the chunk.
</p>

@d Web construction...
@{
@<Web add a full chunk name, ignoring abbreviated names@>
@<Web add an anonymous chunk@>
@<Web add a named macro chunk@>
@<Web add an output file definition chunk@>
@}

<p>A name is only added to the known names when it is
a full name, not an abbreviation ending with <tt>"..."</tt>.
Abbreviated names are quietly skipped until the full name
is seen.
</p>

<p>The algorithm for the <b>addDefName()</b> method, then is as follows:</p>
<ol>
<li>Use the <b>fullNameFor()</b> method to locate the full name.</li>
<li>If no full name was found (the result of <b>fullNameFor()</b> ends
with <tt>'...'</tt>), ignore this name as an abbreviation with no definition.</li>
<li>If this is a full name and the name was not in the 
<i>named</i> mapping, add this full name to the mapping.
</li>
</ol>

<p>This name resolution approach presents a problem when a chunk is
defined before it is referenced and the first definition
uses an abbreviated name.  This is an atypical construction
of an input document, however, since the intent is to provide
high-level summaries that have forward references to supporting
details.
</p>

@d Web add a full chunk name...
@{
def addDefName( self, name ):
    """Reference to or definition of a chunk name."""
    nm= self.fullNameFor( name )
    if nm[-3:] == '...':
        theLog.event( ReadEvent, "Abbreviated reference %r" % name )
        return None # first occurance is a forward reference using an abbreviation
    if not self.named.has_key( nm ):
        self.named[ nm ]= []
        theLog.event( ReadEvent, "Adding chunk %r" % name )
    return nm
@}

<p>An anonymous <b>Chunk</b> is kept in a sequence of chunks, used for
tangling.
</p>
@d Web add an anonymous chunk
@{
def add( self, chunk ):
    """Add an anonymous chunk."""
    self.chunkSeq.append( chunk )
@}

<p>A named <b>Chunk</b> is defined with a <tt>@@d</tt> command.
It is collected into a mapping of <b>NamedChunk</b> instances.
An entry in the mapping is a sequence of chunks that have the
same name.  This sequence of chunks is used to produce the
weave or tangle output.
</p>
<p>All chunks are also placed in the overall sequence of chunks.
This overall sequence is used for weaving the document.
</p>
<p>The <b>addDefName()</b> method is used to resolve this name if
it is an abbreviation, or add it to the mapping if this
is the first occurance of the name.  If the name cannot be
added, an instance of our <b>Error</b> class is raised.  If the name exists or 
was added, the chunk is appended to the chunk list associated
with this name.
</p>
<p>The web's sequence counter is incremented, and this 
unique sequence number sets the  <i>seq</i> attribute of the <b>Chunk</b>.
If the chunk list was empty, this is the first chunk, the
<i>firstSecond</i> flag is set to "=".  If the chunk list was 
not empty, this is a subsequent chunk, the
<i>firstSecond</i> flag is set to "+=".
</p>

@d Web add a named macro chunk
@{
def addNamed( self, chunk ):
    """Add a named chunk to a sequence with a given name."""
    self.chunkSeq.append( chunk )
    nm= self.addDefName( chunk.name )
    if nm:
        self.sequence += 1
        chunk.seq= self.sequence
        if self.named[nm]: chunk.firstSecond= '+='
        else: chunk.firstSecond= '='
        self.named[ nm ].append( chunk )
        theLog.event( ReadEvent, "Extending chunk %r from %r" % ( nm, chunk.name ) )
    else:
        raise Error("No full name for %r" % chunk.name, chunk)
@}

<p>An output file definition <b>Chunk</b> is defined with an <tt>@@o</tt> or <tt>@@O</tt>
command.  It is collected into a mapping of <b>OutputChunk</b> instances.
An entry in the mapping is a sequence of chunks that have the
same name.  This sequence of chunks is used to produce the
weave or tangle output.
</p>
<p>Note that file names cannot be abbreviated.</p>
<p>All chunks are also placed in overall sequence of chunks.
This overall sequence is used for weaving the document.
</p>
<p>If the name does not exist in the <i>output</i> mapping,
the name is added with an empty sequence of chunks.
In all cases, the chunk is 
appended to the chunk list associated
with this name.
</p>
<p>The web's sequence counter is incremented, and this 
unique sequence number sets the Chunk's <i>seq</i> attribute.
If the chunk list was empty, this is the first chunk, the
<i>firstSecond</i> flag is set to "=".  If the chunk list was 
not empty, this is a subsequent chunk, the
<i>firstSecond</i> flag is set to "+=".
</p>

@d Web add an output file definition chunk
@{
def addOutput( self, chunk ):
    """Add an output chunk to a sequence with a given name."""
    self.chunkSeq.append( chunk )
    if not self.output.has_key( chunk.name ):
        self.output[chunk.name] = []
        theLog.event( ReadEvent, "Adding chunk %r" % chunk.name )
    self.sequence += 1
    chunk.seq= self.sequence
    if self.output[chunk.name]: chunk.firstSecond= '+='
    else: chunk.firstSecond= '='
    self.output[chunk.name].append( chunk )
@}

<p>Web chunk name resolution has three aspects.  The first
is inflating elided names (those ending with <tt>...</tt>) to their
actual full names.  The second is finding the named chunk
in the web structure.  The third is returning a reference
to a specific chunk including the name and sequence number.
</p>
<p>Note that a chunk name actually refers to a sequence
of chunks.  Multiple definitions for a chunk are allowed, and
all of the definitions are concatenated to create the complete
chunk.  This complexity makes it unwise to return the sequence
of same-named chunks; therefore, we put the burden on the Web to 
process all chunks with a given name, in sequence.
</p>

<p>The <b>fullNameFor()</b> method resolves full name for a chunk as follows:</p>
<ol>
<li>If the string is already in the <i>named</i> mapping, this is the full name</li>
<li>If the string ends in <tt>'...'</tt>, visit each key in the dictionary to see if the
key starts with the string up to the trailing <tt>'...'</tt>.  If a match is found, the dictionary
key is the full name.</li>
<li>Otherwise, treat this as a full name.</li>
</ol>

@d Web Chunk name resolution...
@{
def fullNameFor( self, name ):
    # resolve "..." names
    if self.named.has_key( name ): return name
    if name[-3:] == '...':
        best= []
        for n in self.named.keys():
            if n.startswith( name[:-3] ): best.append( n )
        if len(best) > 1:
            raise Error("Ambiguous abbreviation %r, matches %r" % ( name, best ) )
        elif len(best) == 1: 
            return best[0]
    return name
@}

<p>The <b>_chunk()</b> method locates a named sequence of chunks by first determining the full name
for the identifying string.  If full name is in the <i>named</i> mapping, the sequence
of chunks is returned.  Otherwise, an instance of our <b>Error</b> class is raised because the name
is unresolvable.
</p>
<p>It might be more helpful for debugging to emit this as an error in the
weave and tangle results and keep processing.  This would allow an author to
catch multiple errors in a single run of <em>pyWeb</em>.</p>
 
@d Web Chunk name resolution...
@{
def _chunk( self, name ):
    """Locate a named sequence of chunks."""
    nm= self.fullNameFor( name )
    if self.named.has_key( nm ):
        return self.named[nm]
    raise Error( "Cannot resolve %r in %r" % (name,self.named.keys()) )
@}

<p>The <b>chunkReference()</b> method returns the full name and sequence number
of a chunk.  Given a short identifying string, the full name is easily resolved.
The chunk sequence number, however, requires that the first chunk be located.
This first chunk has the sequence number that can be used for all cross-references.
</p>

@d Web Chunk name resolution...
@{
def chunkReference( self, name ):
    """Provide a full name and sequence number for a (possibly abbreviated) name."""
    c= self._chunk( name )
    if not c:
        raise Error( 'Unknown named chunk %r' % name )
    return self.fullNameFor( name ), [ x.seq for x in c ]
@}

<p>Cross-reference support includes creating and reporting
on the various cross-references available in a web.  This includes
creating the <i>usedBy</i> list of chunks that reference a given chunk;
and returning the file, macro and user identifier cross references.
</p>

<p>Each <b>Chunk</b> has a list <b>Reference</b> commands that shows the chunks
to which a chunk refers.  These relationships must be reversed to show
the chunks that refer to a given chunk.  This is done by traversing
the entire web of named chunks and recording each chunk-to-chunk reference
in a <i>usedBy</i> mapping.  This mapping has the referred-to chunk as 
the key, and a sequence of referring chunks as the value.
</p>

<p>The accumulation is initiated by the <b>createUsedBy()</b> method.  This
method visits each <b>Chunk</b>, calling the <b>usedBy()</b> method, 
passing in the <b>Web</b> instance
as an argument.  Each <b>Chunk</b> class <b>usedBy()</b> method, in turn, 
invokes the <b>usedBy()</b> method
of each <b>Command</b> instance in the chunk.  Most commands do nothing, 
but a <b>ReferenceCommand</b>
will call back to the <b>Web</b> class <b>setUsage()</b> method to record a reference.
</p>
<p>When the <b>createUsedBy()</b> method has accumulated the entire cross 
reference, it also assures that all chunks are used exactly once.</p>

<p>The <b>setUsage()</b> method accepts a chunk which contains a reference
command and the name to which the reference command
points (the "referent").  Each of the referent's chunk sequence numbers 
is used an the key in a mapping that lists references to this referent.
</p>
<p>The <b>chunkReferencedBy()</b> method resolves the chunk's name, returning
the list of chunks which refer to the given chunk.
</p>

@d Web Chunk cross reference methods...
@{
def createUsedBy( self ):
    """Compute a "used-by" table showing references to chunks."""
    for c in self.chunkSeq:
        c.usedBy( self )
    @<Web Chunk check reference counts are all one@>
def setUsage( self, aChunk, aRefName ):
    for c in self._chunk( aRefName ):
        self.usedBy.setdefault( c.seq, [] )
        self.usedBy[c.seq].append( (self.fullNameFor(aChunk.name),aChunk.seq) )
        c.refCount += 1
def chunkReferencedBy( self, seqNumber ):
    """Provide the list of places where a chunk is used."""
    return self.usedBy.setdefault( seqNumber, [] )
@}

<p>We verify that the reference count for a
chunk is exactly one.  We don't gracefully tolerate multiple references to
a chunk or unreferenced chunks.</p>
@d Web Chunk check...
@{
for nm,cl in self.named.items():
   if len(cl) > 0:
       if cl[0].refCount == 0:
           theLog.event( WarningEvent, "No reference to %r" % nm )
       elif cl[0].refCount > 1:
           theLog.event( WarningEvent, "Multiple references to %r" % nm )
   else:
       theLog.event( WarningEvent, "No definition for %r" % nm )
@}

<p>The <b>fileXref()</b> method visits all named file output chunks in <i>output</i> and
collects the sequence numbers of each section in the sequence of chunks.
</p>
<p>The <b>chunkXref()</b> method uses the same algorithm as a the <b>fileXref()</b> method,
but applies it to the <i>named</i> mapping.
</p>

@d Web Chunk cross reference methods...
@{
def fileXref( self ):
    fx= {}
    for f,cList in self.output.items():
        fx[f]= [ c.seq for c in cList ]
    return fx
def chunkXref( self ):
    mx= {}
    for n,cList in self.named.items():
        mx[n]= [ c.seq for c in cList ]
    return mx
@}

<p>The <b>userNamesXref()</b> method creates a mapping for each
user identifier.  The value for this mapping is a tuple
with the chunk that defined the identifer (via a <tt>@@|</tt> command), 
and a sequence of chunks that reference the identifier. 
</p>
<p>For example:
<tt>{ 'Web': ( 87, (88,93,96,101,102,104) ), 'Chunk': ( 53, (54,55,56,60,57,58,59) ) }</tt>, 
shows that the identifier
<tt>'Web'</tt> is defined in chunk with a sequence number of 87, and referenced
in the sequence of chunks that follow.
</p>
<p>This works in two passes:</p>
<ul>
<li><b>_gatherUserId()</b> gathers all user identifiers</li>
<li><b>_updateUserId()</b> searches all text commands for the identifiers and
updates the <b>Web</b> class cross reference information.</li>
</ul>

@d Web Chunk cross reference methods...
@{
def userNamesXref( self ):
    ux= {}
    self._gatherUserId( self.named, ux )
    self._gatherUserId( self.output, ux )
    self._updateUserId( self.named, ux )
    self._updateUserId( self.output, ux )
    return ux
def _gatherUserId( self, chunkMap, ux ):
    @<collect all user identifiers from a given map into ux@>
def _updateUserId( self, chunkMap, ux ):
    @<find user identifier usage and update ux from the given map@>
@}

<p>User identifiers are collected by visiting each of the sequence of 
<b>Chunks</b> that share the
same name; within each component chunk, if chunk has identifiers assigned
by the <tt>@@|</tt> command, these are seeded into the dictionary.
If the chunk does not permit identifiers, it simply returns an empty
list as a default action.
</p>
 
@d collect all user identifiers...
@{
for n,cList in chunkMap.items():
    for c in cList:
        for id in c.getUserIDRefs():
            ux[id]= ( c.seq, [] )
@}

<p>User identifiers are cross-referenced by visiting 
each of the sequence of <b>Chunks</b> that share the
same name; within each component chunk, visit each user identifier;
if the <b>Chunk</b> class <b>searchForRE()</b> method matches an identifier, 
this is appended to the sequence of chunks that reference the original user identifier.
</p>

@d find user identifier usage...
@{
# examine source for occurances of all names in ux.keys()
for id in ux.keys():
    theLog.event( WeaveEvent, "References to %r" % id )
    idpat= re.compile( r'\W%s\W' % id )
    for n,cList in chunkMap.items():
        for c in cList:
            if c.seq != ux[id][0] and c.searchForRE( idpat, self ):
                ux[id][1].append( c.seq )
@}

<p>The <b>language()</b> method determines the output language.
The determination of the language can be done a variety of ways.
One is to use command line parameters, another is to use the filename
extension on the input file.</p>
<p>We examine the first few characters of input.  A proper HTML, XHTML or
XML file begins with '&lt;!', '&lt;?' or '&lt;H'.  Latex files
typically begin with '%' or '\'.
</p>
<p>The <b>EmitterFactory</b> may be a better location for this function.</p>

@d Web determination of the language...
@{
def language( self, preferredWeaverClass=None ):
    """Construct a weaver appropriate to the document's language"""
    if preferredWeaverClass:
        return preferredWeaverClass()
    if self.chunkSeq[0].startswith('<'): return HTML()
    return Latex()
@}

<p>The <b>tangle()</b> method of the <b>Web</b> class performs 
the <b>tangle()</b> method for each <b>Chunk</b> of each
named output file.  Note that several chunks may share the file name, requiring
the file be composed of material in each chunk.
</p>
<p>During tangling of a chunk, the chunk may reference another
chunk.  This transitive tangling of an individual chunk is handled by the
<b>tangleChunk()</b> method.
</p>

@d Web tangle...
@{
def tangle( self, aTangler ):
    for f,c in self.output.items():
        aTangler.open( f )
        for p in c:
            p.tangle( self, aTangler )
        aTangler.close()
def tangleChunk( self, name, aTangler ):
    theLog.event( TangleEvent, "Tangling chunk %r" % name )
    for p in self._chunk(name):
        p.tangle( self, aTangler )
@}

<p>The <b>weave()</b> method of the <b>Web</b> class creates the final documentation.
This is done by stepping through each <b>Chunk</b> in sequence
and weaving the chunk into the resulting file via the <b>Chunk</b> class <b>weave()</b> method.
</p>
<p>During weaving of a chunk, the chunk may reference another
chunk.  When weaving a reference to a named chunk (output or ordinary programming
source defined with @@{), this does not lead to transitive weaving: only a
reference is put in from one chunk to another.  However, when weaving
a chunk defined with @@[, the chunk <i>is</i> expanded when weaving.
The decision is delegated to the referenced chunk.
</p>

@d Web weave...
@{
def weave( self, aWeaver ):
    aWeaver.open( self.sourceFileName )
    for c in self.chunkSeq:
        c.weave( self, aWeaver )
    aWeaver.close()
def weaveChunk( self, name, aWeaver ):
    theLog.event( WeaveEvent, "Weaving chunk %r" % name )
    chunkList= self._chunk(name)
    chunkList[0].weaveReferenceTo( self, aWeaver )
    for p in chunkList[1:]:
        p.weaveShortReferenceTo( self, aWeaver )
@}

<h3>The WebReader Class</h3>

<h4>Usage</h4>

<p>There are two forms of the constructor for a <b>WebReader</b>.  The 
initial <b>WebReader</b> instance is created with code like the following:
</p>
<pre>
p= WebReader( aFileName, command=aCommandCharacter )
</pre>
<p>
This will define the initial input file and the command character, both
of which are command-line parameters to the application.
</p>
<p>When processing an include file (with the @@i command), a child <b>WebReader</b>
instance is created with code like the following:
</p>
<pre>
c= WebReader( anIncludeName, parent=parentWebReader )
</pre>
<p>
This will define the included file, but will inherit the command 
character from the parent <b>WebReader</b>.  This will also include a 
reference from child to parent so that embedded Python expressions
can view the entire input context.
</p>

<h4>Design</h4>

<p>The <b>WebReader</b> class parses the input file into command blocks.
These are assembled into <b>Chunks</b>, and the <b>Chunks</b> are assembled into the document
<b>Web</b>.  Once this input pass is complete, the resulting <b>Web</b> can be tangled or
woven.
</p>

<p>The parser works by reading the entire file and splitting on <tt>@@.</tt> patterns.
The <b>split()</b> method of the Python <b>re</b> module will separate the input
and preserve the actual character sequence on which the input was split.
This breaks the input into blocks of text separated by the <tt>@@.</tt> characters.
</p>

<p>"Major" commands partition the input into <b>Chunks</b>.  The major commands 
are <tt>@@d</tt>, <tt>@@o</tt> and <tt>@@O</tt>, as well as the <tt>@@{</tt>, <tt>@@}</tt>, <tt>@@[</tt>, <tt>@@]</tt> brackets, and the <tt>@@i</tt> command
to include another file.
</p>
<p>"Minor" commands are inserted into a <b>Chunk</b> as a <b>Command</b>.  Blocks of text
are minor commands, as well as the <tt>@@&lt;<i>name</i>@@&gt;</tt> references, 
the various cross-reference commands (<tt>@@f</tt>, <tt>@@m</tt> and <tt>@@u</tt>).  
The <tt>@@@@</tt> escape is also
handled here so that all further processing is independent of any parsing.
</p>

<h4>Implementation</h4>

<p>The class has the following attributes:</p>
<ul>
<li><i>fileName</i> is used to pass the file name to the Web instance.</li>
<li><i>tokenList</i> is the completely tokenized input file.</li>
<li><i>token</i> is the most recently examined token.</li>
<li><i>tokenIndex</i> is an index through the tokenList.</li>
<li><i>lineNumber</i> is the count of <tt>'\n'</tt> characters seen in the tokens.</li>
<li><i>aChunk</i> is the current open Chunk.</li>
<li><i>parent</i> is the outer <b>WebReader</b> when processing a <tt>@@i</tt> command.</li>
<li><i>theWeb</i> is the current open Web.</li>
<li><i>permitList</i> is the list of commands that are permitted to fail.  This
is generally an empty list or <tt>('@@i',)</tt>.</li>
<li><i>command</i> is the command character; a WebReader will use the parent
command character if the parent is not <tt>None</tt>.
<li><i>parsePat</i> is generated from the command character, and is used to parse
the input into tokens.</li>
</ul>

@d WebReader class...
@{
class WebReader:
    """Parse an input file, creating Commands and Chunks."""
    def __init__( self, fileName, parent=None, command='@@', permit=() ):
        self.fileName= fileName
        self.tokenList= []
        self.token= ""
        self.tokenIndex= 0
        self.tokenPushback= []
        self.lineNumber= 0
        self.aChunk= None
        self.parent= parent
        self.theWeb= None
        self.permitList= permit
        if self.parent: 
            self.command= self.parent.command
        else:
            self.command= command
        self.parsePat= '(%s.)' % self.command
        @<WebReader command literals@>
    @<WebReader tokenize the input@>
    @<WebReader location in the input stream@>
    @<WebReader handle a command string@>
    @<WebReader load the web@>
@| WebReader @}

<p>This tokenizer centralizes a single call to <b>nextToken()</b>.  This assures that
every token is examined by <b>nextToken()</b>, which permits accurate 
counting of the <tt>'\n'</tt> characters
and determining the line numbers of the input file.  This line number
information can then be attached to each <b>Command</b>, directing the user back to 
the correct line of the original input file.
</p>
<p>The tokenizer supports lookahead by allowing the parser to examine tokens
and then push them back into a pushBack stack.  Generally this is used for the
special case of parsing the @@i command, which has no @@-command terminator or
separator.  It ends with the following <tt>'\n'</tt>.
</p>
<p>Python permits a simplified double-ended queue for this kind
of token stream processing.  Ordinary tokens are fetched with a <tt>pop(0)</tt>, and
a pushback is done by prepending the pushback token with a <tt>tokenList = [ token ] + tokenList</tt>.
For this application, however, we need to keep a count of <tt>'\n'</tt>s seen, 
and we want to avoid double-counting <tt>'\n'</tt> pushed back into the token stream.
So we use a queue of tokens and a stack for pushback.
</p>

@d WebReader tokenize...
@{
def openSource( self ):
    theLog.event( InputEvent, "Processing %r" % self.fileName )
    file= open(self.fileName, 'r' ).read()
    self.tokenList= re.split(self.parsePat, file )
    self.lineNumber= 1
    self.tokenPushback= []
def nextToken( self ):
    self.lineNumber += self.token.count('\n')
    if self.tokenPushback:
        self.token= self.tokenPushback.pop()
    else:
        self.token= self.tokenList.pop(0)
    return self.token
def moreTokens( self ):
    return self.tokenList or self.tokenPushback
def pushBack( self, token ):
    self.tokenPushback.append( token )
def totalLines( self ):
    self.lineNumber += self.token.count('\n')
    return self.lineNumber-1
@}

<p>The <b>location()</b> provides the file name and 
range of lines for a particular command.  This allows error
messages as well as tangled or woven output 
to correctly reference the original input files.
</p>

@d WebReader location...
@{
def location( self ):
    return ( self.fileName, self.lineNumber, self.lineNumber+self.token.count("\n") )
@}

<p>Command recognition is done via a <i>Chain of Command</i>-like design.
There are two conditions: the command string is recognized or it is not recognized.</p>
<p>If the command is recognized, <b>handleCommand()</b> either:</p>
<ul>
<li>(for major commands) attaches the current <b>Chunk</b> (<i>self.aChunk</i>) to the 
current <b>Web</b> (<i>self.aWeb</i>), <em>or</em></li>
<li>(for minor commands) create a <b>Command</b>, attach it to the current 
<b>Chunk</b> (<i>self.aChunk</i>)</li>
</ul>
<p><em>and</em> returns a true result.</p>
<p>If the command is not recognized, <b>handleCommand()</b> returns false.</p>
<p>
A subclass can override <b>handleCommand()</b> to (1) call this superclass version;
(2) if the command is unknown to the superclass, 
then the subclass can attempt to process it;
(3) if the command is unknown to both classes, 
then return false.  Either a subclass will handle it, or the default activity taken
by <b>load()</b> is to treat the command a text, but also issue a warning.
</p>

@d WebReader handle a command...
@{
def handleCommand( self, token ):
    @<major commands segment the input into separate Chunks@>
    @<minor commands add Commands to the current Chunk@>
    elif token[:2] in (self.cmdlcurl,self.cmdlbrak):
        # These should be consumed as part of @@o and @@d parsing
        raise Error('Extra %r (possibly missing chunk name)' % token, self.aChunk)
    else:
        return None # did not recogize the command
    return 1 # did recognize the command
@}

<p>The following sequence of <b>if</b>-<b>elif</b> statements identifies
the major commands that partition the input into separate <b>Chunks</b>.
</p>
@d major commands...
@{
if token[:2] == self.cmdo or token[:2] == self.cmdo_big:
    @<start an OutputChunk, adding it to the web@>
elif token[:2] == self.cmdd:
    @<start a NamedChunk or NamedDocumentChunk, adding it to the web@>
elif token[:2] == self.cmdi:
    @<import another file@>
elif token[:2] in (self.cmdrcurl,self.cmdrbrak):
    @<finish a chunk, start a new Chunk adding it to the web@>
@}

<p>An output chunk has the form <tt>@@o <i>name</i> @@{ <i>content</i> @@}</tt>.
We use the first two tokens to name the <b>OutputChunk</b>.  We simply expect
the <tt>@@{</tt> separator.  We then attach all subsequent commands
to this chunk while waiting for the final <tt>@@}</tt> token to end the chunk.
</p>

@d start an OutputChunk...
@{
file= self.nextToken().strip()
self.aChunk= OutputChunk( file )
self.aChunk.webAdd( self.theWeb )
self.aChunk.big_definition= token[:2] == self.cmdo_big
self.expect( (self.cmdlcurl,) )
# capture an OutputChunk up to @@}
@}

<p>An named chunk has the form <tt>@@d <i>name</i> @@{ <i>content</i> @@}</tt> for
code and <tt>@@d <i>name</i> @@[ <i>content</i> @@]</tt> for document source.
We use the first two tokens to name the <b>NamedChunk</b> or <b>NamedDocumentChunk</b>.  
We expect either the <tt>@@{</tt> or <tt>@@[</tt> separator, and use the actual
token found to choose which subclass of <b>Chunk</b> to create.
We then attach all subsequent commands
to this chunk while waiting for the final <tt>@@}</tt> or <tt>@@]</tt> token to 
end the chunk.
</p>

@d start a NamedChunk...
@{
name= self.nextToken().strip()
# next token is @@{ or @@]
brack= self.expect( (self.cmdlcurl,self.cmdlbrak) )
if brack == self.cmdlcurl: 
    self.aChunk= NamedChunk( name )
else: 
    self.aChunk= NamedDocumentChunk( name )
self.aChunk.webAdd( self.theWeb )
# capture a NamedChunk up to @@} or @@]
@}

<p>An import command has the unusual form of <tt>@@i <i>name</i></tt>, with no trailing
separator.  When we encounter the <tt>@@i</tt> token, the next token will start with the
file name, but may continue with an anonymous chunk.  We require that all <tt>@@i</tt> commands
occur at the end of a line, and break on the <tt>'\n'</tt> which must occur after the file name.
This permits file names with embedded spaces.
</p>
<p>Once we have split the file name away from the rest of the following anonymous chunk,
we push the following token back into the token stream, so that it will be the 
first token examined at the top of the <b>load()</b> loop.
</p>
<p>We create a child <b>WebReader</b> instance to process the included file.  The entire file 
is loaded into the current <b>Web</b> instance.  A new, empty <b>Chunk</b> is created at the end
of the file so that processing can resume with an anonymous <b>Chunk</b>.
</p>

@d import another file
@{
# break this token on the '\n' and pushback the new token.
next= self.nextToken().split('\n',1)
self.pushBack('\n')
if len(next) > 1:
    self.pushBack( '\n'.join(next[1:]) )
incFile= next[0].strip()
try:
    include= WebReader( incFile, parent=self )
    include.load( self.theWeb )
except (Error,IOError),e:
    theLog.event( ErrorEvent, 
        "Problems with included file %s, output is incomplete." 
        % incFile )
    # Discretionary - sometimes we want total failure
    if self.cmdi in self.permitList: pass
    else: raise
self.aChunk= Chunk()
self.aChunk.webAdd( self.theWeb )
@}

<p>When a <tt>@@}</tt> or <tt>@@]</tt> are found, this finishes a named chunk.  The next
text is therefore part of an anonymous chunk.
</p>
<p>Note that no check is made to assure that the previous <b>Chunk</b> was indeed a named
chunk or output chunk started with <tt>@@{</tt> or <tt>@@[</tt>.  
To do this, an attribute would be
needed for each <b>Chunk</b> subclass that indicated if a trailing bracket was necessary.
For the base <b>Chunk</b> class, this would be false, but for all other subclasses of
<b>Chunk</b>, this would be true.
</p>

@d finish a chunk...
@{
self.aChunk= Chunk()
self.aChunk.webAdd( self.theWeb )
@}

<p>The following sequence of <b>elif</b> statements identifies
the minor commands that add <b>Command</b> instances to the current open <b>Chunk</b>. 
</p>

@d minor commands...
@{
elif token[:2] == self.cmdpipe:
    @<assign user identifiers to the current chunk@>
elif token[:2] == self.cmdf:
    self.aChunk.append( FileXrefCommand(self.lineNumber) )
elif token[:2] == self.cmdm:
    self.aChunk.append( MacroXrefCommand(self.lineNumber) )
elif token[:2] == self.cmdu:
    self.aChunk.append( UserIdXrefCommand(self.lineNumber) )
elif token[:2] == self.cmdlangl:
    @<add a reference command to the current chunk@>
elif token[:2] == self.cmdlexpr:
    @<add an expression command to the current chunk@>
elif token[:2] == self.cmdcmd:
    @<double at-sign replacement, append this character to previous TextCommand@>
@}

<p>User identifiers occur after a <tt>@@|</tt> in a <b>NamedChunk</b>.
<p>Note that no check is made to assure that the previous <b>Chunk</b> was indeed a named
chunk or output chunk started with <tt>@@{</tt>.  
To do this, an attribute would be
needed for each <b>Chunk</b> subclass that indicated if user identifiers are permitted.
For the base <b>Chunk</b> class, this would be false, but for the <b>NamedChunk</b> class and
<b>OutputChunk</b> class, this would be true.
</p>

@d assign user identifiers... 
@{
# variable references at the end of a NamedChunk
# aChunk must be subclass of NamedChunk
# These are accumulated and expanded by @@u reference
self.aChunk.setUserIDRefs( self.nextToken().strip() )
@}

<p>A reference command has the form <tt>@@< <i>name</i> @@></tt>.  We accept three
tokens from the input, the middle token is the referenced name.
</p>

@d add a reference command...
@{
# get the name, introduce into the named Chunk dictionary
expand= self.nextToken().strip()
self.expect( (self.cmdrangl,) )
self.theWeb.addDefName( expand )
self.aChunk.append( ReferenceCommand( expand, self.lineNumber ) )
@}

<p>An expression command has the form <tt>@@( <i>Python Expression</i> @@)</tt>.  
We accept three
tokens from the input, the middle token is the expression.
</p>
<p>There are two alternative semantics for an embedded expression.</p>
<ul>
<li>Deferred Execution.  This requires definition of a new subclass of <b>Command</b>, 
<b>ExpressionCommand</b>, and appends it into the current <b>Chunk</b>.  At weave and
tangle time, this expression is evaluated.  The insert might look something like this:
<tt>aChunk.append( ExpressionCommand( expression, self.lineNumber ) )</tt>.
</li>
<li>Immediate Execution.  This simply creates a context and evaluates
the Python expression.  The output from the expression becomes a TextCommand, and
is append to the current <b>Chunk</b>.</li>
</ul>
<p>We use the Immediate Execution semantics.</p>

@d add an expression command...
@{
# get the Python expression, create the expression command
expression= self.nextToken()
self.expect( (self.cmdrexpr,) )
try:
    theLocation= self.location()
    theWebReader= self
    theFile= self.fileName
    thisApplication= sys.argv[0]
    result= str(eval( expression ))
except Exception,e:
    result= '!!!Exception: %s' % e
    theLog.event( ReadEvent, 'Failure to process %r: result is %s' % ( expression, e ) )
self.aChunk.appendText( result, self.lineNumber )
@}

<p>A double command sequence (<tt>'@@@@'</tt>, when the command is an <tt>'@@'</tt>) has the
usual meaning of <tt>'@@'</tt> in the input stream.  We do this via 
the <b>appendChar()</b> method of the current <b>Chunk</b>.  This will append the 
character on the end of the most recent <b>TextCommand</b>, and then put the <b>Chunk</b> in a state
where the next <b>TextCommand</b> is also appended to the most recent, 
creating a single <b>TextCommand</b> with the <tt>'@@'</tt> in it.
</p>
@d double at-sign...
@{
# replace with '@@' here and now!
# Put this at the end of the previous chunk
# AND make sure the next chunk is appended to this.
self.aChunk.appendChar( self.command, self.lineNumber )
@}

<p>The <b>expect()</b> method examines the 
next token to see if it is the expected string.  If this is not found, a
standard type of error message is written.
</p>
<p>The <b>load()</b> method reads the entire input file as a sequence
of tokens, split up by the <b>openSource()</b> method.  Each token that appears
to be a command is passed to the <b>handleCommand()</b> method.  If
the <b>handleCommand()</b> method returns a true result, the command was recognized
and placed in the <b>Web</b>.  if <i>handleCommand()</i> returns a false result, the command
was unknown, and some default behavior is used.
</p>
<p>The <b>load()</b> method takes an optional <tt>permit</tt> variable.
This encodes commands where failure is permitted.  Currently, only the @@i command
can be set to permit failure.  This allows including a file that does not yet 
exist.  The primary application of this option is when weaving test output.
The first pass of <em>pyWeb</em> tangles the program source files; they are
then run to create test output; the second pass of <em>pyWeb</em> weaves this
test output into the final document via the @@i command.
</p>

@d WebReader load...
@{
def expect( self, tokens ):
    if not self.moreTokens():
        raise Error("At %r: end of input, %r not found" % (self.location(),tokens) )
    t= self.nextToken()
    if t not in tokens:
        raise Error("At %r: expected %r, found %r" % (self.location(),tokens,t) )
    return t
def load( self, aWeb ):
    self.theWeb= aWeb
    self.aChunk= Chunk()
    self.aChunk.webAdd( self.theWeb )
    self.openSource()
    while self.moreTokens():
        token= self.nextToken()
        if len(token) >= 2 and token.startswith(self.command):
            if self.handleCommand( token ):
                continue
            else:
                @<other command-like sequences are appended as a TextCommand@>
        elif token:
            # accumulate non-empty block of text in the current chunk
            self.aChunk.appendText( token, self.lineNumber )
@}

@d other command-like sequences...
@{
theLog.event( ReadEvent, 'Unknown @@-command in input: %r' % token )
self.aChunk.appendText( token, self.lineNumber )
@}


<p>The command character can be changed to permit
some flexibility when working with languages that make extensive
use of the <tt>@@</tt> symbol, <em>i.e.</em>, PERL.
The initialization of the <b>WebReader</b> is based on the selected 
command character.
</p>

@d WebReader command literals
@{
# major commands
self.cmdo= self.command+'o'
self.cmdo_big= self.command+'O'
self.cmdd= self.command+'d'
self.cmdlcurl= self.command+'{'
self.cmdrcurl= self.command+'}'
self.cmdlbrak= self.command+'['
self.cmdrbrak= self.command+']'
self.cmdi= self.command+'i'
# minor commands
self.cmdlangl= self.command+'<'
self.cmdrangl= self.command+'>'
self.cmdpipe= self.command+'|'
self.cmdlexpr= self.command+'('
self.cmdrexpr= self.command+')'
self.cmdf= self.command+'f'
self.cmdm= self.command+'m'
self.cmdu= self.command+'u'
self.cmdcmd= self.command+self.command
@}

<h3>Operation Class Hierarchy</h3>
<p>This application performs three major operations: loading the documen web, 
weaving and tangling.  Generally,
the use case is to perform a load, weave and tangle.  However, a less common use case
is to first load and tangle output files, run a regression test and then 
load and weave a result that includes the test output file.
</p>
<p>The <tt>-x</tt> option excludes one of the two output operations.  The <tt>-xw</tt> 
excludes the weave pass, doing only the tangle operation.  The <tt>-xt</tt> excludes
the tangle pass, doing the weave operation.
</p>
<p>This two pass operation might be embedded in the following type of Python program.</p>
<pre>
import pyweb, os
pyweb.tangle( "source.w" )
os.system( "python source.py >source.log" )
pyweb.weave( "source.w" )
</pre>
<p>The first step runs <em>pyWeb</em>, excluding the final weaving pass.  The second
step runs the tangled program, <tt>source.py</tt>, and produces test results in
a log file, <tt>source.log</tt>.  The third step runs <em>pyWeb</em> excluding the
tangle pass.  This produces a final document that includes the <tt>source.log</tt> 
test results.
</p>
<p>To accomplish this, we provide a class hierarchy that defines the various
operations of the <em>pyWeb</em> application.  This class hierarchy defines an extensible set of 
fundamental operations.  This gives us the flexibility to create a simple sequence
of operations and execute any combination of these.  It eliminates the need for a 
forest of <tt>if</tt>-statements to determine precisely what will be done.
</p>
<p>Each operation has the potential to update the state of the overall
application.   A partner with this command hierarchy is the Application class
that defines the application options, inputs and results.</p> 

@d Operation class hierarchy... @{
@<Operation superclass has common features of all operations@>
@<MacroOperation subclass that holds a sequence of other operations@>
@<WeaveOperation subclass initiates the weave operation@>
@<TangleOperation subclass initiates the tangle operation@>
@<LoadOperation subclass loads the document web@>
@}

<h4>Operation Class</h4>
<p>The <b>Operation</b> class embodies the basic operations of <em>pyWeb</em>.
The intent of this hierarchy is to both provide an easily expanded method of
adding new operations, but an easily specified list of operations for a particular
run of <em>pyWeb</em>.

<h5>Usage</h5>
<p>The overall process of the application is defined by an instance of <b>Operation</b>.
This instance may be the <b>WeaveOperation</b> instance, the <b>TangleOperation</b> instance
or a <b>MacroOperation</b> instance.
</p>
<p>The instance is constructed during parsing of the input parameters.  Then the 
<b>Operation</b> class <b>perform()</b> method is called to actually perform the
operation.  There are three standard <b>Operation</b> instances available: an instance
that is a macro and does both tangling and weaving, an instance that excludes tangling,
and an instance that excludes weaving.  These correspond to the command-line options.
</p>
<pre>
anOp= SomeOperation( <i>parameters</i> )
anOp.perform()
</pre>

<h5>Design</h5>
<p>The <p>Operation</b> is the superclass for all operations.</p>

<h5>Implementation</h5>

@d Operation superclass... @{
class Operation:
    """An operation performed by pyWeb."""
    def __init__( self, name ):
        self.name= name
        self.start= None
    def __str__( self ):
        return self.name
    @<Operation perform method actually performs the operation@>
    @<Operation final summary method@>
@}

<p>The <b>perform()</b> method does the real work of the operation.
For the superclass, it merely logs a message.  This is overridden 
by a subclass.</p>
@d Operation perform... @{
def perform( self, theWeb, theApplication ):
    theLog.event( ExecutionEvent, "Starting %s" % self.name )
    self.start= time.clock()
@}

<p>The <b>summary()</b> method returns some basic processing
statistics for this operation.
</p>
@d Operation final... @{
def duration( self ):
    """Return duration of the operation."""
    # Windows clock() function is funny.
    return (self.start and time.clock()-self.start) or 0
def summary( self, theWeb, theApplication ):
    return "%s in %0.1f sec." % ( self.name, self.duration() )
@}

<h4>MacroOperation Class</h4>
<p>A <b>MacroOperation</b> defines a composite operation; it is a sequence of
other operations.  When the macro is performed, it delegates to the 
sub-operations.</p>

<h5>Usage</h5>
<p>The instance is created during parsing of input parameters.  An instance of
this class is one of
the three standard operations available; it generally is the default, "do everything" 
operation.</p>

<h5>Design</h5>
<p>This class overrides the <b>perform()</b> method of the superclass.  It also adds
an <b>append()</b> method that is used to construct the sequence of operations.
</p>
<h5>Implementation</h5>

@d MacroOperation subclass... @{
class MacroOperation( Operation ):
    """An operation composed of a sequence of other operations."""
    def __init__( self, opSequence=None ):
        Operation.__init__( self, "Macro" )
        if opSequence: self.opSequence= opSequence
        else: self.opSequence= []
    def __str__( self ):
        return "; ".join( [ x.name for x in self.opSequence ] )
    @<MacroOperation perform method delegates the sequence of operations@>
    @<MacroOperation append adds a new operation to the sequence@>
    @<MacroOperation summary summarizes each step@>
@}

<p>Since the macro <b>perform()</b> method delegates to other operations,
it is possible to short-cut argument processing by using the Python
<tt>*args</tt> construct to accept all arguments and pass them to each
sub-operation.</p>

@d MacroOperation perform... @{
def perform( self, theWeb, theApplication ):
    for o in self.opSequence:
        o.perform(theWeb,theApplication)
@}

<p>Since this class is essentially a wrapper around the built-in sequence type, 
we delegate sequence related operations directly to the underlying sequence.</p>

@d MacroOperation append... @{
def append( self, anOperation ):
    self.opSequence.append( anOperation )
@}

<p>The <b>summary()</b> method returns some basic processing
statistics for each step of this operation.
</p>
@d MacroOperation summary... @{
def summary( self, theWeb, theApplication ):
    return ", ".join( [ x.summary(theWeb,theApplication) for x in self.opSequence ] )
@}

<h4>WeaveOperation Class</h4>
<p>The <b>WeaveOperation</b> defines the operation of weaving.  This operation
logs a message, and invokes the <b>weave()</b> method of the <b>Web</b> instance.
This method also includes the basic decision on which weaver to use.  If a <b>Weaver</b> was
specified on the command line, this instance is used.  Otherwise, the first few characters
are examined and a weaver is selected.
</p>

<h5>Usage</h5>
<p>An instance is created during parsing of input parameters.  The instance of this 
class is one of
the standard operations available; it is the "exclude tangling" option and it is
also an element of the "do everything" macro.</p>

<h5>Design</h5>
<p>This class overrides the <b>perform()</b> method of the superclass.
</p>
<h5>Implementation</h5>

@d WeaveOperation subclass... @{
class WeaveOperation( Operation ):
    """An operation that weaves a document."""
    def __init__( self ):
        Operation.__init__( self, "Weave" )
    @<WeaveOperation perform method does weaving of the document file@>
    @<WeaveOperation summary method provides line counts@>
@}

<a name="pick_language"></a>
<p>The language is picked just prior to weaving.  It is either (1) the language
specified on the command line, or, (2) if no language was specified, a language
is selected based on the first few characters of the input.</p>
<p>Weaving can only raise an exception when there is a reference to a chunk that
is never defined.</p>

@d WeaveOperation perform... @{
def perform( self, theWeb, theApplication ):
    Operation.perform( self, theWeb, theApplication )
    if not theApplication.theWeaver: 
        # Examine first few chars of first chunk of web to determine language
        theApplication.theWeaver= theWeb.language() 
    try:
        theWeb.weave( theApplication.theWeaver )
    except Error,e:
        theLog.event( ErrorEvent, 
            "Problems weaving document from %s (weave file is faulty)." 
            % theWeb.sourceFileName )
        raise
@}

<p>The <b>summary()</b> method returns some basic processing
statistics for the weave operation.
</p>
@d WeaveOperation summary... @{
def summary( self, theWeb, theApplication ):
    if theApplication.theWeaver and theApplication.theWeaver.linesWritten > 0:
        return "%s %d lines in %0.1f sec." % ( self.name, theApplication.theWeaver.linesWritten, self.duration() )
    return "did not %s" % ( self.name, )
@}

<h4>TangleOperation Class</h4>
<p>The <b>TangleOperation</b> defines the operation of weaving.  This operation
logs a message, and invokes the <b>weave()</b> method of the <b>Web</b> instance.
This method also includes the basic decision on which weaver to use.  If a <b>Weaver</b> was
specified on the command line, this instance is used.  Otherwise, the first few characters
are examined and a weaver is selected.
</p>

<h5>Usage</h5>
<p>An instance is created during parsing of input parameters.  The instance of this 
class is one of
the standard operations available; it is the "exclude tangling" option, and it is
also an element of the "do everything" macro.</p>

<h5>Design</h5>
<p>This class overrides the <b>perform()</b> method of the superclass.
</p>
<h5>Implementation</h5>

@d TangleOperation subclass... @{
class TangleOperation( Operation ):
    """An operation that weaves a document."""
    def __init__( self ):
        Operation.__init__( self, "Tangle" )
    @<TangleOperation perform method does tangling of the output files@>
    @<TangleOperation summary method provides total lines tangled@>
@}

<p>Tangling can only raise an exception when a cross reference request (<tt>@@f</tt>, <tt>@@m</tt> or <tt>@@u</tt>)
occurs in a program code chunk.  Program code chunks are defined 
with any of <tt>@@d</tt>, <tt>@@o</tt> or <tt>@@O</tt>  and use <tt>@@{</tt> <tt>@@}</tt> brackets.
</p>

@d TangleOperation perform... @{
def perform( self, theWeb, theApplication ):
    Operation.perform( self, theWeb, theApplication )
    try:
        theWeb.tangle( theApplication.theTangler )
    except Error,e:
        theLog.event( ErrorEvent, 
            "Problems tangling outputs from %s (tangle files are faulty)." 
            % theWeb.sourceFileName )
        raise
@}

<p>The <b>summary()</b> method returns some basic processing
statistics for the tangle operation.
</p>
@d TangleOperation summary... @{
def summary( self, theWeb, theApplication ):
    if theApplication.theTangler and theApplication.theTangler.linesWritten > 0:
        return "%s %d lines in %0.1f sec." % ( self.name, theApplication.theTangler.linesWritten, self.duration() )
    return "did not %s" % ( self.name, )
@}


<h4>LoadOperation Class</h4>
<p>The <b>LoadOperation</b> defines the operation of loading the web structure.  This operation
uses the application's <tt>webReader</tt> to actually do the load.
</p>

<h5>Usage</h5>
<p>An instance is created during parsing of the input parameters.  An instance of
this class is part of any of the weave, tangle and "do everything" operation.</p>

<h5>Design</h5>
<p>This class overrides the <b>perform()</b> method of the superclass.
</p>
<h5>Implementation</h5>

@d LoadOperation subclass... @{
class LoadOperation( Operation ):
    """An operation that loads the source web for a document."""
    def __init__( self ):
        Operation.__init__( self, "Load" )
    @<LoadOperation perform method does tangling of the output files@>
    @<LoadOperation summary provides lines read@>
@}

<p>Trying to load the web involves two steps, either of which can raise 
exceptions due to incorrect inputs.</p>
<ul>
<li>The <b>WebReader</b> class <b>load()</b> method can raise exceptions for a number of 
syntax errors.
    <ul>
    <li>Missing closing brackets (<tt>@@}</tt>, @@] or <tt>@@&gt;</tt>).</li>
    <li>Missing opening bracket (<tt>@@{</tt> or <tt>@@[</tt>) after a chunk name (<tt>@@d</tt>, <tt>@@o</tt> or <tt>@@O</tt>).</li>
    <li>Extra opening brackets (<tt>@@{</tt> or <tt>@@[</tt>).</li>
    <li>The input file does not exist or is not readable.</li>
    </ul></li>
<li>The <b>Web</b> class <b>createUsedBy()</b> method can raise an exception when a 
chunk reference cannot be resolved to a named chunk.</li>
</ul>

@d LoadOperation perform... @{
def perform( self, theWeb, theApplication ):
    Operation.perform( self, theWeb, theApplication )
    try:
        theApplication.webReader.load( theWeb )
        theWeb.createUsedBy()
    except (Error,IOError),e:
        theLog.event( ErrorEvent, 
            "Problems with source file %s, no output produced." 
            % theWeb.sourceFileName )
        raise
@}

<p>The <b>summary()</b> method returns some basic processing
statistics for the tangle operation.
</p>
@d LoadOperation summary... @{
def summary( self, theWeb, theApplication ):
    return "%s %d lines in %01.f sec." % ( self.name, theApplication.webReader.totalLines(), self.duration() )
@}

<h2>The Application Class</h2>

<h3>Design</h3>

<p>The <b>Application</b> class is provided so that the <b>Operation</b> instances
have an overall application to update.  This allows the <b>WeaveOperation</b> to 
provide the selected <b>Weaver</b> instance to the application.  It also provides a
central location for the various options and alternatives that might be accepted from
the command line.
</p>

<p>The constructor sets the default options for weaving and tangling.</p>

<p>The <b>parseArgs()</b> method uses the <tt>sys.argv</tt> sequence to 
parse the command line arguments and update the options.  This allows a
program to pre-process the arguments, passing other arguments to this module.
</p>

<p>The <b>process()</b> method processes a list of files.  This is either
the list of files passed as an argument, or it is the list of files
parsed by the <b>parseArgs()</b> method.
</p>

<p>The <b>parseArgs()</b> and </b>process()</b> functions are separated so that
a module can include this one, bypass command-line parsing, yet still perform
the basic operations simply and consistently.</p>
<p>For example:</p>
<pre>
import pyweb, getopt
a= pyweb.Application( <i>My Emitter Factory</i> )
opt,arg= getopt.getopt( argv[1:], '<i>My Unique Arguments</i>' )
    <i>My argument parsing</i>
a.process( <i>a File List</i> )
</pre>

<p>The <b>main()</b> function creates an <b>Application</b> instance and
calls the <b>parseArgs()</b> and <b>process()</b> methods to provide the
expected default behavior for this module when it is used as the main program.
</p>

<h3>Implementation</h3>

@d Application Class... 
@{
class Application:
    def __init__( self, ef=None ):
        @<Application default options@>
    @<Application parse command line@>
    @<Application class process all files@>
@}

<p><a name="log_setting"></a>
The first part of parsing the command line is 
setting default values that apply when parameters are omitted.
The default values are set as follows:
</p>
<ul>
<li><i>emitterFact</i> is an <b>EmitterFactory</b> instance that will
create the required <b>Weaver</b> and <b>Tangler</b> requested.</li>
<li><i>theTangler</i> is set to a <b>TanglerMake</b> instance 
to create the output files.</li>
<li><i>theWeaver</i> is set to <b>None</b> so that the input
language will be used to select an appropriate weaver.</li>
<li><i>commandChar</i> is set to <b><tt>@@</tt></b> as the 
default command introducer.</li>
<li><i>doWeave</i> and </i>doTangle</i> are instances of <b>Operation</b>
that describe the basic operations of the application.</li>
<li><i>theOperation</i> is an instance of <b>Operation</b> that describes
the default overall operation: both tangle and weave.</li>
<li><i>permitList</i> provides a list of commands that are permitted
to fail.  Typically this is empty, or contains @@i to allow the include
command to fail.</li>
<li><i>files</i> is the final list of argument files from the command line; 
these will be processed unless overridden in the call to <b>process()</b>.</li>
<li><i>webReader</i> is the <b>WebReader</b> instance created for the current
input file.</i>
</ul>

@d Application default options...
@{
if not ef: ef= EmitterFactory()
self.emitterFact= ef
self.theTangler= ef.mkEmitter( 'TanglerMake' )
self.theWeaver= None
self.commandChar= '@@'
loadOp= LoadOperation()
weaveOp= WeaveOperation()
tangleOp= TangleOperation()
self.doWeave= MacroOperation( [loadOp,weaveOp] )
self.doTangle= MacroOperation( [loadOp,tangleOp] )
self.theOperation= MacroOperation( [loadOp,tangleOp,weaveOp] )
self.permitList= []
self.files= []
self.webReader= None
@}

<p>The algorithm for parsing the command line parameters uses the built in
<b>getopt</b> module.  This module has a <b>getopt()</b> function that accepts
the original arguments and normalizes them into two sequences: options and
arguments.  The options sequence contains tuples of option name and option value.
</p>
@d Application parse command line...
@{
def parseArgs( self, argv ):
    global theLog
    opts, self.files = getopt.getopt( argv[1:], 'hvsdc:w:t:x:p:' )
    for o,v in opts:
        if o == '-c': 
            theLog.event( OptionsEvent, "Setting command character to %r" % v )
            self.commandChar= v
        elif o == '-w': 
            theLog.event( OptionsEvent, "Setting weaver to %r" % v )
            self.theWeaver= self.emitterFact.mkEmitter( v )
        elif o == '-t': 
            theLog.event( OptionsEvent, "Setting tangler to %r" % v )
            self.theTangler= self.emitterFact.mkEmitter( v )
        elif o == '-x':
            if v.lower().startswith('w'): # skip weaving
                self.theOperation= self.doTangle
            elif v.lower().startswith('t'): # skip tangling
                self.theOperation= self.doWeave
            else:
                raise Exception( "Unknown -x option %r" % v )
        elif o == '-p':
            # save permitted errors, usual case is -pi to permit include errors
            self.permitList= [ '%s%s' % ( commandChar, c ) for c in v ]
        elif o == '-h': print __doc__
        elif o == '-v': theLog= Logger( verbose )
        elif o == '-s': theLog= Logger( silent )
        elif o == '-d': theLog= Logger( debug )
        else:
            raise Exception('unknown option %r' % o)
@}

<p>The <b>process()</b> function uses the current <b>Application</b> settings
to process each file as follows:</p>
<ol>
<li>Create a new <b>WebReader</b> for the <b>Application</b>, providing
the parameters required to process the input file.</li>
<li>Create a <b>Web</b> instance, <i>w</i> 
and set the Web's <i>sourceFileName</i> from the WebReader's <i>fileName</i>.</i>
<li>Perform the given command, typically a <b>MacroOperation</b>, 
which does some combination of load, tangle the output files and
weave the final document in the target language; if
necessary, examine the <b>Web</b> to determine the documentation language.</li>
<li>Print a performance summary line that shows lines processed per second.</li>
</ol>

<p>In the event of failure in any of the major processing steps, 
a summary message is produced, to clarify the state of 
the output files, and the exception is reraised.
The re-raising is done so that all exceptions are handled by the 
outermost main program.</p>

@d Application class process all...
@{
def process( self, theFiles=None ):
    for f in theFiles or self.files:
        self.webReader= WebReader( f, command=self.commandChar, permit=self.permitList )
        try:
            w= Web()
            w.sourceFileName= self.webReader.fileName
            self.theOperation.perform( w, self )
        except Error,e:
            print '>', e.args[0]
            for a in e.args[1:]:
                print '...', a
        except IOError,e:
            print e
        theLog.event(SummaryEvent, 'pyWeb: %s' % self.theOperation.summary(w,self) )
@}

<a name="moduleInit"></a>
<h2>Module Initialization</h2>

<p>These global singletons define the logging that can be done.  These are used
in the logger configuration that is done as part of Module Initialization.</p>

@d Global singletons...
@{
_report= LogReport()
_discard= LogDiscard()
_debug= LogDebug()
@| _report _discard _debug @}

<p>A Log configuration tuple lists a number of <b>Event</b> subclasses 
that are associated with an instance
of a subclass of the <b>LogActivity</b> class.
Each alternative configuration is a list of association tuples.
A command-line parameter selects one of these configuration
alternatives. 
</p>
<p>A direct mapping from class to <b>LogActivity</b> instance might be a simpler block
of Python programming, but is relatively long-winded when specifying the configuration.
</p>

@d Module Initialization...
@{
standard = [
    (('ErrorEvent','WarningEvent'), _report ),
    (('ExecutionEvent',), _debug ),
    (('InputEvent','WeaveStartEvent', 'WeaveEndEvent', 
      'TangleStartEvent', 'TangleEndEvent','SummaryEvent'), _report),
    (('OptionsEvent', 'ReadEvent', 
      'WeaveEvent', 'TangleEvent'),  _discard ) ]
silent = [
    (('ErrorEvent','SummaryEvent'), _report ),
    (('ExecutionEvent','WarningEvent',
      'InputEvent','WeaveStartEvent', 'WeaveEndEvent', 
      'TangleStartEvent', 'TangleEndEvent'), _discard),
    (('Optionsevent', 'ReadEvent', 
      'WeaveEvent', 'TangleEvent'),  _discard ) ]
verbose = [
    (('ErrorEvent','WarningEvent'), _report ),
    (('ExecutionEvent',
      'InputEvent','WeaveStartEvent', 'WeaveEndEvent', 
      'TangleStartEvent', 'TangleEndEvent','SummaryEvent'), _report),
    (('OptionsEvent', 'ReadEvent',
      'WeaveEvent', 'TangleEvent'),  _report ) ]

theLog= Logger(standard)
@| standard silent verbose theLog @}

<h2>Interface Functions</h2>

<p>There are three interface functions:  <b>main()</b>, <b>tangle()</b> and <b>weave()</b>.
The latter two are often termed "convenience functions".
</p>

<p>The main program is configured with an instance of an <b>EmitterFactory</b>.
This instance is used to resolve names of weavers and tanglers when parsing
parameters.  An application program that uses this module can do the following:
</p>
<ul>
<li>Subclass <b>Weaver</b> or <b>Tangler</b> to create a subclass with 
additional features.</li>
<li>Subclass <b>EmitterFactory</b> to add a reference to the new subclass.</li>
<li>Call <tt>pyweb.main( <i>newFactory</i>, sys.argv )</tt> to run the existing
main program with extra classes available to it.</li>
</ul>

<p>The <b>main()</b> function creates an <b>Application</b> instance
to parse the command line arguments and then process those arguments.
An <b>EmitterFactory</b> is passed as an argument so that another module
can be constructed to add features to this module.
See the example in the <a href="#emitterFactory">Emitter Factory</a> section, above.
</p>


@d Interface Functions...
@{
def main( ef, argv ):
    a= Application( ef )
    a.parseArgs( argv )
    a.process()

if __name__ == "__main__":
    main( EmitterFactory(), sys.argv )
@| main @}

<p>Two simple convenience functions are available for Python scripts
that look like the following:
</p>
<pre>
import pyweb
pyweb.tangle( 'aFile.w' )
...test procedure...
pyweb.weave( 'aFile.w' )
</pre>

@d Interface Functions... @{
def tangle( aFile ):
    """Tangle a single file, permitting errors on missing @@i files."""
    a= Application( EmitterFactory() )
    a.permitList= [ '%si' % a.commandChar ]
    a.theOperation= a.doTangle
    a.process( [aFile] )
    
def weave( aFile ):
    """Weave a single file."""
    a= Application( EmitterFactory() )
    a.theOperation= a.doWeave
    a.process( [aFile] )
@| tangle weave @}

<h1>Indices</h1>
<h2>Files</h2>
@f
<h2>Macros</h2>
@m
<h2>User Identifiers</h2>
@u

<hr />
<p><small>Created by @(thisApplication@) at @(time.asctime()@).</small></p>
<p><small>pyweb.__version__ '@(__version__@)'.</small></p>
<p><small>Source @(theFile@) modified @(time.ctime(os.path.getmtime(theFile))@).
</small></p>
<p><small>Working directory '@(os.getcwd()@)'.</small></p>

</body>
</html>
