pyWeb

In Python, Yet Another Literate Programming Tool

Steven F. Lott

Table of Contents

  1. Table of Contents
  2. Introduction
    1. Background
    2. pyWeb
    3. Use Cases
      1. Tangle Source Files
      2. Weave Source Files
      3. Tangle, Regression Test and Weave
    4. Writing pyWeb .w Files
      1. Major Commands
      2. Minor Commands
      3. Additional Features
    5. Running pyWeb to Tangle and Weave
    6. Restrictions
    7. Installation
    8. Acknowledgements
  3. TODO:
  4. Design Overview
  5. Implementation
    1. pyWeb Base File
      1. Python Library Imports
      2. Python DOC String
      3. Other Python Overheads
    2. Base Class Definitions
      1. Emitters
        1. Emitter Superclass
          1. Usage
          2. Design
          3. Implementation
        2. Weaver subclass of Emitter
          1. Usage
          2. Design
          3. Implementation
        3. LaTex subclass of Weaver
          1. Usage
          2. Design
          3. Implementation
        4. HTML subclass of Weaver
          1. Usage
          2. Design
          3. Implementation
        5. Tangler subclass of Emitter
          1. Usage
          2. Design
          3. Implementation
        6. TanglerMake subclass of Tangler
          1. Usage
          2. Design
          3. Implementation
      2. Emitter Factory
        1. Usage
        2. Design
        3. Implementation
      3. Chunks
        1. Chunk Superclass
          1. Usage
          2. Design
          3. Implementation
        2. NamedChunk class
          1. Usage
          2. Design
          3. Implementation
        3. OutputChunk class
          1. Usage
          2. Design
          3. Implementation
        4. NamedDocumentChunk class
          1. Usage
          2. Design
          3. Implementation
      4. Commands
        1. Command Superclass
          1. Usage
          2. Design
          3. Implementation
        2. TextCommand class
          1. Usage
          2. Design
          3. Implementation
        3. CodeCommand class
          1. Usage
          2. Design
          3. Implementation
        4. XrefCommand superclass
          1. Usage
          2. Design
          3. Implementation
        5. FileXrefCommand class
          1. Usage
          2. Design
          3. Implementation
        6. MacroXrefCommand class
          1. Usage
          2. Design
          3. Implementation
        7. UserIdXrefCommand class
          1. Usage
          2. Design
          3. Implementation
        8. ReferenceCommand class
          1. Usage
          2. Design
          3. Implementation
      5. Error class
        1. Usage
        2. Design
        3. Implementation
      6. Logger Classes
        1. Usage
        2. Design
      7. The Web Class
      8. The WebReader Class
        1. Usage
        2. Design
        3. Implementation
      9. Operation Class Hierarchy
        1. Operation Class
          1. Usage
          2. Design
          3. Implementation
        2. MacroOperation Class
          1. Usage
          2. Design
          3. Implementation
        3. WeaveOperation Class
          1. Usage
          2. Design
          3. Implementation
        4. TangleOperation Class
          1. Usage
          2. Design
          3. Implementation
        5. LoadOperation Class
          1. Usage
          2. Design
          3. Implementation
    3. The Application Class
      1. Design
      2. Implementation
    4. Module Initialization
    5. Interface Functions
  6. Indices
    1. Files
    2. Macros
    3. User Identifiers

Introduction

Literate programming was pioneered by Knuth as a method for developing readable, understandable presentations of programs. These would present a program in a literate fashion for people to read and understand; this would be in parallel with presentation as source text for a compiler to process and both would be generated from a common source file.

One intent is to synchronize the program source with the documentation about that source. If the program and the documentation have a common origin, then the traditional gaps between intent (expressed in the documentation) and action (expressed in the working program) are significantly reduced.

Numerous tools have been developed based on Knuth's initial work. A relatively complete survey is available at sites like Literate Programming, and the OASIS XML Cover Pages: Literate Programming with SGML and XML.

The immediate predecessors to this pyWEB tool are FunnelWeb, noweb and nuweb. The ideas lifted from these other tools created the foundation for pyWEB.

There are several Python-oriented literate programming tools. These include LEO, interscript, lpy, py2html.

The FunnelWeb tool is independent of any programming language and only mildly dependent on TEX. It has 19 commands, many of which duplicate features of HTML or LATEX.

The noweb tool was written by Norman Ramsey. This tool uses a sophisticated multi-processing framework, via Unix pipes, to permit flexible manipulation of the source file to tangle and weave the programming language and documentation markup files.

The nuweb Simple Literate Programming Tool was developed by Preston Briggs (preston@tera.com). His work was supported by ARPA, through ONR grant N00014-91-J-1989. It is written in C, and very focused on producing LATEX documents. It can produce HTML, but this is clearly added after the fact. It cannot be easily extended, and is not object-oriented.

The LEO tool, is a structured GUI editor for creating source. It uses XML and noweb-style chunk management. It is more than a simple weave and tangle tool.

The interscript tool is very large and sophisticated, but doesn't gracefully tolerate HTML markup in the document. It can create a variety of markup languages from the interscript source, making it suitable for creating HTML as well as LATEX.

The lpy tool can produce very complex HTML representations of a Python program. It works by locating documentation markup embedded in Python comments and docstrings. This is called inverted literate programming.

The py2html tool does very sophisticated syntax coloring.

Background

The following is an almost verbatim quote from Briggs' nuweb documentation, and provides an apt summary of Literate Programming.

In 1984, Knuth introduced the idea of literate programming and described a pair of tools to support the practise (Donald E. Knuth, Literate Programming, The Computer Journal 27 (1984), no. 2, 97-111.) His approach was to combine Pascal code with TEX documentation to produce a new language, WEB, that offered programmers a superior approach to programming. He wrote several programs in WEB, including weave and tangle, the programs used to support literate programming. The idea was that a programmer wrote one document, the web file, that combined documentation written in TEX (Donald E. Knuth, The TEXbook, Computers and Typesetting, 1986) with code (written in Pascal).

Running tangle on the web file would produce a complete Pascal program, ready for compilation by an ordinary Pascal compiler. The primary function of tangle is to allow the programmer to present elements of the program in any desired order, regardless of the restrictions imposed by the programming language. Thus, the programmer is free to present his program in a top-down fashion, bottom-up fashion, or whatever seems best in terms of promoting understanding and maintenance.

Running weave on the web file would produce a TEX file, ready to be processed by TEX. The resulting document included a variety of automatically generated indices and cross-references that made it much easier to navigate the code. Additionally, all of the code sections were automatically prettyprinted, resulting in a quite impressive document.

Knuth also wrote the programs for TEX and METAFONT entirely in WEB, eventually publishing them in book form. These are probably the largest programs ever published in a readable form.

pyWeb

pyWeb works with any programming language and any markup language. This philosophy comes from FunnelWeb, noweb and nuweb. The primary differences between pyWeb and other tools are the following.

pyWeb works with any programming language and any markup language. The initial release supports HTML completely, and LATEX approximately. The biggest gap in the LATEX support is a complete lack of understanding of the original markup in nuweb, and the very real damage done to that markup when creating pyWeb.

The following is extensively quoted from Briggs' nuweb documentation, and provides an excellent background in the advantages of the very simple approach started by nuweb and adopted by pyWeb.

The need to support arbitrary programming languages has many consequences:

No prettyprinting
Both WEB and CWEB are able to prettyprint the code sections of their documents because they understand the language well enough to parse it. Since we want to use any language, we've got to abandon this feature. However, we do allow particular individual formulas or fragments of LATEX or HTML code to be formatted and still be part of the output files.
Limited index of identifiers
Because WEB knows about Pascal, it is able to construct an index of all the identifiers occurring in the code sections (filtering out keywords and the standard type identifiers). Unfortunately, this isn't as easy in our case. We don't know what an identifier looks like in each language and we certainly don't know all the keywords. We provide a mechanism to mark identifiers, and we use a pretty standard pattern for recognizing identifiers almost most programming languages.

Of course, we've got to have some compensation for our losses or the whole idea would be a waste. Here are the advantages I [Briggs] can see:

Simplicity
The majority of the commands in WEB are concerned with control of the automatic prettyprinting. Since we don't prettyprint, many commands are eliminated. A further set of commands is subsumed by LATEX and may also be eliminated. As a result, our set of commands is reduced to only about seven members (explained in the next section). This simplicity is also reflected in the size of this tool, which is quite a bit smaller than the tools used with other approaches.
No prettyprinting
Everyone disagrees about how their code should look, so automatic formatting annoys many people. One approach is to provide ways to control the formatting. Our approach is simpler -- we perform no automatic formatting and therefore allow the programmer complete control of code layout.
Control
We also offer the programmer reasonably complete control of the layout of his output files (the files generated during tangling). Of course, this is essential for languages that are sensitive to layout; but it is also important in many practical situations, e.g., debugging.
Speed
Since [pyWeb] doesn't do too much, it runs very quickly. It combines the functions of tangle and weave into a single program that performs both functions at once.
Chunk numbers
Inspired by the example of noweb, [pyWeb] refers to all program code chunks by a simple, ascending sequence number through the file. This becomes the HTML anchor name, also.
Multiple file output
The programmer may specify more than one output file in a single [pyWeb] source file. This is required when constructing programs in a combination of languages (say, Fortran and C). It's also an advantage when constructing very large programs.

Use Cases

pyWeb supports two use cases, Tangle Source Files and Weave Documentation. These are often combined into a single request of the application that will both weave and tangle.

Tangle Source Files

A user initiates this process when they have a complete .w file that contains a description of source files. These source files are described with @o commands in the .w file.

The use case is successful when the source files are produced.

Outside this use case, the user will debug those source files, possibly updating the .w file. This will lead to a need to restart this use case.

The use case is a failure when the source files cannot be produced, due to errors in the .w file. These must be corrected based on information in log messages.

The sequence is simply ./pyweb.py theFile.w.

Weave Source Files

A user initiates this process when they have a .w file that contains a description of a document to produce. The document is described by the entire .w file.

The use case is successful when the documentation file is produced.

Outside this use case, the user will edit the documentation file, possibly updating the .w file. This will lead to a need to restart this use case.

The use case is a failure when the documentation file cannot be produced, due to errors in the .w file. These must be corrected based on information in log messages.

The sequence is simply ./pyweb.py theFile.w.

Tangle, Regression Test and Weave

A user initiates this process when they have a .w file that contains a description of a document to produce. The document is described by the entire .w file. Further, their final document should include regression test output from the source files created by the tangle operation.

The use case is successful when the documentation file is produced, including current regression test output.

Outside this use case, the user will edit the documentation file, possibly updating the .w file. This will lead to a need to restart this use case.

The use case is a failure when the documentation file cannot be produced, due to errors in the .w file. These must be corrected based on information in log messages.

The use case is a failure when the documentation file does not include current regression test output.

The sequence is as follows:

./pyweb.py -xw -pi theFile.w
python theTest>aLog
./pyweb.py -xt theFile.w

The first step excludes weaving and permits errors on the @i command. The -pi option is necessary in the event that the log file does not yet exist. The second step runs the regression test, creating a log file. The third step weaves the final document, including the regression test output.

Writing pyWeb .w Files

The input to pyWeb is a .w file that consists of a series of Chunks. Each Chunk is either program source code to be tangled or it is documentation to be woven. The bulk of the file is typically documentation chunks that describe the program in some human-oriented markup language like HTML or LATEX.

The pyWeb tool parses the input, and performs the tangle and weave operations. It tangles each individual output file from the program source chunks. It weaves the final documentation file from the entire sequence of chunks provided, mixing the author's original documentation with the program source.

The Major commands partition the input and define the various chunks. The Minor commands are used to control the woven and tangled output from those chunks.

Major Commands

There are three major commands that define the various chunks in an input file.

@o file @{ text @}
The @o (output) command defines a named output file chunk. The text is tangled to the named file with no alteration. It is woven into the document in an appropriate fixed-width font.
@d name @{ text @}
The @d (define) command defines a named chunk of program source. This text is tangled or woven when it is referenced by the reference minor command.
@i file
The @i (include) command includes another file. The previous chunk is ended. The file is processed completely, then a new chunk is started for the text after the @i command.

All material that is not explicitly in a @o or @d named chunk is implicitly collected into a sequence of anonymous document source chunks. These anonymous chunks form the backbone of the document that is woven. The anonymous chunks are never tangled into output program source files.

Note that white space (line breaks ('\n'), tabs and spaces) have no effect on the input parsing. They are completely preserved on output.

The following example has three chunks. An anonymous chunk of documentation, a named output chunk, and an anonymous chunk of documentation.

<p>Some HTML documentation that describes the following piece of the
program.</p>
@o myFile.py 
@{
import math
print math.pi
@}
<p>Some more HTML documentation.</p>

Minor Commands

There are five minor commands that cause content to be created where the command is referenced.

@@
The @@ command creates a single @ in the output file.
@<name@>
The name references a named chunk. When tangling, the referenced chunk replaces the reference command. When weaving, a reference marker is used. In HTML, this will be the <A HREF=...> markup.
@f
The @f command inserts a file cross reference. This lists the name of each file created by an @o command, and all of the various chunks that are concatenated to create this file.
@m
The @m command inserts a named chunk ("macro") cross reference. This lists the name of each chunk created by an @d command, and all of the various chunks that are concatenated to create the complete chunk.
@u
The @u command inserts a user identifier cross reference. This lists the name of each chunk created by an @d command, and all of the various chunks that are concatenated to create the complete chunk.
@|
A chunk may define user identifiers. The list of defined identifiers is placed in the chunk, set off by a @| separator.
@(Python expression@)
The Python expression is evaluated and the result is tangled or woven in place. A few global variables and modules are available. These are described below.

Additional Features

The named chunks (from both @o and @d commands) are assigned unique sequence numbers to simplify cross references. In LaTex it is possible to determine the page breaks and assign the sequence numbers based on the physical pages.

Chunk names and file names are case sensitive.

Chunk names can be abbreviated. A partial name can have a trailing ellipsis (...), this will be resolved to the full name. The most typical use for this is shown in the following example.

<p>Some HTML documentation.</p>
@o myFile.py 
@{
@<imports of the various packages used@>
print math.pi,time.time()
@}
<p>Some notes on the packages used.</p>
@d imports...
@{
import math,time
@| math time
@}
<p>Some more HTML documentation.</p>
  1. An anonymous chunk of documentation.
  2. A named chunk that tangles the myFile.py output. It has a reference to the imports of the various packages used chunk. Note that the full name of the chunk is essentially a line of documentation, traditionally done as a comment line in a non-literate programming environment.
  3. An anonymous chunk of documentation.
  4. A named chunk with an abbreviated name. The imports... matches the complete name. Set off after the @| separator is the list of identifiers defined in this chunk.
  5. An anonymous chunk of documentation.

Named chunks are concatenated from their various pieces. This allows a named chunk to be broken into several pieces, simplifying the description. This is most often used when producing fairly complex output files.

<p>An anonymous chunk with some HTML documentation.</p>
@o myFile.py 
@{
import math,time
@}
<p>Some notes on the packages used.</p>
@o myFile.py
@{
print math.pi,time.time()
@}
<p>Some more HTML documentation.</p>
  1. An anonymous chunk of documentation.
  2. A named chunk that tangles the myFile.py output. It has the first part of the file. In the woven document this is marked with "=".
  3. An anonymous chunk of documentation.
  4. A named chunk that also tangles the myFile.py output. This chunk's content is appended to the first chunk. In the woven document this is marked with "+=".
  5. An anonymous chunk of documentation.

Newline characters are preserved on input. Because of this the output may appear to have excessive newlines. In all of the above examples, each named chunk was defined with the following.


@{
import math,time
@}

This puts a newline character before and after the import line.

One transformation is performed when tangling output. The indentation of a chunk reference is applied to the entire chunk. This makes it simpler to prepare source for languages (like Python) where indentation is important. It also gives the author control over how the final tangled output looks.

Also, note that the myFile.py uses the @| command to show that this chunk defines the identifier aFunction.

<p>An anonymous chunk with some HTML documentation.</p>
@o myFile.py 
@{
def aFunction( a, b ):
    @<body of the aFunction@>
@| aFunction @}
<p>Some notes on the packages used.</p>
@d body...
@{
"""doc string"""
return a + b
@}
<p>Some more HTML documentation.</p>

The tangled output from this will look like the following. All of the newline characters are preserved, and the reference to body of the aFunction is indented to match the prevailing indent where it was referenced. In the following example, explicit line markers of ~ are provided to make the blank lines more obvious.


~
~def aFunction( a, b ):
~        
~    """doc string"""
~    return a + b
~

There are two possible implementations for evaluation of a Python expression in the input.

  1. Create an ExpressionCommand, and append this to the current Chunk. This will allow evaluation during weave processing and during tangle processing. This makes the entire weave (or tangle) context available to the expression, including completed cross reference information.
  2. Evaluate the expression during input parsing, and append the resulting text as a TextCommand to the current Chunk. This provides a common result available to both weave and parse, but the only context available is the WebReader and the incomplete Web, built up to that point.

In this implementation, we adopt the latter approach, and evaluate expressions immediately. A simple global context is created with the following variables defined.

time
This is the standard time module.
os
This is the standard os module.
theLocation
A tuple with the file name, first line number and last line number for the original expression's location
theWebReader
The WebReader instance doing the parsing.
thisApplication
The name of the running pyWeb application.
__version__
The version string in the pyWeb application.

Running pyWeb to Tangle and Weave

Assuming that you have marked pyweb.py as executable, you do the following.

./pyweb.py file...

This will tangle the @o commands in each file. It will also weave the output, and create file.html.

Command Line Options

Currently, the following command line options are accepted.

-v
Verbose logging. The default is changed by updating the constructor for theLog from Logger(standard) to Logger(verbose).
-s
Silent operation. The default is changed by updating the constructor for theLog from Logger(standard) to Logger(silent).
-c x
Change the command character from @ to x. The default is changed by updating the constructor for theWebReader from WebReader(f,'@') to WebReader(f,'x').
-w weaver
Choose a particular documentation weaver, for instance 'Weaver', 'HTML', 'Latex', or 'HTMLPython'. The default is based on the first few characters of the input file. You can do this by updating the language determination call in the application main function from l= w.language() to l= HTML().
-t tangler
Choose a particular source file tangler, for instance 'Tangler' or 'TanglerMake'. The default is the make-friendly tangler. The default is changed by updating the constructor for theTangler from TanglerMake() to Tangler().
-xw
Exclude weaving. This does tangling of source program files only.
-xt
Exclude tangling. This does weaving of the document file only.
-pcommand
Permit errors in the given list of commands. The most common version is -pi to permit errors in locating an include file. This is done in the following scenario: pass 1 uses -xw -pi to exclude weaving and permit include-file errors; the tangled program is run to create test results; pass 2 uses -xt to exclude tangling and include the test results.

Restrictions

pyWeb requires Python 2.1. or newer.

Currently, input is not detabbed; Python users generally are discouraged from using tab characters in their files.

Installation

You must have Python 2.1.

  1. Download and expand pyweb.zip. You will get pyweb.css, pyweb.html, pyweb.pdf, pyweb.py and pyweb.w.
  2. If necessary, chmod +x pyweb.py.
  3. If you like, cp pyweb.py /usr/local/bin/pyweb to make a global command.
  4. Make a bootstrap copy of pyweb.py (I copy it to pyweb1.py). You can run ./pyweb.py pyweb.w to generate the latest and greatest pyweb.py file, as well as this documentation, pyweb.html.

Be sure to save a bootstrap copy of pyweb.py before changing pyweb.w. Should your changes to pyweb.w introduce a bug into pyweb.py, you will need a fall-back version of pyWeb that you can use in place of the one you just damaged.

Acknowledgements

This application is very directly based on (derived from?) work that preceded this, particularly the following:

Also, after using John Skaller's interscript for two large development efforts, I finally understood the feature set I really needed.

TODO:

  1. Use the Decorator pattern to apply code chunk (@d) vs. file chunk (@o) decorations when weaving a named chunk. This should factor out the distinctions between these two weave operations, separating the common parts. This should remove codeBegin, codeEnd, fileBegin and fileEnd and instead make these calls to a subclass of ChunkDecorator (DecorateCode vs. DecorateFile). The decorator is instantiated when the chunk is created, because the decorator refers to the chunk. Weaving and Tangling will call the decorator, which in turn calls the chunk to produce the final output. Further, the NamedDocumentChunk (which is otherwise undecorated) is unified, meaning we really only have one class of named chunk. Indeed, we may only have one class of chunk, once this activity is factored out.
  2. Create an application that merges something like PyFontify module with pyWeb base classes to add Syntax coloring to a Python-specific HTML weaver.
  3. The createUsedBy() method can be done incrementally by accumulating a list of forward references to chunks; as each new chunk is added, any references to the chunk are removed from the forward references list, and a call is made to the Web's setUsage method. References backward to already existing chunks are easily resolved with a simple lookup. The advantage of the incremental resolution is a simplification in the protocol for using a Web instance.
  4. Use the Builder pattern to provide an explicit WebBuilder instance with the WebReader class to build the parse tree. This can be overridden to, for example, do incremental building in one pass.
  5. Note that the Web is a lot like a NamedChunk; this could be factored out. This will create a more proper Composition pattern implementation.

Design Overview

This application breaks the overall problem into the following sub-problems.

  1. Reading and parsing the input.
  2. Building an internal representation of the source Web.
  3. Weaving a document file.
  4. Tangling the desired program source files.

A solution to the reading and parsing problem depends on a convenient tool for breaking up the input stream and a representation for the chunks of input. Input decomposition is done with the Python Splitter pattern. The representation of the source document is done with the Composition pattern.

The Splitter pattern is widely used in text processing, and has a long legacy in a variety of languages and libraries. A Splitter decomposes a string into a sequence of strings using a split pattern. There are many variant implementations. One variant locates only a single occurence (usually the left-most); this is commonly implemented as a Find or Search string function. Another variant locates occurrences of a specific string or character, and discards the matching string or character.

The variation on Splitter that we use in this application creates each element in the resulting sequence as either (1) an instance of the split regular expression or (2) the text between split patterns. By preserving the actual split text, we can define our splitting pattern with the regular expression '@.'. This will split on any @ followed by a single character. We can then examine the instances of the split RE to locate pyWeb commands.

We could be a tad more specific and use the following as a split pattern: '@[doOifmu|(){}[\]]'. This would silently ignore unknown commands, merging them in with the surrounding text. This would leave the '@@' sequences completely alone, allowing us to replace '@@' with '@' in every text chunk.

The Composition pattern is used to build up a parse tree of instances of Chunk. This parse tree is contained in the overall Web, which is a sequence of Chunks. Each named chunk may be a sequence of Chunks with a common name.

Each chunk is composed of a sequence of instances of Command. Because of this uniform composition, the several operations (particularly weave and tangle) can be delegated to each Chunk, and in turn, delegated to each Command that composes a Chunk.

The weaving operation depends on the target document markup language. There are several approaches to this problem. One is to use a markup language unique to pyweb, and emit markup in the desired target language. Another is to use a standard markup language and use converters to transform the standard markup to the desired target markup. The problem with the second method is specifying the markup for actual source code elements in the document. These must be emitted in the proper markup language.

Since the application must transform input into a specific markup language, we opt using the Strategy pattern to encapsulate markup language details. Each alternative markup strategy is then a subclass of Weaver. This simplifies adding additional markup languages without inventing a markup language unique to pyweb. The author uses their preferred markup, and their preferred toolset to convert to other output languages.

The tangling operation produces output files. In earlier tools, some care was taken to understand the source code context for tangling, and provide a correct indentation. This required a command-line parameter to turn off indentation for languages like Fortran, where identation is not used. In pyweb, the indent of the actual @< command is used to set the indent of the material that follows. If all @< commands are presented at the left margin, no indentation will be done. This is helpful simplification, particularly for users of Python, where indentation is significant.

The standard Emitter class handles this basic indentation. A subclass can be created, if necessary, to handle more elaborate indentation rules.

Implementation

The implementation is contained in a file that both defines the base classes and provides an overall main() function. The main() function uses these base classes to weave and tangle the output files.

An additional file provides a more sophisticated implementation, adding features to an HTML weave subclass.

pyWeb Base File

The pyWeb base file is shown below:

"pyweb.py" (1)=

Shell Escape (4) 
DOC String (3) 
CVS Cruft and pyweb generator warning (5) 
Imports (2) 
Base Class Definitions (6) 
Application Class (144) 
Module Initialization of global variables (149) 
Interface Functions (150)(151) 

"pyweb.py" (1).

The overhead elements are described in separate sub sections as follows:

The more important elements are described in separate sections:

Python Library Imports

The following Python library modules are used by this application.

The following modules are used by specific subclasses for more specialized purposes.

Imports (2)=


import sys, os, re, time, getopt
import tempfile, filecmp

Imports (2). Used by pyweb.py (1) .

Python DOC String

A Python __doc__ string provides a standard vehicle for documenting the module or the application program. The usual style is to provide a one-sentence summary on the first line. This is followed by more detailed usage information.

DOC String (3)=

"""pyWeb Literate Programming - tangle and weave tool.

Yet another simple literate programming tool derived from nuweb, 
implemented entirely in Python.  
This produces HTML (or LATEX) for any programming language.

Usage:
    pyweb [-vs] [-c x] [-w format] [-t format] file.w

Options:
    -v           verbose output
    -s           silent output
    -c x         change the command character from '@' to x
    -w format    Use the given weaver for the final document.
                 The default is based on the input file, a leading '<'
                 indicates HTML, otherwise Latex.
    -t format    Use the given tangler to produce the output files.
    -xw          Exclude weaving
    -xt          Exclude tangling
    -pi          Permit include-command errors
    
    file.w       The input file, with @o, @d, @i, @[, @{, @|, @<, @f, @m, @u commands.
"""

DOC String (3). Used by pyweb.py (1) .

Other Python Overheads

The shell escape is provided so that the user can define this file as executable, and launch it directly from their shell. The shell reads the first line of a file; when it finds the '#!' shell escape, the remainder of the line is taken as the path to the binary program that should be run. The shell runs this binary, providing the file as standard input.

Shell Escape (4)=


#!/usr/local/bin/python

Shell Escape (4). Used by pyweb.py (1) .

The CVS cruft is a standard way of placing CVS information into a Python module so it is preserved. See PEP (Python Enhancement Proposal) #8 for information on recommended styles.

We also sneak in the "DO NOT EDIT" warning that belongs in all generated application source files.

CVS Cruft and pyweb generator warning (5)=


__version__ = """$Revision$"""

### DO NOT EDIT THIS FILE!
### It was created by ./pyweb.py, __version__='$Revision$'.
### From source pyweb.w modified Tue Jul 23 12:05:28 2002.
### In working directory '/Users/slott/Documents/Books/Python Book/pyWeb'.

CVS Cruft and pyweb generator warning (5). Used by pyweb.py (1) .

Base Class Definitions

There are three major class hierarchies that compose the base of this application. These are families of related classes that express the basic relationships among entities.

Additionally, there are several supporting classes:

Base Class Definitions (6)=



Error class - defines the errors raised (87) 
Logger classes - handle logging of status messages (88) 
Command class hierarchy - used to describe individual commands (74) 
Chunk class hierarchy - used to describe input chunks (52) 
Emitter class hierarchy - used to control output files (7) 
Web class - describes the overall "web" of chunks (92) 
WebReader class - parses the input file, building the Web structure (110) 
Operation class hierarchy - used to describe basic operations of the application (127) 

Base Class Definitions (6). Used by pyweb.py (1) .

Emitters

An Emitter instance is resposible for control of an output file format. This includes the necessary file naming, opening, writing and closing operations. It also includes providing the correct markup for the file type.

There are several subclasses of the Emitter superclass, specialized for various file formats.

Emitter class hierarchy - used to control output files (7)=



Emitter superclass (8) 
Weaver subclass to control documentation production (17) 
LaTex subclass of Weaver (24) 
HTML subclass of Weaver (33) 
Tangler subclass of Emitter (43) 
Tangler subclass which is make-sensitive (47) 

Emitter Factory - used to generate emitter instances from parameter strings (50)(51) 

Emitter class hierarchy - used to control output files (7). Used by Base Class Definitions (6) .

An Emitter instance is created to contain the various details of writing an output file. Emitters are created as follows:

  1. A Web object will create an Emitter to weave the final document.
  2. A Web object will create an Emitter to tangle each file.

Since each Emitter instance is responsible for the details of one file type, different subclasses of Emitter are used when tangling source code files (Tangler) and weaving files that include source code plus markup (Weaver). Further specialization is required when weaving HTML or LaTex.

In the case of tangling, the following algorithm is used:

  1. Visit each each output Chunk (@o or @O), doing the following:
    1. Open the Tangler instance using the target file name.
    2. Visit each Chunk directed to the file, calling the chunk's tangle() method.
      1. Call the Tangler's docBegin() method. This sets the Tangler's indents.
      2. Visit each Command, call the command's tangle() method. For the text of the chunk, the text is written to the tangler using the codeBlock() method. For references to other chunks, the referenced chunk is tangled using the referenced chunk's tangler() method.
      3. Call the Tangler's docEnd() method. This clears the Tangler's indents.

In the case of weaving, the following algorithm is used:

  1. If no Weaver is given, examine the first Command of the first Chunk and create a weaver appropriate for the output format. A leading '<' indicates HTML, otherwise assume Latex.
  2. Open the Weaver instance using the source file name. This name is transformed by the weaver to an output file name appropriate to the language.
  3. Visit each each sequential Chunk (anonymous, @d, @o or @O), doing the following:
    1. Visit each Chunk, calling the Chunk's weave() method.
      1. Call the Weaver's docBegin(), fileBegin() or codeBegin() method, depending on the subclass of Chunk. For fileBegin() and codeBegin(), this writes the header for a code chunk in the weaver's markup language. A slightly different decoration is applied by fileBegin() and codeBegin().
      2. Visit each Command, call the Command's weave() method. For ordinary text, the text is written to the Weaver using the codeBlock() method. For references to other chunks, the referenced chunk is woven using the Weaver's referenceTo() method.
      3. Call the Weaver's docEnd(), fileEnd() or codeEnd() method. For fileEnd() or codeEnd(), this writes a trailer for a code chunk in the Weaver's markup language.

Emitter Superclass

Usage

The Emitter class is not a concrete class, and is never instantiated. It contains common features factored out of the Weaver and Tangler subclasses.

Inheriting from the Emitter class generally requires overriding one or more of the core methods: doOpen(), doClose() and doWrite(). A subclass of Tangler, might override the code writing methods: codeLine(), codeBlock() or codeFinish().

Design

The Emitter class is an abstract superclass for all emitters. It defines the basic framework used to create and write to an output file. This class follows the Template design pattern. This design pattern directs us to factor the basic open(), close() and write() methods into three step algorithms.

def open( self ):
    common preparation
    self.do_open() #overridden by subclasses
    common finish-up tasks

The common preparation and common finish-up sections are generally internal housekeeping. The do_open() method would be overridden by subclasses to change the basic behavior.

Implementation

The class has the following attributes:

Emitter superclass (8)=



class Emitter:
    """Emit an output file; handling indentation context."""
    def __init__( self ):
        self.fileName= ""
        self.theFile= None
        self.context= [0]
        self.indent= 0
        self.lastIndent= 0
        self.linesWritten= 0
        self.totalFiles= 0
        self.totalLines= 0
    Emitter core open, close and write (9) 
    Emitter write a block of code (13)(14)(15) 
    Emitter indent control: set, clear and reset (16) 

Emitter superclass (8). Used by Emitter class hierarchy - used to control output files (7) .

The core open() method tracks the open files. A subclass overrides a doOpen() method to apply an OS-specific operations required to correctly name the output file, and opens the file. If any specific preamble is required by the output file format, this could be done in the doOpen() override. This kind of feature, however, is discouraged. The point of pyWeb (and its predecessors, noweb and nuweb) is to be very simple, putting complete control in the hands of the author.

The close() method closes the file. If some specific postamble is required, this can be part of a function that overrides doClose().

The write() method is the lowest-level, unadorned write. This does no some additional counting as well as moving the characters to the file. Any further processing could be added in a function that overrides doWrite().

The default implementations of the open() and close() methods do nothing, making them safe for debugging. The default write() method prints to the standard output file.

Emitter core open, close and write (9)=



def open( self, aFile ):
    """Open a file."""
    self.fileName= aFile
    self.doOpen( aFile )
    self.linesWritten= 0
Emitter doOpen, to be overridden by subclasses (10) 
def close( self ):
    self.codeFinish()
    self.doClose()
    self.totalFiles += 1
    self.totalLines += self.linesWritten
Emitter doClose, to be overridden by subclasses (11) 
def write( self, text ):
    self.linesWritten += text.count('\n')
    self.doWrite( text )
Emitter doWrite, to be overridden by subclasses (12) 

Emitter core open, close and write (9). Used by Emitter superclass (8) .

The doOpen(), doClose() and doWrite() method is overridden by the various subclasses to perform the unique operation for the subclass.

Emitter doOpen, to be overridden by subclasses (10)=



def doOpen( self, aFile ):
    self.fileName= aFile
    print "> creating %r" % self.fileName

Emitter doOpen, to be overridden by subclasses (10). Used by Emitter core open, close and write (9) .

Emitter doClose, to be overridden by subclasses (11)=



def doClose( self ):
    print "> wrote %d lines to %s" % ( self.linesWritten, self.fileName )

Emitter doClose, to be overridden by subclasses (11). Used by Emitter core open, close and write (9) .

Emitter doWrite, to be overridden by subclasses (12)=



def doWrite( self, text ):
    print text,

Emitter doWrite, to be overridden by subclasses (12). Used by Emitter core open, close and write (9) .

The codeBlock() method writes several lines of code. It calls the codeLine() method for each line of code after doing the correct indentation. Often, the last line of code is incomplete, so it is left unterminated. This last line of code also shows the indentation for any additional code to be tangled into this section.

Note that tab characters confuse the indent algorithm. Tabs are not expanded to spaces in this application. They should be expanded prior to creating a .w file.

The algorithm is as follows:

  1. Save the topmost value of the context stack as the current indent.
  2. Split the block of text on '\n' boundaries.
  3. For each line (except the last), call codeLine() with the indented text, ending with a newline.
  4. The string split() method will put a trailing zero-length element in the list if the original block ended with a newline. We drop this zero length piece to prevent writing a useless fragment of indent-only after the final '\n'. If the last line has content, call codeLine with the indented text, but do not write a trailing '\n'.
  5. Save the length of the last line as the most recent indent.

Emitter write a block of code (13)=



def codeBlock( self, text ):
    """Indented write of a block of code."""
    self.indent= self.context[-1]
    lines= text.split( '\n' )
    for l in lines[:-1]:
        self.codeLine( '%s%s\n' % (self.indent*' ',l) )
    if lines[-1]:
        self.codeLine( '%s%s' % (self.indent*' ',lines[-1]) )
    self.lastIndent= self.indent+len(lines[-1])

Emitter write a block of code (13). Used by Emitter superclass (8) .

The codeLine() method writes a single line of source code. This is often overridden by weaver subclasses to transform source into a form acceptable by the final weave file format.

In the case of an HTML weaver, the HTML reserved characters (<, >, &, and ") must be replaced in the output of code. However, since the author's original document sections contain HTML these will not be altered.

Emitter write a block of code (14)+=



def codeLine( self, aLine ):
    """Each individual line of code; often overridden by weavers."""
    self.write( aLine )

Emitter write a block of code (14). Used by Emitter superclass (8) .

The codeFinish() method finishes writing any cached lines when the emitter is closed.

Emitter write a block of code (15)+=



def codeFinish( self ):
    if self.lastIndent > 0:
        self.write('\n')

Emitter write a block of code (15). Used by Emitter superclass (8) .

The setIndent() method pushes the last indent on the context stack. This is used when tangling source to be sure that the included text is indented correctly with respect to the surrounding text.

The clrIndent() method discards the most recent indent from the context stack. This is used when finished tangling a source chunk. This restores the indent to the prevailing indent.

The resetIndent() method removes all indent context information.

Emitter indent control: set, clear and reset (16)=



def setIndent( self ):
    self.context.append( self.lastIndent )
def clrIndent( self ):
    self.context.pop()
def resetIndent( self ):
    self.context= [0]

Emitter indent control: set, clear and reset (16). Used by Emitter superclass (8) .

Weaver subclass of Emitter

Usage

A Weaver is an Emitter that produces markup in addition to user source document and code. The Weaver class is abstract, and a concrete subclass must provide markup in a specific language.

Design

The Weaver subclass defines an Emitter used to weave the final documentation. This involves decorating source code to make it displayable. It also involves creating references and cross references among the various chunks.

The Weaver class adds several methods to the basic Emitter methods. These additional methods are used exclusively when weaving, never when tangling.

Implementation

Weaver subclass to control documentation production (17)=



class Weaver( Emitter ):
    """Format various types of XRef's and code blocks when weaving."""
    Weaver doOpen, doClose and doWrite overrides (18) 
    # A possible Decorator interface
    Weaver document chunk begin-end (19) 
    Weaver code chunk begin-end (20) 
    Weaver file chunk begin-end (21) 
    Weaver reference command output (22) 
    Weaver cross reference output methods (23) 

Weaver subclass to control documentation production (17). Used by Emitter class hierarchy - used to control output files (7) .

The default for all weavers is to create an HTML file. While not truly universally applicable, it is a common-enough operation that it might be justified in the parent class.

This close method overrides the Emitter class close() method by closing the actual file created by the open() method.

This write method overrides the Emitter class write() method by writing to the actual file created by the open() method.

Weaver doOpen, doClose and doWrite overrides (18)=



def doOpen( self, aFile ):
    src, junk = os.path.splitext( aFile )
    self.fileName= src + '.html'
    self.theFile= open( self.fileName, "w" )
    theLog.event( WeaveStartEvent, "Weaving %r" % self.fileName )
def doClose( self ):
    self.theFile.close()
    theLog.event( WeaveEndEvent, "Wrote %d lines to %r" % 
        (self.linesWritten,self.fileName) )
def doWrite( self, text ):
    self.theFile.write( text )

Weaver doOpen, doClose and doWrite overrides (18). Used by Weaver subclass to control documentation production (17) .

The following functions all form part of an interface that could be removed to a separate class that is a kind of Decorator. Each weaver file format is really another of the possible decorators for woven output. This could separate the basic mechanism of weaving from the file-format issues of latex and HTML.

The docBegin() and docEnd() methods are used when weaving document text. Typically, nothing is done before emitting these kinds of chunks. However, putting a <!--line number--> comment is an example of possible additional processing.

Weaver document chunk begin-end (19)=



def docBegin( self, aChunk ):
    pass
def docEnd( self, aChunk ):
    pass

Weaver document chunk begin-end (19). Used by Weaver subclass to control documentation production (17) .

The codeBegin() method emits the necessary material prior to a chunk of source code, defined with the @d command. A subclass would override this to provide specific text for the intended file type.

The codeEnd() method emits the necessary material subsequent to a chunk of source code, defined with the @d command. The list of references is also provided so that links or cross references to chunks that refer to this chunk can be emitted. A subclass would override this to provide specific text for the intended file type.

Weaver code chunk begin-end (20)=



def codeBegin( self, aChunk ):
    pass
def codeEnd( self, aChunk, references ):
    pass

Weaver code chunk begin-end (20). Used by Weaver subclass to control documentation production (17) .

The fileBegin() method emits the necessary material prior to a chunk of source code, defined with the @o or @O command. A subclass would override this to provide specific text for the intended file type.

The fileEnd() method emits the necessary material subsequent to a chunk of source code, defined with the @o or @O command. The list of references is also provided so that links or cross references to chunks that refer to this chunk can be emitted. A subclass would override this to provide specific text for the intended file type.

Weaver file chunk begin-end (21)=



def fileBegin( self, aChunk ):
    pass
def fileEnd( self, aChunk, references ):
    pass

Weaver file chunk begin-end (21). Used by Weaver subclass to control documentation production (17) .

The referenceTo() method emits a reference to a chunk of source code. There reference is made with a @<...@> reference form within a @d, @o or @O chunk. The references are defined with the @d, @o or @O commands. A subclass would override this to provide specific text for the intended file type.

Weaver reference command output (22)=



def referenceTo( self, name, sequence ):
    pass

Weaver reference command output (22). Used by Weaver subclass to control documentation production (17) .

The xrefHead() method puts decoration in front of cross-reference output. A subclass may override this to change the look of the final woven document.

The xrefFoot() method puts decoration after cross-reference output. A subclass may override this to change the look of the final woven document.

The xrefLine() method is used for both file and macro cross-references to show a name (either file name or macro name) and a list of chunks that reference the file or macro.

The xrefDefLine() method is used for the user identifier cross-reference. This shows a name and a list of chunks that reference or define the name. One of the chunks is identified as the defining chunk, all others are referencing chunks.

The default behavior simply writes the Python data structure used to represent cross reference information. A subclass may override this to change the look of the final woven document.

Weaver cross reference output methods (23)=



def xrefHead( self ):
    pass
def xrefFoot( self ):
    pass
def xrefLine( self, name, refList ):
    """File Xref and Macro Xref detail line."""
    self.write( "%s: %r\n" % ( name, refList ) )
def xrefDefLine( self, name, defn, refList ):
    """User ID Xref detail line."""
    self.write( "%s: %s, %r\n" % ( name, defn, refList ) )

Weaver cross reference output methods (23). Used by Weaver subclass to control documentation production (17) .

LaTex subclass of Weaver

Usage

An instance of Latex can be used by the Web object to weave an output document. The instance is created outside the Web, and given to the weave() method of the Web.

w= Web()
WebReader(aFile).load( w  )
weave_latex= Latex()
w.weave( weave_latex )
Design

The LaTex subclass defines a Weaver that is customized to produce LaTex output of code sections and cross reference information.

Note that this implementation is incomplete, and possibly incorrect. This is a badly damaged snapshot from the nuweb original source.

Implementation

LaTex subclass of Weaver (24)=



class Latex( Weaver ):
    """Latex formatting for XRef's and code blocks when weaving."""
    LaTex doOpen override, close and write are the same as Weaver (25) 
    LaTex code chunk begin (26) 
    LaTex code chunk end (27) 
    LaTex file output begin (28) 
    LaTex file output end (29) 
    LaTex references summary at the end of a chunk (30) 
    LaTex write a line of code (31) 
    LaTex reference to a chunk (32) 

LaTex subclass of Weaver (24). Used by Emitter class hierarchy - used to control output files (7) .

The LaTex open() method opens a .tex file by replacing the source file's suffix with ".tex" and opening the resulting file.

LaTex doOpen override, close and write are the same as Weaver (25)=



def doOpen( self, aFile ):
    src, junk = os.path.splitext( aFile )
    self.fileName= src + '.tex'
    self.theFile= open( self.fileName, "w" )
    theLog.event( WeaveStartEvent, "Weaving %r" % self.fileName )

LaTex doOpen override, close and write are the same as Weaver (25). Used by LaTex subclass of Weaver (24) .

The LaTex codeBegin() method writes the header prior to a chunk of source code.

LaTex code chunk begin (26)=



def codeBegin( self, aChunk ):
    self.resetIndent()
    self.write("\\begin{flushleft} \\small")
    if not aChunk.big_definition: # defined with 'O' instead of 'o'
        self.write("\\begin{minipage}{\\linewidth}")
    self.write( " \\label{scrap%d}" % aChunk.seq )
    self.write( '\\verb@"%s"@~{\\footnotesize ' % aChunk.name )
    self.write( "\\NWtarget{nuweb%s}{%s}$\\equiv$"
        % (aChunk.name,aChunk.seq) )
    self.write( "\\vspace{-1ex}\n\\begin{list}{}{} \\item" )

LaTex code chunk begin (26). Used by LaTex subclass of Weaver (24) .

The LaTex codeEnd() method writes the trailer subsequent to a chunk of source code. This calls the LaTex references() method to write a reference to the chunk that invokes this chunk.

LaTex code chunk end (27)=



def codeEnd( self, aChunk, references ):
    self.write("{\\NWsep}\n\\end{list}")
    self.references( references )
    if not aChunk.big_definition: # defined with 'O' instead of 'o'
        self.write("\\end{minipage}\\\\[4ex]")
    self.write("\\end{flushleft}")

LaTex code chunk end (27). Used by LaTex subclass of Weaver (24) .

The LaTex codeBegin() method writes the header prior to a the creation of a tangled file.

LaTex file output begin (28)=



def fileBegin( self, aChunk ):
    self.resetIndent()
    self.write("\\begin{flushleft} \\small")
    if not aChunk.big_definition: # defined with 'O' instead of 'o'
        self.write("\\begin{minipage}{\\linewidth}")
    self.write( " \\label{scrap%d}" % aChunk.seq )
    self.write( '\\verb@"%s"@~{\\footnotesize ' % aChunk.name )
    self.write( "\\NWtarget{nuweb%s}{%s}$\\equiv$"% (aChunk.name,aChunk.seq) )
    self.write( "\\vspace{-1ex}\n\\begin{list}{}{} \\item" )

LaTex file output begin (28). Used by LaTex subclass of Weaver (24) .

The LaTex codeEnd() method writes the trailer subsequent to a tangled file. This calls the LaTex references() method to write a reference to the chunk that invokes this chunk.

LaTex file output end (29)=



def fileEnd( self, aChunk, references ):
    self.write("{\\NWsep}\n\\end{list}")
    self.references( references )
    if not aChunk.big_definition: # defined with 'O' instead of 'o'
        self.write("\\end{minipage}\\\\[4ex]")
    self.write("\\end{flushleft}")

LaTex file output end (29). Used by LaTex subclass of Weaver (24) .

The references() method writes a list of references after a chunk of code.

LaTex references summary at the end of a chunk (30)=



def references( self, references ):
    if references:
        self.write("\\vspace{-1ex}")
        self.write("\\footnotesize\\addtolength{\\baselineskip}{-1ex}")
        self.write("\\begin{list}{}{\\setlength{\\itemsep}{-\\parsep}")
        self.write("\\setlength{\\itemindent}{-\\leftmargin}}")
        for n,s in references:
            self.write("\\item \\NWtxtFileDefBy\\ %s (%s)" % (n,s) )
        self.write("\\end{list}")
    else:
        self.write("\\vspace{-2ex}")

LaTex references summary at the end of a chunk (30). Used by LaTex subclass of Weaver (24) .

The codeLine() method writes a single line of code to the weaver, providing the necessary LaTex markup.

LaTex write a line of code (31)=



def codeLine( self, aLine ):
    """Each individual line of code with LaTex decoration."""
    self.write( '\\mbox{}\\verb@%s@\\\\\n' % aLine.rstrip() )

LaTex write a line of code (31). Used by LaTex subclass of Weaver (24) .

The referenceTo() method writes a reference to another chunk of code. It uses write directly as to follow the current indentation on the current line of code.

LaTex reference to a chunk (32)=



def referenceTo( self, name, sequence ):
    self.write( "\\NWlink{nuweb%s}{%s}$\\equiv$"% (name,sequence) )

LaTex reference to a chunk (32). Used by LaTex subclass of Weaver (24) .

HTML subclass of Weaver

Usage

An instance of HTML can be used by the Web object to weave an output document. The instance is created outside the Web, and given to the weave() method of the Web.

w= Web()
WebReader(aFile).load( w  )
weave_html= HTML()
w.weave( weave_html )
Design

The HTML subclass defines a Weaver that is customized to produce HTML output of code sections and cross reference information.

All HTML chunks are identified by anchor names of the form pywebn. Each n is the unique chunk number, in sequential order.

Implementation

HTML subclass of Weaver (33)=



class HTML( Weaver ):
    """HTML formatting for XRef's and code blocks when weaving."""
    HTML code chunk begin (34) 
    HTML code chunk end (35) 
    HTML output file begin (36) 
    HTML output file end (37) 
    HTML references summary at the end of a chunk (38) 
    HTML write a line of code (39) 
    HTML reference to a chunk (40) 
    HTML simple cross reference markup (41) 

HTML subclass of Weaver (33). Used by Emitter class hierarchy - used to control output files (7) .

The codeBegin() method starts a chunk of code, defined with @d, providing a label and HTML tags necessary to set the code off visually.

HTML code chunk begin (34)=



def codeBegin( self, aChunk ):
    self.resetIndent()
    self.write( '\n<a name="pyweb%s"></a>\n' % ( aChunk.seq ) )
    self.write( '<!--line number %s-->' % (aChunk.lineNumber()) )
    self.write( '<p><em>%s</em> (%s)&nbsp;%s</p>\n' 
        % (aChunk.fullName,aChunk.seq,aChunk.firstSecond) )
    self.write( "<pre><code>\n" )

HTML code chunk begin (34). Used by HTML subclass of Weaver (33) .

The codeEnd() method ends a chunk of code, providing a HTML tags necessary to finish the code block visually. This calls the references method to write the list of chunks that reference this chunk.

HTML code chunk end (35)=



def codeEnd( self, aChunk, references ):
    self.write( "\n</code></pre>\n" )
    self.write( '<p>&loz; <em>%s</em> (%s).' % (aChunk.fullName,aChunk.seq) )
    self.references( references )
    self.write( "</p>\n" )

HTML code chunk end (35). Used by HTML subclass of Weaver (33) .

The fileBegin() method starts a chunk of code, defined with @o or @O, providing a label and HTML tags necessary to set the code off visually.

HTML output file begin (36)=



def fileBegin( self, aChunk ):
    self.resetIndent()
    self.write( '\n<a name="pyweb%s"></a>\n' % ( aChunk.seq ) )
    self.write( '<!--line number %s-->' % (aChunk.lineNumber()) )
    self.write( '<p><tt>"%s"</tt> (%s)&nbsp;%s</p>\n' 
        % (aChunk.fullName,aChunk.seq,aChunk.firstSecond) )
    self.write( "<pre><code>\n" )

HTML output file begin (36). Used by HTML subclass of Weaver (33) .

The fileEnd() method ends a chunk of code, providing a HTML tags necessary to finish the code block visually. This calls the references method to write the list of chunks that reference this chunk.

HTML output file end (37)=



def fileEnd( self, aChunk, references ):
    self.write( "\n</code></pre>\n" )
    self.write( '<p>&loz; <tt>"%s"</tt> (%s).' % (aChunk.fullName,aChunk.seq) )
    self.references( references )
    self.write( "</p>\n" )

HTML output file end (37). Used by HTML subclass of Weaver (33) .

The references() method writes the list of chunks that refer to this chunk.

HTML references summary at the end of a chunk (38)=



def references( self, references ):
    if references:
        self.write( "  Used by ")
        for n,s in references:
            self.write( '<a href="#pyweb%s"><em>%s</em> (%s)</a>  ' % ( s,n,s ) )
        self.write( "." )

HTML references summary at the end of a chunk (38). Used by HTML subclass of Weaver (33) .

The codeLine() method writes an individual line of code for HTML purposes. This encodes the four basic HTML entities (<, >, &, ") to prevent code from being interpreted as HTML.

The htmlClean() method does the basic HTML entity replacement. This is factored out of the basic codeLine() method so that subclasses can use this method, also.

HTML write a line of code (39)=



def htmlClean( self, text ):
    """Replace basic HTML entities."""
    clean= text.replace( "&", "&amp;" ).replace( '"', "&quot;" )
    clean= clean.replace( "<", "&lt;" ).replace( ">", "&gt;" )
    return clean
def codeLine( self, aLine ):
    """Each individual line of code with HTML cleanup."""
    self.write( self.htmlClean(aLine) )

HTML write a line of code (39). Used by HTML subclass of Weaver (33) .

The referenceTo() method writes a reference to another chunk. It uses the direct write() method so that the reference is indented properly with the surrounding source code.

HTML reference to a chunk (40)=



def referenceTo( self, aName, seq ):
    """Weave a reference to a chunk."""
    # Provide name to get a full reference.
    # Omit name to get a short reference.
    if aName:
        self.write( '<a href="#pyweb%s">&rarr;<em>%s</em> (%s)</a> ' 
            % ( seq, aName, seq ) )
    else:
        self.write( '<a href="#pyweb%s">(%s)</a> ' 
            % ( seq, seq ) )

HTML reference to a chunk (40). Used by HTML subclass of Weaver (33) .

The xrefHead() method writes the heading for any of the cross reference blocks created by @f, @m, or @u. In this implementation, the cross references are simply unordered lists.

The xrefFoot() method writes the footing for any of the cross reference blocks created by @f, @m, or @u. In this implementation, the cross references are simply unordered lists.

The xrefLine() method writes a line for the file or macro cross reference blocks created by @f or @m. In this implementation, the cross references are simply unordered lists.

HTML simple cross reference markup (41)=



def xrefHead( self ):
    self.write( "<dl>\n" )
def xrefFoot( self ):
    self.write( "</dl>\n" )
def xrefLine( self, name, refList ):
    self.write( "<dt>%s:</dt><dd>" % name )
    for r in refList:
        self.write( '<a href="#pyweb%s">%s</a>  ' % (r,r) )
    self.write( "</dd>\n" )
HTML write user id cross reference line (42) 

HTML simple cross reference markup (41). Used by HTML subclass of Weaver (33) .

The xrefDefLine() method writes a line for the user identifier cross reference blocks created by @u. In this implementation, the cross references are simply unordered lists. The defining instance is included in the correct order with the other instances, but is bold and marked with a bullet ().

HTML write user id cross reference line (42)=



def xrefDefLine( self, name, defn, refList ):
    self.write( "<dt>%s:</dt><dd>" % name )
    allPlaces= refList+[defn]
    allPlaces.sort()
    for r in allPlaces:
        if r == defn:
            self.write( '<a href="#pyweb%s"><b>&bull;%s</b></a>  ' 
                % (r,r) )
        else:
            self.write( '<a href="#pyweb%s">%s</a>  ' % (r,r) )
    self.write( "</dd>\n" )

HTML write user id cross reference line (42). Used by HTML simple cross reference markup (41) .

Tangler subclass of Emitter

Usage

The Tangler class is concrete, and can tangle source files. An instance of Tangler is given to the Web class tangle() method.

w= Web()
WebReader( aFile ).load( w )
t= Tangler()
w.tangle( t )
Design

The Tangler subclass defines an Emitter used to tangle the various program source files. The superclass is used to simply emit correctly indented source code and do very little else that could corrupt or alter the output.

Language-specific subclasses could be used to provide additional decoration. For example, inserting #line directives showing the line number in the original source file.

Implementation

Tangler subclass of Emitter (43)=



class Tangler( Emitter ):
    """Tangle output files."""
    Tangler doOpen, doClose and doWrite overrides (44) 
    Tangler code chunk begin (45) 
    Tangler code chunk end (46) 

Tangler subclass of Emitter (43). Used by Emitter class hierarchy - used to control output files (7) .

The default for all tanglers is to create the named file.

This doClose() method overrides the Emitter class doClose() method by closing the actual file created by open.

This doWrite() method overrides the Emitter class doWrite() method by writing to the actual file created by open.

Tangler doOpen, doClose and doWrite overrides (44)=



def doOpen( self, aFile ):
    self.fileName= aFile
    self.theFile= open( aFile, "w" )
    theLog.event( TangleStartEvent, "Tangling %r" % aFile )
def doClose( self ):
    self.theFile.close()
    theLog.event( TangleEndEvent, "Wrote %d lines to %r" 
        % (self.linesWritten,self.fileName) )
def doWrite( self, text ):
    self.theFile.write( text )

Tangler doOpen, doClose and doWrite overrides (44). Used by Tangler subclass of Emitter (43) .

The codeBegin() method starts emitting a new chunk of code. It does this by setting the Tangler's indent to the prevailing indent at the start of the @< reference command.

Tangler code chunk begin (45)=



def codeBegin( self, aChunk ):
    self.setIndent()

Tangler code chunk begin (45). Used by Tangler subclass of Emitter (43) .

The codeEnd() method ends emitting a new chunk of code. It does this by resetting the Tangler's indent to the previous setting.

Tangler code chunk end (46)=



def codeEnd( self, aChunk ):
    self.clrIndent()

Tangler code chunk end (46). Used by Tangler subclass of Emitter (43) .

TanglerMake subclass of Tangler

Usage

The TanglerMake class is can tangle source files. An instance of TanglerMake is given to the Web class tangle() method.

w= Web()
WebReader( aFile ).load( w )
t= TanglerMake()
w.tangle( t )
Design

The TanglerMake subclass makes the Tangler used to tangle the various program source files more make-friendly. This subclass of Tangler does not touch an output file where there is no change. This is helpful when pyWeb's output is sent to make. Using TanglerMake assures that only files with real changes are rewritten, minimizing recompilation of an application for changes to the associated documentation.

Implementation

Tangler subclass which is make-sensitive (47)=



class TanglerMake( Tangler ):
    """Tangle output files, leaving files untouched if there are no changes."""
    def __init__( self ):
        Tangler.__init__( self )
        self.tempname= None
    TanglerMake doOpen override, using a temporary file (48) 
    TanglerMake doClose override, comparing temporary to original (49) 

Tangler subclass which is make-sensitive (47). Used by Emitter class hierarchy - used to control output files (7) .

A TanglerMake creates a temporary file to collect the tangled output. When this file is completed, we can compare it with the original file in this directory, avoiding a "touch" if the new file is the same as the original.

TanglerMake doOpen override, using a temporary file (48)=



def doOpen( self, aFile ):
    self.tempname= tempfile.mktemp()
    self.theFile= open( self.tempname, "w" )
    theLog.event( TangleStartEvent, "Tangling %r" % aFile )

TanglerMake doOpen override, using a temporary file (48). Used by Tangler subclass which is make-sensitive (47) .

If there is a previous file: compare the temporary file and the previous file. If there was previous file or the files are different: rename temporary to replace previous; else: unlink temporary and discard it. This preserves the original (with the original date and time) if nothing has changed.

TanglerMake doClose override, comparing temporary to original (49)=



def doClose( self ):
    self.theFile.close()
    try:
        same= filecmp.cmp( self.tempname, self.fileName )
    except OSError,e:
        same= 0
    if same:
        theLog.event( SummaryEvent, "No change to %r" % (self.fileName) )
        os.remove( self.tempname )
    else:
        # note the Windows requires the original file name be removed first
        try: 
            os.remove( self.fileName )
        except OSError,e:
            pass
        os.rename( self.tempname, self.fileName )
        theLog.event( TangleEndEvent, "Wrote %d lines to %r" 
            % (self.linesWritten,self.fileName) )

TanglerMake doClose override, comparing temporary to original (49). Used by Tangler subclass which is make-sensitive (47) .

Emitter Factory

Usage

We use the Factory Method design pattern to permit extending the Emitter class hierarchy. Any application that imports this basic pyWeb module can define appropriate new subclasses, provide a subclass of this EmitterFactory, and use the existing main program.

import pyweb

class MyHTMLWeaver( HTML ):
    ... (overrides to various methods) ...
    
class MyEmitterFactory( EmitterFactory ):
    def mkEmitter( self, name ):
        """Make an Emitter - try superclass first, then locally defined."""
        s= pyweb.EmitterFactory.mkEmitter( self, name )
        if s: return s
        if name.lower() == 'myhtmlweaver': return MyHTMLWeaver()
        return None

if __name__ == "__main__":
    pyweb.main( MyEmitterFactory(), sys.argv ) 

Design

We use a Chain of Command-like design for the mkEmitter() method. A subclass first uses the parent class mkEmitter() to see if the name is recognized. If it is not, then the subclass can match the added class names against the argument.

Implementation

To emphasize the implementation, we provide an EmitterFactory superclass that creates the abstract superclasses of Weaver and Tangler. We subclass this to create a more useful EmitterFactory that creates any of the instances in this base pyWeb module.

The EmitterFactorySuper is a superclass that only recognizes the basic Weaver and Tangler emitters. This must be subclassed to recognize the more useful emitters.

Emitter Factory - used to generate emitter instances from parameter strings (50)=



class EmitterFactorySuper:
    def mkEmitter( self, name ):
        if name.lower() == 'weaver': return Weaver()
        elif name.lower() == 'tangler': return Tangler()
        return None

Emitter Factory - used to generate emitter instances from parameter strings (50). Used by Emitter class hierarchy - used to control output files (7) .

The EmitterFactory class is a subclass of EmitterFactorySuper that recognizes all of the various emitters defined in this module. It also shows how a subclass would be constructed.

Emitter Factory - used to generate emitter instances from parameter strings (51)+=



class EmitterFactory( EmitterFactorySuper ):
    def mkEmitter( self, name ):
        """Make an Emitter - try superclass first, then locally defined."""
        s= EmitterFactorySuper.mkEmitter( self, name )
        if s: return s
        if name.lower() == 'html': return Weaver()
        elif name.lower() == 'latex': return Latex()
        elif name.lower() == 'tanglermake': return TanglerMake()
        return None

Emitter Factory - used to generate emitter instances from parameter strings (51). Used by Emitter class hierarchy - used to control output files (7) .

Chunks

A Chunk is a piece of the input file. It is a collection of Command instances. A chunk can be woven or tangled to create output.

The two most important methods are the weave() and tangle() methods. These visit the commands of this chunk, producing the required output file.

Additional methods (startswith(), searchForRE() and usedBy()) are used to examine the text of the Command instances within the chunk.

A Chunk instance is created by the WebReader as the input file is parsed. Each Chunk instance has one or more pieces of the original input text. This text can be program source, a reference command, or the documentation source.

Chunk class hierarchy - used to describe input chunks (52)=



Chunk class (53) 
NamedChunk class (63) 
OutputChunk class (68) 
NamedDocumentChunk class (71) 

Chunk class hierarchy - used to describe input chunks (52). Used by Base Class Definitions (6) .

The Chunk class is both the superclass for this hierarchy and the implementation for anonymous chunks. An anonymous chunk is always documentation in the target markup language. No transformation is ever done on anonymous chunks.

A NamedChunk is a chunk created with a @d command. This is a chunk of source programming language, bracketed with @{ and @}.

An OutputChunk is a named chunk created with a @o or @O command. This must be a chunk of source programming language, bracketed with @{ and @}.

A NamedDocumentChunk is a named chunk created with a @d command. This is a chunk of documentation in the target markup language, bracketed with @[ and @].

Chunk Superclass

Usage

An instance of the Chunk class has a life that includes four important events: creation, cross-reference, weave and tangle.

A Chunk is created by a WebReader, and associated with a Web. There are several web append methods, depending on the exact subclass of Chunk. The WebReader calls the chunk's webAdd() method select the correct method for appending and indexing the chunk. Individual instances of Command are appended to the chunk. The basic outline for creating a Chunk instance is as follows:

w= Web()
c= Chunk()
c.webAdd( w )
c.append( ...some Command... )
c.append( ...some Command... )

Before weaving or tangling, a cross reference is created for all user identifiers in all of the Chunk instances. This is done by: (1) visit each Chunk and call the getUserIDRefs() method to gather all identifiers; (2) for each identifier, visit each Chunk and call the searchForRE() method to find uses of the identifier.

ident= []
for c in the Web's named chunk list:
    ident.extend( c.getUserIDRefs() )
for i in ident:
    pattern= re.compile('\W%s\W' % i)
    for c in the Web's named chunk list:
        c.searchForRE( pattern, self )

A Chunk is woven or tangled by the Web. The basic outline for weaving is as follows. The tangling operation is essentially the same.

for c in the Web's chunk list:
    c.weave( aWeaver )
Design

The Chunk class contains the overall definitions for all of the various specialized subclasses. In particular, it contains the append(), appendChar() and appendText() methods used by all of the various Chunk subclasses.

When a @@ construct is located in the input stream, the stream contains three text tokens: material before the @@, the @@, and the material after the @@. These three tokens are reassembled into a single block of text. This reassembly is accomplished by changing the chunk's state so that the next TextCommand is appended onto the previous TextCommand.

There are two operating states for instances of this class. The state change is accomplished on a call to the appendChar() method, and alters the behavior of the appendText() method. The appendText() method either:

Each subclass of Chunk has a particular type of text that it will process. Anonymous chunks only handle document text. The NamedChunk subclass that handles program source will override this method to create a different command type. The makeContent() method creates the appropriate Command instance for this Chunk subclass.

The weave() method of an anonymous Chunk uses the weaver's docBegin() and docEnd() methods to insert text that is source markup. Other subclasses will override this to use different Weaver methods for different kinds of text.

Implementation

The Chunk constructor initializes the following instance variables:

Chunk class (53)=



class Chunk:
    """Anonymous piece of input file: will be output through the weaver only."""
    # construction and insertion into the web
    def __init__( self ):
        self.commands= [ ]
        self.lastCommand= None
        self.big_definition= None
        self.xref= None
        self.firstSecond= None
        self.name= ''
        self.seq= None
    def __str__( self ):
        return "\n".join( map( str, self.commands ) )
    Chunk append a command (54) 
    Chunk append a character (55) 
    Chunk append text (56) 
    Chunk add to the web (57) 
    def makeContent( self, text, lineNumber=0 ):
        return TextCommand( text, lineNumber )
    Chunk examination: starts with, matches pattern, references (58) 
    Chunk weave (61) 
    Chunk tangle (62) 

Chunk class (53). Used by Chunk class hierarchy - used to describe input chunks (52) .

The append() method simply appends a Command instance to this chunk.

Chunk append a command (54)=



def append( self, command ):
    """Add another Command to this chunk."""
    self.commands.append( command )

Chunk append a command (54). Used by Chunk class (53) .

When an @@ construct is located, the appendChar() method:

  1. accumulates the @ character at the end of the previous TextCommand,
  2. and changes the state of the chunk so that the next TextCommand is concatenated, also.

Chunk append a character (55)=



def appendChar( self, text, lineNumber=0 ):
    """Append a single character to the most recent TextCommand."""
    if len(self.commands)==0 or not isinstance(self.commands[-1],TextCommand):
        self.commands.append( self.makeContent("",lineNumber) )
    self.commands[-1].text += text
    self.lastCommand= self.commands[-1]

Chunk append a character (55). Used by Chunk class (53) .

The appendText() method appends a TextCommand to this chunk, or it appends it to the most recent TextCommand. This condition is defined by the appendChar() method.

Chunk append text (56)=



def appendText( self, text, lineNumber=0 ):
    """Add another TextCommand to this chunk or concatenate to the most recent TextCommand."""
    if self.lastCommand:
        assert len(self.commands)>=1 and isinstance(self.commands[-1],TextCommand)
        self.commands[-1].text += text
        self.lastCommand= None
    else:
        self.commands.append( self.makeContent(text,lineNumber) )

Chunk append text (56). Used by Chunk class (53) .

The webAdd() method adds this chunk to the given document web. Each subclass of the Chunk class must override this to be sure that the various Chunk subclasses are indexed properly. The Chunk class uses the add() method of the Web class to append an anonymous, unindexed chunk.

Chunk add to the web (57)=



def webAdd( self, web ):
    """Add self to a Web as anonymous chunk."""
    web.add( self )

Chunk add to the web (57). Used by Chunk class (53) .

The startsWith() method examines a the first Command instance this Chunk instance to see if it starts with the given prefix string.

The lineNumber() method returns the line number of the first Command in this chunk. This provides some context for where the chunk occurs in the original input file.

A NamedChunk instance may define one or more identifiers. This parent class provides a dummy version of the getUserIDRefs method. The NamedChunk subclass overrides this to provide actual results. By providing this at the superclass level, the Web can easily gather identifiers without knowing the actual subclass of Chunk.

The searchForRE() method examines each Command instance to see if it matches with the given regular expression. If so, this can be reported to the Web instance and accumulated as part of a cross reference for this Chunk.

The usedBy() method visits each Command instance; a Command instance calls the Web class setUsage() method to report the references from this Chunk to other Chunks. This set of references can be reversed to identify the chunks that refer to this chunk.

Chunk examination: starts with, matches pattern, references (58)=



def startswith( self, prefix ):
    """Examine the first command's starting text."""
    return len(self.commands) >= 1 and self.commands[0].startswith( prefix )
def searchForRE( self, rePat, aWeb ):
    """Visit each command, applying the pattern."""
    Chunk search for user identifiers done by iteration through each command (59) 
def usedBy( self, aWeb ):
    """Update web's used-by xref."""
    Chunk usedBy update done by iteration through each command (60) 
def lineNumber( self ):
    """Return the first command's line number or None."""
    return len(self.commands) >= 1 and self.commands[0].lineNumber
def getUserIDRefs( self ):
    return []

Chunk examination: starts with, matches pattern, references (58). Used by Chunk class (53) .

The chunk search in the searchForRE() method parallels weaving and tangling a Chunk. The operation is delegated to each Command instance within the Chunk instance.

Chunk search for user identifiers done by iteration through each command (59)=



for c in self.commands:
    if c.searchForRE( rePat, aWeb ):
        return self
return None

Chunk search for user identifiers done by iteration through each command (59). Used by Chunk examination: starts with, matches pattern, references (58) .

The usedBy() update visits each Command instance. It calls the Command class usedBy() method, passing in the overall Web instance and this Chunk instance. This allows the Command to generate a reference from this Chunk to another Chunk, and notify the Web instance of this reference. The Command, if it is a ReferenceCommand, will also update the Chunk instance refCount attribute.

Note that an exception may be raised by this operation if a referenced Chunk does not actually exist. If a reference Command does raise an error, we append this Chunk information and reraise the error with the additional context information.

Chunk usedBy update done by iteration through each command (60)=



try:
    for t in self.commands:
        t.usedBy( aWeb, self )
except Error,e:
    raise Error,e.args+(self,)

Chunk usedBy update done by iteration through each command (60). Used by Chunk examination: starts with, matches pattern, references (58) .

The weave() method weaves this chunk into the final document as follows:

  1. call the Weaver class docBegin() method. This method does nothing for document content.
  2. visit each Command instance: call the Command instance weave() method to emit the content of the Command instance
  3. call the Weaver class docEnd() method. This method does nothing for document content.

Note that an exception may be raised by this operation if a referenced Chunk does not actually exist. If a reference Command does raise an error, we append this Chunk information and reraise the error with the additional context information.

Chunk weave (61)=



def weave( self, aWeb, aWeaver ):
    """Create the nicely formatted document from an anonymous chunk."""
    aWeaver.docBegin( self )
    try:
        for t in self.commands:
            t.weave( aWeb, aWeaver )
    except Error, e:
        raise Error,e.args+(self,)
    aWeaver.docEnd( self )
def weaveReferenceTo( self, aWeb, aWeaver ):
    """Create a reference to this chunk -- except for anonymous chunks."""
    raise Exception( "Cannot reference an anonymous chunk.""")
def weaveShortReferenceTo( self, aWeb, aWeaver ):
    """Create a short reference to this chunk -- except for anonymous chunks."""
    raise Exception( "Cannot reference an anonymous chunk.""")

Chunk weave (61). Used by Chunk class (53) .

Anonymous chunks cannot be tangled. Any attempt indicates a serious problem with this program or the input file.

Chunk tangle (62)=



def tangle( self, aWeb, aTangler ):
    """Create source code -- except anonymous chunks should not be tangled"""
    raise Error( 'Cannot tangle an anonymous chunk', self )

Chunk tangle (62). Used by Chunk class (53) .

NamedChunk class

Usage

A NamedChunk is created and used almost identically to an anonymous Chunk. The most significant difference is that a name is provided when the NamedChunk is created. This name is used by the Web to organize the chunks.

Design

A NamedChunk is created with a @d, @o or @O command. A NamedChunk contains programming language source when the brackets are @{ and @}. A separate subclass of NamedDocumentChunk is used when the brackets are @[ and @].

A NamedChunk can be both tangled into the output program files, and woven into the output document file.

The weave() method of a NamedChunk uses the Weaver's codeBegin() and codeEnd() methods to insert text that is program source and requires additional markup to make it stand out from documentation. Other subclasses can override this to use different Weaver methods for different kinds of text.

Implementation

This class introduces some additional attributes.

NamedChunk class (63)=



class NamedChunk( Chunk ):
    """Named piece of input file: will be output as both tangler and weaver."""
    def __init__( self, name ):
        Chunk.__init__( self )
        self.name= name
        self.seq= None
        self.fullName= None
        self.xref= []
        self.refCount= 0
    def __str__( self ):
        return "%r: %s" % ( self.name, Chunk.__str__(self) )
    def makeContent( self, text, lineNumber=0 ):
        return CodeCommand( text, lineNumber )
    NamedChunk user identifiers set and get (64) 
    NamedChunk add to the web (65) 
    NamedChunk weave (66) 
    NamedChunk tangle (67) 

NamedChunk class (63). Used by Chunk class hierarchy - used to describe input chunks (52) .

The setUserIDRefs() method accepts a list of user identifiers that are associated with this chunk. These are provided after the @| separator in a @d named chunk. These are used by the @u cross reference generator.

NamedChunk user identifiers set and get (64)=



def setUserIDRefs( self, text ):
    """Save xref variable names."""
    self.xref= text.split()
def getUserIDRefs( self ):
    return self.xref

NamedChunk user identifiers set and get (64). Used by NamedChunk class (63) .

The webAdd() method adds this chunk to the given document Web instance. Each class of Chunk must override this to be sure that the various Chunk classes are indexed properly. This class uses the addNamed() method of the Web class to append a named chunk.

NamedChunk add to the web (65)=



def webAdd( self, web ):
    """Add self to a Web as named chunk, update xrefs."""
    web.addNamed( self )

NamedChunk add to the web (65). Used by NamedChunk class (63) .

The weave() method weaves this chunk into the final document as follows:

  1. call the Weaver class codeBegin() method. This method emits the necessary markup for code appearing in the woven output.
  2. visit each Command, calling the command's weave() method to emit the command's content
  3. call the Weaver class CodeEnd() method. This method emits the necessary markup for code appearing in the woven output.

The weaveRefenceTo() method weaves a reference to a chunk using both name and sequence number. The weaveShortReferenceTo() method weaves a reference to a chunk using only the sequence number. These references are created by ReferenceCommand instances within a chunk being woven.

If a ReferenceCommand does raise an error during weaving, we append this Chunk information and reraise the error with the additional context information.

NamedChunk weave (66)=



def weave( self, aWeb, aWeaver ):
    """Create the nicely formatted document from a chunk of code."""
    # format as <pre> in a different-colored box
    self.fullName= aWeb.fullNameFor( self.name )
    aWeaver.codeBegin( self )
    for t in self.commands:
        try:
            t.weave( aWeb, aWeaver )
        except Error,e:
            raise Error,e.args+(self,)
    aWeaver.codeEnd( self, aWeb.chunkReferencedBy( self.seq ) )
def weaveReferenceTo( self, aWeb, aWeaver ):
    """Create a reference to this chunk."""
    self.fullName= aWeb.fullNameFor( self.name )
    aWeaver.referenceTo( self.fullName, self.seq )
def weaveShortReferenceTo( self, aWeb, aWeaver ):
    """Create a shortened reference to this chunk."""
    aWeaver.referenceTo( None, self.seq )

NamedChunk weave (66). Used by NamedChunk class (63) .

The tangle() method tangles this chunk into the final document as follows:

  1. call the Tangler class codeBegin() method to set indents properly.
  2. visit each Command, calling the Command's tangle() method to emit the Command's content
  3. call the Tangler class codeEnd() method to restore indents.

If a ReferenceCommand does raise an error during tangling, we append this Chunk information and reraise the error with the additional context information.

NamedChunk tangle (67)=



def tangle( self, aWeb, aTangler ):
    """Create source code."""
    # use aWeb to resolve @<namedChunk@>
    # format as correctly indented source text
    aTangler.codeBegin( self )
    for t in self.commands:
        try:
            t.tangle( aWeb, aTangler )
        except Error,e:
            raise Error,e.args+(self,)
    aTangler.codeEnd( self )

NamedChunk tangle (67). Used by NamedChunk class (63) .

OutputChunk class

Usage

A OutputChunk is created and used identically to a NamedChunk. The difference between this class and the parent class is the decoration of the markup when weaving.

Design

The OutputChunk class is a subclass of NamedChunk that handles file output chunks defined with @o or @O. These are woven slightly differently, to allow for a presentation of the file chunks that is different from the presentation of the other named chunks.

The weave() method of a OutputChunk uses the Weaver's fileBegin() and fileEnd() methods to insert text that is program source and requires additional markup to make it stand out from documentation. Other subclasses could override this to use different Weaver methods for different kinds of text.

All other methods, including the tangle method are identical to NamedChunk.

Implementation

OutputChunk class (68)=



class OutputChunk( NamedChunk ):
    """Named piece of input file, defines an output tangle."""
    OutputChunk add to the web (69) 
    OutputChunk weave (70) 

OutputChunk class (68). Used by Chunk class hierarchy - used to describe input chunks (52) .

The webAdd() method adds this chunk to the given document Web. Each class of Chunk must override this to be sure that the various Chunk classes are indexed properly. This class uses the addOutput() method of the Web class to append a file output chunk.

OutputChunk add to the web (69)=



def webAdd( self, web ):
    """Add self to a Web as output chunk, update xrefs."""
    web.addOutput( self )

OutputChunk add to the web (69). Used by OutputChunk class (68) .

The weave() method weaves this chunk into the final document as follows:

  1. call the Weaver class codeBegin() method to emit proper markup for an output file chunk.
  2. visit each Command, call the Command's weave() method to emit the Command's content
  3. call the Weaver class codeEnd() method to emit proper markup for an output file chunk.

These chunks of documentation are never tangled. Any attempt is an error.

If a ReferenceCommand does raise an error during weaving, we append this Chunk information and reraise the error with the additional context information.

OutputChunk weave (70)=



def weave( self, aWeb, aWeaver ):
    """Create the nicely formatted document from a chunk of code."""
    # format as <pre> in a different-colored box
    self.fullName= aWeb.fullNameFor( self.name )
    aWeaver.fileBegin( self )
    try:
        for t in self.commands:
            t.weave( aWeb, aWeaver )
    except Error,e:
        raise Error,e.args+(self,)
    aWeaver.fileEnd( self, aWeb.chunkReferencedBy( self.seq ) )

OutputChunk weave (70). Used by OutputChunk class (68) .

NamedDocumentChunk class

Usage

A NamedDocumentChunk is created and used identically to a NamedChunk. The difference between this class and the parent class is that this chunk is only woven when referenced. The original definition is silently skipped.

Design

The NamedDocumentChunk class is a subclass of NamedChunk that handles named chunks defined with @d and the @[...@] delimiters. These are woven slightly differently, since they are document source, not programming language source.

We're not as interested in the cross reference of named document chunks. They can be used multiple times or never. They are often referenced by anonymous chunks. While this chunk subclass participates in this data gathering, it is ignored for reporting purposes.

All other methods, including the tangle method are identical to NamedChunk.

Implementation

NamedDocumentChunk class (71)=



class NamedDocumentChunk( NamedChunk ):
    """Named piece of input file with document source, defines an output tangle."""
    def makeContent( self, text, lineNumber=0 ):
        return TextCommand( text, lineNumber )
    NamedDocumentChunk weave (72) 
    NamedDocumentChunk tangle (73) 

NamedDocumentChunk class (71). Used by Chunk class hierarchy - used to describe input chunks (52) .

The weave() method quietly ignores this chunk in the document. A named document chunk is only included when it is referenced during weaving of another chunk (usually an anonymous document chunk).

The weaveReferenceTo() method inserts the content of this chunk into the output document. This is done in response to a ReferenceCommand in another chunk. The weaveShortReferenceTo() method calls the weaveReferenceTo() to insert the entire chunk.

NamedDocumentChunk weave (72)=



def weave( self, aWeb, aWeaver ):
    """Ignore this when producing the document."""
    pass
def weaveReferenceTo( self, aWeb, aWeaver ):
    """On a reference to this chunk, expand the body in place."""
    try:
        for t in self.commands:
            t.weave( aWeb, aWeaver )
    except Error,e:
        raise Error,e.args+(self,)
def weaveShortReferenceTo( self, aWeb, aWeaver ):
    """On a reference to this chunk, expand the body in place."""
    self.weaveReferenceTo( aWeb, aWeaver )

NamedDocumentChunk weave (72). Used by NamedDocumentChunk class (71) .

NamedDocumentChunk tangle (73)=



def tangle( self, aWeb, aTangler ):
    """Raise an exception on an attempt to tangle."""
    raise Error( "Cannot tangle a chunk defined with @[.""" )

NamedDocumentChunk tangle (73). Used by NamedDocumentChunk class (71) .

Commands

The input stream is broken into individual commands, based on the various @x strings in the file. There are several subclasses of Command, each used to describe a different command or block of text in the input.

All instances of the Command class are created by a WebReader instance. In this case, a WebReader can be thought of as a factory for Command instances. Each Command instance is appended to the sequence of commands that belong to a Chunk. A chunk may be as small as a single command, or a long sequence of commands.

Each command instance responds to methods to examine the content, gather cross reference information and tangle a file or weave the final document.

Command class hierarchy - used to describe individual commands (74)=



Command superclass (75) 
TextCommand class to contain a document text block (76) 
CodeCommand class to contain a program source code block (77) 
XrefCommand superclass for all cross-reference commands (78) 
FileXrefCommand class for an output file cross-reference (79) 
MacroXrefCommand class for a named chunk cross-reference (80) 
UserIdXrefCommand class for a user identifier cross-reference (81) 
ReferenceCommand class for chunk references (82) 

Command class hierarchy - used to describe individual commands (74). Used by Base Class Definitions (6) .

Command Superclass

Usage

A Command is created by the WebReader, and attached to a Chunk. The Command participates in cross reference creation, weaving and tangling.

The Command superclass is abstract, and has default methods factored out of the various subclasses. When a subclass is created, it will override some of the methods provided in this superclass.

class MyNewCommand( Command ):
    ... overrides for various methods ...

Additionally, a subclass of WebReader must be defined to parse the new command syntax. The main process() function must also be updated to use this new subclass of WebReader.

Design

The Command superclass provides the parent class definition for all of the various command types. The most common command is a block of text, which is woven or tangled. The next most common command is a reference to a chunk, which is woven as a mark-up reference, but tangled as an expansion of the source code.

The attributes of a Command instance includes the line number on which the command began, in lineNumber.

Implementation

Command superclass (75)=



class Command:
    """A Command is the lowest level of granularity in the input stream."""
    def __init__( self, fromLine=0 ):
        self.lineNumber= fromLine
    def __str__( self ):
        return "at %r" % self.lineNumber
    def startswith( self, prefix ):
        return None
    def searchForRE( self, rePat, aWeb ):
        return None
    def usedBy( self, aWeb, aChunk ):
        pass
    def weave( self, aWeb, aWeaver ):
        pass
    def tangle( self, aWeb, aTangler ):
        pass

Command superclass (75). Used by Command class hierarchy - used to describe individual commands (74) .

TextCommand class

Usage

A TextCommand is created by a Chunk or a NamedDocumentChunk when a WebReader calls the chunk's appendChar() or appendText() method. This Command participates in cross reference creation, weaving and tangling. When it is created, the source line number is provided so that this text can be tied back to the source document.

Design

An instance of the TextCommand class is a block of document text. It can originate in an anonymous block or a named chunk delimited with @[ and @].

This subclass provides a concrete implementation for all of the methods. Since text is the author's original markup language, it is emitted directly to the weaver or tangler.

Implementation

TextCommand class to contain a document text block (76)=



class TextCommand( Command ):
    """A piece of document source text."""
    def __init__( self, text, fromLine=0 ):
        Command.__init__( self, fromLine )
        self.text= text
    def __str__( self ):
        return "at %r: %r..." % (self.lineNumber,self.text[:32])
    def startswith( self, prefix ):
        return self.text.startswith( prefix )
    def searchForRE( self, rePat, aWeb ):
        return rePat.search( self.text )
    def weave( self, aWeb, aWeaver ):
        aWeaver.write( self.text )
    def tangle( self, aWeb, aTangler ):
        aTangler.write( self.text )

TextCommand class to contain a document text block (76). Used by Command class hierarchy - used to describe individual commands (74) .

CodeCommand class

Usage

A CodeCommand is created by a NamedChunk when a WebReader calls the appendText() or appendChar() method. The Command participates in cross reference creation, weaving and tangling. When it is created, the source line number is provided so that this text can be tied back to the source document.

Design

An instance of the CodeCommand class is a block of program source code text. It can originate in a named chunk (@d) with a @{ and @} delimiter. Or it can be a file output chunk (@o, @O).

It uses the codeBlock() methods of a Weaver or Tangler. The weaver will insert appropriate markup for this code. The tangler will assure that the prevailing indentation is maintained.

Implementation

CodeCommand class to contain a program source code block (77)=



class CodeCommand( TextCommand ):
    """A piece of program source code."""
    def weave( self, aWeb, aWeaver ):
        aWeaver.codeBlock( self.text )
    def tangle( self, aWeb, aTangler ):
        aTangler.codeBlock( self.text )

CodeCommand class to contain a program source code block (77). Used by Command class hierarchy - used to describe individual commands (74) .

XrefCommand superclass

Usage

An XrefCommand is created by the WebReader when any of the @f, @m, @u commands are found in the input stream. The Command is then appended to the current Chunk being built by the WebReader.

Design

The XrefCommand superclass defines any common features of the various cross-reference commands (@f, @m, @u).

The formatXref() method creates the body of a cross-reference by the following algorithm:

  1. Use the Weaver class xrefHead() method to emit the cross-reference header.
  2. Sort the keys in the cross-reference mapping.
  3. Use the Weaver class xrefLine() method to emit each line of the cross-reference mapping.
  4. Use the Weaver class xrefFoot() method to emit the cross-reference footer.

If this command winds up in a tangle operation, that use is illegal. An exception is raised and processing stops.

Implementation

XrefCommand superclass for all cross-reference commands (78)=



class XrefCommand( Command ):
    """Any of the Xref-goes-here commands in the input."""
    def __str__( self ):
        return "at %r: cross reference" % (self.lineNumber)
    def formatXref( self, xref, aWeaver ):
        aWeaver.xrefHead()
        xk= xref.keys()
        xk.sort()
        for n in xk:
            aWeaver.xrefLine( n, xref[n] )
        aWeaver.xrefFoot()
    def tangle( self, aWeb, aTangler ):
        raise Error('Illegal tangling of a cross reference command.')

XrefCommand superclass for all cross-reference commands (78). Used by Command class hierarchy - used to describe individual commands (74) .

FileXrefCommand class

Usage

A FileXrefCommand is created by the WebReader when the @f command is found in the input stream. The Command is then appended to the current Chunk being built by the WebReader.

Design

The FileXrefCommand class weave method gets the file cross reference from the overall web instance, and uses the formatXref() method of the XrefCommand superclass for format this result.

Implementation

FileXrefCommand class for an output file cross-reference (79)=



class FileXrefCommand( XrefCommand ):
    """A FileXref command."""
    def weave( self, aWeb, aWeaver ):
        """Weave a File Xref from @o commands."""
        self.formatXref( aWeb.fileXref(), aWeaver )

FileXrefCommand class for an output file cross-reference (79). Used by Command class hierarchy - used to describe individual commands (74) .

MacroXrefCommand class

Usage

A MacroXrefCommand is created by the WebReader when the @m command is found in the input stream. The Command is then appended to the current Chunk being built by the WebReader.

Design

The MacroXrefCommand class weave method gets the named chunk (macro) cross reference from the overall web instance, and uses the formatXref() method of the XrefCommand superclass method for format this result.

Implementation

MacroXrefCommand class for a named chunk cross-reference (80)=



class MacroXrefCommand( XrefCommand ):
    """A MacroXref command."""
    def weave( self, aWeb, aWeaver ):
        """Weave the Macro Xref from @d commands."""
        self.formatXref( aWeb.chunkXref(), aWeaver )

MacroXrefCommand class for a named chunk cross-reference (80). Used by Command class hierarchy - used to describe individual commands (74) .

UserIdXrefCommand class

Usage

A MacroXrefCommand is created by the WebReader when the @u command is found in the input stream. The Command is then appended to the current Chunk being built by the WebReader.

Design

The UserIdXrefCommand class weave method gets the user identifier cross reference information from the overall web instance. It then formats this line using the following algorithm, which is similar to the algorithm in the XrefCommand superclass.

  1. Use the Weaver class xrefHead() method to emit the cross-reference header.
  2. Sort the keys in the cross-reference mapping.
  3. Use the Weaver class xrefDefLine() method to emit each line of the cross-reference definition mapping.
  4. Use the Weaver class xrefFoor() method to emit the cross-reference footer.
Implementation

UserIdXrefCommand class for a user identifier cross-reference (81)=



class UserIdXrefCommand( XrefCommand ):
    """A UserIdXref command."""
    def weave( self, aWeb, aWeaver ):
        """Weave a user identifier Xref from @d commands."""
        ux= aWeb.userNamesXref()
        aWeaver.xrefHead()
        un= ux.keys()
        un.sort()
        for u in un:
            defn, refList= ux[u]
            aWeaver.xrefDefLine( u, defn, refList )
        aWeaver.xrefFoot()

UserIdXrefCommand class for a user identifier cross-reference (81). Used by Command class hierarchy - used to describe individual commands (74) .

ReferenceCommand class

Usage

A ReferenceCommand instance is created by a WebReader when a @<name@> construct in is found in the input stream. This is attached to the current Chunk being built by the WebReader.

Design

During a weave, this creates a markup reference to another NamedChunk. During tangle, this actually includes the NamedChunk at this point in the tangled output file.

The constructor creates several attributes of an instance of a ReferenceCommand.

Implementation

ReferenceCommand class for chunk references (82)=



class ReferenceCommand( Command ):
    """A reference to a named chunk, via @<name@>."""
    def __init__( self, refTo, fromLine=0 ):
        Command.__init__( self, fromLine )
        self.refTo= refTo
        self.fullname= None
        self.sequenceList= None
    def __str__( self ):
        return "at %r: reference to chunk %r" % (self.lineNumber,self.refTo)
    ReferenceCommand resolve this chunk name if it was abbreviated (83) 
    ReferenceCommand refers to chunk (84) 
    ReferenceCommand weave a reference to a chunk (85) 
    ReferenceCommand tangle a referenced chunk (86) 

ReferenceCommand class for chunk references (82). Used by Command class hierarchy - used to describe individual commands (74) .

The resolve() method queries the overall Web instance for the full name and sequence number for this chunk reference. This is used by the Weaver class referenceTo() method to write the markup reference to the chunk.

ReferenceCommand resolve this chunk name if it was abbreviated (83)=



def resolve( self, aWeb ):
    """Reference to actual location of this chunk"""
    self.fullName, self.sequenceList = aWeb.chunkReference( self.refTo )

ReferenceCommand resolve this chunk name if it was abbreviated (83). Used by ReferenceCommand class for chunk references (82) .

The usedBy() method is a request that is delegated by a Chunk; it resolves the reference and calls the setUsage() method of the overall Web instance to indicate that the parent chunk refers to the named chunk. This also updates the reference count for the named chunk.

ReferenceCommand refers to chunk (84)=



def usedBy( self, aWeb, aChunk ):
    self.resolve( aWeb )
    aWeb.setUsage( aChunk, self.fullName )

ReferenceCommand refers to chunk (84). Used by ReferenceCommand class for chunk references (82) .

The weave() method inserts a markup reference to a named chunk. It uses the Weaver class referenceTo() method to format this appropriately for the document type being woven.

ReferenceCommand weave a reference to a chunk (85)=



def weave( self, aWeb, aWeaver ):
    """Create the nicely formatted reference to a chunk of code."""
    self.resolve( aWeb )
    aWeb.weaveChunk( self.fullName, aWeaver )

ReferenceCommand weave a reference to a chunk (85). Used by ReferenceCommand class for chunk references (82) .

The tangle() method inserts the resolved chunk in this place. When a chunk is tangled, it sets the indent, inserts the chunk and resets the indent.

ReferenceCommand tangle a referenced chunk (86)=



def tangle( self, aWeb, aTangler ):
    """Create source code."""
    self.resolve( aWeb )
    aWeb.tangleChunk( self.fullName, aTangler )

ReferenceCommand tangle a referenced chunk (86). Used by ReferenceCommand class for chunk references (82) .

Error class

Usage

An Error is raised whenever processing cannot continue. Since it is a subclass of Exception, it takes an arbitrary number of arguments. The first should be the basic message text. Subsequent arguments provide additional details. We will try to be sure that all of our internal exceptions reference a specific chunk, if possible. This means either including the chunk as an argument, or catching the exception and appending the current chunk to the exception's arguments.

The Python raise statement takes an instance of Error and passes it to the enclosing try/except statement for processing.

The typical creation is as follows:

raise Error("No full name for %r" % chunk.name, chunk)

A typical exception-handling suite might look like this:

try:
    ...something that may raise an Error or Exception...
except Error,e:
    print e.args # this is our internal Error
except Exception,w:
    print w.args # this is some other Python Exception

Design

The Error class is a subclass of Exception used to differentiate application-specific exceptions from other Python exceptions. It does no additional processing, but merely creates a distinct class to facilitate writing except statements.

Implementation

Error class - defines the errors raised (87)=



class Error( Exception ): pass

Error class - defines the errors raised (87). Used by Base Class Definitions (6) .

Logger Classes

Usage

A single global variable theLog has an instance of the Logger. This instance must be global to this entire module. It is created at the module scope. See the Module Initialization section, below.

theLog= Logger(standard)

Important application events are defined as subclasses of Event. By default, these classes behave somewhat like exceptions. They are constructed with an arbitrary list of values as their arguments. The intent is to name and package the arguments, so there are no methods to override.

class MyEvent( Event ): pass

When a log message needs to be written, the event() method of the logger actually creates the subclass of Event with the desired arguments. It also attaches a LogActivity object to the Event, and calls the Event's log() method.

theLog.event( MyEvent, "arg1", arg2, etc... )

The global logger instance can be configured to apply certain logging strategy methods to each Event instance that is created. The default strategies are LogReport, LogDebug and LogDiscard. These are applied by the event() method after the Event instance is constructed. The LogReport strategy writes a summary to stdout; the LogDebug strategy writes a very detailed line to stdout; the LogDiscard strategy silently ignores the event.

Design

An instance of the Logger class provides global context information for all debugging activity. The most important service is the event() method; this method creates and then activates the given log event.

The Logger class event() method constructs an Event instance. The function accepts a sequence of arguments. The first argument must be an Event class. The remaining arguments are arguments given to the Event class constructor.

The LogActivity instance determines what is done with this class of event. Two of the built-in LogActivity classes are LogReport and LogDiscard. An instance of LogReport will report the event. An instance of LogDiscard will silently discard the event.

Implementation

Logger classes - handle logging of status messages (88)=



class Logger:
    def __init__( self, logConfig ):
        self.logConfig= logConfig
    def event( self, className, message ):
        className(self,message).log()
LogActivity strategy class hierarchy - including LogReport and LogDiscard (91) 
Logger Event base class definitions (89) 
Logger Event subclasses are unique to this application (90) 
Global singletons that define the activities for each Event class (148) 

Logger classes - handle logging of status messages (88). Used by Base Class Definitions (6) .

The various Event subclasses are used to separate application events for convenience in logging and debugging. When an Event instance is created, the Logger configuration is used to attach the correct LogActivity strategy instance.

Logger Event base class definitions (89)=



class Event:
    def __init__( self, *args ):
        self.logger= args[0]
        self.args= args[1:]
        self.action= LogReport()    # default action
        self.time= time.time()
        for cl,a in self.logger.logConfig:
            if self.__class__.__name__ in cl:
                self.action= a
    def __str__( self ):
        return "%s" % ( ';'.join( self.args[:1]) )
    def __repr__( self ):
        return ("%s %s: %s" 
            % ( self.__class__.__name__, 
            time.strftime( '%c', time.localtime( self.time ) ),
            ';'.join( self.args[:1]) ) )
    def log( self ):
        self.action.log( self, self.logger )
class ExecutionEvent( Event ): 
    def __init__( self, *args ):
        apply( Event.__init__, ( self, ) + args )

Logger Event base class definitions (89). Used by Logger classes - handle logging of status messages (88) .

These subclasses of Event are unique to this pyWeb application.

The ErrorEvent overrides the __str__() method. It could provide a slightly different report. In the current version, the display is the same as all other log messages.

Logger Event subclasses are unique to this application (90)=



class InputEvent( Event ): pass
class OptionsEvent( Event ): pass
class ReadEvent( Event ): pass
class WeaveEvent( Event ): pass
class WeaveStartEvent( Event ): pass
class WeaveEndEvent( Event ): pass
class TangleEvent( Event ): pass
class TangleStartEvent( Event ): pass
class TangleEndEvent( Event ): pass
class SummaryEvent( Event ): pass
class ErrorEvent( Event ): 
    def __str__( self ):
        return "%s" % ( ';'.join( self.args[:1]) )
class WarningEvent( ErrorEvent ): pass

Logger Event subclasses are unique to this application (90). Used by Logger classes - handle logging of status messages (88) .

The strategies provide alternate activities to be taken when a log event's log() method is called.

LogActivity strategy class hierarchy - including LogReport and LogDiscard (91)=



class LogActivity:
    def log( self, anEvent, aLogger ):
        pass
class LogReport( LogActivity ):
    def log( self, anEvent, aLogger ):
        print anEvent
class LogDebug( LogActivity ):
    def log( self, anEvent, aLogger ):
        print `anEvent`
class LogDiscard( LogActivity ):
    pass

LogActivity strategy class hierarchy - including LogReport and LogDiscard (91). Used by Logger classes - handle logging of status messages (88) .

The Web Class

The overall web of chunks and their cross references is carried in a single instance of the Web class that drives the weaving and tangling operations. Broadly, the functionality of a Web can be separated into several areas.

A web instance has a number of attributes.

Web class - describes the overall "web" of chunks (92)=



class Web:
    """The overall Web of chunks and their cross-references."""
    def __init__( self ):
        self.sourceFileName= None
        self.chunkSeq= []
        self.output= {}
        self.named= {}
        self.usedBy= {}
        self.sequence= 0
    def __str__( self ):
        return "chunks=%r" % self.chunkSeq
    Web construction methods used by Chunks and WebReader (93) 
    Web Chunk name resolution methods (98)(99)(100) 
    Web Chunk cross reference methods (101)(103)(104) 
    Web determination of the language from the first chunk (107) 
    Web tangle the output files (108) 
    Web weave the output document (109) 

Web class - describes the overall "web" of chunks (92). Used by Base Class Definitions (6) .

During web construction, it is convenient to capture information about the individual Chunk instances being appended to the web. This done using a Callback design pattern. Each subclass of Chunk provides an override for the Chunk class webAdd() method. This override calls one of the appropriate web construction methods.

Also note that the full name for a chunk can be given either as part of the definition, or as part a reference. Typically, the reference has the full name and the definition has the elided name. This allows a reference to a chunk to contain a more complete description of the chunk.

Web construction methods used by Chunks and WebReader (93)=



Web add a full chunk name, ignoring abbreviated names (94) 
Web add an anonymous chunk (95) 
Web add a named macro chunk (96) 
Web add an output file definition chunk (97) 

Web construction methods used by Chunks and WebReader (93). Used by Web class - describes the overall "web" of chunks (92) .

A name is only added to the known names when it is a full name, not an abbreviation ending with "...". Abbreviated names are quietly skipped until the full name is seen.

The algorithm for the addDefName() method, then is as follows:

  1. Use the fullNameFor() method to locate the full name.
  2. If no full name was found (the result of fullNameFor() ends with '...'), ignore this name as an abbreviation with no definition.
  3. If this is a full name and the name was not in the named mapping, add this full name to the mapping.

This name resolution approach presents a problem when a chunk is defined before it is referenced and the first definition uses an abbreviated name. This is an atypical construction of an input document, however, since the intent is to provide high-level summaries that have forward references to supporting details.

Web add a full chunk name, ignoring abbreviated names (94)=



def addDefName( self, name ):
    """Reference to or definition of a chunk name."""
    nm= self.fullNameFor( name )
    if nm[-3:] == '...':
        theLog.event( ReadEvent, "Abbreviated reference %r" % name )
        return None # first occurance is a forward reference using an abbreviation
    if not self.named.has_key( nm ):
        self.named[ nm ]= []
        theLog.event( ReadEvent, "Adding chunk %r" % name )
    return nm

Web add a full chunk name, ignoring abbreviated names (94). Used by Web construction methods used by Chunks and WebReader (93) .

An anonymous Chunk is kept in a sequence of chunks, used for tangling.

Web add an anonymous chunk (95)=



def add( self, chunk ):
    """Add an anonymous chunk."""
    self.chunkSeq.append( chunk )

Web add an anonymous chunk (95). Used by Web construction methods used by Chunks and WebReader (93) .

A named Chunk is defined with a @d command. It is collected into a mapping of NamedChunk instances. An entry in the mapping is a sequence of chunks that have the same name. This sequence of chunks is used to produce the weave or tangle output.

All chunks are also placed in the overall sequence of chunks. This overall sequence is used for weaving the document.

The addDefName() method is used to resolve this name if it is an abbreviation, or add it to the mapping if this is the first occurance of the name. If the name cannot be added, an instance of our Error class is raised. If the name exists or was added, the chunk is appended to the chunk list associated with this name.

The web's sequence counter is incremented, and this unique sequence number sets the seq attribute of the Chunk. If the chunk list was empty, this is the first chunk, the firstSecond flag is set to "=". If the chunk list was not empty, this is a subsequent chunk, the firstSecond flag is set to "+=".

Web add a named macro chunk (96)=



def addNamed( self, chunk ):
    """Add a named chunk to a sequence with a given name."""
    self.chunkSeq.append( chunk )
    nm= self.addDefName( chunk.name )
    if nm:
        self.sequence += 1
        chunk.seq= self.sequence
        if self.named[nm]: chunk.firstSecond= '+='
        else: chunk.firstSecond= '='
        self.named[ nm ].append( chunk )
        theLog.event( ReadEvent, "Extending chunk %r from %r" % ( nm, chunk.name ) )
    else:
        raise Error("No full name for %r" % chunk.name, chunk)

Web add a named macro chunk (96). Used by Web construction methods used by Chunks and WebReader (93) .

An output file definition Chunk is defined with an @o or @O command. It is collected into a mapping of OutputChunk instances. An entry in the mapping is a sequence of chunks that have the same name. This sequence of chunks is used to produce the weave or tangle output.

Note that file names cannot be abbreviated.

All chunks are also placed in overall sequence of chunks. This overall sequence is used for weaving the document.

If the name does not exist in the output mapping, the name is added with an empty sequence of chunks. In all cases, the chunk is appended to the chunk list associated with this name.

The web's sequence counter is incremented, and this unique sequence number sets the Chunk's seq attribute. If the chunk list was empty, this is the first chunk, the firstSecond flag is set to "=". If the chunk list was not empty, this is a subsequent chunk, the firstSecond flag is set to "+=".

Web add an output file definition chunk (97)=



def addOutput( self, chunk ):
    """Add an output chunk to a sequence with a given name."""
    self.chunkSeq.append( chunk )
    if not self.output.has_key( chunk.name ):
        self.output[chunk.name] = []
        theLog.event( ReadEvent, "Adding chunk %r" % chunk.name )
    self.sequence += 1
    chunk.seq= self.sequence
    if self.output[chunk.name]: chunk.firstSecond= '+='
    else: chunk.firstSecond= '='
    self.output[chunk.name].append( chunk )

Web add an output file definition chunk (97). Used by Web construction methods used by Chunks and WebReader (93) .

Web chunk name resolution has three aspects. The first is inflating elided names (those ending with ...) to their actual full names. The second is finding the named chunk in the web structure. The third is returning a reference to a specific chunk including the name and sequence number.

Note that a chunk name actually refers to a sequence of chunks. Multiple definitions for a chunk are allowed, and all of the definitions are concatenated to create the complete chunk. This complexity makes it unwise to return the sequence of same-named chunks; therefore, we put the burden on the Web to process all chunks with a given name, in sequence.

The fullNameFor() method resolves full name for a chunk as follows:

  1. If the string is already in the named mapping, this is the full name
  2. If the string ends in '...', visit each key in the dictionary to see if the key starts with the string up to the trailing '...'. If a match is found, the dictionary key is the full name.
  3. Otherwise, treat this as a full name.

Web Chunk name resolution methods (98)=



def fullNameFor( self, name ):
    # resolve "..." names
    if self.named.has_key( name ): return name
    if name[-3:] == '...':
        best= []
        for n in self.named.keys():
            if n.startswith( name[:-3] ): best.append( n )
        if len(best) > 1:
            raise Error("Ambiguous abbreviation %r, matches %r" % ( name, best ) )
        elif len(best) == 1: 
            return best[0]
    return name

Web Chunk name resolution methods (98). Used by Web class - describes the overall "web" of chunks (92) .

The _chunk() method locates a named sequence of chunks by first determining the full name for the identifying string. If full name is in the named mapping, the sequence of chunks is returned. Otherwise, an instance of our Error class is raised because the name is unresolvable.

It might be more helpful for debugging to emit this as an error in the weave and tangle results and keep processing. This would allow an author to catch multiple errors in a single run of pyWeb.

Web Chunk name resolution methods (99)+=



def _chunk( self, name ):
    """Locate a named sequence of chunks."""
    nm= self.fullNameFor( name )
    if self.named.has_key( nm ):
        return self.named[nm]
    raise Error( "Cannot resolve %r in %r" % (name,self.named.keys()) )

Web Chunk name resolution methods (99). Used by Web class - describes the overall "web" of chunks (92) .

The chunkReference() method returns the full name and sequence number of a chunk. Given a short identifying string, the full name is easily resolved. The chunk sequence number, however, requires that the first chunk be located. This first chunk has the sequence number that can be used for all cross-references.

Web Chunk name resolution methods (100)+=



def chunkReference( self, name ):
    """Provide a full name and sequence number for a (possibly abbreviated) name."""
    c= self._chunk( name )
    if not c:
        raise Error( 'Unknown named chunk %r' % name )
    return self.fullNameFor( name ), [ x.seq for x in c ]

Web Chunk name resolution methods (100). Used by Web class - describes the overall "web" of chunks (92) .

Cross-reference support includes creating and reporting on the various cross-references available in a web. This includes creating the usedBy list of chunks that reference a given chunk; and returning the file, macro and user identifier cross references.

Each Chunk has a list Reference commands that shows the chunks to which a chunk refers. These relationships must be reversed to show the chunks that refer to a given chunk. This is done by traversing the entire web of named chunks and recording each chunk-to-chunk reference in a usedBy mapping. This mapping has the referred-to chunk as the key, and a sequence of referring chunks as the value.

The accumulation is initiated by the createUsedBy() method. This method visits each Chunk, calling the usedBy() method, passing in the Web instance as an argument. Each Chunk class usedBy() method, in turn, invokes the usedBy() method of each Command instance in the chunk. Most commands do nothing, but a ReferenceCommand will call back to the Web class setUsage() method to record a reference.

When the createUsedBy() method has accumulated the entire cross reference, it also assures that all chunks are used exactly once.

The setUsage() method accepts a chunk which contains a reference command and the name to which the reference command points (the "referent"). Each of the referent's chunk sequence numbers is used an the key in a mapping that lists references to this referent.

The chunkReferencedBy() method resolves the chunk's name, returning the list of chunks which refer to the given chunk.

Web Chunk cross reference methods (101)=



def createUsedBy( self ):
    """Compute a "used-by" table showing references to chunks."""
    for c in self.chunkSeq:
        c.usedBy( self )
    Web Chunk check reference counts are all one (102) 
def setUsage( self, aChunk, aRefName ):
    for c in self._chunk( aRefName ):
        self.usedBy.setdefault( c.seq, [] )
        self.usedBy[c.seq].append( (self.fullNameFor(aChunk.name),aChunk.seq) )
        c.refCount += 1
def chunkReferencedBy( self, seqNumber ):
    """Provide the list of places where a chunk is used."""
    return self.usedBy.setdefault( seqNumber, [] )

Web Chunk cross reference methods (101). Used by Web class - describes the overall "web" of chunks (92) .

We verify that the reference count for a chunk is exactly one. We don't gracefully tolerate multiple references to a chunk or unreferenced chunks.

Web Chunk check reference counts are all one (102)=



for nm,cl in self.named.items():
   if len(cl) > 0:
       if cl[0].refCount == 0:
           theLog.event( WarningEvent, "No reference to %r" % nm )
       elif cl[0].refCount > 1:
           theLog.event( WarningEvent, "Multiple references to %r" % nm )
   else:
       theLog.event( WarningEvent, "No definition for %r" % nm )

Web Chunk check reference counts are all one (102). Used by Web Chunk cross reference methods (101) .

The fileXref() method visits all named file output chunks in output and collects the sequence numbers of each section in the sequence of chunks.

The chunkXref() method uses the same algorithm as a the fileXref() method, but applies it to the named mapping.

Web Chunk cross reference methods (103)+=



def fileXref( self ):
    fx= {}
    for f,cList in self.output.items():
        fx[f]= [ c.seq for c in cList ]
    return fx
def chunkXref( self ):
    mx= {}
    for n,cList in self.named.items():
        mx[n]= [ c.seq for c in cList ]
    return mx

Web Chunk cross reference methods (103). Used by Web class - describes the overall "web" of chunks (92) .

The userNamesXref() method creates a mapping for each user identifier. The value for this mapping is a tuple with the chunk that defined the identifer (via a @| command), and a sequence of chunks that reference the identifier.

For example: { 'Web': ( 87, (88,93,96,101,102,104) ), 'Chunk': ( 53, (54,55,56,60,57,58,59) ) }, shows that the identifier 'Web' is defined in chunk with a sequence number of 87, and referenced in the sequence of chunks that follow.

This works in two passes:

Web Chunk cross reference methods (104)+=



def userNamesXref( self ):
    ux= {}
    self._gatherUserId( self.named, ux )
    self._gatherUserId( self.output, ux )
    self._updateUserId( self.named, ux )
    self._updateUserId( self.output, ux )
    return ux
def _gatherUserId( self, chunkMap, ux ):
    collect all user identifiers from a given map into ux (105) 
def _updateUserId( self, chunkMap, ux ):
    find user identifier usage and update ux from the given map (106) 

Web Chunk cross reference methods (104). Used by Web class - describes the overall "web" of chunks (92) .

User identifiers are collected by visiting each of the sequence of Chunks that share the same name; within each component chunk, if chunk has identifiers assigned by the @| command, these are seeded into the dictionary. If the chunk does not permit identifiers, it simply returns an empty list as a default action.

collect all user identifiers from a given map into ux (105)=



for n,cList in chunkMap.items():
    for c in cList:
        for id in c.getUserIDRefs():
            ux[id]= ( c.seq, [] )

collect all user identifiers from a given map into ux (105). Used by Web Chunk cross reference methods (104) .

User identifiers are cross-referenced by visiting each of the sequence of Chunks that share the same name; within each component chunk, visit each user identifier; if the Chunk class searchForRE() method matches an identifier, this is appended to the sequence of chunks that reference the original user identifier.

find user identifier usage and update ux from the given map (106)=



# examine source for occurances of all names in ux.keys()
for id in ux.keys():
    theLog.event( WeaveEvent, "References to %r" % id )
    idpat= re.compile( r'\W%s\W' % id )
    for n,cList in chunkMap.items():
        for c in cList:
            if c.seq != ux[id][0] and c.searchForRE( idpat, self ):
                ux[id][1].append( c.seq )

find user identifier usage and update ux from the given map (106). Used by Web Chunk cross reference methods (104) .

The language() method determines the output language. The determination of the language can be done a variety of ways. One is to use command line parameters, another is to use the filename extension on the input file.

We examine the first few characters of input. A proper HTML, XHTML or XML file begins with '<!', '<?' or '<H'. Latex files typically begin with '%' or '\'.

The EmitterFactory may be a better location for this function.

Web determination of the language from the first chunk (107)=



def language( self, preferredWeaverClass=None ):
    """Construct a weaver appropriate to the document's language"""
    if preferredWeaverClass:
        return preferredWeaverClass()
    if self.chunkSeq[0].startswith('<'): return HTML()
    return Latex()

Web determination of the language from the first chunk (107). Used by Web class - describes the overall "web" of chunks (92) .

The tangle() method of the Web class performs the tangle() method for each Chunk of each named output file. Note that several chunks may share the file name, requiring the file be composed of material in each chunk.

During tangling of a chunk, the chunk may reference another chunk. This transitive tangling of an individual chunk is handled by the tangleChunk() method.

Web tangle the output files (108)=



def tangle( self, aTangler ):
    for f,c in self.output.items():
        aTangler.open( f )
        for p in c:
            p.tangle( self, aTangler )
        aTangler.close()
def tangleChunk( self, name, aTangler ):
    theLog.event( TangleEvent, "Tangling chunk %r" % name )
    for p in self._chunk(name):
        p.tangle( self, aTangler )

Web tangle the output files (108). Used by Web class - describes the overall "web" of chunks (92) .

The weave() method of the Web class creates the final documentation. This is done by stepping through each Chunk in sequence and weaving the chunk into the resulting file via the Chunk class weave() method.

During weaving of a chunk, the chunk may reference another chunk. When weaving a reference to a named chunk (output or ordinary programming source defined with @{), this does not lead to transitive weaving: only a reference is put in from one chunk to another. However, when weaving a chunk defined with @[, the chunk is expanded when weaving. The decision is delegated to the referenced chunk.

Web weave the output document (109)=



def weave( self, aWeaver ):
    aWeaver.open( self.sourceFileName )
    for c in self.chunkSeq:
        c.weave( self, aWeaver )
    aWeaver.close()
def weaveChunk( self, name, aWeaver ):
    theLog.event( WeaveEvent, "Weaving chunk %r" % name )
    chunkList= self._chunk(name)
    chunkList[0].weaveReferenceTo( self, aWeaver )
    for p in chunkList[1:]:
        p.weaveShortReferenceTo( self, aWeaver )

Web weave the output document (109). Used by Web class - describes the overall "web" of chunks (92) .

The WebReader Class

Usage

There are two forms of the constructor for a WebReader. The initial WebReader instance is created with code like the following:

p= WebReader( aFileName, command=aCommandCharacter )

This will define the initial input file and the command character, both of which are command-line parameters to the application.

When processing an include file (with the @i command), a child WebReader instance is created with code like the following:

c= WebReader( anIncludeName, parent=parentWebReader )

This will define the included file, but will inherit the command character from the parent WebReader. This will also include a reference from child to parent so that embedded Python expressions can view the entire input context.

Design

The WebReader class parses the input file into command blocks. These are assembled into Chunks, and the Chunks are assembled into the document Web. Once this input pass is complete, the resulting Web can be tangled or woven.

The parser works by reading the entire file and splitting on @. patterns. The split() method of the Python re module will separate the input and preserve the actual character sequence on which the input was split. This breaks the input into blocks of text separated by the @. characters.

"Major" commands partition the input into Chunks. The major commands are @d, @o and @O, as well as the @{, @}, @[, @] brackets, and the @i command to include another file.

"Minor" commands are inserted into a Chunk as a Command. Blocks of text are minor commands, as well as the @<name@> references, the various cross-reference commands (@f, @m and @u). The @@ escape is also handled here so that all further processing is independent of any parsing.

Implementation

The class has the following attributes:

WebReader class - parses the input file, building the Web structure (110)=



class WebReader:
    """Parse an input file, creating Commands and Chunks."""
    def __init__( self, fileName, parent=None, command='@', permit=() ):
        self.fileName= fileName
        self.tokenList= []
        self.token= ""
        self.tokenIndex= 0
        self.tokenPushback= []
        self.lineNumber= 0
        self.aChunk= None
        self.parent= parent
        self.theWeb= None
        self.permitList= permit
        if self.parent: 
            self.command= self.parent.command
        else:
            self.command= command
        self.parsePat= '(%s.)' % self.command
        WebReader command literals (126) 
    WebReader tokenize the input (111) 
    WebReader location in the input stream (112) 
    WebReader handle a command string (113) 
    WebReader load the web (124) 

WebReader class - parses the input file, building the Web structure (110). Used by Base Class Definitions (6) .

This tokenizer centralizes a single call to nextToken(). This assures that every token is examined by nextToken(), which permits accurate counting of the '\n' characters and determining the line numbers of the input file. This line number information can then be attached to each Command, directing the user back to the correct line of the original input file.

The tokenizer supports lookahead by allowing the parser to examine tokens and then push them back into a pushBack stack. Generally this is used for the special case of parsing the @i command, which has no @-command terminator or separator. It ends with the following '\n'.

Python permits a simplified double-ended queue for this kind of token stream processing. Ordinary tokens are fetched with a pop(0), and a pushback is done by prepending the pushback token with a tokenList = [ token ] + tokenList. For this application, however, we need to keep a count of '\n's seen, and we want to avoid double-counting '\n' pushed back into the token stream. So we use a queue of tokens and a stack for pushback.

WebReader tokenize the input (111)=



def openSource( self ):
    theLog.event( InputEvent, "Processing %r" % self.fileName )
    file= open(self.fileName, 'r' ).read()
    self.tokenList= re.split(self.parsePat, file )
    self.lineNumber= 1
    self.tokenPushback= []
def nextToken( self ):
    self.lineNumber += self.token.count('\n')
    if self.tokenPushback:
        self.token= self.tokenPushback.pop()
    else:
        self.token= self.tokenList.pop(0)
    return self.token
def moreTokens( self ):
    return self.tokenList or self.tokenPushback
def pushBack( self, token ):
    self.tokenPushback.append( token )
def totalLines( self ):
    self.lineNumber += self.token.count('\n')
    return self.lineNumber-1

WebReader tokenize the input (111). Used by WebReader class - parses the input file, building the Web structure (110) .

The location() provides the file name and range of lines for a particular command. This allows error messages as well as tangled or woven output to correctly reference the original input files.

WebReader location in the input stream (112)=



def location( self ):
    return ( self.fileName, self.lineNumber, self.lineNumber+self.token.count("\n") )

WebReader location in the input stream (112). Used by WebReader class - parses the input file, building the Web structure (110) .

Command recognition is done via a Chain of Command-like design. There are two conditions: the command string is recognized or it is not recognized.

If the command is recognized, handleCommand() either:

and returns a true result.

If the command is not recognized, handleCommand() returns false.

A subclass can override handleCommand() to (1) call this superclass version; (2) if the command is unknown to the superclass, then the subclass can attempt to process it; (3) if the command is unknown to both classes, then return false. Either a subclass will handle it, or the default activity taken by load() is to treat the command a text, but also issue a warning.

WebReader handle a command string (113)=



def handleCommand( self, token ):
    major commands segment the input into separate Chunks (114) 
    minor commands add Commands to the current Chunk (119) 
    elif token[:2] in (self.cmdlcurl,self.cmdlbrak):
        # These should be consumed as part of @o and @d parsing
        raise Error('Extra %r (possibly missing chunk name)' % token, self.aChunk)
    else:
        return None # did not recogize the command
    return 1 # did recognize the command

WebReader handle a command string (113). Used by WebReader class - parses the input file, building the Web structure (110) .

The following sequence of if-elif statements identifies the major commands that partition the input into separate Chunks.

major commands segment the input into separate Chunks (114)=



if token[:2] == self.cmdo or token[:2] == self.cmdo_big:
    start an OutputChunk, adding it to the web (115) 
elif token[:2] == self.cmdd:
    start a NamedChunk or NamedDocumentChunk, adding it to the web (116) 
elif token[:2] == self.cmdi:
    import another file (117) 
elif token[:2] in (self.cmdrcurl,self.cmdrbrak):
    finish a chunk, start a new Chunk adding it to the web (118) 

major commands segment the input into separate Chunks (114). Used by WebReader handle a command string (113) .

An output chunk has the form @o name @{ content @}. We use the first two tokens to name the OutputChunk. We simply expect the @{ separator. We then attach all subsequent commands to this chunk while waiting for the final @} token to end the chunk.

start an OutputChunk, adding it to the web (115)=



file= self.nextToken().strip()
self.aChunk= OutputChunk( file )
self.aChunk.webAdd( self.theWeb )
self.aChunk.big_definition= token[:2] == self.cmdo_big
self.expect( (self.cmdlcurl,) )
# capture an OutputChunk up to @}

start an OutputChunk, adding it to the web (115). Used by major commands segment the input into separate Chunks (114) .

An named chunk has the form @d name @{ content @} for code and @d name @[ content @] for document source. We use the first two tokens to name the NamedChunk or NamedDocumentChunk. We expect either the @{ or @[ separator, and use the actual token found to choose which subclass of Chunk to create. We then attach all subsequent commands to this chunk while waiting for the final @} or @] token to end the chunk.

start a NamedChunk or NamedDocumentChunk, adding it to the web (116)=



name= self.nextToken().strip()
# next token is @{ or @]
brack= self.expect( (self.cmdlcurl,self.cmdlbrak) )
if brack == self.cmdlcurl: 
    self.aChunk= NamedChunk( name )
else: 
    self.aChunk= NamedDocumentChunk( name )
self.aChunk.webAdd( self.theWeb )
# capture a NamedChunk up to @} or @]

start a NamedChunk or NamedDocumentChunk, adding it to the web (116). Used by major commands segment the input into separate Chunks (114) .

An import command has the unusual form of @i name, with no trailing separator. When we encounter the @i token, the next token will start with the file name, but may continue with an anonymous chunk. We require that all @i commands occur at the end of a line, and break on the '\n' which must occur after the file name. This permits file names with embedded spaces.

Once we have split the file name away from the rest of the following anonymous chunk, we push the following token back into the token stream, so that it will be the first token examined at the top of the load() loop.

We create a child WebReader instance to process the included file. The entire file is loaded into the current Web instance. A new, empty Chunk is created at the end of the file so that processing can resume with an anonymous Chunk.

import another file (117)=



# break this token on the '\n' and pushback the new token.
next= self.nextToken().split('\n',1)
self.pushBack('\n')
if len(next) > 1:
    self.pushBack( '\n'.join(next[1:]) )
incFile= next[0].strip()
try:
    include= WebReader( incFile, parent=self )
    include.load( self.theWeb )
except (Error,IOError),e:
    theLog.event( ErrorEvent, 
        "Problems with included file %s, output is incomplete." 
        % incFile )
    # Discretionary - sometimes we want total failure
    if self.cmdi in self.permitList: pass
    else: raise
self.aChunk= Chunk()
self.aChunk.webAdd( self.theWeb )

import another file (117). Used by major commands segment the input into separate Chunks (114) .

When a @} or @] are found, this finishes a named chunk. The next text is therefore part of an anonymous chunk.

Note that no check is made to assure that the previous Chunk was indeed a named chunk or output chunk started with @{ or @[. To do this, an attribute would be needed for each Chunk subclass that indicated if a trailing bracket was necessary. For the base Chunk class, this would be false, but for all other subclasses of Chunk, this would be true.

finish a chunk, start a new Chunk adding it to the web (118)=



self.aChunk= Chunk()
self.aChunk.webAdd( self.theWeb )

finish a chunk, start a new Chunk adding it to the web (118). Used by major commands segment the input into separate Chunks (114) .

The following sequence of elif statements identifies the minor commands that add Command instances to the current open Chunk.

minor commands add Commands to the current Chunk (119)=



elif token[:2] == self.cmdpipe:
    assign user identifiers to the current chunk (120) 
elif token[:2] == self.cmdf:
    self.aChunk.append( FileXrefCommand(self.lineNumber) )
elif token[:2] == self.cmdm:
    self.aChunk.append( MacroXrefCommand(self.lineNumber) )
elif token[:2] == self.cmdu:
    self.aChunk.append( UserIdXrefCommand(self.lineNumber) )
elif token[:2] == self.cmdlangl:
    add a reference command to the current chunk (121) 
elif token[:2] == self.cmdlexpr:
    add an expression command to the current chunk (122) 
elif token[:2] == self.cmdcmd:
    double at-sign replacement, append this character to previous TextCommand (123) 

minor commands add Commands to the current Chunk (119). Used by WebReader handle a command string (113) .

User identifiers occur after a @| in a NamedChunk.

Note that no check is made to assure that the previous Chunk was indeed a named chunk or output chunk started with @{. To do this, an attribute would be needed for each Chunk subclass that indicated if user identifiers are permitted. For the base Chunk class, this would be false, but for the NamedChunk class and OutputChunk class, this would be true.

assign user identifiers to the current chunk (120)=



# variable references at the end of a NamedChunk
# aChunk must be subclass of NamedChunk
# These are accumulated and expanded by @u reference
self.aChunk.setUserIDRefs( self.nextToken().strip() )

assign user identifiers to the current chunk (120). Used by minor commands add Commands to the current Chunk (119) .

A reference command has the form @<name @>. We accept three tokens from the input, the middle token is the referenced name.

add a reference command to the current chunk (121)=



# get the name, introduce into the named Chunk dictionary
expand= self.nextToken().strip()
self.expect( (self.cmdrangl,) )
self.theWeb.addDefName( expand )
self.aChunk.append( ReferenceCommand( expand, self.lineNumber ) )

add a reference command to the current chunk (121). Used by minor commands add Commands to the current Chunk (119) .

An expression command has the form @( Python Expression @). We accept three tokens from the input, the middle token is the expression.

There are two alternative semantics for an embedded expression.

We use the Immediate Execution semantics.

add an expression command to the current chunk (122)=



# get the Python expression, create the expression command
expression= self.nextToken()
self.expect( (self.cmdrexpr,) )
try:
    theLocation= self.location()
    theWebReader= self
    theFile= self.fileName
    thisApplication= sys.argv[0]
    result= str(eval( expression ))
except Exception,e:
    result= '!!!Exception: %s' % e
    theLog.event( ReadEvent, 'Failure to process %r: result is %s' % ( expression, e ) )
self.aChunk.appendText( result, self.lineNumber )

add an expression command to the current chunk (122). Used by minor commands add Commands to the current Chunk (119) .

A double command sequence ('@@', when the command is an '@') has the usual meaning of '@' in the input stream. We do this via the appendChar() method of the current Chunk. This will append the character on the end of the most recent TextCommand, and then put the Chunk in a state where the next TextCommand is also appended to the most recent, creating a single TextCommand with the '@' in it.

double at-sign replacement, append this character to previous TextCommand (123)=



# replace with '@' here and now!
# Put this at the end of the previous chunk
# AND make sure the next chunk is appended to this.
self.aChunk.appendChar( self.command, self.lineNumber )

double at-sign replacement, append this character to previous TextCommand (123). Used by minor commands add Commands to the current Chunk (119) .

The expect() method examines the next token to see if it is the expected string. If this is not found, a standard type of error message is written.

The load() method reads the entire input file as a sequence of tokens, split up by the openSource() method. Each token that appears to be a command is passed to the handleCommand() method. If the handleCommand() method returns a true result, the command was recognized and placed in the Web. if handleCommand() returns a false result, the command was unknown, and some default behavior is used.

The load() method takes an optional permit variable. This encodes commands where failure is permitted. Currently, only the @i command can be set to permit failure. This allows including a file that does not yet exist. The primary application of this option is when weaving test output. The first pass of pyWeb tangles the program source files; they are then run to create test output; the second pass of pyWeb weaves this test output into the final document via the @i command.

WebReader load the web (124)=



def expect( self, tokens ):
    if not self.moreTokens():
        raise Error("At %r: end of input, %r not found" % (self.location(),tokens) )
    t= self.nextToken()
    if t not in tokens:
        raise Error("At %r: expected %r, found %r" % (self.location(),tokens,t) )
    return t
def load( self, aWeb ):
    self.theWeb= aWeb
    self.aChunk= Chunk()
    self.aChunk.webAdd( self.theWeb )
    self.openSource()
    while self.moreTokens():
        token= self.nextToken()
        if len(token) >= 2 and token.startswith(self.command):
            if self.handleCommand( token ):
                continue
            else:
                other command-like sequences are appended as a TextCommand (125) 
        elif token:
            # accumulate non-empty block of text in the current chunk
            self.aChunk.appendText( token, self.lineNumber )

WebReader load the web (124). Used by WebReader class - parses the input file, building the Web structure (110) .

other command-like sequences are appended as a TextCommand (125)=



theLog.event( ReadEvent, 'Unknown @-command in input: %r' % token )
self.aChunk.appendText( token, self.lineNumber )

other command-like sequences are appended as a TextCommand (125). Used by WebReader load the web (124) .

The command character can be changed to permit some flexibility when working with languages that make extensive use of the @ symbol, i.e., PERL. The initialization of the WebReader is based on the selected command character.

WebReader command literals (126)=



# major commands
self.cmdo= self.command+'o'
self.cmdo_big= self.command+'O'
self.cmdd= self.command+'d'
self.cmdlcurl= self.command+'{'
self.cmdrcurl= self.command+'}'
self.cmdlbrak= self.command+'['
self.cmdrbrak= self.command+']'
self.cmdi= self.command+'i'
# minor commands
self.cmdlangl= self.command+'<'
self.cmdrangl= self.command+'>'
self.cmdpipe= self.command+'|'
self.cmdlexpr= self.command+'('
self.cmdrexpr= self.command+')'
self.cmdf= self.command+'f'
self.cmdm= self.command+'m'
self.cmdu= self.command+'u'
self.cmdcmd= self.command+self.command

WebReader command literals (126). Used by WebReader class - parses the input file, building the Web structure (110) .

Operation Class Hierarchy

This application performs three major operations: loading the documen web, weaving and tangling. Generally, the use case is to perform a load, weave and tangle. However, a less common use case is to first load and tangle output files, run a regression test and then load and weave a result that includes the test output file.

The -x option excludes one of the two output operations. The -xw excludes the weave pass, doing only the tangle operation. The -xt excludes the tangle pass, doing the weave operation.

This two pass operation might be embedded in the following type of Python program.

import pyweb, os
pyweb.tangle( "source.w" )
os.system( "python source.py >source.log" )
pyweb.weave( "source.w" )

The first step runs pyWeb, excluding the final weaving pass. The second step runs the tangled program, source.py, and produces test results in a log file, source.log. The third step runs pyWeb excluding the tangle pass. This produces a final document that includes the source.log test results.

To accomplish this, we provide a class hierarchy that defines the various operations of the pyWeb application. This class hierarchy defines an extensible set of fundamental operations. This gives us the flexibility to create a simple sequence of operations and execute any combination of these. It eliminates the need for a forest of if-statements to determine precisely what will be done.

Each operation has the potential to update the state of the overall application. A partner with this command hierarchy is the Application class that defines the application options, inputs and results.

Operation class hierarchy - used to describe basic operations of the application (127)=



Operation superclass has common features of all operations (128) 
MacroOperation subclass that holds a sequence of other operations (131) 
WeaveOperation subclass initiates the weave operation (135) 
TangleOperation subclass initiates the tangle operation (138) 
LoadOperation subclass loads the document web (141) 

Operation class hierarchy - used to describe basic operations of the application (127). Used by Base Class Definitions (6) .

Operation Class

The Operation class embodies the basic operations of pyWeb. The intent of this hierarchy is to both provide an easily expanded method of adding new operations, but an easily specified list of operations for a particular run of pyWeb.

Usage

The overall process of the application is defined by an instance of Operation. This instance may be the WeaveOperation instance, the TangleOperation instance or a MacroOperation instance.

The instance is constructed during parsing of the input parameters. Then the Operation class perform() method is called to actually perform the operation. There are three standard Operation instances available: an instance that is a macro and does both tangling and weaving, an instance that excludes tangling, and an instance that excludes weaving. These correspond to the command-line options.

anOp= SomeOperation( parameters )
anOp.perform()
Design

The

Operation

is the superclass for all operations.
Implementation

Operation superclass has common features of all operations (128)=



class Operation:
    """An operation performed by pyWeb."""
    def __init__( self, name ):
        self.name= name
        self.start= None
    def __str__( self ):
        return self.name
    Operation perform method actually performs the operation (129) 
    Operation final summary method (130) 

Operation superclass has common features of all operations (128). Used by Operation class hierarchy - used to describe basic operations of the application (127) .

The perform() method does the real work of the operation. For the superclass, it merely logs a message. This is overridden by a subclass.

Operation perform method actually performs the operation (129)=



def perform( self, theWeb, theApplication ):
    theLog.event( ExecutionEvent, "Starting %s" % self.name )
    self.start= time.clock()

Operation perform method actually performs the operation (129). Used by Operation superclass has common features of all operations (128) .

The summary() method returns some basic processing statistics for this operation.

Operation final summary method (130)=



def duration( self ):
    """Return duration of the operation."""
    # Windows clock() function is funny.
    return (self.start and time.clock()-self.start) or 0
def summary( self, theWeb, theApplication ):
    return "%s in %0.1f sec." % ( self.name, self.duration() )

Operation final summary method (130). Used by Operation superclass has common features of all operations (128) .

MacroOperation Class

A MacroOperation defines a composite operation; it is a sequence of other operations. When the macro is performed, it delegates to the sub-operations.

Usage

The instance is created during parsing of input parameters. An instance of this class is one of the three standard operations available; it generally is the default, "do everything" operation.

Design

This class overrides the perform() method of the superclass. It also adds an append() method that is used to construct the sequence of operations.

Implementation

MacroOperation subclass that holds a sequence of other operations (131)=



class MacroOperation( Operation ):
    """An operation composed of a sequence of other operations."""
    def __init__( self, opSequence=None ):
        Operation.__init__( self, "Macro" )
        if opSequence: self.opSequence= opSequence
        else: self.opSequence= []
    def __str__( self ):
        return "; ".join( [ x.name for x in self.opSequence ] )
    MacroOperation perform method delegates the sequence of operations (132) 
    MacroOperation append adds a new operation to the sequence (133) 
    MacroOperation summary summarizes each step (134) 

MacroOperation subclass that holds a sequence of other operations (131). Used by Operation class hierarchy - used to describe basic operations of the application (127) .

Since the macro perform() method delegates to other operations, it is possible to short-cut argument processing by using the Python *args construct to accept all arguments and pass them to each sub-operation.

MacroOperation perform method delegates the sequence of operations (132)=



def perform( self, theWeb, theApplication ):
    for o in self.opSequence:
        o.perform(theWeb,theApplication)

MacroOperation perform method delegates the sequence of operations (132). Used by MacroOperation subclass that holds a sequence of other operations (131) .

Since this class is essentially a wrapper around the built-in sequence type, we delegate sequence related operations directly to the underlying sequence.

MacroOperation append adds a new operation to the sequence (133)=



def append( self, anOperation ):
    self.opSequence.append( anOperation )

MacroOperation append adds a new operation to the sequence (133). Used by MacroOperation subclass that holds a sequence of other operations (131) .

The summary() method returns some basic processing statistics for each step of this operation.

MacroOperation summary summarizes each step (134)=



def summary( self, theWeb, theApplication ):
    return ", ".join( [ x.summary(theWeb,theApplication) for x in self.opSequence ] )

MacroOperation summary summarizes each step (134). Used by MacroOperation subclass that holds a sequence of other operations (131) .

WeaveOperation Class

The WeaveOperation defines the operation of weaving. This operation logs a message, and invokes the weave() method of the Web instance. This method also includes the basic decision on which weaver to use. If a Weaver was specified on the command line, this instance is used. Otherwise, the first few characters are examined and a weaver is selected.

Usage

An instance is created during parsing of input parameters. The instance of this class is one of the standard operations available; it is the "exclude tangling" option and it is also an element of the "do everything" macro.

Design

This class overrides the perform() method of the superclass.

Implementation

WeaveOperation subclass initiates the weave operation (135)=



class WeaveOperation( Operation ):
    """An operation that weaves a document."""
    def __init__( self ):
        Operation.__init__( self, "Weave" )
    WeaveOperation perform method does weaving of the document file (136) 
    WeaveOperation summary method provides line counts (137) 

WeaveOperation subclass initiates the weave operation (135). Used by Operation class hierarchy - used to describe basic operations of the application (127) .

The language is picked just prior to weaving. It is either (1) the language specified on the command line, or, (2) if no language was specified, a language is selected based on the first few characters of the input.

Weaving can only raise an exception when there is a reference to a chunk that is never defined.

WeaveOperation perform method does weaving of the document file (136)=



def perform( self, theWeb, theApplication ):
    Operation.perform( self, theWeb, theApplication )
    if not theApplication.theWeaver: 
        # Examine first few chars of first chunk of web to determine language
        theApplication.theWeaver= theWeb.language() 
    try:
        theWeb.weave( theApplication.theWeaver )
    except Error,e:
        theLog.event( ErrorEvent, 
            "Problems weaving document from %s (weave file is faulty)." 
            % theWeb.sourceFileName )
        raise

WeaveOperation perform method does weaving of the document file (136). Used by WeaveOperation subclass initiates the weave operation (135) .

The summary() method returns some basic processing statistics for the weave operation.

WeaveOperation summary method provides line counts (137)=



def summary( self, theWeb, theApplication ):
    if theApplication.theWeaver and theApplication.theWeaver.linesWritten > 0:
        return "%s %d lines in %0.1f sec." % ( self.name, theApplication.theWeaver.linesWritten, self.duration() )
    return "did not %s" % ( self.name, )

WeaveOperation summary method provides line counts (137). Used by WeaveOperation subclass initiates the weave operation (135) .

TangleOperation Class

The TangleOperation defines the operation of weaving. This operation logs a message, and invokes the weave() method of the Web instance. This method also includes the basic decision on which weaver to use. If a Weaver was specified on the command line, this instance is used. Otherwise, the first few characters are examined and a weaver is selected.

Usage

An instance is created during parsing of input parameters. The instance of this class is one of the standard operations available; it is the "exclude tangling" option, and it is also an element of the "do everything" macro.

Design

This class overrides the perform() method of the superclass.

Implementation

TangleOperation subclass initiates the tangle operation (138)=



class TangleOperation( Operation ):
    """An operation that weaves a document."""
    def __init__( self ):
        Operation.__init__( self, "Tangle" )
    TangleOperation perform method does tangling of the output files (139) 
    TangleOperation summary method provides total lines tangled (140) 

TangleOperation subclass initiates the tangle operation (138). Used by Operation class hierarchy - used to describe basic operations of the application (127) .

Tangling can only raise an exception when a cross reference request (@f, @m or @u) occurs in a program code chunk. Program code chunks are defined with any of @d, @o or @O and use @{@} brackets.

TangleOperation perform method does tangling of the output files (139)=



def perform( self, theWeb, theApplication ):
    Operation.perform( self, theWeb, theApplication )
    try:
        theWeb.tangle( theApplication.theTangler )
    except Error,e:
        theLog.event( ErrorEvent, 
            "Problems tangling outputs from %s (tangle files are faulty)." 
            % theWeb.sourceFileName )
        raise

TangleOperation perform method does tangling of the output files (139). Used by TangleOperation subclass initiates the tangle operation (138) .

The summary() method returns some basic processing statistics for the tangle operation.

TangleOperation summary method provides total lines tangled (140)=



def summary( self, theWeb, theApplication ):
    if theApplication.theTangler and theApplication.theTangler.linesWritten > 0:
        return "%s %d lines in %0.1f sec." % ( self.name, theApplication.theTangler.linesWritten, self.duration() )
    return "did not %s" % ( self.name, )

TangleOperation summary method provides total lines tangled (140). Used by TangleOperation subclass initiates the tangle operation (138) .

LoadOperation Class

The LoadOperation defines the operation of loading the web structure. This operation uses the application's webReader to actually do the load.

Usage

An instance is created during parsing of the input parameters. An instance of this class is part of any of the weave, tangle and "do everything" operation.

Design

This class overrides the perform() method of the superclass.

Implementation

LoadOperation subclass loads the document web (141)=



class LoadOperation( Operation ):
    """An operation that loads the source web for a document."""
    def __init__( self ):
        Operation.__init__( self, "Load" )
    LoadOperation perform method does tangling of the output files (142) 
    LoadOperation summary provides lines read (143) 

LoadOperation subclass loads the document web (141). Used by Operation class hierarchy - used to describe basic operations of the application (127) .

Trying to load the web involves two steps, either of which can raise exceptions due to incorrect inputs.

LoadOperation perform method does tangling of the output files (142)=



def perform( self, theWeb, theApplication ):
    Operation.perform( self, theWeb, theApplication )
    try:
        theApplication.webReader.load( theWeb )
        theWeb.createUsedBy()
    except (Error,IOError),e:
        theLog.event( ErrorEvent, 
            "Problems with source file %s, no output produced." 
            % theWeb.sourceFileName )
        raise

LoadOperation perform method does tangling of the output files (142). Used by LoadOperation subclass loads the document web (141) .

The summary() method returns some basic processing statistics for the tangle operation.

LoadOperation summary provides lines read (143)=



def summary( self, theWeb, theApplication ):
    return "%s %d lines in %01.f sec." % ( self.name, theApplication.webReader.totalLines(), self.duration() )

LoadOperation summary provides lines read (143). Used by LoadOperation subclass loads the document web (141) .

The Application Class

Design

The Application class is provided so that the Operation instances have an overall application to update. This allows the WeaveOperation to provide the selected Weaver instance to the application. It also provides a central location for the various options and alternatives that might be accepted from the command line.

The constructor sets the default options for weaving and tangling.

The parseArgs() method uses the sys.argv sequence to parse the command line arguments and update the options. This allows a program to pre-process the arguments, passing other arguments to this module.

The process() method processes a list of files. This is either the list of files passed as an argument, or it is the list of files parsed by the parseArgs() method.

The parseArgs() and

process() functions are separated so that a module can include this one, bypass command-line parsing, yet still perform the basic operations simply and consistently.

For example:

import pyweb, getopt
a= pyweb.Application( My Emitter Factory )
opt,arg= getopt.getopt( argv[1:], 'My Unique Arguments' )
    My argument parsing
a.process( a File List )

The main() function creates an Application instance and calls the parseArgs() and process() methods to provide the expected default behavior for this module when it is used as the main program.

Implementation

Application Class (144)=



class Application:
    def __init__( self, ef=None ):
        Application default options (145) 
    Application parse command line (146) 
    Application class process all files (147) 

Application Class (144). Used by pyweb.py (1) .

The first part of parsing the command line is setting default values that apply when parameters are omitted. The default values are set as follows:

doTangle are instances of Operation that describe the basic operations of the application.
  • theOperation is an instance of Operation that describes the default overall operation: both tangle and weave.
  • permitList provides a list of commands that are permitted to fail. Typically this is empty, or contains @i to allow the include command to fail.
  • files is the final list of argument files from the command line; these will be processed unless overridden in the call to process().
  • webReader is the WebReader instance created for the current input file.
  • Application default options (145)=

    
    
    if not ef: ef= EmitterFactory()
    self.emitterFact= ef
    self.theTangler= ef.mkEmitter( 'TanglerMake' )
    self.theWeaver= None
    self.commandChar= '@'
    loadOp= LoadOperation()
    weaveOp= WeaveOperation()
    tangleOp= TangleOperation()
    self.doWeave= MacroOperation( [loadOp,weaveOp] )
    self.doTangle= MacroOperation( [loadOp,tangleOp] )
    self.theOperation= MacroOperation( [loadOp,tangleOp,weaveOp] )
    self.permitList= []
    self.files= []
    self.webReader= None
    
    

    Application default options (145). Used by Application Class (144) .

    The algorithm for parsing the command line parameters uses the built in getopt module. This module has a getopt() function that accepts the original arguments and normalizes them into two sequences: options and arguments. The options sequence contains tuples of option name and option value.

    Application parse command line (146)=

    
    
    def parseArgs( self, argv ):
        global theLog
        opts, self.files = getopt.getopt( argv[1:], 'hvsdc:w:t:x:p:' )
        for o,v in opts:
            if o == '-c': 
                theLog.event( OptionsEvent, "Setting command character to %r" % v )
                self.commandChar= v
            elif o == '-w': 
                theLog.event( OptionsEvent, "Setting weaver to %r" % v )
                self.theWeaver= self.emitterFact.mkEmitter( v )
            elif o == '-t': 
                theLog.event( OptionsEvent, "Setting tangler to %r" % v )
                self.theTangler= self.emitterFact.mkEmitter( v )
            elif o == '-x':
                if v.lower().startswith('w'): # skip weaving
                    self.theOperation= self.doTangle
                elif v.lower().startswith('t'): # skip tangling
                    self.theOperation= self.doWeave
                else:
                    raise Exception( "Unknown -x option %r" % v )
            elif o == '-p':
                # save permitted errors, usual case is -pi to permit include errors
                self.permitList= [ '%s%s' % ( commandChar, c ) for c in v ]
            elif o == '-h': print __doc__
            elif o == '-v': theLog= Logger( verbose )
            elif o == '-s': theLog= Logger( silent )
            elif o == '-d': theLog= Logger( debug )
            else:
                raise Exception('unknown option %r' % o)
    
    

    Application parse command line (146). Used by Application Class (144) .

    The process() function uses the current Application settings to process each file as follows:

    1. Create a new WebReader for the Application, providing the parameters required to process the input file.
    2. Create a Web instance, w and set the Web's sourceFileName from the WebReader's fileName.
  • Perform the given command, typically a MacroOperation, which does some combination of load, tangle the output files and weave the final document in the target language; if necessary, examine the Web to determine the documentation language.
  • Print a performance summary line that shows lines processed per second.
  • In the event of failure in any of the major processing steps, a summary message is produced, to clarify the state of the output files, and the exception is reraised. The re-raising is done so that all exceptions are handled by the outermost main program.

    Application class process all files (147)=

    
    
    def process( self, theFiles=None ):
        for f in theFiles or self.files:
            self.webReader= WebReader( f, command=self.commandChar, permit=self.permitList )
            try:
                w= Web()
                w.sourceFileName= self.webReader.fileName
                self.theOperation.perform( w, self )
            except Error,e:
                print '>', e.args[0]
                for a in e.args[1:]:
                    print '...', a
            except IOError,e:
                print e
            theLog.event(SummaryEvent, 'pyWeb: %s' % self.theOperation.summary(w,self) )
    
    

    Application class process all files (147). Used by Application Class (144) .

    Module Initialization

    These global singletons define the logging that can be done. These are used in the logger configuration that is done as part of Module Initialization.

    Global singletons that define the activities for each Event class (148)=

    
    
    _report= LogReport()
    _discard= LogDiscard()
    _debug= LogDebug()
    
    

    Global singletons that define the activities for each Event class (148). Used by Logger classes - handle logging of status messages (88) .

    A Log configuration tuple lists a number of Event subclasses that are associated with an instance of a subclass of the LogActivity class. Each alternative configuration is a list of association tuples. A command-line parameter selects one of these configuration alternatives.

    A direct mapping from class to LogActivity instance might be a simpler block of Python programming, but is relatively long-winded when specifying the configuration.

    Module Initialization of global variables (149)=

    
    
    standard = [
        (('ErrorEvent','WarningEvent'), _report ),
        (('ExecutionEvent',), _debug ),
        (('InputEvent','WeaveStartEvent', 'WeaveEndEvent', 
          'TangleStartEvent', 'TangleEndEvent','SummaryEvent'), _report),
        (('OptionsEvent', 'ReadEvent', 
          'WeaveEvent', 'TangleEvent'),  _discard ) ]
    silent = [
        (('ErrorEvent','SummaryEvent'), _report ),
        (('ExecutionEvent','WarningEvent',
          'InputEvent','WeaveStartEvent', 'WeaveEndEvent', 
          'TangleStartEvent', 'TangleEndEvent'), _discard),
        (('Optionsevent', 'ReadEvent', 
          'WeaveEvent', 'TangleEvent'),  _discard ) ]
    verbose = [
        (('ErrorEvent','WarningEvent'), _report ),
        (('ExecutionEvent',
          'InputEvent','WeaveStartEvent', 'WeaveEndEvent', 
          'TangleStartEvent', 'TangleEndEvent','SummaryEvent'), _report),
        (('OptionsEvent', 'ReadEvent',
          'WeaveEvent', 'TangleEvent'),  _report ) ]
    
    theLog= Logger(standard)
    
    

    Module Initialization of global variables (149). Used by pyweb.py (1) .

    Interface Functions

    There are three interface functions: main(), tangle() and weave(). The latter two are often termed "convenience functions".

    The main program is configured with an instance of an EmitterFactory. This instance is used to resolve names of weavers and tanglers when parsing parameters. An application program that uses this module can do the following:

    The main() function creates an Application instance to parse the command line arguments and then process those arguments. An EmitterFactory is passed as an argument so that another module can be constructed to add features to this module. See the example in the Emitter Factory section, above.

    Interface Functions (150)=

    
    
    def main( ef, argv ):
        a= Application( ef )
        a.parseArgs( argv )
        a.process()
    
    if __name__ == "__main__":
        main( EmitterFactory(), sys.argv )
    
    

    Interface Functions (150). Used by pyweb.py (1) .

    Two simple convenience functions are available for Python scripts that look like the following:

    import pyweb
    pyweb.tangle( 'aFile.w' )
    ...test procedure...
    pyweb.weave( 'aFile.w' )
    

    Interface Functions (151)+=

    
    
    def tangle( aFile ):
        """Tangle a single file, permitting errors on missing @i files."""
        a= Application( EmitterFactory() )
        a.permitList= [ '%si' % a.commandChar ]
        a.theOperation= a.doTangle
        a.process( [aFile] )
        
    def weave( aFile ):
        """Weave a single file."""
        a= Application( EmitterFactory() )
        a.theOperation= a.doWeave
        a.process( [aFile] )
    
    

    Interface Functions (151). Used by pyweb.py (1) .

    Indices

    Files

    pyweb.py:
    1

    Macros

    Application Class:
    144
    Application class process all files:
    147
    Application default options:
    145
    Application parse command line:
    146
    Base Class Definitions:
    6
    CVS Cruft and pyweb generator warning:
    5
    Chunk add to the web:
    57
    Chunk append a character:
    55
    Chunk append a command:
    54
    Chunk append text:
    56
    Chunk class:
    53
    Chunk class hierarchy - used to describe input chunks:
    52
    Chunk examination: starts with, matches pattern, references:
    58
    Chunk search for user identifiers done by iteration through each command:
    59
    Chunk tangle:
    62
    Chunk usedBy update done by iteration through each command:
    60
    Chunk weave:
    61
    CodeCommand class to contain a program source code block:
    77
    Command class hierarchy - used to describe individual commands:
    74
    Command superclass:
    75
    DOC String:
    3
    Emitter Factory - used to generate emitter instances from parameter strings:
    50 51
    Emitter class hierarchy - used to control output files:
    7
    Emitter core open, close and write:
    9
    Emitter doClose, to be overridden by subclasses:
    11
    Emitter doOpen, to be overridden by subclasses:
    10
    Emitter doWrite, to be overridden by subclasses:
    12
    Emitter indent control: set, clear and reset:
    16
    Emitter superclass:
    8
    Emitter write a block of code:
    13 14 15
    Error class - defines the errors raised:
    87
    FileXrefCommand class for an output file cross-reference:
    79
    Global singletons that define the activities for each Event class:
    148
    HTML code chunk begin:
    34
    HTML code chunk end:
    35
    HTML output file begin:
    36
    HTML output file end:
    37
    HTML reference to a chunk:
    40
    HTML references summary at the end of a chunk:
    38
    HTML simple cross reference markup:
    41
    HTML subclass of Weaver:
    33
    HTML write a line of code:
    39
    HTML write user id cross reference line:
    42
    Imports:
    2
    Interface Functions:
    150 151
    LaTex code chunk begin:
    26
    LaTex code chunk end:
    27
    LaTex doOpen override, close and write are the same as Weaver:
    25
    LaTex file output begin:
    28
    LaTex file output end:
    29
    LaTex reference to a chunk:
    32
    LaTex references summary at the end of a chunk:
    30
    LaTex subclass of Weaver:
    24
    LaTex write a line of code:
    31
    LoadOperation perform method does tangling of the output files:
    142
    LoadOperation subclass loads the document web:
    141
    LoadOperation summary provides lines read:
    143
    LogActivity strategy class hierarchy - including LogReport and LogDiscard:
    91
    Logger Event base class definitions:
    89
    Logger Event subclasses are unique to this application:
    90
    Logger classes - handle logging of status messages:
    88
    MacroOperation append adds a new operation to the sequence:
    133
    MacroOperation perform method delegates the sequence of operations:
    132
    MacroOperation subclass that holds a sequence of other operations:
    131
    MacroOperation summary summarizes each step:
    134
    MacroXrefCommand class for a named chunk cross-reference:
    80
    Module Initialization of global variables:
    149
    NamedChunk add to the web:
    65
    NamedChunk class:
    63
    NamedChunk tangle:
    67
    NamedChunk user identifiers set and get:
    64
    NamedChunk weave:
    66
    NamedDocumentChunk class:
    71
    NamedDocumentChunk tangle:
    73
    NamedDocumentChunk weave:
    72
    Operation class hierarchy - used to describe basic operations of the application:
    127
    Operation final summary method:
    130
    Operation perform method actually performs the operation:
    129
    Operation superclass has common features of all operations:
    128
    OutputChunk add to the web:
    69
    OutputChunk class:
    68
    OutputChunk weave:
    70
    ReferenceCommand class for chunk references:
    82
    ReferenceCommand refers to chunk:
    84
    ReferenceCommand resolve this chunk name if it was abbreviated:
    83
    ReferenceCommand tangle a referenced chunk:
    86
    ReferenceCommand weave a reference to a chunk:
    85
    Shell Escape:
    4
    TangleOperation perform method does tangling of the output files:
    139
    TangleOperation subclass initiates the tangle operation:
    138
    TangleOperation summary method provides total lines tangled:
    140
    Tangler code chunk begin:
    45
    Tangler code chunk end:
    46
    Tangler doOpen, doClose and doWrite overrides:
    44
    Tangler subclass of Emitter:
    43
    Tangler subclass which is make-sensitive:
    47
    TanglerMake doClose override, comparing temporary to original:
    49
    TanglerMake doOpen override, using a temporary file:
    48
    TextCommand class to contain a document text block:
    76
    UserIdXrefCommand class for a user identifier cross-reference:
    81
    WeaveOperation perform method does weaving of the document file:
    136
    WeaveOperation subclass initiates the weave operation:
    135
    WeaveOperation summary method provides line counts:
    137
    Weaver code chunk begin-end:
    20
    Weaver cross reference output methods:
    23
    Weaver doOpen, doClose and doWrite overrides:
    18
    Weaver document chunk begin-end:
    19
    Weaver file chunk begin-end:
    21
    Weaver reference command output:
    22
    Weaver subclass to control documentation production:
    17
    Web Chunk check reference counts are all one:
    102
    Web Chunk cross reference methods:
    101 103 104
    Web Chunk name resolution methods:
    98 99 100
    Web add a full chunk name, ignoring abbreviated names:
    94
    Web add a named macro chunk:
    96
    Web add an anonymous chunk:
    95
    Web add an output file definition chunk:
    97
    Web class - describes the overall "web" of chunks:
    92
    Web construction methods used by Chunks and WebReader:
    93
    Web determination of the language from the first chunk:
    107
    Web tangle the output files:
    108
    Web weave the output document:
    109
    WebReader class - parses the input file, building the Web structure:
    110
    WebReader command literals:
    126
    WebReader handle a command string:
    113
    WebReader load the web:
    124
    WebReader location in the input stream:
    112
    WebReader tokenize the input:
    111
    XrefCommand superclass for all cross-reference commands:
    78
    add a reference command to the current chunk:
    121
    add an expression command to the current chunk:
    122
    assign user identifiers to the current chunk:
    120
    collect all user identifiers from a given map into ux:
    105
    double at-sign replacement, append this character to previous TextCommand:
    123
    find user identifier usage and update ux from the given map:
    106
    finish a chunk, start a new Chunk adding it to the web:
    118
    import another file:
    117
    major commands segment the input into separate Chunks:
    114
    minor commands add Commands to the current Chunk:
    119
    other command-like sequences are appended as a TextCommand:
    125
    start a NamedChunk or NamedDocumentChunk, adding it to the web:
    116
    start an OutputChunk, adding it to the web:
    115

    User Identifiers

    Chunk:
    53 63 117 118 121 124
    CodeCommand:
    63 77
    Command:
    54 75 76 78 82
    Emitter:
    8 17 43 51
    Error:
    60 61 62 66 67 70 72 73 78 87 96 98 99 100 113 117 124 136 139 142 147
    ErrorEvent:
    90 117 136 139 142 149
    Event:
    88 89 90
    ExecutionEvent:
    88 89 129 149
    FileXrefCommand:
    79 119
    HTML:
    3 33 39 107
    InputEvent:
    90 111 149
    Latex:
    3 24 51 107
    LogActivity:
    91
    LogDiscard:
    91 148
    LogReport:
    89 91 148
    Logger:
    88 146 149
    MacroXrefCommand:
    80 119
    NamedChunk:
    63 68 71 116 120
    NamedDocumentChunk:
    71 116
    OptionsEvent:
    90 146 149
    OutputChunk:
    68 115
    ReadEvent:
    90 94 96 97 122 125 149
    ReferenceCommand:
    82 121
    SummaryEvent:
    49 90 147 149
    TangleEndEvent:
    44 49 90 149
    TangleEvent:
    90 108 149
    TangleStartEvent:
    44 48 90 149
    Tangler:
    43 47 50
    TanglerMake:
    47 51 145
    TextCommand:
    53 55 56 71 76 77
    UserIdXrefCommand:
    81 119
    WeaveEndEvent:
    18 90 149
    WeaveEvent:
    90 106 109 149
    WeaveStartEvent:
    18 25 90 149
    Weaver:
    17 24 33 50 51
    Web:
    57 65 69 92 147
    WebReader:
    110 117 147
    XrefCommand:
    78 79 80 81
    __version__:
    5
    _debug:
    148 149
    _discard:
    148 149
    _report:
    148 149
    filecmp:
    2 49
    getopt:
    2 146
    main:
    150
    os:
    2 18 25 49
    re:
    2 106 111
    silent:
    3 146 149
    standard:
    149
    sys:
    2 122 150
    tangle:
    3 62 67 68 71 73 75 76 77 78 86 108 139 151
    tempfile:
    2 48
    theLog:
    18 25 44 48 49 94 96 97 102 106 108 109 111 117 122 125 129 136 139 142 146 147 149
    time:
    2 89 129 130
    verbose:
    3 146 149
    weave:
    3 61 66 70 72 75 76 77 79 80 81 85 109 136 151

    Created by ./pyweb.py at Tue Jul 23 12:05:53 2002.

    pyweb.__version__ '$Revision$'.

    Source pyweb.w modified Tue Jul 23 12:05:28 2002.

    Working directory '/Users/slott/Documents/Books/Python Book/pyWeb'.