There is no single way to display the hierarchical structure of a document since this could range from 'Julius Caesar' to the taxonomy of Drosophila protein sequences or an infrared spectrum. The options given here will work well for some documents and be irritating for others, so I'd welcome feedback and generic suggestions. For specific types of information (e.g. 3-dimensional structures of molecules), specific routines will have to be written. Most documents have a much more complex structure than is apparent at first sight, and so it's commonplace that the screen gets covered with windows or huge highly branched trees. The attraction of costwish is that when you know the 'best' way to analyse a document, you can easily write specifc code to customise it.
This menu determines the action when a node in the document is activated (Button-1). (This could come from the TOC, a subwindow TOC or buttons, the result of a search and anywhere else that links to nodes are displayed.) I expect to develop this considerably.
A node consists of three components (none of which are mandatory):
If <FOO>...</FOO> contains both ASCII text and also other tags, it is called Mixed Content (HTML contains many examples of this). Mixed content is problematical to display if the tags are meant to denote hierarchy. A common use of mixed content (as in HTML) is when the tags can be regarded as operating on the text ("make this italic") and this part of a document can be regarded as an Event Stream.
Many documents contain both hierarchical structure (Chapter, Section, etc.) and event streams (formatting). costwish supports both, but concentrates on the hierarchy, whilst some other tools concentrate on events. Events are often processed either by creating special symbols for the starttags (and endtags) and embedding these in the ASCII string or by applying formatting instructions (e.g. Bold, Centre, Indent) specific to each tag (often through a translation table).
If costwish encounters a document of unknown structure whose tags are not previously known, it will render it as a tree in the TOC and as subtrees in the subwindows. The Node Action menu also allows the subwindows to be displayed as event streams with tags rendered with angle brackets (<...>). (It's important to realise that costwish hasn't merely transcribed the input, but has recreated a normalised document from the ESIS stream.). At present costwish does not translate the event stream but this will be added very shortly - the main difficulty is to know how to translate a complete document this way, or whether subdocuments can be extracted. (NOTE: In the case that costwish encounters HTML it displays this fully rendered).
Attributes are displayed in a separate window (paned mode). If there are specific display routines for a GI, attributes can be rendered here as well. The content is also displayed in paned mode, but only where it is a direct granchild of the node. (This is because all content is held in PEL nodes which are not displayable and are themselves children of displayable nodes.) Some of the other display types (toplevel windows) do not display attributes and content specifically.
In many cases the default display types (trees or event streams) are not very useful. In these cases special code can be written to display the node which takes precedence. This is particularly useful where the apparent hierarchy is misleading or too deep and where elements need to be merged together. Thus in the PLAY dtd a SPEECH contains a SPEAKER, and several discrete LINEs. It's easy to format this attractively without having to include the tags and this has been done in play/postproc.tcl through the SPEECH::displayNode routine. Although this requires some simple programming, much of it can be done by analogy. For more complex content (e.g. molecular structures) the raw data is almost meaningless and special routines are essential.
There are two generic aspects of this display: what type of window to use and what sort of display within a window. They aren't totally independent, but this is a rough guide.
The output can be customised for a GI (e.g. FOO) by writing the specific routines:
These can be customised to reformat the data, transform it and use other rendering methods. They are read in when the appropriate DTD is required (from the DOCTYPE). In general they will still use a toplevel window with a text widget and buttons, but this isn't required.
Note that the active areas (tags or buttons) display subnodes as new windows (recursively) under manual control. Each subnode 'knows' which its parent in the document is (i.e. in the ESIS parse tree). (There is potential confusion in the terminology here, since tk widgets have parents and children, and in this case the tkparent of a toplevel window is ".". Unless you are developing tk code, 'parent' and 'child' will always refer to the ESIS tree.). Because windows may proliferate in complex documents there are several navigational aids:
After several strong suggestions I have made the default a Paned window.
The panes (which can be shifted left ot right) are:
This is still being customised. At present the best way is to
There are three generic types of display for a node which has no specific display routine:
(NOTE: There is likely to be continual small adjustment to the displays, so don't be thrown if the display has changed from this description.)
Peter Murray-Rust
April 1996