$Id: README,v 1.10 1997/11/06 19:21:54 ken Exp $
SGML::SPGrove
A Perl 5 module for loading SGML,
XML, and HTML document instances
using James Clark's SP.
Ken MacLeod
ken@bitsko.slc.ut.us
INTRODUCTION
The SGML::SPGrove module links with James Clark's SGML Parser (SP)
to load SGML, XML, and HTML document instances. SPGrove supports
the Visitor design pattern for accessing the grove and also comes
with a module for performing simple rule-based transformations to
Perl objects.
See the file Changes for user-visible changes. See the `examples'
directory for examples of using `SPGrove'. `DOM' relates SPGrove
to the World Wide Web Consortium's Document Object Model.
Newer versions of this module can be found at
. SPGrove shares a mailing
list with Quilt. To subscribe to the Quilt mailing list, send a
message with the word `subscribe' in the Subject: field to
.
Copyright (C) 1997 Ken MacLeod
SPGrove is distributed under the same terms as SP. See the file
COPYING for distribution terms.
OVERVIEW
SGML::SPGrove takes a system identifier and passes it to SP to
parse, as each element is parsed from the document SPGrove builds
Perl objects to match. When done parsing, SPGrove returns an
SGML::SPGrove object that contains the root element of the parsed
document and an array (hopefully empty :-) of parser errors.
Elements of the document are SGML::Element objects. Elements
have a generic identifier (or name), attributes, and the contents
of the element. Attributes are stored as a Perl hash, with the
values as an array of scalars and SGML::SData objects. The
contents of an element may be more Elements, scalars, SData
objects, processing instruction (PI) objects, or Entities.
SGML::SData objects are replacements for character entity
references within the document. The Text::EntityMap perl module
can be used to map SData replacements from common character entity
sets to common output formats.
SGML::PI objects are processing instructions contained within the
document.
SGML::Entity, SGML::ExtEntity, and SGML::SubDocEntity are entity
references.
SGML::Notation objects define a notation used for entities and in
attributes.
SGML::Writer outputs all or part of a grove to a file or scalar.
SGML::Simple::Spec, SGML::Simple::SpecBuilder, and
SGML::Simple::BuilderBuilder work together to implement a simple
rule-based transformer for transforming document instances to Perl
objects. SpecBuilder takes a spec grove conforming to the
``SPGrove Simple Spec'' DTD and creates a specification object
that can be given to BuilderBuilder to create a Visitor package
that can be used to transform other groves to Perl objects.
Visitors and Builders are explained thoroughly in ``Design
Patterns: Elements of Reusable Object-Oriented Software'' by
Gamma, Helm, Johnson, and Vlissides, published by Addison-Wesley
(ISBN 0-201-63361-2).
INSTALLATION
SGML::SPGrove requires Perl 5, James Clark's SP (from the Jade
distribution), and the Perl modules Class-Eroot and
Class-Visitor. SP requires a C++ compiler.
The extra Perl modules are also available at SPGrove's source site.
1) SPGrove needs SP's `libsp.a' and include files. SP's `make
install' does not install these [I'm working on that, I should
have an RPM available soon]. Create a workarea for compiling
SP, compile it and keep the workarea until SPGrove is done.
Edit SPGrove's Makefile.PL to point `LIBS' to SP's `lib'
directory and `INC' to SP's three include directories. I've
left my templates in to point the way.
2) standard Perl module after that,
perl Makefile.PL
make
make test
make install
Just so you know, SPGrove's copy of SP's library is included
in the install, that's over a megabyte and a half.
FYI, a statically linked perl executable (`make perl') appears
to run significantly faster, in one test, 17 seconds instead
of 25 seconds.