aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSergey Poznyakoff <gray@gnu.org.ua>2010-03-19 16:55:04 +0200
committerSergey Poznyakoff <gray@gnu.org.ua>2010-03-19 17:29:49 +0200
commit004dc35b5d6c451184ed0983caee65e47c6b4091 (patch)
tree695091a2645b1d4d0b7375aa731b2cba818cd91a
parent5c3643f95a9d5568b64875ddf96a7e007050de72 (diff)
downloadgamma-004dc35b5d6c451184ed0983caee65e47c6b4091.tar.gz
gamma-004dc35b5d6c451184ed0983caee65e47c6b4091.tar.bz2
Further improvements in expat.
* src/expat.sci (xml-make-parser): Fix definition. (xml-error-descr): New syntax. * src/gamma-expat.c (xml-expat-version-string) (xml-expat-version) (xml-default-current) (xml-error-string) (xml-current-line-number) (xml-current-byte-count): New functions. (xml-primitive-make-parser): Include context data int the error information. (generic_xml_decl_handler): Account for possible NULLs. * examples/xml-check.scm: New file. * examples/xml-struct.scm: New file. * examples/xmlck.scm: New file. * examples/expat-info.scm: New file. * examples/Makefile.am, examples/README: Update. * doc/expat.texi: Update.
-rw-r--r--doc/expat.texi661
-rw-r--r--examples/Makefile.am8
-rw-r--r--examples/README31
-rw-r--r--examples/expat-info.scm12
-rw-r--r--examples/xml-check.scm47
-rw-r--r--examples/xml-struct.scm63
-rw-r--r--examples/xmlck.scm20
-rw-r--r--src/expat.sci43
-rw-r--r--src/gamma-expat.c123
-rw-r--r--src/gamma-expat.h2
10 files changed, 865 insertions, 145 deletions
diff --git a/doc/expat.texi b/doc/expat.texi
index e52a636..08b4137 100644
--- a/doc/expat.texi
+++ b/doc/expat.texi
@@ -4,8 +4,10 @@
4@c ******************************************************************* 4@c *******************************************************************
5@WRITEME 5@WRITEME
6 6
7The @samp{(gamma expat)} module provides interface with 7The @samp{(gamma expat)} module provides interface to
8@command{libexpat}, a library for parsing @acronym{XML} documents. 8@command{libexpat}, a library for parsing @acronym{XML} documents.
9See @uref{http://expat.sourceforge.net}, for a description of the
10library.
9 11
10Usage: 12Usage:
11 13
@@ -14,14 +16,130 @@ Usage:
14@end lisp 16@end lisp
15 17
16@menu 18@menu
17* primitives:: Expat Primitives 19* expat basics::
20* creating parsers::
21* parsing::
22* errors::
18* handlers:: 23* handlers::
19@end menu 24@end menu
20 25
21@node primitives 26@node expat basics
22@section Expat Primitives 27@section Expat Basics
28
29Parsing of @acronym{XML} documents using Expat is based on
30user-defined callback functions. You create a @dfn{parser}
31object, and associate @dfn{callback} (or @dfn{handler}) functions with
32the events he is interested in. Such events may be, for instance,
33encountering of a open or closing tag, encountering of a comment
34block, etc. Once the parser object is ready, you start feeding the
35document to it. As the parser recognizes @acronym{XML} constructs, it
36calls the callbacks that are registered for them.
37
38Parsers are created using @code{xml-make-parser} function. In the
39simplest case, it takes no arguments, e.g.:
40
41@lisp
42(let ((parser (xml-make-parser)))
43 @dots{}
44@end lisp
45
46The function @code{xml-parse} takes the parser as its argument, reads
47the document from the current input stream and feeds it to the parser.
48Thus, the simplest program for parsing @acronym{XML} documents is:
49
50@lisp
51(use-modules ((gamma expat)))
52(xml-parse (xml-make-parser))
53@end lisp
54
55This program is perhaps not so useful, but you may already use it to
56check whether its input is a correctly formed @acronym{XML} document.
57If @code{xml-parse} encounters an error, it signals the
58@code{gamma-xml-error} error. @xref{errors, error handling}, for a
59discussion on how to handle it.
60
61The @code{xml-make-parser} function takes optional arguments, which
62allow to set callback functions for the new parser. For example, the
63following code sets function @samp{elt-start} as a handler for
64start elements:
65
66@lisp
67(xml-make-parser #:start-element-handler elt-start)
68@end lisp
69
70The @code{#:start-element-handler} keyword informs the function that
71the argument following it is a handler for start @acronym{XML} documents.
72Any number of handlers may be set this way, e.g.:
73
74@lisp
75(xml-make-parser #:start-element-handler elt-start
76 #:end-element-handler elt-end
77 #:comment-handler comment)
78@end lisp
79
80Definitions of particular handler functions differ depending on their
81purpose, i.e. on the event they are defined to handle. For example,
82a start element handler must be defined as having two arguments.
83First of them is the name of the tag, and the second one is a list of
84attributes supplied for that tag. Thus, for example, the following
85start handler prints the tag and the number of attributes:
86
87@lisp
88(define (elt-start name attrs)
89 (format #t "~A (~A)~%" name (length attrs)))
90@end lisp
91
92For a detailed description of all available handlers and handler
93keywords, see @ref{handlers}.
94
95To further improve our example, suppose you need a program that will
96take an @acronym{XML} document as its input and create a description
97of its structure on output, showing element nesting levels by
98indenting their description. Here is how to write it.
99
100First, define handlers for start and end elements. Start element
101handler will print two indenting spaces for each level of ancestor
102elements, followed by the element name and its attributes and a
103newline. It will then increase the global level variable:
104
105@lisp
106(define level 0)
107
108(define (elt-start name attrs)
109 (display (make-string (* 2 level) #\space))
110 (display name)
111 (for-each
112 (lambda (x)
113 (display " ")
114 (display (car x))
115 (display "=")
116 (display (cdr x)))
117 attrs)
118 (newline)
119 (set! level (1+ level)))
120@end lisp
121
122The handler for end tags is simpler: it must only decrease the level:
123
124@lisp
125(define (elt-end name)
126 (set! level (1- level)))
127@end lisp
128
129Finally, create a parser and parse the input:
130
131@lisp
132(xml-parse (xml-make-parser #:start-element-handler elt-start
133 #:end-element-handler elt-end))
134@end lisp
135
136
137
138@node creating parsers
139@section Creating XML Parsers
23@WRITEME 140@WRITEME
24 141
142@anchor{xml-primitive-make-parser}
25@deffn {Scheme procedure} xml-primitive-make-parser enc sep 143@deffn {Scheme procedure} xml-primitive-make-parser enc sep
26Return a new @acronym{XML} parser. If @var{enc} is given, it must be one of: 144Return a new @acronym{XML} parser. If @var{enc} is given, it must be one of:
27@samp{US-ASCII}, @samp{UTF-8}, @samp{UTF-16}, @samp{ISO-8859-1}. If @var{sep} 145@samp{US-ASCII}, @samp{UTF-8}, @samp{UTF-16}, @samp{ISO-8859-1}. If @var{sep}
@@ -63,23 +181,6 @@ and to:
63@end lisp 181@end lisp
64@end deffn 182@end deffn
65 183
66@deffn {Scheme procedure} xml-primitive-parse parser input isfinal
67Parse next piece of input. Arguments are:
68
69@table @var
70@item parser
71A parser returned from a previous call to
72@code{xml-primitive-make-parser} or @code{xml-make-parser}.
73
74@item input
75A piece of input text.
76
77@item isfinal
78Boolean value indicating whether @var{input} is the last part of
79input.
80@end table
81@end deffn
82
83@deffn {Scheme procedure} xml-primitive-set-handler parser key handler 184@deffn {Scheme procedure} xml-primitive-set-handler parser key handler
84Set @acronym{XML} handler for an event. Arguments are: 185Set @acronym{XML} handler for an event. Arguments are:
85 186
@@ -87,90 +188,77 @@ Set @acronym{XML} handler for an event. Arguments are:
87@item parser 188@item parser
88A valid @acronym{XML} parser 189A valid @acronym{XML} parser
89 190
191@anchor{handler-keyword}
90@item key 192@item key
91A key, identifying an event. 193A key, identifying the event. For example,
92 194@samp{#:start-element-handler} sets handler which is called for start
93@table @asis 195tags.
94@kwindex start-element-handler
95@item #:start-element-handler
96
97@kwindex end-element-handler
98@item #:end-element-handler
99
100@kwindex character-data-handler
101@item #:character-data-handler
102
103@kwindex processing-instruction-handler
104@item #:processing-instruction-handler
105
106@kwindex comment-handler
107@item #:comment-handler
108
109@kwindex start-cdata-section-handler
110@item #:start-cdata-section-handler
111 196
112@kwindex end-cdata-section-handler 197@xref{handlers}, for its values and their meaning.
113@item #:end-cdata-section-handler
114 198
115@kwindex default-handler 199@item handler
116@item #:default-handler 200Handler procedure.
117 201@end table
118@kwindex default-handler-expand
119@item #:default-handler-expand
120
121@kwindex external-entity-ref-handler
122@item #:external-entity-ref-handler
123
124