diff options
author | Sergey Poznyakoff <gray@gnu.org> | 2019-07-10 09:54:32 +0300 |
---|---|---|
committer | Sergey Poznyakoff <gray@gnu.org> | 2019-07-10 09:54:32 +0300 |
commit | d36275fe9adf1428fd2476defda9e1fcda56988a (patch) | |
tree | 0d5c9b9aabac887a100934118955ed7736690a98 | |
parent | 5742ab5a037160a115144b3bf45cf3349df24635 (diff) | |
download | wordsplit-d36275fe9adf1428fd2476defda9e1fcda56988a.tar.gz wordsplit-d36275fe9adf1428fd2476defda9e1fcda56988a.tar.bz2 |
Improve docs
-rw-r--r-- | README | 71 | ||||
-rw-r--r-- | wordsplit.3 | 14 |
2 files changed, 47 insertions, 38 deletions
@@ -1,15 +1,18 @@ | |||
1 | * Overview | 1 | * Overview |
2 | 2 | ||
3 | This package provides a set of C functions for splitting a string into | 3 | This package provides a set of C functions for parsing input strings. |
4 | words. The splitting process is highly configurable and allows for | 4 | Default parsing rules are are similar to those used in Bourne shell. |
5 | considerable flexibility. The default splitting rules are similar to | 5 | This includes tilde expansion, variable expansion, quote removal, word |
6 | those used in Bourne shell. The splitting process includes tilde | 6 | splitting, command substitution, and path expansion. Parsing is |
7 | expansion, variable expansion, quote removal, command substitution, | 7 | controlled by a number of settings which allow the caller to alter |
8 | and path expansion. Each of these phases can be turned off by the caller. | 8 | processing at each of these phases or even to disable any of them. |
9 | Thus, wordsplit can be used for parsing inputs in different formats, | ||
10 | from simple character-delimited entries, as in /etc/passwd, and up to | ||
11 | complex shell statements. | ||
9 | 12 | ||
10 | The following code fragment shows the basic usage: | 13 | The following code fragment shows the basic usage: |
11 | 14 | ||
12 | /* This variable controls the splitting */ | 15 | /* This variable controls parsing */ |
13 | wordsplit_t ws; | 16 | wordsplit_t ws; |
14 | int rc; | 17 | int rc; |
15 | 18 | ||
@@ -31,7 +34,7 @@ The following code fragment shows the basic usage: | |||
31 | /* Reclaim the allocated memory */ | 34 | /* Reclaim the allocated memory */ |
32 | wordsplit_free(&ws); | 35 | wordsplit_free(&ws); |
33 | 36 | ||
34 | For a detailed discussion, please see the man page wordsplit.3 inluded | 37 | For a detailed discussion, please see the man page wordsplit.3 included |
35 | in the package. | 38 | in the package. |
36 | 39 | ||
37 | * Description | 40 | * Description |
@@ -51,20 +54,25 @@ are for building the autotest-based testsuite: | |||
51 | 54 | ||
52 | * Incorporating wordsplit into your project | 55 | * Incorporating wordsplit into your project |
53 | 56 | ||
54 | The project is designed to be used as a git submodule. First, select | 57 | The project is designed to be used as a git submodule. To incorporate |
55 | the location DIR for the wordsplit directory within your project. Then | 58 | it into your project, first select the location for the wordsplit |
56 | add the submodule: | 59 | directory within your project. Then add the submodule at this |
60 | location. The rest is quite straightforward: you need to add | ||
61 | wordsplit.c to your sources and add both wordsplit.c and wordsplit.h | ||
62 | to the distributed files. | ||
63 | |||
64 | The following will describe each step in detail. For the rest of this | ||
65 | discussion it is supposed that 'wordsplit' is the name of the location | ||
66 | selected for the submodule. It is also supposed that your project | ||
67 | uses GNU autotools framework. If you are using plain makefiles, these | ||
68 | instructions are easy to convert to such use as well. | ||
57 | 69 | ||
58 | git submodule add git://git.gnu.org.ua/wordsplit.git DIR | 70 | To add the submodule do: |
59 | 71 | ||
60 | The rest is quite straightforward: you need to add wordsplit.c to your | 72 | git submodule add git://git.gnu.org.ua/wordsplit.git wordsplit |
61 | sources and add both wordsplit.c and wordsplit.h to the distributed files. | ||
62 | 73 | ||
63 | There are two methods of doing so: direct incorporation and | 74 | There are two methods of including the sources to the project: direct |
64 | incorporation via VPATH. The discussion below will describe both | 75 | incorporation and incorporation via VPATH. |
65 | methods based on the assumption that your project is using GNU | ||
66 | autotools framework. If you are using plain makefiles, these | ||
67 | instructions are easy to convert to such use as well. | ||
68 | 76 | ||
69 | ** Direct incorporation | 77 | ** Direct incorporation |
70 | 78 | ||
@@ -88,8 +96,8 @@ You can also put wordsplit.h in the noinst_HEADERS variable, if you like: | |||
88 | noinst_HEADERS = wordsplit/wordsplit.h | 96 | noinst_HEADERS = wordsplit/wordsplit.h |
89 | AM_CPPFLAGS = -I$(srcdir)/wordsplit | 97 | AM_CPPFLAGS = -I$(srcdir)/wordsplit |
90 | 98 | ||
91 | If you are building an installable library and wish to make wordsplit functions | 99 | If you are building an installable library and wish to export the |
92 | available, install wordsplit.h to $(pkgincludedir), e.g. | 100 | wordsplit API, install wordsplit.h to $(pkgincludedir), e.g. |
93 | 101 | ||
94 | lib_LTLIBRARIES = libmy.la | 102 | lib_LTLIBRARIES = libmy.la |
95 | libmy_la_SOURCES = main.c \ | 103 | libmy_la_SOURCES = main.c \ |
@@ -97,7 +105,7 @@ available, install wordsplit.h to $(pkgincludedir), e.g. | |||
97 | AM_CPPFLAGS = -I$(srcdir)/wordsplit | 105 | AM_CPPFLAGS = -I$(srcdir)/wordsplit |
98 | pkginclude_HEADERS = wordsplit/wordsplit.h | 106 | pkginclude_HEADERS = wordsplit/wordsplit.h |
99 | 107 | ||
100 | ** Vpath-based incorporation | 108 | ** VPATH-based incorporation |
101 | 109 | ||
102 | Modify the VPATH variable in your Makefile.am: | 110 | Modify the VPATH variable in your Makefile.am: |
103 | 111 | ||
@@ -105,13 +113,13 @@ Modify the VPATH variable in your Makefile.am: | |||
105 | 113 | ||
106 | Notice the use of "+=": it is necessary for the vpath builds to work. | 114 | Notice the use of "+=": it is necessary for the vpath builds to work. |
107 | 115 | ||
108 | Define the nodist_program_SOURCES variable: | 116 | Add wordsplit.c to the nodist_program_SOURCES variable: |
109 | 117 | ||
110 | nodist_program_SOURCES = wordsplit.c | 118 | nodist_program_SOURCES = wordsplit.c |
111 | 119 | ||
112 | The nodist_ prefix is necessary to prevent Make from trying to | 120 | The nodist_ prefix is necessary to prevent Make from trying to |
113 | distribute this file from the current directory (where it doesn't | 121 | distribute this file from the current directory (where it doesn't |
114 | exist of course). It will find it using VPATH during compilation. | 122 | exist of course). During compilation it will be located using VPATH. |
115 | 123 | ||
116 | Finally, add both wordsplit/wordsplit.c and wordsplit/wordsplit.h to | 124 | Finally, add both wordsplit/wordsplit.c and wordsplit/wordsplit.h to |
117 | the EXTRA_DIST variable and modify AM_CPPFLAGS as shown in the | 125 | the EXTRA_DIST variable and modify AM_CPPFLAGS as shown in the |
@@ -196,7 +204,7 @@ Add the following lines to your configure.ac: | |||
196 | 204 | ||
197 | ** lib/Makefile.am | 205 | ** lib/Makefile.am |
198 | 206 | ||
199 | The makefile in lib must be modified to build the auxiliary program | 207 | The Makefile.am in lib must be modified to build the auxiliary program |
200 | wsp and create the testsuite script. This is done by the following | 208 | wsp and create the testsuite script. This is done by the following |
201 | fragment: | 209 | fragment: |
202 | 210 | ||
@@ -228,17 +236,18 @@ fragment: | |||
228 | * History | 236 | * History |
229 | 237 | ||
230 | First version of wordsplit appeared in March 2009 as a part of the | 238 | First version of wordsplit appeared in March 2009 as a part of the |
231 | Wydawca[1] project. Its main usage there was to assist in | 239 | Wydawca[1] project. Its main usage was to assist in configuration |
232 | configuration file parsing. The parser subsystem proved to be quite | 240 | file parsing. The parser subsystem proved to be quite useful and |
233 | useful and it soon forked into a separate project - Grecs[2]. This | 241 | soon evolved into a separate project - Grecs[2]. This package had been |
234 | package had been since used (as a git submodule) in a number of other | 242 | since used (as a git submodule) in a number of other projects, such as |
235 | projects, such as GNU Dico[3] and Direvent[4], to name a few. | 243 | GNU Dico[3] and Direvent[4], to name a few. |
236 | 244 | ||
237 | In 2010 the wordsplit sources were incorporated to the GNU | 245 | In 2010 the wordsplit sources were incorporated to the GNU |
238 | Mailutils[5] package, where they replaced the obsolete argcv module. | 246 | Mailutils[5] package, where they replaced the obsolete argcv module. |
239 | Mailutils uses its own configuration package, which meant that using | 247 | Mailutils uses its own configuration package, which meant that using |
240 | Grecs was not expedient. Therefore the sources had been exported from | 248 | Grecs was not expedient. Therefore the sources had been exported from |
241 | Grecs and are kept in sync with the changes in it. | 249 | Grecs. Since then both Mailutils and Grecs versions are periodically |
250 | synchronized. | ||
242 | 251 | ||
243 | Several other projects, such as GNU Rush[6] and fileserv[7], followed | 252 | Several other projects, such as GNU Rush[6] and fileserv[7], followed |
244 | the suite. It was therefore decided that it would be advisable to | 253 | the suite. It was therefore decided that it would be advisable to |
diff --git a/wordsplit.3 b/wordsplit.3 index 139c73e..e742030 100644 --- a/wordsplit.3 +++ b/wordsplit.3 | |||
@@ -333,7 +333,7 @@ The \fBWRDSF_ESCAPE\fR flag allows the caller to customize escape | |||
333 | sequences. If it is set, the \fBws_escape\fR member must be | 333 | sequences. If it is set, the \fBws_escape\fR member must be |
334 | initialized. This member provides escape tables for unquoted words | 334 | initialized. This member provides escape tables for unquoted words |
335 | (\fBws_escape[0]\fR) and quoted strings (\fBws_escape[1]\fR). Each | 335 | (\fBws_escape[0]\fR) and quoted strings (\fBws_escape[1]\fR). Each |
336 | table is a string consisting of an even number of charactes. In each | 336 | table is a string consisting of an even number of characters. In each |
337 | pair of characters, the first one is a character that can appear after | 337 | pair of characters, the first one is a character that can appear after |
338 | backslash, and the following one is its translation. For example, the | 338 | backslash, and the following one is its translation. For example, the |
339 | above table of C escapes is represented as | 339 | above table of C escapes is represented as |
@@ -600,10 +600,10 @@ flag must be set. By default, it's value is \fB\(dq#\(dq\fR. | |||
600 | Escape tables for unquoted words (\fBws_escape[0]\fR) and quoted | 600 | Escape tables for unquoted words (\fBws_escape[0]\fR) and quoted |
601 | strings (\fBws_escape[1]\fR). These are used to translate escape | 601 | strings (\fBws_escape[1]\fR). These are used to translate escape |
602 | sequences (\fB\\\fIC\fR) into characters. Each table is a string | 602 | sequences (\fB\\\fIC\fR) into characters. Each table is a string |
603 | consisting of even number of charactes. In each pair of characters, | 603 | consisting of even number of characters. In each pair of characters, |
604 | the first one is a character that can appear after backslash, and the | 604 | the first one is a character that can appear after backslash, and the |
605 | following one is its representation. For example, the string | 605 | following one is its representation. For example, the string |
606 | \fB\(dqt\\tn\\n\(dq\fR translates \fB\\t\fR into horisontal | 606 | \fB\(dqt\\tn\\n\(dq\fR translates \fB\\t\fR into horizontal |
607 | tabulation character and \fB\\n\fR into newline. | 607 | tabulation character and \fB\\n\fR into newline. |
608 | .B WRDSF_ESCAPE | 608 | .B WRDSF_ESCAPE |
609 | flag must be set if this member is initialized. | 609 | flag must be set if this member is initialized. |
@@ -755,8 +755,8 @@ Default flags. This is a shortcut for: | |||
755 | WRDSF_SQUEEZE_DELIMS |\ | 755 | WRDSF_SQUEEZE_DELIMS |\ |
756 | WRDSF_CESCAPES)\fR, | 756 | WRDSF_CESCAPES)\fR, |
757 | 757 | ||
758 | i.e.: disable variable expansion and quote substituton, perform quote | 758 | i.e.: disable variable expansion and quote substitution, perform quote |
759 | removal, treat any number of consequtive delimiters as a single | 759 | removal, treat any number of consecutive delimiters as a single |
760 | delimiter, replace \fBC\fR escapes appearing in the input string with | 760 | delimiter, replace \fBC\fR escapes appearing in the input string with |
761 | the corresponding characters. | 761 | the corresponding characters. |
762 | .TP | 762 | .TP |
@@ -807,7 +807,7 @@ flag is set, and error code is returned. If this flag is set, the | |||
807 | function is called instead. This function is not supposed to return. | 807 | function is called instead. This function is not supposed to return. |
808 | .TP | 808 | .TP |
809 | .B WRDSF_WS | 809 | .B WRDSF_WS |
810 | Trim off any leading and trailind whitespace from the returned | 810 | Trim off any leading and trailing whitespace from the returned |
811 | words. This flag is useful if the \fIws_delim\fR member does not | 811 | words. This flag is useful if the \fIws_delim\fR member does not |
812 | contain whitespace characters. | 812 | contain whitespace characters. |
813 | .TP | 813 | .TP |
@@ -1007,7 +1007,7 @@ Undefined variable. This error is returned only if the | |||
1007 | \fBWRDSF_UNDEF\fR flag is set. | 1007 | \fBWRDSF_UNDEF\fR flag is set. |
1008 | .TP | 1008 | .TP |
1009 | .B WRDSE_NOINPUT | 1009 | .B WRDSE_NOINPUT |
1010 | Input exhausted. This is not acually an error. This code is returned | 1010 | Input exhausted. This is not actually an error. This code is returned |
1011 | if \fBwordsplit\fR (or \fBwordsplit_len\fR) is invoked in incremental | 1011 | if \fBwordsplit\fR (or \fBwordsplit_len\fR) is invoked in incremental |
1012 | mode and encounters end of input string. See the section | 1012 | mode and encounters end of input string. See the section |
1013 | .BR "INCREMENTAL MODE" . | 1013 | .BR "INCREMENTAL MODE" . |