aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorSergey Poznyakoff <gray@gnu.org>2018-05-22 13:45:39 +0300
committerSergey Poznyakoff <gray@gnu.org>2018-05-22 14:11:21 +0300
commita531c7d1de4780ee53ffbd383329dfff55d425b0 (patch)
tree20290d0b4eb45069bfea28dc96a4c6f362d363b9 /doc
parent085d771694655dea82c37247497db32eca228b4a (diff)
downloadgrecs-a531c7d1de4780ee53ffbd383329dfff55d425b0.tar.gz
grecs-a531c7d1de4780ee53ffbd383329dfff55d425b0.tar.bz2
New feature: maxwords
This feature allows the user to limit the number of words returned by a call to wordsplit. When the number of words in expansion reaches the predefined limit, the rest of input line will be expanded and returned as a single last word. For example, to parse a /etc/passwd line: struct wordsplit ws; ws.ws_delim = ":"; ws.ws_maxwords = 7; ws.ws_options = WRDSO_MAXWORDS; wordsplit(str, &ws, WRDSF_NOVAR | WRDSF_NOCMD | WRDSF_DELIM | WRDSF_OPTIONS); * doc/wordsplit.3: Document the maxwords feature. * include/wordsplit.h (wordsplit) <ws_maxwords> <ws_wordi>: New members. (WRDSO_MAXWORDS): New option. * src/wordsplit.c (WSP_RETURN_DELIMS): New macro. (_wsplt_subsplit): Rewrite. (wordsplit_init0): Don't reset node list. (wordsplit_init): Initialize ws_wordi and the node list. (wsnode_insert): Correctly insert lists. (coalesce_segment): Additional safety check. (wsnode_tail_coalesce): New static function. (wordsplit_finish): Postprocess delimiters. (expvar,expcmd): Use new _wsplt_subsplit. (wordsplit_varexp): Don't try to expand delimiter nodes. (skip_delim): Remove delimiter processing. It is now done in wordsplit_finish. (scan_word): New argument 'consume_all' instructs it to consume the rest of input as one token. (wordsplit_process_list): Handle wsp->ws_maxwords setting. This also fixed a long-standing bug: quotes weren't processed in WRDSF_NOSPLIT mode. See the testcase 59 (incremental nosplit). (wordsplit_run): Rewrite. (wordsplit_free): Free node list. * tests/wordsplit.at: Update for the new wsp output format. (incremental nosplit): Expect correct output. Add tests for the maxwords feature. * tests/wsp.c (maxwords): New flag. Print the ws_wordi value as "TOTALS" at the end of each run.
Diffstat (limited to 'doc')
-rw-r--r--doc/wordsplit.359
1 files changed, 55 insertions, 4 deletions
diff --git a/doc/wordsplit.3 b/doc/wordsplit.3
index a391b81..c3149fe 100644
--- a/doc/wordsplit.3
+++ b/doc/wordsplit.3
@@ -1,5 +1,5 @@
.\" This file is part of grecs -*- nroff -*-
-.\" Copyright (C) 2007-2016 Sergey Poznyakoff
+.\" Copyright (C) 2007-2018 Sergey Poznyakoff
.\"
.\" Grecs is free software; you can redistribute it and/or modify
.\" it under the terms of the GNU General Public License as published by
@@ -14,7 +14,7 @@
.\" You should have received a copy of the GNU General Public License
.\" along with Grecs. If not, see <http://www.gnu.org/licenses/>.
.\"
-.TH WORDSPLIT 3 "February 20, 2018" "GRECS" "Grecs User Reference"
+.TH WORDSPLIT 3 "May 22, 2018" "GRECS" "Grecs User Reference"
.SH NAME
wordsplit \- split string into words
.SH SYNOPSIS
@@ -133,6 +133,17 @@ if (rc != WRDSE_NOINPUT)
wordsplit_free(&ws);
.EE
+.SH OPTIONS
+The number of flags is limited to 32 (the width of \fBuint32_t\fR data
+type) and each bit is occupied by a corresponding flag. However, the
+number of features \fBwordsplit\fR provides required still
+more. Additional features can be requested by setting a corresponding
+\fIoption bit\fR in the \fBws_option\fR field of the \fBstruct
+wordsplit\fR argument. To inform wordsplit functions that this field
+is initialized the \fBWRDSF_OPTIONS\fR flag must be set.
+.PP
+Option symbolic names begin with \fBWRDSO_\fR. They are discussed in
+detail in the subsequent chapters.
.SH EXPANSION
Expansion is performed on the input after it has been split into
words. There are several kinds of expansion, which of them are
@@ -392,7 +403,29 @@ for each such word using
.PP
When matching a pattern, the dot at the start of a name or immediately
following a slash must be matched explicitly, unless
-the \fBWRDSO_DOTGLOB\fR option is set,
+the \fBWRDSO_DOTGLOB\fR option is set.
+.SH LIMITING THE NUMBER OF WORDS
+The maximum number of words to be returned can be limited by setting
+the \fBws_maxwords\fR member to the desired count, and setting the
+\fBWRDSO_MAXWORDS\fR option, e.g.:
+.sp
+.EX
+struct wordsplit ws;
+ws.ws_maxwords = 3;
+ws.ws_options = WRDSO_MAXWORDS;
+wordsplit(str, &ws, WRDSF_DEFFLAGS|WRDSF_OPTIONS);
+.EE
+.PP
+If the actual number of words in the expanded input is greater than
+the supplied limit, the trailing part of the input will be returned in
+the last word. For example, if the input to the above fragment were
+\fBNow is the time for all good men\fR, then the returned words would be:
+.sp
+.EX
+"Now"
+"is"
+"the time for all good men"
+.EE
.SH WORDSPLIT_T STRUCTURE
The data type \fBwordsplit_t\fR has three members that contain
output data upon return from \fBwordsplit\fR or \fBwordsplit_len\fR,
@@ -410,6 +443,12 @@ from \fBwordsplit\fR.
Array of resulting words. Accessible upon successful return
from \fBwordsplit\fR.
.TP
+.BI "size_t " ws_wordi
+Total number of words processed. This field is intended for use with
+.B WRDSF_INCREMENTAL
+flag. If that flag is not set, the following relation holds:
+.BR "ws_wordi == ws_wordc - ws_offs" .
+.TP
.BI "int " ws_errno
Error code, if the invocation of \fBwordsplit\fR or
\fBwordsplit_len\fR failed. This is the same value as returned from
@@ -435,6 +474,12 @@ flag is set, this member specifies the number of initial elements in
to fill with NULLs. These elements are not counted in the returned
.IR ws_wordc .
.TP
+.BI "size_t " ws_maxwords
+Maximum number of words to return. For this field to take effect, the
+\fBWRDSO_MAXWORDS\fR option and \fBWRDSF_OPTIONS\fR flag must be set.
+For a detailed discussion, see the chapter
+.BR "LIMITING THE NUMBER OF WORDS" .
+.TP
.BI "int " ws_flags
Contains flags passed to wordsplit on entry. Can be used as a
read-only member when using \fBwordsplit\fR in incremental mode or
@@ -804,6 +849,12 @@ Quote removal: handle octal escapes in doubly-quoted strings.
.TP
.B WRDSO_XESC_QUOTE
Quote removal: handle hex escapes in doubly-quoted strings.
+.TP
+.B WRDSO_MAXWORDS
+The \fBws_maxwords\fR member is initialized. This is used to control
+the number of words returned by a call to \fBwordsplit\fR. For a
+detailed discussion, refer to the chapter
+.BR "LIMITING THE NUMBER OF WORDS" .
.SH "ERROR CODES"
.TP
.BR WRDSE_OK ", " WRDSE_EOF
@@ -974,7 +1025,7 @@ Sergey Poznyakoff
.SH "BUG REPORTS"
Report bugs to <gray+grecs@gnu.org.ua>.
.SH COPYRIGHT
-Copyright \(co 2009-2014 Sergey Poznyakoff
+Copyright \(co 2009-2018 Sergey Poznyakoff
.br
.na
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

Return to:

Send suggestions and report system problems to the System administrator.