diff options
Diffstat (limited to 'doc/vmod-dbrw.texi')
-rw-r--r-- | doc/vmod-dbrw.texi | 644 |
1 files changed, 644 insertions, 0 deletions
diff --git a/doc/vmod-dbrw.texi b/doc/vmod-dbrw.texi new file mode 100644 index 0000000..de83723 --- /dev/null +++ b/doc/vmod-dbrw.texi @@ -0,0 +1,644 @@ +\input texinfo @c -*-texinfo-*- +@smallbook +@c %**start of header +@setfilename vmod-dbrw.info +@settitle vmod-dbrw User Manual +@c %**end of header +@setchapternewpage odd + +@defcodeindex kw +@defcodeindex op +@syncodeindex fn cp +@syncodeindex vr cp +@syncodeindex ky cp +@syncodeindex pg cp +@syncodeindex tp cp +@syncodeindex kw cp +@syncodeindex op cp + +@include version.texi + +@ifinfo +@dircategory Varnish Cache Modules +@direntry +* vmod-dbrw: (vmod-dbrw). Database-driven rewrite rules. +@end direntry +@end ifinfo + +@copying +Published by the Free Software Foundation, +51 Franklin Street, Fifth Floor, +Boston, MA 02110-1301 USA + +Copyright @copyright{} 2013 Sergey Poznyakoff + +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.3 or +any later version published by the Free Software Foundation; with no +Invariant Sections, no Front-Cover, and no Back-Cover texts. +A copy of the license is included in the section entitled ``GNU Free +Documentation License''. +@end copying + +@titlepage +@title Vmod-Dbrw User Manual +@subtitle version @value{VERSION}, @value{UPDATED} +@author Sergey Poznyakoff. +@page +@vskip 0pt plus 1filll +@insertcopying +@end titlepage + +@ifnothtml +@page +@summarycontents +@page +@end ifnothtml +@contents + +@ifnottex +@node Top +@top vmod-dbrw +This edition of the @cite{vmod-dbrw User Manual}, last updated @value{UPDATED}, +documents @command{vmod-dbrw} Version @value{VERSION}. +@end ifnottex + +@menu +* Intro:: What is vmod-dbrw. +* Overview:: How it works. +* Configuration:: +* Query:: How to Write a Query. +* Rewrite:: The @code{rewrite} Function. + +* Reporting Bugs:: How to Report a Bug. + +Appendices + +* Copying This Manual:: The GNU Free Documentation License. +* Concept Index:: Index of Concepts. + +@c Here are some other nodes which are really inferiors of the ones +@c already listed, mentioned here so you can get to them in one step: +@c +@c @detailmenu +@c @end detailmenu +@end menu + +@node Intro +@chapter Introduction to vmod-dbrw +@cindex Varnish +@cindex rewrite rules +@command{Vmod-dbrw} is a module for +@command{Varnish Cache}@footnote{@url{https://www.varnish-cache.org}} +which implements database-driven rewrite rules. These rules may be +similar to @code{RewriteRule} directives implemented by +@command{mod_rewrite}@footnote{@url{http://httpd.apache.org/docs/current/rewrite/}} +module in @command{Apache} or to @code{Redirect} directives of its +@command{mod_alias} module. What distinguishes the +@command{vmod-dbrw} rules from these, is that they are handled by +@command{Varnish}, before the request reaches the httpd server, and +that they are stored in an SQL database, which makes them easily +manageable. + +Some web sites implement thousands of rewrite rules. The purpose +of this module is to facilitate deploying and handling them. + +@node Overview +@chapter Overview +@cindex rewrite rules +@cindex database engines +@cindex supported databases +The rewrite rules are stored in a MySQL or PostgreSQL database. The +@command{vmod-dbrw} module does not impose any restrictions on its +schema. It only needs to know the SQL query which is to be used to +retrieve data. This query is supplied to the module, along with the +credentials for accessing the database, by calling the @code{config} +function in the @code{vcl_init} subroutine of the Varnish +configuration file. + +Once the module configured, the @code{rewrite} function can be called +in the appropriate place of the Varnish configuration file. Its argument +is a list of variable assignments separated with semicolons, each +assignment having the form @code{@var{name}=@var{value}}. When called, +@code{rewrite} expands the SQL query registered with the prior call to +@code{config} by replacing each @code{$@var{name}} +construct (a @dfn{variable reference}) with the corresponding +@var{value} from its argument. Similarly to the shell syntax, the +variable reference can also be written as @code{$@{@var{name}@}}. +This latter form can be used in contexts where the variable reference is +immediately followed by a letter, digit or underscore, to prevent it +from being counted as a part of the name. + +The expanded query is then sent to the database server. If it returns +a non-empty set, it is further handled depending on the number of +fields it contains. + +@cindex result interpretation +@anchor{result interpretation} +@cindex strict matching +@anchor{strict matching} +If the returned set has one or two columns, only the first tuple is +used and @code{rewrite} returns the value of its first column. + +@cindex regex matching +@anchor{regex matching} +Otherwise, if the returned set has three or more columns, the +regular expression matching is performed. For the purpose of +this discussion, let's refer to the columns as follows: @var{result}, +@var{regexp}, @var{value} and @var{flags}. The @var{flags} column is +optional. Any surplus columns are ignored. + +For each returned tuple, the @var{value} column undergoes variable +expansion, using the same algorithm as when preparing the query, and +the resulting string is matched with the @var{regexp} column, which is +treated as an extended POSIX regular expression. If the value matches +the expression, the @var{result} column is expanded by replacing +@dfn{backreferences}: each occurrence of @code{$@var{digit}} (where +@var{digit} stands for a decimal digit from @samp{0} through @samp{9}) +is replaced with the contents of the @var{digit}s parenthesized +subexpression in @var{regexp}. For compatibility with the traditional +usage, the @code{\@var{digit}} notation is also allowed. The +resulting value is then returned to the caller. + +@cindex flags +@anchor{flags} +Optional @var{flags} column is a comma-separated list of flags that +modify regular expression handling: + +@table @samp +@kindex NC +@kindex nocase +@cindex regular expression, case-insensitive +@cindex case-insensitive regular expression +@item NC +@itemx nocase +Treat @var{regexp} as case-insensitive regular expression. + +@kindex case +@cindex regular expression, case-sensitive +@cindex case-sensitive regular expression +@item case +Treat @var{regexp} as case-sensitive (default). + +@kindex QSA +@kindex qsappend +@cindex query string handling +@cindex append query string +@item QSA +@itemx qsappend +Treat the resulting value as URL; append any query string from the +original @var{value} to it. + +@kindex QSD +@kindex qsdiscard +@cindex query string handling +@cindex discard query string +@item QSD +@itemx qsdiscard +Treat the resulting value as URL; discard any query string attached to +the original @var{value}. + +@anchor{X-VMOD-DBRW-Status} +@vindex X-VMOD-DBRW-Status +@cindex redirection code +@cindex status code +@item redirect=@var{code} +@item R=@var{code} +On success, set the @samp{X-VMOD-DBRW-Status} header to @var{code}, +which must be a valid HTTP status code. +@end table + +If @var{regexp} or @var{value} is NULL, strict matching is assumed +(@pxref{strict matching}). + +If @var{flags} is NULL, it is ignored. + +@node Configuration +@chapter Configuration + +@deftypefn {function} void config (string @var{dbtype}, string @var{params}, @ + string @var{query}) +@cindex vcl_init +@cindex configuration +@cindex initialization +This function configures the module and provides it with the data necessary to +connect and use the database. It is normally called from the +@code{vcl_init} subroutine. + +Arguments: + +@table @var +@item dbtype +Type of the database to use. Valid values are @samp{mysql} and +@samp{pgsql}. + +@item params +Database connection parameters. This is a list of +@samp{@var{name}=@var{value}} assignments separated with semicolons. +The @var{value} part can be any sequence of characters, excepting +white space and semicolon. If any of these is to appear in it, they +must either be escaped by prepending them with a backslash, or the +entire @var{value} enclosed in a pair of (single or double) quotes. +The following @dfn{escape sequences} are allowed for use in +@var{value}: + +@cindex escape sequences +@cindex backslash interpretation +@float Table, backslash-interpretation +@caption{Backslash escapes} +@multitable @columnfractions 0.30 .5 +@item Sequence @tab Replaced with +@item \a @tab Audible bell character (@acronym{ASCII} 7) +@item \b @tab Backspace character (@acronym{ASCII} 8) +@item \f @tab Form-feed character (@acronym{ASCII} 12) +@item \n @tab Newline character (@acronym{ASCII} 10) +@item \r @tab Carriage return character (@acronym{ASCII} 13) +@item \t @tab Horizontal tabulation character (@acronym{ASCII} 9) +@item \v @tab Vertical tabulation character (@acronym{ASCII} 11) +@end multitable +@end float + +If a backslash is followed by a symbol other than listed above, it +is removed and the symbol following it is reproduced verbatim. + +The valid parameters are: + +@table @samp +@kindex debug +@cindex debugging level +@item debug=@var{n} +Set debugging level. Argument is a decimal number. + +@kindex server +@cindex database server +@item server=@var{host} +Name or IP address of the database server to connect to. If not +defined, localhost (@samp{127.0.0.1}) is assumed. For MySQL +databases, if @var{host} begins with a slash, its value is +taken to be the full pathname of the local UNIX socket to connect to. + +@kindex port +@cindex database server port +@item port=@var{n} +Port number on the @samp{server} to connect to. Default is @samp{3306} +for MySQL and 5432 for Postgres. + +@kindex database +@cindex database name +@cindex name, database +@item database=@var{name} +The name of the database to use. + +@kindex config +@cindex options, MySQL +@cindex MySQL options file +@item config=@var{filename} +(MySQL-specific) Read database access credentials and other +parameters from the MySQL options file @var{filename}. + +@kindex group +@item group=@var{name} +(MySQL-specific) Read credentials from section @var{name} of the +options file supplied with the @code{config} parameter. Default +section name is @samp{client}. + +@kindex cacert +@cindex SSL connection, MySQL +@cindex certificate authority file, SSL +@item cacert=@var{filename} +Use secure connection to the database server via SSL. The +@var{filename} argument is a full pathname of the certificate +authority file. + +@kindex options +@cindex options, PostgreSQL +@item options=@var{string} +(Postgres-specific) Connection options. + +@kindex user +@cindex database user name +@item user=@var{name} +Database user name. + +@kindex password +@cindex database password +@item password=@var{string} +Password to access the database. +@end table + +@kindex query +@cindex database query +@item query +The SQL query to use. It can contain variable references in the +(@code{$@var{name}} or @code{$@{@var{name}@}}), which will be replaced +with the actual value of the @var{name} argument to the function +@code{rewrite}. +@end table +@end deftypefn + +The example below configures @command{vmod-dbrw} to use MySQL database +@samp{rewrite}, with the user name @samp{varnish} and password @var{guessme}. + +@example +@group +import dbrw; + +sub vcl_init @{ + dbrw.config("mysql", + "database=rewrite;user=varnish;password=guessme", + @{"SELECT dest + FROM redirects + WHERE host='$host' + AND url='$url'"@}); +@} +@end group +@end example + +@node Query +@chapter Writing Queries +@cindex query +The query supplied to the @code{config} function depends on the +database schema and on the kind of matching required. To ensure the +best performance of the module it is important to design the database +and the query so that the database look up be as fast as possible. + +Suppose that you plan to use @command{vmod-dbrw} to implement +redirection rules based on strict matching (@pxref{strict matching}). + +The simplest database structure for this purpose (assuming MySQL) will be: + +@example +@group +CREATE TABLE redirects ( + id INT AUTO_INCREMENT, + host varchar(255) NOT NULL DEFAULT '', + url varchar(255) NOT NULL DEFAULT '', + dest varchar(255) DEFAULT NULL, + PRIMARY KEY (host,url) +); +@end group +@end example + +@noindent +The columns and their purpose are: + +@table @asis +@item id +An integer uniquely identifying the row. It is convenient for +managing the table (e.g. deleting the row). + +@item host +Host part of the incoming request. + +@item url +URL part of the incoming request. + +@item dest +Destination URL to redirect to. +@end table + +The rewrite function is to look for a row that has @samp{host} and +@samp{url} matching the incoming request and to redirect it to the +URL in the @samp{dest} column. The corresponding query is: + +@example +SELECT dest FROM redirects WHERE host='$host' AND url='$url' +@end example + +The variables @samp{host} and @samp{url} are supposed to contain the +actual host and URL parts of the incoming request. + +Handling regular expression matches is a bit trickier. Your query +should first return the rows that could match the request. Then the +@command{vmod-dbrw} engine will do the rest, by iterating over them +and finding the one that actually does. It will iterate over the rows +in the order they were returned by the database server, so it might be +necessary to sort them by some criterion beforehand. + +The following is an example table structure: + +@example +@group +CREATE TABLE rewrite ( + id INT AUTO_INCREMENT, + host varchar(255) NOT NULL DEFAULT '', + url varchar(255) NOT NULL DEFAULT '', + dest varchar(255) DEFAULT NULL, + value varchar(255) DEFAULT NULL, + pattern varchar(255) DEFAULT NULL, + flags char(64) DEFAULT NULL, + weight int NOT NULL DEFAULT '0', + KEY source (host,url) +); +@end group +@end example + +@noindent +The meaning of @code{id}, @code{host}, and @code{dest} is +the same as in the previous example. The meaning of @code{url} is +described below. Other columns are (@pxref{regex matching}): + +@table @asis +@item value +The value to be compared with the pattern. + +@item pattern +Regular expression to use. + +@item flags +Optional flags. + +@item weight +Relative weight of this row in the set. Rows will be sorted by this +column, in ascending order. +@end table + +@noindent +The simplest way to select candidate rows is by their @samp{host} +column: + +@example +@group +SELECT dest,pattern,value,flags +FROM rewrite +WHERE host='$host' +ORDER BY weight +@end group +@end example + +@noindent +One can further abridge the returned set by selecting only those +rows whose @code{url} column is the prefix of the requested URL: + +@example +@group +SELECT dest,pattern,value,flags +FROM rewrite +WHERE host='$host' +AND LOCATE(url,'$url')==1 +ORDER BY weight +@end group +@end example + +@noindent +Furthermore, the @code{url} column can contain a SQL wildcard pattern, +in which case the query will look like: + +@example +@group +SELECT dest,pattern,value,flags +FROM rewrite +WHERE host='$host' +AND '$url' like $url +ORDER BY weight +@end group +@end example + +@node Rewrite +@chapter The @code{rewrite} Function + +@deftypefn {function} string rewrite (string @var{args}) +This function is the working horse of the module. It rewrites its +argument using the database configured in the previous call to +@code{config} and returns the obtained value. + +To do so, it performs the following steps: + +@itemize @bullet +@item Parameter parsing + +The @var{args} parameter must be a list of +@code{@var{name}=@var{value}} pairs separated by semicolons. The +function parses this string and builds a symbol table. + +@item Variable expansion + +Using the symbol table built in the previous stage, each occurrence of +@code{$@var{name}} or @code{$@{@var{name}@}} is replaced with the actual +value of the variable @var{name} from the table. Expanding an +undefined variable is considered an error. + +@item Establishing the database connection + +Unless the connection has already been established by a prior call to +@code{rewrite}, the function establishes it using the parameters +supplied earlier in a call to @code{config}. If the connection fails, +the function returns NULL immediately. + +Database connections are persisting and thread-specific. This means +that each thread keeps its own connection to the database and attempts +to re-establish it if it goes down for some reason. + +@item Query execution + +The query is sent to the server and the resulting set collected from +it. + +@item Result interpretation + +The resulting set is interpreted as described in @ref{result +interpretation}. This results in a single value being returned to the +caller. +@end itemize +@end deftypefn + +@cindex vcl_recv +Assuming the database structure similar to the one discussed in the +previous chapter, the following example illustrates how to use +@code{rewrite} to redirect the incoming request: + +@example +@group +sub vcl_recv @{ + set req.http.X-Redirect-To = + dbrw.rewrite("host=" + req.http.Host + ";" + + "url=" + req.url); + if (req.http.X-Redirect-To != "") @{ + error(750, "Redirect"); + @} +@} +@end group +@end example + +Further handling of the 750 error should be performed in a traditional +way, e.g.: + +@vindex X-VMOD-DBRW-Status +@cindex vcl_error +@cindex vmod_std +@example +@group +import std; + +sub vcl_error @{ + if (obj.status == 750) @{ + set obj.http.Location = req.http.X-Redirect-To; + set obj.status = + std.integer(req.http.X-VMOD-DBRW-Status, 301); + return (deliver); + @} +@} +@end group +@end example + +The @code{X-VMOD-DBRW-Status} header, if set, contains the status code to be +returned to the client (@pxref{X-VMOD-DBRW-Status}). Notice the use +of the @command{vmod_std} module to cast it to integer. + +@cindex vmod_redirect +Using the @command{vmod_redirect}, the redirection can be further +simplified: + +@example +@group +import std; +import dbrw; +import redirect; + +sub vcl_recv @{ + set req.http.X-Redirect-To = + dbrw.rewrite("host=" + req.http.Host + ";" + + "url=" + req.url); + if (req.http.X-Redirect-To != "") @{ + error(redirect.location( + std.integer(req.http.X-VMOD-DBRW-Status, 301), + req.http.X-Redirect-To), "Redirection"); + @} +@} +@end group +@end example + +@noindent +The @code{vcl_error} subroutine is not needed in this case. + +@node Reporting Bugs +@chapter How to Report a Bug + + Email bug reports to @email{gray@@gnu.org}. + + As the purpose of bug reporting is to improve software, please be +sure to include a detailed information when reporting a bug. The minimum +information needed is: + +@itemize @bullet +@item Module version you use. +@item A description of the bug. +@item Conditions under which the bug appears. +@item It is often helpful to send the contents of @file{config.log} +file along with your bug report. This file is created after running +@command{./configure} in @command{vmod-dbrw} source root directory. +@end itemize + +@node Copying This Manual +@appendix GNU Free Documentation License +@include fdl.texi + +@node Concept Index +@unnumbered Concept Index + +This is a general index of all issues discussed in this manual + +@printindex cp + +@bye + |