\input texinfo @c -*-texinfo-*- @smallbook @c %**start of header @setfilename vmod-dbrw.info @settitle vmod-dbrw User Manual @c %**end of header @setchapternewpage odd @defcodeindex kw @defcodeindex op @syncodeindex fn cp @syncodeindex vr cp @syncodeindex ky cp @syncodeindex pg cp @syncodeindex tp cp @syncodeindex kw cp @syncodeindex op cp @include version.texi @ifinfo @dircategory Varnish Cache Modules @direntry * vmod-dbrw: (vmod-dbrw). Database-driven rewrite rules. @end direntry @end ifinfo @copying Published by the Free Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Copyright @copyright{} 2013-2018 Sergey Poznyakoff Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover, and no Back-Cover texts. A copy of the license is included in the section entitled ``GNU Free Documentation License''. @end copying @titlepage @title Vmod-Dbrw User Manual @subtitle version @value{VERSION}, @value{UPDATED} @author Sergey Poznyakoff. @page @vskip 0pt plus 1filll @insertcopying @end titlepage @ifnothtml @page @summarycontents @page @end ifnothtml @contents @ifnottex @node Top @top vmod-dbrw This edition of the @cite{vmod-dbrw User Manual}, last updated @value{UPDATED}, documents @command{vmod-dbrw} Version @value{VERSION}. @end ifnottex @menu * Intro:: What is vmod-dbrw. * Overview:: How it works. * Configuration:: * Query:: How to Write a Query. * Rewrite:: The @code{rewrite} Function. * Reporting Bugs:: How to Report a Bug. Appendices * Copying This Manual:: The GNU Free Documentation License. * Concept Index:: Index of Concepts. @c Here are some other nodes which are really inferiors of the ones @c already listed, mentioned here so you can get to them in one step: @c @c @detailmenu @c @end detailmenu @end menu @node Intro @chapter Introduction to vmod-dbrw @cindex Varnish @cindex rewrite rules @command{Vmod-dbrw} is a module for @command{Varnish Cache}@footnote{@url{https://www.varnish-cache.org}} which implements database-driven rewrite rules. These rules may be similar to @code{RewriteRule} directives implemented by @command{mod_rewrite}@footnote{@url{http://httpd.apache.org/docs/current/rewrite/}} module in @command{Apache} or to @code{Redirect} directives of its @command{mod_alias} module. What distinguishes the @command{vmod-dbrw} rules from these, is that they are handled by @command{Varnish}, before the request reaches the httpd server, and that they are stored in an SQL database, which makes them easily manageable. Some web sites implement thousands of rewrite rules. The purpose of this module is to facilitate deploying and handling them. @node Overview @chapter Overview @cindex rewrite rules @cindex database engines @cindex supported databases Rewrite rules are stored in a MySQL or PostgreSQL database. The @command{vmod-dbrw} module does not impose any restrictions on its schema. It only needs to know the SQL query which is to be used to retrieve data. This query is supplied to the module, along with the credentials for accessing the database, by calling the @code{config} function in the @code{vcl_recv} subroutine of the Varnish configuration file. Once the module is configured, the @code{rewrite} function can be called in the appropriate place of the Varnish configuration file. Its argument is a list of variable assignments separated by semicolons, each assignment having the form @code{@var{name}=@var{value}}. When called, @code{rewrite} expands the SQL query registered with the prior call to @code{config} by replacing each @code{$@var{name}} construct (a @dfn{variable reference}) with the corresponding @var{value} from its argument. Similarly to the shell syntax, the variable reference can also be written as @code{$@{@var{name}@}}. This latter form can be used in contexts where the variable reference is immediately followed by a letter, digit or underscore, to prevent it from being counted as a part of the name. Special syntax is available for substituting default values and invoking built-in functions during the expansion of the query. @xref{Expansions}, for a detailed description of these. Having undergone expansions, the query is sent to the database server. The returned set of records (if non-empty) is processed depending on the number of fields it contains. @cindex result interpretation @anchor{result interpretation} @cindex strict matching @anchor{strict matching} If the returned set has one or two columns, only the first tuple is used and @code{rewrite} returns the value of its first column. @cindex regex matching @anchor{regex matching} Otherwise, if the returned set has three or more columns, the regular expression matching is performed. For the purpose of this discussion, let's refer to the columns as follows: @var{result}, @var{regexp}, @var{value} and @var{flags}. The @var{flags} column is optional. Any surplus columns are ignored. For each returned tuple, the @var{value} column undergoes variable expansion, using the same algorithm as when preparing the query, and the resulting string is matched with the @var{regexp} column, which is treated as an extended POSIX regular expression. If the value matches the expression, the @var{result} column is expanded by replacing @dfn{backreferences}: each occurrence of @code{$@var{digit}} (where @var{digit} stands for a decimal digit from @samp{0} through @samp{9}) is replaced by the contents of the @var{digit}s parenthesized subexpression in @var{regexp}. For compatibility with the traditional usage, the @code{\@var{digit}} notation is also allowed. The resulting value is then returned to the caller. @cindex flags @anchor{flags} Optional @var{flags} column is a comma-separated list of flags that modify regular expression handling: @table @samp @kindex NC @kindex nocase @cindex regular expression, case-insensitive @cindex case-insensitive regular expression @item NC @itemx nocase Treat @var{regexp} as case-insensitive regular expression. @kindex case @cindex regular expression, case-sensitive @cindex case-sensitive regular expression @item case Treat @var{regexp} as case-sensitive (default). @kindex QSA @kindex qsappend @cindex query string handling @cindex append query string @item QSA @itemx qsappend Treat the resulting value as URL; append any query string from the original @var{value} to it. @kindex QSD @kindex qsdiscard @cindex query string handling @cindex discard query string @item QSD @itemx qsdiscard Treat the resulting value as URL; discard any query string attached to the original @var{value}. @anchor{X-VMOD-DBRW-Status} @vindex X-VMOD-DBRW-Status @cindex redirection code @cindex status code @item redirect=@var{code} @item R=@var{code} On success, set the @samp{X-VMOD-DBRW-Status} header to @var{code}, which must be a valid HTTP status code. @end table If @var{regexp} or @var{value} is NULL, strict matching is assumed (@pxref{strict matching}). If @var{flags} is NULL, it is ignored. @node Configuration @chapter Configuration @deftypefn {function} void config (string @var{dbtype}, string @var{params}, @ string @var{query}) @cindex vcl_recv @cindex configuration @cindex initialization This function configures the module and provides it with the data necessary to connect and use the database. It is normally called from the @code{vcl_recv} subroutine. Arguments: @table @var @item dbtype Type of the database to use. Valid values are @samp{mysql} and @samp{pgsql}. @item params Database connection parameters. This is a list of @samp{@var{name}=@var{value}} assignments separated by semicolons. The @var{value} part can be any sequence of characters, excepting white space and semicolon. If @var{value} contains any of these, they either must be escaped by prepending them with a backslash, or the entire @var{value} must be enclosed in a pair of (single or double) quotes. The following @dfn{escape sequences} are allowed for use in @var{value}: @cindex escape sequences @cindex backslash interpretation @float Table, backslash-interpretation @caption{Backslash escapes} @multitable @columnfractions 0.30 .5 @item Sequence @tab Replaced by @item \a @tab Audible bell character (@acronym{ASCII} 7) @item \b @tab Backspace character (@acronym{ASCII} 8) @item \f @tab Form-feed character (@acronym{ASCII} 12) @item \n @tab Newline character (@acronym{ASCII} 10) @item \r @tab Carriage return character (@acronym{ASCII} 13) @item \t @tab Horizontal tabulation character (@acronym{ASCII} 9) @item \v @tab Vertical tabulation character (@acronym{ASCII} 11) @end multitable @end float If a backslash is immediately followed by a symbol not listed in the above table, it is removed and the symbol is reproduced verbatim. Valid parameters are: @table @samp @kindex debug @cindex debugging level @item debug=@var{n} Set debugging level. Argument is a decimal number. @kindex server @cindex database server @item server=@var{host} Name or IP address of the database server to connect to. If not defined, localhost (@samp{127.0.0.1}) is assumed. For MySQL databases, if @var{host} begins with a slash, its value is taken to be the full pathname of the local UNIX socket to connect to. @kindex port @cindex database server port @item port=@var{n} Port number on the @samp{server} to connect to. Default is @samp{3306} for MySQL and 5432 for Postgres. @kindex database @cindex database name @cindex name, database @item database=@var{name} The name of the database to use. @kindex config @cindex options, MySQL @cindex MySQL options file @item config=@var{filename} (MySQL-specific) Read database access credentials and other parameters from the MySQL options file @var{filename}. @kindex group @item group=@var{name} (MySQL-specific) Read credentials from section @var{name} of the options file supplied with the @code{config} parameter. Default section name is @samp{client}. @kindex cacert @cindex SSL connection, MySQL @cindex certificate authority file, SSL @item cacert=@var{filename} Use secure connection to the database server via SSL. The @var{filename} argument is a full pathname of the certificate authority file. @kwindex timeout @cindex timeout, idle, SQL @cindex idle timeout, SQL @item timeout=@var{n} Sets idle timeout for a single connection. The connection will be closed and opened again if the module is to use it after @var{n} or more seconds since its last use. Set @samp{timeout=-1} to disable idle timeout (session will remain open until the SQL server closes it). Set @samp{timeout=0} to close the connection after each request (not recommended). The default depends on the selected SQL backend. For MySQL, it equals the value of the @samp{wait_timeout} global variable. For PostgreSQL, it is @samp{-1}. @kindex options @cindex options, PostgreSQL @item options=@var{string} (Postgres-specific) Connection options. @kindex user @cindex database user name @item user=@var{name} Database user name. @kindex password @cindex database password @item password=@var{string} Password to access the database. @end table @kindex query @cindex database query @item query The SQL query to use. It can contain variable references (@code{$@var{name}} or @code{$@{@var{name}@}}), which will be expanded to the actual value of the @var{name} argument to the function @code{rewrite}. @xref{Expansions}, for details. @end table @end deftypefn The example below configures @command{vmod-dbrw} to use MySQL database @samp{rewrite}, with the user name @samp{varnish} and password @samp{guessme}. @example @group import dbrw; sub vcl_recv @{ dbrw.config("mysql", "database=rewrite;user=varnish;password=guessme", @{"SELECT dest FROM redirects WHERE host='$host' AND url='$url'"@}); @} @end group @end example @menu * Expansions:: @end menu @node Expansions @section Expansions @cindex expansions @cindex variable expansion @cindex expansion, variable The @samp{query} argument to the @code{dbrw.config} function normally contains variable references. A variable reference has the form @samp{$@var{variable}} or @samp{$@{@var{variable}@}}, where @var{variable} is the variable name. When the @code{dbrw.rewrite} function (@pxref{Rewrite}) is called, each such reference is expanded to the actual value of @var{variable} passed in the argument to that function. The two forms are entirely equivalent. The form with curly braces is normally used if the variable name is immediately followed by an alphanumeric symbol, which will otherwise be considered a part of it. This form also allows for specifying the action to take if the variable is undefined or expands to an empty value. During variable expansion, the forms below cause @code{dbrw.rewrite} to test for a variable that is unset or null (i.e., whose value is an empty string). Omitting the colon results in a test only for a variable that is unset. @table @asis @item $@{@var{variable}:-@var{word}@} @dfn{Use Default Values}. If @var{variable} is unset or null, the expansion of @var{word} is substituted. Otherwise, the value of @var{variable} is substituted. @item $@{@var{variable}:=@var{word}@} @dfn{Assign Default Values}. If @var{variable} is unset or null, the expansion of @var{word} is assigned to variable. The value of @var{variable} is then substituted. @item $@{@var{variable}:?@var{word}@} @dfn{Display Error if Null or Unset}. If @var{variable} is null or unset, the expansion of @var{word} (or a message to that effect if @var{word} is not present) is output to the current logging channel. Otherwise, the value of @var{variable} is substituted. @item $@{@var{variable}:+@var{word}@} @dfn{Use Alternate Value}. If @var{variable} is null or unset, nothing is substituted, otherwise the expansion of @var{word} is substituted. @end table @cindex command expansion @cindex expansion, command After expanding variables, the query undergoes @dfn{command expansion}. Syntactically, a command invocation is @example $(@var{cmd} @var{args}) @end example @noindent where @var{cmd} is the command name, and @var{args} is a list of arguments separated by whitespace. Arguments can in turn contain variable and command references. During command expansion, each invocation is replaced by the result of the call to function @var{cmd} with the supplied arguments. As of version @value{VERSION} of @command{vmod-dbrw}, only one function is declared: @deffn {Command} urlprefixes @var{uri} Expands to comma-separated list of path prefixes contained in @var{uri}, starting from the longest one (@var{uri} itself, with eventual query part stripped off). Single @samp{/} is not included in the list. Each list item is quoted. The expansion can be used in the @samp{IN ()} SQL conditional. @end deffn @node Query @chapter Writing Queries @cindex query The query supplied to the @code{config} function depends on the database schema and on the desired kind of matching (e.g. exact vs. wildcard). To ensure the best performance of the module it is important to design the schema and the query so that the database look up be as fast as possible. Suppose that you plan to use @command{vmod-dbrw} to implement redirection rules based on strict matching (@pxref{strict matching}). The simplest database structure for this purpose (assuming MySQL) will be: @example @group CREATE TABLE redirects ( id INT AUTO_INCREMENT, host varchar(255) NOT NULL DEFAULT '', url varchar(255) NOT NULL DEFAULT '', dest varchar(255) DEFAULT NULL, PRIMARY KEY (host,url) ); @end group @end example @noindent The columns and their purpose are: @table @asis @item id An integer uniquely identifying the row. It is useful for table management purposes (e.g. deleting the row). @item host Host part of the incoming request. @item url URL part of the incoming request. @item dest Destination URL to redirect to. @end table The rewrite function looks up a row that has @samp{host} and @samp{url} matching the incoming request and, if found, returns the value of its @samp{dest} column. The corresponding query is: @example SELECT dest FROM redirects WHERE host='$host' AND url='$url' @end example The variables @samp{host} and @samp{url} are supposed to contain the actual host and URL parts of the incoming request. Handling regular expression matches is a bit trickier. Your query should first return such rows that could possibly match the request. Then the @command{vmod-dbrw} engine will do the rest, by iterating over the returned set and finding the row that actually matches the request. It will iterate over the rows in the order they were returned by the database server, so it might be necessary to sort them by some criterion beforehand. The following is an example table structure: @example @group CREATE TABLE rewrite ( id INT AUTO_INCREMENT, host varchar(255) NOT NULL DEFAULT '', url varchar(255) NOT NULL DEFAULT '', dest varchar(255) DEFAULT NULL, value varchar(255) DEFAULT NULL, pattern varchar(255) DEFAULT NULL, flags char(64) DEFAULT NULL, weight int NOT NULL DEFAULT '0', KEY source (host,url) ); @end group @end example @noindent The meaning of @code{id}, @code{host}, and @code{dest} is the same as in the previous example. The meaning of @code{url} is described below. Other columns are (@pxref{regex matching}): @table @asis @item value The value to be compared with the pattern. @item pattern Regular expression to use. @item flags Optional flags. @item weight Relative weight of this row in the set. Rows will be sorted by this column, in ascending order. @end table @noindent The simplest way to select candidate rows is by their @samp{host} column: @example @group SELECT dest,pattern,value,flags FROM rewrite WHERE host='$host' ORDER BY weight @end group @end example @noindent One can further abridge the returned set by selecting only those rows whose @code{url} column is the prefix of the requested URL: @example @group SELECT dest,pattern,value,flags FROM rewrite WHERE host='$host' AND LOCATE(url,'$url')==1 ORDER BY weight @end group @end example @noindent Furthermore, the @code{url} column can contain a path prefix, which can be matched using the @code{IN} conditional: @example @group SELECT dest,pattern,value,flags FROM rewrite WHERE host='$host' AND url IN ($(urlprefixes $url)) ORDER BY weight @end group @end example Notice the use of the @samp{$(urlprefixes $url)}. This invokes the built-in @dfn{function} @code{urlprefixes}, which expands to comma-separated list of properly quoted pathname prefixes, constructed from its argument. For example, if @samp{$url} is @samp{/local/user/local?a=1}, then the expansion of @samp{$(urlprefixes $url)} is: @example '/local/user/local','/local/user','/local' @end example @node Rewrite @chapter The @code{rewrite} Function @deftypefn {function} string rewrite (string @var{args}) This function is the working horse of the module. It rewrites its argument using the database configured in the previous call to @code{config} and returns the obtained value. To do so, it performs the following steps: @itemize @bullet @item Parameter parsing The @var{args} parameter must be a list of @code{@var{name}=@var{value}} pairs separated by semicolons. The function parses this string and builds a symbol table. @item Variable expansion Using the symbol table built in the previous stage, each occurrence of @code{$@var{name}} or @code{$@{@var{name}@}} is replaced by the actual value of the variable @var{name} from the table. Expanding an undefined variable is considered an error. @item Establishing the database connection Unless the connection has already been established by a prior call to @code{rewrite}, the function establishes it using the parameters supplied earlier in a call to @code{config}. If the connection fails, the function returns NULL immediately. Database connections are persisting and thread-specific. This means that each thread keeps its own connection to the database and attempts to re-establish it if it goes down for some reason. @item Query execution The query is sent to the server and the resulting set collected from it. @item Result interpretation The resulting set is interpreted as described in @ref{result interpretation}. This results in a single value being returned to the caller. @end itemize @end deftypefn @cindex vcl_recv Assuming the database structure similar to the one discussed in the previous chapter, the following example illustrates how to use @code{rewrite} to redirect the incoming request. @example @group sub vcl_recv @{ dbrw.config("mysql", "database=rewrite;user=varnish;password=guessme", @{"SELECT dest FROM redirects WHERE host='$host' AND url='$url'"@}); set req.http.X-Redirect-To = dbrw.rewrite("host=" + req.http.Host + ";" + "url=" + req.url); if (req.http.X-Redirect-To != "") @{ return(synth(301, "Redirect")); @} @} @end group @end example The @samp{synth} sub must be provided in order to construct redirection responses: @vindex X-VMOD-DBRW-Status @cindex vcl_synth @cindex vmod_std @example @group import std; sub vcl_synth @{ if (resp.status == 301) @{ set resp.http.Location = req.http.X-Redirect-To; if (req.http.X-VMOD-DBRW-Status != "") @{ set resp.status = std.integer(req.http.X-VMOD-DBRW-Status, 301); @} return (deliver); @} @} @end group @end example The @code{X-VMOD-DBRW-Status} header, if set, contains the status code to be returned to the client (@pxref{X-VMOD-DBRW-Status}). Notice the use of the @command{vmod_std} module to cast it to integer. @node Reporting Bugs @chapter How to Report a Bug Email bug reports to @email{gray@@gnu.org}. As the purpose of bug reporting is to improve software, please be sure to include a detailed information when reporting a bug. The minimum information needed is: @itemize @bullet @item Module version you use. @item A description of the bug. @item Conditions under which the bug appears. @item It is often helpful to send the contents of @file{config.log} file along with your bug report. This file is created after running @command{./configure} in @command{vmod-dbrw} source root directory. @end itemize @node Copying This Manual @appendix GNU Free Documentation License @include fdl.texi @node Concept Index @unnumbered Concept Index This is a general index of all issues discussed in this manual @printindex cp @bye