aboutsummaryrefslogtreecommitdiff
path: root/doc/vmod-dbrw.texi
blob: 82adda2a6bc07f3c021904ffe003e41cf0a3659b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
\input texinfo @c -*-texinfo-*-
@smallbook
@c %**start of header
@setfilename vmod-dbrw.info
@settitle vmod-dbrw User Manual
@c %**end of header
@setchapternewpage odd

@defcodeindex kw
@defcodeindex op
@syncodeindex fn cp
@syncodeindex vr cp
@syncodeindex ky cp
@syncodeindex pg cp
@syncodeindex tp cp
@syncodeindex kw cp
@syncodeindex op cp

@include version.texi

@ifinfo
@dircategory Varnish Cache Modules
@direntry
* vmod-dbrw: (vmod-dbrw).       Database-driven rewrite rules.
@end direntry
@end ifinfo

@copying
Published by the Free Software Foundation,
51 Franklin Street, Fifth Floor,
Boston, MA 02110-1301 USA

Copyright @copyright{} 2013-2017 Sergey Poznyakoff

Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover, and no Back-Cover texts.
A copy of the license is included in the section entitled ``GNU Free
Documentation License''.
@end copying

@titlepage
@title Vmod-Dbrw User Manual
@subtitle version @value{VERSION}, @value{UPDATED}
@author Sergey Poznyakoff.
@page
@vskip 0pt plus 1filll
@insertcopying
@end titlepage

@ifnothtml
@page
@summarycontents
@page
@end ifnothtml
@contents

@ifnottex
@node Top
@top vmod-dbrw
This edition of the @cite{vmod-dbrw User Manual}, last updated @value{UPDATED},
documents @command{vmod-dbrw} Version @value{VERSION}.
@end ifnottex

@menu
* Intro::                  What is vmod-dbrw.
* Overview::               How it works.
* Configuration::
* Query::                  How to Write a Query.
* Rewrite::                The @code{rewrite} Function.

* Reporting Bugs::         How to Report a Bug.

Appendices

* Copying This Manual::    The GNU Free Documentation License.
* Concept Index::          Index of Concepts.

@c Here are some other nodes which are really inferiors of the ones
@c already listed, mentioned here so you can get to them in one step:
@c
@c @detailmenu
@c @end detailmenu
@end menu

@node Intro
@chapter Introduction to vmod-dbrw
@cindex Varnish
@cindex rewrite rules
@command{Vmod-dbrw} is a module for
@command{Varnish Cache}@footnote{@url{https://www.varnish-cache.org}}
which implements database-driven rewrite rules.  These rules may be
similar to @code{RewriteRule} directives implemented by
@command{mod_rewrite}@footnote{@url{http://httpd.apache.org/docs/current/rewrite/}}
module in @command{Apache} or to @code{Redirect} directives of its
@command{mod_alias} module.  What distinguishes the
@command{vmod-dbrw} rules from these, is that they are handled by
@command{Varnish}, before the request reaches the httpd server, and 
that they are stored in an SQL database, which makes them easily
manageable.

Some web sites implement thousands of rewrite rules.  The purpose
of this module is to facilitate deploying and handling them.

@node Overview
@chapter Overview
@cindex rewrite rules
@cindex database engines
@cindex supported databases
Rewrite rules are stored in a MySQL or PostgreSQL database.  The
@command{vmod-dbrw} module does not impose any restrictions on its
schema.  It only needs to know the SQL query which is to be used to
retrieve data.  This query is supplied to the module, along with the
credentials for accessing the database, by calling the @code{config}
function in the @code{vcl_recv} subroutine of the Varnish
configuration file.

Once the module is configured, the @code{rewrite} function can be called
in the appropriate place of the Varnish configuration file.  Its argument
is a list of variable assignments separated with semicolons, each
assignment having the form @code{@var{name}=@var{value}}.  When called,
@code{rewrite} expands the SQL query registered with the prior call to
@code{config} by replacing each @code{$@var{name}}
construct (a @dfn{variable reference}) with the corresponding
@var{value} from its argument.  Similarly to the shell syntax, the
variable reference can also be written as @code{$@{@var{name}@}}.
This latter form can be used in contexts where the variable reference is
immediately followed by a letter, digit or underscore, to prevent it
from being counted as a part of the name.

The expanded query is then sent to the database server.  If it returns
a non-empty set, it is further handled depending on the number of
fields it contains.

@cindex result interpretation
@anchor{result interpretation}
@cindex strict matching
@anchor{strict matching}
If the returned set has one or two columns, only the first tuple is
used and @code{rewrite} returns the value of its first column.

@cindex regex matching
@anchor{regex matching}
Otherwise, if the returned set has three or more columns, the
regular expression matching is performed.  For the purpose of
this discussion, let's refer to the columns  as follows: @var{result},
@var{regexp}, @var{value} and @var{flags}.  The @var{flags} column is
optional.  Any surplus columns are ignored.

For each returned tuple, the @var{value} column undergoes variable
expansion, using the same algorithm as when preparing the query, and
the resulting string is matched with the @var{regexp} column, which is
treated as an extended POSIX regular expression.  If the value matches
the expression, the @var{result} column is expanded by replacing
@dfn{backreferences}: each occurrence of @code{$@var{digit}} (where
@var{digit} stands for a decimal digit from @samp{0} through @samp{9})
is replaced with the contents of the @var{digit}s parenthesized
subexpression in @var{regexp}.  For compatibility with the traditional
usage, the @code{\@var{digit}} notation is also allowed.  The
resulting value is then returned to the caller.

@cindex flags
@anchor{flags}
Optional @var{flags} column is a comma-separated list of flags that
modify regular expression handling:

@table @samp
@kindex NC
@kindex nocase
@cindex regular expression, case-insensitive
@cindex case-insensitive regular expression
@item NC
@itemx nocase
Treat @var{regexp} as case-insensitive regular expression.

@kindex case
@cindex regular expression, case-sensitive
@cindex case-sensitive regular expression
@item case
Treat @var{regexp} as case-sensitive (default).

@kindex QSA
@kindex qsappend
@cindex query string handling
@cindex append query string
@item QSA
@itemx qsappend
Treat the resulting value as URL; append any query string from the
original @var{value} to it.

@kindex QSD
@kindex qsdiscard
@cindex query string handling
@cindex discard query string
@item QSD
@itemx qsdiscard
Treat the resulting value as URL; discard any query string attached to
the original @var{value}.

@anchor{X-VMOD-DBRW-Status}
@vindex X-VMOD-DBRW-Status
@cindex redirection code
@cindex status code
@item redirect=@var{code}
@item R=@var{code}
On success, set the @samp{X-VMOD-DBRW-Status} header to @var{code},
which must be a valid HTTP status code. 
@end table

If @var{regexp} or @var{value} is NULL, strict matching is assumed
(@pxref{strict matching}).

If @var{flags} is NULL, it is ignored.

@node Configuration
@chapter Configuration

@deftypefn {function} void config (string @var{dbtype}, string @var{params}, @
                                   string @var{query})
@cindex vcl_recv
@cindex configuration
@cindex initialization
This function configures the module and provides it with the data necessary to
connect and use the database.  It is normally called from the
@code{vcl_recv} subroutine.

Arguments:

@table @var
@item dbtype
Type of the database to use.  Valid values are @samp{mysql} and
@samp{pgsql}.

@item params
Database connection parameters.  This is a list of
@samp{@var{name}=@var{value}} assignments separated with semicolons.
The @var{value} part can be any sequence of characters, excepting
white space and semicolon.  If any of these is to appear in it, they
must either be escaped by prepending them with a backslash, or the
entire @var{value} enclosed in a pair of (single or double) quotes.
The following @dfn{escape sequences} are allowed for use in
@var{value}:

@cindex escape sequences
@cindex backslash interpretation
@float Table, backslash-interpretation
@caption{Backslash escapes}
@multitable @columnfractions 0.30 .5
@item Sequence @tab Replaced with
@item \a @tab Audible bell character (@acronym{ASCII} 7)
@item \b @tab Backspace character (@acronym{ASCII} 8)
@item \f @tab Form-feed character (@acronym{ASCII} 12)
@item \n @tab Newline character (@acronym{ASCII} 10)
@item \r @tab Carriage return character (@acronym{ASCII} 13)
@item \t @tab Horizontal tabulation character (@acronym{ASCII} 9)
@item \v @tab Vertical tabulation character (@acronym{ASCII} 11)
@end multitable
@end float

If a backslash is followed by a symbol other than listed above, it
is removed and the symbol following it is reproduced verbatim.

Valid parameters are:

@table @samp
@kindex debug
@cindex debugging level
@item debug=@var{n}
Set debugging level.  Argument is a decimal number.

@kindex server
@cindex database server
@item server=@var{host}
Name or IP address of the database server to connect to.  If not
defined, localhost (@samp{127.0.0.1}) is assumed.  For MySQL
databases, if @var{host} begins with a slash, its value is
taken to be the full pathname of the local UNIX socket to connect to.

@kindex port
@cindex database server port
@item port=@var{n}
Port number on the @samp{server} to connect to.  Default is @samp{3306}
for MySQL and 5432 for Postgres.

@kindex database
@cindex database name
@cindex name, database
@item database=@var{name}
The name of the database to use.

@kindex config
@cindex options, MySQL
@cindex MySQL options file
@item config=@var{filename}
(MySQL-specific) Read database access credentials and other
parameters from the MySQL options file @var{filename}.

@kindex group
@item group=@var{name}
(MySQL-specific) Read credentials from section @var{name} of the
options file supplied with the @code{config} parameter.  Default
section name is @samp{client}.

@kindex cacert
@cindex SSL connection, MySQL
@cindex certificate authority file, SSL
@item cacert=@var{filename}
Use secure connection to the database server via SSL.  The
@var{filename} argument is a full pathname of the certificate
authority file.

@kindex options
@cindex options, PostgreSQL
@item options=@var{string}
(Postgres-specific) Connection options.

@kindex user
@cindex database user name
@item user=@var{name}
Database user name.

@kindex password
@cindex database password
@item password=@var{string}
Password to access the database.
@end table

@kindex query
@cindex database query
@item query
The SQL query to use.  It can contain variable references in the
(@code{$@var{name}} or @code{$@{@var{name}@}}), which will be replaced
with the actual value of the @var{name} argument to the function
@code{rewrite}.
@end table
@end deftypefn

The example below configures @command{vmod-dbrw} to use MySQL database
@samp{rewrite}, with the user name @samp{varnish} and password @var{guessme}.

@example
@group
import dbrw;

sub vcl_recv @{
    dbrw.config("mysql",
           "database=rewrite;user=varnish;password=guessme",
           @{"SELECT dest
                FROM redirects
               WHERE host='$host'
                AND url='$url'"@});
@}
@end group
@end example

@node Query
@chapter Writing Queries
@cindex query
The query supplied to the @code{config} function depends on the
database schema and on the kind of matching required.  To ensure the
best performance of the module it is important to design the database
and the query so that the database look up be as fast as possible.

Suppose that you plan to use @command{vmod-dbrw} to implement
redirection rules based on strict matching (@pxref{strict matching}).

The simplest database structure for this purpose (assuming MySQL) will be:

@example
@group
CREATE TABLE redirects (
  id INT AUTO_INCREMENT,
  host varchar(255) NOT NULL DEFAULT '',
  url varchar(255) NOT NULL DEFAULT '',
  dest varchar(255) DEFAULT NULL,
  PRIMARY KEY (host,url)
);
@end group
@end example

@noindent
The columns and their purpose are:

@table @asis
@item id
An integer uniquely identifying the row.  It is convenient for
managing the table (e.g. deleting the row).

@item host
Host part of the incoming request.

@item url
URL part of the incoming request.

@item dest
Destination URL to redirect to.
@end table

The rewrite function is to look for a row that has @samp{host} and
@samp{url} matching the incoming request and to redirect it to the
URL in the @samp{dest} column.  The corresponding query is:

@example
SELECT dest FROM redirects WHERE host='$host' AND url='$url'
@end example

The variables @samp{host} and @samp{url} are supposed to contain the
actual host and URL parts of the incoming request.

Handling regular expression matches is a bit trickier.  Your query
should first return the rows that could match the request.  Then the
@command{vmod-dbrw} engine will do the rest, by iterating over them
and finding the one that actually does.  It will iterate over the rows
in the order they were returned by the database server, so it might be
necessary to sort them by some criterion beforehand.

The following is an example table structure:

@example
@group
CREATE TABLE rewrite (
  id INT AUTO_INCREMENT,
  host varchar(255) NOT NULL DEFAULT '',
  url varchar(255) NOT NULL DEFAULT '',
  dest varchar(255) DEFAULT NULL,
  value varchar(255) DEFAULT NULL,
  pattern varchar(255) DEFAULT NULL,
  flags char(64) DEFAULT NULL,
  weight int NOT NULL DEFAULT '0',
  KEY source (host,url)
);
@end group
@end example

@noindent
The meaning of @code{id}, @code{host}, and @code{dest} is
the same as in the previous example.  The meaning of @code{url} is
described below.  Other columns are (@pxref{regex matching}):

@table @asis
@item value
The value to be compared with the pattern.

@item pattern
Regular expression to use.

@item flags
Optional flags.

@item weight
Relative weight of this row in the set.  Rows will be sorted by this
column, in ascending order.
@end table

@noindent
The simplest way to select candidate rows is by their @samp{host}
column:

@example
@group
SELECT dest,pattern,value,flags
FROM rewrite
WHERE host='$host'
ORDER BY weight
@end group
@end example

@noindent
One can further abridge the returned set by selecting only those
rows whose @code{url} column is the prefix of the requested URL:

@example
@group
SELECT dest,pattern,value,flags
FROM rewrite
WHERE host='$host'
AND LOCATE(url,'$url')==1
ORDER BY weight
@end group
@end example

@noindent
Furthermore, the @code{url} column can contain a SQL wildcard pattern,
in which case the query will look like:

@example
@group
SELECT dest,pattern,value,flags
FROM rewrite
WHERE host='$host'
AND '$url' like $url
ORDER BY weight
@end group
@end example

@node Rewrite
@chapter The @code{rewrite} Function

@deftypefn {function} string rewrite (string @var{args})
This function is the working horse of the module.  It rewrites its
argument using the database configured in the previous call to
@code{config} and returns the obtained value.

To do so, it performs the following steps:

@itemize @bullet
@item Parameter parsing

The @var{args} parameter must be a list of
@code{@var{name}=@var{value}} pairs separated by semicolons.  The
function parses this string and builds a symbol table.

@item Variable expansion

Using the symbol table built in the previous stage, each occurrence of
@code{$@var{name}} or @code{$@{@var{name}@}} is replaced with the actual
value of the variable @var{name} from the table.  Expanding an
undefined variable is considered an error.

@item Establishing the database connection

Unless the connection has already been established by a prior call to
@code{rewrite}, the function establishes it using the parameters
supplied earlier in a call to @code{config}.  If the connection fails,
the function returns NULL immediately.

Database connections are persisting and thread-specific.  This means
that each thread keeps its own connection to the database and attempts
to re-establish it if it goes down for some reason.

@item Query execution

The query is sent to the server and the resulting set collected from
it.

@item Result interpretation

The resulting set is interpreted as described in @ref{result
interpretation}.  This results in a single value being returned to the
caller.
@end itemize
@end deftypefn

@cindex vcl_recv
Assuming the database structure similar to the one discussed in the
previous chapter, the following example illustrates how to use
@code{rewrite} to redirect the incoming request.

@example
@group
sub vcl_recv @{
    dbrw.config("mysql",
           "database=rewrite;user=varnish;password=guessme",
           @{"SELECT dest
                FROM redirects
               WHERE host='$host'
                AND url='$url'"@});
    set req.http.X-Redirect-To =
        dbrw.rewrite("host=" + req.http.Host + ";" +
                     "url=" + req.url);
    if (req.http.X-Redirect-To != "") @{
        return(synth(301, "Redirect"));
    @}
@}
@end group
@end example

Further handling of the 301 response should be performed in a traditional
way, e.g.:

@vindex X-VMOD-DBRW-Status
@cindex vcl_synth
@cindex vmod_std
@example
@group
import std;

sub vcl_synth @{
    if (resp.status == 301) @{
        set resp.http.Location = req.http.X-Redirect-To;
        if (req.http.X-VMOD-DBRW-Status != "") @{
            set resp.status =
               std.integer(req.http.X-VMOD-DBRW-Status, 301);
        @}
        return (deliver);
    @}
@}
@end group
@end example

The @code{X-VMOD-DBRW-Status} header, if set, contains the status code to be
returned to the client (@pxref{X-VMOD-DBRW-Status}).  Notice the use
of the @command{vmod_std} module to cast it to integer.

@node Reporting Bugs
@chapter How to Report a Bug

  Email bug reports to @email{gray@@gnu.org}.

  As the purpose of bug reporting is to improve software, please be
sure to include a detailed information when reporting a bug. The minimum
information needed is:

@itemize @bullet
@item Module version you use.
@item A description of the bug.
@item Conditions under which the bug appears.
@item It is often helpful to send the contents of @file{config.log}
file along with your bug report. This file is created after running
@command{./configure} in @command{vmod-dbrw} source root directory.
@end itemize

@node Copying This Manual
@appendix GNU Free Documentation License
@include fdl.texi

@node Concept Index
@unnumbered Concept Index

This is a general index of all issues discussed in this manual

@printindex cp

@bye

Return to:

Send suggestions and report system problems to the System administrator.