summaryrefslogtreecommitdiffabout
path: root/README.rst
blob: d189c98b58c86d977b637ed0e0409e0527d8ecf7 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
MediaWiki Markup Translator
===========================
This package provides Python framework for translating WikiMedia
articles to various formats. The present version supports
conversions to plain text, HTML, and Texinfo formats.

A command line converter utility is included.

Classes
=======

class ``WikiMarkup``
--------------------
A base class for all translator classes. Unless you plan extending
wikitrans, you will never have to create objects of this
class. Instead, you will be using one of its derived classes.

Constructor arguments common for all derived classes:

filename = *name*
  The file *name* is opened and used for input.
file = *fd*
  An already opened file *fd* is used for input.
text = *string*
  Input is taken from *string*, line by line.

lang = *code*
  Specifies language version. Default is ``en``. This variable can be
  referred to as ``%(lang)s`` in the keyword arguments below.
html_base = *url*
  Base URL for cross-references. Default is
  ``http://%(lang)s.wikipedia.org/wiki/``.
image_base = *url*
  Base URL for images. Default is
  ``http://upload.wikimedia.org/wikipedia/commons/thumb/a/bf``
media_base = *url*
  Base URL for media files. Default is
  ``http://www.mediawiki.org/xml/export-0.3``


class ``TextWikiMarkup``
------------------------
Translates material in Wiki markup language to plain text. Usage::

   from WikiTrans.wiki2text import TextWikiMarkup

   markup = TextWikiMarkup(filename='input.txt')
   markup.parse()
   print(str(markup))

Specific constructor arguments:

width = *N*
  Limit output width to *N* columns. Default is 78.  
show_urls = *bool*
  Whether or not to show the URLs links refer to. If *bool* is
  ``True`` (the default), a URL will be displayed in parentheses next
  to the link text. If ``False``, only the link text will be displayed. 

class ``TextWiktionaryMarkup``
------------------------------
Translate material from wiktionary to plain text form. This is
supposed to provide a wiktionary-specific form of
``TextWikiMarkup``. Currently, this class differs from
``TextWikiMarkup`` only in that the default value for ``html_base``
is ``http://%(lang)s.wikipedia.org/wiki/``.


class ``TexiWikiMarkup``
------------------------
Translate Wiki markup to Texinfo source. Usage::

   from WikiTrans.wiki2texi import TexiWikiMarkup

   markup = TexiWikiMarkup(filename='input.txt')
   markup.parse()
   print(str(markup))

Two markup-specific keywords control the sectioning model used.

sectioning_model = *model*
  Selects the Texinfo sectioning model for the output
  document. Possible values are:

  ``numbered``
     Top of document is marked with ``@top``. Headings (``=``, ``==``,
     ``===``, etc) produce ``@chapter``,
     ``@section``, ``@subsection``, etc.
  ``unnumbered``
     Unnumbered sectioning: ``@top``, ``@unnumbered``, ``@unnumberedsec``,
     ``@unnumberedsubsec``.
  ``appendix``
     Sectioning suitable for appendix entries: ``@top``, ``@appendix``,
     ``@appendixsec``, ``@appendixsubsec``, etc.
  ``heading``
     Use heading directives to reflect sectioning: ``@majorheading``,
     ``@chapheading``, ``@heading``, ``@subheading``, etc.

sectioning_start = *n*
  Shift resulting heading level by *n* positions. For example, supposing
  ``sectioning_model=numbered``, ``== A ==`` will produce ``@section
  A`` on output. If ``sectioning_start=1`` is also given, this
  directive will produce ``@subsection A`` instead.

class ``HtmlWikiMarkup``
------------------------
Translates Wiki markup to HTML. Usage::

   from WikiTrans.wiki2html import HtmlWikiMarkup

   markup = HtmlWikiMarkup(filename='input.txt')
   markup.parse()
   print(str(markup))

Supported keywords are same as for ``WikiMarkup`` class.

class ``HtmlWiktionaryMarkup``
------------------------------
Translate material from wiktionary to HTML form. This is
supposed to provide a wiktionary-specific form of
``HtmlWikiMarkup``. Currently both classes are equivalent, except that
the default value for ``html_base`` in ``HtmlWiktionaryMarkup``
is ``http://%(lang)s.wikipedia.org/wiki/``.

The ``wikitrans`` utility
=========================
This command line utility converts the supplied text to a selected
output format. The usage syntax is::

  wikitrans [OPTIONS] ARG

If ARG looks like a URL, the wiki text to be converted will be
downloaded from that URL.

Otherwise, if the ``--base-url=URL`` option is given, ARG is treated as
the name of the page to get from the WikiMedia istallation at ``URL``.

Otherwise, ARG is treated as the name of the file to read wiki
material from.

Examples::

  wikitrans text.wiki

  wikitrans --base-url http://en.wiktionary.org door

  wikitrans https://en.wiktionary.org/wiki/Special:Export/door

Options are:

``--version``
  Show program's version number and exit.
``-h``, ``--help``
  Show a short usage summary and exit.
``-v``, ``--verbose``
  Verbose operation.
``-I ITYPE``, ``--input-type=ITYPE``
  Set input document type. *ITYPE* is one of: ``default`` or ``wiktionary``.
``-t OTYPE``, ``--to=OTYPE``, ``--type=OTYPE``
  Set output document type (``html`` (the default), ``texi``,
  ``text``, or ``dump``).
``-l LANG``, ``--lang=LANG``
  Set input document language
``-o KW=VAL``, ``--option=KW=VAL``
  Pass the keyword argument ``KW=VAL`` to the parser class construct.
``-d DEBUG``, ``--debug=DEBUG``
  Set debug level (0..100)
``-D``, ``--dump``
  Dump parse tree and exit; same as ``--type=dump``.
``-b URL``, ``--base-url=URL``
  Set base url. 

Note: when using ``--base-url`` or passing URL as an argument (2nd and 3rd
use cases above), if the URL is in 'wikipedia.org' or 'wiktionary.org'
domain, the options ``--input-type``, and ``--lang`` are set automatically.

Return to:

Send suggestions and report system problems to the System administrator.