summaryrefslogtreecommitdiffabout
path: root/webfont.txt
Unidiff
Diffstat (limited to 'webfont.txt') (more/less context) (ignore whitespace changes)
-rw-r--r--webfont.txt604
1 files changed, 604 insertions, 0 deletions
diff --git a/webfont.txt b/webfont.txt
new file mode 100644
index 0000000..591e980
--- a/dev/null
+++ b/webfont.txt
@@ -0,0 +1,604 @@
1 WEBSTER FONTS
2 =============
3
4 Fonts for the Webster 1913 Dictionary.
5 For version 0.50
6 Last edit May 5, 2001
7 ______________________________________
8 (This file contains some extended ASCII characters, and should be
9transmitted in binary mode)
10----------------------------------------------------------------------
11
12 This file describes a modified font for use in visualizing the
13text of the 1913 "Webster's Revised Unabridged Dictionary" (W1913),
14usable for the DOS operating system of IBM-compatible personal computers.
15The electronic version of that dictionary and this font were prepared by
16MICRA, Inc., Plainfield NJ, and are copyrighted (C) 1996 by MICRA, Inc.
17For details of permissions and restrictions on using these files, see
18the accompanying file "readme.web".
19 The special characters used in the electronic version of the Webster
201913 are required for visualizing unusual characters used in the
21etymology and pronunciation fields of the dictionary, in a form
22comparable to the way they appear in the original. Since there are
23more than 256 characters used in that dictionary, not all can be
24represented by single-byte codes, and are instead represented by
25SGML-style "short-form" symbols. (rather than the "entity" format
26"&xx;" The ampersand is used frequently, and we prefer to leave
27the "<" as the only "escape" character) of the type <x/ where x
28is a specific code for the symbol in the dictionary.
29See the "Short Form" section below for details about such characters.
30Note that the symbols used here are in some cases abbreviations
31(for compactness) of the ISO 8879 recommended symbols. If necessary,
32the table below allows simple replacement by alternate encodings.
33 This symbol font can be loaded in IBM-compatible (x86) computers
34running the DOS operating system by using the "font.bat" command file
35in the "utils" directory. The fonts files for 8x14 and 8x16 fonts are
36"web14.fnt" and "web16.fnt" respectively.
37 For those loading the Webster onto some machine other than an
38IBM-compatible running DOS, it will be necessary to provide a
39translation table, to convert these characters into a code that
40can be handled by that computer. For this reason, I attach an
41"explanation" for each character, for those who cannot view
42the original DOS font.
43 The DOS-loadable font does not contain all of the characters needed
44to depict the etymologies or the pronunciations. In addition to an
45absence of several characters used in the pronunciations, no Greek letters are
46included. The Greek words appearing in the etymologies,
47when they are included, will be typed in a
48roman-letter transcription (See section on Greek transcription, below).
49Only a very few Greek words have been thus transcribed as of the
50present version (version 0.41).
51 Wherever the typists did not know the character to use, they
52usually inserted a reverse-video question mark (decimal 176).
53This appears in full-ASCII versions as <?/. This mark was used both for
54characters in non-ASCII fonts, and for unreadable characters (i.e.,
55characters smeared in the original or distorted in the copies available
56to the typists. The type in the original was in many places smeared and
57illegible at the left and right page margins; occasionally, small
58parts of words were blotted out by plain white space).
59 A character table for the high-order characters appears below.
60Under that is a list and description of most of the special characters
61used in the Webster files.
62 Note that there are yet some characters used in the etymologies,
63and some other symbols, which are not in this list. For example, the
64vowels with a double dot *underneath*, e.g. a (as in all) have no representation
65in this character set, and, where explicitly entered in the dictionary,
66are represented by <xdd/ where "x" is the letter, as in "<add/".
67
68ITALICS
69-------
70 In most places, italic font is represented by the tags <it>...</it>
71surrounding the italic text, or by some other tag which also implies
72italic font. In the pronunciations, however, where italicized vowels
73are used among non-italic and other special characters to indicate
74pronunciation, the special codes <ait/, <eit/, <iit/, <oit/, <uit/,
75are also used to indicate the italicized vowel.
76
77DIACRITICS
78-------------
79 The European grave and acute accents are represented by the
80standard (IBM PC) high-order codes. Other characters with diacritics
81are represented by special "entity" codes, and in some cases also
82are found in this special WEB1913 font, described below.
83 Vowels with a circle above (as in Swedish) are coded <xring/
84(x with a ring, or "degrees" mark over it); vowels with tilde over them
85are represented by <xtil/, where "x" is the vowel, as in <etil/ (<atil/
86also has code 238); letters with a dot above are represented by <xdot/
87-- letter with a dot below are represented by <xsdot/ ("subdot");
88vowels with the semi-long mark (a macron with a short perpendicular
89vertical stroke attached above) are represented by <xsl/; the
90circumflex vowels have codes on this list, but may also be represented
91as <xcir/; vowels with macrons above are <xmac/ (including <oomac/,
92the "oo" with an unbroken macron above the two letters, <aemac/ = the
93ligature ae with a macron [also 214 = \'d6], and <oemac/ the ligature
94oe with a macron [also 215 = \'d7]); vowels with umlauts or a crescent
95(breve) above have codes in this list, but may also be represented by
96<xum/ and <xcr/ respectively. There is an occasional hacek or caron mark
97(an inverted circumflex) in the original; such letters are coded <xcar/.
98The o with a caron has code 213, but no others are in this font list.
99The diaeresis is treated typographically as identical to the umlaut.
100 A special modification, used only for poetry (see entry "saturnian verse"
101under "saturnian") is a vowel with a macron, in which the macron is lighter
102than the usual macron, signifying a stressed syllable which has a short
103vowel sound. This is represented by <xsmac/ ("short mac").
104 Another special character used in pronunciations is an "n" with an underline (like
105a macron, but below the letter), used to represent the "ng" sound. This is coded
106<nsm/ ("n sub-macron"). The ligated th used in pronunciations to depict the
107"th" sound of "the" is coded as <th/.
108 NOTE: the letter combinations "fi" and "fl" are invariably printed as the
109ligatures &filig; and &fllig;, but these ligatures are not marked as such
110in this transcription, and the two letters are left as individuals.
111
112SPECIAL SYMBOLS
113 The dagger <dag/, double dagger <ddag/, and paragraph mark <para/ are rarely used.
114 The double prime, or "seconds" of a degree is sometimes represented by
115a double "light accent" (code 183 = \'b7). In other places, and in later
116versions, it is represented by <sec/ = hex a9, in the webfont.
117 The symbols "greater than" <gt/ and "less than" are encountered only
118once, but are distinguished from the right- and left-angle brackets
119(> and <) because of possible typographical differences in some fonts.
120 The schwa is symbolized by <schwa/. It is not used in the
121pronunciations, but is mentioned as a symbol.
122 The right-pointing arrow is <rarr/, consistent with ISO 8879.
123
124----------------------------------
125Table 1
126----------------------------------
127Numbers
128 Hex codes
1291  
13011   (12 is a hard page break, 13 CR, 14 sect break)
13121  
13231  !"# $%&'(
133121 yz{|} ~ 79-7d 7e-82
134131 83-87 88-8c
135141 8d-91 92-96
136151 97-9b 9c-a0
137161 a1-a5 a6-aa
138171 ab-af b0-b4
139181 b5-b9 ba-be
140191 bf-c3 c4-c8
141201 c9-cd ce-d2
142211 d3-d7 d8-dc
143221 dd-e1 e2-e6
144231 e7-eb ec-f0
145241 f1-f5 f6-fa
146251 fb-ff
147
148=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
149Below is a complete list of the symbols used in the Webster ("webfont")
150which are encoded in the special font listed above, together with
151corresponding symbols in ISO 8879 and Tex coding. Much of this table was
152prepared by Rik Faith, to whom we express our appreciation.
153 The "nearest ASCII" equivalents are given for those who want to
154display the data as best one can in 7-bit simple ASCII symbols without
155using the "entity" symbols.
156=========================================================================
157----------------------------------
158Table 2
159----------------------------------
160
161Comments:
162 (1) The symbol in the "entity" column is the SGML-like symbol used in
163 the present Webster files; the symbol in the "ISO 8879" column is
164 the symbol for the same character given in "The user's guide to
165 ISO 8879" by Smith and Stutely.
166 (2) An asterisk "*" in the "entity" column means that this symbol and
167code value is not used in any form in the Webster 1913 electronic version.
168 (3) If no asterisk is in the "entity" column, and no other symbol is
169there, this means that in the Webster, only the hexadecimal representation
170was used (e.g. for \'d8, \'bd, and \'b8).
171 (4) \'b6 and \'b7, the heavy and light "accents", are never above a
172letter (these are not diacritical marks), but in-between letters, as the
173stress accent used in the headwords and pronunciations. The accent
174*follows* the syllable accented. The light accent \'b7 is also used as
175the "prime" in mathematical expressions (e.g. a\'b7 = "a prime"), or as
176 "minutes" in degrees-minutes-seconds, and when doubled (\'b7\'b7)
177serves as "double prime" in mathematical expressions, and as "seconds"
178in degrees-minutes-seconds. The character \'a9 (<sec/ or &Prime;) is
179also used to represent the double prime.
180 (5) Although the semilong vowels are in the table (e.g. the "asl"
181= "a semilong", most of the entries in the ASCII version dictionary
182use the <xsl/ symbol coding. If you know of any printers' names for
183these, do let me know.
184 (6) For some reason, the a breve and u breve have ISO codes (in the
185Latin-2 table), but the other vowels don't, in the Smith & Stutely book.
186Is this a mistake?
187 (7) The symbol <nsc/ is used for "N small capitals", used in
188pronunciations to represent the soun fo the nasal N in French words.
189 (8) A weak accent (when not in pronunciations) is symbolized by <prime/, the "minutes" (of a degree) symbol. A strong accent is symbolized by <bprime/ ("bold prime", not an ISO entity).
190 (9) If you find any exceptions to these usage assertions, please
191let me know.
192----------------------------------------------------------------------------------------
193 webfont ISO 8879 latin1/ascii TeX nearest description
194------------------ ASCII
195oct dec hex entity oct dec hex
196--------------------------------------------------------------------------------
197025 21 15 * \S * section symbol
198
199074 60 3c lt 074 60 3c $<$ < less than
200076 62 3e gt 076 62 3e $>$ > greater than
201
202200 128 80 <Cced/ Ccedil 307 199 c7 \c{C} C C cedilla
203201 129 81 <uum/ uuml 374 252 fc \"u ue u umlaut (diaeresis)
204202 130 82 <eacute/ eacute 351 233 e9 \'e e e acute
205203 131 83 <acir/ acirc 342 226 e2 \^a a a circumflex
206204 132 84 <aum/ auml 344 228 e4 \"a ae a umlaut (diaeresis)
207205 133 85 <agrave/ agrave 340 224 e0 \`a a a grave
208206 134 86 <aring/ aring 345 229 e5 \aa a a ring above
209207 135 87 <cced/ ccedil 347 231 e7 \c{c} c c cedilla
210210 136 88 <ecir/ ecirc 352 234 ea \^e e e circumflex
211211 137 89 <eum/ euml 353 235 eb \"e e e umlaut (diaeresis)
212212 138 8a <egrave/ egrave 350 232 e8 \`e e e grave
213213 139 8b <ium/ iuml 357 239 ef \"i i i umlaut (diaeresis)
214214 140 8c <icir/ icirc 356 238 ee \^i i i circumflex
215215 141 8d <igrave/ igrave 354 236 ec \`i i i grave
216216 142 8e <Aum/ Auml A A umlaut
217217 143 8f Aring A A ring above
218
219220 144 90 <Eacute/ Eacute 311 201 c9 \'E e E acute
220221 145 91 <ae/ aelig 346 230 e6 \ae ae ligature ae
221222 146 92 <AE/ AElig 306 198 c6 \AE AE ligature AE
222223 147 93 <ocir/ ocirc 364 244 f4 \^o o o circumflex
223224 148 94 <oum/ ouml 366 246 f6 \"o oe o umlaut (diaeresis)
224225 149 95 <ograve/ ograve 362 242 f2 \`o o o grave
225226 150 96 <ucir/ ucirc 373 251 fb \^u u u circumflex
226227 151 97 <ugrave/ ugrave 371 249 f9 \`u u u grave
227230 152 98 <yum/ yuml y y umlaut
228231 153 99 <Oum/ Ouml O O umlaut
229232 154 9a <Uum/ Uuml 334 220 dc \"U U U umlaut (diaeresis)
230233 155 9b
231234 156 9c <pound/ pound 243 163 a3 \pounds * pound sign (British)
232235 157 9d *
233236 158 9e *
234237 159 9f *
235240 160 a0 <aacute/ aacute 341 225 e1 \'a a a acute
236241 161 a1 <iacute/ iacute 355 237 ed \'i i i acute
237242 162 a2 <oacute/ oacute 363 243 f3 \'o o o acute
238243 163 a3 <uacute/ uacute 372 250 fa \'u u u acute
239244 164 a4 <ntil/ ntilde 361 241 f1 \~n ny n tilde
240245 165 a5 <Ntil/ Ntilde NY N tilde
241246 166 a6 <frac23/ $\frac{2}{3}$ 2/3 two-thirds
242247 167 a7 <frac13/ $\frac{1}{3}$ 1/3 one-third
243250 168 a8 *
244251 169 a9 <sec/ Prime seconds (of degree or time)
245 Also, inches or double prime
246252 170 aa *
247253 171 ab <frac12/ 275 189 bd $\frac{1}{2}$ 1/2 one-half
248254 172 ac <frac14/ 274 188 bc $\frac{1}{4}$ 1/4 one-quarter
249255 173 ad *
250256 174 ae *
251257 175 af *
252260 176 b0 <?/ (?) Place-holder
253 for unknown or illegible character.
254261 177 b1 *
255262 178 b2 *
256263 179 b3 *
257264 180 b4 * $\updownarrow$ * verticle arrow
258265 181 b5 <hand/ * pointing hand
259 (printer's "fist")
260266 182 b6 <bprime/ \"{} '' bold accent
261 (used in pronunciations)
262267 183 b7 <prime/ prime 264 180 b4 \'{} ' light accent
263 (used in pronunciations)
264 also minutes (of arc or time)
265270 184 b8 <rdquo/ rdquo '' " close double quote
266271 185 b9 *
267272 186 ba * $\parallel$ || verticle double bar (l)
268273 187 bb *
269274 188 bc <sect/ sect \S * section mark
270275 189 bd <ldquo/ ldquo `` " open double quotes
271276 190 be <amac/ amacr \=a a a macron
272277 191 bf <lsquo/ lsquo ` ` left single quote
273
274300 192 c0 <nsm/ ng "n sub-macron"
275301 193 c1 <sharp/ sharp $\sharp$ # musical sharp
276302 194 c2 <flat/ flat $\flat$ * musical flat
277303 195 c3 * -- -- long dash (en-dash? )
278304 196 c4 * $-$ - horizontal line
279305 197 c5 <th/ (part 1) first part of th ligature
280 see 231 = e7 for part 2
281306 198 c6 <imac/ imacr \=i i i macron
282307 199 c7 <emac/ emacr \=e e e macron
283310 200 c8 <dsdot/ d Sanskrit/Tamil d dot
284311 201 c9 <nsdot/ n Sanskrit/Tamil n dot
285312 202 ca <tsdot/ t Sanskrit/Tamil t dot
286313 203 cb <ecr/ \u{e} e e breve
287314 204 cc <icr/ \u{i} i i breve
288315 205 cd *
289316 206 ce <ocr/ \u{o} o o breve
290317 207 cf - -- - short dash
291
292320 208 d0 -- mdash --- -- long (em) dash
293321 209 d1 <OE/ OElig \OE OE OE ligature
294322 210 d2 <oe/ oelig \oe oe oe ligature
295323 211 d3 <omac/ omacr \=o o o macron
296324 212 d4 <umac/ umacr \=u u u macron
297325 213 d5 <ocar/ \v{o} o o hacek
298326 214 d6 <aemac/ \=\ae ae ae ligature macron
299327 215 d7 <oemac/ \=\oe oe oe ligature macron
300330 216 d8 par $\parallel$ || double vertical
301 bar(s)
302331 217 d9 *
303332 218 da *
304333 219 db *
305334 220 dc <ucr/ ubreve \u{u} u u breve
306335 221 dd <acr/ abreve \u{a} a a breve
307336 222 de <cre/ ssmile \u{} ~ crescent
308 (like a breve, but vertically centered --
309 represents the short accent in poetic meter)
310337 223 df <ymac/ \=y y y macron
311
312340 224 e0 <asl/ a a "semilong"
313 (has a macron above with a short vertical
314 bar on top the center of the macron)
315 Used in pronunciations.
316341 225 e1 <esl/ e "semilong"
317342 226 e2 <isl/ i "semilong"
318343 227 e3 <osl/ o "semilong"
319344 228 e4 <usl/ u "semilong"
320345 229 e5 <adot/ a a with dot above
321346 230 e6 * mu small Greek mu
322347 231 e7 <th/ (part 2) second part of th ligature
323 see 197 = c5 for part 1
324350 232 e8 *
325351 233 e9 *
326352 234 ea *
327353 235 eb <edh/ edh 360 240 f0 th small eth
328354 236 ec *
329355 237 ed <thorn/ thorn 376 254 fe th small thorn
330356 238 ee <atil/ atilde \~a a a tilde
331357 239 ef <ndot/ n n with dot above
332
333360 240 f0 <rsdot/ \d{r} r r with a dot below
334361 241 f1 *
335362 242 f2 *
336363 243 f3 *
337364 244 f4 <yogh/ y small yogh
338365 245 f5 <mdash/ mdash --- -- em dash
339366 246 f6 <divide/ divide 367 247 f7 $\div$ / division sign
340367 247 f7 ap $\approx$ ~= "double tilde"
341370 248 f8 <deg/ deg 260 176 b0 ${}^\circ$ * degree sign
342371 249 f9 <middot/ $\bullet$ * bold middle dot
343372 250 fa * 267 183 b7 $\cdot$ * light middle dot
344373 251 fb <root/ radic $\surd$ * root sign
345374 252 fc *
346375 253 fd *
347376 254 fe *
348377 255 ff *
349
350 ----------------------------------
351Table 3
352----------------------------------
353
354====================================================================
355The table below gives some additional information about some of the
356more commonly used entities
357-------------------------------------------------------------------
358Frequently used:
359decimal hex char definition
360 21 section symbol -- another section also at 197
361 (so that 21 can be used as a normal control
362 character)
363 126 ~ used by typists as a place-holder in word
364 combinations where an uncapitalized headword
365 should be.
366 128 80 <Cced/ c cedilla (uppercase)
367 129 81 <uum/ u umlaut
368 130 82 e acute
369 131 83 a circumflex
370 132 84 <aum/ a umlaut
371 133 85 a grave
372 134 86 <aring/ a with "ring" (circle) above (Swedish!)
373 135 87 <cced/ c cedilla
374 136 - 144 standard European set for IBM
375 136 88 <ecir/ e circumflex
376 137 89 <eum/ e umlaut (or e with dieresis above)
377 138 8a e grave
378 145 91 <ae/ = "ae" fused ligature
379 146 92 <AE/ = upper-case "ae" fused ligature
380 147 93 <ocir/ o circumflex
381 148 94 <oum/ o "umlaut", used mostly in "coperation,
382 Zol." and in pronunciations
383 164 a4 <ntil/ Spanish "enye"
384 166 a6 <frac23/ two-thirds (fraction)
385 167 a7 <frac13/ one-third (fraction)
386 169 a9 <sec/ seconds of degree or time, or double-prime
387 171 ab <frac12/ one-half, as in the original IBM set
388 172 ac <frac14/ one-fourth (fraction)
389 176 b0 <?/ = (reverse-video question mark), used
390 to represent an uncodable or illegible character
391 180 b4 long verticle double-headed arrow (a reference mark)
392 181 b5 <hand/ = (the typographer's "fist")
393 Appearing as a "pointing hand" character
394 (for explanatory notes)
395 182 b6 bold accent in headwords
396 replaced in full ASCII version by double quote = "
397 183 b7 light accent in headwords
398 replaced within headwords in the full ASCII version
399 by an open-single-quote (` = ASCII 96, not the same
400 as 191, \'bf). This mark is used also
401 for minutes of a degree, and for "prime"
402 to modify variables in mathematical expressions.
403 -- two of these in sequence represent seconds
404 of a degree, or double prime. The seconds
405 symbol is also represented by <sec/ (hex a9).
406 184 b8 close double quotes (used with 189 [= \'bd], open quote)
407 186 ba verticle double bar - represents the symbol used
408 in the printed dictionary before a headword to
409 signify that the word was adopted without
410 anglicization from a foreign language
411 but in the full-ASCII version this function
412 uses \'d8 -- see 216
413 188 bc <sect/ section mark
414 - alternate to 21 (a control character)
415 189 bd open double quotes (used with 184, close quote)
416 190 be <amac/ a macron
417 191 bf <lsquo/ "left single quote"
418 single open quote mark (not same as ASCII 96)
419 192 c0 <nsm/ "n sub-macron", an n with a macron below --
420 represents the "ng" sound in pronunciations
421 193 c1 <sharp/ sharp - music notation
422 194 c2 <flat/ flat - music notation
423 195 c3 long dash, one pixel removed from left
424 will fuse with left long dash, char 208
425 196 c4 graphic horizontal line
426 195+208 combination for a very long dash. In the
427 original typing, the dash char 208 was used
428 for both non-breaking hyphen (in hyphenated
429 words), and for the em-dash used as an
430 introductory mark for various segments.
431 The em-dash should be distinguished from
432 the hyphen, but that conversion hasn't yet
433 been done.
434 In the full ASCII version, a double hypen
435 "--" represent the m-dash
436 197 c5 <th/ (part 1) first of a pair of characters
437 197+231 = used to represent the th ligature --
438 <th/ represents the "th" sound of "mother"
439 see 231 (e7) for part 2
440 198 c6 <imac/ = i macron
441 199 c7 <emac/ = e macron
442 200 c8 <dsdot/ Sanskrit/Tamil d with dot underneath
443 201 c9 <nsdot/ Sanskrit/Tamil n with dot underneath
444 202 ca <tsdot/ Sanskrit/Tamil t with dot underneath
445 203 cb <ecr/ = e with crescent (breve) above. Used
446 - in some etymologies and pronunciation
447 204 cc <icr/ = i with crescent (breve) above - used
448 - in some etymologies and pronunciation
449 206 ce <ocr/ = o with crescent (breve) above - used
450 - in some etymologies and pronunciation
451 207 cf short dash, used in hyphenated words, and in
452 breaking syllables where no accent is used. But
453 sometimes the typists used the normal hyphen [45],
454 or the long dash (decimal 208) for that purpose.
455 The normal hyphen is the same length as the long
456 dash, but one pixel higher in the character box.
457 # In headwords, in the full ASCII version, this
458 short dash is represented by the asterisk "*".
459 208 d0 <mdash/ = represents the long dash, used for the em
460 dash which often precedes certain sections within a
461 definition, and which separates some sections,
462 such as wordforms or collocations within a
463 collocation segment. This is replaced in the
464 full ASCII version by a double hyphen, "--".
465 210 d2 <oe/ = "oe" fused ligature
466 211 d3 <omac/ = o macron
467 212 d4 <umac/ = u macron
468 213 d5 <ocar/ o with caron (hacek) (inverted circumflex) above
469 214 d6 <aemac/ = "ae" ligature with a macron
470 215 d7 <oemac/ = "oe" ligature with a macron
471 216 d8 <par/ double vertical bar (short length; the long
472 length is the graphics character 186)
473 This precedes words marked with a double vertical bar in
474 the original dictionary, signifying that the word was
475 adopted directly into English without modification of
476 the spelling.
477 220 dc <ucr/ = u with crescent above - used in some etymologies
478 221 dd <acr/ = a with crescent above - used in some etymologies
479 222 de <cre/ = "crescent", an upward-curving crescent
480 used as a poetic meter mark
481 223 df <ymac/ = y macron (used in Anglo-Saxon?)
482 229 e5 <adot/ = a with a dot above (for pronunciations)
483 231 e7 <th/ (part 2) second of a two-character combination
484 197+231 = used to represent the th ligature in pronunciations
485 <th/ represents the "th" sound of "mother"
486 235 eb <edh/ = Old English and Icelandic "edh", (or "eth")
487 like a Greek delta with a hatch mark
488 through the ascender. Used to represent the
489 Anglo-Saxon/Icelandic/Gothic character,
490 in etymologies, pronounced like "th"
491 237 ed <thorn/ "thorn", an Old English and Icelandic
492 character, appears like a "p" with an extended
493 ascender.
494 Used to represent the
495 Anglo-Saxon/Icelandic/Gothic character,
496 in etymologies, pronounced like "th"
497 in "thorn" and also as in "brother"
498 238 ee <atil/ a with tilde above - in some etymologies
499 244 f4 <yogh/ like a script "3" or "z". Used in Old English
500 etymologies, analogous to "y"
501 247 f7 double tilde ("approximately equals").
502 used by typists as a place-holder in word
503 combinations where the capitalized headword
504 should be.
505 248 f8 <deg/ degrees (temperature or angle). Note: some
506 typists used a superscript "o" to signify
507 degrees. This must be corrected!
508 249 f9 middle dot (bold)
509 250 fa middle dot (light)
510 251 fb <root/ "root" sign used in etymologies, as in original
511 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
512
513======================================
514 Greek transcription
515=====================================
516Greek letters are represented:
517 (capitals represent capital letters; lower-case represent lower-case)
518 #Note that "h" in transliterations is used individually, as eta, and
519 also in the combination "ch" (chi). Conversions to other codings
520 must first convert "ch" before converting "h", or at least verify
521 that an "h" to be converted has no preceding "c". "c" is not
522 otherwise used, so there is no ambiguity. Also, "ps" always
523 represents a psi; it could in theory occur as a pi-sigma
524 combination, but it doesn't. Occasionally, "th" was entered instead
525 of "q" to represent theta; these should be checked to verify that
526 they do not represent tau-eta, and converted to "q".
527
528(1) characters individually:
529 By the short-form notation <alpha/, <beta/, <gamma/, <lambda/ etc.
530 Capitalized letters are <ALPHA/, etc.
531(2) in words:
532 By inclusion within the markers <grk></grk>, using the following
533 roman-letter equivalents for the Greek letters:
534 Accents:
535 (a) aspirants -- used in front of the letter modified, which is
536usually in *front* of words beginning in vowels. Of two types:
537 ' (apostrophe) for the left-curving apirant (spiritus lenis)
538 " (double quote) for the right-curving aspirant (spiritus asper)
539 (when the aspirant is on a letter inside a word, it is placed
540 in front of the letter it modifies.)
541 (the left-curving aspirant is also used over rho, which is
542 then usually transliterated "rh". The " in such cases is
543 placed in front of the r (for rho) which it modifies).
544 (b) normal accent (appearing as an acute accent in the original):
545 ` (left open quote, ASCII ) -- placed after accented vowel
546 (b) grave accent (appearing as an grave accent in the original):
547 ~ (tilde, ASCII ) -- placed after accented vowel. This is
548 rarely seen, as in <grk>to~ pa^n</grk> at "universe" or
549 <grk>ta~ gewrgika`</grk> (at "Georgic").
550 (c) curving accent (appearing as a rounded circumflex):
551 ^ (circumflex) -- placed after accented vowel
552 (d) "iota" subscript (ogonek)-- a comma placed after the vowel
553 having the subscript
554 (e) diaeresis:
555 the double dot found occasionally over the iota is
556 represented by a colon immediately after the iota,
557 as the i-diaeresis in <grk>Farisai:ko`s</grk> (at "pharisaic").
558
559 Where a letter has two accents, both are placed *after* the vowel
560 Letters with an aspirant and an accent have the
561 aspirant before the letter, and the accent after it.
562 ------------------------
563
564
565The capitalized Greek letters are represented by the capitalized
566 versions of the letters shown here.
567-----------------------------------------
568 Greek letter transliteration
569 ------------ ---------------
570 alpha a
571 beta b
572 gamma g
573 delta d
574 epsilon e
575 zeta z
576 eta h
577 theta q (th was used in some earier sections, but was
578 changed due to potential confusion with the
579 tau+eta combination, as in <grk>lyth`rios</grk>
580 (at "lyterian") or <grk>poihth`s</grk>
581 (at "maker") )
582 iota i
583 kappa k
584 lambda l
585 mu m
586 nu n
587 xi x
588 omicron o
589 pi p
590 rho r
591 sigma s (end form not distinguished here from middle
592 form within words, but when isolated, use <sigmat/
593 ("terminal sigma") for the end form)
594 tau t
595 upsilon y (Used for both "u" and "y" pronunciations)
596 phi f
597 chi ch (c is always followed by h, so the h component
598 is not confusable with eta)
599 psi ps (theoretically confusable with pi-sigma, but that
600 combination seems never to occur)
601 omega w
602
603 (Roman j, v, u are unused)
604

Return to:

Send suggestions and report system problems to the System administrator.