From 4a458db06b28492a7e48b1a0560b35778e476482 Mon Sep 17 00:00:00 2001 From: Sergey Poznyakoff Date: Fri, 3 Feb 2012 00:08:07 +0200 Subject: Further work on ancillary files. * webfont.txt: Use Unicode, rewrite character table and Greek transliteration sections. * pronunc.txt: Update. * tagset.txt: Update. --- webfont.txt | 1046 ++++++++++++++++++++++++++++++----------------------------- 1 file changed, 529 insertions(+), 517 deletions(-) (limited to 'webfont.txt') diff --git a/webfont.txt b/webfont.txt index 591e980..d432fe5 100644 --- a/webfont.txt +++ b/webfont.txt @@ -1,88 +1,70 @@ WEBSTER FONTS ============= - Fonts for the Webster 1913 Dictionary. - For version 0.50 - Last edit May 5, 2001 - ______________________________________ - (This file contains some extended ASCII characters, and should be -transmitted in binary mode) ----------------------------------------------------------------------- - - This file describes a modified font for use in visualizing the -text of the 1913 "Webster's Revised Unabridged Dictionary" (W1913), -usable for the DOS operating system of IBM-compatible personal computers. -The electronic version of that dictionary and this font were prepared by -MICRA, Inc., Plainfield NJ, and are copyrighted (C) 1996 by MICRA, Inc. -For details of permissions and restrictions on using these files, see -the accompanying file "readme.web". - The special characters used in the electronic version of the Webster +* Overview + +This file describes special symbols and markup entities used in the +GNU Collaborative International Dictionary of English. + +* Introduction + +The special characters used in the electronic version of the Webster 1913 are required for visualizing unusual characters used in the etymology and pronunciation fields of the dictionary, in a form -comparable to the way they appear in the original. Since there are -more than 256 characters used in that dictionary, not all can be -represented by single-byte codes, and are instead represented by -SGML-style "short-form" symbols. (rather than the "entity" format -"&xx;" The ampersand is used frequently, and we prefer to leave -the "<" as the only "escape" character) of the type ... + +* Italics + +In most places, italic font is represented by the tags ... surrounding the italic text, or by some other tag which also implies -italic font. In the pronunciations, however, where italicized vowels +italic font. In the pronunciations, however, where italicized vowels are used among non-italic and other special characters to indicate -pronunciation, the special codes and <) because of possible typographical differences in some fonts. - The schwa is symbolized by +and <) because of possible typographical differences in some fonts. + +The schwa is symbolized by $ > greater than - -200 128 80 $ > > greater than + +200 128 80 , using the following - roman-letter equivalents for the Greek letters: - Accents: - (a) aspirants -- used in front of the letter modified, which is -usually in *front* of words beginning in vowels. Of two types: - ' (apostrophe) for the left-curving apirant (spiritus lenis) - " (double quote) for the right-curving aspirant (spiritus asper) - (when the aspirant is on a letter inside a word, it is placed - in front of the letter it modifies.) - (the left-curving aspirant is also used over rho, which is - then usually transliterated "rh". The " in such cases is - placed in front of the r (for rho) which it modifies). - (b) normal accent (appearing as an acute accent in the original): - ` (left open quote, ASCII ) -- placed after accented vowel - (b) grave accent (appearing as an grave accent in the original): - ~ (tilde, ASCII ) -- placed after accented vowel. This is - rarely seen, as in to~ pa^n at "universe" or - ta~ gewrgika` (at "Georgic"). - (c) curving accent (appearing as a rounded circumflex): - ^ (circumflex) -- placed after accented vowel - (d) "iota" subscript (ogonek)-- a comma placed after the vowel - having the subscript - (e) diaeresis: - the double dot found occasionally over the iota is - represented by a colon immediately after the iota, - as the i-diaeresis in Farisai:ko`s (at "pharisaic"). - - Where a letter has two accents, both are placed *after* the vowel - Letters with an aspirant and an accent have the - aspirant before the letter, and the accent after it. - ------------------------ - - -The capitalized Greek letters are represented by the capitalized - versions of the letters shown here. + 128 80 Ç , is a Greek +transliteration written in roman letters. The following rules are +used: + +** Aspirants + +Aspirants are represented by ' (apostrophe) and " (double quote) +placed in front of the letter modified. Apostrophe stands for +ψιλὸν πνεῦμα (ψιλή or spiritus lenis), and double quote stands for +δασὺ πνεῦμα (δασεία or spiritus asper). + + 'a -- ἀ + "a -- ἁ + +** Accents + +Accents are placed after the accented letter. The acute accent (ὀξεῖα) is +represented by ` (gravis). The grave accent (βαρεῖα) is represented +by ~ (tilde), and circumflex (περισπωμένη) is represented by +circumflex. Thus: + + a` -- ά + a~ -- ὰ + a^ -- ᾶ + +Some examples of the combined forms (aspirant + accent): + + 'a` -- ἄ + 'a~ -- ἂ + 'a^ -- ἆ + "a` -- ἄ + "a~ -- ἂ, + "a~ -- ἃ + + +** Iota subscriptum + +Iota subscript is represented by comma placed after the affected +vowel. If the vowel is accented, the comma is placed after the +accent mark. For example: + + a`, -- ᾴ + 'a`, -- ᾄ + +** Diaeresis + +Diaeresis is represented by a colon immediately after the affected +vowel. If the vowel is accented, the accent is placed after the +colon, e.g.: + + i: -- ϊ + i:^ -- ῗ + i:` -- ῒ + +** Letters + +The table below shows, for each Greek letter, the corresponding markup +entity and transliteration. The capitalized Greek letters are +represented by the capitalized versions of the letters shown here. + ----------------------------------------- Greek letter transliteration - ------------ --------------- - alpha a - beta b - gamma g - delta d - epsilon e - zeta z - eta h - theta q (th was used in some earier sections, but was - changed due to potential confusion with the - tau+eta combination, as in lyth`rios - (at "lyterian") or poihth`s - (at "maker") ) - iota i - kappa k - lambda l - mu m - nu n - xi x - omicron o - pi p - rho r - sigma s (end form not distinguished here from middle - form within words, but when isolated, use lyth`rios, at "lyterian") or ποιητής (poihth`s, +at "maker"). +[2] Final sigma is not distinguished here from middle sigma, but when +isolated, use 'archai:`zein ἀρχαῒζειν +zw^,on ζῷον +o'i^nos οἶνος +"ydra`rgyros ὑδράργυρος + +Local Variables: +mode: Outline +coding: utf-8 +End: + -- cgit v1.2.1