|
1 .. -*- coding: utf-8 -*- |
|
2 |
|
3 ======================= |
|
4 gadict HACKING guide. |
|
5 ======================= |
|
6 .. contents:: |
|
7 :local: |
|
8 |
|
9 Dictionary source file format. |
|
10 ============================== |
|
11 |
|
12 For source file format used dictd C5 file format. See:: |
|
13 |
|
14 $ man 1 dictfmt |
|
15 |
|
16 Shortly: |
|
17 |
|
18 * Headwords was preceded by 5 or more underscore characters ``_`` and a blank |
|
19 line. |
|
20 * All text until the next headword is considered the definition. |
|
21 * Any leading ``@`` characters are stripped out, but the file is otherwise |
|
22 unchanged. |
|
23 |
|
24 For convenience also used such assumptions: |
|
25 |
|
26 * Headwords was separated by ``;<SPACE>`` (and all was placed on single line). |
|
27 * UTF-8 encoding was used. |
|
28 * Lines started with ``#`` striped out (comment syntax). |
|
29 * First line with ``ABOUT:`` used as description of dictionary. |
|
30 * First URL (line with ``http://``) used as dictionary home page. |
|
31 |
|
32 Comment syntax convention. |
|
33 ========================== |
|
34 |
|
35 As 'dictd -c5' format does not support comment syntax we filter out all lines |
|
36 that start with '#'. |
|
37 |
|
38 Dictionary file name convention. |
|
39 ================================ |
|
40 |
|
41 BNF form:: |
|
42 |
|
43 FILE ::= PREFIX "-" NAME "-" LANG ".dict-c5" |
|
44 PREFIX ::= "gadict" |
|
45 LANG ::= ISOCODE | ISOCODE "-" ISOCODE |
|
46 |
|
47 where ``ISOCODE`` is ISO 639-1 language (2 letter) code, currently ``en``, |
|
48 ``ru``, ``uk`` in use, ``NAME`` is dictionary abbreviated name. |