obsolete/HACKING.rst
changeset 337 33b08a8b7fb1
parent 301 1439e072640a
equal deleted inserted replaced
336:3c46d18b1ff0 337:33b08a8b7fb1
       
     1 .. -*- coding: utf-8 -*-
       
     2 
       
     3 =======================
       
     4  gadict HACKING guide.
       
     5 =======================
       
     6 .. contents::
       
     7    :local:
       
     8 
       
     9 Dictionary source file format.
       
    10 ==============================
       
    11 
       
    12 For source file format used dictd C5 file format. See::
       
    13 
       
    14   $ man 1 dictfmt
       
    15 
       
    16 Shortly:
       
    17 
       
    18  * Headwords was preceded by 5 or more underscore characters ``_`` and a blank
       
    19    line.
       
    20  * All text until the next headword is considered the definition.
       
    21  * Any leading ``@`` characters are stripped out, but the file is otherwise
       
    22    unchanged.
       
    23 
       
    24 For convenience also used such assumptions:
       
    25 
       
    26  * Headwords was separated by ``;<SPACE>`` (and all was placed on single line).
       
    27  * UTF-8 encoding was used.
       
    28  * Lines started with ``#`` striped out (comment syntax).
       
    29  * First line with ``ABOUT:`` used as description of dictionary.
       
    30  * First URL (line with ``http://``) used as dictionary home page.
       
    31 
       
    32 Comment syntax convention.
       
    33 ==========================
       
    34 
       
    35 As 'dictd -c5' format does not support comment syntax we filter out all lines
       
    36 that start with '#'.
       
    37 
       
    38 Dictionary file name convention.
       
    39 ================================
       
    40 
       
    41 BNF form::
       
    42 
       
    43   FILE ::= PREFIX "-" NAME "-" LANG ".dict-c5"
       
    44   PREFIX ::= "gadict"
       
    45   LANG ::= ISOCODE | ISOCODE "-" ISOCODE
       
    46 
       
    47 where ``ISOCODE`` is ISO 639-1 language (2 letter) code, currently ``en``,
       
    48 ``ru``, ``uk`` in use, ``NAME`` is dictionary abbreviated name.