Fix formatting and typos for automated translation to different
dictionary format.
.. -*- coding: utf-8 -*-
.. include:: header.rst
gadict HACKING guide.
.. contents::
Document version.
.. include:: VERSION.rst
Versioning rules.
We use **major.minor** schema.
Until we reach 5000 words **major** is 0. **minor** updated from time to time.
Getting sources (VCS).
To clone repository run::
$ hg clone gadict-hg
To push to repository you must have write permission and do::
$ hg push ssh://$
$ hg clone https://$
Browsing sources.
hgweb interface for official repository.
Sourceforge Allure interface for official repository.
Dictionary source file format.
For source file format used dictd C5 file format. See::
$ man 1 dictfmt
* Headwords was preceded by 5 or more underscore characters ``_`` and a blank
* All text until the next headword is considered the definition.
* Any leading ``@`` characters are stripped out, but the file is
otherwise unchanged.
For convenience also used such assumptions:
* Headwords was separated by ``;<SPACE>`` (and all was placed on single
* UTF-8 encoding was used.
* Lines started with ``#`` striped out (comment syntax).
* First line with ``ABOUT:`` used as description of dictionary.
* First URL (line with ``http://``) used as dictionary home page.
Comment syntax convention.
As 'dictd -c5' format does not support comment syntax we filter out all
lines that start with '#'.
TODO convention.
Entries or parts of text that was not completed marked by keywords:
urgent incomplete
Makefile rules ``todo`` find this occurrence in sources::
$ make todo
Dictionary file name convention.
BNF form::
FILE ::= PREFIX "-" NAME "-" LANG ".dict-c5"
PREFIX ::= "gadict"
where ``ISOCODE`` is ISO 639-1 language (2 letter) code, currently ``en``,
``ru``, ``uk`` in use, ``NAME`` is dictionary abbreviated name.
World wide dictionary formats and standards.
Dictionary writing system
Multi-Dictionary Formatter (MDF). It defines about 100 data
field markers.
FieldWorks Language Explorer (or FLEx, for short) is designed
to help field linguists perform many common language
documentation and analysis tasks.
LIFT (Lexicon Interchange FormaT) is an XML format for storing
lexical information, as used in the creation of dictionaries.
It's not necessarily the format for your lexicon.
Lexique Pro is an interactive lexicon viewer and editor, with
hyperlinks between entries, category views, dictionary
reversal, search, and export tools. It's designed to display
your data in a user-friendly format so you can distribute it
to others.
DEBII — Dictionary Editor and Browser
Register gadict dictionaries for dictd under Debian.
$ su
$ cat >>etc/dictd/dictd.order <<EOF
$ dictdconfig --write
$ /etc/init.d/dictd restart
$ ^D
$ dictdconfig --list
$ dict -d gadict-dictabbr v
Typing IPA chars in Emacs.
For entering IPA chars use IPA input method. To enable it type::
C-u C-\ ipa <enter>
All chars from alphabet typed as usual. To type special IPA chars use next key
bindings (or read help in Emacs by ``M-x describe-input-method`` or ``C-h I``).
For vowel::
æ ae
ɑ o| or A
ɒ |o or /A
ʊ U
ɛ /3 or E
ɔ /c
ə /e
ʌ /v
ɪ I
For consonant::
θ th
ð dh
ʃ sh
ʧ tsh
ʒ zh or 3
ŋ ng
ɡ g
ɹ /r
Special chars::
ː : (semicolon)
ˈ ' (quote)
ˌ ` (back quote)
Alternatively use ``ipa-x-sampa`` or ``ipa-kirshenbaum`` input method (for help
type: ``C-h I ipa-x-sampa RET`` or ``C-h I ipa-kirshenbaum RET``).