Fix formatting and typos for automated translation to different
dictionary format.
.. -*- coding: utf-8 -*-
.. include:: header.rst
=======================
gadict HACKING guide.
=======================
.. contents::
:local:
Document version.
=================
.. include:: VERSION.rst
Versioning rules.
=================
We use **major.minor** schema.
Until we reach 5000 words **major** is 0. **minor** updated from time to time.
Getting sources (VCS).
======================
To clone repository run::
$ hg clone http://hg.code.sf.net/p/gadict/code gadict-hg
To push to repository you must have write permission and do::
$ hg push ssh://$USER@hg.code.sf.net/p/gadict/code
or::
$ hg clone https://$USER@hg.code.sf.net/p/gadict/code
Browsing sources.
=================
http://hg.code.sf.net/p/gadict/code
hgweb interface for official repository.
https://sourceforge.net/p/gadict/code/
Sourceforge Allure interface for official repository.
Dictionary source file format.
==============================
For source file format used dictd C5 file format. See::
$ man 1 dictfmt
Shortly:
* Headwords was preceded by 5 or more underscore characters ``_`` and a blank
line.
* All text until the next headword is considered the definition.
* Any leading ``@`` characters are stripped out, but the file is
otherwise unchanged.
For convenience also used such assumptions:
* Headwords was separated by ``;<SPACE>`` (and all was placed on single
line).
* UTF-8 encoding was used.
* Lines started with ``#`` striped out (comment syntax).
* First line with ``ABOUT:`` used as description of dictionary.
* First URL (line with ``http://``) used as dictionary home page.
Comment syntax convention.
==========================
As 'dictd -c5' format does not support comment syntax we filter out all
lines that start with '#'.
TODO convention.
================
Entries or parts of text that was not completed marked by keywords:
TODO
incomplete
XXX
urgent incomplete
Makefile rules ``todo`` find this occurrence in sources::
$ make todo
Dictionary file name convention.
================================
BNF form::
FILE ::= PREFIX "-" NAME "-" LANG ".dict-c5"
PREFIX ::= "gadict"
LANG ::= ISOCODE | ISOCODE "-" ISOCODE
where ``ISOCODE`` is ISO 639-1 language (2 letter) code, currently ``en``,
``ru``, ``uk`` in use, ``NAME`` is dictionary abbreviated name.
World wide dictionary formats and standards.
============================================
http://en.wikipedia.org/wiki/Dictionary_writing_system
Dictionary writing system
http://www.sil.org/computing/shoebox/mdf.html
Multi-Dictionary Formatter (MDF). It defines about 100 data
field markers.
http://fieldworks.sil.org/flex/
FieldWorks Language Explorer (or FLEx, for short) is designed
to help field linguists perform many common language
documentation and analysis tasks.
http://code.google.com/p/lift-standard/
LIFT (Lexicon Interchange FormaT) is an XML format for storing
lexical information, as used in the creation of dictionaries.
It's not necessarily the format for your lexicon.
http://www.lexiquepro.com/
Lexique Pro is an interactive lexicon viewer and editor, with
hyperlinks between entries, category views, dictionary
reversal, search, and export tools. It's designed to display
your data in a user-friendly format so you can distribute it
to others.
http://deb.fi.muni.cz/index.php
DEBII — Dictionary Editor and Browser
Register gadict dictionaries for dictd under Debian.
====================================================
::
$ su
$ cat >>etc/dictd/dictd.order <<EOF
gadict-dictabbr
/home/user/usr/share/dictd/
$ dictdconfig --write
$ /etc/init.d/dictd restart
$ ^D
$ dictdconfig --list
$ dict -d gadict-dictabbr v
Typing IPA chars in Emacs.
==========================
For entering IPA chars use IPA input method. To enable it type::
C-u C-\ ipa <enter>
All chars from alphabet typed as usual. To type special IPA chars use next key
bindings (or read help in Emacs by ``M-x describe-input-method`` or ``C-h I``).
For vowel::
æ ae
ɑ o| or A
ɒ |o or /A
ʊ U
ɛ /3 or E
ɔ /c
ə /e
ʌ /v
ɪ I
For consonant::
θ th
ð dh
ʃ sh
ʧ tsh
ʒ zh or 3
ŋ ng
ɡ g
ɹ /r
Special chars::
ː : (semicolon)
ˈ ' (quote)
ˌ ` (back quote)
Alternatively use ``ipa-x-sampa`` or ``ipa-kirshenbaum`` input method (for help
type: ``C-h I ipa-x-sampa RET`` or ``C-h I ipa-kirshenbaum RET``).