.. -*- coding: utf-8 -*-
======================
gadict HACKING guide
======================
.. contents::
:local:
Versioning rules
================
We use **major.minor** schema.
Until we reach 5000 words **major** is 0. **minor** updated from time to time.
Getting sources
===============
Cloning repository::
$ hg clone http://hg.defun.work/gadict gadict
$ hg clone http://hg.code.sf.net/p/gadict/code gadict-hg
Pushing changes::
$ hg push ssh://$USER@hg.defun.work/gadict
$ hg push ssh://$USER@hg.code.sf.net/p/gadict/code
$ hg push https://$USER:$PASS@hg.code.sf.net/p/gadict/code
Browsing sources online
=======================
http://hg.defun.work/gadict
hgweb at home page.
http://hg.code.sf.net/p/gadict/code
hgweb at old home page (but supported as mirror).
https://sourceforge.net/p/gadict/code/
Sourceforge Allure interface (not primary, a mirror).
Dictionary source file format
=============================
For source file format used dictd C5 file format. See::
$ man 1 dictfmt
Shortly:
* Headwords was preceded by 5 or more underscore characters ``_`` and a blank
line.
* All text until the next headword is considered the definition.
* Any leading ``@`` characters are stripped out, but the file is otherwise
unchanged.
For convenience also used such assumptions:
* Headwords was separated by ``;<SPACE>`` (and all was placed on single line).
* UTF-8 encoding was used.
* Lines started with ``#`` striped out (comment syntax).
* First line with ``ABOUT:`` used as description of dictionary.
* First URL (line with ``http://``) used as dictionary home page.
Comment syntax convention
=========================
As 'dictd -c5' format does not support comment syntax we filter out all
lines that start with '#'.
TODO convention
===============
Entries or parts of text that was not completed marked by keywords:
TODO
incomplete
XXX
urgent incomplete
Makefile rules ``todo`` find this occurrence in sources::
$ make todo
Dictionary file name convention
===============================
BNF form::
FILE ::= "gadict_" NAME ".gadict"
``NAME`` may have form ``ISOCODE "-" ISOCODE``, like ``en-ru``, where
``ISOCODE`` is ISO 639-1 language (2 letter) code
``NAME`` may be a dictionary abbreviation name.
During dictionaries compilation and releases ``".gadict"`` suffix changed to
appropriated but base name should be preserved as ``"gadict_" NAME``.
World wide dictionary formats and standards
===========================================
http://en.wikipedia.org/wiki/Dictionary_writing_system
Dictionary writing system
http://www.sil.org/computing/shoebox/mdf.html
Multi-Dictionary Formatter (MDF). It defines about 100 data
field markers.
http://fieldworks.sil.org/flex/
FieldWorks Language Explorer (or FLEx, for short) is designed
to help field linguists perform many common language
documentation and analysis tasks.
http://code.google.com/p/lift-standard/
LIFT (Lexicon Interchange FormaT) is an XML format for storing
lexical information, as used in the creation of dictionaries.
It's not necessarily the format for your lexicon.
http://www.lexiquepro.com/
Lexique Pro is an interactive lexicon viewer and editor, with
hyperlinks between entries, category views, dictionary
reversal, search, and export tools. It's designed to display
your data in a user-friendly format so you can distribute it
to others.
http://deb.fi.muni.cz/index.php
DEBII — Dictionary Editor and Browser
Register gadict dictionaries for dictd under Debian
===================================================
::
$ su
$ cat >>etc/dictd/dictd.order <<EOF
gadict-dictabbr
/home/user/usr/share/dictd/
$ dictdconfig --write
$ /etc/init.d/dictd restart
$ ^D
$ dictdconfig --list
$ dict -d gadict-dictabbr v
Typing IPA chars in Emacs
=========================
For entering IPA chars use IPA input method. To enable it type::
C-u C-\ ipa <enter>
All chars from alphabet typed as usual. To type special IPA chars use next key
bindings (or read help in Emacs by ``M-x describe-input-method`` or ``C-h I``).
For vowel::
æ ae
ɑ o| or A
ɒ |o or /A
ʊ U
ɛ /3 or E
ɔ /c
ə /e
ʌ /v
ɪ I
For consonant::
θ th
ð dh
ʃ sh
ʧ tsh
ʒ zh or 3
ŋ ng
ɡ g
ɹ /r
Special chars::
ː : (semicolon)
ˈ ' (quote)
ˌ ` (back quote)
Alternatively use ``ipa-x-sampa`` or ``ipa-kirshenbaum`` input method (for help
type: ``C-h I ipa-x-sampa RET`` or ``C-h I ipa-kirshenbaum RET``).