HACKING.rst
author Oleksandr Gavenko <gavenkoa@gmail.com>
Sun, 13 Mar 2016 01:40:59 +0200
changeset 318 d4767f21ca59
parent 301 1439e072640a
child 338 61a9d2de0e3e
permissions -rw-r--r--
Fix formatting and typos for automated translation to different dictionary format.

.. -*- coding: utf-8 -*-
.. include:: header.rst

=======================
 gadict HACKING guide.
=======================
.. contents::
   :local:


Document version.
=================

.. include:: VERSION.rst

Versioning rules.
=================

We use **major.minor** schema.

Until we reach 5000 words **major** is 0. **minor** updated from time to time.

Getting sources (VCS).
======================

To clone repository run::

  $ hg clone http://hg.code.sf.net/p/gadict/code gadict-hg

To push to repository you must have write permission and do::

  $ hg push ssh://$USER@hg.code.sf.net/p/gadict/code

or::

  $ hg clone https://$USER@hg.code.sf.net/p/gadict/code

Browsing sources.
=================

  http://hg.code.sf.net/p/gadict/code
                hgweb interface for official repository.
  https://sourceforge.net/p/gadict/code/
                Sourceforge Allure interface for official repository.

Dictionary source file format.
==============================

For source file format used dictd C5 file format. See::

  $ man 1 dictfmt

Shortly:

 * Headwords was preceded by 5 or more underscore characters ``_`` and a blank
   line.
 * All text until the next headword is considered the definition.
 * Any leading ``@`` characters are stripped out, but the file is
   otherwise unchanged.

For convenience also used such assumptions:

 * Headwords was separated by ``;<SPACE>`` (and all was placed on single
   line).
 * UTF-8 encoding was used.
 * Lines started with ``#`` striped out (comment syntax).
 * First line with ``ABOUT:`` used as description of dictionary.
 * First URL (line with ``http://``) used as dictionary home page.

Comment syntax convention.
==========================

As 'dictd -c5' format does not support comment syntax we filter out all
lines that start with '#'.

TODO convention.
================

Entries or parts of text that was not completed marked by keywords:

  TODO
                incomplete
  XXX
                urgent incomplete

Makefile rules ``todo`` find this occurrence in sources::

  $ make todo

Dictionary file name convention.
================================

BNF form::

  FILE ::= PREFIX "-" NAME "-" LANG ".dict-c5"
  PREFIX ::= "gadict"
  LANG ::= ISOCODE | ISOCODE "-" ISOCODE

where ``ISOCODE`` is ISO 639-1 language (2 letter) code, currently ``en``,
``ru``, ``uk`` in use, ``NAME`` is dictionary abbreviated name.

World wide dictionary formats and standards.
============================================

  http://en.wikipedia.org/wiki/Dictionary_writing_system
                Dictionary writing system
  http://www.sil.org/computing/shoebox/mdf.html
                Multi-Dictionary Formatter (MDF). It defines about 100 data
                field markers.
  http://fieldworks.sil.org/flex/
                FieldWorks Language Explorer (or FLEx, for short) is designed
                to help field linguists perform many common language
                documentation and analysis tasks.
  http://code.google.com/p/lift-standard/
                LIFT (Lexicon Interchange FormaT) is an XML format for storing
                lexical information, as used in the creation of dictionaries.
                It's not necessarily the format for your lexicon.
  http://www.lexiquepro.com/
                Lexique Pro is an interactive lexicon viewer and editor, with
                hyperlinks between entries, category views, dictionary
                reversal, search, and export tools. It's designed to display
                your data in a user-friendly format so you can distribute it
                to others.
  http://deb.fi.muni.cz/index.php
                DEBII — Dictionary Editor and Browser

Register gadict dictionaries for dictd under Debian.
====================================================
::

  $ su
  $ cat >>etc/dictd/dictd.order <<EOF
  gadict-dictabbr
  /home/user/usr/share/dictd/
  $ dictdconfig --write
  $ /etc/init.d/dictd restart
  $ ^D
  $ dictdconfig --list
  $ dict -d gadict-dictabbr v

Typing IPA chars in Emacs.
==========================

For entering IPA chars use IPA input method. To enable it type::

  C-u C-\ ipa <enter>

All chars from alphabet typed as usual. To type special IPA chars use next key
bindings (or read help in Emacs by ``M-x describe-input-method`` or ``C-h I``).

For vowel::

  æ  ae
  ɑ  o| or A
  ɒ  |o  or /A
  ʊ  U
  ɛ  /3 or E
  ɔ  /c
  ə  /e
  ʌ  /v
  ɪ  I

For consonant::

  θ  th
  ð  dh
  ʃ  sh
  ʧ  tsh
  ʒ  zh or 3
  ŋ  ng
  ɡ  g
  ɹ  /r

Special chars::

  ː  : (semicolon)
  ˈ  ' (quote)
  ˌ  ` (back quote)

Alternatively use ``ipa-x-sampa`` or ``ipa-kirshenbaum`` input method (for help
type: ``C-h I ipa-x-sampa RET`` or ``C-h I ipa-kirshenbaum RET``).