HACKING.rst
author Oleksandr Gavenko <gavenkoa@gmail.com>
Fri, 13 Jul 2012 00:09:11 +0300
changeset 237 40f8ce891969
parent 233 d3670cd252ce
child 242 8deb7f09622a
permissions -rw-r--r--
Update to new 0.5 release.

.. -*- fill-column: 78 -*-

.. include:: header.rst

=======================
 gadict HACKING guide.
=======================
.. contents::

Document version.
=================

.. include:: VERSION.rst

Versioning rules.
=================

We use **major.minor** schema.

Until we reach 5000 words **major** is 0. **minor** updated from time to time.

Getting sources (VCS).
======================

To clone repository run::

  $ hg clone http://hg.code.sf.net/p/gadict/code gadict-hg

To push to repository you must have write permission and do::

  $ hg push ssh://$USER@hg.code.sf.net/p/gadict/code

or::

  $ hg clone https://$USER@hg.code.sf.net/p/gadict/code

Browsing sources.
=================

  http://hg.code.sf.net/p/gadict/code
                hgweb interface for official repository.
  https://sourceforge.net/p/gadict/code/
                Sourceforge Allure interface for official repository.

Dictionary source file format.
==============================

For source file format used dictd C5 file format. See::

  $ man 1 dictfmt

Shortly:

 * Headwords was preceded by 5 or more underscore characters ``_`` and a blank
   line.
 * All text until the next headword is considered the definition.
 * Any leading ``@`` characters are stripped out, but the file is
   otherwise unchanged.

For convenience also used such assumptions:

 * Headwords was separated by ``;<SPACE>`` (and all was placed on single
   line).
 * UTF-8 encoding was used.
 * Lines started with ``#`` striped out (comment syntax).
 * First line with ``ABOUT:`` used as description of dictionary.
 * First URL (line with ``http://``) used as dictionary home page.

Comment syntax convention.
==========================

As 'dictd -c5' format does not support comment syntax we filter out all
lines that start with '#'.

World wide dictionary formats and standards.
============================================

  http://en.wikipedia.org/wiki/Dictionary_writing_system
                Dictionary writing system
  http://www.sil.org/computing/shoebox/mdf.html
                Multi-Dictionary Formatter (MDF). It defines about 100 data
                field markers.
  http://fieldworks.sil.org/flex/
                FieldWorks Language Explorer (or FLEx, for short) is designed
                to help field linguists perform many common language
                documentation and analysis tasks.
  http://code.google.com/p/lift-standard/
                LIFT (Lexicon Interchange FormaT) is an XML format for storing
                lexical information, as used in the creation of dictionaries.
                It's not necessarily the format for your lexicon.
  http://www.lexiquepro.com/
                Lexique Pro is an interactive lexicon viewer and editor, with
                hyperlinks between entries, category views, dictionary
                reversal, search, and export tools. It's designed to display
                your data in a user-friendly format so you can distribute it
                to others.
  http://deb.fi.muni.cz/index.php
                DEBII — Dictionary Editor and Browser

Register gadict dictionaries for dictd under Debian.
====================================================
::

  $ su
  $ cat >>etc/dictd/dictd.order <<EOF
  gadict-dictabbr
  /home/user/usr/share/dictd/
  $ dictdconfig --write
  $ /etc/init.d/dictd restart
  $ ^D
  $ dictdconfig --list
  $ dict -d gadict-dictabbr v


IPA chars.
==========

Hare list of spetial IPA chars (code present in UTF-8 encoding)::

  θ ð ʃ ŋ ʧ ʒ ı ɡ ː ˌ ˈ ˑ ˏ ˊ ˋ
  æ ɑ ɒ ʌ ʊ ɒ ɛ ə ɜ ɔ ɪ є ɚ

You can copy/paste they in phonetic string.

TODO
----
::

            Front       Central     Back
         long  short long  short long  short
  Close   iː    ɪ                 uː    ʊ
  Mid           e     ɜː    ə     ɔː
  Open          æ           ʌ     ɑː    ɒ

      Diphthong          Triphthong
  Closing    Centring
   /eɪ/       /ɪə/        /aɪə/
   /aɪ/       /eə/        /ɑʊə/
   /ɔɪ/       /ʊə/
   /əʊ/
   /aʊ/

Old vs. new transcription.
--------------------------

From "Better English pronunciation."::

  Old  iː i e ɔː u uː ei ou ai au ɔi æ ɔ ʌ əː ɑː iə ɛə uə ə
  New  iː ɪ e ɔː ʊ uː eɪ əʊ aɪ aʊ ɔɪ æ ɒ ʌ ɜː ɑː ɪə eə ʊə ə

Also from wikipedia::

  Old  æ e əː ʌɪ ɑʊ ɛə
  New  a ɛ ɜː aɪ aʊ eə

Emacs.
------

For entering IPA chars use IPA input method. To enable it type::

  C-u C-\ ipa <enter>

All chars from alphabet typed as usual. To type special IPA chars use next key
bindings.

For vowel::

  æ ae
  ɑ o| (small letter o and ) or A (upper letter A)
  ɒ |o () or /A
  ʊ U (upper-letter-u)
  ɛ /3 (slash three)
  ɔ /c
  ə /e
  ʌ /v
  ɪ I

For consonant::

  θ th
  ð dh
  ʃ sh
  ʒ zh or 3
  ŋ ng
  ɡ g

Special chars::

  ː : (semicolon)
  ˈ ' (quote)
  ˌ ` (back quote)