About my c5 dictionary format usage.
authorOleksandr Gavenko <gavenkoa@gmail.com>
Sun, 13 Mar 2016 15:51:35 +0200
changeset 337 33b08a8b7fb1
parent 336 3c46d18b1ff0
child 338 61a9d2de0e3e
About my c5 dictionary format usage.
obsolete/HACKING.rst
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/obsolete/HACKING.rst	Sun Mar 13 15:51:35 2016 +0200
@@ -0,0 +1,48 @@
+.. -*- coding: utf-8 -*-
+
+=======================
+ gadict HACKING guide.
+=======================
+.. contents::
+   :local:
+
+Dictionary source file format.
+==============================
+
+For source file format used dictd C5 file format. See::
+
+  $ man 1 dictfmt
+
+Shortly:
+
+ * Headwords was preceded by 5 or more underscore characters ``_`` and a blank
+   line.
+ * All text until the next headword is considered the definition.
+ * Any leading ``@`` characters are stripped out, but the file is otherwise
+   unchanged.
+
+For convenience also used such assumptions:
+
+ * Headwords was separated by ``;<SPACE>`` (and all was placed on single line).
+ * UTF-8 encoding was used.
+ * Lines started with ``#`` striped out (comment syntax).
+ * First line with ``ABOUT:`` used as description of dictionary.
+ * First URL (line with ``http://``) used as dictionary home page.
+
+Comment syntax convention.
+==========================
+
+As 'dictd -c5' format does not support comment syntax we filter out all lines
+that start with '#'.
+
+Dictionary file name convention.
+================================
+
+BNF form::
+
+  FILE ::= PREFIX "-" NAME "-" LANG ".dict-c5"
+  PREFIX ::= "gadict"
+  LANG ::= ISOCODE | ISOCODE "-" ISOCODE
+
+where ``ISOCODE`` is ISO 639-1 language (2 letter) code, currently ``en``,
+``ru``, ``uk`` in use, ``NAME`` is dictionary abbreviated name.