How does it work?

The Model

Ontology

Our ontology comprises thousands of precisely defined, semantically simple English concepts, including universal semantic primitives (based on the work of Wierzbicka and Goddard) and core vocabulary from the Longman Dictionary. Organized into seven semantic categories, each concept, even with multiple senses (like the various meanings of "be"), is used consistently across all semantic representations.


Example of "be" from the Ontology

Semantic Representation

Existing semantic frameworks lacked the depth needed for minority languages. Therefore, we created a new system of simple concepts in simple propositions, richly annotated with features. These features, like Number (including Singular, Dual, Trial, Quadrial, Plural, and Paucal), Participant Tracking, Proximity, and many more, capture nuances often crucial for accurate translation. When biblical scholars have differing interpretations, we include alternative representations to reflect those viewpoints.


Example Semantic Representation for Gen 6:8

Lexicon

Our lexicon stores all target language words, features, and forms, categorized into seven syntactic classes (nouns, verbs, adjectives, adverbs, adpositions, conjunctions, and particles). Linguists define relevant features and forms for each language (e.g., noun gender, honorifics, class). Lexical spell out rules generate word forms, handling inflection and sound changes.  Suppletive forms are directly incorporated into the lexicon.


Example Lexicon for the Tagalog language

Transfer Grammar

Because a truly language-neutral semantic representation is unattainable, our transfer grammar adapts the semantic representations to each target language.  It restructures these representations, generating appropriate deep structures by creating grammatical relations from semantic roles, adjusting verb theta grids, handling relativization, marking noun relationships, building clause chains, resolving collocation issues, and more. The result is a target-language-specific deep structure.


Transfer Grammer Model

Complex concept insertion rules

⬇️

Feature adjustment rules

⬇️

Styles of direct speech

⬇️

Target tense/Aspect/Mood rules

⬇️

Relative clause strategies

⬇️

Collocation correction rules

⬇️

Genitival object-object relationships

⬇️

Theta grid adjustment rules

⬇️

Structural adjustment rules


An example of a structural adjustment rule that aggregates possessor noun phrases in the Tagalog language

Synthesizing Grammar

Our synthesizing grammar generates the final target language text, mirroring the descriptive grammar used by field linguists.  It handles agreement marking, lexical form selection, contextual affixation (prefixes, suffixes, infixes, circumfixes, suprafixes, and clitics), constituent ordering, sound changes, pronoun/switch reference marker insertion, and punctuation.  Taking the transfer grammar's output (deep structure), it produces the final translated text.


Synthesizing Grammer Model

Feature copying rules

⬇️

Spellout rules

⬇️

Clitic rules

⬇️

Constituent movement rules

⬇️

Phrase structure rules   🔁   Pronoun rules

⬇️

Word morphophonemic rules

⬇️

Find and replace rules


An example of a spellout rule that inserts case markers in the Tagalog language


An example of a spellout rule that supplies tense morphemes in the Kewa language


An example of a clitic rule that marks the objects of reciprocal actions in the Kewa language


An example of a phrase structure rule for clauses in the Tagalog language


An example of a morphophonemic rule for common ergative case markers in the Tagalog language

"A New Approach to Bible Translation"