20 January, 2013

Removing inferred roots from verb smart paradigms

In the Maltese resource grammar implementation I had some code which tried to extract the radicals from a so-called mamma verb form. So for example:

classifyVerb "ħareġ"

would give (amongst other information) the radicals Ħ-R-Ġ in record form. This works well most of the time, except for cases where it is completely impossible to guess the missing radicals from weak-root verbs. For example, dar is actually the mamma of two distinct verbs, one with root D-W-R and another with D-J-R.

The usual way of dealing with this is to have a less-smart fallback in your paradigm, which takes an explicit root in such ambiguous cases. But the reality is that in this case we don’t even need the smarter version of the paradigm. The set of root-and-pattern verbs in Maltese is a closed set, so there are no new such verbs being added to the language (all new verbs are today added as loan verbs). Furthermore, this list has already been compiled by Michael Spagnol is his PhD thesis, and we even now have it in database form here. I am using this to directly build a monolingual Maltese verb database in GF, and since I already have the radicals for all these verbs, there really is no need at all to try and determine it automatically in a smart paradigm. As my professor Aarne Ranta likes to say, “don’t guess what you know.”