“the j stands for Jack”
about me | research blog | wordpress plugins | jQuery plugins

20 January, 2013

Changing the verb implementation

Perhaps I saw the signs earlier than I would like to admit, but it has become clear now that my current implementation of Maltese verb morphology in GF has taken the wrong direction and needs to be significantly re-written. Having an inflection table with close to 1000 forms is not just a headache implementationally, but also arguably not linguistically accurate either.

So the new plan, which is what is done in the implementations for Italian and Finnish, so to remove pronominal suffixes from the verb’s inflection table, and instead use binding on the syntax level to produce these forms. Reducing the inflection table is the easy part, but getting the rest to produce correct results might be tricky since the stem sometimes changes depending on the pronoun being suffixed.

So anyway I have created a new branch to work on this, so that at any point I can switch back to the original implementation if I want to compare something or if I end up wanting to use that approach again.

23 October, 2012

Pronominal suffixes and transitivity

The verb morphology I am currently working on for Maltese definitely suffers from over-generation, in particular when it comes to derived verbs and pronominal suffixes. Derived verbs are often intransitive and interpreted as reflexive or passive, which makes the addition of direct object suffixes to them very awkward.

For example, take the root W-Ż-N in the first (underived) form: wiżen “he weighed”.
Adding some pronominal suffixes we get wiżnek “he weighed you”, wiżinlek … “he weighed … for you”, and wizinhomlok “he weighed them for you”.

So far so good, but let’s now look at the seventh derived form of this root: ntiżen “he was weighed”.
Appending an indirect object pronoun is fine: ntiżinlek “he was weighed for you”. But when we try with a direct object it ceases to make sense, e.g. ntiżnek and ntizinhomlok. The reflexive meaning taken on by this derived verbs means direct object pronouns no longer make any sense when attached to the verb (even when in combination with an indirect object pronoun).

The problem is that I currently don’t know if these cases are detectable on a morphological level. In other words, if seventh form verbs never have any direct object pronouns attached then it is very simple to fix the over-generation, but it’s still a little early for me to tell whether such a general exclusion can be made.

24 August, 2012

Vowel length and negation

Continuing the previous post about vowel lengths, here are some remarks about the handling of the long vowel ie under negation (which is after all the suffixation of the letter x).

Consider the verbs waqaf, kiel, and ħa. Note that the latter two are irregular, however I think they are still valid for the point I want to make. Their imperfect forms all consist of a stem which begins with the long vowel ie: nieqaf, tiekol, jieħu. Does this vowel get shortened under negation? Let’s see what the Maltese corpus has to say about this:


-ieqaf -ieqafx -iqafx -ieqfu -ieqfux -iqfux
n- 1070 21 26 850 24 12
t- 2124 116 53 23 3 2
j- 2828 90 102 1390 51 58
Totals 6022 227 281 2263 78 72


-iekol -iekolx -ikolx -ieklu -ieklux -iklux
n- 292 3 8 752 7 8
t- 935 18 18 75 1 7
j- 1339 16 21 1747 30 18
Totals 2566 37 47 2574 38 33


-ieħu -ieħux -iħux -ieħdu -ieħdux -iħdux
n- 7191 23 58 11215 24 38
t- 17643 101 163 631 6 5
j- 33070 155 204 22682 113 103
Totals 57904 279 425 34528 143 146

These are the totals of the negative forms, as percentages of the total occurrences of the corresponding positive form:

Verb Singular IE Singular I Plural IE Plural I
waqaf 3.76% 4.66% 3.45% 3.18%
kiel 1.44% 1.83% 1.48% 1.28%
ħa 0.48% 0.73% 0.41% 0.42%

So what do all these numbers mean?
When considering the singular negative, the version without the long ie vowel is more common in all cases. As an example, ma nikolx is more common than ma niekolx, which would indicate that the former is really the correct form.

In the plural though, it’s almost the complete opposite. To continue our example, this means that ma nieklux is slightly more frequent that ma niklux. However the difference in frequency is less pronounced: 7% in plural compared to 12% in singular for the given example.

So here we have another indication of the correct spelling, but not exactly hard evidence. The more I try to rely on the corpus for these things, the more apparent it becomes that it is not really a good settler of questions of minor orthographic differences.

21 August, 2012

Vowel length and pronominal suffixes in Maltese

Vowel length in Maltese seems to be one of those tricky things. The combination of pronominal suffixes with verbs ending in ‘a’ is a good example.

Direct Object suffixes

Think of the single verb form for “we saw you”: rajniek. Or should that be rajnik? Based on how it sounds as a native speaker, the latter shorter-vowel version seems more likely.

The Maltese corpus is not much help in deciding this. Just look at these frequency counts for tokens ending in jniek and jnik:

Rank Token Count
1 tajniek 4
2 rrispondejniek 3
3 smajniek 3
4 qtajniek 2
5 rajniek 2
6 drajniek 1
7 ħabbejniek 1
8 obdejniek 1
Rank Token Count
1 tajnik 6
2 għabbejnik 2
3 mejnik 2
4 rajnik 2
5 staqsejnik 2
6 avviċinajnik 1
7 għaddejnik 1
8 kkritikajnik 1
9 ħallejnik 1

In total, jniek occurs 17 times and jnik occurs 18 times. Note also the even split of the words which appear in both lists: tajniek (4) vs. tajnik (3), and rajniek (2) vs. rajnik (2).

But it turns out there is an explicit rule for this. According to “Grammatika Maltija” pg 166, whenever a verb ending in ‘a’ is going to have a pronominal suffix attached to it, the joining vowel becomes an ‘ie’. So tajna + ek = tajniek, even though when you say it it sounds a lot more like tajnik. The results from the corpus seem to confirm that I’m not the only one confused by this, although admittedly the numbers are probably too low to be statistically significant. While counter-intuitive, this rule seems pretty established, so we just accept it.

Indirect Object suffixes

What about indirect pronominal suffixes? Think of “we sang for your”, kantajnielek. Or is that kantajnilek? Again, the latter sounds like a more accurately transcription of the spoken form. The corpus reports 11 occurances of tokens ending in jnielek, and 10 for jnilek. Another even non-statistically-significant split. “Maltese” by Borg and Azzopardi-Alexander claims the former is correct, with an ‘ie’.

Direct and Indirect Object suffixes

And what happens when you have both a direct and indirect pronominal suffixes? The information is much more polarised. Using the rule above, as in “Maltese”, the ‘ie’ remains. So you have the forms kantajniehulek and ftaħniehulek.

But the corpus contains exactly zero tokens which end with iehulek, and a whopping 92 which finish with ihulek. In this case the two sources directly contradict each other. Some personal communication on the Kelmet il-Malti Facebook group confirms that the above rule no longer applies, and the more natural principle of vowel length comes into play again. So kantajnihulek and ftaħnihulek are the correct forms, and the book is wrong.

15 August, 2012

Liquid-medial strong verbs beginning with għ

Liquid-medial verbs are a subclass of the strong Maltese semitic verbs, which have a liquid consonant (għ, l, m, n, r) as their second radical. Their paradigm is slightly different in that they sometimes require an extra vowel in conjugation. Whether this vowel is morphological or euphonic, I don’t know. Not all sources identify them as a subclass, and simply claim the vowel is inserted euphonically as needed. However when the first radical is GĦ, this extra vowel is dropped again:

Class Root Mamma (Perf P3 Sg Masc) Imperfect P1 Sg Imperfect P1 Pl Template (prev column)
Strong Regular K-T-B kiteb nikteb niktbu nvCCCv
Strong Liquid-Medial S-R-Q seraq nisraq nisirqu nvCvCCv
Strong Liquid-Medial GĦ-M-L għamel nagħmel nagħmlu nvCCCv

This also creeps up when adding some indirect object suffixes (P3 Sg Fem, and all Pl) in imperative/imperfect:

Class Root Imperfect P2 Sg Imperfect P1 Sg + I.O. P1 Sg Imperfect P1 Sg + I.O. P1 Pl Template (prev column)
Strong Regular K-T-B tikteb tiktibli tiktbilna tvCCCilna
Strong Liquid-Medial S-R-Q tisraq tisraqli tisraqilna tvCCvCilna
Strong Liquid-Medial GĦ-M-L tagħmel tagħmilli tagħmlilna tvCCCilna

8 August, 2012

Hundreds of forms, but nowhere to check them

A previous post showed just how many inflectional forms there are for a single verb in Maltese. But while writing the algorithms for producing such tables, I repeatedly find that for many of these forms, there is no real way of checking them for correctness, because no such other resource exists.

There is the Korpus Malti, but despite containing nearly 100 million tokens, there are numerous grammatically-correct verb forms which do not occur anywhere in the corpus. No traditional dictionary would contain every possible inflected form for each verb, for reasons of size, so in many cases I must simply resort to “best guesses” and intuition. There are so-called verb models which are used in Maltese verb conjugations, e.g. the verb lagħab (he played) should be conjugated as seraq (he stole), but that only covers radical-placement, not vowel changes. For example, which is correct: naqtgħak or naqtgħek? The former does not appear at all in the corpus, and the latter appears just once, from a public blog entry. Not exactly hard evidence, is it?

26 July, 2012

Full inflection table of a Maltese verb

The inflection table of the Maltese verb is formidable. Apart from tense/aspect and person, number & gender, a Maltese verb also can also take suffixes for a direct object, for an indirect object, or for both a direct and indirect object. Add to this the “-x” suffix when the verb is negated, and you end up with no fewer than 952 unique forms for a single verb (the total number of combinations is 1152, but some combinations are non-existent).

Here’s the full table for the verb fetaħ (he opened).


19 July, 2012

Strongly-integrated loan verbs and weak-final quadriconsonantal roots

Splitting quadriliteral verbs into strong and weak is not universal in the literature. At least Borg and Azzopardi-Alexander make no mention of this, however their treatment of quad verbs feels a little lacking to me. But they do make the following distinctions:

  1. Repeated bi-radical base, e.g. GEMGEM (G-M-G-M)
  2. Repeated third radical (C3), e.g. GERBEB (G-R-B-B)
  3. Repeated first radical (C1) after the second (C2), e.g. ŻERŻAQ (Ż-R-Ż-Q)
  4. Addition of a fourth radical to a triradical base, e.g. ĦARBAT (Ħ-R-B-T)

They make no reference to weak radicals in quad verbs. They then go on to discuss “strongly-integrated loan verbs”, i.e. verbs of Romance or even possibly English origin which have taken on completely regular Semitic-style morphology. The examples given are KANTA, VINĊA, and SERVA, which correspond to the 3 different verb endings in Italian (cantare, vincere, and servire respectively).

Spagnol agrees with this, but goes farther and actually classifies these verbs as quadriliteral verbs with the weak consonant J as the fourth radical. Here’s a table of some of the most common ones, including ones for which I could find no Romance origin word.

English Romance origin Għerq (Root) Mamma (Perf P3 Sg Masc) Imperative P2 Sg Perfect P1 Sg Perfect P3 Sg Fem
to sing cantare K-N-T-J kanta kanta kantajt kantat
to serve servire S-R-V-J serva servi servejt serviet
to win vincere V-N-Ċ-J vinċa vinċi vinċejt vinċiet
to ask - S-Q-S-J saqsa saqsi saqsejt saqsiet
to draw - P-N-Ġ-J pinġa pinġi pinġejt pinġiet
to enjoy godere G-W-D-J gawda gawdi gawdejt gawdiet
to talk parlare P-R-L-J parla parla parlajt parlat
to complete - L-S-T-J lesta lesti lestejt lestiet
to vary variare V-R-J-J varja varja varjajt varjat

Looking at the vowel patterns, we end up with a very neat division:

Romance ending Mamma (Perf P3 Sg Masc) Imperative P2 Sg Perfect P1 Sg Perfect P3 Sg Fem
-are a a a a
-ire/-ere/- a i e ie

In other words, the vowel patterns are always the same, except for when the verb derives from a Romance -are verb.

16 July, 2012

Vowel-change patterns in the Maltese “hollow” verb (moħfi)

The behaviour of consonant radicals in Maltese morphology is always predictable, but vowel changes are a lot less so. Consider this list of Maltese “hollow verbs”: that is, where the middle root is the weak consonant w or j (of course there are many more hollow verbs than the ones listed here, but I chose the ones which to me are most “common”).

English Mamma (Perf P3 Sg Masc) Għerq (Root) Perfect P1 Sg Imperative P2 Sg
to urinate biel B-W-L bilt bul
to kiss bies B-W-S bist bus
to take long dam D-W-M domt dum
to turn dar D-W-R dort dur
to taste daq D-W-Q doqt duq
to melt dab D-W-B dobt dub
to heal fieq F-J-Q fiqt fiq
to overflow far F-W-R fort fur
to bring ġab Ġ-J-B ġibt ġib
to sew ħiet Ħ-J-T ħitt ħit
to die miet M-W-T mitt mut
to wake up qam Q-W-M qomt qum
to want ried R-J-A ridt rid
to find sab S-J-B sibt sib
to become ready sar S-J-R sirt sir
to fast sam S-W-M somt sum
to drive saq S-W-Q soqt suq
to fly tar T-J-R tirt tir
to increase żied Ż-J-D żidt żid

The following vowel-change patterns emerge:

Long vowel in base form Middle radical Vowel in Perfect Vowel in Imperative/Imperfect Applicable verbs
a W o u dab, dam, dar, daq, far, qam, sam, saq
a J i i ġab, sab, sib, tar
ie J i i fieq, ħiet, ried, żied
ie J i u biel, bies
ie W i u miet

Conclusions from this minor study:

  1. The long vowel in the base form (mamma) does not necessarily determine the middle radical.
  2. Even the base form combined with the root is not enough to determine the vowel changes in the the perfect and imperative forms, as in the cases of fieq and biel. In such cases the imperative must be specified explicitly.