After discussing Tibeto-Burman prefixes, and touching upon the *s- prefix in Tibeto-Burman, I sort of ran into the Indo-European s-mobile again.
For those unknown to the s-mobile, it is a unexplained element *s that seems to optionally appear in front of words. There are several words even in English today that form an s-mobile pair. For example melt besides smelt with some differentiation in meaning but not much.
I have been pondering about the origin of this s-mobile for some time now, it is something that will occasionally bug the mind of any Indo-Europeanist, only to leave them confused and dissatisfied without a proper answer. I too, do not have a proper answer, but I have a little theory that I'd like to explore.
Several ideas have been explorer in the past. Some say it is simply an irregular shift of *s/_C > ø. I don't like irregular shifts that are as wide-spread as this shift is, since it is found in every single branch of Indo-European. If there was a shift that never became particularly popular, I'd at least like to see one of the branches that got rid of it completely.
Another explanation that has been explored by some, is that it might be the s-prefix as found in Semitic languages, which often has a causative meaning. This explanation would be very nice if all s-prefixed verbs could be explained as Semitic loanwords. The problem with this theory is though, that the s-mobile seems to appear in front of nouns as well.
As an example we find the Dutch stier 'bull' and Old English stêor OHG stior. In Old Norse we find the s-less form þjôrra. It won't come as a surprise to anyone that this word is related to Lat. taurus and Gr. ταῦρος. This seems to be a very early loan from Semitic. Arabic has ṯawr. It can't have gone in the other direction because Semitic had access to a t so there would be no reason to replace Indo-European *t with *ṯ. It should be noted that this s-mobile appears in Indo-European, but is absolutely impossible to find in Semitic. This fact led me to think that it must have been some sort of productive suffix in Indo-European.
If I were trying to connect Indo-European with Sino-Tibetan I wouldn't have hesitated to say that the PIE *s is the animal prefix in PST *s. But since such a claim would make me look like a maniac, I will not even go into that, but the resemblance is just a funny coincidence which I wanted to mention.
Now then, we have the strange situation of an element *s that can appear both before verbs and nouns. I asked myself what kind of element can appear before verbs and nouns in Indo-European. And then I realised, that, although only very productive in Graeco-Aryan, Indo-European has a lot of prefixes that can be place before verbs and nouns. Elements such as Skt. pra-, su-, a-, apa- etc. which all have direct reflexes in Greek as well.
So what *s is a prefix like this as well? The s-mobile sometimes seems to give a somewhat intensifying meaning to verbs (although I should really once look into the semantics of that, but this is what is often claimed).
I was wondering if maybe, the *s is a strongly reduced form of the prefix *h₁su- which yields su- in Sanskrit and εὐ- in Greek. Semantically it would make sense. smelting is then 'well-melting'. And a stier would be a 'good-bull'. If this reduced form really does come from a reduced form of *h₁su- it would explain why this prefix is not found in any other languages but Graeco-Aryan languages, since the *s would be that form. Why both forms occurs in Graeco-Aryan though, remains unexplained.
Although the idea is pretty nifty, I'm still very hesitant about this hypothesis. Can pretonic *u really reduce to schwa and then disappear completely? Do we have precedent of this? And then there's the laryngeal. Although the reflex of it wont be commonly found in languages, you would expect indirect evidence of lengthened vowels before an s-mobile in Vedic Sanskrit. As far as I am aware, this does not exist, but it is definitely worth looking into. If I can indeed find indirect evidence in Vedic, I'll be a lot more confident about this hypothesis.
I've been reading up on Proto-Tibeto-Burman/Sino-Tibetan lately. I've run into a problem with the aspirated consonants which I have not yet been able to solve.
In Proto-Tibeto-Burman we reconstruct two series of stops: Voiceless and Voiced. Tibetan though, has a series of three stops: Voiceless, Voiceless Aspirated and Voiced.
Voiceless and Voiceless aspirated consonants can be accounted for as being distributed complementary for all of the cases where these consonants are non-word-initial. Many other cases the voiceless aspirated consonants can also simply be explained as allophones, as is accurately done by Nathan Hill's article Aspirated and Unaspirated Voiceless Consonants in Old Tibetan .
One, to me, blatant omission in his article though, is that he only looks at the phonetic distribution of these consonants. But one could argue that there is a phonemic contrast that isn't immediately obvious from the outside form. Stephan Beyer addresses this in his book 'The Classical Tibetan Language' when discussing the results of the combination of stops with the verbal prefixes on page 174. I have taken the liberty to copy this table on this blog.
|
ROOT INITIAL |
N- |
B- |
G- |
Ø- |
|
K |
Nkh |
bk |
dk |
kh |
|
KH |
Nkh |
kh |
kh |
kh |
|
G |
Ng |
bk |
dg |
kh |
|
T |
Nth |
bt |
gt |
th |
|
TH |
Nth |
th |
th |
th |
|
D |
Nd |
bt |
gd |
th |
|
P |
Nb |
ph |
db |
ph |
|
PH |
Nph |
ph |
db |
ph |
|
B |
Nb |
b |
db |
b |
|
C |
Nch |
bś |
gś |
ś |
|
CH |
Nch |
bc |
gc |
ch |
|
J |
Nj |
bź |
gź |
ź |
|
Nj |
bc |
gź |
ch |
|
|
TS |
Ntsh |
bs |
gs |
s |
|
TSH |
Ntsh |
bts |
gts |
tsh |
|
DZ |
Ndz |
bz |
gz |
z |
|
Ndz |
bts |
gz |
tsh |
As you can see, Beyer needs two types of voiceless stops to account for the distribution of consonants. Beyer's aim is to accurately describe Tibetan rather than give proto-Tibetan forms so the voiceless aspirates might not be etymological voiceless aspirated stops, but then I don't get the distribution.
For example we have the verb 'to wash' Root: KRU-D Conjugation: Class I
Present Nkhrud-pa Perfect bkrus Future bkru Imperative khrus
This root has an etymology I could find as PTB *krəw 'to wash'
Written Burmese khyûi, Dimasa gru < *krəw
Jingpho krùt < *krəw-t
(Examples cited from Matisoff's Handbook of Proto-Tibeto-Burman)
And a contrasticve example would be the verb 'to carry, bring' Root: KHYER Conjugation: Class I
Present Nkhyer-pa Perfect khyer Future khyer Imperative khyer
If k and kh are truly from one phoneme *k why is there this asymmetry? I fail to understand this, and Hill fails to give an answer to this.
Other more obvious examples of this Class I verbs are the following:
Nchiṅ-pa P bciṅs F bciṅ I chiṅs ‘to bind’
Nchad-pa P bśad F bśad I śod ‘to say’
These two are readily understood. In the second form ś (<*sy-) is the orginal form, which is turned into an affricate by the stop-feature of the prefix *N-. While the first example is a true consonant c (<*ty-).
Another word is the following:
Nbyin-pa P byiṅ F dbyuṅ I phyuṅs 'to send forth'
Matisoff does not mention this Tibetan form, but it seems to me that it is related to *pyiŋ- 'release, send forth' Written Burman: phyâñ. Though I think the form implies a variant of this root with an *u as *pyuŋ-
And then there's:
Nphral-ba P phral F dbral I phrol 'to separate, to part' which is the active counterpart to Nbral-ba 'to be separated'.
Usually an active counterpart to an intransitive verb is made with the s- prefix, but *Nsbr- would not yield *Nphr-s but rather *sbr- as can be seen from
sbrud-pa P sbrus F sbru I sbrus 'to stir' from an earlier paradigm
*N/g-sbru-d, *b-sbru-s F *b/g-sbru I *sbru-s. It is not 100% certain that the cluster *sbr- is not a result of *g-sbr.
The etymology for Nphral-ba and Nbral-ba seem to be the reconstructed TB form *p/bral 'leave/depart, seperate' which is sadly only reconstructed for TB because it seems to have a Old Chinese cognate.
Another, and probably the most solid etymology I could find for *Np reflected as *Nph is the following:
Nphur P phur 'to fly' from TB *pur Tankhul Naga puy, Magar bhur-ke, Thakali pyuhr-wa among others.
As you can see these different inflections prove to be problematic. It is a shame that I couldn't find any of these "voiceless" and "voiceless aspirate" pairs that both had a unambiguous etymology. Nevertheless, I can see no conditioning based on the examples I've cited. I too, would very much like to reconstruct only an opposition between voiceless and voiced stops, but I am currently not sure how to account for the reflexes that Beyer called the voiceless aspirates.
For kh and th one could imagine that it was simply a loss of prefix one way or the other, But for p/ph this does not seem to be an option since you find the curious relfex Nph beside Nb.
I might be missing something, Tibeto-Burman is pretty new to me, but I can't find any conditioning. Any suggestions? Or a nudge to someone who wrote an article figuring it all out?
One of the typologically puzzling things about Arabic, and Semitic languages in general, is that /i/ and /u/ very often contrast with /a/, but hardly ever with each other. This is usually an indication that these are allophones, but this explanation can not be held if these vowels can't freely interchange, and are perceived as separate vowels.
Although this issue is an issue in the whole of Semitic, as far as I am aware, I am most familiar with Arabic, so I'll stick to using examples from this language.
Of course, there is one extremely productive pattern of 'minimal pairs' of vowels in the form of case endings.
Nom. rajul-un
Gen. rajul-in
Acc. rajul-an
So, sure they seem quite phonemic in that context. But what I find puzzling is that in stem formations we can't find u and i to contrast normally.
To further research this I have made a table of the distribution of Arabic vowels in CVCVC roots. The table looks as follows:
|
V1 \ V2 |
a |
i |
u |
ā |
ī |
ū |
|
a |
+ |
+ |
+ |
+ |
+ |
+ |
|
i |
+ |
- |
- |
+ |
- |
- |
|
u |
+ |
- |
+ |
+ |
- |
+ |
|
ā |
- |
+ |
- |
- |
- |
- |
|
ī |
- |
- |
- |
- |
- |
- |
|
ū |
- |
- |
- |
- |
- |
- |
Several notes can be made about this table. I shaded the entry CaCiC, since it is difficult. The only word I can think of is malik 'king' (although doubtlessly there are more). Some people will probably know that this word is related to Hebrew mĕlĕḵ which paradoxically points to a CVCC root. Is malik perhaps from *malk with an epenthetic vowel? It is very reminiscent of dutch melk 'milk' which by many people is in fact pronounced [ˈmɛ.lǝk] rather than [ˈmɛlk].
Another thing that is strange is that, of the long vowels, only ā can occur in V1 position, and exclusively if it is followed by the vowel i. Could it perhaps be that the CaCiC is indeed from *CaCC, and that CāCiC represents the orignal *CaCiC?
If this were true, the table of vowel distribution would look a lot more elegant.
|
V1 \ V2 |
a |
i |
u |
ā |
ī |
ū |
|
a |
+ |
+ |
+ |
+ |
+ |
+ |
|
i |
+ |
- |
- |
+ |
- |
- |
|
u |
+ |
- |
+ |
+ |
- |
+ |
There is an enormous problem with this reductionist approach though. The Vowel pattern CāCiC is associated with a meaning of nomen agentis. It is quite productive, from the word kataba 'to write' we can form kātib 'writer'. That would be fine, if it wasn't that Hebrew has this exact same pattern. Hebrew has the verb ṣāfăr 'to count' besides ṣôfēr 'scribe, writer (litt. 'counter')' (ô < *ā, ē < *i). If we assume that CāCiC is from *CaCiC this must have been a common shift for Arabic, Hebrew and I've been told, also Aramaic. Could someone with knowledge of Akkadian/Ethiopian Languages let me know whether this pattern exists and whether it has CāCiC or CaCiC?
So, after the discussion on CaCiC, let's continue regarding this vowel table. Maybe not completely surprising, but for allowed vowel distributions, Arabic disregards vowel length. CiCiC isn't allowed, whether the second i is long or not. Same goes for the other disallowed vowel combinations. I wonder what this implies. I have no experience with languages that have long vowels and limitations on their distribution, so I'm not sure what scenario is typologically plausible.
It is good that I made this table, for it has shown me some stuff that I was previously unaware of. I was under the impression that the distribution of u and i was identical, but I have found absolutely no examples of words with CiCiC, while CuCuC is in fact quite a common plural formation. As I knew before writing this combinations with i and u in one root are impossible, which is mysterious. It almost looks like a sort of 'vowel disharmony' if I may coin that term.
I had written a large post of a proposal of a fourth proto-semitic vowel *ǝ , that would be affected by its surroundings, but often simply surface as a or i. But once I put the distribution into a table, I became uncertain if such a proposal would be feasible, and threw away most of this post.
It is true that i and also u sometimes have schwa-like properties, if malik indeed comes from *malk that's obviously an example, but there's even more readily available examples in the form of the 'alif al-waṣl. When a Arabic word starts with a CC cluster a vowel is placed in front of the first consonant to make the cluster pronounceable. For example *sm 'name' becomes (i)sm. When a vowel proceeds it, this vowel is lost again, it is purely epenthetic. When the root contains no vowels, or an a or i the value of the 'alif al-waṣli is i. But if the following vowel is an u the 'alif al-waṣl is also u as in *drus > (u)drus 'learn!'. This is in fact an example of vowel harmony. There are some nouns violate this rule though like (i)mru'' 'man'. Another strange thing is that the a in the definite article (a)l behaves just like 'alif al-waṣl except that it is always a in isolated pronunciation. Nevertheless it is quite obvious that this alif al-waṣl must have come from a subphonemic *ǝ.
Another example of a *ǝ is the i that is often used to break up clusters in a sentence especially the apocopate verb often needs an extra i places in between its final consonant and the following word.
If there was a *ǝ in the middle of words, would that help to explain the distribution of the vowels? It might, if we assume that all i were in fact *ǝ we would understand which CiCuC and CuCiC do not occur, since the u would have affected the *ǝ to become an u. But it still does not explain why CiCiC and CiCīC unless we assume that *ǝ and *ī turned a preceeding *ǝ into a. Such an explanation is entirely ad-hoc. Although it might be true, there is no indication that it was like that, and we would need comparative evidence to prove it.
So to conclude, Arabic gives quite strong indications that i was in fact rather a *ǝ than an *i that was heavily affected by its surroundings. This does not increase or decrease the amount of phonemic vowels, but it may help understanding the vocalic patterns in Arabic better.
There is no conclusive evidence though that i was *ǝ, one would have to look at deeper genetic relations (Afro-Asiatic? Maybe only Berbero-Semitic?). I do feel that one should probably position this *ǝ in proto-semitic times if it exists. Hebrew vowel distribution is as far as I can see it, quite similar to that of Arabic.
I hope to soon dive into correspondences between Arabic and Berber verbal morphology with this hypothesis that i should be interpretead as a *ǝ. But before that I should probably consider the Arabic verbal morphology first, since I've only considered nouns of the type CVCVC so far. The vowel distribution in the verbal morphology becomes quite a bit more difficult though.
That's right after 3 years and a bit, I am now officially a Bachelor of Arts in Comparative Indo-European Linguistics. Yay me, and yay for shameless self-promotion!
So I finished my final Bachelor Thesis with a score of 9/10, that is to say, pretty damn good. And therefore I shall treat you guys on this goodness, my thesis on the Consonant Gradation in the Indo-European Verb.
I am sure that it will lead to loads of discussion, because there is a lot to discuss, and even more is uncertain. But I am willing to discuss it all, it's an exciting subject. So enjoy!
[EDIT] Due to issues with rapidshare, I now uploaded my thesis to Mediafire (Thanks Tropylium!), please let me know if anyone runs into issues.
Recently I've been doing a class on fieldwork, in this class we have an informant who speaks Minangkabau, a Malay dialect spoken by about 5 million people around Padang (which was recently hit by quite a severe earthquake).
Last week it was my turn to elicit some words and sentences from the informant, and one of the things that was elicited was the word for 'leg' kaki. This word struck me as odd, but I had no idea why.
Just now my mom came in, it's a cold evening, and she had cold feet.
I tell her wow, wat heb je kouwe kakkies ' wow you have cold feet'. And then it struck me why the word had seemed so familiar: kakkies is a bargoens word for 'feet'. And yes this is indeed a loanword from Indonesian!
As the title says, I am often perplexed by afro-asiatic. I've learned some Arabic and Hebrew, followed a class on comparative semitic, I have a (hardly looked at) book on Egyptian, and I'm currently following a class on Riffian Berber and general Berber Linguistics.
Studying these languages it seems silly to deny that Proto-Afro-Asiatic must have existed. So I won't. But what always puzzles me, is the fact that unlike Indo-European the 'proof' for Afro-Asiatic is quite the opposite of what kind of proof we find in Indo-European.
Lexical items in Afro-Asiatic that are cognate, are extremely hard to find. This is quite the opposite in Indo-European, where lexical items were the first things to catch the attention of a certain relation between the languages.
But the morphology of Afro-Asiatic is disturbingly similar. Obvious are things like -t suffix for the feminine, but even personal endings of verbs are surprisingly similar in Afro-Asiatic.
This is completely unlike Indo-European. Sure Sanskrit and Greek grammatically are almost clones of each other, but I make it no secret that I believe that the relation between Sanskrit and Greek is a lot closer than some people claim. But reconstructing a uniform image of the verbal system or even morphology when comparing Sanskrit to, say, Germanic, stuff gets a lot more confusing.
And then we're talking about Germanic and Sanskrit. The time depth of Indo-European is a LOT less than that of Afro-Asiatic. Is there something inherent to the way the language's structure which makes morphemes more resistant to change? That seems odd, structurally you could argue Indo-European at an early stage (but post-syncope) was quite similar to Afro-Asiatic languages.
Of course this 'morphological but not lexical' change resistance is more of a 'feeling' I get, then anything I ever measured. So maybe I'm wrong about this. Maybe Afro-Asiatic is just as innovating in the morphological department as Indo-European, but just a whole lot more innovating in the lexical department.
This is me just rambling to a point that it's appallingly unscientific, but I guess it'll set some of your brains into motion, and that'll be enough. :-P
The 19th of october, that will be the date that I will be defending my Bachelor thesis on Consonant Gradation in the Verbal System of Proto-Indo-European.
This defense is open to public, if any reader happens to be around and wants to come, he is invited to place a comment, then I'll provide more information.
After that I can officially call myself Bachelor of Arts in Comparative Indo-European Linguistics, which is kind of cool.
Hey guys! Long time no see. My Bachelor thesis was eating a lot of time, combined with work on the Greek Etymological Dictionary and me just simply enjoying my holiday. But I'm back, with this word that has been bothering me for some time now.
The word Skt. sthā- 'to stand', is besides its double representation of the Laryngeal quite straightforward. Now if we look at its causative though, something really funny happens. Usually a causative is formed by giving the root lengthened grade (from PIE *o in open syllables) and adding the suffix -aya-. Words ending in vowels though would get the situation where we'd have **sthā-aya-. which is a rather unfortunate cluster of vowels. To remedy this, Sanskrit puts a -p- between the root and the suffix resulting in sthāpaya- 'to cause to stand; to stop'.
Why a p? This is not at all a natural transitional consonant you'd put there. A y would be a lot more likely (and quite common practice in Sanskrit). Since it can not be readily understood by phonetic reasons, there's two more examples. The Vedic people were feeling funny, and thought it'd be nice to come up with a completely nonsensical transition sound, or it is archaic.
As a historical linguist, I feel compelled to further research the archaic option. Indo-European has certain elements behind certain stems called 'stem-extension'. These are always simple consonants like *k, *p or *u. The function of these stem-extensions have always been a bit mysterious. A nice example is the root *(s)ker- ''to cut' as found in Dutch scheren 'to shave' beside *(s)ker-p- which we find in Old English sceorfan 'to bite'.
I believe that this p that shows up in Sanskrit might give us an indication of the original function of the *p-stem-extension. Maybe originally this was a way to form causatives from verbal stems, which was later replaced by the common textbook causative formation. A nice note to put with this is, that Anatolian indeed is unfamiliar with the textbook causative formation, so there's some indication that it's recent.
While most p-causatives in Vedic Sanskrit occur after Laryngeal final roots, there are a few verbs that show this p even without them ending in a vowel/laryngeal. These are r̥- 'to go'; ar-p-áya- 'cause to go' and kṣi- 'to dwell' kṣe-p-áya- 'cause to dwell'.
All in old, Sanskrit seems to give a strong indication that the *p-stem extension is an old causative formation. Now we must look to see if there's any other words out there in other languages that seem to support this idea. Germanic *(s)ker- 'to shave/cut'' ~ *(s)ker-p- 'to bite' might be seen as a reflex of this, though the difference is rather more intensive than causative.
There is lots more to say about these stem extensions, and I'm nowhere near done figuring them out. There's some really odd stuff going on with the voice of these extensions for example. They seem to become pre-glottalised sometimes for no apparent reason.
As a final little side-note sthāpaya- looks suprisingly much like the Dutch verb stoppen 'to stop'. I don't buy the commonly cited Latin etymology stupere (it wouldn't explain with Dutch and Enlish both have the vowel o rather than u, or English with u and Dutch with o), it can hardly be cognate either, since the vowels would be wrong, and Dutch p points to PIE *b, which is very odd to have in the first place. So until I make any significant breakthrough on this bizarre word (which even if it is from Latin has a difficult reconstruction), I'll consider it completely unrelated.
The other day I had a discussion about the Dutch verb willen 'to want'. It is a funny verb, because it formally has two past tenses. Both wou and wilde.
I was watching a movie in which the form wou was used in the subtitles and the person who I was watching it with pointed out that it looked silly and was incorrect. She claimed that wilde was the correct formal form. Luckily our lovely language hasn't been prescriptivised to a level that a perfectly correct form like wou is deemed incorrect, but it does show how people feel about it. Even I tend to avoid wou when writing formal letters.
The funniest thing is, wou is the historically correct form. willen belongs to a small class of funny germanic verbs that are ja-verbs in the present, but behave as normal verbs in the preterite. So, willen goes back to *wiljan while its past tense is a perfectly normal Germanic preterite *wal. In other words, it's a strong verb.
In general though ja-presents are weak verbs, while those without a suffix are strong, and this is the reason why it was changed to wilde. For example rillen 'to shake' has a past tense rilde from *riljan and *riliða where in the preterite the *j-suffix shows up in its vocalised form *i. By analogy of this class of verbs, a secondary preterite of willen was easily made, making the verb regular rather than irregular.
What I find remarkable is that, generally more 'formal' language tends to be a bit more archaic, but in this case, people seem to prefer an analogically levelled form over a form that preceeds it by well over a 1000 years.
Not sure if any of you ever saw this, but there's a band called The Magnetic Fields who did a song on Ferdinand de Saussure, which is cool enough for me to justify posting it here.
According to Mallory/Adams on page 23 of the EIEC, it seems that their solution is to reconstruct *steuros 'large (domestic)... read more
on *sprefixes in Indo-European