Wednesday, August 24, 2016

Taking stock

Progress on the new book has reached a point that reminds me of a time when I was working for DEC (in the days when it was OK to call it "DEC" – rather than follow HR's instructions and give it the whole nine yards [well ten actually... syllables])
<autobiographical_note>
The biggest thing that happened in the 1990's (in an engineering sense) was a networking system called DECnet/OSI. As an article in the HP Journal says (but don't be misled by the name – in those days HP was just another competitor): 
The DECnet Phase V networking software presented the DECnet-VAX development team with a major challenge. ...[T]he Phase V architecture has substantial differences from Phase IV in many layers. For example, the session control layer now contains a global name service..... In most cases, the existing Phase IV code could not be adapted to the new architecture; it had to be redesigned and rewritten.  
HP Journal PDF (and regular readers will know how I feel about them). 
My one tangible souvenir of "Phase V"
This was a huge enterprise, involving dozens of  engineers (and writers) working on both sides of the Atlantic. And there were many interlinking plans, schedules and wall-charts, involving dependencies, critical paths... all that scheduley stuff; and tele-conferences (this was the BS era – Before Skype; I did take part in one videoconference, using some hugely expensive proprietary system). Managers played games of Schedule Chicken (committing their teams to an impossible date, knowing that another team was going to slip first).  
There was a cartoon on a poster at the time. A character (called Antworth, and with a rather formic profile – which suggests that he might have been popular in a Dilbert-like way, although I've had no luck with Google) was pointing at ("talking to" was the MBA-ese) a complex flow-chart detailing many interlinked "deliverables" (more MBA-ese). There was an insignificant little box in one corner, with the words 
then a miracle happens
in very small writing. A manager was saying Nice work, Antworth, but I think it could do with a bit more work just here [pointing at the miracle box]. Some wag in the Reading office had added the words 
Phase V xxx stable
I've cloaked the crucial module in mystery, because the Non-Disclosure Agreement I signed at the time makes it more than my pension's worth to reveal the unstable party
</autobiographical_note>
So much for the preamble. My  "then a miracle happens" moment is nigh. I've collected the data, made the tables (ready for conversion to HTML), written a lot of the text,... I've "just" got to bring it all together in a Sigil  file and add all the links, Table of Contents, Prelims, cover design....As wossname said

'Any sufficiently advanced technology is indistinguishable from magic"

For magic read miracle. I just have to find the technology.

Meanwhile, here are  the last few bits of text.

First draft of *AL* linking text

The letters "al" are divided here into seven sections. But counting exceptions, which have pronunciations both with and without the sound /l/ there are ten. These exceptions are in the sections for /æ/ (shallow and salmon), /ɔ:/ (small and walk), and /ɑ:/ (impala and calm). (In all cases, the variant without an /l/ sound is much the less common.)

The sound /ə/ – 79%
This sound is by far the most common. Even after systematic omissions (as described in the Introduction) there are still well over 300 listed here.

The sound /æ/ – 16%
The sound /eɪ/ – 3%
The sound /ɑ:/ – 0.1%
The sound /ɒ/ – percentage negligible
This sound, when represented by the spelling "al", always follows a /w/ phoneme (as in wall or squall, for example).
The sound /ʌ/ – percentage negligible

First draft of *EL* linking text

The spelling "el" can represent any one of six sounds. The sounds /e/, /ə/, and words with a Magic E (which makes no sound itself but changes the sound that the preceding vowel represents), account between them for 85% of words with the spelling "el".

The sound /e/ – 33%
Magic E – 31%
This (non-)sound's significance is misrepresented by the size of the following table because of systematic omissions as explained in the Introduction..
The sound /ə/ – 21%
The sound /ɪ/ – 12%
The sound /i:/ – 2%
The sound /eɪ/ – ½%

First draft of *IL* linking text

This spelling is used to represent only four sounds. One of these predominates.
The sound /ɪ/ – 83%This is by far the most common of the four sounds represented by the spellim "el". The number of words listed here outweighs by far all other *IL* words by about 3:1, even without taking into account the more than 300 words excluded for reasons given in the Introduction.
The sound /aɪ/ – 11%

The sound /ə/ – 5%
The sound /i:/ – 1%.These are predominantly borrowings from languages derived from Latin, particularly French.

Update: 2016.08.24.17:15 – Added picture


Thursday, August 18, 2016

Joining up

First draft of *OL* linking text


The sound /ɒ/ – 37%

This (despite the relative sparseness of the words in this table – which is due chiefly to the absence of -ology words) is the most common of sounds represented by the letters *OL*. These omissions are explained in the Introduction.

The sound /ə/ – 30¼%
This is the second most common sound represented by the letters *OL*, but again the relative sizes of the tables might seem to belie this. As in the first case, omissions explained in the Introduction are the reason. In this case the omitted words have the spellings -ological and -ologically.

The sound /əʊ/ – 27½%

The sound Magic E – 6%

The sound /ʌ/ – percentage negligible

This sound occurs only in colo[u]r and its many derivatives. Some speakers use the sound /ɒ/, particularly in those derivatives.

The sound /ʊ/ – percentage negligible
This sound occurs only in the word wolf and its derivatives.

No sound – percentage negligible
This (lack of) sound occurs only in the word chocolate, and not always; younger speakers tend to enunciate the -ol- as /ə/. In chocolate's derivatives – in adult speech – the -ol- is almost always, fully agglutinated (so that a child will give chocolate three syllables, while an adult will say chocolatey with the same syllable count).

The sound /ɜ:/ – percentage negligible
This sound occurs only in the word colonel and its derivatives.

The sound /ɔ:/ – percentage negligible
This sound occurs only in South African English, presumably reflecting its Afrikaans origins.


First draft of *UL* linking text

The sounds /ʊ/ and /jʊ/ - 80%
This vowel, either preceded or not by a /j/ glide (in largely predictable contexts), is present in a majority of *UL* words (although, because of the exclusions outlined in the Introduction , it is outnumbered in this collection by words listed in the next section).
The sound /ʌ/ - 16%

This represents an unusually low proportion for a 2nd-ranked phoneme. The preponderance of /ʊ/ and /jʊ/ sounds means that if a student meets a previously unknown *UL* word there are 4 chances in 5 that the letters will represent this phoneme.

The sounds /u:/ and /ju:/ - 3%
This vowel, either preceded or not by a /j/ glide (in largely predictable contexts), accounts for very few words. The /l/ is almost always sounded, except in some names (such as Leverhulme- /li:vəhju:m/).
The sounds /ə/ and /jə/ - percentage negligible
In the Macmillan English Dictionary only one *UL* word is transcribed with the sound /jə/ (formula). But as the note to that word says, many words with the sound /jʊ/ can often be heard with the /jə/ sound.

b

Monday, August 8, 2016

The last of the L notes

I'm coming to the end of my first pass over the words in the <vowel>l  section. It has taken a long time, and is nowhere near finished. As I wrote some time ago, I am the victim of an obsession with order – in this case, the alphabetical sort; it would have been much more interesting to start with "<vowel>w". In fact, I still haven't decided how to deal with w, which – more than any other letter, I think (though I haven't yet done the research to confirm this) – exerts its influence both forwards and   backwards ( was saw,  for example).

Anyway, here goes:

First draft of "ul" >  /u:l/ notes

  1. mint julep
    The Macmillan English Dictionary transcription does not lengthen the /u/ in the British English case but does for American English. This is the reverse of its usual practice for this vowel: see for example (a random case ) truly - /tru:li/ (Br) but /truli/ (Am). This seems to be a simple slip, but as the vowel sound in the audio sample is /ə/  it is impossible to tell whether the editors had a particular distinction in mind. Possibly, as this cocktail is associated with the Southern United States, there was an attempt to modify the vowel accordingly. But this does not typically affect the pronunciation of native speakers of British English.
  2. Uluru
    The transcription in the Macmillan English Dictionary has /ʊ/ in the first syllable, /u:/ in the second, and a different stress pattern (stressed on the /u:/, which in Macmillan English Dictionary is unstressed. As is often true with recent foreign borrowings, uncertainty and variation are common (especially when Political Correctness is added to the mutually-conflicting mixture of socio-linguistic pressures, with attitudes to the colonial response to aboriginal cultures).

First draft of "ul" >  /ju:l/ notes

  1. arugula
    The word has this transcription in the Macmillan English Dictionary, but the vowel sound in the audio sample is /ə/.
  2. formulaic
    Although the Macmillan English Dictionary gives the noun formula with /jə/, the same dictionary has /jʊ/ in the case of this adjective. Either unstressed vowel is acceptable, and there is no risk of misunderstanding.
  3. primula and spatula
    The word has this transcription in the Macmillan English Dictionary, but the vowel sound in the audio sample is /jə/, and the reverse is true of formula (Macmillan English Dictionary transcription /jə/ but audio /jʊ/.) Generally, either vowel is acceptable, though /jʊ/ is more common in learned words - such as copula, fibula, nebula, or uvula.
  4. tabula rasa
    Also sometimes pronounced with a plain /ʊ/ and no preceding glide. Either is acceptable.

First draft of "ul" >  /əl/ and /jəl/ notes

  1. formula
    The Macmillan English Dictionary prepends a /j/ to this transcription, but no other /ə/ is given this (in the context/jəl/). See also the note for formulaic. The ESOL student need not expend the energy to simulate these distinctions, which are surely coincidental.

Now for the text.


b

Friday, August 5, 2016

More notes

Things have been happening this Summer  (documented from time to time in the original blog) , and it is an unconscionable time since my last post here.

First draft of "ul" >  /ʊl/ notes

  1. bulldog
    This escapes the general exclusion of compound words that start bull-, because the relevance of the dog to bulls is shrouded in the history of cruel sports.
  2. bulletproof
    This is the sole representative of compound words ending -proof.
  3. bullfinch
    This escapes the general exclusion of compound words that start bull-, because there is no obvious connection between a bull and a bird.
  4. bullfrog
    This escapes the general exclusion of compound words that start bull-, because there is no obvious connection between a bull and a frog.
  5. bullwhip
    This escapes the general exclusion of compound words that start bull-, because there is no obvious connection between a bull and a whip.
  6. fitful
    This escapes the general exclusion of -ful words, because nothing is full. But it is the sole representative of words that use -ful to refer to an emotional state.
  7. mullah
    Some native speakers use the phoneme /ʌ/. Either is acceptable.
  8. needful
    This is unlike other -ful words, in that it does not refer to fullness in any sense. A person who is in need is not needful – they are needy; the needful is what needs to be done.

First draft of "ul" >  /jʊl/ notes

  1. arugula
    The word has this transcription in the Macmillan English Dictionary, but the vowel sound in the audio sample is /ə/.
  2. formulaic
    Although the Macmillan English Dictionary gives the noun formula with /jə/, the same dictionary has /jʊ/ in the case of this adjective. Either unstressed vowel is acceptable, and there is no risk of misunderstanding.
  3. primula and spatula
    The word has this transcription in the Macmillan English Dictionary, but the vowel sound in the audio sample is /jə/, and the reverse is true of formula (Macmillan English Dictionary transcription /jə/ but audio /jʊ/.) Generally, either vowel is acceptable, though /jʊ/ is more common in learned words - such as copula, fibula, nebula, or uvula.
  4. tabula rasa
    Also sometimes pronounced with a plain /ʊ/ and no preceding glide. Either is acceptable.

First draft of "ul" >  /ʌl/ notes

  1. adulthood
    This is the sole representative of compound words formed with the suffix -hood.
  2. agriculture
    This is the sole representative of compound words formed with the suffix -culture.
  3. bulkhead
    This escapes the general exclusion of compound words, as it does not involve a head (that is, a part of the body).
  4. catapult
    A variant with the /ʊ/ sound is both common and acceptable.
  5. culinary
    A variant with the /jʊ/ sound is both common and acceptable.
  6. culpability
    The Macmillan English Dictionary does not include this but many other dictionaries do. The link is to the Collins English Dictionary.
  7. ebullient
    A variant with the /ʊ/ sound is both common and acceptable. The Macmillan English Dictionary gives this as "American", but it is widely used in the UK.
  8. exculpate and pulmonary
    The word has this transcription in the Macmillan English Dictionary,  but the vowel sound in the audio sample is /ʊ/. Both are common and acceptable.
  9. mullah
    Some native speakers use the phoneme /ʊ/. Either is acceptable.
  10. multiaccess
    This is the sole representative of words formed with this prefix (followed by a free-standing word).
  11. multiply
    The verb has the last syllable /aɪ/. The (much less common adverb, typically encountered in collocations such as "multiply-resistant") has the last syllable /i:/.
  12. stultifying
    The Macmillan English Dictionary, as published, does not include the bare infinitive (stultify), but the online Macmillan English Dictionary includes stultifying only as a headword, with no reference to stultify.
  13. sultan, sultana, and sultanate,
    The word has this transcription in the Macmillan English Dictionary, but the vowel sound in the audio sample approaches /ɒ/.
  14. terra nullius
    The word has this transcription in the Macmillan English Dictionary, but the vowel sound in the audio sample is /ʊ/.
  15. ultra-
    This is a prefix, and can be attached to any adjective that refers to an expression of quantity.
b

Update: 2016.08.09:50 – Added revision of /jʊl/ notes.
  1. arugula
    The word has this transcription in the Macmillan English Dictionary, but the vowel sound in the audio sample is /ə/.
  2. formulaic
    Although the Macmillan English Dictionary gives the noun formula with /jə/, the same dictionary has /jʊ/ in the case of this adjective. Either unstressed vowel is acceptable, and there is no risk of misunderstanding.
  3. particular
    The Macmillan English Dictionary gives the spelling without a final s as the headword. But the noun is (much more often than not) plural. Click here to run a search at the British National Corpus that shows this: the corpus contains only two instances of particular as a noun; this is always used in expressions of quantity – a report that "omits no particular" is "correct in every particular". In contrast this search finds over 600 instances of particulars (which is always a noun – accounting for the uncluttered searchstring).
  4. primula and spatula
    The word has this transcription in the Macmillan English Dictionary, but the vowel sound in the audio sample is /jə/, and the reverse is true of formula (Macmillan English Dictionary transcription /jə/ but audio /jʊ/.) Generally, either vowel is acceptable, though /jʊ/ is more common in learned words - such as copula, fibula, nebula, or uvula.
  5. tabula rasa
    Also sometimes pronounced with a plain /ʊ/ and no preceding glide. Either is acceptable.

Friday, July 1, 2016

Introductory pieces for *UL*


Last time (here)  I made a silly mistake, but it was a one-letter typo which people will either have missed or overlooked or not cared about: I announced a *UL* chart and wrote about another chart entirely.

Here now is the *UL* one – though as with the other one, it is not meant to be part of the book:


And here's a first draft of the resulting text:

The sounds /ʊ/ and /jʊ/ – 80%

This vowel, either preceded or not by a /j/ glide (in largely predictable contexts), is present in a majority of *UL* words (although, because of the exclusions outlined in the Introduction , it is outnumbered in this collection by words listed in the next section).

The sound /ʌ/ – 16%

This represents an unusually low proportion for a 2nd-ranked phoneme.

The sounds /u:/ and /ju:/ – 3%

This vowel, either preceded or not by a /j/ glide (in largely predictable contexts), accounts for very few words. The /l/ is almost always sounded, except in some names (such as Leverhulme – /li:vəhju:m/).


The sounds /ə/ and /jə/ – percentage negligible

In the Macmillan English Dictionary only one *UL* word is transcribed with the sound /jə/ (formula). But as the note to that word says, many words with the sound /jʊ/ can often be heard with the /jə/ sound.

Wednesday, June 15, 2016

Under the hood

It's too long since I last wrote here. But that's not for want of progress. It’s just that the swan’s legs are thrashing around frantically but the bird itself (in this cygnal metaphor, which represents the progress of the next #WVGTbk as a swan) seems to share the vitality of the proverbial Norwegian Blue (strange convention  – it's nothing to do with a proverb, it's just a sketch) .

But one review of the first book found the percentages (in the section heads – e.g. “Sounds that represent the sound /e/ – N%” ) particularly interesting; maybe other reviews commented too, but not as clearly. And this made me reflect on the process of reaching those numbers. I wanted to make it more reliable and repeatable (a hangover from various brushes with ISO 9001 and the CMMI model during my last three to four years at Compaq/HP: a memorable catchphrase from that time was “If you can’t find the time to do it right, how’re you going to find the time to do it over?” – I say over rather than again because that’s how I first heard it [and its muscularity strikes me as worth preserving], although later speakers often felt it necessary to translate the American English).

So this time I'm showing something of what goes on under the hood as they say (without regard for the British English preference for the word 'bonnet' in that context (the naming of car parts) – under the bonnet might suggest Lizzie Bennet keeping a secret).

In this spreadsheet I've tried to quantify figures I've used in calculating which sound represents which proportion of *OL* words:



In the top three rows I calculate how many *UOL* words there are in my chosen source: the third row carries the search string I use to calculate general exclusions. The phoneme-specific sections follow (either 4 or 2 rows each, depending on whether I could think of a phoneme-specific exclusion in my dictionary search – for example, the string   &!*ology  always excludes words with l preceded by the sound /ɒ/).

The last ten rows are all about what I think of as my cosmological constant – including everything not so far accounted for. There's no rhyme or reason for it; it just makes things work. I call it the balancing factor.  In the last seven  rows I share out the balancing factor according to the proportion of the things I know about. (A fair bit of approximation goes on here.)

Well, there you have it. Back to the grindstone...

b

Wednesday, May 25, 2016

The skeleton in the room...

...Or should that be the elephant in the cupboard? Anyway, the thing not mentioned.

First draft of another part of the Introduction


There is an admittedly uneasy blurring – in my approach, both here and in the first When Vowels Get Together book – between printed/written letters on one hand and phonemes. I look here at vowels "before an l" for example, and list words alphabetically (referring to letters), but the letter a represents the /ɒ/ phoneme only when it follows a /w/ phoneme (as in both "swallow" and "qualify") in RBP, that is. In fact I was surprised that reviewers did not mention this – which, I suppose, might be regarded by some as a flaw.

My justification for this is based on the history of language development. Sounds always precede letters (except in special cases such as acronyms). Sometimes, the link between letters and phonemes remains firm (as in Castilian Spanish, which has a fairly reliable correspondence between letters and phonemes – nearly one-to-one, with a few exceptions). But in English this link is shakier.

The link is still there, though, when you consider the history of spellings. The common silent "gh" for example was originally an attempt to represent the sound /χ/ as in the Scottish "loch" or the German "Bach". In parts of Scotland, indeed, "night" is pronounced /nɪχt/ (as "night" was, at one time, in English); and in Northern Ireland a lake is a "lough", with (uniquely, among British English words) the final consonant /χ/.

In some cases letters have no phonemic value – as is often the case with silent letters. But in other cases there is such a link There are various reasons for this:
  • The "b" in "debt" (Chaucer was writing "dette" in the fifteenth century, but later scholars imposed the "-bt" spelling in deference [some would say craven deference] to the Latin debitum.)
  • The Greek "ρ" with a spiritus fortis (also known as a "rough breathing") persuaded scholars to take the word "rime" (as used by Coleridge, for example) and insist that it should be spelt with an "rh".
In other cases a "silent letter" spelling was imposed by false analogy with another word with a silent letter that had once had a phonemic value. For example both "should" and "would" had one of these "real" silent letters (the words were sceolde and wolde, the past tenses of sculan and willan). But the past tense of another word that came to be used as a modal verb (like "would" and "should") was a word that Chaucer, for example, had spelt "koude" – with no phonemic "justification" for a silent l. So, basing their suggestion on a false analogy, language "experts", (thinking "modal verbs that end /ʊd/ should share the spelling -ould "), introduced the spelling "could".

But quite often (I would guess more often than not, excepting Magic E spellings [where the presence of the e makes its presence felt, audibly, although it itself is not sounded]) the presence of a silent written letter does have some force with reference either to pronunciation at some stage in the development of the language or to etymology.

So while it would be wrong to say that written letters in English correspond to phonemes, quite often they make some reference to a real sound. Anyway, these books use alphabetical lists for convenience.

Update: 2016.05.27.12:30 – One line correction in purple. Oops.
Update: 2016.05.29.15:30 – Deletion of pleonastic "the presence of". REoops.