The World Atlas of Language Structures Online
|
|
Thread rating:  |
Peter Brooks - 12 Sep 2010 15:31 GMT Here's an interesting resource:
http://wals.info/index
It's a little odd. It doesn't show English as a language in North America - nor, strangely, in Australia or South Africa
It's tea map is interesting:
http://wals.info/feature/138?tg_format=map&v1=cd00&v2=c00d&v3=cccc&s=20&z3=3000& z2=2999&z1=2998
tsuidf - 12 Sep 2010 19:34 GMT > Here's an interesting resource: > [quoted text clipped - 4 lines] > > It's tea map is interesting: Yes, it is. But I am perturbed by its gender map, which purports to use the need for change in other words as the indicator for gender (ie, does the verb or adjective connected to a noun changed depending on its grammatical gender?) and comes up with 3 genders for English. I see it was created by a German, which may explain something. Or not. I am heartened to note that more than half the languages annotated use *no* grammatical gender and that the seriously confusing examples (*5* for heaven's sake) are limited to places I'm not likely or hoping to find myself needing to express anything sophisticated.
best from Brussels, Stephanie
tsuidf - 12 Sep 2010 19:36 GMT > > Here's an interesting resource: > [quoted text clipped - 17 lines] > best from Brussels, > Stephanie PS -- I'm also irritated that whilst there is a map of words for 'tea' there is no corresponding effort for 'cat'. I should have thought that the world must be largely divided into 'k-t' words and 'm-ow' words (unlike dogs, or other animals, which are multifarious in this respect I think). Perhaps some future PhD effort?
Peter Brooks - 12 Sep 2010 20:15 GMT > PS -- I'm also irritated that whilst there is a map of words for 'tea' > there is no corresponding effort for 'cat'. I should have thought > that the world must be largely divided into 'k-t' words and 'm-ow' > words (unlike dogs, or other animals, which are multifarious in this > respect I think). Perhaps some future PhD effort? The tea map isn't quite right for English either, since we also talk about 'having a cup of chah'.
Christian Weisgerber - 12 Sep 2010 22:05 GMT > > http://wals.info/index > > Yes, it is. But I am perturbed by its gender map, which purports to > use the need for change in other words as the indicator for gender > (ie, does the verb or adjective connected to a noun changed depending > on its grammatical gender?) and comes up with 3 genders for English. Obviously they based this on the English third person singular pronouns. I didn't like that particular categorization either and it points to a problem with such structural overviews: the decision isn't always straightforward whether to put a language in the have or have not category.
By chance I was looking at some other list of structural features tonight and it included "phonemic consonant length" as a yes/no feature. How, I wondered, would you classify French, where the question about geminate consonants is answered "no, except for ..."? Russian?
 Signature Christian "naddy" Weisgerber naddy@mips.inka.de
Athel Cornish-Bowden - 19 Sep 2010 09:15 GMT >>> http://wals.info/index >> [quoted text clipped - 14 lines] > question about geminate consonants is answered "no, except for > ..."? Russian? There is a problem with any classification that forces answers into a yes/no straitjacket. There was a web site that I saw years ago that listed the diacritical marks used in different languages, and to my surprise I saw English listed as requiring more than, say, French or German (or even Vietnamese, for heaven's sake), on the basis (largely) of the great many loanwords that are occasionally used in English. As far as I can see "née" and "café" are about the only everyday English words that need an é (and many people, including many owners of cafés, omit it from "cafe"), and to list English among the languages that need an acute accent on that basis seems absurd. There is also the problem as to what constitute diacritical marks, to an English speaker the dot on an i is not a diacritical mark, but to a Turkish speaker it is; to a Spanish speaker the tilde on an ñ is not a diacrtical mark, but to an English speaker it is. I'm straying a bit from the subject, but the point is that trying to make hard classifications of messy data is usually a recipe for getting things wrong.
 Signature athel
Lewis - 19 Sep 2010 12:05 GMT > As far as I can see "née" and "café" are about the only everyday > English words that need an é (and many people, including many owners > of cafés, résumé and exposé? fiancé?
 Signature "He uses statistics as a drunken man uses lamp-posts... for support rather than illumination." - Andrew Lang (1844-1912)
Peter Moylan - 19 Sep 2010 13:50 GMT > There is a problem with any classification that forces answers into a > yes/no straitjacket. There was a web site that I saw years ago that [quoted text clipped - 12 lines] > point is that trying to make hard classifications of messy data is > usually a recipe for getting things wrong. A linguist would not make that mistake, I believe. The real problem is that web sites can be created by people who have no competence in the area that they are covering. I myself have competence in the creation of web pages, and if I chose to abuse that it would let me create essays on topics where I know absolutely nothing.
In the long term, we will probably learn to check the credentials of the person who is making assertions. Meanwhile, we are stuck with "Wikipaedia, right or wrong".
I could come up with a few more English words that use diacriticals; but I won't, because I would certainly not support the suggestion that English uses more diacriticals than Vietnamese.
 Signature Peter Moylan, Newcastle, NSW, Australia. http://www.pmoylan.org For an e-mail address, see my web page.
Robert Bannister - 20 Sep 2010 02:36 GMT >>>> http://wals.info/index >>> [quoted text clipped - 31 lines] > point is that trying to make hard classifications of messy data is > usually a recipe for getting things wrong. Once upon a time, it was normal to use diereses on a number of words, although nowadays, it seems confined to a few names like Zoë. In addition, in British English, there are words like "blessèd, agèd" and so on. I still find uncapitalised "cafe" very jarring.
I agree with you that letters with diacritics are perceived in other languages as separate letters from the unmarked ones, although French, for example, does not do a different alphabetical sort for "e, é, è, ê". Of course, a rather large number of English speakers think that diacritics are some kind of unnecessary foreign frill.
 Signature Rob Bannister
Stan Brown - 20 Sep 2010 02:59 GMT > As > far as I can see "née" and "café" are about the only everyday English > words that need an é (and many people, including many owners of cafés, > omit it from "cafe"), I don't think I've ever seen "café"; it's always "cafe" as far as I can remember.
But you missed "fiancé" and "fiancée", which I believe are more common than "née".
 Signature Stan Brown, Oak Road Systems, Tompkins County, New York, USA http://OakRoadSystems.com Shikata ga nai...
Athel Cornish-Bowden - 29 Sep 2010 20:19 GMT >> As >> far as I can see "née" and "café" are about the only everyday English [quoted text clipped - 6 lines] > But you missed "fiancé" and "fiancée", which I believe are more common > than "née". Indeed. I forgot about those, and they are indeed in everyday use. WIWAL my elders used to complain a lot about "cafe", but I think that battle is lost.
 Signature athel
Mark Brader - 24 Sep 2010 05:23 GMT Athel Cornish-Bowden:
> There was a web site that I saw years ago that listed the diacritical > marks used in different languages, and to my surprise I saw English > listed as requiring more than, say, French or German ...
> As far as I can see "née" and "café" are about the only everyday English > words that need an é ... There aren't *any* English words that need accents. That's why there are no accents on the keyboards normally used with English.
> There is also the problem as to what constitute diacritical marks, > to an English speaker the dot on an i is not a diacritical mark, > but to a Turkish speaker it is... Athel's right to say that one language may see a mark as a diacritical when another does not, but I don't think that example is right. My understanding is that i (which is capitalized as I-dot) and dotless-i (capitalized as I) are simply two different letters in Turkish.
 Signature Mark Brader, Toronto | "Astronauts practice landing on laptops" msb@vex.net | --Ft. Myers, FL, News-Press, March 13, 1994
My text in this article is in the public domain.
Garrett Wollman - 24 Sep 2010 16:55 GMT >Athel's right to say that one language may see a mark as a diacritical >when another does not, but I don't think that example is right. My >understanding is that i (which is capitalized as I-dot) and dotless-i >(capitalized as I) are simply two different letters in Turkish. By what objective standard can you distinguish these two cases?
-GAWollman
 Signature Garrett A. Wollman | What intellectual phenomenon can be older, or more oft wollman@bimajority.org| repeated, than the story of a large research program Opinions not shared by| that impaled itself upon a false central assumption my employers. | accepted by all practitioners? - S.J. Gould, 1993
Evan Kirshenbaum - 24 Sep 2010 18:36 GMT >>Athel's right to say that one language may see a mark as a diacritical >>when another does not, but I don't think that example is right. My >>understanding is that i (which is capitalized as I-dot) and dotless-i >>(capitalized as I) are simply two different letters in Turkish. > > By what objective standard can you distinguish these two cases? By what objective standard to you determine that "U" and "V" or "I" and "J" are now different letters? Typically you look at things like how people enumerate their alphabets (and how what get separate names) and how they alphabetize things.
 Signature Evan Kirshenbaum +------------------------------------ Still with HP Labs |Pious Jews have a category of SF Bay Area (1982-) |questions that can harmlessly be Chicago (1964-1982) |allowed to go without an answer |until the Messiah comes. I suspect evan.kirshenbaum@gmail.com |that this is one of them. | Joseph C. Fineman http://www.kirshenbaum.net/
Robert Bannister - 25 Sep 2010 01:16 GMT >>> Athel's right to say that one language may see a mark as a diacritical >>> when another does not, but I don't think that example is right. My [quoted text clipped - 7 lines] > how people enumerate their alphabets (and how what get separate names) > and how they alphabetize things. There is perhaps a small difference: to my mind, the French letters e, é, è and to a lesser extent ê are separate, different letters*, but we have to admit that the French do not classify them as such to the extent where an alphabetical sort will place one before the other. By contrast, there is the Dutch ij and I think there's one like that in Spanish too - however, no diacritics involved.
In other words, one language may distinguish clearly different sounds by using a different letter or by adding a diacritic mark. These may or may not be considered as separate letters. Consider the difference between written Serbian and Croatian - what are different letters in Serbian, appearing widely separated in the alphabet, are represented by Latin letters with diacritics. I've never seen a Croatian dictionary, so I don't know how they sort these different letters, if they do at all.
* I doubt there are many left nowadays who lack the fonts for these characters, but just in case, the letters were plain e, e acute, e grave and e circumflex.
 Signature Rob Bannister
Evan Kirshenbaum - 25 Sep 2010 02:01 GMT >>>> Athel's right to say that one language may see a mark as a diacritical >>>> when another does not, but I don't think that example is right. My [quoted text clipped - 10 lines] > There is perhaps a small difference: to my mind, the French letters > e, é, è and to a lesser extent ê are separate, different letters*, Only if they consider them such, which I don't believe they do.
> but we have to admit that the French do not classify them as such to > the extent where an alphabetical sort will place one before the > other. By contrast, there is the Dutch ij and I think there's one > like that in Spanish too - however, no diacritics involved. Spanish considers "ch" and "ll" to be separate letters (named /tSe/ and /eje/ (or /eZe/), respectively). They also consider "ñ" (/Enje/) differet from "n" (/Ene/). On the other hand, "á" is simply "'a' con acento". The letters "u" and "ü" are also the same.
> In other words, one language may distinguish clearly different > sounds by using a different letter or by adding a diacritic > mark. These may or may not be considered as separate > letters. That's the point. Dotted and dotless "i" are separate letters in Turkish because they consider them to be separate letters. Similarly, "a" and "ä" are separate letters in Swedish, but not in German.
> Consider the difference between written Serbian and Croatian - what > are different letters in Serbian, appearing widely separated in the > alphabet, are represented by Latin letters with diacritics. I've > never seen a Croatian dictionary, so I don't know how they sort > these different letters, if they do at all. According to Wikipedia
http://en.wikipedia.org/wiki/Croatian_alphabet
the letters we would consider to have diacritics or, in some cases, be two-letter sequences are considered to be separate letters there.
 Signature Evan Kirshenbaum +------------------------------------ Still with HP Labs |People think it must be fun to be a SF Bay Area (1982-) |super genius, but they don't Chicago (1964-1982) |realize how hard it is to put up |with all the idiots in the world. evan.kirshenbaum@gmail.com | Calvin
http://www.kirshenbaum.net/
mb - 26 Sep 2010 08:25 GMT On Sep 24, 6:01 pm, Evan Kirshenbaum <evan.kirshenb...@gmail.com> wrote:
> That's the point. Dotted and dotless "i" are separate letters in > Turkish because they consider them to be separate letters. Deciding if to keep letters classified together or separately generally has to do with the mechanisms of the language in question. In the Turkish example, the dotless /I/ is the back equivalent of the front, dotted /i/, exactly as back /a/ corresponds to front /e/, /o/ to /ö/ and /u/ to /ü/. Given that vowel harmony is one of the keystones of how Turkish works, confusing the dotted with the undotted vowels would be as bad for a Turkish speaker as considering a and e the same letter would be to you.
Lewis - 25 Sep 2010 02:36 GMT >>>Athel's right to say that one language may see a mark as a diacritical >>>when another does not, but I don't think that example is right. My >>>understanding is that i (which is capitalized as I-dot) and dotless-i >>>(capitalized as I) are simply two different letters in Turkish. >> >> By what objective standard can you distinguish these two cases?
> By what objective standard to you determine that "U" and "V" or "I" > and "J" are now different letters? Typically you look at things like > how people enumerate their alphabets (and how what get separate names) > and how they alphabetize things. Spanish is odd in that it has single letters that are made up of multiple characters. The 30 letter alphabet goes
A B C CH D E F G H I J K L LL M N Ñ O P Q R RR S T U V W (X) Y Z
and a proper alphabetizing routine will put CH after C and LL after L.
X is special because while it is common in Mexico, it is not used much elsewhere, and I had one text book as a child that omitted it (from Spain, I think, as it had Vosotros in it as well).
 Signature Penny, I'm a physicist. I have a working knowledge of the entire universe and everything it contains.
Roland Hutchinson - 27 Sep 2010 23:33 GMT >>>>Athel's right to say that one language may see a mark as a diacritical >>>>when another does not, but I don't think that example is right. My [quoted text clipped - 12 lines] > > A B C CH D E F G H I J K L LL M N Ñ O P Q R RR S T U V W (X) Y Z
> and a proper alphabetizing routine will put CH after C and LL after L. Used to. Although, as I understand it, the digraphs CH and LL are still considered separate letters, the officially approved collating sequence has changed within living memory.
Quoth Wikipedia (s.v. "collating sequence"): 'Spanish treated (until 1997) "CH" and "LL" as single letters, giving an ordering of cinco, credo, chispa and lomo, luz, llama. This is not true anymore since in 1997 RAE adopted the more conventional usage, and now LL is collated between LK and LM, and CH between CG and CI. The six accented or umlauted characters Á, É, Í, Ó, Ú, Ü are treated as the original letters A, E, I, O, U, for example: radio, ráfaga, rana, rápido, rastrillo. The only Spanish specific collating question is Ñ (eñe) as a different letter collated after N.'
 Signature Roland Hutchinson
He calls himself "the Garden State's leading violist da gamba," ... comparable to being ruler of an exceptionally small duchy. --Newark (NJ) Star Ledger ( http://tinyurl.com/RolandIsNJ )
Evan Kirshenbaum - 28 Sep 2010 02:12 GMT >>>>>Athel's right to say that one language may see a mark as a diacritical >>>>>when another does not, but I don't think that example is right. My [quoted text clipped - 30 lines] > Spanish specific collating question is à (eñe) as a different > letter collated after N.' The RAE hasn't told the people responsible for the online version of their dictionary:
ch.
1. f. Dígrafo que, por representar un solo sonido consonántico de articulación africada, palatal y sorda, como en _mucho_ o _noche_, es considerado desde 1803 cuarta letra del abecedario español. Su nombre es che.
ORTOGR. En la escritura es inseparable.
Nothing about it having changed in 1997. (This is from the 22nd edition, published in 2001.) "Ll" has much the same definition, aside from the actual pronunciation. "D" is defined as the "quinta letra del abecedario español", which only works if "ch" is the fourth.
 Signature Evan Kirshenbaum +------------------------------------ Still with HP Labs |Marge: You liked Rashomon. SF Bay Area (1982-) |Homer: That's not how *I* remember Chicago (1964-1982) | it.
evan.kirshenbaum@gmail.com
http://www.kirshenbaum.net/
James Hogg - 28 Sep 2010 07:06 GMT >>>>>> Athel's right to say that one language may see a mark as a diacritical >>>>>> when another does not, but I don't think that example is right. My [quoted text clipped - 42 lines] > from the actual pronunciation. "D" is defined as the "quinta letra > del abecedario español", which only works if "ch" is the fourth. So evidently I can be forgiven for not knowing about the change.
 Signature James
Roland Hutchinson - 29 Sep 2010 18:30 GMT >>>>>>Athel's right to say that one language may see a mark as a >>>>>>diacritical when another does not, but I don't think that example is [quoted text clipped - 27 lines] >> now LL is collated between LK and LM, and CH between CG and CI. The six >> accented or umlauted characters Ã, Ã, Ã, Ã, Ã, Ã are treated as
>> the original letters A, E, I, O, U, for example: radio, ráfaga, rana, >> rápido, rastrillo. The only Spanish specific collating question is à [quoted text clipped - 16 lines] > from the actual pronunciation. "D" is defined as the "quinta letra del > abecedario español", which only works if "ch" is the fourth. Oh, that doesn't contradict what I (and WikiP) said. (And I checked the RAE dictionary entries before posting, even.) CH and LL are still considered a letter, and still have their place (sequentially) in the alphabet -- only the collating sequence has been changed, presumably to avoid difficulties for foreigners and (especially) computers.
The Spanish-language Wikipedia puts this more clearly (s.v. "Idioma española"):
El español se escribe mediante una variante del alfabeto latino con la letra adicional "ñ" y los dígrafos "ch" y "ll", consideradas letras del abecedario desde 1803 (cuarta edición del DRAE), debido a que representan un solo sonido, distinto de las letras que lo componen.
Así, el alfabeto español está formado por 27 letras y 2 digrafos:
* a, b, c, ch[194] , d, e, f, g, h, i, j, k, l, ll[195] , m, n, ñ, o, p, q, r, s, t, u, v, w, x, y, z.
Durante el X Congreso de la Asociación de Academias de la Lengua Española (Madrid, 1994), se acordó adoptar el alfabeto latino universal, en el cual ch y ll no son letras independientes, lo que afecta a la alfabetización de las palabras que contengan esas dos letras, que desde entonces deben aparecer ordenadas en el lugar que les corresponde dentro de la c y la l. Sin embargo, de acuerdo con las Academias, esta reforma «afecta únicamente al proceso de ordenación alfabética de las palabras, no a la composición del abecedario, del que los dígrafos ch y ll siguen formando parte».[196]
My (not too flowing, first-draft quality) translation:
Spanish is written using a variant of the Latin alphabet with the additional letter "ñ"and the digraphs "ch" and "ll", considered letters of the alphabet since 1803 (fourth edition of the DRAE), due to the fact that they [each] represent a single sound, differing from the letters that they are made of.
Thus, the Spanish alphabet is made up of 27 letters and two digraphs:
* a, b, c, ch[194] , d, e, f, g, h, i, j, k, l, ll[195] , m, n, ñ, o, p, q, r, s, t, u, v, w, x, y, z.
During the 10th Congress of the Association of Academies of the Spanish Language (Madrid, 1994), it was agreed to adopt the universal latin alphabet, in which ch and ll are not independent letters, which affects the alphabetization of words that contain those two letters, which since then must appear sorted in their corresponding places withing the letters c and l. Nevertheless, in agreement with the Academies, this reform "only affects the process of alphabetization of words, not the makeup of the alphabet, of which the digraphs ch and ll continue to be a part."[196]
The reference [196] is: DPD, Ed. Santillana, 2005, pág. 5-6.
(DPD = Diccionario Panhispánico de Dudas)
 Signature Roland Hutchinson
He calls himself "the Garden State's leading violist da gamba," ... comparable to being ruler of an exceptionally small duchy. --Newark (NJ) Star Ledger ( http://tinyurl.com/RolandIsNJ )
James Hogg - 28 Sep 2010 07:05 GMT >>>>> Athel's right to say that one language may see a mark as a diacritical >>>>> when another does not, but I don't think that example is right. My [quoted text clipped - 26 lines] > rastrillo. The only Spanish specific collating question is Ñ (eñe) as a > different letter collated after N.' I didn't know about that change. Thanks for pointing it out.
 Signature James
Peter Brooks - 25 Sep 2010 03:57 GMT On Sep 24, 7:36 pm, Evan Kirshenbaum <evan.kirshenb...@gmail.com> wrote:
> -- > Evan Kirshenbaum +------------------------------------ [quoted text clipped - 5 lines] > | Joseph C. Fineman > Isn't it a good thing that this will never happen - it'd be a sad and tedious world in which there were fewer questions than answers.
Nick - 25 Sep 2010 12:33 GMT > On Sep 24, 7:36 pm, Evan Kirshenbaum <evan.kirshenb...@gmail.com> > wrote: [quoted text clipped - 10 lines] > Isn't it a good thing that this will never happen - it'd be a sad and > tedious world in which there were fewer questions than answers. Luckily, there are an infinity of questions of the form "is n equal to n-1 + 1?" (eg, "is three hundred and seventy equal to three hundred and sixty-nine plus one?") to which the answer is "yes".
So there will clearly always be more questions than answers.
Sits back and waits...
 Signature Online waterways route planner | http://canalplan.eu Plan trips, see photos, check facilities | http://canalplan.org.uk
Skitt - 25 Sep 2010 19:15 GMT >>> -- >>> Evan Kirshenbaum [quoted text clipped - 22 lines] > > Sits back and waits... Why?
Lewis - 26 Sep 2010 05:49 GMT >> On Sep 24, 7:36 pm, Evan Kirshenbaum <evan.kirshenb...@gmail.com> >> wrote: [quoted text clipped - 10 lines] >> Isn't it a good thing that this will never happen - it'd be a sad and >> tedious world in which there were fewer questions than answers.
> Luckily, there are an infinity of questions of the form "is n equal to > n-1 + 1?" (eg, "is three hundred and seventy equal to three hundred and > sixty-nine plus one?") to which the answer is "yes".
> So there will clearly always be more questions than answers.
> Sits back and waits... The old My infinity is bigger than yours position?
Consider, The set of all integers is, of course, infinite.
The set of all even numbers is infinite.
The set of all numbers evenly divisible by 7 is infinite.
The set of all prime numbers is infinite.
The set of all numbers evenly divisible by a 1,817 digit long prime number is... wait for it... infinite.
But "obviously" the first set is 'larger' than the latter set, right?
No. Infinite is infinite. Every set is the same size, it is only the frequency pattern that is different.
 Signature I thought that they were angels, but to my surprise, we climbed aboard their starship, we headed for the skies.
Evan Kirshenbaum - 26 Sep 2010 18:08 GMT >> Luckily, there are an infinity of questions of the form "is n equal >> to n-1 + 1?" (eg, "is three hundred and seventy equal to three [quoted text clipped - 21 lines] > No. Infinite is infinite. Every set is the same size, it is only the > frequency pattern that is different. There's a Mr. Cantor in the lobby who would like to have a word with you.
 Signature Evan Kirshenbaum +------------------------------------ Still with HP Labs |The mystery of government is not how SF Bay Area (1982-) |Washington works, but how to make it Chicago (1964-1982) |stop. | P.J. O'Rourke evan.kirshenbaum@gmail.com
http://www.kirshenbaum.net/
Roland Hutchinson - 27 Sep 2010 23:41 GMT >>> Luckily, there are an infinity of questions of the form "is n equal to >>> n-1 + 1?" (eg, "is three hundred and seventy equal to three hundred [quoted text clipped - 24 lines] > There's a Mr. Cantor in the lobby who would like to have a word with > you. How many rooms does he want this time?
 Signature Roland Hutchinson
He calls himself "the Garden State's leading violist da gamba," ... comparable to being ruler of an exceptionally small duchy. --Newark (NJ) Star Ledger ( http://tinyurl.com/RolandIsNJ )
R H Draney - 28 Sep 2010 03:38 GMT Roland Hutchinson filted:
>>> No. Infinite is infinite. Every set is the same size, it is only the >>> frequency pattern that is different. [quoted text clipped - 3 lines] > >How many rooms does he want this time? Enough for some whoopee....r
 Signature Me? Sarcastic? Yeah, right.
Peter Brooks - 26 Sep 2010 08:01 GMT > > On Sep 24, 7:36 pm, Evan Kirshenbaum <evan.kirshenb...@gmail.com> > > wrote: [quoted text clipped - 18 lines] > > Sits back and waits... Firstly, they are all really the same question. Secondly, each question can be matched to an answer - actually, it's better than that, each question can be mapped to at least three answers "yes" "no", "don't know".
So from the first we get more answers than questions - nobody has stipulated right answers.
From the second, if you disagree with the first, there are a countable number of both. That is questions & answers have the same level of infinity.
I think there's a third option. You could construct an uncountable series of questions. Then, though, you'd have an uncountable number of answers. So they'd share a different order of infinity.
Evan Kirshenbaum - 26 Sep 2010 18:12 GMT > Firstly, they are all really the same question. > Secondly, each question can be matched to an answer - actually, it's [quoted text clipped - 10 lines] > I think there's a third option. You could construct an uncountable > series of questions. Not if the questions are of finite length (and drawn from a countable alphabet).
> Then, though, you'd have an uncountable number of answers. So they'd > share a different order of infinity.
 Signature Evan Kirshenbaum +------------------------------------ Still with HP Labs |Sorry, captain. Convenient SF Bay Area (1982-) |technobabble levels are dangerously Chicago (1964-1982) |low.
evan.kirshenbaum@gmail.com
http://www.kirshenbaum.net/
Peter Brooks - 27 Sep 2010 05:00 GMT On Sep 26, 7:12 pm, Evan Kirshenbaum <evan.kirshenb...@gmail.com> wrote:
> > Firstly, they are all really the same question. > > Secondly, each question can be matched to an answer - actually, it's [quoted text clipped - 26 lines] > > http://www.kirshenbaum.net/ Peter Brooks - 27 Sep 2010 05:41 GMT On Sep 26, 7:12 pm, Evan Kirshenbaum <evan.kirshenb...@gmail.com> wrote:
> > Firstly, they are all really the same question. > > Secondly, each question can be matched to an answer - actually, it's [quoted text clipped - 13 lines] > Not if the questions are of finite length (and drawn from a countable > alphabet). They'd be very tedious and pointless questions, but a series of questions based on each digit of the expansion of all real numbers would. Quite a lot of these questions, of course would not be of finite length.
> > Then, though, you'd have an uncountable number of answers. So they'd > > share a different order of infinity. [quoted text clipped - 8 lines] > > http://www.kirshenbaum.net/ Nick - 27 Sep 2010 08:21 GMT > On Sep 26, 7:12 pm, Evan Kirshenbaum <evan.kirshenb...@gmail.com> > wrote: [quoted text clipped - 23 lines] >> > Then, though, you'd have an uncountable number of answers. So they'd >> > share a different order of infinity. And Skitt wondered why I "sat back and waited".
 Signature Online waterways route planner | http://canalplan.eu Plan trips, see photos, check facilities | http://canalplan.org.uk
Joe Fineman - 27 Sep 2010 22:13 GMT > On Sep 24, 7:36 pm, Evan Kirshenbaum <evan.kirshenb...@gmail.com> > wrote: [quoted text clipped - 10 lines] > Isn't it a good thing that this will never happen - it'd be a sad > and tedious world in which there were fewer questions than answers. I reckon there's more things true than are told, And more things told than are true. -- Kipling
 Signature --- Joe Fineman joe_f@verizon.net
||: We are all strong enough to bear the misfortunes of others. :|| Athel Cornish-Bowden - 29 Sep 2010 21:03 GMT >> Athel's right to say that one language may see a mark as a diacritical >> when another does not, but I don't think that example is right. My >> understanding is that i (which is capitalized as I-dot) and dotless-i >> (capitalized as I) are simply two different letters in Turkish. > > By what objective standard can you distinguish these two cases? I can't speak for Mark, but I would distinguish them according to the way they are sorted in alphabetical order. In French e, é, è and ê are all jumbled up in indexes, so they are variants of one letter, but in Spanish n and ñ are kept separate, so they are different letters (more important, c and ch are kept separate, so although Chile and Colombia are spelt the same in English and Spanish, Chile comes before Colombia in an English list, but Colombia comes before Chile in a Spanish list). I think Mark is right that in Turkish dotted and undotted i are sorted separately, so they are regarded as different letters.
 Signature athel
Adam Funk - 29 Sep 2010 22:01 GMT > I can't speak for Mark, but I would distinguish them according to the > way they are sorted in alphabetical order. In French e, é, è and ê are [quoted text clipped - 5 lines] > I think Mark is right that in Turkish dotted and undotted i are sorted > separately, so they are regarded as different letters. Is this what LC_COLLATE determines?
 Signature Some say the world will end in fire; some say in segfaults. [XKCD 312]
Christian Weisgerber - 29 Sep 2010 23:18 GMT > > I can't speak for Mark, but I would distinguish them according to the > > way they are sorted in alphabetical order. In French e, é, è and ê are [quoted text clipped - 5 lines] > > Is this what LC_COLLATE determines? Yes.
 Signature Christian "naddy" Weisgerber naddy@mips.inka.de
Christian Weisgerber - 29 Sep 2010 22:26 GMT > I can't speak for Mark, but I would distinguish them according to the > way they are sorted in alphabetical order. In French e, é, è and ê are > all jumbled up in indexes, so they are variants of one letter, but in > Spanish n and ñ are kept separate, so they are different letters (more Good luck with that approach.
German has two sorting orders, both standardized in DIN 5007: Variant 1 treats ä, ö, ü as equal to a, o, u. Variant 2 has ä, ö, ü as equivalent to ae, oe, ue. (Both treat ß as ss.)
The first variant is used in encyclopedias and such, the second one in lists of names (phone books).
And Wikipedia mentions that there are three different schemes used in Austrian phone books...
 Signature Christian "naddy" Weisgerber naddy@mips.inka.de
Joachim Pense - 29 Sep 2010 23:28 GMT Am 29.09.2010 23:26, schrieb Christian Weisgerber:
>> I can't speak for Mark, but I would distinguish them according to the >> way they are sorted in alphabetical order. In French e, é, è and ê are [quoted text clipped - 13 lines] > And Wikipedia mentions that there are three different schemes used > in Austrian phone books... And, ä, ö, ü, ß do not count as elements of the Alphabet.
An ä is not considered a "variant of an a". If at all, it would be considered as a variant of an e.
Is "ä" a letter? Can you say "der Buchstabe ä" like you say "der Buchstabe x"? Probably yes, but in a way "ä" is less of a letter than "x".
Joachim
Garrett Wollman - 29 Sep 2010 23:34 GMT >Good luck with that approach. > >German has two sorting orders, both standardized in DIN 5007: >Variant 1 treats \344, \366, \374 as equal to a, o, u. >Variant 2 has \344, \336, \374 as equivalent to ae, oe, ue. >(Both treat \337 as ss.) So which one do you get when you call setlocale(LC_COLLATE, "de_DE.ISO8859-15")?
(At least on my system, it appears that you get "variant 1": (a,<a'>,<a!>,<a/>>,<aa>,<a:>,<a?>,<ae>);\ but I have no idea if that's universal.[1])
-GAWollman
[1] I also don't know what the notation means. I assume that <a:> is a-with-umlaut, <ae> is ae-ligature, and <aa> is a-ring (Swedish /o/). They are not listed in this collation sequence in the same order as they are in the ISO 8859-15 table.
 Signature Garrett A. Wollman | What intellectual phenomenon can be older, or more oft wollman@bimajority.org| repeated, than the story of a large research program Opinions not shared by| that impaled itself upon a false central assumption my employers. | accepted by all practitioners? - S.J. Gould, 1993
Christian Weisgerber - 30 Sep 2010 12:04 GMT > >German has two sorting orders, both standardized in DIN 5007: > >Variant 1 treats \344, \366, \374 as equal to a, o, u. [quoted text clipped - 3 lines] > So which one do you get when you call setlocale(LC_COLLATE, > "de_DE.ISO8859-15")? Most likely variant 1 if collation is actually implemented.
> (At least on my system, it appears that you get "variant 1": > (a,<a'>,<a!>,<a/>>,<aa>,<a:>,<a?>,<ae>);\ > but I have no idea if that's universal.[1]) > > [1] I also don't know what the notation means. See colldef/map.ISO8859-15.
 Signature Christian "naddy" Weisgerber naddy@mips.inka.de
Lewis - 29 Sep 2010 23:54 GMT >> I can't speak for Mark, but I would distinguish them according to the >> way they are sorted in alphabetical order. In French e, é, è and ê are >> all jumbled up in indexes, so they are variants of one letter, but in >> Spanish n and ñ are kept separate, so they are different letters (more
> Good luck with that approach.
> German has two sorting orders, both standardized in DIN 5007: > Variant 1 treats ä, ö, ü as equal to a, o, u. > Variant 2 has ä, ö, ü as equivalent to ae, oe, ue. > (Both treat ß as ss.)
> The first variant is used in encyclopedias and such, the second one > in lists of names (phone books).
> And Wikipedia mentions that there are three different schemes used > in Austrian phone books... I'm reasonably sure that our printed White Pages (back when we had those things in the 1980's and early 90's before broadband) had "M" and "Mc" last names separated.
 Signature Competent? How are we going to compete with that?
Robert Bannister - 30 Sep 2010 02:02 GMT >>> I can't speak for Mark, but I would distinguish them according to the >>> way they are sorted in alphabetical order. In French e, é, è and ê [quoted text clipped - 18 lines] > things in the 1980's and early 90's before broadband) had "M" and "Mc" > last names separated. Ours have all the "Mac" and "Mc" names between Mab and Mad, but (if I remember correctly, which is not a sure bet) separate from any other Mac names.
 Signature Rob Bannister
Nick Spalding - 30 Sep 2010 10:07 GMT Lewis wrote, in <slrnia7gsj.1dra.g.kreme@ibook-g4.local> on 29 Sep 2010 22:54:11 GMT:
> >> I can't speak for Mark, but I would distinguish them according to the > >> way they are sorted in alphabetical order. In French e, é, è and ê are [quoted text clipped - 17 lines] > things in the 1980's and early 90's before broadband) had "M" and "Mc" > last names separated. The Irish phone books have "M" separate from "Mc" and "Mac" which are grouped together sorted by what comes after them, as: McAdam, James Macadam, Thomas McAdam, William
 Signature Nick Spalding BrE/IrE
Peter Duncanson (BrE) - 30 Sep 2010 11:26 GMT >Lewis wrote, in <slrnia7gsj.1dra.g.kreme@ibook-g4.local> > on 29 Sep 2010 22:54:11 GMT: [quoted text clipped - 26 lines] >Macadam, Thomas >McAdam, William In Northern Ireland phone books (and probably the rest of the UK) M', Mc and Mac are all treated as though they are Mac. Normal alphabetic sorting is then used. For instance:
McEwen Macey McFaddef [sic] McFadden
 Signature Peter Duncanson, UK (in alt.usage.english)
Nick Spalding - 30 Sep 2010 12:11 GMT Peter Duncanson (BrE) wrote, in <mto8a6hgfhq9q1922ib2gh0qckllmi9lrq@4ax.com> on Thu, 30 Sep 2010 11:26:18 +0100:
> >Lewis wrote, in <slrnia7gsj.1dra.g.kreme@ibook-g4.local> > > on 29 Sep 2010 22:54:11 GMT: [quoted text clipped - 35 lines] > McFaddef [sic] > McFadden I looked for a M' but didn't spot one in the Dublin book.
 Signature Nick Spalding BrE/IrE
Peter Duncanson (BrE) - 30 Sep 2010 13:04 GMT >Peter Duncanson (BrE) wrote, in ><mto8a6hgfhq9q1922ib2gh0qckllmi9lrq@4ax.com> [quoted text clipped - 41 lines] > >I looked for a M' but didn't spot one in the Dublin book. I can't find an M' in my local phone book either. The information about the treatment of M', Mc and Mac is printed as a note at the beginning of the M names.
There are people named M'Gee in various parts of the world
Cathy M'Gee on Facebook: http://www.facebook.com/Buddha.MGee
and a fine collection of M'Gees and other M'-s in this Google Books item: http://books.google.co.uk/books?id=juPn5mjLa3UC&pg=PT238&lpg=PT238&dq=%22M%27gee %22&source=bl&ots=3Ya5lQdysU&sig=si7TxtJi1Bpt7hHNRcdayKDT8kQ&hl=en&ei=i3ukTKr1Cs iD4QaO5bznDQ&sa=X&oi=book_result&ct=result&resnum=8&ved=0CDAQ6AEwBzgo#v=onepage& q=%22M%27gee%22&f=false or http://tinyurl.com/3ynnjk8
 Signature Peter Duncanson, UK (in alt.usage.english)
Athel Cornish-Bowden - 29 Sep 2010 20:57 GMT > Athel Cornish-Bowden: >> There was a web site that I saw years ago that listed the diacritical [quoted text clipped - 13 lines] > Athel's right to say that one language may see a mark as a diacritical > when another does not, but I don't think that example is right. No. As so often, I spoke too soon. I think that Turkish dotted and undotted are more like ñ and n in Spanish: two different letters. What I meant was that the dot has a specific meaning and function in Turkish that it doesn't have in most other languages.
> My > understanding is that i (which is capitalized as I-dot) and dotless-i > (capitalized as I) are simply two different letters in Turkish.
 Signature athel
Mark Brader - 29 Sep 2010 23:13 GMT Athel Cornish-Bowden:
>>> There is also the problem as to what constitute diacritical marks, >>> to an English speaker the dot on an i is not a diacritical mark, >>> but to a Turkish speaker it is... Mark Brader:
>> Athel's right to say that one language may see a mark as a diacritical >> when another does not, but I don't think that example is right. Athel Cornish-Bowden:
> No. As so often, I spoke too soon. I think that Turkish dotted and > undotted are more like ñ and n in Spanish: two different letters. What > I meant was that the dot has a specific meaning and function in Turkish > that it doesn't have in most other languages. But this is also wrong: if they're just two separate letters, the dot doesn't have a meaning of its own. Rather, it's the combination of "I" and dot that together has a meaning. The dot in Turkish "i" is (as I understand it) like the diagonal stroke in Q or the Norwegian Ø: it's just a part of the letter without its own meaning or function, and the fact that if you removed it you'd get a different letter is incidental.
 Signature Mark Brader | "Some societies define themselves by being open to new Toronto | influences, others define their identity by resisting. msb@vex.net | In either case, they take the consequences." --Donna Richoux My text in this article is in the public domain.
Stan Brown - 14 Sep 2010 03:11 GMT > Here's an interesting resource: > [quoted text clipped - 6 lines] > > http://wals.info/feature/138?tg_format=map&v1=cd00&v2=c00d&v3=cccc&s=20&z3=3000& z2=2999&z1=2998 I see a map legend, but no map.
 Signature Stan Brown, Oak Road Systems, Tompkins County, New York, USA http://OakRoadSystems.com Shikata ga nai...
|
|
|