Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion GroupsEnglish UsageBritish EnglishESL Teaching
Learnglish.com
Contact UsLink To UsSearch & Site Map

Discussion Groups / ESL Teaching / October 2005



Tip: Looking for answers? Try searching our database.

Frequently used English words list?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Xing Qiu - 24 Oct 2005 00:56 GMT
Hi,

    I just wonder is there a list of 20,000 most frequently used English
words? I think my vocabulary is around 15,000 and I have no problem
reading English literature in general.  However, from time to time I met
unfamiliar words, so I think maybe I should simply spend sometime memorize
another couple of thousand words.  I searched the Internet but couldn't
find exact what I want (most word lists I've found contain only
1,000~5,000 words).

Thanks,
Xing
John Ramsay - 24 Oct 2005 01:05 GMT
> Hi,
>
[quoted text clipped - 8 lines]
> Thanks,
> Xing

Lorge Thorndike - Most common 30,000 words. Used to develop
 vocabulary section of IQ test 1954

See article below.

Vocabulary Resources for Material Writers

Writers
>From The Materials Writers Newsletter
The Newsletter of the Materials Writers' National Special Interest Group
of the Japan Association of Language Teachers
Vol. IV, No. 3, October 1996
John Bauman
Enterprise Training Group
Material written for ESL students needs to use somewhat simplified
vocabulary
and structure if it is to be accessible to lower and intermediate level
students. In terms of vocabulary, a writer can try to "keep it simple"
while
writing, but a more rigorous approach is to compare a text with a list
of words
prepared for this purpose. A variety of lists of words are available, as
well as
different ways to use them. In this article, I will briefly list and
describe
some lists. I'll also discuss a program that will analyze a text and
give some
links for further exploration of this topic on the internet. Links to
sites
mentioned are given in the "Web Links" section at the end of this
article.
Teaching and Learning Vocabulary (Nation 1990) contains a good general
discussion of this topic. Nation doesn't hesitate to quantify the issue.
His
model of an ideal vocabulary teaching sequence starts with the most
frequent
2,000 words, which he calls general service vocabulary. Everybody needs
to know
these words; they make up about 87% of an average written text. After
this
point, general frequency becomes less useful as a guide to what words to
teach.
Students are better off studying a list of words specific to their field
of
interest or need, if one can be found. For the student aiming at
English-language higher education, Nation's 800 word University Word
List is
appropriate. After this, the remaining vocabulary of English is of too
little
frequency to merit direct study. Skills such as analyzing word parts,
context
guessing, etc. can be taught.
The number of different words used will depend on the level of the text.
Writers
of material for ESL learners also have to decide which words to use, or,
in a
larger sense, to which population of words should they restrict
themselves. Here
a list becomes necessary. Many have been developed over the years. The
following
remain relevant.
The General Service List
The General Service List (GSL)(West 1953) is the specific list of 2,000
words
that Nation refers to when he writes about the "first 2,000 words." It's
based
on written texts, it's old, and it's not in frequency order, though
frequency
numbers are given. The source of the frequency information is even
earlier than
the publication date, being derived from Thorndike and Lorge (1944). But
the
list was not compiled based on frequency alone. It was created to be an
ideal
vocabulary for ESL students to start out with. Through the 1970s, a lot
of
material, particularly graded readers, was based on this list. Even
today, much
of this material is sold and used. The GSL is out of print, and somewhat
out of
favor. The list is available as a component of the Vocabprofile program
described below and, in a slightly different form, on this web page.

Thorndike and Lorge

The Teacher's Word Book of 30,000 Words (Thorndike and Lorge, 1944) was
created
as a resource for elementary and high school teachers in the United
States. It
is still frequently cited, though computer-produced corpora have largely
replaced it as an authority on the frequency of words. For example, it's
the
source of the words above the 2,000 word level in the vocabulary test in
Nation
(1990). It's old, it's based on a compilation of pre-WW2,
non-computerized word
counts totaling about 18 million written words. As published, it's not
in
frequency order, but frequency ranks are given for each word.
The University Word List
The University Word List (UWL)(in Nation, 1990) is a list of academic
vocabulary
composed of about 800 words. It's designed for students who plan to
study in an
English-language college or university. Essentially, it's the most
common 800
words in academic texts, excluding the 2,000 words of the GSL. This list
is
structurally linked to the GSL. A student who studies the GSL, followed
by the
UWL, will find no repetition of words. The list is divided into 11
parts. Part
one has the greatest frequency and range, part 2 next, etc. This list is
also a
component of the Vocabprofile program.
The Brown Corpus
The Brown Corpus (Francis and Kucera, 1982) is the earliest computerized
study
of English vocabulary. It is an analysis of 1 million words published in
the
United States in 1961. It's also kind of old, but it's more consistent
in it's
definition of "word" (as a lemma) than the earlier lists. The 1982
publication,
which includes both alphabetical and frequency order lists of the words,
is a
very useful resource.
The LOB Corpus
The LOB Corpus (Hofland and Johansson, 1982) is a study of 1 million
words of
British text published in 1961. It was designed to be a British
counterpart to
the Brown corpus.

The Cambridge English Lexicon

The Cambridge English Lexicon (CEL) (Hindmarsh, 1980) is a list of 4470
words,
prepared with reference to the GSL, Thorndike and Lorge, Brown, other
sources,
and the author's experience as an ESL teacher and material developer.
Each item
is graded from 1 to 5. The most useful aspect of the list is that the
different
meanings of the words are also graded on the same scale. Only the CEL
and the
GSL give separate information on the different meanings of common words
(though,
of course, dictionaries do also). The GSL gives actual frequency numbers
for the
different meanings, but the age of the data and the fact that it was
gathered by
hand may make the CEL a more reliable source for an indication of the
relative
importance to students of different meanings of words. The grading in
the CEL is
not based solely on frequency.
Modern Corpora
These days, much is heard about corpora from dictionary publishers, who
all
boast about the enormous corpora that their learner dictionaries are
based on.
The British publishers are particularly enthusiastic about this, using
either
the CoBuild corpus or the British National Corpus (BNC) as a source of
lexicographic information. Both of these corpora contain more than 100
million
words. Limited access to them is possible through the internet, see the
links on
the Collocations Homepage listed below. Depending on your purpose, it
may be
more useful to access these corpora in pre-digested form through the
dictionaries based on them. A lemmatized frequency list of the BNC has
been
prepared by Adam Kilgarriff and is available for FTP.
Vocabprofile
Vocabprofile is a freeware program for PCs that will compare a given
text with
any properly formatted list. Three lists can be done at a time. The
output will
report what percent of the words in the text are on each of the lists.
It will
also print the text with the words marked to indicate which list they
are on, or
if they aren't on a list. Vocabprofile is available for FTP at the URL
below.
The three lists that come with the program are the first 1,000 words of
the GSL,
the second 1,000 words of the GSL and the UWL.
Concluding Remarks
None of these resources is ideal. Thorndike and Lorge and the GSL are
old, old
enough that the English of today surely differs significantly. However,
the core
vocabulary of English changes more slowly, so at the frequency level of
the
first 2,000 words this may be less of a problem. The GSL offers some
advantages
as a standard. It was specifically designed as a teaching vocabulary
list. It
has a long history of use, both in teaching materials and in second
language
acquisition research. A program to compare it with a given text is
readily
available. Of the lists above, only the CEL was also compiled for the
purpose of
facilitating the creation of teaching materials. It's more modern than
the GSL,
but appears to have had less impact. It is not conveniently available
for
computerized text comparison.
The Brown Corpus, the LOB Corpus and the lemmatized list from the BNC
are useful
because they give the lists in frequency order. This allows a population
of
words to be defined much more precisely, and individual words to be
compared
with each other. But these lists were prepared for linguistic research,
not
teachers. They're lists of lemmas, which means that words are listed
more than
once if they can act as more than one part of speech. Some derived forms
are
also considered as separate lemmas, such as comparative and superlative
forms of
adjectives. These factors affect both the frequency rankings of words
and the
number of words that appear on a list. In other words, a list of 1,000
words
taken from the GSL or CEL would contain more than 1,000 lemmas. These
corpus-based lists need substantial adjustment to make them appropriate
as
vocabulary standards. These adjustments have already been made to the
GSL and
CEL.
An author of EFL material has many vocabulary options available. I hope
this
discussion of resources is useful and that the bibliography and the
internet
sites below will be helpful in finding the items that will serve your
specific
needs.
Links to sites mentioned
Adam Kilgarriff
http://www.itri.brighton.ac.uk/~Adam.Kilgarriff/
Links to his lemmatized, frequency order version of the BNC are here.
John Higgins
http://www.marlodge.supanet.com/index.html
Here you can find Vocabprofile as well as links to other programs.
Bibliography
Francis, W.N. and Kucera, H. (1982).Frequency Analysis of English Usage.
Houghton Mifflin, Boston
Hindmarsh, R. (1980). Cambridge English Lexicon. Cambridge University
Press,
Cambridge
Hofland, K. and Johansson, S. (1982). Word Frequencies in British and
American
English. NAVF, Bergen
Nation, I.S.P. (1990). Teaching and Learning Vocabulary. Newbury House,
New York

Thorndike, E.L. and Lorge, I. (1944). The teacher's Word Book of 30,000
Words.
Teachers College, Columbia University, New York
West, M. (1953). A General Service List of English Words. Longman,
London

Back to the Top
John Bauman's Homepage
CV - 24 Oct 2005 21:41 GMT
> Hi,
>
[quoted text clipped - 8 lines]
> Thanks,
> Xing

Hmmm, I wonder is there anywhere on the web you can test your vocabulary,
like answering a quiz and getting an approximate word count as a result ?
Might be interesting to try.
CV
Einde O'Callaghan - 24 Oct 2005 23:21 GMT
> Hi,
>
[quoted text clipped - 5 lines]
> couldn't find exact what I want (most word lists I've found contain only
> 1,000~5,000 words).

I believe one of the commissions of the EU has developed lists of core
words and functions that are essential at various levels of language
acquisition for all the major EU languages. they are designed to help
prepare textbooks and as guidelines for foreign language examinations in
the different EU countries. I'm not certain whether they are available
on-line - I'll check during the next few days.

However, at your level of cpompetence I don't think memorising a few
thousand random words is the solution to your "problem" - I don't really
think it IS a problem. I think it would be far more useful to read a
large quantity of general literature (and specialist literature in your
various areas of interest) and note the words you don't know. This will
enable you to develop your own list of vocabulary that is useful for you
to learn.

Regards, einde O'Callaghan
Jan - 25 Oct 2005 08:16 GMT
Einde,

Are you talking about the Common European Framework?

If you are, then it is a rather vague list of 'can do' statements: a
functional view of the language. People in some countries have taken it
further for the different languages, but I don't know how far. And in
any case, functional language like this is apparently sometimes hard to
pin down.I heard of a Collins COBUILD experiment in which native
speakers were recorded in conversations that should bring up the
language of 'recommending' and 'advice' (talking about visiting a
holiday place). Never once did the speakers use 'should'!

Xing Qiu,

Why don't you just read a bit more in English? It's surely more fun
than learning an abstract list of words, especially because you get
more information about how the word is used. Words don't usually like
being alone: they hang out with the other words, contexts and register
they belong with.

Some learners I know make marks in their dictionary every time they
look up a word when they are reading: if a word has three or more
marks, they probably need to spend a bit more time trying to remember
it. And there are plenty of books now that use simple English, for
example the Penguin Readers, or the Oxford Bookworms.

Good luck!

Jan
Einde O'Callaghan - 25 Oct 2005 21:37 GMT
> Einde,
>
> Are you talking about the Common European Framework?

I think that's what they're called

> If you are, then it is a rather vague list of 'can do' statements: a
> functional view of the language. People in some countries have taken it
[quoted text clipped - 4 lines]
> language of 'recommending' and 'advice' (talking about visiting a
> holiday place). Never once did the speakers use 'should'!

I'm sure I saw lists of English WORDS, not just functions, while
attending a presentation of a new series of books that were based on
this fraqmework.

Regards, Einde O'Callaghan

P.S. I agree with your suggestion about reading and dictionary work.
credoquaabsurdum - 27 Oct 2005 03:58 GMT
Here in Greece, it's been the Common European Framework this and the
CEF that for the last two years. I paid forty bucks for the commission
report and still haven't gotten past page five.

Many, many publishers have jumped on the CEF bandwagon and promised
that their books are based on it. The recognized experts then disagree
and point out circuitously that the publishers are simple thieves bent
on commercial success and devil-take-the-hindmost (new information,
indeed). Quite frankly, for all practical purposes, it seems to be a
lot of dignified talk and no action, in the classic European
tradition...

The "can-do statements" might as well be called "I-think-I-can,
I-think-I-can statements," for all the real-world effectiveness we've
seen to date.

> > Einde,
> >
[quoted text clipped - 18 lines]
>
> P.S. I agree with your suggestion about reading and dictionary work.
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2012 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.