Monthly Archives: August 2015

More Vocabulary lists and papers for conferences

Highland cowThe last two months have seen some members of the team taking time out, one member heading off to carry out research at MIT and two members introduced us to their new daughters! Other members of the team have been on holiday, not all to sunny climes!

However the work has continued and from a research perspective we have been looking at a collection of Arabic core vocabularies to analyse the differences between our own Doha AAC lists and other lists of frequently used words on the web, in conversational situations and for language learning.

The Doha Arabic AAC lists are made up of a collection of the most commonly used words as collated by special needs teachers, therapists (e.g. speech therapists and occupational therapists) and parents. These lists also include the referents for symbols from AAC user workbooks, AAC devices, therapist progress notes of symbols worked on in therapy, and commonly used symbol signage around special needs centres and facilities.

The Arabic most frequently used words have come from individuals’ comments on the Aljazeera websites which were often posted in colloquial Arabic and collected by Dr Wajdi Zaghouani plus another list of words collected in lectures, the KELLY Project (Keywords for Language Learning for young and adults alike) and Buckwalter and Parkinson’s Frequency Dictionary of Arabic: Core Vocabulary for Learners.

There were also several lists based on words needed to encourage literacy skills such as the Supreme Education Council standards (Grade 1, 2, 3 and kindergarten, Ahmad Oweini and Katia Hazoury’s list of Sight words based on a collection of words gathered from popular reading books in Lebanon (grades K to 3)

On the English side the word lists have come from the research collected early on in the project linked to the work of Hill and Romich, Blandin and Iacono, Benajee et al, Van Tatenhove and Beukelman et al. Some frequency lists are based on the General Core Vocabulary (GCV) measure.

The analysis of these lists has been written up in a paper for the 6th Workshop on Speech and Language Processing for Assistive Technologies in Dresden as part of a larger Interspeech conference and will be published after the event in November2015. In essence we took our Doha lists and compared them to the other collections to see whether there were any major differences and which words we also needed to include in our lists to develop symbols that would aid communication and literacy skills. We not only found that there were several differences in the vocabularies but also in comparison to the English lists, there were many more nouns.

In English Boenish and Soto state that the use of nouns goes from 7% in the top 100 words to 20% in the top 300 whereas in MSA the corresponding frequency levels are 26% and 45% according Buckwalter and Parkinsons’ lists. When looking at the English AAC user list this appears to be true but when looking at the Doha AAC lists there are many more nouns and one has to wonder whether this is due to the make up of the Arabic language or that it is much easier to develop symbols related to concrete objects rather than abstract feelings, concepts or happenings!

Word cloud made with WordItOut

More analysis will need to be done in the coming months, but in the meantime the voting sessions continue with the acceptance of symbols and this process was explained in another poster for the ASSETS 2015 conference. The support for literacy skills for Arabic AAC users will be the topic for a poster at Communication Matters in UK and a paper on our participatory approach to the development of the Arabic Symbol Dictionary will be presented at AAATE 2015 also in the first week of September, 2015.


W. Zaghouani, “Critical Survey of the Freely Available Arabic Corpora,” In the Proceedings of the International Conference on Language Resources and Evaluation (LREC’2014), OSACT Workshop. Rejkavik, Iceland, 26-31 May 2014.

A. Kilgarriff, F. Charalabopoulou, M. Gavrilidou, J. B. Johannessen, S. Khalil, S. J. Kokkinakis and Volodina, E. “Corpus-based vocabulary lists for language learners for nine languages,” Language Resources and Evaluation, 1-43 2013.

W. Zaghouani, B. Mohit, N. Habash, O.Obeid, N. Tomeh, and K. Oflazer. “Large-scale Arabic Error Annotation: Guidelines and Framework,” In the Proceedings of the International Conference on Language Resources and Evaluation (LREC’2014). Rejkavik, Iceland, 26-31 May 2014.

Oweini and K. Hazoury, “Towards a list of Awards a Sight Word List in Arabic,” International Review of Education, 56 (4), 457-478 2010.

K. Hill, and B. Romich, 100 Frequently Used Core Words. Accessed May 2015

K. Hill, and B. Romich, “A summary measure clinical report for characterizing AAC performance,” Proceedings of the RESNA ’01 Annual Conference, Reno, NV. pp 55-57. 2001.

J. Boenisch and G. Soto, “The Oral Core Vocabulary of Typically Developing English-Speaking School-Aged Children,” Implications for AAC Practice. Augmentative and Alternative Communication, pp.77–84. 2015.

Balandin and T. Iacono, “A few well-chosen words,” Augmentative and Alternative Communication, 14(September), 147–161 1998.

Banajee, C. Dicarlo, and S. Buras Stricklin, “Core Vocabulary Determination for Toddlers,” Augmentative and Alternative Communication, 19(2), 67–73. 2003.

Beukelman, D. R., Yorkston, K. M., Poblete, M., & Naranjo, C. (1984). Frequency of Word Occurbence in Communication Samples Produced by Adult Communication Aid Users. Journal of Speech and Hearing Disorders, 49(4), 360-367.

T. Buckwalter and D. Parkinson, “A frequency dictionary of Arabic: Core vocabulary for learners,” Routledge. 2014.

G. M. Van Tatenhove, “Building Language Competence With Students Using AAC Devices: Six Challenges,” Perspectives on Augmentative and Alternative Communication, 18(2), 38–47 2009.

P. Hatch, L. Geist, and K. Erickson, “Teaching Core Vocabulary Words and Symbols to Students with Complex Communication Needs,” Presented at Assistive Technology Industry Association, 2015. Retrieved 19/2/2015 from (Accessed 14 June 2015).