Category Archives: website

Arabic Speech Corpus shared by Dr. Nawar Halabi

respond symbol with audioIf you have been using our Arabic symbols page you will have noticed that we have made every phoneme for our lexical entries available as a sound file, so that you can hear how it is pronounced. You can see the audio links at the bottom of the symbol for ‘respond’ in the picture beside this text.   This can help those who have literacy skills difficulties as well as those wish to learn Arabic.

Nawar, who has been part of our Tawasol Symbols project from the beginning at the same time as successfully completing  his PhD, has made this possible with the development of an Arabic Speech Corpus with support from the University of Southampton and MicrolinkPC.

The synthesised speech output that results from this corpse is a very natural sounding voice, recorded using Levantine Arabic, as heard in and around Damascus.  Levantine Arabic is considered one of the three main Arabic dialects and differs from Gulf Arabic in some aspects of grammar and pronunciation although when phonemes are read aloud, they are often nearer Modern Standard Arabic and when combined there is less dialectal impact.

The corpus has been made available for download as a zip file and is released under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.  As the Arabic Speech Corpus website says the packages includes:

  • 1813 .wav files containing spoken utterances.
  • 1813 .lab files containing text utterances.
  • 1813 .TextGrid files containing the phoneme labels with time stamps of the boundaries where these occur in the .wav files. These files can be opened using Praat software.
  • phonetic-transcript.txt which has the form “[wav_filename]” “[Phoneme Sequence]” in every line.
  • orthographic-transcript.txt which has the form “[wav_filename]” “[Orthographic Transcript]” in every line. Orthography is in Buckwalter Format which is friendlier where there is software that does not read Arabic script. It can be easily converted back to Arabic.
  • There is an extra 18 minutes of fully annotated corpus (separate from above, but with the same structure as above) which was used to evaluate the corpus (see PhD thesis). Feel free to use this in your applications.

Please contact Nawar Halabi by email for further information.

News from the ISAAC conference and recent work

ISAAC film festival posterThe ISAAC 2016 conference in Toronto has seen the launch of our film about Mohammed and his use of the Tawasol symbols for praying. The importance of personalisation and localisation of communication charts to suit user needs is illustrated.  The setting of the film takes you to Qatar and straight into a Doha home where one can see the difference listening to participants in this sort of a project can make.

Share and Believe, A Symbolic Journey

Mohammed using his Augmentative and Alternative Communication (AAC) aid to express his feelings about the Tawasol symbols and what he has achieved. We would like to say thank you for his support and his family whilst we have been working to develop freely available symbols that can be used alongside any other symbol sets but take into account Gulf and other Arabic cultural, religious and social settings. The team have been working in collaboration with AAC users, families, teachers and professionals in Doha, Qatar and hope to offer many more symbols in the future that will also help those with literacy and language skill difficulties as well as for use in signage etc.

 

The team feel this has been one of the most important outcomes of the Arabic Symbol Dictionary – a freely available set of symbols that can work with any other symbol set to support Arabic AAC users, those with literacy skill difficulties and for use in the local environment.  We have worked hard with local participants to achieve a mix of Qatari and Arabic dress, religious culture and take into account social etiquette and sensitivities.  Much more has to be done and we are working hard to increase the vocabulary in the coming months.

At the conference we were lucky enough to have two papers accepted and here are the PowerPoints that went with the presentations. The ISAAC Conference program provides links to the abstracts
Core Vocabularies: Same or different for Bilingual Language Learning and Literacy Skill building with Symbols?

Developing an Arabic Symbol Dictionary for AAC users: Bridging the Cultural, Social and Linguistic Gap.

Finally in the last few weeks we have been working with CommuniKate and Joe Reddington to add all our symbols to two general communication charts in English and Arabic which can be personalised as the charts are built using PowerPoint slides.  The system has been developed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License and we are very grateful for the support Joe and Kate have given us with the project.

The English test sample chart is available and is best seen using the Firefox browser, but here is a screen grab of the Arabic version that is still being worked on as we want it to work with text to speech in the same way as the English version.  When you select a symbol the word appears in the window and the text to speech reads it out. At present the English version is using eSpeak but we need to find a good Arabic voice and the correct sentence construction with the appropriate character word changes as the symbols are selected.

Arabic Communication chart

Adding elements to symbols to enhance their use.

AAC symbols need to be bespoke, personalised and relevant to the time of communication as well as the setting and task being undertaken.  However, this is not always possible in the time available with on the spot conversations.  Where there is time to adapt symbols the process often has to be carried out in special programs.  To over come the need to search out these special programs or apps Tom Lam has developed a very simple online application that allows those looking for symbols on our web site or from any other site to add elements to the original symbol.   Changing the usage of a symbol to fit the needs of a particular language (Lundalv et al, 2006)  is also important and may require arrows going in different ways such as from left to right to denote past in Arabic but future in English.

They are sitting

Symbol Creator with a symbol for sitting used to make the phrase ‘They are sitting’

We provided examples of how this could be done in a previous blog and now you can experiment and develop your own symbols using the  ‘Symbol Creator’  on the Tawasol symbols website. It is possible to add borders, background colours, text labels, arrows , plus or minus symbols that can provide plurals or signs for more or less.  Other symbols can be added on top of the first symbol in miniature to offer gender differences etc but as this is on the web it is not possible to change the order that you add things so the first item will go to the back and so on.  But you can delete any of the symbols when you highlight them and re-upload to get the order right!  We are looking into how we can make this process easier.

Resizing is possible but the canvas has been set to 500×500 pixels to fit with the original size of all the Tawasol symbols.  However, you can save the results in several formats and carry out any other adaptations in other graphical packages.  Because the Symbol Creator is online it is important to save the final version as a download as soon as possible!   This process will wipe what has been done but you can always upload the image again.

Please do try the Symbol Creator and if you could fill in the quick survey to give us some guidance for making future improvements that would be wonderful. 

Although the tool will not offer all that can be achieved with a sophisticated commercial program, it will provide an instant method of adapting symbols.  There are other online options such as those offered by ARASAAC for symbol creation and phrase making. 

Of course, this is only the beginning of a process as Amy Speech and Language Inc demonstrate in their examples of  communication boards or stories for symbol users and Lessonpix has a sharing page that provides more resources.

text2picto screen grab

Text2Picto example of a text to symbol translator. CCL KU Leuven

The exciting bit is when one can generate text to symbol sentences that make sense or symbol to text sentences that allow both the symbol user and their friends and family communicate more easily across the airwaves!   Have fun with the Text2Picto beta online text to symbol processor. (Sevens et al 2015) to learn more about the issues of sentence generation.

 

References

Lundälv M, Mühlenbock K, Farre B, Brännström A. SYMBERED – a Symbol-Concept Editing Tool. LREC – Language Resources and Evaluation Conference, Genua, 2006, 1476- 81.

Leen Sevens, Vincent Vandeghinste, Ineke Schuurman and Frank Van Eynde (2015). Natural Language Generation from Pictographs. In: Proceedings of 15th European Workshop on Natural Language Generation (ENLG 2015). Brighton, UK. [Paper] – See more at: http://picto.ccl.kuleuven.be/publications.html#sthash.lGejRT6q.dpuf

Tawasol symbol website goes live

The Tawasol symbol website has been available for the last two months for beta testing.  There are still many updates and fixes to be done but now the site has been submitted to Google and can be found by searching for Tawasol Symbols!

We have been keeping statistics and since October with us all working on the site there are some figures to share.  684 views with 38% coming from new visitors and 62% returning visitors.  The visitors come from the following countries:

Country Sessions % Sessions
1.
United Kingdom
62

52.54%

2.
Qatar
31

26.27%

3.
United States
16

13.56%

4.
Saudi Arabia
4

3.39%

5.
Brazil
2

1.69%

6.
France
1

0.85%

7.
Ireland
1

0.85%

8.
Japan
1

0.85%

There have been 21 downloads of symbol files from the home page, with more downloads occurring in Arabic compared to English.  Many of these will have been test situations so 12 downloads came from UK, 7 from Qatar and 2 from USA:

Event Label Total Events % Total Events
1.
Arabic zip
12

57.14%

2.
Arabic rar
6

28.57%

3.
English zip
3

14.29%

We are still building the dictionary and the only entries seen on the Tawasol symbol website are those entries that have both Arabic and English lexical concepts.  The Symbol Dictionary Management system has many more entries that still require work.

The individual words or phrases can be searched or browsed via category selections and depending on the language chosen once the symbols appear they can be selected to see more information and their links to other symbols of similar meaning or in the opposing language.  So a search for ‘camel’ will bring up the English choice that then offers the choices in Arabic.

search for camel

Search for ‘camel in English to see the selection offered

camel with information

Select the camel that you want to see with further information relating to that lexical entry

camel with Arabic label

You are now viewing the Arabic lexical entry with the available information if you are using the English side of the website

Arabic version of website

The Arabic side of the website provides the user with a similar view.

In the coming months there will be over 500 Arabic / English lexical entries (with their appropriate symbols) being the most commonly used words in both languages for AAC use and spoken and written language learning.  These words and phrases will be a combination of lists collected from AAC users in both languages and those words collected by external researchers and published as the most frequently used words in both languages gathered from speakers and written works.

Tawasol Symbol Website Development

The website development has begun as more symbols are being added to the database and over 1vote on url20 have been accepted by participants voting via the symbol management system.  These will be made available on the tawasolsymbols.org website when it is launched.

Sadly many of the web addresses linked to the use of the word ‘tawasol’ had been taken.  The team voted on a collection of addresses that could be used and it was decided that we should also have a re-direct from arabicsymbols.org.

Then we collected the options for website designs provided by Dana, our graphic designer and added them to a Google form in order to have a voting session on which was considered the best option.  See below…

webiste sampleIt turned out that Dana’s last version No 4 came out top with 21 to 19 votes being the sum of the different criteria.   This has provided the basis for the wire frames that have now been submitted to the team for further comments.

The team decided that where possible in-house designed symbols should appear as guides to content.  Pages should be simple and short and work well on portable devices.

The responsive design and accessibility criteria have led to some restrictions in particular to the width of presentation and the number of symbols that can be viewed at once.  Two sites separate have been prepared with English and Arabic on offer via a WordPress content management system which means anyone with a login can update basic content.

wireframes for website

Issues with downloading symbol files were detected early on in the trials with emails being received from beta testers pointing out the corruption of the Arabic labels. This was resolved when it was discovered that in Windows the process of zipping data caused the corruption to occur – this did not appear to happen on iOS or Mac systems.  A .rar compression format is now offered as well and this has solved the problem.

In-house beta testing revealed other issues which were dealt with such as news not appearing and missed links etc at a very basic level. The second phase of development could now start with the introduction of an API (application program interface) to host the dictionary database and filtering system.