There have been recent updates to make the tool-bar more robust and compatible with browsers and the results have meant increased access to Google docs, Facebook (except Chrome at present) and Twitter – the text to speech and word prediction work when you change the background look and feel using the painter’s palate (change page style). The colour overlay can be toggled on an off for all websites and works with click through so you can go to another site even if you have the coloured overlay in place.
Work on the style sheet issues has also meant that the dialog boxes in ATbar do not always take on the style sheet of the target website. This also saves time when implementing new plugins and adding new things to the tool-bar as there are less style clashes so it is easier to customise the tool-bar for particular websites.
The latest version of the Festival Speech Synthesis System is now available as an option on the ATbar Market place website. The Festival plugin works in a similar way to the commercial Acapela TTS plugin.
ATbar now works with all the free speech synthesisers such as eSpeak and Mbrola but the quality of some of the voices is still a challenge for most listeners.
Nawar Halabi has very kindly provided an introductory video of the Arabic version of ATbar and we have uploaded it to YouTube.
YouTube video overview of ATbar in Arabic
Nawar has also been testing the Arabic version as part of our maintenance programme. We have found some issues with Arabic mis-aligned text at times and there are occasions when the CSS of the website needs to be isolated from the toolbar. Otherwise all the plugins appear to have worked well in the last few months.
Testing the Arabic dictionary
Testing the spell checker, text to speech and word prediction.
Where failures were reported these were double checked and found to be due to the word not being a partial word or not being in the dictionaries – usually due to an English speaking person trying to cut and past Arabic words!
A visit to the Assistive Technology Industry Association 2013 conference where the Microsoft team kindly showed me how we could work in Arabic and English plus the arrival of our Dell tablet with Windows 8 has made us look at the issue of Qatari Arabic support and Windows in depth.
We downloaded the language pack and changed the keyboard and all seemed well but it appears from the email I received from their product advisor that there is no Window Arabic voice at present.
“I researched the question to see if Windows 8 supports Arabic (namely Qatari dialect) text to speech. Unfortunately, at this time, Windows 8 does not support it. Only certain languages are included in the built in software.”
So back to the drawing board for the ATbar desktop option – Narrator is not going to speak in Arabic unless someone has found an Arabic Windows system with a well hidden free voice from Microsoft! If anyone has found a solution to this problem please do let us know!
More research and thanks to a recent development with Arabic eSpeak we now have a free voice, Testing has shown that the voice needs to be improved but with work on the phonetics in the future this is something that could be done. The aim is to ship NVDA with the ATbar desktop version and the Arabic eSpeak voice. It will not really be an acceptable voice where a Nuance or Acapela option is available.
The Windows 8 mobile OS has the potential to support more Arabic options and offers translation from OCR although the actual text is still not 100% correct – Spot the problem!
We have been testing online speech recognition systems offered by Google Chrome and they really are not very successful in the Arabic dialects offered. Below is an example of Speech Recognizer in Arabic.
The TalkTyper system uses Speech Recognizer for speech recognition as well as text to speech – the latter uses a very good voice in Arabic – we are still exploring which voice is used but it sounds like Nuance Maged in Arabic.
What this spot for updates next week linked to the ATbar desktop app and ATbar TTS.
We have set up a series of YouTube videos that include:
Text resizing, font style changes and line spacing. This video has no audio but shows how a user can select the magnifier on the toolbar to enlarge text without resizing the graphics – this tends to allow for more readable text when compared to zooming using the browser Ctrl+ which also enlarges the graphics. However, this feature does not work when Flash has been used within a webpage or fonts have fixed sizes or styles. The same applies to increased line spacing which is also demonstrated.
The second video demonstrates how the A.I.Type word prediction works as well as spell checking when writing a blog using WordPress. Use the HTML mode when working in the edit box rather than the Visual mode and then you will also be able to use the text to speech to aid proof reading.
Seb has enabled the AIType word prediciton with keyboard access and text to speech for simple text boxes in his recent updates to the toolbar for both Arabic and English.
The Word prediction button needs to be selected before entering text. It is possible to use the ‘esc’key to ignore a prediction and close the dialog box or use Ctrl+Alt and the word position as a number to insert the required word.
Word Prediction in WordPress
We have found that the prediction and text to speech work with HTML views of text boxes in WordPress and Blogger but not the Visual mode which overrides the ATbar.
The text needs to be highlighted before the text to speech button is selected. There may be a pause before you hear the speech.
Over the summer the team have been investigating the issues around TTS in Arabic and Edrees Abdu Alkinani has completed his MSc report which has made interesting reading as it summarises many of the findings. It was noted that Arabic TTS synthesis did not have the early successes of European languages due to the limitations in Natural Language Processing (NLP) and the complexities of using diacritics as substitutes for vowel combinations. However, with the advances in Natural Language Processing (NLP) and Digital Signal Processing (DSP) plus automatic diacrtizers progress is being developed progress has been made in the commercial world where there are now several attractive Arabic synthesised voices as will be seen in an evaluation to follow.
Issue No 1 – Lack of diacritics on web pages.
The Learning Resource - Arabic language
English speakers may wonder at the reasons for the difficulties with Arabic TTS, but it does not take more than a cursory glance at the written language to understand that having 14 different diacritic marks with 34 phonemes, 28 of which are consonants, and only six vowels that the combinations may cause TTS problems. As Eedris pointed out… ” كُتُبْ ” means books and ” كَتَبَ ” means wrote – the only difference you will notice is the type of marks used above the letters.
TEFL world wiki - English vowel sounds
This is compared to the English basic 12 vowel sounds with no accents or diacritics even though we may complain about our odd pronunciation of some written words – rough, cough, though, thorough and through – at least some of the letters are different and we cannot leave any out. Yet this is what is happening with written Arabic on the web – the diacritics are being left out….. Number one problem for a text to speech engine.
Issue No 2 – The differences between the way the TTS is developed and the resulting output.
Research has shown that although there are now a few text to speech engines they are commercial and even these vary in quality. The MBROLA project links to work carried out in the open source world, but at present it has been impossible to achieve success with the code offered in the various repositories for evaluation purposes. However, Eedris has supplied the team with these comments based on the demonstrators offered by the various organisations and companies.
MBROLA project MBROLA has two Arabic voices as a recorded audio file. The speed of speech is slow, and the quality poor. Moreover, the pronunciation is hard to understand – even for a an Arabic speaker. The stress pattern is often incorrect and the distinction between words unclear. The most difficult words to understand have letters like, “ أ” ‘A’, “ ض” ‘th’, “ ل” ‘L’.
Acapela Group Acapela offers two good quality male and female voices. The pronunciation for words with and without diacritic marks is understandable, with accurate stress patterns. There are three letters which appear to cause some difficulty “ ج” ‘j’, “ ا’ ‘a’, “ ك” ‘k’. The pronunciation of numbers in all situations is good.
Nuance Vocalizer Nuance provide a very clear male voice with clear pronunciation. The only problem is that the system produces speech without taking into account diacritics. Words which have letters like “ ق” ‘q’, “ ش” ‘sh’, and “ ض” ‘th’ may cause problems but the speed of speech used in the online demo is good. Numbers are not clearly enunciated due to the lack of diacritics.
Loquendo Loquendo offer a recording of a male and female voice on their site as the Arabic voice has only be available since October 2010. The system has good sound quality clear speech. The example on the website has diacritic marks but as it is a small sample it is hard to judge the overall quality but it appears to be good.
Issue No 3 – Further Development of eSpeak with Arabic.
The current version of MBROLA does not appear to run with the arabic voice files and there seem to be very few people who have had success. So this is work in progress…
I have just been looking at conversion tools and will set up a category for other links of items that I come across. In this case it is an open source conversion tool called txt2audio – “WebTool to convert text to audio format content.This tool will allow users to post some text and take the output as mp3 file.” There is also a french option using Google translate Leknoppix on Github