Testing times with Arabic Windows 8 and Arabic eSpeak.

A visit to the Assistive Technology Industry Association 2013 conference where the Microsoft team kindly showed me how we could work in Arabic and English plus the arrival of our Dell tablet with Windows 8 has made us look at the issue of Qatari Arabic support and Windows in depth.

qatari keyboard Qatari Arabic language pack

We downloaded the language pack and changed the keyboard and all seemed well but it appears from the email I received from their product advisor that there is no Window Arabic voice at present.

“I researched the question to see if Windows 8 supports Arabic (namely Qatari dialect) text to speech. Unfortunately, at this time, Windows 8 does not support it. Only certain languages are included in the built in software.”

So back to the drawing board for the ATbar desktop option – Narrator is not going to speak in Arabic unless someone has found an Arabic Windows system with a well hidden free voice from Microsoft!   If anyone has found a solution to this problem please do let us know!

eSpeak

eSpeak logo

More research and thanks to a recent development with Arabic eSpeak we now have a free voice,  Testing has shown that the voice needs to be improved but with work on the phonetics in the future this is something that could be done.  The aim is to ship NVDA with the ATbar desktop version and the Arabic eSpeak voice.  It will not really be an acceptable voice where a Nuance or Acapela option is available.

 

translation into ArabicThe Windows 8 mobile OS has the potential to support more Arabic options and offers translation from OCR although the actual text is still not 100% correct – Spot the problem!

Nuance has a choice of Arabic voices  for mobile and has added speech recognition but none of our team have been able to test its success rates.  Google has also rolled out speech recognition in Arabic for Android phones 

We have been testing online speech recognition systems offered by Google Chrome and they really are not very successful in the Arabic dialects offered.  Below is an example of Speech Recognizer in Arabic.

speech recognizer

The TalkTyper system uses Speech Recognizer for speech recognition as well as text to speech – the latter uses a very good voice in Arabic – we are still exploring which voice is used but it sounds like Nuance Maged in Arabic.

What this spot for updates next week linked to the ATbar desktop app and ATbar TTS.

 

 

 

Update on the Wiktionary issues for the Arabic ATbar dictionary

In the last blog Nawar mentioned the issues we are having with the Arabic version of Wiktionary and its presentation of definitions and alternative words when selecting text on Arabic web sites.  The Wiktionary pages do not appear to be as well organised in Arabic as they are in English.  They are incomplete and often return incorrect results or no results.

Arabic wiktionary homepage

Arabic wiktionary homepage

In a previous blog we showed a diagram that highlighted the importance of organising the stems related to words along with the definitions taken from Wiktionary. The way the words are presented with their changing meanings is important and Maraim and Nawar have been discussing the use of crowd sourcing to achieve a successful outcome as this is not something that can be done immediately if we want to make a useful dictionary that makes the most of open source software alongside content that is also open and accessible to all.

Maraim has written a blog about the subject in Arabic.  She explains the concept of crowd sourcing and provides examples of three different dictionaries – Lingoz, Wordia and Collins that have all used this technique to gather data.

Arabic Spell Checker

Maraim Masoud and I (I am Nawar Habib) have been aiming to improve the accuracy of the Arabic spell checker currently running on ATbar.  We have done some research through previous work done in the area. The currently-running spell checker is an ASpell instance using a word list of common Arabic words. It produces good spelling suggestions for long Arabic words (longer than 4 letters) because of the high diffusion between long Arabic words (Which is probably true in any language). High diffusion means that it is not likely that a Typing error in one word would produce another correct word. Arabic roots on the other hand, are 3 or 4-letter words, so a typing error (changing on letter or omitting a letter) would very likely produce another correct root or even another Arabic language constructs like a connective or proposition, and even if the word produced by the error was not an Arabic word, the spelling suggestions might sometimes be confusing for short words because of many alternative possibilities.

Ayaspell is a project aimed at producing an Arabic word list mainly for spell-checking purposes. The creators of Ayaspell also provide a Hunspell based spell checker equipped with their word list. The main issue with their work is that they used traditional Arabic dictionaries as their word source which contain Arabic words that are no longer used. This would confuse the spell checker and decrease the diffusion talked about above in this post. This is the only documented word list we have found and we did a brief test on the Hunspell implementation which did not show good results.

Hence to improve our spell-checker we should:

1- Make sure popular words are added to our word list (the ability to do that exists).
2- Hunspell and ASpell use Phonetic codes to represent words as they sound spoken. This helps in giving suggestions that not just have close spelling but also close pronunciation. For Arabic it is completely different, Arabic words sound as written (With some exceptions like confusing ة with ه, or ي with ى, or ى with ا, or أ with ا), hence, spelling errors happen accidentally (Button Proximity). But still the phonetic code should be utilized in Arabic but new methods should be added to accurately calculate the distance between words (like Adding Grammar-checking).

We had a problem with Wiktionary’s service API. Wiktionary, when asked for a word definition, conducts an exact-match search on Arabic words, so, if the submitted word has prefixs or suffixes or a definitive article, the word would not be found. To solve this we are creating a light stemmer that operates as preprocessor before the word is looked-up in the dictionary. The light stemmer has a smll CPU footprint because it does not use a word list (only Grammer rules), unlike heavy stemmers which use word list to increase accuracy but decrease performance.

ATbar has a choice of voices and a colour overlay plugin.

Arabic voice choices

Arabic voice choices

We have been experimenting with voices on the ATbar as there has been some discussion about using a male voice as this may be more acceptable to some users.  We really would value your input into these thoughts.  The English version of the toolbar now has Lucy (F) and Peter (M) and the Arabic voices are now Leila (F) and Mehdi (M).  This additional service comes thanks to Insipio and the work Lars and Magnus carried out over the last few weeks.

The return of the text to speech from the server has been reduced from 4 to 2 seconds – we will monitor whether this has an other unforeseen consequences.

Magnus has also add another plugin as standard to the Arabic and English toolbars.  A colour overlay plugin that will allow users to read websites with less glare. There is a choice of colours – cream, pink, pale blue and pale green.  We hope this will help those who have visual stress, find the glare of black on white hard to read as well as those with other specific learning difficulties such as dyslexia.  If you are using Chrome, Safari and FireFox browsers you will also be able to click through the overlay and even write with most forms.  Sadly Opera and Internet Explorer do not support a click through ability. There is a step by step guide on the ATbar wiki in English and Arabic.

pink overlaywordpress with blue overlay

 Looking to the coming months

Arabic Dictionary

Nawar has been working on a new dictionary that will offer users a word list that is more useful for Arabic speakers – it will be based on prefixes and suffixes with root words that may then link with Arabic Wiktionary if it exists and he is hoping to adapt the way the results are presented. It is hoped this will be finished by the end of March 2013

dictionary plan

 

 

 

 

 

 

 

 

 

CSS issues

Magnus is working on the CSS issues that occur with some Arabic websites – we have recently been running evaluations on a series of important sites to see how prevalent the problem actually is with the poor presentation of our dialog boxes.  I will be blogging about the particular issues and we will be illustrating the results of the changes as they happen.   It is possible to change the dialog box for individual sites as has been done in the past but this is not the answer as sites constantly change so we need to find a robust solution that works for all.  This will be finished by the end of January.

TTS free voices

TTS work has been on-going and Nawar has tried the Euler/Mbrola route which despite much experimentation has not been successful so far.  eSpeak experiments are ongoing for the desktop version and it is hoped that we can still find a solution for both desktop and  web based toolbar TTS functions by the end of March 2013.

Meetings with Mesar at ATSummit resulted in a discussion about NVDA being used for text to speech as well as a screen reader – in other words developing a way for the program to respond to selected text that has been visually highlighted as well as offer more options to reduce the verbosity for dyslexic users.

Arabic ATbar Desktop version (Windows Xp, 7)

We want to have a free TTS  for the Windows system desktop ATbar when we link it to NVDA as at present the desktop version links to Narrator which does not read in all applications but offers good selected text to speech and screen reading feedback in Wordpad, Notepad, Internet Explorer and works with all individual letters typed as well as for all actions on the Windows desktop and with system operations -the help file has useful keyboard shortcuts.

The ATbar desktop version once installed launches at start up and has menu buttons for text to speech, coloured overlays, an onscreen keyboard and magnification as mentioned in our previous news update.

desktopATbar

 

 

 

 

There is an ATbar desktop download available    There is now a portable version of the desktop ATbar that can be used on USB pendrives – the lower menu button on ATbar website.  The work on the desktop version has been completed and is awaiting any comments from users.

Spell Checking Plugin Update

The spell checking plugin has been updated to further record spelling errors. It now also records the sentence containing the error to provide context for the spell checking service. However, in order to comply with the Data Protect Act 1998, we ask users if they would like to provide the data anonymously.

When spell checking is complete, the user is asked if they would like to submit anonymous usage data. This data is displayed to ensure they know what the are submitting.

Spell checking plugin - asking user to submit anonymous usage data

End of Year update

Spell Checking Service

The spell checking service has been updated and analysed by Nawar and one of the conclusions is that the error checking for long single words is relatively accurate without context.  However, with words that are small and typed incorrectly there are two problems.   One is that the word can be changed to another word that is not appropriate for the context but the spelling is correctly so the mistake is not picked up.  The second problem is that if one small error has been made in a short word there are often too many options as to how this word could be spelt.  The spell checker does not cope with grammatical errors and is unable to see the context of words.

ATbar spelling service

Magnus has found that because the spell checker does not ‘use’ any words around the error he is having to develop a system that will record the words typed prior to the error and then capture a few words after the error.   This is not as easy as it sounds!  The service for correcting errors is in place without the sentences at present

 Server Side Support

All aspects of the websites and toolbar that have required the move to ‘https’ have occurred.  This may not appear to be important to users but it has been done to allow the ATbar and its services to be used on any secure sites such as banking services etc.  The ‘https’ is a way of telling people that you are a trusted source – Magnus has obtained SSL certificates for the majority of our services – these will expire in 2015.  The ATbar and its services now sit on a new virtual server.  We are still looking to the possibility of having a redundant server if the one we are using fails, but this is a costly exercise.

As part of this process all versions of ATbar are now automatically updated. 

For the latest version of ATbar please find it here: https://core.atbar.org/atbar/en/latest/atbar.min.js

Documentation

Documentation is available on a wiki and on Github

Github screen shot

Instructions are available in Arabic and English

wiki screen shot

Dictionary

We have looked into possible alternative dictionaries instead of using Wiktionary.  Wiktionary has a very limited word list and poor definitions when used in Arabic. Of the freely available dictionaries, Word Reference looked promising as it has a comprehensive English to Arabic translation database which is also a dictionary. It has an API but sadly no Arabic > Arabic with definitions or even stems.

One of the problems we face is that true Arabic dictionaries are structured in a different way to western ones. Many of the dictionaries we have looked at include some stem information but lack the more comprehensive information required to help users (example).

We need to understand the use of the dictionary required on ATbar in order to be able to provide the correct service.  So any comments would be very welcome. 

Desktop ATbar

We have developed a Desktop ATbar with magnification, screen reading, colour overlay with screen ruler and an on-screen keyboard. It is still in the beta version and we are in the process of improving its accessibility such as tab order and icon improvements. It is hoped a final release will be available next week. The toolbar has been tested on Windows 7/8 and should be backward compatible – it has not been developed for the Mac OS.

Desktop ATbar

The code for the toolbar is open source and available for download from GitHub. We have included concise and comprehensive inline-documentation between code segments. Several free open source libraries have been used as part of the project and adjusted to suite our needs.

Now, we are making sure the toolbar is easier to install and there are several issues to consider:

  • Anti-Viruses blocking the toolbar.
  • Installing newer versions of the bar on-top of old ones.
  • Making the bar easy to use with shortcuts while avoiding shortcut collisions.

Please do leave your comments on any items we have discussed.

All good wishes for the New Year.  Till 2013

Automatic updating of ATbar and Arabic Desktop Toolbar

Whilst Magnus has been working on moving ATbar onto new servers which has meant that the ATkit framework has been re-engineered for greater stability and will now automatically provided all users with the latest version of ATbar with no need to manually update.   Magnus has also finished the first version of the spell checking service and this is now ready for testing.  There is a new wiki with plugin guides and more information about the ATbar and the ATkit framework.  On the Services page there is also a link to a   forum for any questions that may arise.

Nawar Halabi has joined the team to help with our desktop toolbar. We are really grateful to have his expertise in Arabic and desktop programming. He has been developing the Arabic Desktop Toolbar

Arabic Desktop Toolbar

The new Arabic toolbar is an open source windows application built using the C# programming language on top of .NET framework. The main purpose of the toolbar is to provide a launch pad for the Ease of Access options provided by Windows and other accessibility applications to Arab desktop computer users.  We hope to gradually provide the same functions that are available on ATbar (which is for web users).

At present the toolbar is available for beta testing and includes four functions: an onscreen keyboard, a screen reader, a magnifier and colour overlays.  The first two use the Windows built-in onscreen keyboard and Narrator, and the other two are bespoke applications.  The window magnifier and the colour overlays can be adapted to suit the user.   There is the ability to open a preferences window where it is possible to customize  the size and behaviour of the magnifier (lens or docked); change the colour of overlays and change whether the toolbar windows are always on top (see screenshots).

Toolbar magnifierTooblarToolbar colour overlayPreferences

Please help us to improv the toolbar and by downloading the beta version from github

Observations so far:

  • A good Arabic screen reader is required to replace windows Narrator. We are looking into the use of NVDA with the Windows Arabic voice.  There appear to be no freely available text to speech voices in Arabic at present.
  • Making the toolbar in a way that accepts plugins (similar to the ATbar browser version) to make it easier to add new features.
  • The toolbar has been built to be accessible to any input device and could take any language support – at present it is only available in Arabic.   It can be used in High contrast mode and has its own desktop icon.

ATBar Services and wiki site now available – Spell Checker service developed for additional support.

ATbar services offers links to other parts of the ATkit such as the marketplace of plugins, news and statistics and a new area for services to improve the ATbar.

spell checker service in English

Arabic spell checker services

The Spell Checker service allows users to log in and adapt the spell checking feature on the toolbar by correcting words found in the spell checking dictionary and adding new options for the error correction list. This feature is available in Arabic and English and allows all those who log in to add suitable corrections for words that have been misspelled where no suitable correction has already been supplied. The alternative words provided by users will go into a moderated database.  Once checked the words will appear in the spell checker.
Dictionary pluginThe ATbar wiki has been set up to work in English and Arabic and will be where all the supporting information about the entire ATkit can be found from guides to the framework.

Work is ongoing to produce guides for all the plugins that have been developed.  The standards toolbar plugins have been completed in English and are being translated into Arabic.

Arabic ATbar spell checking update

Magnus has added an extended Arabic dictionary to our spell checker which has resulted in better error correction. The size of this new dictionary is twenty times larger than the one used originally building on the original Aspell dictionary.  We are also able to supplement the database with additional words.

Arabic ATbar spell checkerAlaa has been testing the checker and noticed an error on our web page that we use for trying the toolbar.  This time the words offered as alternatives made sense and could be used when she was making mistakes.

Database for spelling errorsWe now have a database that records the word that has been misspelled, saves the error alongside the word that has been chosen from the correction list or notes the fact that the user has ignored the offered words.  The database handles all languages but those words in Arabic are appear incomprehensible to readers due to the UTF-8 coding.

Word Prediction update

During the summer we noticed that AIType was cutting off the initial letters when used with Internet Explorer and then we had a complete collapse of the service for a short time. This caused some concern and when we tested the Windows desktop version of the software we could not reach the servers as quickly as expected.

AIType were amazingly quick in reassuring us that they had had some server issues but these were immediately rectified. We checked our servers and the problems happened to occur just as Magnus was updating our servers as well.

Happily the speed issues have been resolved and so have the problems that were occurring with Internet Explorer. Both the Arabic and English word prediction are working well at the moment as has been illustrated in the graphic whilst writing this blog. I have been using the HTML view in WordPress. If you use the Visual mode in rich text editors there tends to be a problem with the dialog box with the word selections not appearing possibly due to the javascript edit box overriding the word prediction whereas a simple edit box always works. As many of the menu items on the rich text editor tool bar are not accessible via keyboard, this may not be an issue for those who do not use a mouse.

word prediction

Word prediction used with a simple edit box - HTML mode in WordPress


الصورة التي تظهر هي اختبار البرنامج بالعربي
Arabic word prediction