Category Archives: ATKit

End of Year update

Spell Checking Service

The spell checking service has been updated and analysed by Nawar and one of the conclusions is that the error checking for long single words is relatively accurate without context.  However, with words that are small and typed incorrectly there are two problems.   One is that the word can be changed to another word that is not appropriate for the context but the spelling is correctly so the mistake is not picked up.  The second problem is that if one small error has been made in a short word there are often too many options as to how this word could be spelt.  The spell checker does not cope with grammatical errors and is unable to see the context of words.

ATbar spelling service

Magnus has found that because the spell checker does not ‘use’ any words around the error he is having to develop a system that will record the words typed prior to the error and then capture a few words after the error.   This is not as easy as it sounds!  The service for correcting errors is in place without the sentences at present

 Server Side Support

All aspects of the websites and toolbar that have required the move to ‘https’ have occurred.  This may not appear to be important to users but it has been done to allow the ATbar and its services to be used on any secure sites such as banking services etc.  The ‘https’ is a way of telling people that you are a trusted source – Magnus has obtained SSL certificates for the majority of our services – these will expire in 2015.  The ATbar and its services now sit on a new virtual server.  We are still looking to the possibility of having a redundant server if the one we are using fails, but this is a costly exercise.

As part of this process all versions of ATbar are now automatically updated. 

For the latest version of ATbar please find it here: https://core.atbar.org/atbar/en/latest/atbar.min.js

Documentation

Documentation is available on a wiki and on Github

Github screen shot

Instructions are available in Arabic and English

wiki screen shot

Dictionary

We have looked into possible alternative dictionaries instead of using Wiktionary.  Wiktionary has a very limited word list and poor definitions when used in Arabic. Of the freely available dictionaries, Word Reference looked promising as it has a comprehensive English to Arabic translation database which is also a dictionary. It has an API but sadly no Arabic > Arabic with definitions or even stems.

One of the problems we face is that true Arabic dictionaries are structured in a different way to western ones. Many of the dictionaries we have looked at include some stem information but lack the more comprehensive information required to help users (example).

We need to understand the use of the dictionary required on ATbar in order to be able to provide the correct service.  So any comments would be very welcome. 

Desktop ATbar

We have developed a Desktop ATbar with magnification, screen reading, colour overlay with screen ruler and an on-screen keyboard. It is still in the beta version and we are in the process of improving its accessibility such as tab order and icon improvements. It is hoped a final release will be available next week. The toolbar has been tested on Windows 7/8 and should be backward compatible – it has not been developed for the Mac OS.

Desktop ATbar

The code for the toolbar is open source and available for download from GitHub. We have included concise and comprehensive inline-documentation between code segments. Several free open source libraries have been used as part of the project and adjusted to suite our needs.

Now, we are making sure the toolbar is easier to install and there are several issues to consider:

  • Anti-Viruses blocking the toolbar.
  • Installing newer versions of the bar on-top of old ones.
  • Making the bar easy to use with shortcuts while avoiding shortcut collisions.

Please do leave your comments on any items we have discussed.

All good wishes for the New Year.  Till 2013

Updates on the progress on Arabic spell checking, TTS, Word Prediction and the ATKit

footstepsThe last few weeks since the Christmas break have flown by with a flurry of activity which is retrospect seems at times to have made us feel as if we have been going two steps forward only to have to go at least one if not more steps backward!  But there have been some breakthroughs in the areas of Spell checking, Text to Speech, Word Prediction and the ATKit website.

Spell Checking

Thanks to Mashael AlKadi we have a really clear evaluation of the spell checker titled Dyslexic Typing Errors in Arabic (PDF download) and also thank you to Mina Monta who commented that:

  • “Some of the words are correct in spell & in the meaning but AT spell checker detect that those are wrong words
  • In the suggested word list, there is no sorting according to the priority of the suggested word (according to the relativity between the suggested word & the original wrong word)
  • Some of the suggested words are wrong in spell
  • The number of the suggested words is to high comparing with MS Word spell checker.
  • MS Word is better in detecting the wrong words in grammar (the word has correct spell) “

Sadly research into English spell checkers has revealed that they are not as accurate as we had hoped when it comes to providing false errors and real words or homophones as can be seen from this presentation about online spell checking.

I asked Mashael whether adding a new corpus would help as Seb has succeeded in collecting a larger Arabic corpus and has put in some code to make it possible to add this extended vocabulary.   However, Mashael’s comment was:

“regarding adding new words, do you mean expanding the tool’s dictionary? I don’t think you should worry beacuse it was working very well expect for certain remarks that I’ve said such as the tool’s behavior with words attached to prepositions. In such case only some adjustments should be applied to the tool’s mechanism and I think it will work great.”

So with the support of Erik and Mina in our last meeting, it has been decided that we will work on particular improvements as a future aim with the help of our Arabic speaking colleagues.

Text to Speech

It has been a bit of a trial and error period starting with the withdrawal of Google Translate. We were aware this might happen, but had rather hoped there could be a reprieve as this was a free option, although in the tests carried out with 5 Arabic speaking students the results were poor in comparison to Acapela and Vocalizer voices. The sadness also on the part of the time spent on this work as it was something we had proved was possible to achieve – a free TTS on the toolbar.  Microsoft Speak Method was also tried and tested – but the TTS appeared to leave off initial sounds and the voice was unacceptable to our beta testers.

We also learnt that NVDA in Arabic was only going to work with the Arabic TTS offered by Microsoft and eSpeak and Festival with the Mbrola project was still an uphill struggle.

As a research project and definitely not for profit we also wondered if we could go back to Google Translate but the agreement  specifically says  “The program may be used only by registered researchers and their teams, and access may not be shared with others.”

Meanwhile Fadwa Mohamad kindly visited King Abdulaziz City for Science and Technology(KACST) over the Christmas period and met Professor Ibrahim A. Almosallam who has been in touch to say that they are developing an Arabic Text to Speech application, but it has yet to be released.  I am enquiring as to whether this is a desktop application or a VAAS system (Voice as a Service) such as that offered by Acapela in Arabic.

Seb then spent time working on the Acapela VAAS system and this was shown to work well in all the tests although there are issues when a whole page is read out.  It is felt that it might be more appropriate to restrict the call on the servers and just allow text to be highlighted and then spoken.  We now have to negotiate the way we can work with this system, as the final output needs to be free to the user.

There is also the option of building a new Arabic voice and this is being explored – although it would take time and effort to generate the corpus, normalise the output and beta test, even when there are engines available to achieve this aim….. A new build Arabic voice needs further discussion but we have the connections in place.

WordPrediction

wordprediction screen grabSeb has been able to show how this feature for the toolbar is possible in English and the background architecture is in place for the Arabic version pending the language pack.

ATKit website

ATkit siteIt has been agreed that the mock up of the ATKit website that was available as a demonstrator should be taken forward and developed.  This has been completed with the ability to add plugins both free and those that require payment (for instance where a TTS requires a fee). Users can register, build  their own toolbar and save the results.  The next step is a completed Arabic translation and the ability to author plugins …

Arabic ATKit