Arabic Word Prediction

Modelling text prediction systems in low- and high-inflected languages

Abstract – Text prediction was initially proposed to help people with a low text composition speed to enhance their message composition. After the important advancements obtained in the last years, text prediction methods may nowadays benefit anyone trying to input text messages or commands, if they are adequately integrated within the user interface of the application. Diverse text prediction methods are based in different statistic and linguistic properties of natural languages. Hence, they are very dependent on the language concerned. In order to discuss general issues of text prediction it is necessary to propose abstract descriptions of the methods used. In this paper a number of models applied to text prediction are presented. Some of them are oriented to low-inflected languages while others are for high-inflected languages. All these models have been implemented and their results are compared. Presented models may be useful for future discussion. Finally, some comments related to the comparison of previously published results are also done.

TurboType

Turbo Type adds the word prediction feature to all text editors. This is not just about selecting words from a dictionary. Turbo Type is able to predict and select the word you are most likely to type. You can add your own words to the dictionary if you have technical or specific terms that you use often. Also with Turbo Type you can expand a small word into a complex text. You can sign your email in a snap! This program is highly customizable.

Project:Possibility Word Predictor

Overview

This goal of this software is to allow users with limited mobility to be able to more accurately and quickly input text into a computer. This software is intelligent enough to offer suggestions for the user based on context and the characters inputted so far.

For implementing the word level prediction, the application reads input text files, that can be all the documents in the users local machine, or all his emails. Any text that can help the application to initiate the word level contextual prediction.

The application maintains 3 sorted maps for storing as keys:

  1. A unigram map: This map stores individual words in the input text
  2. A bigram map: This map stores consecutive words in the input text
  3. A trigram map: This map stores 3 consecutive words in the input text.

Initially the user enters some input text (a word or two). As the user enters the letters of the words, the application presents him with some predictions. After the user has entered a word or two, the user can call the application to make predictions based on the usage context. The application crawls over the maps to present a set of predictions based on the bigrams and the trigrams that the user has provided as input text. The user can accept or reject the suggestions. In both the cases, the users input is reflected back to all the maps so that the context bases prediction becomes more refined.

Features

  • A demonstration of a state of the art word prediction technology which can be applied to other applications in the future

Current Issues

  • Slow startup/initialization time

Future Plans

  • Code cleanup
  • Integrate with non-OS specific “on-screen keyboards”
  • Ability to import impure text formats such as .doc and .html
  • GUI for importing files
  • Offer word predictor as a XML-RPC service

References

1 thought on “Arabic Word Prediction

  1. davidbanes

    Word prediction is an area we have struggled to find a solution for, but is clearly a major priority for us – across a range of needs including learnign disability and physical needs we find that the lack of a good solution is a major “hole” in our toolkit – Todate the best we have been able to offer is the Google Scrib’d solution. all input gratefull received

Comments are closed.