-=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- (c) WidthPadding Industries 1987 0|82|0 -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=-
SoCoder -> Blogs Home -> Blogs


 
spinal
Created : 22 December 2008
 
System : Mac

Keyboard + dictionary...



I have been attempting to make a keyboard with sort of predictive text, by reading a dictionary file, but i'm failing.

source

I'm not sure if it's just taking too long to search the file, or if i'm just doing it wrong




Oh, well.

 

Comments


Monday, 22 December 2008, 03:59
Jayenkai
Yup, takes too long to scan..
Especially from a file!
And, mostly especially morely because you're checking "STRING" vs "STRING"
remember that "STRING" = 6 chrs.. so that's 6 checks at the most..
If you've
HAPP
vs
HAPPENING
HAPPILY
HAPPINESS
HAPPY
etc, then that's a whole lot of checks to do. *4 for each.

For Stringy Things, I converted everything into numbers.
1x32 bit integer can hold 6 letters..
(1-26 per chr, fits into 5 bits, 6*5 into 32 bits, with a couple of bits spare to suggest how long the word is.)
Longer words = more numbers, but since anything over 18 letters is getting stupid, I capped it at that.


Stringy Things' Shitz (Don't worry, it shrinks when it compiles!!)

checkword() usage..
Make CWrd, inwrd and RWrd string variables.
Place the word to Check into CWrd.
use checkword()
RWrd should now contain the word, if it's in the wordlist, or blank if it's not.

..
Nuts, I changed it..
There's a WFnd, in there, too!
Meh, you'll figure it out!

And anyway, that's just to check a word. If you want to do more than that, you'll need to play around with the bits (literally!) to see what other words there are in the list.

Let me know if you're really really stuck
Monday, 22 December 2008, 04:25
shroom_monk
If it's taking too long to search the file, what I'd do is split the dictionary up into 26 lists, one for each letter of the alphabet. Then, look at the first letter of the word, and scan that list, taking less time. If that's not enough, you could try 26*26 different lists, one for each combo of the first 2 letters of the alphabet.
Monday, 22 December 2008, 06:25
JL235
First remove the file IO, that should only ever be performed once. Then you need to try to make the time-complexity for searching for words as small as possibe. I'd recommend using a tree based data-structure for this.

I once saw a very fast dictionary built with a Trie.

Also indexing solely by the order of letters might work well as your typing, but it'll mess up if you want it to spell check too. i.e. if I type 'spallin' then it should offer me 'spelling'. That would be more useful then predictive words alone. For that I would look into indexing by both the order of characters and by techniques such as the levenshtein distance, in the same tree. i.e. so that from a node I can see all possiblities.

Maybe nodes could also be reference counted based on word usage so common words gotten through by certain paths are given a higher priority then less common words. i.e. Lets say I write 'col' and am then offered 'colour' and 'colonize'. If 9 out of 10 times I pick 'colour' then in the future 'colour' should be top of the list of possiblities.

Finally if it's slow building the trees then you might also want to look into some way you can pregenerate them and so load an already indexed and fully calculated data-structure.

|edit| Hmmmmmm, maybe I should try building this? |edit|
Monday, 22 December 2008, 06:31
Jayenkai
Actually, yeah. The tree thing's a good idea..
I wrote one of those, once, but as soon as Stringy Things tried hunting for a word with a "?" (Scrabble - Blank tile) it ended up having to hunt through the whole damn tree anyway.. And since the tree's so complex, it completely buggered it up, and took about 5 seconds to hunt through!
D'oh!
And it took ages to get it working in the first place!!

But, since you're not using blanks, a tree would most certainly be quicker.