In the previous post, I mentioned that the next thing developed in the Aksesi project will be a management console. After submitting that post, I realized that it is going to be another boring application with 90% of its logic encapsulated in CRUD operations. When I decided to take part in Get Noticed competition, my main goal was to learn new things. To make a long story short. The next step won’t be management console; the next step will be gesture recognition with Artificial Intelligence usage.
I would like to remind you that all of the sources are available in my Github repository. Moreover, in the README file, you can find short project description with its main assumptions.
As far as I’m aware, the most suitable model for gesture recognition will be a neural network. I don’t want to implement it from scratch so I did a research. I found two promising open-source frameworks. The first one is called Neuroph. I have used it once in a university project. Unfortunately, this framework seems to be very slow. Its main opponent, considerably faster, is called Encog. One thing to be said for using Encog is that it has a fantastic guide book in which we can find many well-described examples. As well as that, its API seems to be more readable and user-friendly.
How many potential problems do I see? A lot. First of all, I know how the neural network works, however, I’m not fluent in its usage. It will probably force me to spend plenty of hours on reading and analyzing thesis.
Secondly, there could be some problems with a gesture representation. In which format should we pass it into the neural network? Moreover, we need a big data set for the purpose of training the network. Probably creating some kind of gestures generator will be necessary.
Another problem will be scaling and normalization. The neural network input requires a constant number of input data. As we know, gestures made by users not always consist of the same amount of points.
The next problem is connected with points coordinates. jQuery provides coordinates relative to the top-left corner of the document. Depending on a location where the drawing area is displayed on a site, points will have completely different (x,y) values. It will be required to somehow scale or adjust them.
From my point of view, there is a lot of work to do but it’s worth it. After implementing the network and feeding it with training data, Aksesi will be able to provide support for wider range of gestures. Not only simple lines but also circles, squares, triangles, etc.