The main reason for writing this post up is to open the idea of self learning equalizers to the general public and to get to know of any wrong application of ML theory that I have used in implementing a self learning equalizer.
Initial idea for self learning equalizers was from my laziness to set up the equalizer every time to listen to a song of a different genre, this happens frequently if you are an avid listener of songs that span multiple genres like classical, trance, hip-hop, blues..etc. People need optimal sound, what they don’t want is to do manual labor to get the optimal sound out of a computer. And they sure don’t want a heated computer!
That’s where this hit me, why not make it learn this stuff and get it over with. There have been previous attempts and there are even commercial products based on this idea. Again, I am lazy to pay. Specially for a commercial system that is closed and only runs on what they want it to run.
First, in my attempt was to implement something deterministic, some function that gives you the correct gains for the bands when you input the current buffer and target band gains. Which went well but wasn’t that good, and was available with first binary release of Vrok(well, as of writing the second one is only available to me).
My second attempt is to get ML in to this, which would hopefully make it better. The inputs for the ML system is current buffer(B, which has 512 frames) and the outputs are target gains (T, can be set by the user). What the system needs to do is to output T’, which is the set of new gains which are suitable for listening. And there’s a one big fat assumption, “the user MUST change T when he displeases the sound output”. Without this everything will fail. The implementation is quite easy, which I made easier because we need to keep below 3% for CPU usage. And there won’t be a separate learning stage and a usage stage, it’ll be doing both at once. The hypothesis function that is being used is H(b)=θb, where b is the average input value for a given band . Which makes the whole thing much easier to comprehend, as stated earlier if the user doesn’t like the sound he changes the target now the system sees that as the new output and trains itself with the input data from the buffer. The learning is slow and clumsy, works most of the time but sometimes it fails to find user’s dislikes. For this, a separate weight was added to make it reduce gain when the buffer’s average for a given band is higher; the user may still raise the target band if he/she needs further boost of said band.
So in the end, this is a sum of a deterministic function and a ML based solution.
EDIT: Vrok itself has changed alot and the results of testing this system became more negative than I expected. So currently this stuff is removed from Vrok. If any of you readers know of ways to do this faster using less CPU time solutions are welcome.