You might have a somewhat unhealthy obsession with machine-learning. I definitely do… I’ve been using myself as a personal lab rat for data science activities. Its been rough, collecting data on myself and all, but the exciting part is that now I get to tell you.
Over the course of the past couple weeks, I’ve been logging my web-browsing searches, along with a rating corresponding with my experience while browsing. I did this with the intention of
Well, maybe not… Maybe I just did it for fun. Similarly to normal data science projects, a very large part of this torturous activity is actually collecting the data. I have had a lot of great ideas that simply weren’t feasible because of the data collection step. A couple of failed arduino projects (tried the brain implant, didn’t work) are sitting somewhere lost in the folders of one of my computers. Most of these brilliant ideas were related to my personal health or automating something I do everyday.
For example, I was working for a solid month on software that would help me sleep, because I’m an insomniac. So the first part was to come up with a way to monitor when I woke up, and when I went to sleep… And you’re going to laugh, but I trained a neural network with a webcam that could tell if my eyes were open or closed.
Turns out, detecting open or closed eyes was actually really difficult, but I had a model that worked decently and I was ready to start using it. So what went wrong? Life is so full of variables that you can’t really control, and some that you can control, I made myself a test set, and the predictions weren’t even close because I’m brutally inconsistent.
The idea of a completely software-based, high-level application to collect some sort of data definitely stuck out to me when I started my web-browsing project. I was going to build a web browser in Python that logs my search queries (typed into a search box in-browser), that asked me to rate my experience on a scale of one to ten every time I closed it,
At the very inception of the project, I ran into my first big hiccup. I could not for the life of me get any form of web-kit working in GTK3+. I nearly quit entirely, rather than switching to a different GUI Framework. In the end, I settled on Qt for two main reasons:
After building my minimalist browser (glad to be back in Chrome,) I made the close button request a number and dump, and setup a little script to log my web-browsing habits.
After about two weeks, I was pretty confident with the amount of data I had, which was about 2,500 browser sessions, so I loaded my data into Python and decided on my model. I decided to use the tfidf transformer along with my good ole buddy,
After getting my model up and running, I was able to input a query and get about how my day was going at that time. The return was unusually negative, most results returning below 5, the most I remember being 7.
Because I’m a cynical individual, and I tend to keep my browser open for hours on end, I rated my experience on a pretty personally bias scale…
After training this model, I can say confidently I was completely disappointed in myself. I did get enjoyment out of predicting the other way around, guessing what phrases I Googled that made me so upset, but it wasn’t enough to redeem the time I spent on the project in the name of science. Similarly to the other projects, this one was definitely a flop. Regardless of how terribly I think of the thing I created, I did really enjoy making it. Hopefully in the future I’ll be able to revisit this, or a similar idea and make something cooler. Maybe I’ll pay someone to use the browser for me so I don’t have to.
It’s a painful experience, losing a project you loved so much because one day you woke up and said:
But failing is part of the fun. Data science is a fun thing because of experimentation. Testing things out is always welcome, and never up for debate. Hopefully I can come to terms with this defeat and counteract it later down the line with more data, a better browser likely written in Kivy, and less bias. Maybe one day I’ll revisit that old sleeping project, either way I’m excited to do whatever it is that I do.