sobota 4. října 2014

Introduction to fitness band experiment

This started as stupid idea. It was influenced by someone's project in Coursera course (Data Analysis and Statistical Inference), when I did volunteer evaluation. The other student analyzed his own data from Fitbit Flex . I almost forgot on it. Then later on in another Coursera course (Getting and Cleaning Data from Data Science Specialization) we did data processing in R and this data contained also information from accelerometer and some experiment with couple of human subjects (see Human Activity Recognition Using Smartphones Data Set).

Then continued when I was discussing my dissertation thesis topic with my supervisor and we came to the experiment with fitness bands and sentiment written into twitter.


Experiment theory

So, I defined sentiment vs. measured data experiment and start with pilot (only 1 human subject, me):
  • fitness band
    • 14 days using fitness band
    • measure steps and sleeping
    • try to keep on track with personal goals (steps per day, sleeping hours per day)
  • sentiment
    • 14 days tweeting sentiment
    • write optimally 25 tweets per day
    • describe sentiment as activity which I am doing in specific time within day with focus on what feelings I have connected with such activity
Then if this would lead to something interesting I would enhance it to 30 people separated by gender to 15 and 15. And those two groups by age segments (teens, 20's, 30's, 40's and 50's) equally 3 human subjects per each segment.

How it went practically

First at all I want to choose Fitbit Flex, but I have chosen Jawbone UP24 because it allows to use smart wake up.

Experiment was done during period from 18th of August late afternoon till 1st of September noon.

Tweets

I wrote 280 tweets, which means 20 per day in average and I wrote 34,737 characters in total which means 124 per day in average.

Steps and sleep

I did 94591 steps in total, which means 6756 per day in average. I slept 100 hours and 8 minutes in total which is not bad, because I slept 7 hours and 9 minutes per night in average. Just for your information, my daily goal was 10 k steps per day and 6.5 hours of sleep per night.

Bias

If you imagine that you need to write 20 tweets per day and you will sleep for 6.5 hours per night, so you need to write it in rest of 17.5 hours. It means you need to write tweet every 52.5 minutes so there is little bit push to you to tweet as much as you can and it is little bit annoying.

Write sentiment is really difficult, because what does it mean exactly when you need to express something and you are limited by 140 characters in maximum. And is describing you currently doing , recently done or in near future doing activity with focus on feelings and emotions enough? 

Another think is you are trying to present your self as optimistic rather then pessimistic. So, I had constantly bad feeling about that I am writing still positive things which are happening to me.

Walking (and measure steps of course) is given by need to move somewhere. Reason is initiated by activity, so when I have lazy day I haven't move so much and I was also lazy in write tweets. Sometimes the mornings are without any steps and with small portion or nothing at all on twitter.

Improvements

I am thinking about to use Garmin Vívofit in next experiment which allows to measure not only steps by accelerometr, but also pulse by heart rate monitor via ANT+ technology. 

In this case wear heart rate monitor for 14 days including sleeping with would be really challenge. Another thing is that Garmin doesn't provide API (yet) for reading detail data on hourly basis or better granularity from web.

What next

I am at the beginning. So I have downloaded all tweets via Twitter API data and turn them into CSV file which I publish in some next blog post about this topic. I have done the same with Jawbone via Jawbone API and pilot implementation. So I need to finalize it and turn it into CSV files. And then I can follow up with sentiment extraction from already downloaded tweets which I most likely do with NLTK Python library

1 komentář: