Commit df6e38a8 by Alexander Henkel

### work on baseline

parent 12b73e04
 ... ... @@ -10,7 +10,11 @@ To evaluate my personalization approach I observe different factors and their im \section{Supervised personalization} Following experiments show theoretical aspects of the personalization process. \subsection{Transfer learning based on ground truth data} These build a baseline how a personalized model could perform if a perfect labeling would be possible. Additionally the influence of label noise is analyzed. First the base model is trained by the ground truth data of the training set for each participant. After that the resulted model is evaluated with the test data and given metrics. Then $n\%$ of the \textit{null} labels and $h\%$ of the hand wash labels are flipped. Again the base model is trained and evaluated with the new data. This is repeated over different values for $n$ and $h$. These build a baseline how a personalized model could perform if a perfect labeling would be possible. Additionally the influence of label noise is analyzed. First the base model is trained by the ground truth data of the training set for each participant. After that the resulted model is evaluated with the test data and given metrics. Then $n\%$ of the \textit{null} labels and $h\%$ of the hand wash labels are flipped. Again the base model is trained and evaluated with the new data. This is repeated over different values for $n$ and $h$. The plots of \figref{arg1} and \figref{arg1} show the resulted evaluations over increasing noise. First we concentrate on a) and b) of \figref{arg1}. Here the value of $n=0$ and noise is added to the hand wash labels. We can see that noise values up to around of $40\%-50\%$ have just small impact to specificity and sensitivity. If the noise increases further, the sensitivity tend to decrease. For specificity it seems that there is no trend and only the deflections become more extreme. But models trained on additional data with up to $~70\%$ noise on the hand wash labels can still benefit in comparison to the general model. In contrast, noise on \textit{null} labels leads to much worse performance, as can seen in the plots of c) and d). Values of $n$ below $0.1$ already lead to drastic decreases in specificity and sensitivity. To better illustrate this, \figref{arg1} shows plots of noise on \textit{null} labels in a range from $0\%$ to $10\%$. The specificity drops to $~0.55$ for $n<0.015$ and seems to converge to $~0.5$. All trained models on these labels achieve less specificity than the general model. For noise values around $0-2\%$ the sensitivity value can be higher as of the general model. But for lager noise this also decreases drastically. \figref{arg1} shows the resulting F1 score and S scores. These clarify, that noise on hand wash labels just has minor impact to the model performance, whereas even a small amount of noise on $null$ labels drastically reduces the performance which leads to worse personalized models as the base model. I argue this by the imbalance of the labels. A typical daily recording of around 8 hours contains ${\sim}20,000$ labeled windows. A single hand wash action of $20$ seconds covers $15$ windows. If a user would wash its hands $10$ times a day, it would lead to $150$ \textit{hw} labels and $19,850$ \textit{null} labels. Even $50\%$ of noise on the \textit{hw} labels results in ${\sim}0.3\%$ of false labels. But already $1\%$ of flipped \textit{null} labels lead to ${\sim}56\%$ of false hand wash labels. So there would have a higher impact to the training than the original hand wash data. \begin{itemize} \item As baseline \item Observe label flips in noise/hand wash sections -> label noise ... ...
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment