Commit 56b185ac authored by Alexander Henkel's avatar Alexander Henkel
Browse files

finished state to review

parent 1e3db523
\chapter*{Abstract} \chapter*{Abstract}
Wearable sensors like smartwatches offer a good opportunity for human activity recognition (HAR). They are available to a wide user base and can be used in everyday life. Due to the variety of users, the detection model must be able to recognize different movement patterns. Recent research has demonstrated that a personalized recognition tends to perform better than a general one. However, additional labeled data from the user is required which can be time consuming and labor intensive. While common personalization approaches try to reduce the necessary labeled training data, the labeling process remains dependent on some user interaction. Wearable sensors like smartwatches offer an excellent opportunity for human activity recognition (HAR). They are available to a broad user base and can be used in everyday life. Due to the variety of users, the detection model must be able to recognize different movement patterns. Recent research has demonstrated that a personalized recognition performs better than a general one. However, additional labeled data from the user is required, which can be time-consuming and labor intensive to annotate them. While common personalization approaches reduce the necessary labeled training data, the labeling process remains dependent on some user interaction.
In this work, I present a personalization approach in which training data labels are derived from inexplicit user feedback obtained during the usual use of a HAR application. The general model predicts labels which are then refined by various denoising filters based on Convolutional Neural Networks and Autoencoders. This process is assisted by the previously obtained user feedback. High confidence data is then used for fine tuning the recognition model via transfer learning. No changes to the model architecture are required and thus personalization can easily be added to an existing application. In this work, I present a personalization approach in which training data labels are derived from inexplicit user feedback obtained during the use of a HAR application. The general model predicts labels which are then refined by various denoising filters based on Convolutional Neural Networks and Autoencoders. The previously obtained user feedback assists this process. High confidence data is then used to fine-tune the recognition model via transfer learning. No changes to the model architecture are required so that personalization can be easily added to an existing application.
Analysis in the context of hand wash detection demonstrates, that a significant performance increase can be achieved. More over, I compare my approach with a traditional personalization method to confirm the robustness. Finally I evaluate the process in a real world experiment where participants wear a smart watch on a daily basis for a month. Analysis in the context of hand wash detection demonstrates that a significant performance increase can be achieved. Moreover, I compare my approach with a traditional personalization method to confirm the robustness. Finally, I evaluate the process in a real-world experiment where participants wear a smartwatch daily for a month.
\chapter{Zusammenfassung} \chapter{Zusammenfassung}
......
\chapter{Introduction}\label{chap:introduction} \chapter{Introduction}\label{chap:introduction}
Detecting and monitoring peoples activities can be the basis for observing user behavior and well-being. Human Activity Recognition (HAR) is a growing research area in many fields like healthcare~\cite{Zhou2020Apr, Wang2019Dec}, elder care~\cite{Jalal2014Jul, Hong2008Dec}, fitness tracking~\cite{Nadeem2020Oct} or entertainment~\cite{Lara2012Nov}. Especially the technical improvements in wearable sensors like smart watches offer an integration in everyday life over a wide user base~\cite{Weiss2016Feb, Jobanputra2019Jan, Bulling2014Jan}. Detecting and monitoring people's activities can be the basis for observing user behavior and well-being. Human Activity Recognition (HAR) is a growing research area in many fields like healthcare~\cite{Zhou2020Apr, Wang2019Dec}, elder care~\cite{Jalal2014Jul, Hong2008Dec}, fitness tracking~\cite{Nadeem2020Oct} or entertainment~\cite{Lara2012Nov}. Especially the technical improvements in wearable sensors like smart watches offer integration in everyday life over a wide user base~\cite{Weiss2016Feb, Jobanputra2019Jan, Bulling2014Jan}.
One of the application scenarios in healthcare is the observation of various diseases such as Obsessive-Compulsive Disorder (OCD). For example the detection of hand washing activities can be used to derive the frequency or excessiveness which occurs in some people with OCD. More over it is possible to diagnose and even treat such diseases outside a clinical setting~\cite{Ferreri2019Dec, Briffault2018May}. If excessive hand washing is detected Just-in-Time Interventions can be presented to the user which offer an enormous potential for promoting health behavior change~\cite{10.1007/s12160-016-9830-8}. One application scenario in healthcare is observing various diseases such as Obsessive-Compulsive Disorder (OCD). For example, the detection of hand washing activities can be used to derive the frequency or excessiveness which occurs in some people with OCD. Moreover, it is possible to diagnose and even treat such diseases outside a clinical setting~\cite{Ferreri2019Dec, Briffault2018May}. If excessive hand washing is detected, Just-in-Time Interventions can be presented to the user, offering enormous potential for promoting health behavior change~\cite{10.1007/s12160-016-9830-8}.
State of the art Human Activiy Recognition methods are supervised deep neural networks derived from concepts like Convolutional Layers or Long short-term memory (LSTM). These require lots of training data to achieve good performance. Since movement patterns of each human are unique, the performance of activity detection can differ. So training data of a wide variety of humans is necessary to generalize to new users. Therefore it has been shown that personalized models can achieve better accuracy against user-independent models ~\cite{Hossain2019Jul, Lin2020Mar}. State-of-the-art Human Activity Recognition methods are supervised deep neural networks derived from concepts like Convolutional Layers or Long short-term memory (LSTM). These require lots of training data to achieve good performance. Since the movement patterns of each human are unique, the performance of activity detection can differ. So training data of a wide variety of humans is necessary to generalize to new users. Therefore it has been shown that personalized models can achieve better accuracy against user-independent models ~\cite{Hossain2019Jul, Lin2020Mar}.
To personalize a model retraining on new unseen sensor data is necessary. Obtaining the ground truth labels is crucial for most deep learning techniques. However, the annotation process is time and cost-intensive. Typically, training data is labeled in controlled environments by hand. In a real context scenario the user would have to take over the main part. To personalize a model, retraining on new unseen sensor data is necessary. Obtaining the ground truth labels is crucial for most deep learning techniques. However, the annotation process is time and cost-intensive. Typically, training data is labeled in controlled environments by hand. In a real context scenario, the user would have to take over the major part.
Indeed this requires lots of user interaction and a decent expertise which would contradict the usability. Indeed this requires lots of user interaction and decent expertise, which would contradict the usability.
There has been different research in how to preprocess data to make it usable for training. It turned out that a good trade-off is semi-supervised-learning or active learning, where a general base model is used to label the data and in uncertain cases it relies on user interaction ~\cite{siirtola2019importance, Siirtola2019Nov}. Here a small part of labeled data is combined with a larger unlabeled part to improve the detection model. But still some sort of explicit user interaction is required for personalization. So there is a overhead in the usage of a HAR application. There has been various research on preprocessing data to make it usable for training. It turned out that a good trade-off is semi-supervised learning or active learning, where a general base model is used to label the data, and in uncertain cases, it relies on user interaction~\cite{siirtola2019importance, Siirtola2019Nov}. Here a small part of labeled data is combined with a larger unlabeled part to improve the detection model. However, some direct user interaction is still required for personalization. So there is an overhead in the usage of a HAR application.
The goal of my work is to personalize a detection model without increasing the user interaction. Information for labeling is drawn from indicators that arise during the use of the application. These can be derived by user feedback to triggered actions resulted from the predictions of the underlying recognition model. More over the personalization should be an additional and separated part, so no change of the model architecture is required. My work aims to personalize a detection model without increasing user interaction. Information for labeling is drawn from indicators that arise during the use of the application. These can be derived from user feedback to triggered actions resulting from the predictions of the underlying recognition model. Moreover, personalization should be separated so that no change in the model architecture is required.
At first, all new unseen sensor data is labeled by the same general model which is used for activity recognition. These model predictions are corrected to a certain extent by using pretrained filters. High confidence labels are considered for personalization. In addition, the previously obtained indicators are used to further refine the data to generate a valid training set. Therefore the process of manual labeling can be skipped and replaced by an automatic combination of available indications. With the newly collected and labeled training data the previous model can be fine tuned in a incremental learning approach ~\cite{Amrani2021Jan, Siirtola2019May, Sztyler2017Mar}. For neuronal networks it has been shown that transfer learning offers high performance with decent computation time ~\cite{Chen2020Apr}. In combination this leads to a personalized model which has improved performance in detecting specific gestures of an individual user. At first, all new unseen sensor data is labeled by the same general model, which is used for activity recognition. These model predictions are corrected to a certain extent by using pre-trained filters. High confidence labels are considered for personalization. In addition, the previously obtained indicators are used to refine the data to generate a good training set. Therefore the process of manual labeling can be skipped and replaced by an automatic combination of available indications. With the newly collected and labeled training data, the previous model can be fine-tuned in an incremental learning approach ~\cite{Amrani2021Jan, Siirtola2019May, Sztyler2017Mar}. For neuronal networks, it has been shown that transfer learning offers high performance with decent computation time ~\cite{Chen2020Apr}. In combination, this leads to a personalized model which has improved performance in detecting specific gestures of an individual user.
I applied the described personalization process to a hand washing detection application which is used for observing the behavior of OCD patients. During the observation, the user answers requested evaluations if the application detects hand washing. For miss predictions the user has the opportunity to reject evaluations. Depending on how the user reacts to the evaluations, conclusions are drawn about the correctness of the predictions, which leads to the required indicators. I applied the described personalization process to a hand washing detection application used to observe the behavior of OCD patients. During the observation, the user answers requested evaluations if the application detects hand washing. For miss predictions, the user has the opportunity to reject evaluations. Depending on how the user reacts to the evaluations, conclusions are drawn about the accuracy of the predictions, resulting in the desired indicators.
The contributions of my work are as follows: The contributions of my work are as follows:
\begin{itemize} \begin{itemize}
\item [1.] A personalization approach which can be added to an exisitng HAR application and does not require additional user interaction or changes in the model architecture. \item [1.] A personalization approach which can be added to an exisitng HAR application and does not require additional user interaction or changes in the model architecture.
\item [2.] Different indicator assisted refinement methods, based on Convolutional networks and Fully Connected Autoencoders, for generated labels. \item [2.] Different indicator assisted refinement methods, based on Convolutional networks and Fully Connected Autoencoders, for generated labels.
\item [3.] Demonstration that a personalized model which results from this approach outperforms the general model and can archive similar performance as a supervised personalization. \item [3.] Demonstration that a personalized model which results from this approach outperforms the general model and can achieve similar performance as a supervised personalization.
\item [4.] Comparison to a common active learning method. \item [4.] Comparison to a common active learning method.
\item [5.] Presentation of real world experiments which depict user experiences \item [5.] Presentation of real-world experiments which confirms applicability to a broad user base
\end{itemize} \end{itemize}
\todo{structure of this work} \todo{structure of this work???}
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
\chapter{Discussion}\label{chap:Discussion} \chapter{Discussion}\label{chap:Discussion}
The experiments of the previous section given several insights of the personalization in context of hand washing. The previous section's experiments gave several insights into personalization in the context of hand washing.
\begin{itemize} \begin{itemize}
\item[1.] \textbf{Personalization leads to performance increase.} \item[1.] \textbf{Personalization leads to a performance increase.}
As the experiments of \secref{sec:expSupervisedPerso} has shown, retraining the general model with personal data has a positive impact to the detection performance. Using ground truth data can increase the F1 score by ${\sim}0.207$ and S score by ${\sim}0.076$. As the experiments of \secref{sec:expSupervisedPerso} have shown, retraining the general model with personal data positively impacts the detection performance. Using ground truth data can increase the F1 score by an average of ${\sim}0.207$ and S score by an average of ${\sim}0.076$.
\item[2.] \textbf{The influence of label noise is derived from the unbalanced data.} \item[2.] \textbf{The influence of label noise is derived from the unbalanced data.}
Since in a daily usage of the application most activities will be non hand washing, the resulted dataset which is used for personalization will be quite imbalanced. Therefore the pseudo labels for \textit{null} and \textit{hw} have different impact to the learning. Already few incorrect as hand washing labeled \textit{null} samples lead to significant performance decreases. Whereas wrong \textit{null} labels do not stand out among the many others correct labels, so there can be several such labels without strong performance changes. Since in daily usage of the application, most activities will be non-hand washing, the resulting dataset used for personalization will be quite imbalanced. Therefore the pseudo labels for \textit{null} and \textit{hw} have different impact to the learning. Already few incorrect as hand washing labeled \textit{null} samples lead to significant performance decreases. Whereas wrong \textit{null} labels do not stand out among the many other correct labels, there can be several such labels without strong performance changes.
\item[3.] \textbf{The use of soft labels makes the training more resistant to label noise.} \item[3.] \textbf{The use of soft labels makes the training more resistant to label noise.}
Soft labels which are able to depict uncertainty can reduce the fitting to errors. Especially wrong hand washing labels with lower class affiliations achieve a better performing model than using their hardened values. But smoothing correct labels also can have a negative impact. Soft labels which can depict uncertainty can reduce the fitting to errors. Especially wrong hand washing labels with lower-class affiliations achieve a better performance than using their hardened values. However, smoothing correct labels also can have a negative impact.
\item[4.] \textbf{Pseudo labels have to be filtered and denoised.} \item[4.] \textbf{Pseudo labels must be filtered and denoised.}
Just relying on predicted labels by the general model as training data results in a worse personalized model than the general model. Even the inclusion of user feedback alone is not enough to achieve higher performance. Just the use of a high variety of samples which contain no false positive labels achieve higher performance than the general model. Just relying on predicted labels by the general model as training data results in a worse personalized model than the general model. Even the inclusion of user feedback alone is not enough to achieve higher performance. A wide variety of samples containing no false-positive labels achieves higher performance than the general model.
\item[5.] \textbf{Pseudo labeled data can reach nearly supervised performance.} \item[5.] \textbf{Pseudo-labeled data can reach nearly supervised performance.}
The combination of denoising filters and user feedback, generates training data which can result in a personalized model that reaches similar F1 and S scores as for supervised training. The combination of denoising filters and user feedback generates training data, resulting in a personalized model that reaches similar F1 and S scores for supervised training.
\item[6.] \textbf{Missing feedback has just minor impact.} Most of the filter configurations are robust of missing \textit{false} indicators. Also, they achieve similar performance with just $40\%$ of \textit{correct} indicators.
\item[7.] \textbf{This personalization approach outperforms active learning.} It achieved a higher S score by similar user interaction.
\item[6.] \textbf{This personalization approach outperforms active leanring.}
\end{itemize} \end{itemize}
The real world experiment summarizes these findings and uses the combination of different aspects to achieve a best possible personalization. The pillar of this approach builds the opportunity to evaluate various personalized models and compare them. By using the quality estimation it is possible to find the best personalized model for each new recording. Therefore erroneous data which would lead to a heavily noisy training set can be detected and filtered out. Since most best performing personalizations depend on just few additional training data, it is sufficient, if among several days of records, only a few well usable ones exist. The real-world experiment summarizes these findings and combines different aspects to achieve the best possible personalization. The pillar of this approach builds the opportunity to evaluate various personalized models and compare them. The quality estimation makes it possible to find the best-personalized model for each new recording. Therefore erroneous data which would lead to a heavily noisy training set can be detected and filtered out. Since most best-performing personalizations depend on just a few additional training data, it is sufficient if, among several days of records, only a few well usable exist.
\extend{When full experiments are done} \extend{When full experiments are done}
\section{Future work}
The performance of the personalization heavily depends on the quality of the pseudo-labels. Therefore the filter configurations used for denoising them have a significant impact. More work on hyper-parameter tuning can lead to further improvements.
Additionally, other sources of indicators can be considered. For example, Bluetooth beacons can be placed by the sinks. The distance between the watch and the sink can be estimated if the watch is within range. A short distance states that the user is probably washing their hands. This indicator can be handled similarly to a \textit{manual} feedback.
\chapter{Conclusion}\label{chap:conclusion} \chapter{Conclusion}\label{chap:conclusion}
In this work, I have elaborated a personalization process for human activity recognition in the context of hand washing observation. My approach utilizes indirect user feedback to automatically refine training data for fine tuning the general detection model. I described the generation of pseudo labels by predictions of t he general model and introduced several approaches to denoise them. For evaluation I showed common supervised metrics and defined a quality estimation which also relies just on the user feedback. An actual implementation extends the existing application and allows real world experiments. In this work, I have elaborated a personalization process for human activity recognition in the context of hand washing observation. My approach utilizes indirect user feedback to automatically refine training data for fine-tuning the general detection model. I described the generation of pseudo labels by predictions of the general model and introduced several approaches to denoise them. For evaluation, I showed common supervised metrics and defined a quality estimation that also relies just on the user feedback. An actual implementation extends the existing application and allows real-world experiments.
I evaluated personalization in general on a theoretical basis with supervised data. These revealed the impact of noise in the highly imbalanced data and how soft labels can counter training errors. Based on these insights several constellations and filter approaches for training data have been implemented to analyze the behavior of resulting models under the different aspects. I found out, that just using the predictions of the base model leads to performance decreases, since they consist of too much label noise. But even relying only on data covered by user feedback does not overcome the general model, although the training data hardly consists of false labels. Therefore the training data have to consist of a variety of samples which contain as less incorrect labels as possible. The resulting denoising approaches all generates training data which leads to personalized models which achieve higher F1 and S scores than the general model. Some of the configurations even result in similar performance as with supervised training. I evaluated personalization in general on a theoretical basis with supervised data. These revealed the impact of noise in the highly imbalanced data and how-soft labels can counter training errors. Based on these insights, several constellations and filter approaches for training data have been implemented to analyze the behavior of the resulting models under the different aspects. I found out that just using the predictions of the base model leads to performance decreases since they consist of too much label noise. However, even relying only on data covered by user feedback does not overcome the general model, although the training data hardly consists of false labels. Therefore more sophisticated denoising approaches are implemented that generate training data that consist of various samples with as few incorrect labels as possible. This data leads to personalized models that achieve higher F1 and S scores than the general model. Some of the configurations even result in similar performance as with supervised training.
I compared my personalization approach with a active learning implementation as common personalization method. The sophisticated filters configurations achieve higher S scores which confirms the robustness. The real world experiment in corporation with the University of Basel offered a great opportunity to evaluate my personalization approach on a large variety of users and their feedback behaviors. It confirms, that in most cases personalized models outperforms the general model. Furthermore, I compared my personalization approach with an active learning implementation as a common personalization method. The sophisticated filter configurations achieve higher S scores, confirming my approach's robustness. The real-world experiment in corporation with the University of Basel offered a great opportunity to evaluate my personalization approach to a large variety of users and their feedback behaviors. It confirms that in most cases, personalized models outperform the general model. Overall, personalization would reduce the false detections by $XX\%$, and increase correct detections by $XX\%$.
\ No newline at end of file \ No newline at end of file
\begin{figure}[t] \begin{figure}[t]
\begin{centering} \begin{centering}
\makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/example_dataset.png}} \makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/example_dataset.png}}
\caption[Example synthetic data set]{\textbf{Example synthetic data set} Plot over multiple windows on x axis and their activity label on y axis. Indicators mark sections where the running mean of labels exceed the threshold and would triggered a user feedback. Correct indicators are sections where the ground truth data has the activity label for hand washing and false indicators for \textit{null} activities. Neutral indicators are already covered by the following indicator. Manual indicators mark sections where the model missed a hand wash detection.} \caption[Example synthetic data set]{\textbf{Example synthetic data set.} Plot over multiple windows on x axis and their activity label on y axis. Indicators mark sections where the running mean of labels exceed the threshold and would have triggered a user feedback. Correct indicators are sections where the ground truth data has the activity label for hand washing and false indicators for \textit{null} activities. Neutral indicators are already covered by the following indicator. Manual indicators mark sections where the model missed a hand wash detection.}
\label{fig:exampleSyntheticSataset} \label{fig:exampleSyntheticSataset}
\end{centering} \end{centering}
\end{figure} \end{figure}
\begin{figure}[t] \begin{figure}[t]
\begin{centering} \begin{centering}
\makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/example_dataset_feedback.png}} \makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/example_dataset_feedback.png}}
\caption[Example synthetic data set indicator intervals]{\textbf{Example synthetic data set indicator intervals} Highlighted areas of correspondingly indicator intervals. Red areas are \textit{false} intervals, green for \textit{correct}/\textit{manual} and gray for \textit{neutral} intervals.} \caption[Example synthetic data set indicator intervals]{\textbf{Example synthetic data set indicator intervals.} Highlighted areas of correspondingly indicator intervals. Red areas are \textit{false} intervals, green for \textit{correct}/\textit{manual} and gray for \textit{neutral} intervals.}
\label{fig:exampleSyntheticIntervals} \label{fig:exampleSyntheticIntervals}
\end{centering} \end{centering}
\end{figure} \end{figure}
\begin{figure}[t] \begin{figure}[t]
\begin{centering} \begin{centering}
\makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/example_pseudo_filter_cnn.png}} \makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/example_pseudo_filter_cnn.png}}
\caption[Example pseudo filter CNN]{\textbf{Example pseudo filter CNN} Plot of two \textit{positive} intervals, where convolutional neural network filter approach was applied. Values for \textit{hw} of predictions and pseudo labels are plotted in orange and magenta respectively} \caption[Example pseudo filter CNN]{\textbf{Example pseudo filter CNN.} Plot of two \textit{positive} intervals, where convolutional neural network filter approach was applied. Values for \textit{hw} of predictions and pseudo labels are plotted in orange and magenta respectively}
\label{fig:examplePseudoFilterCNN} \label{fig:examplePseudoFilterCNN}
\end{centering} \end{centering}
\end{figure} \end{figure}
\begin{figure}[t] \begin{figure}[t]
\begin{centering} \begin{centering}
\makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/example_pseudo_filter_fcndae.png}} \makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/example_pseudo_filter_fcndae.png}}
\caption[Example pseudo filter FCN-dAE]{\textbf{Example pseudo filter FCN-dAE} Plot of two \textit{positive} intervals, where fully convolutional network denoising auto encoder (FCN-dAE) filter approach was applied. Values for \textit{hw} of predictions and pseudo labels are plotted in orange and magenta respectively.} \caption[Example pseudo filter FCN-dAE]{\textbf{Example pseudo filter FCN-dAE.} Plot of two \textit{positive} intervals, where fully convolutional network denoising auto encoder (FCN-dAE) filter approach was applied. Values for \textit{hw} of predictions and pseudo labels are plotted in orange and magenta respectively.}
\label{fig:examplePseudoFilterFCNdAE} \label{fig:examplePseudoFilterFCNdAE}
\end{centering} \end{centering}
\end{figure} \end{figure}
...@@ -9,7 +9,7 @@ ...@@ -9,7 +9,7 @@
\subfloat[convLSTM3-dAE] \subfloat[convLSTM3-dAE]
{\includegraphics[width=\textwidth]{figures/approach/example_pseudo_filter_convLSTMdAE3.png}} {\includegraphics[width=\textwidth]{figures/approach/example_pseudo_filter_convLSTMdAE3.png}}
\caption[Example pseudo filter convLSTM-dAE]{\textbf{Example pseudo filter convLSTM-dAE} Plot of two \textit{positive} intervals, where the thre convolutional LSTM denoising auto encoder (convLSTM-dAE) filter approaches were applied. Values for \textit{hw} of predictions and pseudo labels are plotted in orange and magenta respectively.} \caption[Example pseudo filter convLSTM-dAE]{\textbf{Example pseudo filter convLSTM-dAE.} Plot of two \textit{positive} intervals, where the thre convolutional LSTM denoising auto encoder (convLSTM-dAE) filter approaches were applied. Values for \textit{hw} of predictions and pseudo labels are plotted in orange and magenta respectively.}
\label{fig:examplePseudoFilterconvLSTM} \label{fig:examplePseudoFilterconvLSTM}
\end{centering} \end{centering}
\end{figure} \end{figure}
\begin{figure}[t] \begin{figure}[t]
\begin{centering} \begin{centering}
\makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/example_pseudo_filter_score.png}} \makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/example_pseudo_filter_score.png}}
\caption[Example pseudo filter score]{\textbf{Example pseudo filter score} Plot of two \textit{positive} intervals, where naive filter approach was applied. Values for \textit{hw} of predictions and pseudo labels are plotted in orange and magenta respectively} \caption[Example pseudo filter score]{\textbf{Example pseudo filter score.} Plot of two \textit{positive} intervals, where naive filter approach was applied. Values for \textit{hw} of predictions and pseudo labels are plotted in orange and magenta respectively}
\label{fig:examplePseudoFilterScore} \label{fig:examplePseudoFilterScore}
\end{centering} \end{centering}
\end{figure} \end{figure}
...@@ -6,7 +6,7 @@ ...@@ -6,7 +6,7 @@
\subfloat[Predicted \textit{hw} values] \subfloat[Predicted \textit{hw} values]
{\includegraphics[width=\textwidth]{figures/approach/example_pseudo_plot_hw.png}} {\includegraphics[width=\textwidth]{figures/approach/example_pseudo_plot_hw.png}}
\caption[Pseudo labels of example synthetic data set]{\textbf{Pseudo labels of example synthetic data set} Plot of predicted pseudo labels in orange. (a) shows prediction of \textit{null} values and (b) shows predictions of \textit{hw} values}. \caption[Pseudo labels of example synthetic data set]{\textbf{Pseudo labels of example synthetic data set.} Plot of predicted pseudo labels in orange. (a) shows prediction of \textit{null} values and (b) shows predictions of \textit{hw} values.}
\label{fig:examplePseudoSataset} \label{fig:examplePseudoSataset}
\end{centering} \end{centering}
\end{figure} \end{figure}
\begin{figure}[htbp] \begin{figure}[htbp]
\begin{centering} \begin{centering}
\makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/PersonalizationImplementation.pdf}} \makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/PersonalizationImplementation.pdf}}
\caption[Personalization implementation]{\textbf{Personalization implementation} The personalization process as it is implemented on the server. Red highlighted entries depict new recordings which haven't been used before and the new personalization entry based on these. Green highlighted is the personalization model which performs best.} \caption[Personalization implementation]{\textbf{Personalization implementation.} The personalization process as it is implemented on the server. Red highlighted entries depict new recordings which haven't been used before and the new personalization entry based on these. Green highlighted is the personalization model which performs best.}
\label{fig:personalizationImplementation} \label{fig:personalizationImplementation}
\end{centering} \end{centering}
\end{figure} \end{figure}
\begin{figure}[t] \begin{figure}[t]
\begin{centering} \begin{centering}
\makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/personalization_pipeline.png}} \makebox[\textwidth]{\includegraphics[width=\textwidth]{figures/approach/personalization_pipeline.png}}
\caption[Personalization interface]{\textbf{Personalization interface} Screenshot of the interface of the personalization implementation for the user 'Participant3'. Each box represents one personalization run. The test sets have been selected by hand in advance. First run does not include new training recordings for personalization, so just potential best running mean settings for the general model are computed. Upper left number is the ID of a run and the blue below gives the ID of the model on which the personalization depends. A green ID represents the best performing model which would transmitted to the user.} \caption[Personalization interface]{\textbf{Personalization interface.} Screenshot of the interface of the personalization implementation for the user 'Participant3'. Each box represents one personalization run. The test sets have been selected by hand in advance. First run does not include new training recordings for personalization, so just potential best running mean settings for the general model are computed. Upper left number is the ID of a run and the blue below gives the ID of the model on which the personalization depends. A green ID represents the best performing model which would be transmitted to the user.}
\label{fig:personalizationPipeline} \label{fig:personalizationPipeline}
\end{centering} \end{centering}
\end{figure} \end{figure}
...@@ -14,6 +14,6 @@ ...@@ -14,6 +14,6 @@
%\multicolumn{1}{c}{} & \multicolumn{1}{c}{Total} & \multicolumn{1}{c}{$\hat{P}$} & \multicolumn{ 1}{c}{$\hat{N}$} & \multicolumn{1}{c}{$n$}\\ %\multicolumn{1}{c}{} & \multicolumn{1}{c}{Total} & \multicolumn{1}{c}{$\hat{P}$} & \multicolumn{ 1}{c}{$\hat{N}$} & \multicolumn{1}{c}{$n$}\\
\end{tabular} \end{tabular}
\caption[Confusion matrix]{\textbf{Confusion matrix} of hand wash predictions.\\} \caption[Confusion matrix]{\textbf{Confusion matrix.} Shows evaluation scheme of hand wash predictions and splits them in true positive (TP), false negative (FN), false positive (FP) and true negative (TN).\\}
\label{tab:confusionMatrix} \label{tab:confusionMatrix}
\end{table} \end{table}
...@@ -12,6 +12,9 @@ ...@@ -12,6 +12,9 @@
\texttt{scope\_corrected\_null} & Same as \texttt{all\_corrected\_null} but just use labels within \textit{false, positive} intervals\\ \texttt{scope\_corrected\_null} & Same as \texttt{all\_corrected\_null} but just use labels within \textit{false, positive} intervals\\
\texttt{all\_corrected\_null\_hwgt} & Like \texttt{all\_corrected\_null} but additionally set all labels within a \textit{positive} interval to their ground truth values\\ \texttt{all\_corrected\_null\_hwgt} & Like \texttt{all\_corrected\_null} but additionally set all labels within a \textit{positive} interval to their ground truth values\\
\texttt{scope\_corrected\_null\_hwgt} & Like \texttt{all\_corrected\_null\_hwgt} but just use labels within \textit{false, positive} intervals\\ \texttt{scope\_corrected\_null\_hwgt} & Like \texttt{all\_corrected\_null\_hwgt} but just use labels within \textit{false, positive} intervals\\
\hline
\texttt{all\_null\_hwgt} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals to their ground truth value \\ \texttt{all\_null\_hwgt} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals to their ground truth value \\
\texttt{all\_null\_score} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply naive approach \\ \texttt{all\_null\_score} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply naive approach \\
\texttt{all\_null\_deepconv} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply CNN approach \\ \texttt{all\_null\_deepconv} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply CNN approach \\
...@@ -19,14 +22,18 @@ ...@@ -19,14 +22,18 @@
\texttt{all\_null\_convlstm1} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply convLSTM1-dAE approach \\ \texttt{all\_null\_convlstm1} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply convLSTM1-dAE approach \\
\texttt{all\_null\_convlstm2} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply convLSTM2-dAE approach \\ \texttt{all\_null\_convlstm2} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply convLSTM2-dAE approach \\
\texttt{all\_null\_convlstm3} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply convLSTM3-dAE approach \\ \texttt{all\_null\_convlstm3} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply convLSTM3-dAE approach \\
\hline
\texttt{all\_cnn\_convlstm3} & Apply the cnn approach to all labels. Use pseudo labels with high confidence or within \textit{false, positive} intervals. Set labels within \textit{false} interval to \textit{null} and apply convLSTM3-dAE approach to \textit{positive} intervals \\ \texttt{all\_cnn\_convlstm3} & Apply the cnn approach to all labels. Use pseudo labels with high confidence or within \textit{false, positive} intervals. Set labels within \textit{false} interval to \textit{null} and apply convLSTM3-dAE approach to \textit{positive} intervals \\
\texttt{all\_cnn\_convlstm3\_hard} & Like \texttt{all\_cnn\_convlstm3} but set high confidence values to their corresponding hard label. Exclude all \textit{hw} labels which are not inside a \textit{positive} interval \\ \texttt{all\_cnn\_convlstm3\_hard} & Like \texttt{all\_cnn\_convlstm3} but set high confidence values to their corresponding hard label. Exclude all \textit{hw} labels which are not inside a \textit{positive} interval \\
\texttt{all\_cnn\_convlstm2\_hard} & Same as before, but with convLSTM2-dAE approach\\
\bottomrule \bottomrule
\end{tabularx} \end{tabularx}
\caption[Filter configurations]{\textbf{Filter configurations} Description how the different denoising approaches are combined.\\} \caption[Filter configurations]{\textbf{Filter configurations.} Description how the different denoising approaches are combined.\\}
\label{tab:filterConfigurations} \label{tab:filterConfigurations}
\end{table} \end{table}
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
{\includegraphics[width=\textwidth]{figures/experiments/supervised_evolution_all_l2-sp.png}} {\includegraphics[width=\textwidth]{figures/experiments/supervised_evolution_all_l2-sp.png}}
\caption[Personalization evolution comparison]{\textbf{Personalization evolution comparison} Plot of model evaluations for each iteration step which are trained with different filter configurations. The three graphs show training results with 50, 100 and 150 epochs used. In (a) freezing the feature layers of the model is used as regularization and in (b) l2-sp regularization is used.} \caption[Personalization evolution comparison]{\textbf{Personalization evolution comparison.} Plot of model evaluations for each iteration step which are trained with different filter configurations. The three graphs show training results with 50, 100 and 150 training epochs used. In (a) freezing the feature layers of the model is used as regularization and in (b) l2-sp regularization is used.}
\label{fig:evolutionAll} \label{fig:evolutionAll}
\end{centering} \end{centering}
\end{figure} \end{figure}
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment