Commit 242f70ee authored by burcharr's avatar burcharr 💬
Browse files

automatic writing commit ...

parent cbd4ba2f
......@@ -56,7 +56,7 @@ For some subjects, the smart watch application did not work properly, i.e. not s
Because of the smoothing that was applied to the data, at least some consecutive windows must be classified into the positive class, which means that a real hand washing procedure needs to be longer than or around $10\,s$. In practice, it can happen that washing ones hands does take a shorter amount of time, which the system will then not detect properly. It is even enough, if for some period of time in the middle of a washing procedure, the washing intensity is small enough for the model to misclassify it as noise.
Our theoretical results could therefore not be reached in the real life scenario. All in all, the system was able to correctly detect most hand washing procedures, and is therefore somewhat effective at this task.
It is not entirely clear, why the theoretical results could not be reached entirely in the real life scenario. It could be due to the assumptions made during the recording of the data sets, i.e. the way the hands were washed during the recordings could be too different from unbiased real world washing. In order to improve the performance in the real world, further research has to be conducted. All in all, the system was able to correctly detect most hand washing procedures, and is therefore somewhat effective at this task.
We also expected, that a higher intensity or a longer duration of the hand washing have a positive influence on the detection probability by the model on the smart watch. This seems logical for the longer duration due to the smoothing, but also for the intensity. It can be assumed, that the system can reach higher certainties with high intensity compared to low intensity washing, as it is likely more separable from less intense activities. However, the results showed a significantly positive correlation value only for intensity and detection rate, whereas the detection rate and hand washing duration seemed to be mostly uncorrelated. However, this may again be due to the relatively small sample size. Especially for the longer washing tasks of 30s and 35s, there were only 2 examples, out of which one was not detected. This may have had a big influence on the absence of a positive correlation value in the evaluation results.
......@@ -104,10 +104,11 @@ In this work, we described the development, training and evaluation of a powerfu
We theoretically evaluated different designs of neural networks on three related problems of hand washing detection, including the separation of hand washing from other activities, the separation of hand washing from compulsive hand washing and the separation of hand washing from compulsive hand washing and from other activities at the same time. For this task, we used hand washing data, data of simulated compulsive hand washing, and data of other activities which was collected from publicly available data sets. After training and evaluation, we selected the best functioning system based on several metrics, including the F1 score and the harmonic mean of sensitivity and specificity, which we called S score. The dominating models, DeepConvLSTM and DeepConvLSTM-A were both based on a deep convolutional neural network joined with an LSTM layer. For DeepConvLSTM-A, which performed slightly better than DeepConvLSTM, we added an attention mechanism, in order to allow the model to flexibly focus on more relevant sections of its input. The designed models were able to beat baselines such as a random forest classifier and a support vector machine, as well as chance level baselines by a large margin.
In a practical evaluation using 5 subjects, we tested DeepConvLSTM-A on the hand washing detection task in a real world and every day environment, as well as in a fixed schedule hand washing test. The system ran on a smart watch, which was used to monitor the users wrist movements in real-time and tried to correctly detect hand washing. The accuracy of this test was lower than expected ($28,33\,\%$). Some false positives appeared for different activities, many of which were washing related, which must be ruled out in the future.
In a practical evaluation using 5 subjects, we tested DeepConvLSTM-A on the hand washing detection task in a real world and every day environment, as well as in a fixed schedule hand washing test. The system ran on a smart watch, which was used to monitor the users wrist movements in real-time and tried to correctly detect hand washing. The sensitivity of this test was lower than expected ($28,33\,\%$), ($50\,\%$ if the correct wrist was used). Furthermore, around 4 false positives per day appeared for different activities, many of which were washing related. They included but were not limited to
High amounts of false positives should be ruled out in the future.
In the second test of the practical evaluation, subjects performed intensive and long hand washing repetitions, which were more easy to detect. The systems performance here was much closer to our the results of the theoretical evaluation of our models (sensitivity $76\,\%$ vs $90\,\%$).
In the second test of the practical evaluation, subjects performed intensive and long hand washing repetitions, which were closer to our lab recorded washing data (including the simulated compulsive data) and thus more easy to detect. The system's performance here was much closer to the results of the theoretical evaluation of our models sensitivity ($76\,\%$ vs $90\,\%$, $82,5\,\%$ if the correct wrist was used).
Hence, the evaluation results suggest that the developed system is able to properly detect hand washing in many cases. The specificity and sensitivity of the system is high, but leaves some room for improvement.
Hence, the evaluation results suggest that the developed system is able to properly detect hand washing in many cases. The theoretical specificity ($75.1\,\%$) and sensitivity ($90\,\%$) of the system is high, but the practical application shows some room for improvement.
In conclusion, the application of wrist worn sensor data to the detection of hand washing and compulsive hand washing remains an interesting and open field of research, with many possible areas of application. Especially the detection of compulsive hand washing would be a world's first, and seems promising for future usage in the treatment of OCD patients. Due to the possibility of directly running neural network models on wrist worn smart watches, interventions could be generated in real time and with low latency.
In conclusion, the application of wrist worn sensor data to the detection of hand washing and compulsive hand washing remains an interesting and open field of research, with many possible areas of application. Especially the detection of compulsive hand washing would be a world's first, and seems promising for future usage in the treatment of OCD patients. Due to the possibility of directly running neural network models on wrist worn smart watches, interventions could be generated in real time and a latency below 15 seconds.
# Introduction
In this thesis we aim to develop several neural network based machine learning methods that can be used to detect hand washing and compulsive hand washing on inertial sensor data of wrist worn devices. We evaluate different approaches for multiple scenarios of hand washing classification. We examine the real world applicability of the developed approach with multiple users.
## Motivation
### Hand washing detection
Hand washing is an important part of every human's personal hygiene. We wash our hands multiple times each day. Washing ones hands can remove dirt or grease and importantly helps to prevent infection with pathogens @noauthor_when_2020. There are many occasions, in which it is desired that we wash our hands, among which are @noauthor_when_2020:
Hand washing is an important part of every human's personal hygiene. Washing ones hands can remove dirt or grease and importantly helps to prevent infection with pathogens @noauthor_when_2020. There are many occasions, in which it is desired that we wash our hands, among which are @noauthor_when_2020:
- After using the toilet
- Before and after preparing or eating food
......@@ -17,18 +15,18 @@ Added to that, hand washing using soap or disinfectants is also part of the work
In order to monitor the effectiveness and frequency of hand washing, we could use a sensor based computer system to detect the activity of hand washing and its duration. Further advanced systems could also be used to predict the quality of the hand washing. These systems could then be used to reduce the risk of contaminations or infections by ameliorating the hygiene of their users.
### Obsessive-Compulsive Disorders
While it is usually really helpful and a basic part of hygiene, hand washing can also be overdone, i.e. be too frequent or be done too thoroughly. One example of persons for which overly excessive hand washing is a problem, is the small percentage of humans suffering from Obsessive-Compulsive Disorders (OCD). OCD affects about $1-3\,\%$ of humans during their life @valleni-basile_frequency_1994, @fawcett_women_2020. OCD appears in the form of obsessions, that lead to compulsive behavior. There are multiple subgroups of obsessions and compulsions, including contamination concerns, symmetry and precision concerns, saving concerns and more @stein_obsessive-compulsive_2002. These concerns lead to respective compulsive behavior: Symmetry and precision concerns lead to arranging and ordering, saving concerns lead to hoarding and contamination concerns can lead to excessive washing, bathing and showering. This work will focus on detecting hand washing and also try to tell apart hand washing from compulsive hand washing of OCD patients.
The separation of compulsive hand washing from ordinary hand washing is an even harder problem than just hand washing detection itself. It is unclear, whether it is possible to predict the type of hand washing with high probability, as there is no previous work in this area. It is reasonable to assume, that their are strong similarities between the kinds of hand washing, as well as subtle differences, e.g. in intensity and length.
While it is usually really helpful and a basic part of hygiene, hand washing can also be overdone, i.e. be too frequent or be done too thoroughly. One example of persons for which overly excessive hand washing is a problem, is the small percentage of humans suffering from Obsessive-Compulsive Disorders (OCD). OCD affects about $1-3\,\%$ of humans during their life @valleni-basile_frequency_1994, @fawcett_women_2020. OCD appears in the form of obsessions, that lead to compulsive behavior. There are multiple subgroups of obsessions and compulsions, including contamination concerns, symmetry and precision concerns, saving concerns and more @stein_obsessive-compulsive_2002. These concerns lead to respective compulsive behavior: Symmetry and precision concerns lead to arranging and ordering, saving concerns lead to hoarding and contamination concerns can lead to excessive washing, bathing and showering, including compulsive hand washing. This work will focus on detecting hand washing and also try to tell apart hand washing from compulsive hand washing of OCD patients.
One method of treatment for clinical cases of OCD is exposure and response prevention (ERP) therapy @meyer_modification_1966 @whittal_treatment_2005. Using this method, patients that suffer from OCD are exposed to situations in which their obsessions are stimulated and they are helped at preventing compulsive reactions to the stimulation. The patients can then "get used" to the situation in a sense, and thus the reaction to the stimulation will be weakened over time. This means that their quality of life is improved, as the severity of their OCD declines.
A successful, i.e. reliable and accurate system for compulsive hand washing detection could be used to intervene, whenever the compulsive hand washing is detected. It could therefore help psychologists and their patients in the treatment of the symptoms. It could help the user to stop the compulsive behavior by issuing a warning. Such a warning could be a vibration of the device, or a sound that is played upon the detection of compulsive behavior. However, the hypothesis of usefulness is yet to be tested, as no such systems exists as of now. Therefore we want to develop a system that can not only detect hand washing with low latency and in real time, but also discriminate between usual hand washing and obsessive-compulsive hand washing at the same time. The system could then, as described, be used in ERP therapy sessions, but also in every day life, to prevent compulsive hand washing.
The separation of compulsive hand washing from ordinary hand washing is an even harder problem than just hand washing detection itself. It is unclear, whether it is possible to predict the type of hand washing with high probability, as there is no previous work in this area. It is reasonable to assume, that there are strong similarities between compulsive hand washing and non-compulsive hand washing, as well as subtle differences, e.g. in intensity and duration of the washing.
### Wrist worn sensors
Different types of sensors can be used to detect activities such as hand washing. It is possible to detect hand washing from RGB camera data to some extent. However, in order for this to work, we would need to place a camera at every place and room a subject could want to wash their hands at. This is unfeasible for most applications of hand washing detection, and could be very expensive. Added to that it might be problematic to place cameras inside wash or bath rooms for privacy reasons. Thus, a better alternative could be body worn, camera-less devices.
Different types of sensors can be used to detect activities such as hand washing. It is possible to detect hand washing from RGB camera data to some extent. However, in order for this to work, we would need to place a camera at every place and room a subject could want to wash their hands at. This is unfeasible for most applications of hand washing detection, and could be very expensive. Furthermore, it might be problematic to place cameras inside wash or bath rooms for privacy reasons. Thus, a better alternative could be body worn, camera-less devices.
Inertial measurement units (IMUs) can measure different types of time series movement data, e.g. the acceleration or angular velocity of the device they are embedded in. IMUs are embedded in most modern smart phones and smart watches, which makes them easily available. For hand washing detection, especially the movement of the hands and wrists can contain information that can help us classify hand washing. Therefore, we can use a smart watch and its embedded IMU to try to predict whether a user is washing their hands or not. Added to that, if the user is washing their hands, we could try to predict if they are washing them in an obsessive-compulsive way or not. Another advantage of using a smart watch would be, that they usually have in-built vibration motors or even speakers. These means could be used to intervene, whenever compulsive hand washing is detected, as described above. Therefore, wrist worn sensors, especially those embedded into the very versatile smart watch systems, are used in this work. The wrist worn devices can also be used to execute machine learning models in real time, using publicly available libraries, e.g. on smart watches running Wear OS.
Inertial measurement units (IMUs) can measure different types of time series movement data, e.g. the acceleration or angular velocity of the device they are embedded in. IMUs are embedded in most modern smart phones and smart watches, which makes them easily available. For hand washing detection, especially the movement of the hands and wrists can contain information that can help us classify hand washing. Therefore, we can use a smart watch and its embedded IMU to try to predict whether a user is washing their hands or not. Added to that, if the user is washing their hands, we could try to predict if they are washing them in an obsessive-compulsive way or not. Another advantage of using a smart watch would be, that they usually have in-built vibration motors or even speakers. These means could be used to intervene, whenever compulsive hand washing is detected, as described above. Therefore, wrist worn sensors, especially those embedded in smart watch systems, are used in this work. The wrist worn devices can also be used to execute machine learning models in real time, using publicly available libraries, e.g. on smart watches running Wear OS.
## Goals
In this work, we want to develop a method for the real time detection of hand washing and compulsive hand washing. We also want to test the method and report meaningful statistics of its success. Further, we want to test parts of the developed method in a real world scenario. We then want to draw conclusions on the applicability of the developed systems in the real world.
......@@ -37,7 +35,11 @@ In this work, we want to develop a method for the real time detection of hand wa
We want to show that neural network based classification methods can be applied to the recognition of hand washing. We want to base our method on sensor data from inertial measurement sensors in smart watches or other wrist worn IMU-equipped devices. We want to detect the hand washing in real time and directly on the mobile, i.e. on a wrist wearable device, such as a smart watch. Doing so, we would be able to give instant real time feedback to the user of the device.
### Separation of hand washing and compulsive hand washing
Added to the detection of hand washing, the detection of obsessive-compulsive hand washing is part of our goals. We want to be able to separate compulsive hand washing from non compulsive hand washing, based on the inertial motion data. Especially for the scenario of possible interventions used for the treatment of OCD, this separation is crucial, as OCD patients do also wash their hands in non compulsive ways and we do not want to intervene for these kinds of hand washing procedures.
On top of the detection of hand washing, the detection of obsessive-compulsive hand washing is part of our goals. We want to be able to separate compulsive hand washing from non compulsive hand washing, based on inertial motion data. Especially for the scenario of possible interventions used for the treatment of OCD, this separation is crucial, as OCD patients do also wash their hands in non compulsive ways and we do not want to intervene for these kinds of hand washing procedures.
### Real world evaluation
We want to evaluate the most promising of the developed models in a real world evaluation, in order to obtain a realistic estimate of its applicability in the task of hand washing detection. We want to report results of an evaluation with multiple subjects to obtain a meaningful performance estimation. From this estimation we want to draw conclusions on the applicability of the developed system in real world therapy scenarios. Added to that, we want to derive future improvements, that could be applied to the system.
\ No newline at end of file
We want to evaluate the most promising of the developed models in a real world evaluation, in order to obtain a realistic estimate of its applicability in the task of hand washing detection. We want to report results of an evaluation with multiple subjects to obtain a meaningful performance estimation. From this estimation we want to draw conclusions on the applicability of the developed system in real world therapy scenarios. Added to that, we want to derive future improvements, that could be applied to the system.
TODO merge if needed or remove:
In this thesis we aim to develop several neural network based machine learning methods that can be used to detect hand washing and compulsive hand washing on inertial sensor data of wrist worn devices. We evaluate different approaches for multiple scenarios of hand washing classification. We examine the real world applicability of the developed approach with multiple users.
\ No newline at end of file
......@@ -34,6 +34,6 @@ reviewer2: "TBD"
declaration: Hiermit erkläre ich, dass ich diese Arbeit selbstständig verfasst habe, keine anderen als die angegebenen Quellen/Hilfsmittel verwendet habe und alle Stellen, die wörtlich oder sinngemäß aus veröffentlichten Schriften entnommen wurden, als solche kenntlich gemacht habe. Darüber hinaus erkläre ich, dass diese Arbeit nicht, auch nicht auszugsweise, bereits für eine andere Prüfung angefertigt wurde.
#abstract
abstract-de: Die automatische Erkennung von Händewaschen und zwanghaftem Händewaschen hat mehrere Anwendungsbereiche in Arbeits- und medizinischen Umgebungen. Die Erkennung von Händewaschen kann in zur Überprüfung der Einhaltung von Hygieneregeln eingesetzt werden, da das Händewaschen eine der wichtigsten Komponenten der persönlichen Hygiene ist. Allerdings kann das Händewaschen auch übertrieben werden, was bedeutet, dass es für die Haut und die allgemeine Gesundheit schädlich sein kann. Manche Patienten mit Zwangsstörungen waschen sich zwanghaft und zu häufig die Hände auf diese schädliche Weise. Die automatische Erkennung von zwanghaftem Händewaschen kann bei der Behandlung dieser Patienten helfen. Ziel dieser Arbeit ist es, auf neuronalen Netzen basierende Methoden zu entwickeln, die in der Lage sind, Händewaschen und zwanghaftes Händewaschen in Echtzeit auf einem am Handgelenk getragenen Gerät zu erkennen, wobei die Daten der Bewegungssensoren des am Handgelenk getragenen Geräts verwendet werden. Wir erreichen eine hohe Genauigkeit für beide Aufgaben und evaluieren Teile der Arbeit mit Probanden in einem realen Experiment, um die starke theoretische Leistung zu bestätigen.
abstract-en: The automatic detection of hand washing and compulsive hand washing has multiple areas of application in work and medical environments. Hand washing detection can be used in compliance and hygiene scenarios, as hand washing is one of the main components of personal hygiene. However, hand washing can also be overdone, which means it can be unhealthy for the skin and general health. Patients with obsessive-compulsive disorder sometimes compulsively wash their hands in such a harmful way. In order to help with their treatment, the automatic detection of compulsive hand washing can possibly be applied. This thesis aims to develop neural network based methods which are able to detect hand washing as well as compulsive hand washing in real time on a wrist worn device using intertial motion sensor data of said wrist worn device. We achieve high accuracy for both tasks and evaluate parts of the work on subjects in a real world experiment, in order to confirm the strong theoretical performance achieved.
abstract-de: Die automatische Erkennung von Händewaschen und zwanghaftem Händewaschen hat mehrere Anwendungsbereiche in Arbeits- und medizinischen Umgebungen. Die Erkennung kann zur Überprüfung der Einhaltung von Hygieneregeln eingesetzt werden, da das Händewaschen eine der wichtigsten Komponenten der persönlichen Hygiene ist. Allerdings kann das Waschen auch übertrieben werden, was bedeutet, dass es für die Haut und die allgemeine Gesundheit schädlich sein kann. Manche Patienten mit Zwangsstörungen waschen sich zwanghaft und zu häufig die Hände auf diese schädliche Weise. Die automatische Erkennung von zwanghaftem Händewaschen kann bei der Behandlung dieser Patienten helfen. Ziel dieser Arbeit ist es, auf neuronalen Netzen basierende Methoden zu entwickeln, die in der Lage sind, Händewaschen und zwanghaftes Händewaschen in Echtzeit auf einem am Handgelenk getragenen Gerät zu erkennen, wobei die Daten der Bewegungssensoren des am Handgelenk getragenen Geräts verwendet werden. Die entwickelte Methode erreicht eine hohe Genauigkeit für beide Aufgaben und Teile der Arbeit wurden mit Probanden in einem realen Experiment evaluiert, um die starke theoretische Leistung (F1 score von 89,2 % bzw. 96,6 %) zu bestätigen.
abstract-en: The automatic detection of hand washing and compulsive hand washing has multiple areas of application in work and medical environments. The detection can be used in compliance and hygiene scenarios, as hand washing is one of the main components of personal hygiene. However, the washing can also be overdone, which means it can be unhealthy for the skin and general health. Patients with obsessive-compulsive disorder sometimes compulsively wash their hands in such a harmful way. In order to help with their treatment, the automatic detection of compulsive hand washing can possibly be applied. This thesis aims to develop neural network based methods which are able to detect hand washing as well as compulsive hand washing in real time on a wrist worn device using intertial motion sensor data of said wrist worn device. We achieve high accuracy for both tasks and evaluate parts of the work on subjects in a real world experiment, in order to confirm the strong theoretical performance (F1 score of 89.2 % and 96.6 %) achieved.
---
This diff is collapsed.
......@@ -14,20 +14,20 @@ Gesture recognition, in general, uses similar methods as the more difficult huma
## Human activity recognition
\label{section:har}
Recognizing more than one gesture or body movement in combination in a temporal context and deriving the current activity of the user is called human activity recognition (HAR). In this task, we want to detect more general activities, compared to a shorter and simpler gestures. An activity can include many distinguishable gestures. However, the same activity will not always include all of the same gestures and the gestures included could be in a different order for every repetition. Activities are less repetitive than gestures, and harder to detect in general @zhu_wearable_2011. However, Zhu et al. have shown that the combined detection of multiple different gestures can be used in HAR tasks too @zhu_wearable_2011, which makes sense, because a human activity can consist of many gestures. Nevertheless, most methods used for HAR consist of more direct applications of machine learning to the data, without the detour of detecting specific gestures contained in the execution of an activity.
Recognizing more than one gesture or body movement in combination in a temporal context and deriving the current activity of the user is called human activity recognition (HAR). In this task, we want to detect more general activities, compared to the shorter and simpler gestures. An activity can include many distinguishable gestures. However, the same activity will not always include all of the same gestures and the gestures included could be in a different order for every repetition. Activities are less repetitive than gestures, and harder to detect in general @zhu_wearable_2011. However, Zhu et al. have shown that the combined detection of multiple different gestures can be used in HAR tasks too @zhu_wearable_2011, which makes sense, because a human activity can consist of many gestures. Nevertheless, most methods used for HAR consist of more direct applications of machine learning to the data, without the detour of detecting specific gestures contained in the execution of an activity.
Methods used in HAR include classical machine learning methods as well as deep learning @liu_overview_2021 @bulling_tutorial_2014. The classical machine learning methods rely on features of the data obtained by feature engineering. The required feature engineering is the creation of meaningful statistics or calculations based on the time frame for which the activity should be predicted. The features can be frequency-domain based and time-domain based, but usually both are used at the same time to train these conventional models @liu_overview_2021. The classical machine learning methods include but are not limited to Random Forests (RFC), Hidden Markov Models (HMM), Support Vector Machines (SVM), the $k$-nearest neighbors algorithm and more.
#### Deep neural networks
Recently, deep neural networks have taken over the role of the state of the art machine learning method in the area of human activity recognition @bock_improving_2021, @liu_overview_2021. Deep neural networks are universal function approximators @bishop_pattern_2006, and are known for being easy to use on "raw" data. They are "artificial neural networks" consisting of multiple layers, where each layer contains a certain amount of nodes that are connected to the nodes of the following layer. The connections are each assigned a weight, and the weighted sum over the values of all the previous connected nodes is used to calculate the value of a node in the next layer. Simple neural networks where all nodes of a layer are connected to all nodes in the following layer are often called "fully connected neural networks" (FC-NN or FC).
The connections' parameters are optimized using forward passes through the network of nodes, followed by the execution of the backpropagation algorithm, and an optimization step. We can accumulate all the gradients with regard to a loss function for each of the parameters and for a small subset of the data passed and perform "stochastic gradient decent" (SGD). SGD or alternative similar optimization methods like the commonly used ADAM @kingma_adam_2017 optimizer perform a parameter update step. After many such updates and if the training works well, the network parameters will have been updated to values that lead to a lower value of the loss function for the training data. However, there is no guarantee of convergence whatsoever. As mentioned above, deep neural networks can, in theory, be used to approximate arbitrary functions. Nevertheless, the parameters for the perfect approximation cannot be easily found, and empirical testing has revealed that neural networks do need a lot of training data in order to perform well, compared to classical machine learning methods. In return, with enough data, deep neural networks often outperform the classical machine learning methods.
The connections' parameters are optimized using forward passes through the network of nodes, followed by the execution of the backpropagation algorithm, and an optimization step. We can accumulate all the gradients with regard to a loss function for each of the parameters and for a small subset of the data passed and perform "stochastic gradient decent" (SGD). SGD or alternative similar optimization methods like the commonly used ADAM @kingma_adam_2017 optimizer perform a parameter update step. After many such updates and if the training works well, the network parameters will have been updated to values that lead to a lower value of the loss function for the training data. However, there is no guarantee of convergence whatsoever. As mentioned above, deep neural networks can, in theory, be used to approximate arbitrary functions. Nevertheless, the parameters for the perfect approximation cannot be easily found, and empirical testing has revealed that neural networks do need a lot of training data in order to perform well, compared to classical machine learning methods. In return, with enough data, deep neural networks often outperform classical machine learning methods.
###### Convolutional neural networks (CNNs)
are neural networks that are not fully connected, but work by using convolutions with a kernel, that we slide over the input. CNNs were first introduced for hand written character recognition @lecun_backpropagation_1989 @le_cun_handwritten_1990 (1989, 1990), but were later revived for computer vision tasks @krizhevsky_imagenet_2012 (2012), after more computational power was available on modern devices to train them. Since the rise of CNNs in computer vision, most computer vision problems are solved with them. The convolutions work by moving filter windows with learnable parameters (also called kernels) over the input @albawi_understanding_2017. Opposed to a fully connected network, the weights are shared over many of the nodes, because the same filters are applied over the full size of the input. CNNs have less parameters to train than a fully connected network with the same amount of nodes, which makes them easier to train. They are generally expected to perform better than FC networks, especially on image related tasks. The filters can be 2-dimensional (2d), like for images (e.g. a 5x5 filter moved across the two axes of an image) or 1-dimensional (1d), which can e.g. be used to slide a kernel along the time dimension of a sensor recording. Even in the 1-dimensional case, less parameters are needed compared to the application of a fully connected network. Thus, the 1-dimensional CNN is expected to be easier to train and achieve a better performance.
###### Recurrent neural networks (RNNs)
are similar to feed forward neural networks, with the difference being that they have access to information from a previous time step. The simplest version of an RNN is a single node that takes the input $\mathbf{x}_t$ and its own output $\mathbf{h}_{t-1}$ from the last time step as inputs. RNNs can be trained on time series data and are able to interprete temporal connections and dependencies in the data to some extent. Recurrent neural networks are trained using "back propagation through time" @mozer_focused_1995. This means that we have to run a forwards pass of multiple time steps through the network first, followed by a back propagation that sums up over all the different time steps and their gradients. For "long" runs, i.e. if the network is supposed to take into account many time steps, there is the "vanishing gradient problem" @hochreiter_vanishing_1998. With an increasing amount of time steps, the gradients become smaller and smaller, making it harder or impossible to properly train the recurrent neural network.
are similar to feed forward neural networks, with the difference being that they have access to information from a previous time step. The simplest version of an RNN is a single node that takes the input $\mathbf{x}_t$ and its own output $\mathbf{h}_{t-1}$ from the last time step as inputs. RNNs can be trained on time series data and are able to interpret temporal connections and dependencies in the data to some extent. Recurrent neural networks are trained using "back propagation through time" @mozer_focused_1995. This means that we have to run a forwards pass of multiple time steps through the network first, followed by a back propagation that sums up over all the different time steps and their gradients. For "long" runs, i.e. if the network is supposed to take into account many time steps, there is the "vanishing gradient problem" @hochreiter_vanishing_1998. With an increasing amount of time steps, the gradients become smaller and smaller, making it harder or impossible to properly train the recurrent neural network.
###### Long short-term memory (LSTM)
......@@ -85,13 +85,14 @@ score(\mathbf{h}_T,\mathbf{h}_s) &= \mathbf{h}_t^T\mathbf{W}_{\alpha}\mathbf{h}_
\end{align}
\end{figure}
Note that the calculation of $\alpha_t$ is done with the softmax function as shown in eqn. \ref{eqn:attent_lstm_sm}, although this is not explicitly mentioned by the authors of the paper. This makes sure that the weights $\alpha$ used for the weighted sum, always sum up to 1.
Note that the calculation of $\alpha_t$ is done with the softmax function as shown in eqn. \ref{eqn:attent_lstm_sm}, although this is not explicitly mentioned by the authors of the paper. This makes sure that the weights $\alpha$ used for the weighted sum always sum up to 1.
Zeng et al. evaluate their approach on 3 data sets and report a state of the art performance, beating the initial DeepConvLSTM.
Another study by Singh et al. combines DeepConvLSTM with a self-attention mechanism @singh_deep_2021. The attention mechanism is very similar to the one used by Zeng et al. @zeng_understanding_2018, where the mechanism consists of a layer that follows the LSTM layers in the DeepConvLSTM network. Instead of utilizing a score layer which uses both $h_t$ and $h_T$, Singh et al. find the weights $\mathbf{\alpha}$ by applying the softmax function to the output of a fully connected layer, for each $h_t$, without taking into account $h_T$. Other than that, the two attention mechanisms are pretty similar. Singh et al. also report a statistically significant increase in performance compared to the initial DeepConvLSTM, although the evaluate their approach on different data sets than Zeng et al..
\label{deepconvlstm_att}
Another study by Singh et al. combines DeepConvLSTM with a self-attention mechanism @singh_deep_2021. The attention mechanism is very similar to the one used by Zeng et al. @zeng_understanding_2018, where the mechanism consists of a layer that follows the LSTM layers in the DeepConvLSTM network. Instead of utilizing a score layer which uses the relation of each $h_t$ to $h_T$, Singh et al. find the weights $\mathbf{\alpha}$ by applying the softmax function to the output of a fully connected layer through which they pass the concatenated $h_t$ values. Instead of taking into account only the relations of each $h_t$ to $h_T$ separately, they use one layer to jointly calculate all the attention weights. Other than that, the two attention mechanisms are pretty similar. Singh et al. also report a statistically significant increase in performance compared to the initial DeepConvLSTM, although the evaluate their approach on different data sets than Zeng et al..
For HAR, DeepConvLSTM and the models derived from it are the state of the art machine learning methods, as their consistently outperform other model architectures on the available benchmarks and data sets.
For HAR, DeepConvLSTM and the models derived from it are the state of the art machine learning methods, as they consistently outperform other model architectures on the available benchmarks and data sets.
## Hand washing
To our knowledge, no study has ever tried to separately predict compulsive hand washing opposed to non-compulsive hand washing.
......@@ -99,8 +100,8 @@ To our knowledge, no study has ever tried to separately predict compulsive hand
Most studies that try to automatically detect hand washing are aiming for compliance improvements, i.e. trying to increase or measure the frequency of hand washes or assessing or improving the quality of hand washes.
Hand washing compliance can be measured using different tools. Jain et al. @jain_low-cost_2009 use an RFID-based system to check whether health care workers comply with hand washing frequency requirements. However, the system is merely used to make sure all workers entering an emergency care unit have washed their hands. Bakshi et al. @bakshi_feature_2021 developed a hand washing detection data set with RGB video data, and showed a valid way to extract SIFT-descriptors from it for further research. Llorca et al. showed a vision based system for automatic hand washing quality assessment @llorca_vision-based_2011 based on the detection of skin in RGB images using optical techniques such as optical flow estimation.
A study by Li et al. @li_wristwash_2018 is able to recognize 13 steps of a hand washing procedure with an accuracy of $85\,\%$. They employ a sliding window feature based hidden markov model approach. Wang et al. explore using sensor armbands to assess the users compliance with given hand washing hygiene guidelines @wang_accurate_2020. They run a classifier using XGBoost and are mostly able to separate the different steps of the scripted hand washing routine.
Added to that, Cao et al. @cao_awash_2021 developed a system that similarly detects different steps of a scripted hand washing routine and prompts the user, if they confuse the order of the steps or forget one of the steps. The technology is aimed at elderly patients with dementia. Their system is able to detect which step of hand washing is currently conducted based on wrist motion data using an LSTM based neural network. However, none of the three systems mentioned in this paragraph are meant to separate hand washing from other activities.
A study by Li et al. @li_wristwash_2018 is able to recognize 13 steps of a hand washing procedure on wrist motion data with an accuracy of $85\,\%$. They employ a sliding window feature based hidden markov model approach and run a continuous recognition. Wang et al. explore using sensor armbands to assess the users compliance with given hand washing hygiene guidelines @wang_accurate_2020. They run a classifier using XGBoost and are mostly able to separate the different steps of the scripted hand washing routine.
Added to that, Cao et al. @cao_awash_2021 developed a system that similarly detects different steps of a scripted hand washing routine and prompts the user, if they confuse the order of the steps or forget one of the steps. The technology is aimed at elderly patients with dementia. Their system is able to detect which step of hand washing is currently conducted based on wrist motion data using an LSTM based neural network. However, none of the three systems mentioned in this paragraph are meant to separate hand washing from other activities. These models are trained to tell apart the different steps of hand washing, as they are defined in their respective studies. The models used in these studies are not tested on a null class, i.e. they are not tested for other activities than hand washing. Thus, they can only be used for the detection of steps of hand washing, but not for the detection of hand washing in real life.
In order to separate hand washing from other activities, Mondol et al. employ a simple feed forward neural network. Their network consists of a few linear layers and can be used to detect hand washing @sayeed_mondol_hawad_2020. Their method seeks to specifically eliminate false positives by trying to detect out of distribution (OOD) samples, i.e. samples that are very different from the ones seen by the model during training. They apply a conditional Gaussian distribution of the network's features of the last layer before the output layer (penultimate layer).
......
......@@ -14,11 +14,12 @@ In all tables of this chapter, the best values for a specific metric will be hig
The values for the metrics specificity and sensitivity will be reported in the tables, but not discussed separately, because they are included in the more meaningful metrics F1 score and S score. The results generally show that achieving a high value in only one metric out of specificity and sensitivity, at cost of reaching low values in the other one, brings about worse performance in the F1 score and S score.
### Distinguishing hand washing from all other activities
For the first task of classifying hand washing in contrast to non hand washing activities, we report the results with and without the application of label smoothing. The results without label smoothing and without normalization are shown in table \ref{tbl:washing} and @fig:p1_metrics.
For the first task of classifying hand washing in contrast to non hand washing activities, we report the results with and without the application of label smoothing. The results without label smoothing are shown in table \ref{tbl:washing}. In @fig:p1_metrics, the results scores for problem 1 with and without smoothing are shown.
\input{tables/washing.tex}
![F1 score and S score for problem 1](img/washing.pdf){#fig:p1_metrics width=98%}
![F1 score and S score for problem 1](img/washing_all.pdf){#fig:p1_metrics width=105%}
As we can see, without label smoothing, the neural networks outperformed the conventional machine learning methods by a large margin. The best neural network method outperforms the best traditional method by a difference of nearly $0.2$ for the F1 score and by around $0.1$ for the S score. Between the neural network methods themselves, the differences can become really small, especially between the top performing DeepConvLSTM and DeepConvLSTM-A. While DeepConvLSTM reaches a slightly better F1 score of $0.853$, DeepConvLSTM-A reaches $0.847$. However, if we take into consideration the S score, DeepConvLSTM-A ($0.758$) is ahead of DeepConvLSTM ($0.756$). The convolutional neural network (CNN, $0.750$) and the LSTM with attention mechanism (LSTM-A, $0.708$) also reach similar levels of performance on both metrics, with the CNN outperforming the LSTM-A only in the S score. We can see that, like in the preliminary validation, normalization did not lead to the desired performance advantage. For the neural network methods, activating the normalization leads to a decrease of $0.01$ to $0.1$ in the F1 score and of $0.07$ to $0.15$ in the S score.
......@@ -26,9 +27,7 @@ As we can see, without label smoothing, the neural networks outperformed the con
\input{tables/washing_rm.tex}
![F1 score and S score for problem 1, with smoothing](img/washing_rm.pdf){#fig:p1_metrics_rm width=98%}
With label smoothing, we can reach an increased performance with all of the model classes, including the traditional machine learning methods RFC and SVM. The results with a 20 prediction wide average filter smoothing can be seen in table \ref{tbl:washing_rm} and @fig:p1_metrics_rm. The top performing neural network architectures do not change with the smoothing. However, the performance measures increase. DeepConvLSTM has the best F1 score ($0.892$), followed by LSTM-A ($0.891$), DeepConvLSTM-A ($0.890$) and CNN ($0.888$). These results are higher by about $0.03$ to $0.05$ compared to utilizing the raw predictions, without smoothing. In the S score metric, DeepConvLSTM-A performs best ($0.819$), followed by DeepConvLSTM-A ($0.814$) and CNN ($0.808$). For the S score, the advantage of the label smoothing is bigger in general, between $0.05$ to $0.06$ for all model classes except the LSTM, which only improves by $0.015$. RFC and SVM do not improve with the label smoothing, their scores decrease by about $0.04$ for both of the metrics.
With label smoothing, we can reach an increased performance with all of the model classes, including the traditional machine learning methods RFC and SVM. The results with a 20 prediction wide average filter smoothing can be seen in table \ref{tbl:washing_rm} and @fig:p1_metrics. The top performing neural network architectures do not change with the smoothing. However, the performance measures increase. DeepConvLSTM has the best F1 score ($0.892$), followed by LSTM-A ($0.891$), DeepConvLSTM-A ($0.890$) and CNN ($0.888$). These results are higher by about $0.03$ to $0.05$ compared to utilizing the raw predictions, without smoothing. In the S score metric, DeepConvLSTM-A performs best ($0.819$), followed by DeepConvLSTM-A ($0.814$) and CNN ($0.808$). For the S score, the advantage of the label smoothing is bigger in general, between $0.05$ to $0.06$ for all model classes except the LSTM, which only improves by $0.015$. RFC and SVM do not improve with the label smoothing, their scores decrease by about $0.04$ for both of the metrics.
The models running on normalized data also profit from the label smoothing, however they still cannot reach the performance of the non normalized models.
......@@ -37,21 +36,19 @@ For the special case of the models initially trained on problem 3 which were the
\FloatBarrier
### Distinguishing compulsive hand washing from non-compulsive hand washing
The results without smoothing of predictions for the second task, distinguishing compulsive hand washing from non compulsive hand washing can be seen in table \ref{tbl:only_conv_hw} and @fig:p2_metrics. In terms of the F1 score metric, the LSTM model performs best ($0.926$). It is closely followed by DeepConvLSTM-A ($0.922$) and DeepConvLSTM ($0.918$). However, the RFC also performs surprisingly well, with an F1 score of $0.891$, even beating the CNN ($0.883$) and FC networks ($0.886$). Due to the imbalance of classes in the test set ($70.6\,\%$ of samples correspond to the positive class), the majority classifier reaches an F1 score of $0.828$. The S score is best for DeepConvLSTM ($0.869$) and LSTM ($0.862$), followed by LSTM-A ($0.848$) and DeepConvLSTM-A ($0.846$). The baseline methods RFC ($0.734$) and SVM ($0.701$) fail to reach similar S scores as the neural network based methods.
The results without smoothing of predictions for the second task, distinguishing compulsive hand washing from non compulsive hand washing can be seen in table \ref{tbl:only_conv_hw}. In @fig:p2_metrics, the results with and without smoothing are shown. In terms of the F1 score metric, the LSTM model performs best ($0.926$). It is closely followed by DeepConvLSTM-A ($0.922$) and DeepConvLSTM ($0.918$). However, the RFC also performs surprisingly well, with an F1 score of $0.891$, even beating the CNN ($0.883$) and FC networks ($0.886$). Due to the imbalance of classes in the test set ($70.6\,\%$ of samples correspond to the positive class), the majority classifier reaches an F1 score of $0.828$. The S score is best for DeepConvLSTM ($0.869$) and LSTM ($0.862$), followed by LSTM-A ($0.848$) and DeepConvLSTM-A ($0.846$). The baseline methods RFC ($0.734$) and SVM ($0.701$) fail to reach similar S scores as the neural network based methods.
![F1 score and S score for problem 2](img/only_conv_hw.pdf){#fig:p2_metrics width=98%}
![F1 score and S score for problem 2](img/only_conv_hw_all.pdf){#fig:p2_metrics width=105%}
\input{tables/only_conv_hw.tex}
Like for problem 1, applying normalization to the input data worsens the performance of almost all classifiers. The performance loss in the F1 score reaches from $0.024$ (LSTM) to $0.11$ (CNN). For the FC network, the normalization leads to a slight performance increase of $0.01$. The S score performance decrease when we apply normalization is between $0.27$ (CNN) and $0.128$ (DeepConvLSTM-A). As with the F1 scores, the FC network profits off the normalization, here by a difference in S score of $0.035$. SVM and RFC also do not perform better with the application of normalization.
The results for task 2 with the application of smoothing are shown in table \ref{tbl:only_conv_hw_rm} and @fig:p2_metrics_rm. Similarly to problem 1, smoothing helps to further increase the performance of all classifiers. All neural network based methods reach F1 scores of over $0.95$. The best F1 score is achieved with DeepConvLSTM-A ($0.966$), the second best with LSTM ($0.965$). The differences remain small for this problem, as DeepConvLSTM ($0.963$) and LSTM-A ($0.961$) also achieve very similar scores. There is a small gap, after which the RFC ($0.922$) and SVM ($0.914$) follow. The traditional methods do not profit as much from the smoothing as the neural network based methods.
The results for task 2 with the application of smoothing are shown in table \ref{tbl:only_conv_hw_rm} and @fig:p2_metrics. Similarly to problem 1, smoothing helps to further increase the performance of all classifiers. All neural network based methods reach F1 scores of over $0.95$. The best F1 score is achieved with DeepConvLSTM-A ($0.966$), the second best with LSTM ($0.965$). The differences remain small for this problem, as DeepConvLSTM ($0.963$) and LSTM-A ($0.961$) also achieve very similar scores. There is a small gap, after which the RFC ($0.922$) and SVM ($0.914$) follow. The traditional methods do not profit as much from the smoothing as the neural network based methods.
The S scores of the neural network based models are also high, with the highest score being $0.911$ (DeepConvLSTM-A), followed by $0.910$ (LSTM), $0.909$ (DeepConvLSTM) and $0.908$ (LSTM-A). The values of CNN ($0.897$) and FC ($0.893$) are not far off either. However, the classical methods RFC ($0.761$) and SVM ($0.724$) do not reach the same level of performance, with the S score gap to the neural network based models even becoming a little bit bigger after the application of smoothing.
![F1 score and S score for problem 2, with smoothing](img/only_conv_hw_rm.pdf){#fig:p2_metrics_rm width=98%}
\input{tables/only_conv_hw_rm.tex}
Also, like in task 2 two without smoothing, normalization brings about small to big performance decreases for all the neural network based models except for the FC network. The FC F1 score rises by $0.01$ when normalization is applied and its S score rises by $0.014$.
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment