Commit 0213765f authored by burcharr's avatar burcharr 💬
Browse files

automatic writing commit ...

parent f8657062
......@@ -50,7 +50,7 @@ To conclude the results of problem 3, the overall performance of this more diffi
### Practical applicability
The data from the real world evaluation with our test subjects shows, that not all real world hand washing procedures are detected by our smart watch system. Overall, the system's sensitivity was only $28.33\,\%$ in the evaluation of a "normal day", which is much lower compared to the theoretical results. However, this was to be expected to some degree, since real hand washing knows many forms and patterns, that are unlikely to all be captured during the explicit recording of training data. Added to that, the hand washing detection depended on the side of the body, on which the watch was worn, at least for some subjects.
The data from the real world evaluation with our test subjects shows, that not all real world hand washing procedures are detected by our smart watch system. Overall, the system's sensitivity was only $28.33\,\%$ in the evaluation of a "normal day", which is much lower compared to the theoretical results. However, this was to be expected to some degree, since real hand washing knows many forms and patterns, that are unlikely to all be captured during the explicit recording of training data. Added to that, the hand washing detection depended on the side of the body, on which the watch was worn, at least for some subjects. The performance was significantly worse if the watch was worn on the right wrist. This is likely due to the hand washing data used for training being collected almost exclusively with smart watches worn on the left wrist. If the data from subjects wearing the watch on the right wrist is left out, the overall detection sensitivity rises to $50\,\%$.
For some subjects, the smart watch application did not work properly, i.e. not start to run in the background as desired, which is why their results could not be included in the reported results. However, it could be possible, that other users' smart watch applications also were inactive for some of the time, possibly missing some hand washing procedures during this time.
......@@ -60,11 +60,11 @@ Our theoretical results could therefore not be reached in the real life scenario
We also expected, that a higher intensity or a longer duration of the hand washing have a positive influence on the detection probability by the model on the smart watch. This seems logical for the longer duration due to the smoothing, but also for the intensity. It can be assumed, that the system can reach higher certainties with high intensity compared to low intensity washing, as it is likely more separable from less intense activities. However, the results showed a significantly positive correlation value only for intensity and detection rate, whereas the detection rate and hand washing duration seemed to be mostly uncorrelated. However, this may again be due to the relatively small sample size. Especially for the longer washing tasks of 30s and 35s, there were only 2 examples, out of which one was not detected. This may have had a big influence on the absence of a positive correlation value in the evaluation results.
Added to that, the system did detect an average of 4 false positives per subject per day. These false positives could lead to annoyances and ultimately to the users losing trust in the detection capabilities of the system. However, the amount found here in the everyday task also varied a lot from subject to subject. Mainly, washing activities lead to false positives, which was to be expected, because similar movements like in hand washing are executed. Other activities also lead to false positives, which also confirmed the theoretical results' high, but not very high specificity.
Added to that, the system did detect an average of 4 false positives per subject per day. These false positives could lead to annoyances and ultimately to the users losing trust in the detection capabilities of the system. However, the amount found here in the everyday task also varied a lot from subject to subject. Mainly, washing activities lead to false positives, which was to be expected, because similar movements like in hand washing are executed. Other activities also lead to false positives, which also confirmed that the theoretical results' high, but not very high specificity does not lead to the total avoidance of false positives.
The test of scenario 2, the task of intensively washing for at least 30 seconds, yielded a lot higher accuracy. Per subject the washing was on average detected in $76\,\%$ of washing repetitions. Compared to the sensitivity of $90\,\%$ reached for problem 1 with smoothing, this is only lower by $14$ percentage points. The discrepancy here is much lower than in the every day scenario. This could be due to the fact that the training data for hand washing procedures was also collected in a more controlled environment, and more similar patterns were achieved. The results of the evaluation for scenario 2 are thus better than the results for scenario 1.
In total, the practical evaluation showed some weaknesses and some strengths of the system. As the sample size is small, and system instabilities occurred, the results have to be interpreted carefully. The evaluation is valid, especially for the false positives and the activities provoking them. However, the low sensitivity found in the every day task does not match the much higher sensitivity found in the intensive hand washing task, and the differences between subjects were huge for scenario 1.
In total, the practical evaluation showed some weaknesses and some strengths of the system. As the sample size is small, and system instabilities occurred, the results have to be interpreted carefully. The evaluation is valid, especially for the false positives and the activities provoking them. However, the low sensitivity found in the every day task does not match the much higher sensitivity found in the intensive hand washing task, and the differences between subjects were huge for scenario 1. Part of the reason for this is the difference in performance on the left and right wrists respectively.
## Comparison of goals to results
#### Detection of hand washing in real time from inertial motion sensors
......@@ -110,4 +110,4 @@ In the second test of the practical evaluation, subjects performed intensive and
Hence, the evaluation results suggest that the developed system is able to properly detect hand washing in many cases. The specificity and sensitivity of the system is high, but leaves some room for improvement.
In conclusion, the application of wrist worn sensor data to the detection of hand washing and compulsive hand washing remains an interesting and open field of research, with many possible areas of application. Especially the detection of obsessive hand washing would be a world's first, and seems promising for future usage in the treatment of OCD patients. Due to the possibility of directly running neural network models on wrist worn smart watches, interventions could be generated in real time and with low latency.
In conclusion, the application of wrist worn sensor data to the detection of hand washing and compulsive hand washing remains an interesting and open field of research, with many possible areas of application. Especially the detection of compulsive hand washing would be a world's first, and seems promising for future usage in the treatment of OCD patients. Due to the possibility of directly running neural network models on wrist worn smart watches, interventions could be generated in real time and with low latency.
......@@ -23,7 +23,7 @@ The separation of compulsive hand washing from ordinary hand washing is an even
One method of treatment for clinical cases of OCD is exposure and response prevention (ERP) therapy @meyer_modification_1966 @whittal_treatment_2005. Using this method, patients that suffer from OCD are exposed to situations in which their obsessions are stimulated and they are helped at preventing compulsive reactions to the stimulation. The patients can then "get used" to the situation in a sense, and thus the reaction to the stimulation will be weakened over time. This means that their quality of life is improved, as the severity of their OCD declines.
A successful, i.e. reliable and accurate system for obsessive hand washing detection could be used to intervene, whenever the compulsive hand washing is detected. It could therefore help psychologists and their patients in the treatment of the symptoms. It could help the user to stop the compulsive behavior by issuing a warning. Such a warning could be a vibration of the device, or a sound that is played upon the detection of compulsive behavior. However, the hypothesis of usefulness is yet to be tested, as no such systems exists as of now. Therefore we want to develop a system that can not only detect hand washing with low latency and in real time, but also discriminate between usual hand washing and obsessive-compulsive hand washing at the same time. The system could then, as described, be used in ERP therapy sessions, but also in every day life, to prevent compulsive hand washing.
A successful, i.e. reliable and accurate system for compulsive hand washing detection could be used to intervene, whenever the compulsive hand washing is detected. It could therefore help psychologists and their patients in the treatment of the symptoms. It could help the user to stop the compulsive behavior by issuing a warning. Such a warning could be a vibration of the device, or a sound that is played upon the detection of compulsive behavior. However, the hypothesis of usefulness is yet to be tested, as no such systems exists as of now. Therefore we want to develop a system that can not only detect hand washing with low latency and in real time, but also discriminate between usual hand washing and obsessive-compulsive hand washing at the same time. The system could then, as described, be used in ERP therapy sessions, but also in every day life, to prevent compulsive hand washing.
### Wrist worn sensors
Different types of sensors can be used to detect activities such as hand washing. It is possible to detect hand washing from RGB camera data to some extent. However, in order for this to work, we would need to place a camera at every place and room a subject could want to wash their hands at. This is unfeasible for most applications of hand washing detection, and could be very expensive. Added to that it might be problematic to place cameras inside wash or bath rooms for privacy reasons. Thus, a better alternative could be body worn, camera-less devices.
......
This diff is collapsed.
......@@ -23,7 +23,7 @@ Recently, deep neural networks have taken over the role of the state of the art
The connections' parameters are optimized using forward passes through the network of nodes, followed by the execution of the backpropagation algorithm, and an optimization step. We can accumulate all the gradients with regard to a loss function for each of the parameters and for a small subset of the data passed and perform "stochastic gradient decent" (SGD). SGD or alternative similar optimization methods like the commonly used ADAM @kingma_adam_2017 optimizer perform a parameter update step. After many such updates and if the training works well, the network parameters will have been updated to values that lead to a lower value of the loss function for the training data. However, there is no guarantee of convergence whatsoever. As mentioned above, deep neural networks can, in theory, be used to approximate arbitrary functions. Nevertheless, the parameters for the perfect approximation cannot be easily found, and empirical testing has revealed that neural networks do need a lot of training data in order to perform well, compared to classical machine learning methods. In return, with enough data, deep neural networks often outperform the classical machine learning methods.
###### Convolutional neural networks (CNNs)
are neural networks that are not fully connected, but work by using convolutions with a kernel, that we slide over the input. CNNs were first introduced for hand written character recognition @lecun_backpropagation_1989 @le_cun_handwritten_1990 (1989, 1990), but were later revived for computer vision tasks @krizhevsky_imagenet_2012 (2012), after more computational power was available on modern devices to train them. Since the rise of CNNs in computer vision, most computer vision problems are solved with them. The convolutions work by moving filter windows with learnable parameters (also called kernels) over the input @albawi_understanding_2017. Opposed to a fully connected network, the weights are shared over many of the nodes, because the same filters are applied over the full size of the input. CNNs have less parameters to train than a fully connected network with the same amount of nodes, which makes them easier to train. They are generally expected to perform better than FC networks, especially on image related tasks. The filters can be 2-dimensional, like for images (e.g. a 5x5 filter moved across the two axes of an image) or 1-dimensional, which can e.g. be used to slide a kernel along the time dimension of a sensor recording. Even in the 1-dimensional case, less parameters are needed compared to the application of a fully connected network. Thus, the 1-dimensional CNN is expected to be easier to train and achieve a better performance.
are neural networks that are not fully connected, but work by using convolutions with a kernel, that we slide over the input. CNNs were first introduced for hand written character recognition @lecun_backpropagation_1989 @le_cun_handwritten_1990 (1989, 1990), but were later revived for computer vision tasks @krizhevsky_imagenet_2012 (2012), after more computational power was available on modern devices to train them. Since the rise of CNNs in computer vision, most computer vision problems are solved with them. The convolutions work by moving filter windows with learnable parameters (also called kernels) over the input @albawi_understanding_2017. Opposed to a fully connected network, the weights are shared over many of the nodes, because the same filters are applied over the full size of the input. CNNs have less parameters to train than a fully connected network with the same amount of nodes, which makes them easier to train. They are generally expected to perform better than FC networks, especially on image related tasks. The filters can be 2-dimensional (2d), like for images (e.g. a 5x5 filter moved across the two axes of an image) or 1-dimensional (1d), which can e.g. be used to slide a kernel along the time dimension of a sensor recording. Even in the 1-dimensional case, less parameters are needed compared to the application of a fully connected network. Thus, the 1-dimensional CNN is expected to be easier to train and achieve a better performance.
###### Recurrent neural networks (RNNs)
......@@ -94,7 +94,7 @@ Another study by Singh et al. combines DeepConvLSTM with a self-attention mechan
For HAR, DeepConvLSTM and the models derived from it are the state of the art machine learning methods, as their consistently outperform other model architectures on the available benchmarks and data sets.
## Hand washing
To our knowledge, no study has ever tried to separately predict obsessive hand washing opposed to non-obsessive hand washing.
To our knowledge, no study has ever tried to separately predict compulsive hand washing opposed to non-compulsive hand washing.
Most studies that try to automatically detect hand washing are aiming for compliance improvements, i.e. trying to increase or measure the frequency of hand washes or assessing or improving the quality of hand washes.
Hand washing compliance can be measured using different tools. Jain et al. @jain_low-cost_2009 use an RFID-based system to check whether health care workers comply with hand washing frequency requirements. However, the system is merely used to make sure all workers entering an emergency care unit have washed their hands. Bakshi et al. @bakshi_feature_2021 developed a hand washing detection data set with RGB video data, and showed a valid way to extract SIFT-descriptors from it for further research. Llorca et al. showed a vision based system for automatic hand washing quality assessment @llorca_vision-based_2011 based on the detection of skin in RGB images using optical techniques such as optical flow estimation.
......
......@@ -59,9 +59,9 @@ Also, like in task 2 two without smoothing, normalization brings about small to
\FloatBarrier
### Classifying hand washing and obsessive hand washing separately and distinguishing from other activities
### Classifying hand washing and compulsive hand washing separately and distinguishing from other activities
The three class problem of classifying hand washing, obsessive hand washing and other activities is harder than the other two problems, as it contains them both at once. The resulting confusion matrix for each of the neural network classifiers is shown in @fig:confusion. The version trained on the normalized data is shown on the left, while the data trained on the non normalized data of the same network class is shown on the right. Each confusion matrix shows, what percentage of the true labels of a class was classified in which of the three available classes. Optimally, the diagonal values would be all $1.0$ and the off-diagonal values all $0.0$. The matrices are color-coded with the same value ranges, so that they are interchangeably comparable.
The three class problem of classifying hand washing, compulsive hand washing and other activities is harder than the other two problems, as it contains them both at once. The resulting confusion matrix for each of the neural network classifiers is shown in @fig:confusion. The version trained on the normalized data is shown on the left, while the data trained on the non normalized data of the same network class is shown on the right. Each confusion matrix shows, what percentage of the true labels of a class was classified in which of the three available classes. Optimally, the diagonal values would be all $1.0$ and the off-diagonal values all $0.0$. The matrices are color-coded with the same value ranges, so that they are interchangeably comparable.
![Confusion matrices for all neural network based classifiers with and without normalization of the sensor data](img/confusion.pdf){#fig:confusion width=98%}
......@@ -100,9 +100,9 @@ The mean diagonal value of the confusion matrix upholds almost the same ordering
### Scenario 1: One day of evaluation
In the first scenario, the 5 (TODO) subjects reported an average of $4.75$ hand washing procedures on the day on which they evaluated the system.
Per subject, there were $4.75$ ($\pm 3.3$) hand washing procedures. Out of those, $1.75$ ($\pm 2.06\,\%$) were correctly identified. The accuracy per subject was $28,33\,\%$ ($\pm 37.9\,\%$). The highest accuracy for a subject was $80\,\%$ out of 5 hand washes, the lowest was $0\,\%$ out of 4 hand washes.
Per subject, there were $4.75$ ($\pm 3.3$) hand washing procedures. Out of those, $1.75$ ($\pm 2.06\,\%$) were correctly identified. The accuracy per subject was $28,33\,\%$ ($\pm 37.9\,\%$). The highest accuracy for a subject was $80\,\%$ out of 5 hand washes, the lowest was $0\,\%$ out of 4 hand washes. Of all hand washing procedures conducted over the day by the subjects, $35,8\,\%$ were detected correctly.
Some subjects wore the smart watch on the left wrist instead of the right wrist, and reported worse results for that.
Some subjects wore the smart watch on the right wrist instead of the left wrist, and reported worse results for that. Leaving out hand washes conducted with the smart watch worn on the right wrist, the detection sensitivity rises to $50\,\%$.
The duration and intensity of the hand washing process also played a role.
The correlation of duration of the hand washing with the detection rate is $-0.039$. However, the raw data does only contain 2 "longer" hand washes over 30 seconds, the rest being in the range of 10 to 25 seconds.
......@@ -125,5 +125,6 @@ Some subjects also reported difficulties with the smart watch application (not p
### Scenario 2: Controlled intensive hand washing
In scenario 2, the subjects each washed their hands at least 3 times. Some subjects voluntarily agreed to perform more repetitions, which leads to more than 3 washing detection results per subject. The detection accuracy per subject was $76\,\%$ ($\pm 25\,\%$), with the highest being, $100\,\%$ and the lowest being $50\,\%$.
The mean accuracy over all repetitions and not split by subjects was $73,7\,\%$. For scenario 2, one user moved the smart watch from the left wrist to the right wrist after two repetitions. The first two repetitions were not detected, while the two repetitions with the smart watch worn on the right wrist were detected correctly.
The mean accuracy over all repetitions and not split by subjects was $73,7\,\%$. For scenario 2, one user moved the smart watch from the right wrist to the left wrist after two repetitions. The first two repetitions were not detected, while the two repetitions with the smart watch worn on the right wrist were detected correctly. Leaving out hand washes conducted with the smart watch worn on the right wrist, the detection sensitivity rises to $78.6\,\%$, and the detection accuracy per subject is $82.5\,\%$ ($\pm 23.6\,\%$).
No preview for this file type
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment