Commit b8e20a88 authored by burcharr's avatar burcharr 💬
Browse files

automatic writing commit ...

parent d4b6a454
...@@ -35,31 +35,40 @@ In order to correctly detect the hand washing in real time in a real world scena ...@@ -35,31 +35,40 @@ In order to correctly detect the hand washing in real time in a real world scena
In order to also separate non-obsessive hand washing from obsessive hand washing, data of obsessive hand washing must be included. In order to record this data, real patients can be asked to wear a sensor during their daily life. In order to also separate non-obsessive hand washing from obsessive hand washing, data of obsessive hand washing must be included. In order to record this data, real patients can be asked to wear a sensor during their daily life.
### Data used in our data set ### Data used in our data set
We used hand washing data and "compulsive" hand washing data recorded at the University of Basel and University of Freiburg as our "positive" class data. This data was recorded at several occasions and using different paradigms. We mainly used data recorded at 50Hz, using a smart watch application. On several occasions in 2019 and 2020, data was recorded. The data from 2019 includes hand washing data and, added to that, also includes simulated "compulsive" hand washing. For the simulated compulsive hand washing, subjects were asked to ... todo ask phil for protocolls whatever. We used hand washing data and "compulsive" hand washing data recorded at the University of Basel and University of Freiburg as our "positive" class data. This data was recorded at several occasions and using different paradigms. We mainly used data recorded at $50\,$Hz, using a smart watch application. In two occasions in 2019 and 2020, data was recorded. The data from 2019 includes hand washing data and, added to that, also includes simulated "compulsive" hand washing. For the simulated compulsive hand washing, subjects were asked to ... todo ask phil for protocolls whatever.
The recording and labeling of the data is not part of this work We used the simulated compulsive hand washing data as compulsive hand washing data, as we did not have access to recordings of actual obsessive hand washing. Thus, when we write about "compulsive" hand washing data we used in this thesis, the simulated hand washing is meant.
The recording and labeling of the data is not part of this work.
Added to that, multiple data sets from other studies were used. In our selection, we include publicly available data sets of which each contains wrist worn sensor data of at least one arm. Not all the data sets were recorded at the frequency of 50Hz. Thus, we resampled all data obtained to our fixed frequency using linear interpolation. Added to that, multiple data sets from other studies were used. In our selection, we include publicly available data sets of which each contains wrist worn sensor data of at least one arm. Not all the data sets were recorded at the frequency of 50Hz. Thus, we resampled all data obtained to our fixed frequency using linear interpolation.
\begin{table}[] \begin{table}[]
\begin{tabular}{|l|l|l|l|} \begin{tabular}{|l|l|l|l|}
\hline \hline
No & Dataset name & Contained activities (excerpt) & Original recording frequency \\ \hline No & Dataset name & Contained activities (excerpt) & Recording frequency \\ \hline
1 & 2019 & hand washing and compulsive hand washing & 50 Hz \\ \hline 1 & 2019 & hand washing and compulsive hand washing & 50 Hz \\ \hline
2 & 2020 & different hand washing activities & 50 Hz \\ \hline 2 & 2020 & different hand washing activities & 50 Hz \\ \hline
3 & 2020 long-term & All day recordings of activities of daily living & 50 Hz \\ \hline 3 & 2020 long-term & All day recordings of activities of daily living & 50 Hz \\ \hline
4 & WISDM @kwapisz\_activity\_2011 & Movement (walking, jogging, stairs, sitting, ...) & 20 Hz \\ \hline 4 & WISDM & Movement (walking, jogging, stairs, sitting, ...) & 20 Hz \\ \hline
5 & RealWorld @sztyler\_-body\_2016 & Movement (walking, jogging, stairs, sitting, ...) & 50 Hz \\ \hline 5 & RealWorld & Movement (walking, jogging, stairs, sitting, ...) & 50 Hz \\ \hline
6 & REALDISP @banos\_benchmark\_2012 & Movement and fitness exercises & 50 Hz \\ \hline 6 & REALDISP & Movement and fitness exercises & 50 Hz \\ \hline
7 & PAMAP2 @reiss\_introducing\_2012 & Movement, sports, household chores, desk work & 100 Hz \\ \hline 7 & PAMAP2 & Movement, sports, household chores, desk work & 100 Hz \\ \hline
\end{tabular} \end{tabular}
\caption{Data sets used in our combined data set. Data sets 1 to 3 stem from Freiburg / Basel, the rest are external data sets.} \caption{Data sets used in our combined data set. Data sets 1 to 3 stem from Freiburg / Basel, the rest are external data sets.}
\label{tbl:datasets}
\end{table} \end{table}
The external data sets used and their specifications are listed in table %todo {add table with that info}. The external data sets used are:
The external data sets were collected and converted by Daniel Homm, analyzed and resampled by us.
- WISDM @kwapisz_activity_2011
- RealWorld @sztyler_-body_2016
- REALDISP @banos_benchmark_2012
- PAMAP2 @reiss_introducing_2012
### Specifications of the resulting data set %TODO The external data sets were collected and converted by Daniel Homm, analyzed and resampled by us. Their contents can be seen in table \ref{tbl:datasets}.
The final data set used contains a total of x data points. Of these x data points, y were hand washing, z were other activities or idle. Out of the y hand washing data points, w were obsessive hand washing data points. Table k shows the specific sizes and the comparisons. Researchers willing to use this data set should note that for most machine learning methods it makes sense to balance the training set with regard to the classes, in order to avoid biases towards the more frequent classes. In specific machine learning algorithms, one could also combat the class imbalance problem using an importance weighting for the different classes. To train a neural network, the loss function can also be weighted by the class frequency.
### Specifications of the resulting data set used
The final data set used contains a total of 14.4 million 6-dimensional data points. With these 14.4 million data points we created windows of length 150 samples (3 seconds), with 50% overlap. This left us with ~194,000 windows. Out of those windows, ~15,750 ($8,2\,\%$) contained hand washing, ~178,500 ($91,8\,\%$) were other activities or idle. Out of the ~15,750 hand washing windows, ~10,250 ($65\,\%$) were compulsive hand washing windows, ~5500 ($35\,\%$) were non compulsive washing. We note that for most machine learning methods it makes sense to balance the training set with regard to the classes, in order to avoid biases towards the more frequent classes. In specific machine learning algorithms, one could also combat the class imbalance problem using an importance weighting for the different classes. To train a neural network, the loss function can also be weighted by the class frequency.
## Description of different classification problems ## Description of different classification problems
\label{sec:classification_problems} \label{sec:classification_problems}
...@@ -170,7 +179,7 @@ Added to the simple LSTM model, we also implemented the LSTM with attention mech ...@@ -170,7 +179,7 @@ Added to the simple LSTM model, we also implemented the LSTM with attention mech
The DeepConvLSTM and its modifications are considered state of the art in Human Activity Recognition tasks. We are applying our implementation to the hand washing classification problem. DeepConvLSTM combines the advantages of convolutional layers and LSTMs. We implement it using the original design with four convolutional layers followed by two LSTM layers and a classification layer. Similar to the convolutional neural network, we use $64$ filters in each of the layers, a kernel size of $9$ and a stride of $1$. During preliminary testing, leaving out one LSTM layer like proposed by Bock et al. @bock_improving_2021 did not yield a significantly different performance. Thus we use two layers like it was done in the original study. The LSTM layers each have a hidden size of $128$. The classification layer has output size $2$ or $3$. This results in a network with around 346.000 learnable parameters. The DeepConvLSTM and its modifications are considered state of the art in Human Activity Recognition tasks. We are applying our implementation to the hand washing classification problem. DeepConvLSTM combines the advantages of convolutional layers and LSTMs. We implement it using the original design with four convolutional layers followed by two LSTM layers and a classification layer. Similar to the convolutional neural network, we use $64$ filters in each of the layers, a kernel size of $9$ and a stride of $1$. During preliminary testing, leaving out one LSTM layer like proposed by Bock et al. @bock_improving_2021 did not yield a significantly different performance. Thus we use two layers like it was done in the original study. The LSTM layers each have a hidden size of $128$. The classification layer has output size $2$ or $3$. This results in a network with around 346.000 learnable parameters.
#### DeepConvLSTM with attention mechanism (DeepConvLSTM-A) #### DeepConvLSTM with attention mechanism (DeepConvLSTM-A)
To our knowledge, no previous work exists, that couples DeepConvLSTM with the exact attention mechanism used in LSTM-A. Only after starting the work on this thesis, we found out that a pretty similar approach had been tried by Singh et al. @singh_deep_2021. We therefore tried to combine the two methods DeepConvLSTM and LSTM-A together, ending up with DeepConvLSTM-A. The attention mechanism is implemented exactly the same way as with the LSTM with attention mechanism, and is therefore different from the one used by Singh et al.. The data is first passed through the four convolutional layers with the same configuration as above. Then it is passed through the LSTM, which only has one layer here, and then the hidden states generated over the series of time steps are combined with the weighted sum. Afterwards, these results are passed through the fully connected classification layer. The DeepConvLSTM-A model has around 230.000 parameters that need to be trained. To our knowledge, no previous work exists, that couples DeepConvLSTM with the exact attention mechanism used in LSTM-A. Only after starting the work on this thesis, we found out that a pretty similar approach had been tried by Singh et al. @singh_deep_2021. We instead tried to combine the two methods DeepConvLSTM and LSTM-A together, ending up with DeepConvLSTM-A. The attention mechanism is implemented exactly the same way as with the LSTM with attention mechanism by Zeng et al. @zeng_understanding_2018, and is therefore different from the one used by Singh et al.. Due to to us finding out about the work of Singh et al. late, we did not add their version to the list of architectures we tried. The resulting architecture for our version of a DeepConvLSTM with attention mechanism is still different from theirs. The data is first passed through the four convolutional layers with the same configuration as for DeepConvLSTM. Then it is passed through the LSTM, which only has one layer here, and then the hidden states generated over the series of time steps are combined with the weighted sum as in LSTM-A. Afterwards, these results are passed through the fully connected classification layer. The DeepConvLSTM-A model has around 230.000 parameters that need to be trained.
### Training routines and hyper parameter search ### Training routines and hyper parameter search
...@@ -298,11 +307,18 @@ Added to that, we also report the performance of the best two models for problem ...@@ -298,11 +307,18 @@ Added to that, we also report the performance of the best two models for problem
### Practical evaluation ### Practical evaluation
For the practical evaluation, we asked TODO XY subjects to test the system in practice. We defined two different paradigms, one for real world performance evaluation and one for explicit evaluation of the model running on the watch. In order to do this, the model with the best performance on the test set of task 1., i.e. the general detection of hand washing, was exported to be executed on the watch inside the described smart watch application. The scenarios were: For the practical evaluation, we asked TODO XY subjects to test the system in practice. We defined two different paradigms, one for real world performance evaluation and one for explicit evaluation of the model running on the smart watch. In order to do this, the model with the best performance on the test set of task 1., i.e. the general detection of hand washing, was exported to be executed on the watch inside the described smart watch application. We limited the testing to these scenarios because we did not have access to subjects that would actually wash their hands compulsively. The scenarios were:
1. The subjects are wearing a smart watch for one day. During this time, whenever they wash their hands, the watch will or will not detect the hand washing procedure. The subjects note down, whether or not the hand washing was recognized correctly.
2. The subjects specifically go to the bathroom to wash their hands 3 times to test the recognition. They note down whether or not the hand washing was recognized correctly. The hand washing is supposed to be done thoroughly and intensively (at least 30 seconds of washing per repetition).
Scenario 1 can be used to evaluate the real world performance of the classifier in day to day living. It is supposed to gather information about the use cases in which the system works well, but also in which cases it fails. There are many activities of daily living that one could think of, which are not included in the data set, i.e. unseen activities for the classifier. Such activities might be problematic for the classifier, as they are unlikely to perfectly resemble any Null class activities it was trained on. The test in scenario 1 is supposed to uncover some of these activities, which might be detected as false positives. Added to that, by having the subjects note down whether the detection worked for every time the subject washed their hands, we also get an estimate of the sensitivity of the system, apart from what the theoretical evaluation yielded.
Scenario 2 is used to check whether the system works correctly for most of the time, when we certainly know, that intensive washing is involved. It also ensures the subjects active compliance, by making the hand washing activity their main focus. In scenario 1, it would be possible that the subject forgets to take notes sometime, which is not as likely in the controlled hand washing scenario.
1. The subjects were wearing a smart watch for 1 day. During this time, whenever they wash their hands, the watch will or will not detect the hand washing procedure. The subjects noted down, whether or not the hand washing was recognized correctly. TODO Together, the two scenarios provide a basis for estimating the real world performance of the system.
2. The subjects specifically went to the bathroom to wash their hands 3 times to test the recognition. They noted down whether or not the hand washing was recognized correctly. TODO
The invitation to the hand washing evaluation (German language) with the exact description can be found in the appendix. The invitation to the hand washing evaluation (German language) with the exact description of the two scenarios can be found in the appendix.
\ No newline at end of file \ No newline at end of file
...@@ -82,7 +82,7 @@ score(\mathbf{h}_T,\mathbf{h}_s) &= \mathbf{h}_t^T\mathbf{W}_{\alpha}\mathbf{h}_ ...@@ -82,7 +82,7 @@ score(\mathbf{h}_T,\mathbf{h}_s) &= \mathbf{h}_t^T\mathbf{W}_{\alpha}\mathbf{h}_
They evaluate their approach on 3 data sets and report a state of the art performance, beating the initial DeepConvLSTM. They evaluate their approach on 3 data sets and report a state of the art performance, beating the initial DeepConvLSTM.
Another study by Singh et al. combines DeepConvLSTM with a self-attention mechanism @singh_deep_2021. The attention mechanism is very similar to the one used by Zeng et al. @zeng_understanding_2018, where the attention mechanism consists of a layer that follows the LSTM layers in the DeepConvLSTM network. Instead of with a weighted sum, Singh et al. find the weights $\mathbf{\alpha}$ by applying the softmax function to the output of a fully connected layer. They also report a statistically significant increase in performance compared to the initial DeepConvLSTM. Another study by Singh et al. combines DeepConvLSTM with a self-attention mechanism @singh_deep_2021. The attention mechanism is very similar to the one used by Zeng et al. @zeng_understanding_2018, where the mechanism consists of a layer that follows the LSTM layers in the DeepConvLSTM network. Instead of with a weighted sum, Singh et al. find the weights $\mathbf{\alpha}$ by applying the softmax function to the output of a fully connected layer. They also report a statistically significant increase in performance compared to the initial DeepConvLSTM.
## Hand washing ## Hand washing
To our knowledge, no study has ever tried to separately predict obsessive hand washing opposed to non-obsessive hand washing. To our knowledge, no study has ever tried to separately predict obsessive hand washing opposed to non-obsessive hand washing.
......
No preview for this file type
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment