Commit 2d275867 authored by Alexander Henkel's avatar Alexander Henkel
Browse files

work on experiemnts

parent 26624d2c
...@@ -52,6 +52,7 @@ The neural network model I want to personalize has been implemented and trained ...@@ -52,6 +52,7 @@ The neural network model I want to personalize has been implemented and trained
To personalize the model I use transfer learning in a domain adaption manner as described in \secref{sec:relWorkPersonalization}. This requires additional labeled training data. Due to condition $1)$ all sensor data from a user is available, however without any labels. Therefore I generate pseudo labels based on the predictions of the general activity recognition model. Additionally these labels are refined as described in the following \secref{sec:approachLabeling}. The use of pseudo labels leads to a supervised training. This allows to let the model architecture unchanged and main parts of the original training implementation and hyper parameter settings can be reused which have been elaborated by Robin Burchard~\cite{robin2021}. To personalize the model I use transfer learning in a domain adaption manner as described in \secref{sec:relWorkPersonalization}. This requires additional labeled training data. Due to condition $1)$ all sensor data from a user is available, however without any labels. Therefore I generate pseudo labels based on the predictions of the general activity recognition model. Additionally these labels are refined as described in the following \secref{sec:approachLabeling}. The use of pseudo labels leads to a supervised training. This allows to let the model architecture unchanged and main parts of the original training implementation and hyper parameter settings can be reused which have been elaborated by Robin Burchard~\cite{robin2021}.
\subsubsection{Regularization}\label{sec:approachRegularization}
To avoid over-fitting to the target during multiple iterations of personalization I try two different approaches. The first approach is to freeze the feature layers of the model. As shown by Yosinski et al. feature layers tend to be more generalizeable and can better transferred to the new domain~\cite{Yosinski2014}. Therefore the personalization is just applied to the classification layer. So less parameters have to be fine tuned, what results in less computation time and a smaller amount of training data can have significant impact to the model. In the second approach I apply L2-SP penalty to the optimization as mentioned by Xuhong et al.~\cite{xuhong2018explicit}. Here the regularization restricts the search space to the initial model parameters. Therefore information which is learned in the pre-training stays existent even over multiple fine tuning iterations. This allows to adjust all parameters which offers more flexibility in fine-tuning. To test which approach fits best, I compare them in \secref{??}. To avoid over-fitting to the target during multiple iterations of personalization I try two different approaches. The first approach is to freeze the feature layers of the model. As shown by Yosinski et al. feature layers tend to be more generalizeable and can better transferred to the new domain~\cite{Yosinski2014}. Therefore the personalization is just applied to the classification layer. So less parameters have to be fine tuned, what results in less computation time and a smaller amount of training data can have significant impact to the model. In the second approach I apply L2-SP penalty to the optimization as mentioned by Xuhong et al.~\cite{xuhong2018explicit}. Here the regularization restricts the search space to the initial model parameters. Therefore information which is learned in the pre-training stays existent even over multiple fine tuning iterations. This allows to adjust all parameters which offers more flexibility in fine-tuning. To test which approach fits best, I compare them in \secref{??}.
...@@ -188,7 +189,7 @@ Additionally plots of the simulated test recordings and filtered hand wash inter ...@@ -188,7 +189,7 @@ Additionally plots of the simulated test recordings and filtered hand wash inter
\input{figures/approach/personalization_pipeline} \input{figures/approach/personalization_pipeline}
\section{Active learning} \section{Active learning}\label{sec:approachActiveLearning}
To evaluate my personalization approach I compare it with a simple implementation of semi supervised active learning. Similar to my approach the base model is used to predict the new unseen datasets. These predictions can then be used to calculate the informativeness of the corresponding sample. Idea is to determine instances where the model is uncertain in the label. To do that I calculate the entropy $H$ of the predictions for a sample $x_i$ which is defined as To evaluate my personalization approach I compare it with a simple implementation of semi supervised active learning. Similar to my approach the base model is used to predict the new unseen datasets. These predictions can then be used to calculate the informativeness of the corresponding sample. Idea is to determine instances where the model is uncertain in the label. To do that I calculate the entropy $H$ of the predictions for a sample $x_i$ which is defined as
\begin{align} \begin{align}
......
...@@ -9,7 +9,7 @@ To evaluate my personalization approach I use different metrics which rely on gr ...@@ -9,7 +9,7 @@ To evaluate my personalization approach I use different metrics which rely on gr
\section{Supervised personalization} \section{Supervised personalization}
Following experiments show theoretical aspects of the personalization process. Following experiments show theoretical aspects of the personalization process.
\subsection{Transfer learning based on ground truth data} \subsection{Transfer learning based on ground truth data}\label{sec:expTransferLearningGT}
These build a baseline how a personalized model could perform if a perfect labeling would be possible. Additionally the influence of label noise is analyzed. First the base model is trained by the ground truth data of the training set for each participant. After that the resulted model is evaluated with the test data and given metrics. Then $n\%$ of the \textit{null} labels and $h\%$ of the hand wash labels are flipped. Again the base model is trained and evaluated with the new data. This is repeated over different values for $n$ and $h$. The plots of \figref{fig:supervisedNoisyAllSpecSen} and \figref{fig:supervisedNoisyAllF1S} show the resulted mean evaluations of all participants with increasing noise. These build a baseline how a personalized model could perform if a perfect labeling would be possible. Additionally the influence of label noise is analyzed. First the base model is trained by the ground truth data of the training set for each participant. After that the resulted model is evaluated with the test data and given metrics. Then $n\%$ of the \textit{null} labels and $h\%$ of the hand wash labels are flipped. Again the base model is trained and evaluated with the new data. This is repeated over different values for $n$ and $h$. The plots of \figref{fig:supervisedNoisyAllSpecSen} and \figref{fig:supervisedNoisyAllF1S} show the resulted mean evaluations of all participants with increasing noise.
%\input{figures/experiments/supervised_random_noise_hw_all} %\input{figures/experiments/supervised_random_noise_hw_all}
...@@ -23,30 +23,45 @@ First we concentrate on (a) of \figref{fig:supervisedNoisyAllSpecSen}. Here $n=0 ...@@ -23,30 +23,45 @@ First we concentrate on (a) of \figref{fig:supervisedNoisyAllSpecSen}. Here $n=0
I argue the high performance loss by the imbalance of the labels. A typical daily recording of around 12 hours contains $28,800$ labeled windows. A single hand wash action of $20$ seconds covers ${\sim}13$ windows. If a user would wash its hands $10$ times a day, it would lead to $130$ \textit{hw} labels and $28,670$ \textit{null} labels. Even $50\%$ of noise on the \textit{hw} labels results in ${\sim}0.2\%$ of false data. But already $1\%$ of flipped \textit{null} labels lead to ${\sim}68\%$ of false hand wash labels. So they would have a higher impact to the training than the original hand wash data. As the S score of \figref{fig:supervisedNoisyPart} shows it is possible, that the personalized model benefits from additional data if the ratio of noise in \textit{null} labels is smaller than ${\sim}0.2\%$. The training data of the experiments contains $270591$ \textit{null} labels and $2058$ hand wash labels. So ${\sim}0.2\%$ noise would lead to ${\sim}541$ false \textit{hw} labels which is ${\sim}20\%$ of training data. As a rule of thumb, I claim, that the training data should contain less than ${\sim}20\%$ of false hand wash labels whereas the amount of incorrect \textit{null} labels does not require particular focus. I argue the high performance loss by the imbalance of the labels. A typical daily recording of around 12 hours contains $28,800$ labeled windows. A single hand wash action of $20$ seconds covers ${\sim}13$ windows. If a user would wash its hands $10$ times a day, it would lead to $130$ \textit{hw} labels and $28,670$ \textit{null} labels. Even $50\%$ of noise on the \textit{hw} labels results in ${\sim}0.2\%$ of false data. But already $1\%$ of flipped \textit{null} labels lead to ${\sim}68\%$ of false hand wash labels. So they would have a higher impact to the training than the original hand wash data. As the S score of \figref{fig:supervisedNoisyPart} shows it is possible, that the personalized model benefits from additional data if the ratio of noise in \textit{null} labels is smaller than ${\sim}0.2\%$. The training data of the experiments contains $270591$ \textit{null} labels and $2058$ hand wash labels. So ${\sim}0.2\%$ noise would lead to ${\sim}541$ false \textit{hw} labels which is ${\sim}20\%$ of training data. As a rule of thumb, I claim, that the training data should contain less than ${\sim}20\%$ of false hand wash labels whereas the amount of incorrect \textit{null} labels does not require particular focus.
\subsection{Hard vs. Soft labels} \subsection{Hard vs. Soft labels}\label{sec:expHardVsSoft}
In these experiments, I would like to show the effect of noise in soft labels compared to crisp labels. Similar as before different values of label flips are applied to the training data. Then the labels are smoothen to a degree $s\in [0, 0.49]$. As seen before, noise on \textit{hw} labels does not have significant impact to the performance. Therefore not much changes in performance due to different smoothing is expected. This is confirmed by \figref{fig:supervisedSoftNoiseHW}. Just for lager noise values a trend can be detected which is a slightly increase of the S score. I focus on noise in \textit{null} labels. \figref{fig:supervisedSoftNoiseNull} gives detailed insights of the performance impact. For all noise values, the specificity increases with higher smoothing, what becomes clearer, for more noise. But the sensitivity seems to decrease slightly, especially for higher noise rates. Overall the F1 score and S score benefits from smoothing. In the case of $0.2\%$ noise, the personalized models trained on smoothed false labels, unlike without smoothing, can reach a higher S score than the base model. In these experiments, I would like to show the effect of noise in soft labels compared to crisp labels. Similar as before different values of label flips are applied to the training data. Then the labels are smoothen to a degree $s\in [0, 0.49]$. As seen before, noise on \textit{hw} labels does not have significant impact to the performance. Therefore not much changes in performance due to different smoothing is expected. This is confirmed by \figref{fig:supervisedSoftNoiseHW}. Just for lager noise values a trend can be detected which is a slightly increase of the S score. I focus on noise in \textit{null} labels. \figref{fig:supervisedSoftNoiseNull} gives detailed insights of the performance impact. For all noise values, the specificity increases with higher smoothing, what becomes clearer, for more noise. But the sensitivity seems to decrease slightly, especially for higher noise rates. Overall the F1 score and S score benefits from smoothing. In the case of $0.2\%$ noise, the personalized models trained on smoothed false labels, unlike without smoothing, can reach a higher S score than the base model.
\input{figures/experiments/supervised_soft_noise_hw} \input{figures/experiments/supervised_soft_noise_hw}
\input{figures/experiments/supervised_soft_noise_null} \input{figures/experiments/supervised_soft_noise_null}
In the next step I want to observe if smoothing could have a negative effect if correct labels are smoothed. Therefore I repeat the previous experiment but don't flip the randomly selected labels and just apply the smoothing $s$ to them. Again, no major changes in the performance due to noise in \textit{hw} labels is expected which can also be seen in the left graph of \figref{fig:supervisedFalseSoftNoise}. In the case of wrongly smoothed \textit{null} labels we can see a negative trend in S score for higher smoothing values, as shown in the right graph. For a greater portion of smoothed labels, the smooth value has higher influence to the models performance. But for noise values $\leq 0.2\%$ the all personalized models still achieve higher S scores than the general models. Therefore it seems, that the personalization benefits from using soft labels. To make sure that the performance increase of smoothing false labels prevails the drawbacks of falsely smoothed correct labels, I combined both experiments. This is oriented what happens to the labels if one of the denoising filters would be applied to a hand wash section. First a certain ratio $n$ of \textit{null} labels are flipped. This expresses when the filter would falsely classify a \textit{null} label as hand washing. The false labels are smoothed to value $s$. After that the same ratio $n$ of correct \textit{hw} labels are smoothed to value $s$. This is equal to smoothing the label boundaries of a hand wash action. The resulting performance of personalizations can be seen in \figref{fig:supervisedSoftNoiseBoth}. \subsubsection{Negative impact of soft labels}\label{sec:expNegImpSoftLabel}
In the next step I want to observe if smoothing could have a negative effect if correct labels are smoothed. Therefore I repeat the previous experiment but don't flip the randomly selected labels and just apply the smoothing $s$ to them. Again, no major changes in the performance due to noise in \textit{hw} labels is expected which can also be seen in the left graph of \figref{fig:supervisedFalseSoftNoise}. In the case of wrongly smoothed \textit{null} labels we can see a negative trend in S score for higher smoothing values, as shown in the right graph. For a greater portion of smoothed labels, the smooth value has higher influence to the models performance. But for noise values $\leq 0.2\%$ the all personalized models still achieve higher S scores than the general models. Therefore it seems, that the personalization benefits from using soft labels. To make sure that the performance increase of smoothing false labels prevails the drawbacks of falsely smoothed correct labels, I combined both experiments. This is oriented what happens to the labels if one of the denoising filters would be applied to a hand wash section. First a certain ratio $n$ of \textit{null} labels are flipped. This expresses when the filter would falsely classify a \textit{null} label as hand washing. The false labels are smoothed to value $s$. After that the same ratio $n$ of correct \textit{hw} labels are smoothed to value $s$. This is equal to smoothing the label boundaries of a hand wash action. The resulting performance of personalizations can be seen in \figref{fig:supervisedSoftNoiseBoth}. The performance increase of smoothing false labels and the performance decrease of smoothing correct labels seems to cancel out for smaller values of $n$. For lager values the performance slightly increases for larger $s$. For training data refining I concentrate to use soft labels mainly within hand washing sections for activity borders. There will be a higher chance for false labeling.
%\input{figures/experiments/supervised_false_soft_noise_hw} %\input{figures/experiments/supervised_false_soft_noise_hw}
%\input{figures/experiments/supervised_false_soft_noise_null} %\input{figures/experiments/supervised_false_soft_noise_null}
\input{figures/experiments/supervised_false_soft_noise} \input{figures/experiments/supervised_false_soft_noise}
\input{figures/experiments/supervised_soft_noise_both} \input{figures/experiments/supervised_soft_noise_both}
\begin{itemize}
\item Which impact does hardened labels have against soft labels
\item flip labels and smooth out %\begin{itemize}
\end{itemize} % \item Which impact does hardened labels have against soft labels
% \item flip labels and smooth out
%\end{itemize}
\section{Evaluation of different Pseudo label generations} \section{Evaluation of different Pseudo label generations}
In this section, I describe the evaluation of different pseudo labeling approaches using the filters introduced in \secref{sec:approachFilterConfigurations}. For each filter configuration, the base model is used to predict the labels of the training sets and create pseudo labels. After that the filter is applied to the pseudo labels. To determine the quality of the pseudo labels, they are evaluated against the ground truth values using soft versions of the metrics $Sensitivity^{soft}$, $Specificity^{soft}$, $F_1^{soft}$, $S^{soft}$. The general model is then trained by the refined pseudo labels. All resulted models are evaluated by their test sets and the mean over all is computed. \figref{fig:pseudoModelsEvaluation} shows a bar plot over the metrics for all filter configuration. I concentrate to the values of S score. The configurations \texttt{all}, \texttt{high\_conf}, \texttt{scope}, \texttt{all\_corrected\_null}, \texttt{scope\_corrected\_null}, \texttt{all\_corrected\_null\_hwgt}, \texttt{scope\_corrected\_null\_hwgt} lead to a lower performance than the base model. Insights gives \figref{fig:pseudoModelsTrainingData} (b). All of the generated training data contain false positive labels, i.e. \textit{null} samples which are labeled as hand washing, or there are just a few true negative labels. For all \texttt{all\_null\_*} configurations nearly all not hand washing samples are correctly labeled as \textit{null}. More over, the training data consists of slightly more true positive labels than the others. The resulted models reach almost the same performance as the supervised trained model. In the case of \textitt{all\_cnn\_*} configurations, the training data achieve similar evaluation values as for \texttt{all\_null\_*} configurations but the scaled values are slightly lower. This also leads to a bit lower performance. Additionally the dataset of \textitt{all\_cnn\_convlstm3} also contains In this section, I describe the evaluation of different pseudo labeling approaches using the filters introduced in \secref{sec:approachFilterConfigurations}. For each filter configuration, the base model is used to predict the labels of the training sets and create pseudo labels. After that the filter is applied to the pseudo labels. To determine the quality of the pseudo labels, they are evaluated against the ground truth values using soft versions of the metrics $Sensitivity^{soft}$, $Specificity^{soft}$, $F_1^{soft}$, $S^{soft}$. The general model is then trained by the refined pseudo labels. All resulted models are evaluated by their test sets and the mean over all is computed. \figref{fig:pseudoModelsEvaluation} shows a bar plot over the metrics for all filter configuration. I concentrate to the values of S score in terms of performance.
\subsubsection{Baseline configurations}
The configurations \texttt{all}, \texttt{high\_conf}, \texttt{scope}, \texttt{all\_corrected\_null}, \texttt{scope\_corrected\_null}, \texttt{all\_corrected\_null\_hwgt}, \texttt{scope\_corrected\_null\_hwgt} lead to a lower performance than the base model. Insights gives \figref{fig:pseudoModelsTrainingData} (b). Configurations \texttt{all}, \texttt{scope}, \texttt{all\_corrected\_null} and \texttt{all\_corrected\_null\_hwgt} lead to training data which contain a higher amount of false positive labels, i.e. \textit{null} samples which are labeled as hand washing. As seen in \secref{sec:expTransferLearningGT} especially this kind of noise has a high negative impact to the training. We can also see, that in comparison to the \texttt{all} configuration, the \texttt{all\_corrected\_null} and \texttt{all\_corrected\_null\_hwgt} has a similar amount of false positive labels although all labels within \textit{false} intervals are set to \textit{null}. From this it can be concluded, that there are too many outlier false positives which are not covered by a user feedback. Therefore relying on all predictions and just correcting the labels inside user feedback still results in too much noise. But just a reduction of false positive labels does not necessarily leads to a higher performance of the resulting model. Configurations \texttt{scope}, \texttt{scope\_corrected\_null} and \texttt{scope\_corrected\_null\_hwgt} illustrate this. Where \texttt{scope} contains some false positives, all of them are cleaned in the other two configurations. But both reach a lower S score than \texttt{scope}, despite better evaluation values of the training data. The arising models of \texttt{scope\_corrected\_null} and \texttt{scope\_corrected\_null\_hwgt} have slightly higher specificity, resulting from the less false positive labels. But their sensitivity is significantly lower. The scoped configurations just rely on samples which are covered by \textit{positive} or \textit{false} intervals. Therefore most of the data was predicted as hand washing, i.e. also the \textit{null} samples are similar to hand washing movements. Correcting these labels, creates a dataset consisting mainly of similar hand washing movements, but are labeled differently. So the training data lacks of variety in \textit{null} samples which penalties the accuracy. Similar the data set of \texttt{high\_conf} configuration does not contain much false positive labels but performs worse than the \texttt{all} configuration. The model achieves a slightly higher sensitivity but a lower specificity. The dataset contains only samples for which the prediction has already been made with a high degree of certainty. Therefore, the model tends to overfit.
\subsubsection{All null configurations}
For all \texttt{all\_null\_*} configurations nearly no false positive label is included. More over, the training data consists of all available samples. In \figref{fig:pseudoModelsTrainingData} (a) you can see, that the training data of the configuration \texttt{all\_null\_hwgt} reaches almost a perfect scoring. There are few false negative labels that can occur due to manual intervals that do not cover the entire hand washing process. The model results nearly the same performance as the supervised trained model. Also, all other \texttt{all\_null\_*} models achieve roughly supervised performance. Their training data just contain some false negative labels which does not have significant impact to the training process.
\subsubsection{All cnn configurations}
In the case of \texttt{all\_cnn\_*} configurations, the training data obtain similar evaluation values as for \texttt{all\_null\_*} configurations but they contain less samples. This also leads to a bit lower performance. Additionally the dataset of \texttt{all\_cnn\_convlstm3} also consists of $0.4642\%$ false positive labels. Therefore the resulting model performs worse and similar to the general model. The false positive labels results by sections where multiple outliers occur but did not triggered a evaluation event. Therefore it is possible, that the cnn filter does not clean them and they are included to the training set. Idea of the \texttt{all\_cnn\_*\_hard} configurations is to exclude exactly these cases. More over the soft labels of high confidence predictions are set to their crisp value to counter the problem of smoothed correct labels, as seen in \secref{sec:expNegImpSoftLabel}. The respective models also achieve almost the same performance as the supervised model.
\input{figures/experiments/supervised_pseudo_models} \input{figures/experiments/supervised_pseudo_models}
\input{figures/experiments/supervised_pseudo_models_training_data} \input{figures/experiments/supervised_pseudo_models_training_data}
\subsection{Influence of missing feedback}
The following experiment shows the impact of missing user feedback to the training data and resulting model performance. As before the base model is trained on data which is refined with the different filter configurations. But in this case just $f\%$ of the \textit{false} and $c\%$ of the \textit{correct} indicators exists. All others are replaced with neutral indicators.
\begin{itemize} \begin{itemize}
\item Setup of filters \item Setup of filters
\item Comparison \item Comparison
...@@ -55,6 +70,7 @@ In this section, I describe the evaluation of different pseudo labeling approach ...@@ -55,6 +70,7 @@ In this section, I describe the evaluation of different pseudo labeling approach
\end{itemize} \end{itemize}
\section{Evaluation over iteration steps} \section{Evaluation over iteration steps}
In this section I compare the performance of the personalized models between iteration steps. Therefore the base model is applied to one of the training data sets of a participant, which is refined by one of the filter configurations. After that the resulted personalized model is evaluated. This step is repeated over all training sets where the previous base model is replaced by the new model. Additionally I evaluate the performance of a single iteration step by always training and evaluating the base model on the respective training data. I repeat that experiment with different amounts of training epochs and for the two regularization approaches of \secref{sec:approachRegularization}.
\begin{itemize} \begin{itemize}
\item How does the personalized model evolve over multiple training steps \item How does the personalized model evolve over multiple training steps
\end{itemize} \end{itemize}
...@@ -64,6 +80,10 @@ In this section, I describe the evaluation of different pseudo labeling approach ...@@ -64,6 +80,10 @@ In this section, I describe the evaluation of different pseudo labeling approach
specificity, sensitivity, f1, S1 specificity, sensitivity, f1, S1
\subsection{Compare Active learning with my approach} \subsection{Compare Active learning with my approach}
To confirm the robustness of my personalization approach, I compare it with a common active learning implementation as introduced in \secref{sec:approachActiveLearning}. To find a appropriate selection of hyper parameters $B$, $s$, $h$, use weighting, and number of epochs, I use a grid search approach. \tabref{tab:activeLearningGridSearch} shows the covered values for the hyper parameters.
\input{figures/experiments/table_active_learning_grid_search}
\begin{itemize} \begin{itemize}
\item Performance \item Performance
\item User interaction \item User interaction
......
...@@ -7,20 +7,20 @@ ...@@ -7,20 +7,20 @@
Configuration Name & Description\\ \midrule Configuration Name & Description\\ \midrule
\texttt{all} & Use all predictions as pseudo labels without any pre-processing \\ \texttt{all} & Use all predictions as pseudo labels without any pre-processing \\
\texttt{high\_conf} & Use high confidence predictions as pseudo labels\\ \texttt{high\_conf} & Use high confidence predictions as pseudo labels\\
\texttt{scope} & Just use pseudo labels within \textit{null, hw, manual} intervals\\ \texttt{scope} & Just use pseudo labels within \textit{false, positive} intervals\\
\texttt{all\_corrected\_null} & Correct all pseudo labels within a \textit{null} interval to a \textit{null} label\\ \texttt{all\_corrected\_null} & Correct all pseudo labels within a \textit{false} interval to a \textit{null} label\\
\texttt{scope\_corrected\_null} & Same as \texttt{all\_corrected\_null} but just use labels within \textit{null, hw, manual} intervals\\ \texttt{scope\_corrected\_null} & Same as \texttt{all\_corrected\_null} but just use labels within \textit{false, positive} intervals\\
\texttt{all\_corrected\_null\_hwgt} & Like \texttt{all\_corrected\_null} but additionally set all labels within a \textit{hw/manual} interval to their ground truth values\\ \texttt{all\_corrected\_null\_hwgt} & Like \texttt{all\_corrected\_null} but additionally set all labels within a \textit{positive} interval to their ground truth values\\
\texttt{scope\_corrected\_null\_hwgt} & Like \texttt{all\_corrected\_null\_hwgt} but just use labels within \textit{null, hw, manual} intervals\\ \texttt{scope\_corrected\_null\_hwgt} & Like \texttt{all\_corrected\_null\_hwgt} but just use labels within \textit{false, positive} intervals\\
\texttt{all\_null\_hwgt} & Set all pseudo labels to \textit{null} and within \textit{hw/manual} intervals to their ground truth value \\ \texttt{all\_null\_hwgt} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals to their ground truth value \\
\texttt{all\_null\_score} & Set all pseudo labels to \textit{null} and within \textit{hw/manual} intervals apply naive approach \\ \texttt{all\_null\_score} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply naive approach \\
\texttt{all\_null\_deepconv} & Set all pseudo labels to \textit{null} and within \textit{hw/manual} intervals apply CNN approach \\ \texttt{all\_null\_deepconv} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply CNN approach \\
\texttt{all\_null\_fcndae} & Set all pseudo labels to \textit{null} and within \textit{hw/manual} intervals apply FCN-dAE approach \\ \texttt{all\_null\_fcndae} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply FCN-dAE approach \\
\texttt{all\_null\_convlstm1} & Set all pseudo labels to \textit{null} and within \textit{hw/manual} intervals apply convLSTM1-dAE approach \\ \texttt{all\_null\_convlstm1} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply convLSTM1-dAE approach \\
\texttt{all\_null\_convlstm2} & Set all pseudo labels to \textit{null} and within \textit{hw/manual} intervals apply convLSTM2-dAE approach \\ \texttt{all\_null\_convlstm2} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply convLSTM2-dAE approach \\
\texttt{all\_null\_convlstm3} & Set all pseudo labels to \textit{null} and within \textit{hw/manual} intervals apply convLSTM3-dAE approach \\ \texttt{all\_null\_convlstm3} & Set all pseudo labels to \textit{null} and within \textit{positive} intervals apply convLSTM3-dAE approach \\
\texttt{all\_cnn\_convlstm3} & Apply to all labels the cnn approach. Use pseudo labels with high confidence or within \textit{null, hw, manual} interval. Set labels within \textit{null} interval to \textit{null} and apply convLSTM3-dAE approach to \textit{hw/manual} intervals \\ \texttt{all\_cnn\_convlstm3} & Apply the cnn approach to all labels. Use pseudo labels with high confidence or within \textit{false, positive} intervals. Set labels within \textit{false} interval to \textit{null} and apply convLSTM3-dAE approach to \textit{positive} intervals \\
\texttt{all\_cnn\_convlstm3\_hard} & Like \texttt{all\_cnn\_convlstm3} but set high confidence values to their corresponding hard label. Exclude all \textit{hw} labels which are not inside a \textit{hw/manual} interval \\ \texttt{all\_cnn\_convlstm3\_hard} & Like \texttt{all\_cnn\_convlstm3} but set high confidence values to their corresponding hard label. Exclude all \textit{hw} labels which are not inside a \textit{positive} interval \\
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment