Evaluation results for the teams who submitted predictions for the NAILS pilot task are below. BA = Balanced Accuracy QUT (2 submissions): BA: 0.82810 Timestamp: 1506039026.829094 BA: 0.85281 Timestamp: 1506039013.06217 Best scoring approach for QUT had a BA=0.8528 ARL17 (5 submissions): BA: 0.88386 Timestamp: 1505857403.6826353 BA: 0.87243 Timestamp: 1505314319.1925228 BA: 0.85257 Timestamp: 1505222934.571628 BA: 0.84591 Timestamp: 1505154204.2565637 BA: 0.77234 Timestamp: 1503679734.9772093 Best scoring approach for ARL17 had a BA=0.8839 Accuracy in the NAILS pilot task was measured using balanced accuracy on the withheld test set. The highest prediction accuracy of the prediction submission runs for a team was used to rank participating teams. The test set comprised of 18000 feature vectors that needed to be labelled as either a target (900 entries where label=1) or a standard (17100 entries where label=-1). Details of how these accuracies break down across experimental participants (and similar) are available in the overview paper.