Research Purpose Use of Test Collection
[NTCIR Data Home (English)]
[NTCIR Data Home (Japanese)]
NTCIR-12 Lifelog Task Outline
This pilot Lifelog task aimed to begin the comparative evaluation of information access and retrieval systems operating over personal lifelog data. This task includes of two subtasks, both (or either) of which could be participated in independently.
The two tasks are:
- Lifelog Semantic Access Task (LSAT) to explore search and retrieval from lifelogs
- Lifelog Insight Task (LIT) to explore knowledge mining and visualisation of lifelogs.
Lifelog Semantic Access Task (LSAT)
In this subtask, the participants have to retrieve a number of specific moments in a lifelogger's life. We define moments as semantic events, or activities that happened throughout the day. The task can best be compared to a known-item search task as known from TRECVid. Tasks can be undertaken in an interactive or automatic manner. Submissions include the start and end time of the day when the retrieved moments took place. Standard evaluation metrics are used based on MAP and NDCG (Normalized Discounted Cumulative Gain). The LSAT task includes 48 search tasks, generated by the lifeloggers and guided by Kahneman's lifestyle activities (Kahneman et al., 2004). The tasks consist of Topic, Description and Narrative (see below).
Lifelog Insight Task (LIT)
The aim of this subtask is to gain insights into the lifelogger's life. It follows the idea of the Quantified Self movement that focuses on the visualization of knowledge (for ideas and examples on personal information visualization please visit http://quantifiedself.com/data-visualization/) mined from self-tracking data to provide "self-knowledge through numbers". Participants are requested to provide insights about the lifelog data that support the lifelogger in the act of reflecting upon the data, facilitate filtering and provide for efficient/effective means of visualisation of the data. Note that this task is not evaluated. Instead, participants were asked to present a demo at the NTCIR-12 conference. The LIT task includes ten information needs, generated by the lifeloggers and representing the concept of reflection from lifelogs. The tasks will consist of Topic, Description and Narrative.
NTCIR-12 Lifelog Task Dataset
The NTCIR Lifelog test collection consists of data from three lifeloggers for a period of about one month each. The data consists of a large collection of wearable camera images (at about 2 per minute) and an XML description of the semantic locations (e.g. Starbucks cafe, McDonalds restaurant, home, work) and the physical activities of the user (e.g. walking, transport, cycling), of the lifelogger at a granularity of one minute.
Given the fact that lifelog data is typically visual in nature and in order to reduce the barriers-to-participation, the output of the CAFFE CNN-based visual concept detector was included in the test collection as additional metadata. This classifier provided labels and probabilities of occurrence for 1,000 objects in every image. The accuracy of the CAFFE visual concept detector is representative of the current generation of off-the-shelf visual analytics tools. In order to anonymise the data, all faces were blurred in the images and only semantic locations provided.
Details of the dataset are as follows:
- Number of Lifeloggers: 3
- Size of the Collection (GB): 18.18GB
- Size of the Collection (Images): 88,124 images
- Size of the Collection ( Locations): 130 locations
- Size of the Collection (Visual Concepts): 825MB
- Number of Ad-hoc Search (LSAT) Topics: 48
- Number of LSAT Topics: 48
- Number of LIT Topics: 10
In addition to the test dataset, there was also a small dry-run dataset released consisting of one day of data from one lifelogger and ten topics.
To obtain a copy of the document data collection, please send an email to:
Access to the LifeLog datasets
The datasets can be downloaded from the NTCIR-Lifelog website. Participants need to fill in organisation and user-level data agreement forms to access the data. Each link is password protected and each organisation will receive a unique username and password to access the data.
Supporting Materials (to be used with trec_eval)
- Moment Level Qrels: ntcir12_lifelog_lsat_moment_qrels.txt
- Image Level Qrels: ntcir12_lifelog_lsat_image_qrels.txt
- Image to Moment Mapping File: ntcir12_lifelog_lsat_image_to_moment_mapping.csv
Topics (for both LSAT and LIT sub-tasks)
- LSAT sub-task topics (English): ntcir12_lifelog_lsat_topics_english.docx
- LSAT sub-task topics (Chinese): ntcir12_lifelog_lsat_topics_chinese.docx
- LIT sub-task topics (English): ntcir12_lifelog_lit_topics_english.docx
- LIT sub-task topics (Chinese): ntcir12_lifelog_lit_topics_chinese.docx
NTCIR-12 Lifelog Task data set available from NII
The test collection and data from NII are available free of charge.
NTCIR-12 Lifelog Supporting Materials and Topics are downloadable from NII/IDR at:
NTCIR-12 Lifelog Task Evaluation Results
NTCIR-12 Lifelog Task Evaluation Results: NTCIR-12 Lifelog Task Evaluation Results
Last Updated 27 June 2016
- Further details can be found in the task overview paper:
Gurrin, C., Joho, H., Hopfgartner, F., Zhou, L., and Albatal, R. (2016). Overview of NTCIR-12 Lifelog Task (Lifelog) Task. In Proceedings of the 12th NTCIR Conference, Tokyo, June 7-10, 2016.
- NTCIR-12 Lifelog website:http://ntcir-lifelog.computing.dcu.ie/
- NTCIR-12 Conference Proceedings (Lifelog):
NTCIR-12 Conference Proceedings (Lifelog)
The test collection was constructed and used for the NTCIR project. It is usable only for research purposes.
The document collection included in the test collection was made available
to NII for use in the NTCIR project free of charge or for a fee. The providers
of the document data understand the importance of such test collections
in research on information access technologies and have kindly given their
permission to use the data for research purposes. Please remember that
the document data in the NTCIR test collection is copyrighted and has commercial
value as data. To maintain a good relationship with the data producers/provider,
we researchers must be reliable partners and use the data only for research
purposes under the user agreement, and we must use the data carefully so
as not to violate copyright.
[NTCIR Data Home (English)]
[NTCIR Data Home (Japanese)]
[Top of this page]
- Updated on : 2016-07-25