NTCIREVAL

NTCIR Project
Tools
NTCIREVAL

[JAPANESE] [NTCIR Home] [NTCIR Tools Home]

NTCIREVAL by Tetsuya Sakai

NTCIREVAL is a toolkit for computing various retrieval effectiveness metrics.
It can be used for NTCIR and TREC ad hoc retrieval evaluation, diversified search and group fairness evaluation, NTCIR-8 Community QA Task evaluation and so on.

NTCIREVAL can compute metrics such as:

-Average Precision

-Q-measure

-nDCG

-Expected Reciprocal Rank (ERR)

-Graded Average Precision (GAP)

-Rank-Biased Precision (RBP)

-Expected Blended Ratio (EBR)

-intentwise Rank-Biased Utility (iRBU)

-Normalised Cumulative Utility (NCU)

-Condensed-List versions of the above metrics

-Bpref

-D#-measures and DIN#-measures for diversity evaluation

-Intent-Aware (IA) metrics and P+Q# for diversity evaluation

- Group Fairness and Relevance (GFR)

For details, please refer to the README file included in the tar file.

Download

https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.230130.tar.gz

(Group Fairness and Relevance measures implemented for the FairWeb task)

https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.200626.tar.gz

(A minor update to avoid a warning message at compilation time; a few normalised measuresd added)

https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.200520.tar.gz

(New measures EBR and iRBU implemented; a script for computing measures based on continuous gain values added)

https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.190617.tar.gz

(Fixed a very minor bug that does not affect measure computations)

https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.190111.tar.gz

(Fixed a bug introduced in 161017; added a few evaluation measures)

https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.161017.tar.gz (a minor bug in ntcir_eval irec fixed, 2016-10-17, thanks to Dr Tomohiro Manabe)

https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.160507.tar.gz (now contains a script for creating topic-by-run matrices from nev files, 2016-05-07)
https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.141207.tar.gz (now Mac-compatible thanks to Dr. Makoto P. Kato, 2014-12-07)

https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.130507.tar.gz (makefile fixed on 2013-05-07)
https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.120718.tar.gz (bug in the NEVIAPQ2sharpnev script fixed on 2012-07-18)
https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.120528.tar.gz (bug for computing 1CLICK T-measure fixed on 2012-05-28)
https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.120508.tar.gz (updated on 2012-05-10)
https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.110426.tar.gz (updated on 2011-04-26)
https://research.nii.ac.jp/ntcir/tools/NTCIREVAL.100728.tar.gz (updated on 2010-07-28)

References

l Tetsuya Sakai: Metrics, Statistics, Tests, PROMISE Winter School 2013: Bridging between Information Retrieval and Databases (LNCS 8173), 2014.

l Tetsuya Sakai: How to Run an Evaluation Task: with a Primary Focus on Ad Hoc Information Retrieval, Information Retrieval Evaluation in a Changing World – Lessons Learned from 20 Years of CLEF, Springer, 2019.

l Tetsuya Sakai and Zhaohao Zeng: Retrieval Evaluation Measures that Agree with Users' SERP Preferences: Traditional, Preference-based, and Diversity Measures, ACM TOIS, 39(2), Article No.14, 2020.

l Tetsuya Sakai, Jin Young Kim, and Inho Kang: A Versatile Framework for Evaluating Ranked Lists in terms of Group Fairness and Relevance, ACM TOIS, to appear, 2023.