[Date Prev][Date Next][Date Index]

[ntcir:157] ntcir-5 web dry run document data

To those who are interested in Web information retrieval,

# Apologies for multiple copies.

We started the WEB Task at the Fifth NTCIR Workshop at the
beginning of this October, and we are now ready to distribute
the web document data set for dry run.

The total net size of the document data set is about 230GB and
it includes about 20 million web pages crawled from *.jp domain
in 2004.

The data set is available only for participants of the task. If
you are interested in it, please visit the following web page.
We will still accept participation for some while.


Best regards,

NTCIR-5 WEB Task organizers