First International Workshop on SCIentific DOCument Analysis
(SCIDOCA 2016)
associated with JSAI International Symposia on AI 2016 (IsAI-2016)

November 15 - 16, 2016

Raiosha Building, Hiyoshi Campus, Keio University Kanagawa, Japan

Aims and Scope

Recent proliferation of scientific papers and technical documents has become an obstacle to efficient information acquisition of new information in various fields. It is almost impossible for individual researchers to check and read all related documents. Even retrieving relevant documents is becoming harder and harder. This workshop gathers all the researchers and experts who are aiming at scientific document analysis from various perspectives, and invite technical paper presentations and system demonstrations that cover any aspects of scientific document analysis.

Important Dates

Workshop: November 15 - 16, 2016

Submission Deadline: September 26, 2016
Notification: October 11, 2016
Camera-ready due: October 18, 2016

Registration

Please register the workshop at registration page of JSAI International Symposia on AI 2016.

Topics

Relevant topics include, but are not limited to, the following:

text analysis
document structure analysis
logical structure analysis
figure and table analysis
citation analysis of scientific and technical documents
scientific information assimilation
summarization and visualization
knowledge discovery/mining from scientific papers and data
similar document retreval
entity and relation linking between documents and knowledge base
survey generation
resources for scientific documents analysis
document understanding in general
NLP systems aiming for scientific documents including tagging, parsing, coreference, etc.

Invited Speakers

Jin-Dong Kim, Database Center for Life Science (DBCLS)

Randy Goebel, University of Alberta

Submissions

There are two types of submissions:

1. Full paper
We welcome and encourage the submission of high quality, original papers, which are not simultaneously submitted for publication elsewhere. The paper should not exceed 14 pages including figures, references, etc. For this type, the paper will be appeared in IsAI2016-SCIDOCA proceedings so be considered archival and subject to post-proceedings selection.

2. Extended abstract
We also welcome extended abstracts, which describe work-in-progress contributions, small, focused contributions, etc. The paper should not exceed 4 pages including references. Accepted papers will be made available online at the workshop website, but the workshop proceedings can be considered non-archival. Also for this type we allow submissions of papers that are under review or have been recently published in a conference or a journal.

For both types, papers should be written in English, formatted according to the Springer Verlag LNCS style in a pdf form, which can be obtained from here. Anonymity is not required. If you use a word file, please follow the instruction of the format, and then convert it into a pdf form and submit it at the paper submission page.

You can submit your paper at "https://easychair.org/conferences/?conf=scidoca2016". If you cannot submit a paper by EasyChair System by some trouble, please send email to "ksatoh[at]nii.ac.jp"

If a paper is accepted, at least one author of the paper must register the workshop and present it. Please register the workshop at registration page.

Post Proceedings

We are now negotiating about post-proceedings of selected publications in LNAI series with Springer Verlag.

SCIDOCA2016 Program (November 15, 2016)

10:00-10:10 Opening

10:10-11:30 Information extraction (20*4)

Yoshinobu Kano. Table Data Extraction for Text Mining in Neuroscience Papers
Chien-Xuan Tran, Minh-Le Nguyen and Ken Satoh. A Study of Open Information Extraction from Legal Texts
Shuhei Kondo and Yuji Matsumoto. Automatic Annotation of Species-Names using KNApSAcK database
Akiko Aizawa, Takeshi Sagara, Goran Topic and Vitor Castro. Wikification of Technical Terms with Term Decomposition and Expansion (extended abstract)

11:30-13:30: Lunch

13:30-14:40: Invited talk 1

Jin-Dong Kim, Database Center for Life Science (DBCLS), Japan

Title: Toward Linked and Shared Resources of Scientific Literature Annotation
Abstract: As efficient access to the content of scientific literature is more and more desired, annotation to scientific literature is more and more recognized as an important resource, and there are now a number of large projects to develop annotations to large bodies of scientific literature, e.g. the whole PubMed or PubMed Central. While the community of text mining has invested a lot for the productivity of literature annotation, there is also a growing interest in exchange and storage for literature annotation. The talk will begin with an introduction to various efforts for scientific literature annotation, and discuss issues for shared scientific literature annotation, from three different aspects: productivity, interoperability, and accessibility.

14:40-15:00: Break

15:00-16:10: Information Retrieval 1 (20*2+30)

Takeshi Abekawa and Akiko Aizawa. SideNoter: Scholarly Paper Browsing System based on Text Annotation
Yuta Kobayashi, Hiroki Teranishi, Masashi Shimbo and Yuji Matsumoto. Learning scientific paper representations from text and citation graphs
Kimitaka Asatani, Ochi Masanao and Junichiro Mori. Detecting Research Trend of Academic Field in Latent Space

16:10-16:40: Break

16:40-18:00: Text analysis (20*4)

Mai Omura, Hiroyuki Shindo and Yuji Matsumoto. Information structure analysis of abstracts in multiple domains using word embeddings
Kazutaka Kinugawa and Yoshimasa Tsuruoka. Developing a Supervised Text Summarizer with Academic Papers in Biomedical Sciences
Aya Iwamoto, Hiroshi Noji, Hiroyuki Shindo and Yuji Matsumoto. Toward Coreference Resolution in Scientific Domain
Paul Reisert, Naoya Inoue, Naoaki Okazaki and Kentaro Inui. Towards Recognizing Logic in Argumentative Texts

19:00- Informal Workshop Dinner at:
Yuzen Tatsukichi
Address: Hiyoshi-honcho Kohoku-ku Yokohama-shi, Kanagawa, Japan
Tel: 045-563-6198

SCIDOCA2016 programme (November 16, 2016)

10:20-11:30: Invited talk 2

Randy Goebel, Alberta Machine Intelligence Institute, University of Alberta

Title:What is required to develop scalable semantics?
Abstract: The long term promise of Newell and Simon’s physical symbol systems hypothesis has historically relied on many decades of research on formalizing semantics in a manner that provided the basis for mechanical extraction and processing of semantic content. The application of this hypothesis has exploited a variety of semantical theories, from the earliest accounts of denotational semantics of formal philosophers and logicians like Frege, to the sophisticated representations of natural language in higher order intensional logics, like Montague, to the rapid proliferation of technologies like the resource description framework (RDF) as the basis for metadata components underlying the semantic web.
The current era of “big data” has created the challenge of being more deliberate about the development and deployment of technologies and methods to create semantics carrying content, especially when considering applications like general question answering, literature-based discovery, and general knowledge curation. A central focus of research on big data and natural language information extraction covers a broad spec-trum of research methodologies, including the spectrum from fix-relation extraction to open relation extraction, deep parsing versus information retrieval-based question answering, and shallow versus deep formal representations of language to support reasoning and inference.
The idea of scalable semantics is to develop a kind of semantic data stack that can provide guidance in the development and deployment of semantic information extraction, appropriate to application goals, for example in the spectrum of sentiment analysis of Tweets all the way to deep question answering on full texts.
Here we present some of the challenges of the development of such a scalable semantics stack, by providing examples of both boundaries between layers, and the kind of integration of multiple levels that will be required for multi-scale semantics.

11:30-13:30: Lunch

13:30-14:30: Information Retrieval 2 (20*3)

Masaharu Yoshioka, Tao Zhu and Shinjiroh Hara. Multi-faceted figure retrieval system of research papers for nano-crystal device development researchers
Son Nguyen Truong, Nguyen Le Minh and Ken Satoh. Approaches for personalized Information Retrieval Systems in legal texts
Vu Tran, Minh Nguyen and Ken Satoh. Ranking Legal Text Pairs with Logic-Embedded Deep Neural Models

14:30-14:40: Closing Remarksk

Workshop Co-Chairs

Yuji Matsumoto, Nara Institute of Science and Technology, Japan
Hiroshi Noji, Nara Institute of Science and Technology, Japan

Workshop Accounting Officer

Ken Satoh, National Institute of Informatics, Japan

Program Committee Members

Yuji Matsumoto, NAIST
Hiroyuki Shindo, NAIST
Ken Satoh, NII
Kentaro Inui, Tohoku University
Naoya Inoue, Tohoku University
Akiko Aizawa, NII
Yusuke Miyao, NII
Takeshi Abekawa, NII
Hidetsugu Nanba, Hiroshima City University
Yoshimasa Tsuruoka, University of Tokyo
Junichiro Mori, University of Tokyo
Yoshinobu Kano, Shizuoka University

For any inquiry concerning the workshop, please send it to "noji[at]naist.ac.jp"

SCIDOCA 2016 home page http://research.nii.ac.jp/~ksatoh/scidoca2016

First International Workshop on SCIentific DOCument Analysis (SCIDOCA 2016) associated with JSAI International Symposia on AI 2016 (IsAI-2016)