DSpace university logo mark
Japanese | English 

NAOSITE : Nagasaki University's Academic Output SITE > 060 工学部・工学研究科 > 060 会議発表資料 >

Unsupervised segmentation of bibliographic elements with latent permutations

ファイル 記述 サイズフォーマット
LNCS6724_254.pdf252.33 kBAdobe PDF本文ファイル

タイトル: Unsupervised segmentation of bibliographic elements with latent permutations
著者: Masada, Tomonari / Shibata, Yuichiro / Oguri, Kiyoshi
発行日: 2011年
出版者: Springer Verlag
引用: Lecture Notes in Computer Science, 6724, pp.254-267; 2011
抄録: This paper introduces a novel approach for large-scale unsupervised segmentation of bibliographic elements. Our problem is to segment a word token sequence representing a citation into subsequences each corresponding to a different bibliographic element, e.g. authors, paper title, journal name, publication year, etc. Obviously, each bibliographic element should be represented by contiguous word tokens. We call this constraint contiguity constraint. Therefore, we should infer a sequence of assignments of word tokens to bibliographic elements so that this constraint is satisfied. Many HMM-based methods solve this problem by prescribing fixed transition patterns among bibliographic elements. In this paper, we use generalized Mallows models (GMM) in a Bayesian multi-topic model, effectively applied to document structure learning by Chen et al. [4], and infer a permutation of latent topics each of which can be interpreted as one among the bibliographic elements. According to the inferred permutation, we arrange the order of the draws from a multinomial distribution defined over topics. In this manner, we can obtain an ordered sequence of topic assignments satisfying contiguity constraint. We do not need to prescribe any transition patterns among bibliographic elements. We only need to specify the number of bibliographic elements. However, the method proposed by Chen et al. works for our problem only after introducing modification. The main contribution of this paper is to propose strategies to make their method work also for our problem.
記述: Workshops on Web Information Systems Engineering, WISE 2010: 1st International Symposium on Web Intelligent Systems and Services, WISS 2010, 2nd International Workshop on Mobile Business Collaboration, MBC 2010 and 1st Int. Workshop on CISE 2010; Hong Kong; 12 December 2010 through 14 December 2010; Code 86805
URI: http://hdl.handle.net/10069/26163
ISSN: 03029743
権利: © 2011 Springer-Verlag. / The original publication is available at www.springerlink.com
資料タイプ: Conference Paper
原稿種類: author
出現コレクション:060 会議発表資料

引用URI : http://hdl.handle.net/10069/26163



Valid XHTML 1.0! Copyright © 2006-2015 長崎大学附属図書館 - お問い合わせ Powerd by DSpace