• Chỉ mục bởi
  • Năm xuất bản
LIÊN KẾT WEBSITE

XPath-wrapper induction for data extraction

Tran N.-K. College of Engineering and Technology, Vietnam National University, Hanoi, Viet Nam|
Ha Q.-T. | Pham K.-C. Computer Science Department, University of Illinois, Urbana-Champaign, United States|

Proceedings - 2010 International Conference on Asian Language Processing, IALP 2010 Số , năm 2010 (Tập , trang 150-153)

DOI: 10.1109/IALP.2010.33

Tài liệu thuộc danh mục: Scopus

Conference Paper

English

Từ khóa: Amount of information; Data extraction; Human being; Structured information; Template-based; User query; Wrapper induction; Natural language processing systems
Tóm tắt tiếng anh
The Web contains an enormous amount of information which is formatted for human beings. This makes it difficult for computer to extract relevant content from various sources. This paper presents an XPath-wrapper induction algorithm which leverages user queries and template-based sites for extracting structured information. Our experiments show average accuracy of 94%. � 2010 IEEE.

Xem chi tiết