Jing Chen, Qing Li: Concept Hierarchy Construction by Combining Spectral Clustering and Subsumption Estimation. WISE 2006:199-209

With the rapid development of the Web, how to add structural guidance (in the form of concept hierarchies) for Web document navigation becomes a hot research topic. In this paper, we present a method for the automatic acquisition of concept hierarchies. Given a set of concepts, each concept is regarded as a vertex in an undirected, weighted graph. The problem of concept hierarchy construction is then transformed into a modified graph partitioning problem and solved by spectral methods. As the undirected graph cannot accurately depict the hyponymy information regarding the concepts, subsumption estimation is introduced to guide the spectral clustering algorithm. Experiments on real data show very encouraging results.

Jing Chen, Zhigang Zhang, Qing Li, Xiaoming Li: A Pattern-Based Voting Approach for Concept Discovery on the Web. APWeb 2005:109-120

Automatically discovering concepts is not only a fundamental task in knowledge capturing and ontology engineering processes, but also a key step of many applications in information retrieval. For such a task, pattern-based approaches and statistics-based approaches are widely used, between which the former ones eventually turned out to be more precise. However, the effective patterns in such approaches are usually defined manually. It involves much time and human labor, and considers only a limited set of effective patterns. In our research, we accomplish automatically obtaining patterns through frequent sequence mining. A voting approach is then presented that can determine whether a sentence contains a concept and accurately identify it. Our algorithm includes three steps: pattern mining, pattern refining and concept discovery. In our experimental study, we use several traditional measures, precision, recall and F1 value, to evaluate the performance of our approach. The experimental results not only verify the validity of the approach, but also illustrate the relationship between performance and the parameters of the algorithm.

Jing Chen, Qing Li, Ling Feng: Refining the Results of Automatic e-Textbook Construction by Clustering. ICWL 2005:311-319

The abundance of knowledge-rich information on the World Wide Web makes compiling an online e-textbook both possible and necessary. The authors of [7] proposed an approach to automatically generate an e-textbook by mining the ranking lists of the search engine. However, the performance of the approach was degraded by Web pages that were relevant but not actually discussing the desired concept. In this paper, we extend the work in [7] by applying a clustering approach before the mining process. The clustering approach serves as a post-processing stage to the original results retrieved by the search engine, and aims to reach an optimum state in which all Web pages assigned to a concept are discussing that exact concept.

Jing Chen, Qing Li, Weijia Jia: Automatically Generating an -textbook on the Web. World Wide Web (WWW) 8(4):377-394 (2005)

Nowadays, people tend to learn from the Web because it is convenient, and rich of free information. The main means of learning on the Web is by submitting a query to a search engine, and subsequently browsing through the returned results to find relevant information. Although in many cases, a search engine such as Google works quite well, the results returned are often not appropriate for the learning purpose. In this paper, we present a novel approach to automatically generate an E-textbook for a user specified topic hierarchy. Such a technology can ease the learning process to a great extent.