Danh mục

Summary of Computer doctoral thesis: Mining weighted sequential patterns in sequence database

Số trang: 25      Loại file: pdf      Dung lượng: 1.04 MB      Lượt xem: 14      Lượt tải: 0    
tailieu_vip

Phí tải xuống: 25,000 VND Tải xuống file đầy đủ (25 trang) 0
Xem trước 3 trang đầu tiên của tài liệu này:

Thông tin tài liệu:

The objective of the thesis is to propose a solution to mine the weighted sequential patterns between sequences in sequence databases with time interval and quantitative sequence databases with time interval.
Nội dung trích xuất từ tài liệu:
Summary of Computer doctoral thesis: Mining weighted sequential patterns in sequence database MINISTRY OF EDUCATION VIETNAM ACADEMY OF AND TRAINING SIENCE AND TECHNOLOGY GRADUATE UNIVERSITY OF SIENCE AND TECHNOLOGY ……..….***………… TRAN HUY DUONG MINING WEIGHTED SEQUENTIAL PATTERNS IN SEQUENCE DATABASE Major: Information System Major code: 62 48 01 04 SUMMARY OF COMPUTER DOCTORAL THESIS Ha Noi – 2021 The thesis has been completed at: Graduate University of Science and Technology- Vietnam Academy of Science and Technology Supervisor 1: Dr. Nguyen Truong Thang Supervisor 2: Prof. Dr. Vu Duc Thi Reviewer 1: … Reviewer 2: … Reviewer 3: …. The thesis shall be defended in front of the Thesis Committee at Vietnam Academy Of Science And Technology - Graduate University Of Science And Technology, at ….… hour……, date…… month….…year 2021 This thesis could be found at: - The National Library of Vietnam - The Library of Graduate University of Science and Technology INTRODUCTION 1. Overview Mining frequent sequence patterns is one of the important issues and is studied by many scholars in the field of data mining. Sequential pattern mining has many real-life applications since data is encoded as sequences in many fields such as bioinformatics, e-learning, market basket analysis, text analysis, and webpage click-stream analysis. This field of research has emerged in the 1990s with the seminal paper of Agrawal and Srikant [2]. Usually, the sequential pattern mining does not include additional information. Meanwhile, the expansion of the information of the data series is very diverse such as adding information about the weights of the items in the sequence, information about the quantity of the items in the sequence, information about the time interval. For sequence databases with the time interval, the time interval of the occurrence of data series allows analysis of how long after the sequence patterns will appear. The studies so far have focused on detecting time-spaced sequence patterns that occur between components in time-spaced sequence databases, where the time interval is a well-defined numerical value. Conventional classical sequence pattern mining algorithms do not care about the importance of each data item in the sequence (weights) nor the number of data items in each data item in the sequence (quantitative). However, in practice, each data series has different importance, including the value of internal (quantitative) and external (weighted) benefits of the data series in the database. Up to now, there have not been many studies on weighted sequence pattern mining in time-spaced sequence databases, which are interested in both the weight of each item in the data series and the time interval between the sequences. At the same time, there have not been many studies on high utility sequence pattern mining in time-spaced quantitative sequence databases, including the weight of each item in the data series, the quantitative value of the items and the time interval between the sequences in the quantitative sequence database has a time interval. That is the reason for proposing the thesis Mining Weighted Sequential Patterns in Sequence Database. The thesis proposes and solves the problem of weighted frequent sequence pattern mining in sequence databases with time interval and high utility sequential pattern mining in quantitative sequence database with time interval. 2. Research aim and area The objective of the thesis is to propose a solution to mine the weighted sequential patterns between sequences in sequence databases with time interval and quantitative sequence databases with time interval. The thesis focuses on proposing solutions to: • Mining weighted sequential patterns in sequence databases with time interval. The sequential pattern found are then called the weighted sequential pattern with time interval. • Mining weighted sequential patterns in quantitative sequence databases with time interval. The sequential pattern found are then called the high utility sequential pattern with time interval. 1 I focus on researching and proposing new algorithms for mining frequent sequence patterns, demonstrate correctness and completeness, analyze the computational complexity of algorithms, test and analyze the significance of the frequently mined sequential pattern. 3. Research methodology The thesis studies the sequences the weights of the items in the sequence, the time interval between the sequences, the sequence databases with time interval and the quantitative sequence databases with time interval. Current studies, algorithms and methods of pattern mining on sequence databases, quantitative sequence databases have factors of time distance, weight, and high utility. 4. New contributions of the thesis The main contributions of the thesis are proposing and solve the following issues: • Propose an algorithm to mining top-k frequent sequential patterns taking in the weights of items and time intervals in sequence databases with time interval. The results of the work are posted at [CT1]. • Propose two algorithms for mining high utility sequential patterns that taking in the weights of items, the quantitative value of each item and the time interval in quantitative sequence databases with time interval. The results of the work are posted at [CT2], [CT3], [CT4], [CT5]. 5. The thesis layout The thesis consists of an introduction, 03 content chapters and a conclusion: • Introduction: Present an overview of the thesis; research objectives, objects and scope, research methods, the main contributio ...

Tài liệu được xem nhiều: