Summary of Computer doctoral thesis: Mining weighted sequential patterns in sequence database
Số trang: 25
Loại file: pdf
Dung lượng: 1.04 MB
Lượt xem: 14
Lượt tải: 0
Xem trước 3 trang đầu tiên của tài liệu này:
Thông tin tài liệu:
The objective of the thesis is to propose a solution to mine the weighted sequential patterns between sequences in sequence databases with time interval and quantitative sequence databases with time interval.
Nội dung trích xuất từ tài liệu:
Summary of Computer doctoral thesis: Mining weighted sequential patterns in sequence database MINISTRY OF EDUCATION VIETNAM ACADEMY OF AND TRAINING SIENCE AND TECHNOLOGY GRADUATE UNIVERSITY OF SIENCE AND TECHNOLOGY ……..….***………… TRAN HUY DUONG MINING WEIGHTED SEQUENTIAL PATTERNS IN SEQUENCE DATABASE Major: Information System Major code: 62 48 01 04 SUMMARY OF COMPUTER DOCTORAL THESIS Ha Noi – 2021 The thesis has been completed at: Graduate University of Science and Technology- Vietnam Academy of Science and Technology Supervisor 1: Dr. Nguyen Truong Thang Supervisor 2: Prof. Dr. Vu Duc Thi Reviewer 1: … Reviewer 2: … Reviewer 3: …. The thesis shall be defended in front of the Thesis Committee at Vietnam Academy Of Science And Technology - Graduate University Of Science And Technology, at ….… hour……, date…… month….…year 2021 This thesis could be found at: - The National Library of Vietnam - The Library of Graduate University of Science and Technology INTRODUCTION 1. Overview Mining frequent sequence patterns is one of the important issues and is studied by many scholars in the field of data mining. Sequential pattern mining has many real-life applications since data is encoded as sequences in many fields such as bioinformatics, e-learning, market basket analysis, text analysis, and webpage click-stream analysis. This field of research has emerged in the 1990s with the seminal paper of Agrawal and Srikant [2]. Usually, the sequential pattern mining does not include additional information. Meanwhile, the expansion of the information of the data series is very diverse such as adding information about the weights of the items in the sequence, information about the quantity of the items in the sequence, information about the time interval. For sequence databases with the time interval, the time interval of the occurrence of data series allows analysis of how long after the sequence patterns will appear. The studies so far have focused on detecting time-spaced sequence patterns that occur between components in time-spaced sequence databases, where the time interval is a well-defined numerical value. Conventional classical sequence pattern mining algorithms do not care about the importance of each data item in the sequence (weights) nor the number of data items in each data item in the sequence (quantitative). However, in practice, each data series has different importance, including the value of internal (quantitative) and external (weighted) benefits of the data series in the database. Up to now, there have not been many studies on weighted sequence pattern mining in time-spaced sequence databases, which are interested in both the weight of each item in the data series and the time interval between the sequences. At the same time, there have not been many studies on high utility sequence pattern mining in time-spaced quantitative sequence databases, including the weight of each item in the data series, the quantitative value of the items and the time interval between the sequences in the quantitative sequence database has a time interval. That is the reason for proposing the thesis Mining Weighted Sequential Patterns in Sequence Database. The thesis proposes and solves the problem of weighted frequent sequence pattern mining in sequence databases with time interval and high utility sequential pattern mining in quantitative sequence database with time interval. 2. Research aim and area The objective of the thesis is to propose a solution to mine the weighted sequential patterns between sequences in sequence databases with time interval and quantitative sequence databases with time interval. The thesis focuses on proposing solutions to: • Mining weighted sequential patterns in sequence databases with time interval. The sequential pattern found are then called the weighted sequential pattern with time interval. • Mining weighted sequential patterns in quantitative sequence databases with time interval. The sequential pattern found are then called the high utility sequential pattern with time interval. 1 I focus on researching and proposing new algorithms for mining frequent sequence patterns, demonstrate correctness and completeness, analyze the computational complexity of algorithms, test and analyze the significance of the frequently mined sequential pattern. 3. Research methodology The thesis studies the sequences the weights of the items in the sequence, the time interval between the sequences, the sequence databases with time interval and the quantitative sequence databases with time interval. Current studies, algorithms and methods of pattern mining on sequence databases, quantitative sequence databases have factors of time distance, weight, and high utility. 4. New contributions of the thesis The main contributions of the thesis are proposing and solve the following issues: • Propose an algorithm to mining top-k frequent sequential patterns taking in the weights of items and time intervals in sequence databases with time interval. The results of the work are posted at [CT1]. • Propose two algorithms for mining high utility sequential patterns that taking in the weights of items, the quantitative value of each item and the time interval in quantitative sequence databases with time interval. The results of the work are posted at [CT2], [CT3], [CT4], [CT5]. 5. The thesis layout The thesis consists of an introduction, 03 content chapters and a conclusion: • Introduction: Present an overview of the thesis; research objectives, objects and scope, research methods, the main contributio ...
Nội dung trích xuất từ tài liệu:
Summary of Computer doctoral thesis: Mining weighted sequential patterns in sequence database MINISTRY OF EDUCATION VIETNAM ACADEMY OF AND TRAINING SIENCE AND TECHNOLOGY GRADUATE UNIVERSITY OF SIENCE AND TECHNOLOGY ……..….***………… TRAN HUY DUONG MINING WEIGHTED SEQUENTIAL PATTERNS IN SEQUENCE DATABASE Major: Information System Major code: 62 48 01 04 SUMMARY OF COMPUTER DOCTORAL THESIS Ha Noi – 2021 The thesis has been completed at: Graduate University of Science and Technology- Vietnam Academy of Science and Technology Supervisor 1: Dr. Nguyen Truong Thang Supervisor 2: Prof. Dr. Vu Duc Thi Reviewer 1: … Reviewer 2: … Reviewer 3: …. The thesis shall be defended in front of the Thesis Committee at Vietnam Academy Of Science And Technology - Graduate University Of Science And Technology, at ….… hour……, date…… month….…year 2021 This thesis could be found at: - The National Library of Vietnam - The Library of Graduate University of Science and Technology INTRODUCTION 1. Overview Mining frequent sequence patterns is one of the important issues and is studied by many scholars in the field of data mining. Sequential pattern mining has many real-life applications since data is encoded as sequences in many fields such as bioinformatics, e-learning, market basket analysis, text analysis, and webpage click-stream analysis. This field of research has emerged in the 1990s with the seminal paper of Agrawal and Srikant [2]. Usually, the sequential pattern mining does not include additional information. Meanwhile, the expansion of the information of the data series is very diverse such as adding information about the weights of the items in the sequence, information about the quantity of the items in the sequence, information about the time interval. For sequence databases with the time interval, the time interval of the occurrence of data series allows analysis of how long after the sequence patterns will appear. The studies so far have focused on detecting time-spaced sequence patterns that occur between components in time-spaced sequence databases, where the time interval is a well-defined numerical value. Conventional classical sequence pattern mining algorithms do not care about the importance of each data item in the sequence (weights) nor the number of data items in each data item in the sequence (quantitative). However, in practice, each data series has different importance, including the value of internal (quantitative) and external (weighted) benefits of the data series in the database. Up to now, there have not been many studies on weighted sequence pattern mining in time-spaced sequence databases, which are interested in both the weight of each item in the data series and the time interval between the sequences. At the same time, there have not been many studies on high utility sequence pattern mining in time-spaced quantitative sequence databases, including the weight of each item in the data series, the quantitative value of the items and the time interval between the sequences in the quantitative sequence database has a time interval. That is the reason for proposing the thesis Mining Weighted Sequential Patterns in Sequence Database. The thesis proposes and solves the problem of weighted frequent sequence pattern mining in sequence databases with time interval and high utility sequential pattern mining in quantitative sequence database with time interval. 2. Research aim and area The objective of the thesis is to propose a solution to mine the weighted sequential patterns between sequences in sequence databases with time interval and quantitative sequence databases with time interval. The thesis focuses on proposing solutions to: • Mining weighted sequential patterns in sequence databases with time interval. The sequential pattern found are then called the weighted sequential pattern with time interval. • Mining weighted sequential patterns in quantitative sequence databases with time interval. The sequential pattern found are then called the high utility sequential pattern with time interval. 1 I focus on researching and proposing new algorithms for mining frequent sequence patterns, demonstrate correctness and completeness, analyze the computational complexity of algorithms, test and analyze the significance of the frequently mined sequential pattern. 3. Research methodology The thesis studies the sequences the weights of the items in the sequence, the time interval between the sequences, the sequence databases with time interval and the quantitative sequence databases with time interval. Current studies, algorithms and methods of pattern mining on sequence databases, quantitative sequence databases have factors of time distance, weight, and high utility. 4. New contributions of the thesis The main contributions of the thesis are proposing and solve the following issues: • Propose an algorithm to mining top-k frequent sequential patterns taking in the weights of items and time intervals in sequence databases with time interval. The results of the work are posted at [CT1]. • Propose two algorithms for mining high utility sequential patterns that taking in the weights of items, the quantitative value of each item and the time interval in quantitative sequence databases with time interval. The results of the work are posted at [CT2], [CT3], [CT4], [CT5]. 5. The thesis layout The thesis consists of an introduction, 03 content chapters and a conclusion: • Introduction: Present an overview of the thesis; research objectives, objects and scope, research methods, the main contributio ...
Tìm kiếm theo từ khóa liên quan:
Summary of Computer doctoral thesis Information system Sequence database Mining weighted sequential patterns Cơ sở dữ liệu dãy Hệ thống thông tinGợi ý tài liệu liên quan:
-
Bài tập thực hành môn Phân tích thiết kế hệ thống thông tin
6 trang 321 0 0 -
Bài thuyết trình Hệ thống thông tin trong bệnh viện
44 trang 251 0 0 -
Bài giảng HỆ THỐNG THÔNG TIN KẾ TOÁN - Chương 2
31 trang 234 0 0 -
Phương pháp và và ứng dụng Phân tích thiết kế hệ thống thông tin: Phần 1 - TS. Nguyễn Hồng Phương
124 trang 217 0 0 -
Đồ án tốt nghiệp: Xây dựng ứng dụng quản lý kho hàng trên nền Web
61 trang 215 0 0 -
62 trang 209 2 0
-
Bài giảng Phân tích thiết kế hệ thống thông tin - Chương 9: Thiết kế giao diện
21 trang 188 0 0 -
Giáo trình Phân tích thiết kế hệ thống thông tin (chương 2-bài 2)
14 trang 183 0 0 -
Bài thuyết trình Logistic: Thực tế hệ thống thông tin logistic của Công ty Vinamilk
15 trang 166 0 0 -
65 trang 163 0 0