Acknowledgements | 第4-5页 |
abstract | 第5-6页 |
Chapter 1. Introduction | 第9-15页 |
1.1 Introduction to the topic and its relevance | 第9-10页 |
1.2 Goals, objectives and structure of the thesis | 第10-11页 |
1.3 Basic concepts of clustering | 第11-15页 |
Chapter 2. Reduction of dimension using the fractal dimension | 第15-60页 |
2.1 Application of fractal theory to data sets | 第15-31页 |
2.1.1 General information about fractals | 第15-20页 |
2.1.2 Hausdorff distance | 第20-21页 |
2.1.3 Mathematical foundations of fractal compression | 第21-23页 |
2.1.4 Methods of Fractal Data Mining | 第23-25页 |
2.1.5 Understanding a table as a fractal | 第25-29页 |
2.1.6 The reduction of the dimensionality | 第29-31页 |
2.2 The design of the algorithm | 第31-60页 |
2.2.1 General view of the searching domains algorithm | 第31-34页 |
2.2.2 Creating a list of possible domain structures | 第34-35页 |
2.2.3 Calculating the number of different domain values for each set | 第35-36页 |
2.2.4 Creating a list of all possible domain structures | 第36-40页 |
2.2.4.1 Searching optimal set of domains and analysis of results | 第36-38页 |
2.2.4.2 The algorithm of brute force of searching domains | 第38-40页 |
2.2.5 The algorithm searching the optimal number of domains, number of distinct values of a domain and the analysis of the results | 第40-45页 |
2.2.6 Improved algorithm using Map Reduce | 第45-60页 |
2.2.6.1 Why we use Hadoop MapReduce | 第45-48页 |
2.2.6.2 The problem of counting unique values in big datasets | 第48-51页 |
2.2.6.3 Designing the algorithm of calculation using MapReduce | 第51-60页 |
Chapter 3. Clustering of the obtained domains | 第60-91页 |
3.1 Clustering of the obtained domains | 第60-61页 |
3.2 Review of clustering algorithms | 第61-67页 |
3.3 Grid-based clustering Halite | 第67-74页 |
3.4 Improvement of the algorithm Halite | 第74-91页 |
3.4.1 Improved MDL model in determining the relevance of the axes of the clusters | 第74-81页 |
3.4.2 The application of Laplacian filter and the behavior at the boundaries and corners of the space | 第81-91页 |
Chapter 4. Experiments and conclusion | 第91-104页 |
4.1 Review of test cases | 第91-96页 |
4.2 The results of the experiments | 第96-102页 |
4.3 Conclusion | 第102-104页 |
References | 第104-105页 |