基于Hadoop/HBase集群的在线分析处理反应调度

Abstract	第7-9页
摘要	第10-17页
1 INTRODUCTION	第17-21页
1.1 Online Analytical:From Centralized System to Distributed System	第17-18页
1.1.1 Motivations	第17页
1.1.2 Elements of explored solutions	第17-18页
1.2 Issues	第18-19页
1.2.1 Deploy multidimensional data over a cluster	第18页
1.2.2 Query a warehouse based on an HBase cluster	第18-19页
1.3 Contributions	第19-20页
1.4 Structure of the thesis	第20-21页
2 STATE OF ART	第21-34页
2.1 Data Warehouse and OLAP	第21-28页
2.1.1 Foundations	第21页
2.1.2 Multidimensional model	第21-24页
2.1.3 Functional architecture of an OLAP system	第24-25页
2.1.4 Storage models	第25-28页
2.2 Hadoop Ecosystem	第28-32页
2.2.1 Hadoop Framework	第28-29页
2.2.2 MapReduce	第29-30页
2.2.3 HDFS:The Hadoop Dist ributed File System	第30-31页
2.2.4 HBASE	第31-32页
2.3 Data warehouse in distributed environment	第32-33页
2.3.1 Fragmentation of Warehouse	第32-33页
2.3.2 Warehouse on dist ributed database	第33页
2.4 Conclusion	第33-34页
3 Multidimensional Data on Distributed Storage	第34-44页
3.1 Use Cases	第34-35页
3.2 Conceptual model for multidimensional data	第35-38页
3.2.1 Schema and Instance of Dimension	第36-37页
3.2.2 Facts and Aggregates	第37-38页
3.2.3 Local Instances of Dimension	第38页
3.3. Identification of multidimensional data	第38-40页
3.3.1 Definition and identification of multidimensional chunks	第38-39页
3.3.2 Construction of chunks blocks	第39-40页
3.4 Multidimensional data indexing	第40-43页
3.4.1 Indexes on different aggregation levels	第40-42页
3.4.2 Indexes on chunks block	第42页
3.4.3 CCB Index Operations	第42-43页
3.5 Conclusion	第43-44页
4 REACTIVE SCHEDULING POLICY	第44-53页
4.1 Presentation of query processing phases	第44-45页
4.2 Rewriting the client request	第45页
4.3 Location useful data for the query	第45-46页
4.4. Queries Scheduling	第46-47页
4.5 Execution plan and optimization of execution	第47页
4.6 Queries execution and tasks scheduling	第47-52页
4.6.1 Our Scheduling Policy	第48-50页
4.6.2 Monitoring and updating the status of the execution	第50-51页
4.6.3 Assembly of the result	第51页
4.6.4 Scheduling Implementation	第51-52页
4.7 Conclusion	第52-53页
5 PROTOTYPE AND EXPERIMENTATION	第53-64页
5.1 Prototype Architecture	第53-57页
5.1.1. Our data model based on HBase	第53-55页
5.1.2 Presentation of the scheduling engine services for distributed storage	第55-57页
5.2 Prototype implementation	第57-60页
5.2.1 Hadoop/HBase deployment	第57-58页
5.2.2 OLAP Client Interface	第58-59页
5.2.3 Experiments Infrastructure	第59-60页
5.3 Experiments	第60-63页
5.3.1 Test Scenario	第60页
5.3.2 Stress Scenario	第60-61页
5.3.3 Results	第61-63页
5.4 Conclusion	第63-64页
6 CONCLUSIONS AND PERSPECTIVES	第64-71页
6.1 Evaluation and contributions	第64-66页
6.1.1 ldentification and indexing of data multidimensional	第64-65页
6.1.2 Implementation and Ouery Optimization	第65-66页
6.1.3 Prototype of services	第66页
6.2 Limitation and perspectives	第66-71页
6.2.1 Management and maintenance of distributed data warehouse	第66-67页
6.2.2 Maintenance and adaption of CCB Index structures according to the change of dist ributed warehouse	第67-68页
6.2.3 Evolution and optimization of query processing method	第68页
6.2.4 Design and integration of methods by services architecture	第68-71页
References	第71-75页
PUBLICATION	第75-76页
Acknowledgments	第76页
Dedication	第76页