首页--工业技术论文--自动化技术、计算机技术论文--计算技术、计算机技术论文--计算机的应用论文--信息处理(信息加工)论文--文字信息处理论文

基于简易子文档框架的高效文档聚类研究

摘要第7-11页
Abstract第11-14页
Chapter 1 Introduction第17-23页
    1.1 Research Background and Importance第17-18页
    1.2 Related Work in Multi-topic Document Clustering第18-19页
    1.3 Related Work in Text Segmentation第19页
    1.4 Research Objectives第19-20页
    1.5 Research Contributions第20-21页
    1.6 Structure of Thesis第21-23页
Chapter 2 Related Theories and Techniques第23-35页
    2.1 Text Mining第24-25页
    2.2 Document Segmentation第25-27页
        2.2.1 Document Segmentation Methods第25-26页
        2.2.2 Topic Modelling Techniques in Text Segmentation第26-27页
    2.3 Document Clustering第27-29页
        2.3.1 Hierarchical and Partitional Clustering第28-29页
    2.4 Related Work第29-33页
        2.4.1 Existing Approach in Document Clustering第29-30页
        2.4.2 Existing Approach in Document Segmentation第30-32页
            2.4.2.1 Linear Text Segmentation第30-31页
            2.4.2.2 Hierarchical Text Segmentation第31-32页
        2.4.3 Cluster Matching using Query Processing第32-33页
    2.5 Summary第33-35页
Chapter 3 Document Representation, Notations and Implementation第35-49页
    3.1 Notations第35-36页
    3.2 Proposed Sub-document based Framework第36页
    3.3 Identification of sub-documents第36-41页
        3.3.1 TextTiling第36-38页
        3.3.2 LDA based Document Segmentation第38-41页
            3.3.2.1 Labelling the LDA第40-41页
    3.4 Document and Sub-document Representation第41-47页
        3.4.1 Sub-document Based Representation第41页
        3.4.2 Sub-Document Set based Representation第41-43页
        3.4.3 Document based Representation第43-45页
        3.4.4 Vector Space Model based Clustering第45-47页
    3.5 Dataset in Experiment 1第47页
    3.6 Data Preprocessing第47-48页
    3.7 Summary第48-49页
Chapter 4 Evaluation and Analysis第49-55页
    4.1 Evaluation Metrics in Document Segmentation第49页
    4.2 Evaluation Metrics in Document Clustering第49-50页
    4.3 Document Segmentation Evaluation Model第50-51页
    4.4 Document Clustering Model Selection第51-52页
    4.5 Parameters Values for Overlapping and Disjoint Clustering第52-53页
    4.6 Summary第53-55页
Chapter 5 Efficient Query Processing for Cluster Matching第55-69页
    5.1 Database Used in Cluster Matching第55-56页
    5.2 Query Optimization第56-57页
    5.3 Cost-effectiveness based Optimization第57页
    5.4 Query Processing in Cluster Matching第57-59页
    5.5 Automatic and Manual Query Processing第59-60页
    5.6 Experimental Results of Query Processing第60-62页
    5.7 Comparison Analysis of Queries Using Different Databases第62-66页
    5.8 Summary第66-69页
Chapter 6 Experiment and Analysis第69-95页
    6.1 Parameters Settings in Identification of Sub-documents第69-70页
    6.2 Performance Comparison between LDA and TextTiling in Experiment 1第70-71页
    6.3 Sub-document Cross Clustering第71-72页
    6.4 Sub-document Set Cross Clustering第72-73页
    6.5 Sub-document based Framework with Different Clustering Methods第73-80页
    6.6 Sub-Document Set Clustering in Comparison to Traditional Clustering第80-82页
    6.7 Performance Evaluation of Sub-document Based Framework第82-87页
    6.8 Experiment 2: Results and Discussions第87-93页
        6.8.1 Dataset in Experiment 2第87-88页
        6.8.2 Document Segmentation第88-89页
        6.8.3 Performance of Bisecting LDA using Sub-document based Framework第89-93页
    6.9 Summary第93-95页
Chapter 7 Conclusions第95-99页
    7.1 Summary of Research第95-97页
    7.2 Future Research and Recommendations第97-99页
List of Research Publications during the Doctoral Research 攻读博士学位期间所发表的学术论文第99-101页
References第101-111页
Acknowledgement第111页

论文共111页,点击 下载论文
上一篇:基于复杂信息系统的界面辅助设计软件架构及核心功能模块研究
下一篇:基于惯性项和四元数的神经网络的动力学行为分析