基于分类的智能信息检索研究与实现

1. 基于分类的智能信息检索研究与实现	第1-38页
第一章引言	第7-9页
·研究背景	第7页
·本文的研究思路及主要的工作	第7-8页
·论文组织	第8-9页
第二章网页预处理及中文分词的研究	第9-14页
·网页预处理	第9-10页
·中文分词	第10-14页
·中文分词概述	第10页
·汉字编码问题	第10页
·分词词典的建设	第10-11页
·词表的查找	第11-12页
·分词过程描述	第12页
·未登录词的识别	第12-13页
·空间利用率和查找时间复杂度分析	第13-14页
第三章特征提取	第14-18页
·特征选取的目的	第14页
·常见特征提取方法介绍及评价	第14-16页
·文档频次	第14-15页
·互信息	第15页
·信息增益	第15页
·CHI统计方法(开方拟和检验)	第15-16页
·本文所采用的特征提取方法	第16-17页
·特征提取算法描述	第17-18页
第四章中文网页分类的研究	第18-23页
·类的定义	第18页
·网页分类概述	第18-19页
·基于KNN文档分类介绍	第19-20页
·文档的机器表示	第20-21页
·传统的特征加权方法	第20页
·本文中采用的特征加权方法	第20-21页
·分类算法描述	第21-23页
第五章索引和搜索	第23-25页
·倒排文件	第23页
·倒排文件的搜索	第23-24页
·本文的搜索方法	第24-25页
第六章系统总体设计及实验	第25-30页
·系统总体结构图	第25-26页
·各功能模块介绍	第26-27页
·中文网页数据集	第27-28页
·中文网页分类实验结果及评价	第28-29页
·讨论	第29-30页
第七章将来的工作展望	第30-31页
致谢	第31-32页
参考文献	第32-38页
2. Research and Implementation onf Intelligent Information Retrieval Based on Classification	第38-75页
Abstract	第38-40页
Chapter one Preface	第40-43页
·Background of Study	第40-41页
·The thinking of research of this paper and main work	第41-42页
·Framework of paper	第42-43页
Chapter Two the Predisposing of Web Page and Chinese Word Segmentation	第43-49页
·The pre-disposed of Web Page	第43-44页
·Chinese word segmentation	第44-49页
·Summarize of Chinese word segmentation	第44页
·Code question of Chinese word	第44-45页
·Construction of segmentation lexicon	第45-46页
·Finding of dictionary	第46页
·Describes of segmentation course	第46-47页
·Discernment of not recorded word	第47-48页
·Space utilization ratio and the complexity analyze of looking up	第48-49页
Chapter three The Selection of Feature	第49-54页
·the purpose of selection of feature	第49-50页
·the recommending and appraising of common method of feature selection and appraise	第50-51页
·Document Frequency	第50页
·Mutual Information	第50-51页
·Information gain	第51页
·the statistic of x~2	第51页
·Method of Feature Extraction in this paper.	第51-52页
·Description of feature selection algorithm	第52-54页
Chapter Four the study of Chinese Web Page Classification	第54-60页
·Definition of class	第54页
·summary of document classification	第54-55页
·the introduction of document classification based on KNN	第55-56页
·The machine expression of the web page	第56-58页
·the tradition feature weight method	第56-57页
·Feature weighting method adopted in this system	第57-58页
·Description of classification algorithm	第58-60页
Chapter Five the Index and Retrieval	第60-62页
·Inversed file	第60-61页
·search of inversed file	第61页
·The search methods of systems	第61-62页
Chapter six the Whole Design and Experiment	第62-67页
·the whole strut graph of system	第62-63页
·the introduction of every function module	第63-64页
·Material of Chinese web page train sets	第64-65页
·the result and appraise of Chinese web page	第65-66页
·Discussions	第66-67页
Chapter seven the Prospective of Work	第67-68页
REFERENCE	第68-75页
3. 文本搜索引擎关键技术综述	第75-115页
第一章文本搜索引擎概述	第75-78页
·前言	第75页
·搜索引擎发展简史	第75页
·常见的搜索引擎	第75-78页
·目录搜索引擎	第75-76页
·全文搜索引擎	第76页
·元搜索引擎	第76-77页
·小结	第77-78页
第二章网络蜘蛛	第78-83页
·前言	第78页
·常见搜索策略	第78-80页
·IP地址搜索策略	第78页
·深度优先搜索策略	第78页
·宽度优先搜索策略	第78-79页
·基于内容评价的搜索策战略	第79页
·基于未来回报价值评价的搜索策略	第79-80页
·基于巩固学习的搜索策略	第80页
·爬虫的设计中应该注意的问题	第80-81页
·网站与网络爬虫	第81-82页
·小结	第82-83页
第三章中文分词	第83-89页
·前言	第83页
·中文分词研究现状	第83-86页
·基于字符串匹配的方法	第83-84页
·基于理解的分词方法	第84页
·基于统计的分词方法	第84-85页
·其他的方法	第85-86页
·分词方法评价准则	第86-87页
·分词中存在的困难	第87-89页
第四章特征选取	第89-91页
·前言	第89页
·常见特征选取的方法	第89-91页
·文档频次	第89页
·信息增益	第89-90页
·CHI统计	第90页
·互信息	第90-91页
第五章分类和聚类	第91-97页
·前言	第91页
·类的定义	第91-92页
·分类算法	第92-94页
·简单向量距离分类法	第92页
·贝叶斯算法	第92-93页
·KNN算法	第93-94页
·基于投票的方法	第94页
·聚类	第94-97页
·聚类概述	第94页
·常见聚类方法	第94-97页
第六章索引	第97-100页
·前言	第97页
·索引中的关键技术	第97-100页
·文本的词法分析	第97-98页
·索引词条的选择	第98页
·词典	第98-99页
·倒排文件	第99-100页
第七章检索技术	第100-103页
·布尔逻辑模型	第100页
·模糊逻辑模型	第100页
·向量空间模型	第100-101页
·概率检索模型	第101-103页
第八章搜索结果的排序	第103-106页
·前言	第103页
·词频位置加权排序算法	第103-104页
·PageRank的排序方法	第104页
·HillTop的排序方法	第104-106页
第九章结束语	第106-108页
·内容总结	第106页
·搜索引擎的技术展望	第106-108页
参考文献	第108-115页
4. A Survey of Key Technologies For Text Search Engine	第115-164页
Chapter one A survey of text search engines	第118-123页
§1.1 Preface	第118页
§1.2 History of search engines	第118-119页
§1.3 Normal search engines	第119-123页
§1.3.1 Catalog search engines	第119-120页
§1.3.2 Full text search engines	第120页
§1.3.3 Meta-search engines	第120-121页
§1.3.4 Evaluate of kinds of search engines	第121-123页
Chapter two WebCrawler	第123-128页
§2.1 Preface	第123页
§2.2 Search tactics of WebCrawler	第123-124页
§2.2.1 Search tactics based on IP address	第123页
§2.2.2 Depth first	第123-124页
§2.2.3 Width first search tactics	第124页
§2.3 Question that should be paid attention to in the design of the WebCrawler	第124-126页
§2.4 Website and WebCrawler	第126-127页
§2.5 Brief summary	第127-128页
Chapter three Chinese Word Segmentation	第128-136页
§3.1 Preface	第128-129页
§3.2 The current state of Chinese segmentation method	第129-133页
§3.2.1 Segmentation method string match based	第129-130页
§3.2.2 Segmentation method based on understand	第130页
§3.2.3 Segmentation method based on statistics	第130-131页
§3.2.4 Other segmentation method	第131-133页
§3.3 Evaluation of segmentation method	第133-134页
§3.4 Difficult in segmentation method	第134-136页
Chapter four Feature Selection	第136-139页
§4.1 Preface	第136页
§4.2 Usual methods of characteristic extraction	第136-139页
§4.2.1 Frequency of document	第136-137页
§4.2.2 Information Gain	第137页
§4.2.3 Statistic of x2	第137-138页
§4.2.4 Mutual information	第138-139页
Chapter five Classification and Clustering	第139-146页
§5.1 Preface	第139页
§5.2 Definition of class	第139-140页
§5.3 Methods of classification	第140-143页
§5.3.1 Simple vector distance classification	第140-141页
§5.3.2 Bayesian classification	第141页
§5.3.3 KNN algorithm	第141-142页
§5.3.4 According to the polling method	第142-143页
§5.4 clustering	第143-146页
§5.4.1 Summary of clustering	第143页
§5.4.2 Procedure of clustering	第143-146页
Chapter six Index and Search	第146-150页
§6.1 Significance of index	第146页
§6.2 Kernel technology in index	第146-150页
§6.2.1 Analysis of text syntax	第146-147页
§6.2.2 Choice of index term	第147-148页
§6.2.3 Lexicon	第148页
§6.2.4 Inversed file	第148-150页
Chapter seven Search Technology	第150-153页
§7.1 Boolean logic model	第150页
§7.2 Fuzzy logic model	第150页
§7.3 Vector space model	第150-152页
§7.4 Probability search model	第152-153页
Chapter eight the Sort of Search Result	第153-155页
§8.1 Preface	第153页
§8.2 Sort method by word frequency weight	第153-154页
§8.3 Page Rank sort method	第154-155页
§8.4 Hilltop sort method	第155页
Chapter nine Conclusion	第155-158页
·Summaries	第155-156页
·Prospects of search engine technology	第156-158页
Reference literature	第158-164页