基于深度学习模型的说话人识别算法研究

Abstract	第4页
Acknowledgements	第6-16页
1 Introduction	第16-26页
1.1 Background	第16-18页
1.2 Speaker Recognition	第18-19页
1.3 Fundamentals of Speaker Recognition	第19-20页
1.4 Research Questions	第20-21页
1.5 Contribution of the Thesis	第21-22页
1.6 Thesis Structure	第22-26页
2 Literature Review	第26-44页
2.1 Introduction	第26页
2.2 Deep Learning	第26-31页
2.2.1 Shallow vs Deep Architecture-Why do we need Deep Architecture?	第27-29页
2.2.2 Approach towards Deep Learning	第29-31页
2.3 Deep Belief Network	第31-34页
2.4 Introduction to Speech Features	第34-37页
2.4.1 Speech Features Categorization	第34-37页
2.5 Mel Frequency Cepstral Coefficients	第37-40页
2.5.1 Major steps	第37-38页
2.5.2 Explanation	第38-39页
2.5.3 Delta and Delta-Delta coefficients	第39-40页
2.6 Support Vector Machines	第40-44页
2.6.1 SVM as large-margin boundary classifier	第41-44页
3 Deep Hybrid Features for Speaker Recognition	第44-60页
3.1 Introduction	第44-50页
3.1.1 Restricted Boltzmann Machine	第45-48页
3.1.2 Contrastive Divergence Algorithm	第48-49页
3.1.3 Learning Audio Data with RBM	第49-50页
3.2 Convolutional Deep Belief Networks for Speaker Identification	第50页
3.3 Deep Hybrid Features-DHyF	第50-54页
3.3.1 Previous Work	第51-52页
3.3.2 Speaker Recognition Pipeline	第52页
3.3.3 Features Learning	第52-54页
3.3.4 Bag of Words Analogy	第54页
3.3.5 Classification	第54页
3.4 Experiment and Results	第54-57页
3.5 Conclusion	第57-60页
4 Convolutional Data for Deep Audio Learning	第60-70页
4.1 Introduction	第60页
4.2 Convolutional Data	第60-62页
4.3 Proposed Approach	第62页
4.4 Initial Experimentation	第62-63页
4.5 Future Direction on Convolutional Data	第63-70页
5 The super vector and i-vector paradigms for speaker recognition	第70-78页
5.1 Introduction	第70页
5.2 Super vectors	第70-71页
5.3 i-vectors	第71-73页
5.4 NIST i-vector challenge	第73页
5.5 Baseline-Cosine Distance Scoring	第73-74页
5.6 Performance Metric	第74页
5.7 Late Fusion Approach	第74-75页
5.8 Results	第75-76页
5.9 Couclusion	第76-78页
6 Automatic Speech Recognition of Urdu	第78-92页
6.1 Introduction	第78-79页
6.2 Background	第79-80页
6.3 Previous Work on Urdu ASR	第80-81页
6.4 Methodology	第81-83页
6.4.1 Mel Frequency Cepstral Coefficients	第81页
6.4.2 Classification Techniques	第81-83页
6.4.3 Linear Discriminant Analysis	第83页
6.5 Experimental Setup	第83-85页
6.5.1 Dataset	第83页
6.5.2 Confusion Matrix	第83-84页
6.5.3 Comparison with DWT features	第84-85页
6.6 Conclusion	第85-92页
7 Conclusion and Future Work	第92-96页
7.1 Conclusion	第92-94页
7.2 Future Work	第94-96页
A Useful Resources	第96-98页
A.1 Software Tools	第96页
A.2 Useful Links	第96-98页
B Algorithms	第98-100页
B.1 MFCC Calculation	第98-99页
B.2 Contrastive Divergence	第99-100页
C Authors' Publication(first author/co-author)	第100-102页
Bibliography	第102-120页
作者简历及在学研究成果	第120-122页
学位论女数据集	第122页