Double Attention Mechanism for Sentence Embedding

Abstract	第3页
摘要	第4-8页
1 INTRODUCTION	第8-12页
1.1 Natural Language Processing Overview	第8-10页
1.2 Motivation	第10-11页
1.3 Goal and Contribution	第11-12页
2 Background	第12-28页
2.1 Neural Networks	第12-17页
2.1.1 Definition	第12页
2.1.2 A Single Neuron	第12-14页
2.1.3 Feedforward Neural Network	第14-16页
2.1.4 Backpropagation Algorithm	第16-17页
2.2 Convolutional Neural Network	第17-20页
2.2.1 Overview of CNN architecture	第17页
2.2.2 Convolutional Layers	第17-18页
2.2.3 Pooling Layers	第18-19页
2.2.4 Fully Connected Layers	第19-20页
2.2.5 Training a CNN	第20页
2.3 Recurrent Neural Network	第20-24页
2.3.1 LSTM Recurrent Neural Network	第22-24页
2.3.2 The bidirectional RNN	第24页
2.4 Word Embedding	第24-27页
2.4.1 Word2Vec	第25-26页
2.4.2 Glo Ve	第26-27页
2.5 Attention Mechanism in Deep Learning	第27-28页
3 Related Work	第28-33页
3.1 Unsupervised models for Sentence Embedding	第28-32页
3.1.1 The Paragraph Vector	第28-29页
3.1.2 The Skip Thought Model	第29-31页
3.1.3 The Fast Sent Model	第31-32页
3.1.4 The Sequential (Denoising) Autoencoders model	第32页
3.2 Supervised models for Sentence Embedding	第32-33页
3.2.1 Model without Attention mechanism	第32-33页
3.2.2 Model with Attention mechanism	第33页
4 Methodology	第33-38页
4.1 Word embedding	第33-35页
4.2 The bidirectional LSTM with Self-Attention mechanism	第35-36页
4.3 The Convolution Neural Network based on Attention Pooling	第36-38页
5 Implementation	第38-64页
5.1 Implementation of the Word Embedding model	第38-48页
5.1.1 Data Presentation	第38页
5.1.2 All reviews text to one string	第38-40页
5.1.3 Tokenization into sentences	第40-41页
5.1.4 Clean and split sentence into word	第41页
5.1.5 Setting the numerical parameters	第41-42页
5.1.6 Train our Word2Vec	第42-43页
5.1.7 Storing and loading	第43页
5.1.8 Model visualization	第43-47页
5.1.9 Most similar words	第47-48页
5.2 Implementation of the Proposed method	第48-57页
5.2.1 Datasets Presentation	第48页
5.2.2 Cleaning the Dataset	第48-49页
5.2.3 Build the Vocabulary of the dataset	第49-51页
5.2.4 Choose the Maximum Sequence Length	第51-53页
5.2.5 Build a training set, a validation set and a test set	第53页
5.2.6 Implementation Detail of the Model	第53-56页
5.2.7 Training of the Model	第56-57页
5.3 Experimental Result	第57-64页
5.3.1 Comparison systems	第57-58页
5.3.2 Optimal parameters setting	第58-62页
5.3.3 Results comparison	第62-64页
Conclusion	第64-65页
References	第65-67页