摘要 | 第3-7页 |
abstract | 第7-8页 |
Chapter1 Introduction | 第12-48页 |
1.1 The P value variability is an enormous challenge in statistical analysis | 第12-24页 |
1.1.1 Rapid development of bioinformatics | 第12-13页 |
1.1.2 Advance of the biotechnology in biological and clinical analysis | 第13-17页 |
1.1.3 Statistical feature-selection techniques in biological and clinical analysis | 第17-19页 |
1.1.4 The P value variability in statistical feature-selection methods | 第19-24页 |
1.2 The relevance of power and reproducibility | 第24-35页 |
1.2.1 Variability is an inherent nature for P value | 第24-25页 |
1.2.2 Impact of P value variability on reproducibility in genomics | 第25-26页 |
1.2.3 Impact of P value variability on reproducibility in proteomics | 第26-29页 |
1.2.4 Multiple factors have influence on power | 第29-32页 |
1.2.5 Power can be raised by Network-based methods | 第32-35页 |
1.3 Overlooked issues in feature selection for omics data analysis | 第35-48页 |
1.3.1 Research on comparative statistical feature-selection methods | 第35-40页 |
1.3.2 Normalization methods | 第40-42页 |
1.3.3 Multiple testing corrections | 第42-44页 |
1.3.4 Data heterogeneity | 第44-48页 |
Chapter2 Experimental section | 第48-58页 |
2.1 Sample-to-sample P value variability | 第48-51页 |
2.1.1 Dataset design | 第48-50页 |
2.1.2 Statistical feature-selection methods | 第50-51页 |
2.2 How do P values vary in multivariate analysis | 第51-52页 |
2.2.1 Dataset design | 第51页 |
2.2.2 Studemt's two-sample t-test | 第51-52页 |
2.3 Some overlooked issues in univariate statistical feature selection in omics data | 第52-58页 |
2.3.1 Dataset design | 第52-54页 |
2.3.2 Univariate statistical feature-selection methods | 第54-55页 |
2.3.3 Normalization methods(upstream) | 第55-56页 |
2.3.4 Multiple test corrections(downstream) | 第56-58页 |
Chapter3 P value variability and its implications in multiple testing | 第58-74页 |
3.1 Results | 第58-71页 |
3.1.1 P value variability of t-test | 第58-65页 |
3.1.2 P value variability of U test | 第65-69页 |
3.1.3 The t-test P value is more variable | 第69页 |
3.1.4 Impacts of P value instability in feature selection– a toy example | 第69-70页 |
3.1.5 Impacts of P value instability in feature selection– a real clear cell renal | 第70-71页 |
3.2 Discussions | 第71-72页 |
3.2.1 The t-test is pretty powerful,even in the non-normal or mixed scenarios | 第71-72页 |
3.2.2 Implications for P value instability in biological gene or protein selection | 第72页 |
3.3 Conclusions | 第72-74页 |
Chapter4 With great power comes great reproducibility | 第74-82页 |
4.1 Results– P value and effect size variability | 第74-76页 |
4.2 Discussions | 第76-82页 |
4.2.1 Signal boosting transformations(SBTs) | 第76-77页 |
4.2.2 Network-based statistical testing(NBST) | 第77-78页 |
4.2.3 Additional metrics | 第78-79页 |
4.2.4 Determination of phenotypic-relevance is more important | 第79-82页 |
Chapter5 Implications of upstream/downstream process for feature selection | 第82-100页 |
5.1 Results | 第84-97页 |
5.1.1 Impact of normalization methods for simulated and real datasets | 第84-88页 |
5.1.2 Impact of multiple testing corrections for simulated and real datasets | 第88-93页 |
5.1.3 Impact of normalization methods/multiple testing corrections for data with varying heterogeneity | 第93-97页 |
5.2 Discussions | 第97-99页 |
5.2.1 Univariate statistical feature-selection outcome depends strongly on the normalization methods | 第97-98页 |
5.2.2 Dangerous of multiple testing corrections,especially for small sample size data | 第98页 |
5.2.3 Different heterogeneity has impact on the consequence of normalization methods/multiple testing corrections | 第98-99页 |
5.3 Conclusions | 第99-100页 |
Chapter6 Conclusion and outlook | 第100-102页 |
6.1 Conclusion | 第100页 |
6.2 Outlook | 第100-102页 |
References | 第102-117页 |
Notes on publications and participation in scientific research | 第117-118页 |
Acknowledgements | 第118页 |