关键词:大数据;数据挖掘;知识发现;医药
摘 要:The past decade has seen explosive growth in digitized medical data. This trend offers medical practitioners an unparalleled opportunity to identify effectiveness of treatments for patients using summary statistics and to offer patients more personalized medical treatments based on predictive analytics. To exploit this opportunity, statisticians and computer scientists need to work and communicate effectively with medical practitioners to ensure proper measurement data, collection of sufficient volumes of heterogeneous data to ensure patient privacy, and understanding of probabilities and sources of errors associated with data sampling. Interdisciplinary collaborations between scientists are likely to lead to the development of more effective methods for explaining probabilities, possible errors, and risks associated with treatment options to patients. This chapter introduces some online resources to help medical practitioners with little or no background in summary and predictive statistics learn basic statistical concepts and implement data analysis on their personal computers using R, a high-level computer language that requires relatively little training. Readers who are only interested in understanding basic statistical concepts may want to skip the subsection on R.