内容简介
很多数据科学资源包括了统计方法,但是欠缺具有深度的统计学视角。如果你熟悉R语言编程,也对统计学有所了解,这份快速参考将帮助你搭建易学可达的知识桥梁。
你将从这本书中学到:
? 为什么探究式数据分析是数据科学的入门关键
? 随机采样如何减少偏见并产生高质量的数据集,即便用于大数据
? 实验设计原则如何生成针对问题的答案
? 如何使用回归估计结果及检测异常
? 用于预测记录归属的关键归类技巧
? 从数据学习到的统计机器学习方法
? 用于从未标记数据中提取意义的无监督学习方法
作者简介
Peter Bruce 创立并发展壮大了Statistics.com上的统计学教育学院,该学院目前提供约90项统计学课程,近半数面向数据科学家。
Andrew Bruce 在学术、政府和商业各领域拥有超过30年的统计学和数据科学经验,作为美国华盛顿大学统计学博士,他在同行评审的期刊上发表过多篇论文。
精彩书评
“本书既不是另一部统计学教材,也不是机器学习手册。它是更好的:运用清晰的解释和丰富的实例,在实用统计学术语、原则和当下数据挖掘行话与实践之间建立联系。这是一本对于数据科学初学者和老手们而言都很棒的参考书。”
——Galit Shmueli(畅销图书《Data Mining for Business Analytics》系列主要作者,中国台湾清华大学著名教授)
目录
Preface
1. Exploratory Data Analysis
Elements of Structured Data
Further Reading
Rectangular Data
Data Frames and Indexes
Nonrectangular Data Structures
Further Reading
Estimates of Location
Mean
Median and Robust Estimates
Example: Location Estimates of Population and Murder Rates
Further Reading
Estimates of Variability
Standard Deviation and Related Estimates
Estimates Based on Percentiles
Example: Variability Estimates of State Population
Further Reading
Exploring the Data Distribution
Percentiles and Boxplots
Frequency Table and Histograms
Density Estimates
Further Reading
Exploring Binary and Categorical Data
Mode
Expected Value
Further Reading
Correlation
Scatterplots
Further Reading
Exploring Two or More Variables
Hexagonal Binning and Contours (Plotting Numeric versus Numeric Data)
Two Categorical Variables
Categorical and Numeric Data
Visualizing Multiple Variables
Further Reading
Summary
2. Data and Sampling Distributions
Random Sampling and Sample Bias
Bias
Random Selection
Size versus Quality: When Does Size Matter?
Sample Mean versus Population Mean
Further Reading
Selection Bias
Regression to the Mean
Further Reading
Sampling Distribution of a Statistic
Central Limit Theorem
Standard Error
Further Reading
The Bootstrap
Resampling versus Bootstrapping
Further Reading
Confidence Intervals
Further Reading
Normal Distribution
Standard Normal and QQ-Plots
Long-Tailed Distributions
Further Reading
Student's t-Distribution
Further Reading
Binomial Distribution
Further Reading
Poisson and Related Distributions
Poisson Distributions
Exponential Distribution
Estimating the Failure Rate
……
3. Statistical Experiments and Significance Testing
4. Regression and Prediction
5. Classification
6. Statistical Machine Learning
7. Unsupervised Learning
Bibliography
Index
面向数据科学家的实用统计学(影印版) [1. Practical Statistics for Data Scientists] 下载 mobi epub pdf txt 电子书 格式
面向数据科学家的实用统计学(影印版) [1. Practical Statistics for Data Scientists] 下载 mobi pdf epub txt 电子书 格式 2024
面向数据科学家的实用统计学(影印版) [1. Practical Statistics for Data Scientists] mobi epub pdf txt 电子书 格式下载 2024