大數據分析:R語言實現(影印版 英文版) [Big data analytics with R]

大數據分析:R語言實現(影印版 英文版) [Big data analytics with R] 下載 mobi epub pdf 電子書 2025

Simon,Walkowiak 著


想要找書就要到 圖書大百科
立刻按 ctrl+D收藏本頁


類似圖書 點擊查看全場最低價


齣版社: 東南大學齣版社
外文名稱:Big data analytics with R




  《大數據分析:R語言實現(影印版 英文版)》首先簡要敘述瞭大數據領域及其當前的行業標準.然後介紹瞭R語言的發展、結構、現實應用和不足之處,接著引入瞭用於數據管理和轉換的主要R函數的修訂版。讀者會瞭解至U基於雲的大數據解決方案(例如Amazon EC2實例和Amazon RDS,Microsoft Azure及其HDInsight集群)以及R與關係/非關係數據庫(如MongoDB和HBase)之間如何建立連接。除此之外,進一步涵蓋瞭大數據工具,如ApacheHadoop、HDFS和MapReduce,還有其他一些R兼容工具,如Apache Spark及其機器學習庫Spark MLlib、H2O。


  Simon Walkowiak,a cognitive neuroscientist and a managing director of Mind Project Ltd - a Big Data and Predictive Analytics consultancy based in London, United Kingdom. As a former data curator at the UK Data Service (UKDS, University of Essex) - European largest socio-economic data repository, Simon has an extensive experience in processing and managing large-scale datasets such as censuses, sensor and smart meter data, telecommunication data and well-known governmental and social surveys such as the British Social Attitudes survey, Labour Force surveys, Understanding Society, National Travel survey, and many other socio-economic datasets collected and deposited by Eurostat, World Bank, Office for National Statistics, Department of Transport, NatyCen and International Energy Agency, to mention just a few. Simon has delivered numerous data science and R training courses at public institutions and international comparniues. He has also taught a course in Big Data Methods in R at major UK universities and at the prestigious Big Data and Analyhcs Summer School organized by the Institute of Analytics and Data Saence (IADS),




Chapter 1: The Era of Big Data
Big Data - The monster re-defined
Big Data toolbox - dealing with the giant
Hadoop - the elephant in the room
Hadoop Spark-ed up
R- The unsung Big Data hero

Chapter 2: Introduction to R Programming Language and Statistical Environment
Learning R
Revisiting R basics
Getting R and RStudio ready
Setting the URLs to R repositories
R data structures
Data frames
Exporting R data objects
Applied data science with R
Importing data from different formats
Exploratory Data Analysis
Data aggregations and contingency tables
Hypothesis testing and statistical inference
Tests of differences
Independent t-test example (with power and effect size estimates)
ANOVA example
Tests of relationships
An example of Pearson's r correlations
Multiple regression example
Data visualization packages

Chapter 3: Unleashing the Power of R from Within
Traditional limitations of R
Out-of-memory data
Processing speed
To the memory limits and beyond
Data transformations and aggregations with the ff and ffbase packages
Generalized linear models with the ff and ffbase packages
Logistic regression example with ffbase and biglm
Expanding memory with the bigmemory package
Parallel R
From bigmemory to faster computations
An apply() example with the big.matrix object
A for() loop example with the ffdf object
Using apply() and for() loop examples on a data.frame
A parallel package example
A foreach package example
The future of parallel processing in R
Utilizing Graphics Processing Units with R
Multi-threading with Microsoft R Open distribution
Parallel machine learning with H20 and R
Boosting R performance with the data.table package and other tools
Fast data import and manipulation with the data.table package
Data import with data.table
Lightning-fast subsets and aggregations on data.table
Chaining, more complex aggregations, and pivot tables with data.table
Writing better R code

Chapter 4: Hadoop and MapReduce Framework for R
Hadoop architecture
Hadoop Distributed File System
MapReduce framework
A simple MapReduce word count example
Other Hadoop native tools
Learning Hadoop
A single-node Hadoop in Cloud
Deploying Hortonworks Sandbox on Azure
A word count example in Hadoop using Java
A word count example in Hadoop using the R language
RStudio Server on a Linux RedHat/CentOS virtual machine
Installing and configuring RHadoop packages
HDFS management and MapReduce in R - a word count example
HDInsight - a multi-node Hadoop cluster on Azure
Creating your first HDInsight cluster
Creating a new Resource Group
Deploying a Virtual Network
Creating a Network Security Group
Setting up and configuring an HDInsight cluster
Starting the cluster and exploring Ambari
Connecting to the HDInsight cluster and installing RStudio Server
Adding a new inbound security rule for port 8787
Editing the Virtual Network's public IP address for the head node
Smart energy meter readings analysis example - using R on HDInsight cluster

Chapter 5: R with Relational Database Management Systems (RDBMSs)
Relational Database Management Systems (RDBMSs)
A short overview of used RDBMSs
Structured Query Language (SQL)
SQLite with R
Preparing and importing data into a local SQLite database
Connecting to SQLite from RStudio
MariaDB with R on a Amazon EC2 instance
Preparing the EC2 instance and RStudio Server for use
Preparing MariaDB and data for use
Working with MariaDB from RStudio
PostgreSQL with R on Amazon RDS
Launching an Amazon RDS database instance
Preparing and uploading data to Amazon RDS
Remotely querying PostgreSQL on Amazon RDS from RStudio

Chapter 6: R with Non-Relational (NoSQL) Databases
Introduction to NoSQL databases
Review of leading non-relational databases
MongoDB with R
Introduction to MongoDB
MongoDB data models
Installing MongoDB with R on Amazon EC2
Processing Big Data using MongoDB with R
Importing data into MongoDB and basic MongoDB commands
MongoDB with R using the rmongodb package
MongoDB with R using the RMongo package
MongoDB with R using the mongolite package
HBase with R
Azure HDInsight with HBase and RStudio Server
Importing the data to HDFS and HBase
Reading and querying HBase using the rhbase package

Chapter 7: Faster than Hadoop - Spark with R
Spark for Big Data analytics
Spark with R on a multi-node HDInsight cluster
Launching HDInsight with Spark and R/RStudio
Reading the data into HDFS and Hive
Getting the data into HDFS
Importing data from HDFS to Hive
Bay Area Bike Share analysis using SparkR

Chapter 8: Machine Learning Methods for Big Data in R
What is machine learning?
Supervised and unsupervised machine learning methods
Classification and clustering algorithms
Machine learning methods with R
Big Data machine learning tools
GLM example with Spark and R on the HDInsight cluster
Preparing the Spark cluster and reading the data from HDFS
Logistic regression in Spark with R
Naive Bayes with H20 on Hadoop with R
Running an H2O instance on Hadoop with R
Reading and exploring the data in H2O
Naive Bayes on H2O with R
Neural Networks with H2O on Hadoop with R
How do Neural Networks work?
Running Deep Learning models on H20

Chapter 9: The Future of R - Big, Fast, and Smart Data
The current state of Big Data analytics with R
Out-of-memory data on a single machine
Faster data processing with R
Hadoop with R
Spark with R
R with databases
Machine learning with R
The future of R
Big Data
Fast data
Smart data
Where to go next
大數據分析:R語言實現(影印版 英文版) [Big data analytics with R] 下載 mobi epub pdf txt 電子書 格式

大數據分析:R語言實現(影印版 英文版) [Big data analytics with R] mobi 下載 pdf 下載 pub 下載 txt 電子書 下載 2025

大數據分析:R語言實現(影印版 英文版) [Big data analytics with R] 下載 mobi pdf epub txt 電子書 格式 2025

大數據分析:R語言實現(影印版 英文版) [Big data analytics with R] 下載 mobi epub pdf 電子書
想要找書就要到 圖書大百科
立刻按 ctrl+D收藏本頁











類似圖書 點擊查看全場最低價

大數據分析:R語言實現(影印版 英文版) [Big data analytics with R] mobi epub pdf txt 電子書 格式下載 2025





© 2025 book.qciss.net All Rights Reserved. 圖書大百科 版權所有