10 new books added to Big Book of R

21 July 2021

In this post I’m highlighting 10 new books added to Big Book of R. Thank you to the authors for writing them and thanks to R Posts you might have missed who I got a bunch of these from.

Hiring Data Scientists and Machine Learning Engineers

Roy Keyes

It’s quite possible that the only thing more confusing than defining data science is actually hiring data scientists. Hiring Data Scientists and Machine Learning Engineers is a concise, practical guide to cut through the confusion. Whether you’re the founder of a brand new startup, the senior vice president in charge of “digital transformation” at a global industrial company, the leader of a new analytics effort at a non-profit, or a junior manager of a machine learning team at a tech giant, this book will help walk you through the important questions you need to answer to determine what role and which skills you should hire for, how to source applicants, how to assess those applicants’ skills, and how to set your new hires up for success. Special emphasis is placed on in-office vs remote hiring situations.

https://www.bigbookofr.com/career-community.html#hiring-data-scientists-and-machine-learning-engineers

Introduction to Machine Learning Interviews Book

Chip Huyen

This book is the result of the collective wisdom of many people who have sat on both sides of the table and who have spent a lot of time thinking about the hiring process. It was written with candidates in mind, but hiring managers who saw the early drafts told me that they found it helpful to learn how other companies are hiring, and to rethink their own process.

The book consists of two parts. The first part provides an overview of the machine learning interview process, what types of machine learning roles are available, what skills each role requires, what kinds of questions are often asked, and how to prepare for them. This part also explains the interviewers’ mindset and what kind of signals they look for.

The second part consists of over 200 knowledge questions, each noted with its level of difficulty — interviews for more senior roles should expect harder questions — that cover important concepts and common misconceptions in machine learning.

https://www.bigbookofr.com/career-community.html#introduction-to-machine-learning-interviews-book

Spatial Microsimulation with R

Robin Lovelace and Morgane Dumont

Imagine a world in which data on companies, households and governments were widely available. Imagine, further, that researchers and decision-makers acting in the public interest had tools enabling them to test and model such data to explore different scenarios of the future. People would be able to make more informed decisions, based on the best available evidence. In this technocratic dreamland pressing problems such as climate change, inequality and poor human health could be solved.

These are the types of real-world issues that we hope the methods in this book will help to address. Spatial microsimulation can provide new insights into complex problems and, ultimately, lead to better decision-making. By shedding new light on existing information, the methods can help shift decision-making processes away from ideological bias and towards evidence-based policy.

https://www.bigbookofr.com/geospatial.html#spatial-microsimulation-with-r

Reproducible Medical Research with R

Peter D.R. Higgins, MD, PhD, MSc

This is a book for anyone in the medical field interested in analyzing the data available to them to better understand health, disease, or the delivery of care. This could include nurses, dieticians, psychologists, and PhDs in related fields, as well as medical students, residents, fellows, or doctors in practice.
I expect that most learners will be using this book in their spare time at night and on weekends, as the health training curricula are already packed full of information, and there is no room to add skills in reproducible research to the standard curriculum. This book is designed for self-teaching, and many hints and solutions will be provided to avoid roadblocks and frustration. Many learners find themselves wanting to develop reproducible research skills after they have finished their training, and after they have become comfortable with their clinical role. This is the time when they identify and want to address problems faced by patients in their practice with the data they have before them. This book is for you.

https://www.bigbookofr.com/life-sciences.html#reproducible-medical-research-with-r

R for Water Resources Data Science

Ryan Peek and Rich Pauloo

Consists of 2 courses

Introductory:
This course is most relevant and targeted at folks who work with data, from analysts and program staff to engineers and scientists. This course provides an introduction to the power and possibility of a reproducible programming language (R) by demonstrating how to import, explore, visualize, analyze, and communicate different types of data. Using water resources based examples, this course guides participants through basic data science skills and strategies for continued learning and use of R.

Intermediate:
In this course, we will move more quickly, assume familiarity with basic R skills, and also assume that the participant has working experience with more complex workflows, operations, and code-bases. Each module in this course functions as a “stand-alone” lesson, and can be read linearly, or out of order according to your needs and interests. Each module doesn’t necessarily require familiarity with the previous module.

This course emphasizes intermediate scripting skills like iteration, functional programming, writing functions, and controlling project workflows for better reproducibility and efficiency. Approaches to working with more complex data structures like lists and timeseries data, the fundamentals of building Shiny Apps, pulling water resources data from APIs, intermediate mapmaking and spatial data processing, integrating version control in projects with git.

https://www.bigbookofr.com/life-sciences.html#r-for-water-resources-data-science-1

Book of R: A First Course in Programming and Statistics

Tilman M. Davies

The Book of R is a comprehensive, beginner-friendly guide to R, the world’s most popular programming language for statistical analysis. Even if you have no programming experience and little more than a grounding in the basics of mathematics, you’ll find everything you need to begin using R effectively for statistical analysis.

You’ll start with the basics, like how to handle data and write simple programs, before moving on to more advanced topics, like producing statistical summaries of your data and performing statistical tests and modeling. You’ll even learn how to create impressive data visualizations with R’s basic graphics tools and contributed packages, like ggplot2 and ggvis, as well as interactive 3D visualizations using the rgl package.

https://www.bigbookofr.com/r-programming.html#book-of-r-a-first-course-in-programming-and-statistics

Data visualisation using R, for researchers who don’t use R

Emily Nordmann, Phil McAleer, Wilhelmiina Toivo, Helena Paterson, Lisa DeBruine

In this tutorial, we aim to provide a practical introduction to data visualisation using R, specifically aimed at researchers who have little to no prior experience of using R. First we detail the rationale for using R for data visualisation and introduce the “grammar of graphics” that underlies data visualisation using the ggplot package. The tutorial then walks the reader through how to replicate plots that are commonly available in point-and-click software such as histograms and boxplots, as well as showing how the code for these “basic” plots can be easily extended to less commonly available options such as violin-boxplots.

https://www.bigbookofr.com/data-visualization.html#data-visualisation-using-r-for-researchers-who-dont-use-r

R for Conservation and Development Projects: A Primer for Practitioners

Nathan Whitmore

This book is aimed at conservation and development practitioners who need to learn and use R in a part-time professional context. It gives people with a non-technical background a set of skills to graph, map, and model in R. It also provides background on data integration in project management and covers fundamental statistical concepts. The book aims to demystify R and give practitioners the confidence to use it.

Key Features:

• Viewing data science as part of a greater knowledge and decision making system
• Foundation sections on inference, evidence, and data integration
• Plain English explanations of R functions
• Relatable examples which are typical of activities undertaken by conservation and development organisations in the developing world
• Worked examples showing how data analysis can be incorporated into project reports

https://www.bigbookofr.com/life-sciences.html#r-for-conservation-and-development-projects-a-primer-for-practitioners

One Way ANOVA with R: Completely Randomized Design – Between Groups

Bruce Dudek

This document can be a standalone “how-to” document for R users. However, it
is primarily intended for students in the APSY510/511 statistics sequence at the
University at Albany. It is a fairly thorough treatment of graphical and inferential evaluation of one-factor designs. It presumes prior background coverage
of the ANOVA logic from standard textbooks such as Howell or Maxwell, Delaney and Kelley (2017). The analyses are intended to parallel and exhaust the
methods already covered with SPSS, and to extend them to additional topics.

https://www.bigbookofr.com/statistics.html#one-way-anova-with-r-completely-randomized-design—between-groups

An Open Compendium of Soil Datasets

Tomislav Hengl

(Not R specific but looks really relevant)

This is a public compendium of global, regional, national and sub-national soil samples and/or soil profile datasets (points with Observations and Measurements of soil properties and characteristics). Datasets listed here, assuming compatible open license, are afterwards imported into the Global compilation of soil chemical and physical properties and soil classes and eventually used to create a better open soil information across countries. The specific objectives of this initiative are:

To enable data digitization, import and binding + harmonization,
To accelerate research collaboration and networking,
To enable development of more accurate / more usable global and regional soil property and class maps (typically published via https://OpenLandMap.org)

https://www.bigbookofr.com/life-sciences.html#an-open-compendium-of-soil-datasets

Bonus book: Project Management Fundamentals for Data Analysts

This is a book I self published a few weeks ago. I’ve received a lot of excellent reviews. It’s a quick read and packed with info that’ll help you for the rest of your career.

In Project Management Fundamentals for Data Analysts, I’ve boiled the concepts down to the bare essentials which can be read in under 15 minutes – you can certainly fit that into your crazy schedule (and it will help your future schedule not be so chaotic!).

These concepts can be used to great effect on their own if you wish to never read another word on the topic. It’ll also provide a solid foundation if you want to dive deeper into more formal courses or sophisticated theory.

https://oscarbaruffa.com/pm/

Don’t miss any updates, sign up below. I don’t post very often 🙂

* indicates required
Back to Top