Introducing Data Science – the book

Everybody should be data literate.

Data literacy is the ability to read, understand, create, and communicate data as information. Much like literacy as a general concept, data literacy focuses on the competencies involved in working with data. [definition from Wikipedia]

Our society is built on, and driven by, a hurricane of data, and our lives are shaped by it, in both big and small ways.  Throughout the book we shine a light on the role data plays in our everyday existence, and how to use technology to take control of that data.

The book gives the reader a solid grounding in the core of data science: a set of skills that equates to some undefined, but tangible, level of data literacy.  It weaves together a broad and eclectic range of concepts into a coherent and conjoined journey, using logical, hands-on and engaging examples, exercises and anecdotes.

Beginning Data Science, IoT, and AI on Single Board Computers: Core Skills and Real-World Application with the BBC micro:bit and XinaBox

This book was co-written by me (Philip) and my co-author Pradeeka Seneviratne.  We were also helped by the support team at Apress (especially Natalie and Jessica – thanks :).  The term ‘we’ below refers collectively to everyone who helped see this book through to publication.

Hardware agnostic:

In the book we use BBC micro:bit and XinaBox hardware for the activities, but we go to great pains to explain that the hardware is just an enabler – it doesn’t matter what hardware you use (micro:bit, Arduino, RPi, AdaFruit, Sparkfun, XinaBox etc).

The hardware requirements are outlined in such a way that you could implement the examples with any hardware system, and the software requirements are outlined in natural language, to make it easy to port.

You could even read through the book and treat the exercises as thought experiments:

The core value of the book is the journey. In each chapter we look at increasingly sophisticated technology, and how it can be used to undertake data science activities. We show how to build the technology too, but that is NOT the point of the book at all.   You could even say that the technology is a barrier to undertaking the data science journey outlined in the book – a barrier that the book expressly tries to lower.

Coding skills not required:

This is NOT a coding book.  Nevertheless, unless you are able to buy digital instruments that meet all the needs outlined in the book, some coding is going to be necessary.

We provide all of the code needed from the book in our supporting website, so you don’t have to write a single line yourself (as long as you are using the same hardware, of course).

Coding IS fun though, and its a useful skill to have (just look at other links on my blog site – I do it as a hobby :).  In the book we show how to put the code together, using MakeCode where possible or MicroPython where not. All code is explained in ‘natural language’ first.  Then the natural language is linked to the coding syntax. to show how how natural language code translates into computer code.  Finally a full listing of the code is provided, and a digital copy of this is available on the resources website.  And our code samples are as simple as possible – we do just enough to get the job done.

As we say in the book: you don’t need to program your fridge or your toaster, or even your Tesla. before you use it.  It can be the same for the digital instruments we build as well.  Not everyone is or wants to be a coder, and the book talks about how coding is not a key skill in data literacy / data science.

There is some cool stuff in the book for people interested in coding, but if your interest is more towards the data science side of things then you will be able to work through the book, and find a great deal of value in doing so.

A narrative journey:

The journey begins with using a simple thermometer and ends with building an AI enabled weather station which uses a Machine Learning engine to predict the likelihood of rain.  The road takes in correlations and ethics and even glances at the concept of free will!  We provide simple and understandable definitions for vague and technically confusing concepts, and we give real world examples and applications of everything we discuss.  We are not experts, we are YOU!: we have figured out how to use the technology of the day to enhance our data literacy and we want to share this with you.

In every chapter we explore more and more sophisticated technologies.  We work step-by-step through them and see how they can deliver outcomes that are of value to us.  We look at how these technologies unlock techniques and skills that can be used in our data science endeavours.

You will learn how to use all sorts of technologies, including Bluetooth, Wi-Fi, IoT and AI / ML, and also see what they are useful for.  We look at these through the lens of data science and ask how they can be of value to us, how they meet our data science needs or expand our capabilities.  We then look at practical ways to implement this.

Cut through the myths and jargon

IoT and AI are EASY TO USE, seriously!  But as soon as an expert tries to explain them the average person gets lost very quickly: mired in details and acronyms that obfuscate rather than illuminate.

You do not need to know how to build a car or construct a road to be able to drive to the shop. And we do not need to be engineers or PHD graduates to USE complex technologies, including IoT and AI.

And that is the key to the book. Using stuff. Not making it, or reviewing tediously long bullet lists of technical features, or trying to memorise vague aspects of a protocol.  I know very little about any of the protocols associated with IoT, but I happily and successfully use it on a daily basis.  Its that kind of book.

At the end of the book you will have a range of practical skills which you will be able to use to do stuff, and you will have an ample understanding of the technologies you are using.  I am not sure how well it will prepare you for exams, but it will enhance (or reinforce) your data literacy.

An eclectic range of skills and information

When I set out to write this book I had a crisis of confidence: by no definition am I a ‘Data Scientist’ (I was, sort of, 20 odd years ago when I worked in quantitative research; but that was a different world).  What right did I have to try to lay down foundations for data literacy?

I am lucky to have had a colourful and varied career with at least four significant career pivots.  I spent years teaching in high school, I was a freelance programmer for a spell,  a decade doing cutting edge market research and a decade in software and hardware development. Most recently I’ve been offering consultancy services in the EDTECH sector, currently as COO for XinaBox.

I drew on all these experiences to craft a journey that is greater than the sum of its parts – a book that required all these different perspectives to come to fruition. A practical, pragmatic and accessible tome: my idea of what the core set of skills that any data scientist, or data literate individual, needs to have.

It is a bit bold in places:

  • I have taken positions on things that do not command consensus, e.g.  I had to provide usable definitions for nebulous terms like “data science” and “intelligence”.
  • I have explained some things in ways that I know purists and technical  gurus will find simplistic or flawed, e.g. I have explained the Internet of Thing from the perspective of the hardware that it encompasses and not the protocols that comprise it.  I am going to get moaned at for that in the pub I am sure 🙁
  • I have omitted or spent very little time on topics that I KNOW the experts insist are key (I interviewed a few beforehand).  An example of this is Python: the best data scientists I know are proficient with Python and its use is embedded in their data science endeavours. The reader will find plenty packed into the limited space we had in the book, and the journey is self-contained and complete.

Whilst crafting this journey the yard-stick was Occam’s razor: if a line of code, or a buzz-word, or an explanation, or a concept – if any of them are not key to the narrative then they ended up on the chopping board.  I have never read the MQTT specification – I know how to use it and I use it quite a lot; and I will show you how to use it in Chapter 9.

Charitable works:

Pradeeka and I have agreed to donate all royalties from the first 6 months to undertake charitable activities:

We will be buying up copies of the book and the hardware used in the book and we will be donating these free to teachers who operate in disadvantaged circumstances.  100% of the royalties will be invested in this way.

There’s always one thing to moan about:

I have one very minor moan: I had to use Amerikan spelling. And every time I had to change CORRECT spelling to dumbed-down-US-pidgin-spelling a small part of me died.

Tbh it was surprising to me to find how much I cared about this.  Perhaps it was some kind of transference mechanism – it was quite stressful writing the book and holding a full time job.  So it took a bit longer, but they do say that the best things usually do!

Thank you and I hope you enjoy the journey 🙂