Working with Large Data Sets

In this chapter, we are going to take a step forward in our data science experimentation by significantly increasing the volume of data that we collect. So far we have worked with a dozen data points, and that was adequate for our needs. But sometimes correlations (and other interesting features of data) are subtle and hard to find. The larger our data set the more likely it is that the answers we are looking for lie within it.

Shortly after finalising the book the people at XinaBox dropped a bombshell: support for reading and writing to microSD card directly from MakeCode. It is a game changer for data science applications, effectively giving the micro:bit unlimited persistent memory. If you have an IM01 micro:bit bridge from XinaBox then use that and skip out all the code relating to using the Python based file system, which is limited and a bit finicky to use.

If you do not have access to microSD capabilities for your micro:bit then the chapter will be invaluable. We use microPython to take advantage of the built-in file system and write data to persistent memory in csv format.

The code used is listed below:

Section 4.2:

Download pre-compiled hex file (in a .zip folder)

with open('hello.txt', 'w') as my_file:
    my_file.write("Hello, World!")

Section 4.7:

Download pre-compiled hex file (in a .zip folder)

from microbit import *
from SW01 import SW01

import os
import sys

SW01 = SW01()
d = 0

try:
    d = os.size("we.csv")
except:
    pass

if (d > 0):
    display.show("*")
    sys.exit()

open("we.csv",'w')

while True:
    try:
        with open("we.csv") as a:
            cont = a.read()
            cont = cont + str(SW01.getTempC()) + "," + str(SW01.getHumidity()) + "\n"
    except:
        display.show(".")
        sys.exit()

    with open("we.csv", 'w') as b:
        b.write(cont)

Section 4.8:

Download pre-compiled hex file (in a .zip folder)

from microbit import *
from SW01 import SW01

import os
import sys

SW01 = SW01()
d = 0

try:
    d = os.size("we.csv")
except:
    pass

if (d > 0):
    display.show("*")
    sys.exit()

bp = False

while True:
    if button_a.was_pressed():
        a = open("we.csv", "w")
        bp = True
    if (bp == True):    
        try:
            with open("we.csv") as a:
                cont = a.read()
                cont = cont + str(SW01.getTempC()) + "," + str(SW01.getHumidity()) + "\n"
        except:
            pass
            display.show(".")
        with open("we.csv", 'w') as b:
            b.write(cont)

        sleep(1440000)      # Every [x] minutes

Errata:

Page 88, Figure 4.2. And Page 89, Figure 4.3

Both images show Perssure, which should be Pressure. In addition, Page 90, Figure 4.4 shows 3 data fields, although a few paragraphs earlier we dropped humidity from the experiment, so Figure 4.4 should only show 2 data fields.

Thanks to @Pycro1 for pointing the out 🙂