Building COVID-19 Dashboard in Golang with Google BigQuery

It has been almost half a year since the first case relating to the COVID-19 pandemic in Singapore, the country I am now working at, was confirmed. Two days after the first case was confirmed in Singapore, eight travellers entering Malaysia, my home country, from Singapore were confirmed to be infected as well.

Since then, we were asked to work from home as travel restriction is applied in both countries. While the situation is not getting better, it’s quite disappointing to know that there are still people believing that COVID-19 is a hoax.

Fortunately, there are still a lot more people working hard in this tough period. Earlier on, my friend who is doing research in Colorado told me that she’s working hard with a group of scientists to educate the public about the virus.

🎨  People endured hours-long queues to enter Singapore from Malaysia before the travel restrictions to curb the spread of COVID-19 came into force. 🎨 

In addition, to aid the researchers and data scientists in an effort to combat the pandemic disease, Google BigQuery also decided to host a repository of public datasets from JHU CSSE (Johns Hopkins Center for Systems Science and Engineering). With the public datasets, we can now query up to 1TB for free each month on COVID-19 datasets and the queries over COVID-19 data are free (until 15th of September 2020).

In my previous article, I talked about how Google BigQuery could work together with Google Data Studio to render beautiful reports without any coding. Thus, in this article, I will show how we can write a simple web client in Golang to fetch data from the BigQuery via its API.

BigQuery Public Datasets Programme

There are a huge number of datasets hosted by Google where we can access and integrate them into our applications but Google pays for the storage. Using the public datasets, we only need to pay for the queries we perform on the data.

🎨  There are a lot of public datasets available in the GCP Marketplace. 🎨 

In order to access the public datasets, we first need to enable them through the Google BigQuery documentation (I find this to be quite funny because Google makes the enabling link to be so hidden). In the “Using the Web UI” page, as shown in the screenshot below, we can then find an URL which will let us open the public datasets project manually through browser (Remember to update the &page=project to your project in GCP).

There are also detailed steps written in the documentation of Data Analytics Products (Yes, the same info is spread all over different places).

🎨  The link to enable the public datasets in the web UI. 🎨 

The COVID-19 Dataset

Once we have done the steps above, we shall see the public datasets, including the COVID-19 datasets, available in our Google BigQuery. The dataset that I will be using in this article is the covid19_jhu_csse, a daily updated data repository for COVID-19 from JHU CSSE.

🎨  The covid19_jhu_csse dataset. 🎨 

There are four tables under the dataset where the first three recording the number of confirmed cases, the number of reported deaths, and the number of recovered cases, respectively, in each of the country or region.

The interesting about the first three tables is that they recorded the numbers of each day in a separated column. Hence, every day, there will be one new column added to three of the tables. I’m not sure why they do so but this actually requires us to write our own client in order to get the data. Google Data Studio cannot work well with dynamic column names.

🎨  A column for each of the day. 🎨 

Luckily, there is a fourth table called summary which actually has just one date column and every record for each day is one row instead of one column. This is a more SQL-friendly table and can be integrated with Google Data Studio easily.

🎨  The summary table is more SQL-friendly because the date is stored in just one column. 🎨 

In this article, I will demonstrate using 1st, 2nd, and 4th table in order to show how we can programmatically get the data through the BigQuery API.

BigQuery Client Library for Golang

There are many client libraries of Google BigQuery for different types of programming languages, including C#. In this article, we choose to use Golang.

Before we proceed, we need to make sure that we have already enabled the BigQuery API for our project in the GCP. From the GCP Cloud Console, we will get the credential which will allow us to connect to the Google BigQuery and thus we must keep this credential file in a safe and secret place.

Now we can proceed to build our Golang client.

Firstly, we need to install the client library using go get command.

go get -u cloud.google.com/go/bigquery

Secondly, we need to initialise a Google BigQuery client.

ctx := context.Background()
client, err := bigquery.NewClient(ctx, projectID)

if err != nil {
    log.Fatalf("bigquery.NewClient: %v", err)
}

defer client.Close()

Querying the Tables

Next, we can start to query the data in the BigQuery.

rows, err := queryData1(ctx, client)
if err != nil {
    log.Fatal(err) 
}

queryResult := processQueryResult1(rows)

If we have other different queries for different tables or even datasets, we can continue to query in the same way as above.

So what does queryData1 look like? It is basically as simple as follows.

func queryData1(ctx context.Context, client *bigquery.Client) (*bigquery.RowIterator, error) { 
    query := client.Query("<SQL here>")
    
    return query.Read(ctx)
}

For example, if we are fetching the date as well as numbers of confirmed cases and deaths, we will be using the the following SQL.

`SELECT 
    CAST(date as STRING) as date, 
    IFNULL(confirmed, 0) as confirmed_cases, 
    IFNULL(deaths, 0) as deaths 
FROM ` + "`bigquery-public-data.covid19_jhu_csse.summary`" + ` ORDER BY date;

There are a few things to take note here is the use of CAST.

It casts the date field to string otherwise we may encounter problems such as having error of “schema field date of type DATE is not assignable to struct field date of type time.Time” when we unmarshal the returned JSON from the BigQuery in Golang later. The reason why I choose CAST is because casting from a date type to a string is independent of time zone and is of the form YYYY-MM-DD.

In addition, we also use IFNULL to make sure that the value in the confirmed_cases and deaths are always non-negative integers. In the original tables, the numbers can be null.

Now, we just need to have a struct where we can apply RowIterator.Next() to load each row into it. The struct that corresponding to the SQL above is as follows.

type QueryResultDataRow struct { 
    Date           string `bigquery:"date"` 
    ConfirmedCases int64  `bigquery:"confirmed_cases"` 
    Deaths         int64  `bigquery:"deaths"`
}

To iterate, we can use the code below.

func processQueryResult1(iter *bigquery.RowIterator) []QueryResultDataRow { 
    var result []QueryResultDataRow

    for { 
        var row QueryResultDataRow

        err := iter.Next(&row)

        if err == iterator.Done { 
            break 
        }

        if err != nil { 
            log.Print(err) 
            continue 
        }

        result = append(result, row) 
    }
    
    return result
}

Here, I’d like to share that there was a mistake I made when I wrote the code above. I forgot that I should end the for loop when the iterator is done, i.e. when err == iterator.Done. So the return statement will never reach. Please take note of this when you are writing this type of iteration.

Challenge: The Tables Having Dates as Columns

If you would like to challenge yourself to retrieve the data from the tables having dates as their columns, it is possible too, just with a few challenges.

First challenge is that we are not sure when the dataset will be updated. So, we can never be sure for the value of the last column. Since the dataset will be updated daily, to be safe, we can let the date of two days ago to be the last column in our query.

Second challenge is the format of the date. We cannot use the Golang magical reference date (Mon, Jan 2 15:04:05 MST 2006) to format the date because of the underscores found in the column name. There is a very interesting discussion about the origin of the magical reference date on Stack Overflow, in case you are interested, but it’s not important here. Hence, we will use the following code to format the date instead.

latestDateInQuery := fmt.Sprintf("_%v_%v_%v", int(d.Month()), d.Day(), d.Year() - 2000)

So the following code will help us to get the count from the second latest, if not the latest, column.

latestDate := time.Now().AddDate(0, 0, -2)
latestDateInQuery := fmt.Sprintf("_%v_%v_%v", int(latestDate.Month()), latestDate.Day(), latestDate.Year()-2000)

Once we get the column name, we can then use it in the following query.

`SELECT 
    IFNULL(province_state, "") AS place, 
    country_region, 
    latitude, 
    longitude, 
    (` + latestDateInQuery + `) AS count 
FROM ` + "`bigquery-public-data.covid19_jhu_csse.confirmed_cases`;"

Visualising the Data

With the queries above, we can then easily generate results with Google Charts. Here, I use the Line Chart and the GeoChart.

🎨  The COVID-19 dashboard powered by Golang and Google BigQuery. 🎨 

There is an interesting feature in GeoChart is that, by default, when we are using latitude and longitude instead of the address to identify the places, the text shown on the map tooltip will be the latitude and longitude, which is not user friendly. However, we can actually change the text by putting a description column right after the longitude column, as discussed over here on Google Groups. It’s interesting because this is said to be an undocumented support for such a column. So we’re not sure where this will stop working.

Next, I am using web page done with Material Design to display the charts. Please enjoy the following screenshots.

🎨  Charts showing the situation in both of my beloved countries. 🎨 
🎨  Top 10 countries having the most confirmed cases. 🎨 
🎨  The global situation where we locate the places with latitudes and longitudes. 🎨 

That’s all for the COVID-19 dashboard done using Golang and Google BigQuery. Also, thanks to JHU CSSE and Google, we are able to access to such an important data for free.

Finally, I’d like to wish all of you and your loved ones to stay safe and healthy.

🎨  A nurse checks the temperature of a visitor as part of the COVID-19 screening procedure. (Photo Credit: The Straits Times) 🎨 

Nice Report Always Comes with Colourful Charts

Normally, we will show only tabular data in our online report module. However, tables are not exciting. So, why don’t we add some charts to the report to better describe the data? Thanks to Google Charts, web developers can add colourful charts to the web pages easily.

Step 1: The Required Libraries

The first thing we need is the Google API Loader which allows us later to easily load APIs from Google. So, with the help of it, we can make use of the Google Hosted Libraries.

 type="text/javascript" src="https://www.google.com/jsapi">

After that, we will load the API which will help us drawing charts. To do so, we will have the following JavaScript code.

google.load("visualization", "1.1", { packages: ["corechart"] });

Material Design

What is Google Material Design? There is an interesting video from Google Design team sharing the ideas behind it. (Image Credit: Google Design)
What is Google Material Design? There is an interesting video from Google Design team sharing the ideas behind it. (Image Credit: Google Design)

Recently, in order to support a common look-and-feel across all Google properties and apps running on Google platforms, Material Design is introduced to Google Charts as well.

There are more than 20 of charts available in Google Chart Gallery. Depends on which chart that you would like to use, you need to load different API in order to use the Material version of each chart. I played around with some of them which are useful in my use cases.

The first chart that I use is the Column Chart which is to draw vertical bar chart. So, the API can be called as follows.

google.load("visualization", "1.1", { packages: ["bar"] });

Other than vertical bar chart, another common graphs used in report will be Line Chart. The API of Line Chart can be called as follows.

google.load("visualization", "1.1", { packages: ["line"] });

Line Chart is available on Google Charts.
Line Chart is available on Google Charts.

The third chart that I use is Timeline. This chart is different from the two charts introduced above because so far I still can’t find the non-Material version of it. So, the only way to call the API of Timeline is as follows.

google.load("visualization", "1.1", { packages: ["timeline"] });

Timeline is a chart describing the happening events over time.
Timeline is a chart describing the happening events over time.

Step 2: The Data

After we have loaded API of the chart that we want to use, then we need to pump in the data.

var data = new google.visualization.DataTable();
data.addColumn('string', 'Branch');
data.addColumn('number', 'Sales');
data.addRows([
    ['Ang Mo Kio', 1205.80],
    ['Bedok', 828.90],
    ['Clementi', 2222.10],
    ['Dhoby Ghaut', 3180.00]
]);

In the code above, I created a data table with 2 columns. Then I added 4 rows of data with addRows.

The addRows part can be done using a simple for loop. However, due to the fact that not all browsers support Tailing Comma, the for loop needs to have additional step to remove Tailing Comma.

Step 3: Chart Render

After we have the data, now we can proceed to draw the chart. For example, if we want to draw a Column Chart for the data table above, then we will use the following code.

var chart = new google.visualization.ColumnChart(document.getElementById('chart_div'));chart.draw(data);

The HTML element chart_div above is just an empty div where the chart should be rendered at.

We will then be able to get the following diagram (Non-Material).

A simple Column Chart.
A simple Column Chart.

Customization of Charts with Options

Google Charts allows us to customize the diagram. For example, we can add title for the diagram, horizontal axis, and vertical axis.

var options = {
    title: 'Sales of Branches',
    hAxis: {
        title: 'Branch'
    },
    vAxis: {
        title: 'Amount (SGD)',
        minValue: 0
    },
    legend: {
        position: 'none'
    }
 };

chart.draw(data, options);

In the code above, I not only added titles, but I also force the vertical axis to start from 0 and hide the legend by setting its position to none.

Column chart is now updated with helpful titles.
Column chart is now updated with helpful titles.

What we have seen so far is the non-Material Column Chart. So how will a Material Column Chart look like?

To get a Material Column Chart, we will change the code above to the following.

var options = {
    chart: {
        title: 'Sales of Branches',
        subtitle: '2015 First Quarter'
    },
    axes: {
        y: { 0: { label: 'Amount (SGD)' } },
        x: { 0: { label: 'Branch' } } 
    },
    legend: {
        position: 'none'
    }
};

var chart = new google.charts.Bar(document.getElementById('chart_div'));

chart.draw(data, options);

Then we will be able to get the Material version of the chart. Now, we are even able to define a subtitle for the diagram.

Yup, we successfully upgraded our chart to the Material version.
Yup, we successfully upgraded our chart to the Material version.

In case you would like convert a non-Material Column Chart to a Material version, you can do so with the code below too.

chart.draw(data, google.charts.Bar.convertOptions(options));

Challenge with Tab in Bootstrap

When I was adding Google Charts to one of my web page with tabs using Bootstrap framework, I realized there was a problem with the display of the rendered diagram. The labels in the chart are incorrectly positioned. This problem has been discussed on Stack Overflow as well.

 

Display Issues of Google Charts in Bootstrap Tabs
Display Issues of Google Charts in Bootstrap Tabs

Interestingly, if the chart happens to be in the first tab which is visible by default, then there won’t be any display issue on the chart. This problem only occurs when the charts are located in those subsequent tabs which are hidden during the first load of the page. When user clicks on any of those subsequent tabs, then the display issue will happen.

So one obvious solution is actually to only call the Google API to render the graph when the tab is clicked. To be safe enough, I actually put a delay to the click event of the tab so that the chart will be drawn 3 seconds after the the corresponding tab is clicked. This seems to help fixing the display problem.

Similar to my solution, there is also a better alternative which is to bind the draw function to the show event of the tab, as discussed on Stack Overflow. I like the solution too because of its cleanliness of the code.

Try More with Google Charts

Google Charts is undoubtedly a very easy-to-use solution for web developers to present their data. So, please try it out and be amazed by the number of charts available on Google Charts.

Summer 2015 Self-Learning Project

This article is part of my Self-Learning in this summer. To read the other topics in this project, please click here to visit the project overview page.

Summer Self-Learning Banner