Exploiting the data cubes

In this section we will use a web-based front-end to illustrate the operations on data cubes.

The sample data cube that we will be working with contains data about jobseekers without current employment in the Flemish region. We will use this cube to explore the evolution in the number of jobseekers aged over 50 and their level of education.

As a first step, we need to go and open the data cube on the web: open a browser window and go to http://vmopencube01.deri.ie/resource/OpenCubeOLAPBrowser.

Next, we have to select the cube containing the data that we are interested in. Click the Please select a cube to visualize list, and select the cube called Cube educationlevel non-working job-seekers.

choose cube

The screen now shows the dimensions and the measures that can be selected within this data cube. Going back to the goal that we set – to look into the evolution in the number of jobseekers aged over 50 and their level of education, in the Flemish region – we can determine which dimensions and measures we should select:

Having selected these dimensions, the screen already changes to show dropdowns for Columns and Rows, but we don’t see any data yet. We also need to select the measure that we want to see data for (otherwise we still will not see anything except column and row headers).

As soon as we select the measure, the screen is updated to show the corresponding data from the cube (if you selected dimensions in a different order, the table may look different).

The first dimension that we selected has been set as the columns for our table, the second as the rows. We see now what is meant by Age group Level1: the data has been organised in groups for ages 0 to 24, 25 to 49, and 50+.

At this stage, the data that we see does not yet meet our goal:

To get to the data that we want to see, we need to further refine our selections. In the dimensions panel, select the dimension that we are missing: the highest level of an educational programme the person has successfully completed, and then also set that dimension as the value for Rows. Note also that a new panel has appeared: for Filter, select the age group that we are interested in: 50+.

50+

Note how at this stage, we’re not selecting the other dimensions since they introduce distinctions that we have not specified as a part of our goal.

Since we have fewer rows than columns, we can view more data on screen if we swap those around and increase the number of rows shown:

The take-aways from what we have done so far can be summarized as follows:

Now that we have answered our question with regard to the evolution of the number of jobseekers above 50 and their education level, what if we wanted to see the geographical distribution of those jobseekers? Remember that there are still a couple of dimensions present in the cube that we haven’t selected?

Starting from the set-up that we have above, under Dimensions let’s select the dimension labeled The country or geographic area to which the measured statistical phenomenon relates, and within that select the levels Province and District. Since we have selected a fourth dimension, it is added as a filter, and the first of its values is applied as a filter. Our data has changed since it now shows numbers for the Province of Antwerpen, but our goal of seeing geographical spread has not quite been reached ... To see the data in the way that we want, we need to make sure the new dimension is set to be displayed either as columns or rows: let’s select the geography dimension for Rows, and we get this:

swapping

One caveat here is that the time-dimension has now become a filter, and we’re only shown data for 1999 unless we select a different year. If we are not interested in that dimension anymore, we can remove it.

You will remember that we selected two levels in the geography dimension: Province and District. The table currently shows data at the level of the Province. The data for Districts can however also be shown, grouped by province, by clicking the [+] next to each Province, as shown here:

Without going into further detail, it is very easy to image many more different ways of looking at these data, simply by selecting different dimensions, different levels within dimensions, and different combinations of filter values. In fact, all of these constitute instances of the operations that we discussed when we introduced data cubes: slicing and dicing, rolling up and drilling down.