2015-11-30

Cassandra TIme Series Bucketing

Intro

Bucketing is one of the most important techniques when working with time series data in Cassandra. This post has it's roots in two very popular blog entries:

The posts are very well written and the pretty much describe all of the standard techniques when it comes down to working with time series data in Cassandra. But to be honest there isn't all that much code in them. This is partly to a fact that almost every project has it's own specifics and from my experience it often happens that even within a relatively small team there will be multiple implementations on how to bucket and access the time series data.

The Case for Bucketing

For some time now I'm in the world if IoT and I find that explaining everything with a help of a simple temperature sensor is the best method to discuss the subject. Previously mentioned articles are also a good read. This section is sort of a warm up. Theoretically in most of the use cases we'll want to access temperature readings by some sensor Id and we know where this sensor is located. In the most simple case sensor id becomes the long row in cassandra and the readings are stored in it and kept sorted by time etc. However in some cases the temperature may be read very often and this could cause the wide row to grow to a proportion that is not manageable by cassandra so the data has to be split among multiple long rows. The easiest method to make this split is to make multiple long rows based on the measurement timestamp.

How big should my buckets be?

It may vary from project to project, but it depends on two important factors. How many readings are you storing per single measurement and how often the measurement is happening. For instance if you are recording a reading once per day you probably don't even need the bucketing. Also if you are recording it once per hour the project you are working on probably wont't last long enough for you to run into problem. It applies to seconds too, but only for the most trivial case where you are making a single reading. If you go into frequencies where something is happening on the milliseconds level you will most definetly need bucketing. The most complex project I worked up until now had time bucketing on a level of a single minute. meaning every minute, new bucket. But that project is not in the IoT world, In that world I'm using partitions on a month basis.

10 000 feet Bucketing View

Main problem is how to calculate the bucket based on measurement time stamp. Also keep in mind there might be differences between the timezones, in a distributed system a very advisable practice is to save everything in the UTC format. If we decided that wee need bucketing per day it could be something as simple as the following:

    FastDateFormat dateFormat = FastDateFormat.getInstance(
        "yyyy-MM-dd", TimeZone.getTimeZone("UTC"));

    public String dateBucket(Date date) {
        return dateFormat.format(date;
    }
    
That's it, combine this with your sensor Id and you get buckets on a day level basis. Now the problem is how to retrieve the measurements from buckets. Especially if you have to fetch the measurements across multiple buckets. We'll go over this in the next section.

Anything goes

Bare in mind that you should keep buckets in time series data easy to maintain. Also try to avoid having multiple implementation for the same thing in your code base. This section will not provide 100% implemented examples but will be more on a level of a pseudo code.

When you are fetching the data from the buckets, you will have two types of query. One is to fetch data out from the bucket without any restrictions on measurement time stamp. The other is when you will want to start from a certain position within the bucket. Again there is a question of ordering and sorting the retrieved data. I worked in systems having all sorts of practices there, most of the time reversing was done with a help of a specific boolean flag but my opinion is this should be avoided. It's best to stick to the from and to parameters and order the data according to them. i.e.

        from:   01/01/2016
        to:     02/02/2016
        returns: ascending

        from:   02/02/2016
        to:     01/01/2016
        returns: descending
    
That way you don't have to break you head and think about various flags passed over the levels in your code.

Here is a bit of pseudo code:

        // constructor of your iterator object

        startPartition = dateBucket(from);
        endPartition = dateBucket(to);

        lastFetchedToken = null;

        bucketMoveCount = 0;

        String statement = "SELECT * FROM readings"

        // from past experience, somehow the driver takes out data the fastest
        // if it fetches 3000 items at once, would be interesting to drill down
        // why is this so :)

        int fetchSize = 3000;

        if (from.isBefore(to)) {
            select = statement + " ORDER BY measurement_timestamp ASC LIMIT " + fetchSize;
            selectFromBoundary = statement + " AND measurement_timestamp > ? ORDER BY measurement_timestamp ASC LIMIT " + fetchSize;

            partitionDiff = -1f;
        } else {
            selectNormal = statement + " LIMIT " + fetchSize;
            selectFromBoundary = statement + " AND measurement_timestamp < ? LIMIT " + fetchSize;

            partitionDiff = 1f;
        }
    
Partition could move by hour, day, minute. It all depends on how you decide to implement it. You will have to do some time based calculations there I recommend using Joda-Time there. Now when you defined how init of an iterator looks like, it's time to do some iterations over it:
    public List<Row> getNextPage() {

        List<Row> resultOut = new ArrayList<>();

        boolean continueFromPreviousBucket = false;

        do {
            ResultSet resultSet =
                    lastFetchedToken == null ?
                            session.execute(new SimpleStatement(select, currentBucket)) :
                            session.execute(new SimpleStatement(selectFromBoundary, currentBucket, lastToken));

            List<Row> result = resultSet.all();

            if (result.size() == fetchSize) {
                if (continueFromPreviousBucket) {
                    resultOut.addAll(result.subList(0, fetchSize - resultOut.size()));
                } else {
                    resultOut = result;
                }

                lastFetchedToken = resultOut.get(resultOut.size() - 1).getUUID("measurement_timestamp");

            } else if (result.size() == 0) {
                currentBucket = calculateNextBucket();
                bucketMoveCount++;

            } else if (result.size() < fetchSize) {
                currentBucket = calculateNextBucket();
                bucketMoveCount++;

                lastFetchedToken = null;

                if (continueFromPreviousBucket) {
                    resultOut.addAll(result.subList(0, Math.min(result.size(), fetchSize - resultOut.size())));
                } else {
                    resultOut = result;
                }

                continueFromPreviousBucket = true;
            }

            if (resultOut.size() == fetchSize
                    || bucketMoveCount >= MAX_MOVE_COUNT
                    || Math.signum(currentBucket.compareTo(endPartition)) != okPartitionDiff) {
                break;
            }

        } while (true);

        return result;
    }
    

This is just a high level overview of how to move among the buckets. Actual implementation would actually be significantly different from project to project. My hope for this post is that you give the problems I faced a thought before you run into them.

No comments: