Avatar photo

Order and Limit Results of Grouped Queries (Hooray!)

Greetings Keen community! I’d like to make a quick feature announcement that will (hopefully) make many of you happy 😊

At Keen IO we’ve created a platform for collecting and analyzing data. In addition to the ability to count the individuals who performed a particular action, the API includes the ability to group results by one or more properties of the events (similar to the GROUP BY clause in SQL). For example: count the number of individuals who made a purchase and group by the country they live in. This makes it possible to see who made purchases in the United States versus Australia or elsewhere.

This grouping functionality can be very powerful, but there’s one annoying drawback: if there are many different values for your group_by property then the results can get quite large. (In the example above note all of the tiny slivers representing countries with only a handful of purchases.) What if I’m only interested in the top 5 or 10? Until now the only option was to post-process the response on the client (e.g. using Python or JavaScript) to sort and then discard the unwanted groups.

Today I’m excited to announce that, by popular demand, we’ve made this much easier! We recently added a feature called order_by that allows you to rank and return only the results that you’re most interested in. (To those familiar with SQL: this works very much like the ORDER BY clause, as you might expect.)

The order_by parameter orders results returned by a group_by query. The feature includes the ability to specify ascending (ASC) or descending (DESC) ordering, and allows you to order by multiple properties and/or by the result of the analysis.

Most importantly the new order_by feature includes the ability to limit the number of groups that are returned (again, mirroring the SQL LIMIT clause). This type of analysis can help answer important questions such as:

  • Who are the top 100 game players in the US?
  • What are the top 10 most popular article titles from last week?
  • Which 5 authors submitted the most number of articles last week?
  • What are the top 3 grossing states based on sum purchases during Black Friday?

order_by can be used with any Keen query that has a group_by, which in turn can be used with most Keen analysis types. (limit can be used with any order_by query.) For more details on the exact API syntax please check out the order_by API docs.

There is one important caveat to call out: using order_by and limit in and of itself won’t make your queries faster or cheaper, because Keen still has to compute the full result in order to be able to sort and truncate it. But being able to have the API take care of this clean-up for you can be a real time saver; during our brief internal beta I’ve already come to rely on it as a key part of my Keen analysis toolbox.

I’d like to extend a huge thanks to our developer community for all the honest constructive feedback they’ve given us over the years (on this issue and many others). You’re all critical in helping us understand where we can focus our engineering efforts to provide the most value. On that note: we have many more product enhancements on the radar for 2018, so if you want to place your votes we’re all ears! Feedback (both positive and negative) on the order_by feature is also welcome, of course. Please reach out to us at team@keen.io any time 🚀

Cheers,
Kevin Litwack | Platform Engineer