Avatar photo

Beyond Funneldamentals: A More Powerful Behavior Analysis API

A couple weeks back you maybe saw that we released a slew of new features for Keen IO, including some new capabilities for our funnels API. Which, if you know what funnels are and how they work, you were probably like, “Hey, yay, great, fun!” But, if you’re anything like me (I hadn’t really used funnels before I started working on them here), you might have been more like, “Um, yay, great, fun? Now we can fill all those… data bottles more easily?”

So, for both you newbies and you sophisticated old hats, consider this a crash course in funnels — and in what Keen IO can now do to help you out with them.

Background: What Do Funnels Do, Anyway?

(Feel free to skip this section if you already have an understanding of what Keen’s funnels are.)

The best place to start, I think, is with an excerpt from our funnel documentation:

A funnel is a flow of events that a user performs on their way to reaching a goal. This flow could be a checkout process, registration, or lead conversion. When analyzing a funnel, you are concerned with the number of users that successfully make it to the next step, as well as the number of users that drop off. This will show you where your flow loses the most users, so you know where to focus your attention. The result from a funnel is a list of counts for each of the steps you specify. If you would like to read more, check out our blog posts on Cohort Analysis and Retention Analysis.

For example, your funnel could have these steps:

  1. Successful completion of an app’s tutorial.
  2. Creation of content in the app.
  3. Sharing of content with another user.

A funnel analysis with those steps would work like this:

  1. Count the number of unique users who completed the app’s tutorial.
  2. Of the users who were counted in step 1, count the number of them who created content.
  3. Of the users who were counted in step 2, count the number of them who shared content.

My takeaway from this was that funnels work across a series of steps, and that, for each step, the funnel returns a count of unique actors who must be in each previous step — in the above example, for instance, an actor who was counted for step 3 must have been counted for steps 1 & 2 as well.

Steps with Actors

You’ll note that we said that, for each step, the funnel returns a count of unique actors. So, if we sent a request like this:

[
    {"event_collection": "tutorial_completed",
     "actor_property": "user.id"},
    {"event_collection": "content_created",
     "actor_property": "user.id"},
    {"event_collection": "content_shared",
     "actor_property": "user.id"}
]

… then our results might look something like this:

{
    "result": [300, 120, 80],
    ⋮
}

This is great and all, but what if we wanted to see an actual list of those specific users who had completed a step — say, all the users who had created content in step 2? Enter with_actors!

For every step that with_actors is set to true, the unique values of the actor_property for that step will be returned, meaning that this:

[
    {"event_collection": "tutorial_completed",
     "actor_property": "user.id"},
    {"event_collection": "content_created",
     "actor_property": "user.id",
     "with_actors": true},
    {"event_collection": "content_shared",
     "actor_property": "user.id"}
]

… will return something like this:

{
    "result": [300, 120, 80],
    "actors": [
      null,
      ["alice@test.com", "bob@test.com", "carla@test.com", … ],
      null
    ],
    ⋮
}

Note that if you’ve set with_actors to true for some steps but not others, the steps where it hasn’t been set will just return a null in the actors list. If with_actors hasn’t been set for any of the steps, actors won’t be returned at all.

Optional Steps

Remember when I said that any actor counted in step 3 must have been in step 2 as well? Well, imagine now if we wanted to know which users have shared content through our service OR which users who have viewed content. We might design that funnel like so:

  1. User completes tutorial.
  2. User creates content.
  3. User has viewed content.
  4. User shares content with others.

The thing is, as we’ve ordered this funnel right now, we’ll only count those users who have shared content if they’ve viewed content, which could leave out a substantial group of users of content-sharing users. We could swap the order around, but we’d still end up with the same result. What we really want is 3 or 4, not 3 and 4.

Here’s where optional comes in. If we set optional to true on a step, it won’t affect the data in future steps, so we can design a funnel like this:

[
    {"event_collection": "tutorial_completed",
     "actor_property": "user.id"},
    {"event_collection": "content_created",
     "actor_property": "user.id"},
    {"event_collection": "content_viewed",
     "optional": true,
     "actor_property": "user.id"}
    {"event_collection": "content_shared",
     "actor_property": "user.id"}
]

… and we can expect a result like this:

{
    "result": [300, 120, 60, 80],
    ⋮
}

Inverted Steps

OK, quick detour! There’s this well-known phenomenon called survivorship bias, where we have a tendency to consider the things that “survived” a process as more important than the things that dropped out along the way. There’s a famous story involving airplanes in WWII that I won’t get into here, but suffice it to say that negative data can sometimes be just as telling as positive data, in terms of why those drop-outs are dropping out.

What if we’re curious specifically about the drop-offs from our tutorial? Well, this is where inverted steps come in. If we set inverted to true on a step, we match negatively against it – an inverted step 2 returns everything in the collection for step 1 that isn’t in step 2. So a funnel like this:

[
    {"event_collection": "tutorial_completed",
     "actor_property": "user.id"},
    {"event_collection": "content_created",
     "actor_property": "user.id",
     "inverted": true
    }
]

… will return something like this:

{
    "result": [300, 180],
    ⋮
}

… that’s a lot of users dropping off! We should look into that.

The Big Finish

Now see what happens when we combine all of these features into one for the big pièce de résistance:

[
    {"event_collection": "tutorial_completed",
     "actor_property": "user.id"},
    {"event_collection": "content_created",
     "actor_property": "user.id",
     "inverted": true,
     "optional": true,
     "with_actors": true},
    {"event_collection": "content_created",
     "actor_property": "user.id"},
    {"event_collection": "content_shared",
     "actor_property": "user.id"}
]

This gives us all the steps of our original funnel, but with a quick check on all the users who dropped-off before the second step, like so:

  1. Completed tutorial
  2. Didn’t create content (optional)
  3. Created content
  4. Shared content

… the end result being:

{
    "result": [300, 180, 120, 80],
    "actors": [
      null,
      ["user_1", "user_2", … ],
      null,
      null
    ],
    ⋮
}

OK, that’s the basics! Hopefully, this helped you see the light at the end of the funnel. (I make no apologies for my funnel puns.) If you have questions — about funnels in general, or how Keen can help you out with them — feel free to hit me up!