Data Extractions

Data Extractions allow you to get your event data out of Keen IO. We strongly believe that you should always have full access to all of your data, and we aim to make it as simple and painless as possible.

The following types of Data Extractions are currently supported in the Keen Analysis API:

Data Extraction to JSON

Technical Reference: Extraction Resource

A JSON data extraction is done using an HTTP GET request like this:

https://api.keen.io/3.0/projects/<project_id>/queries/extraction?api_key=<read_key>&event_collection=<event_collection>

Extractions take the following parameters:

  • api_key (optional) - The API Key for the project containing the data you are analyzing. The API key can alternatively be provided in the request header. See Authentication for more information.
  • event_collection (required) - The name of the event collection you are analyzing.
  • filters (optional) - Filters are used to narrow down the events used in an analysis request based on event property values.
  • timeframe (optional) - A Timeframe specifies the events to use for analysis based on a window of time.
  • email (optional) - If an email address is specified, an email will be sent to it when your extraction is ready for download. If email is not specified, your extraction will be processed synchronously and your data will be returned as JSON.
  • latest (optional) - Use this parameter to specifically request the most recent events added to a given collection. Extract up to 100 of your most recent events.

The result is an array of events like this:

{"result":[{
            "keen": {
                "created_at": "2012-07-30T21:21:46.566000+00:00",
                "timestamp": "2012-07-30T21:21:46.566000+00:00"
                    },
            "user": {
                "email": "dan@keen.io",
                "id": "4f4db6c7777d66ffff000000"
                    },
            "user_agent": {
                "browser": "chrome",
                "browser_version": "20.0.1132.57",
                "platform": "macos"
                          },
            },
            {
            "keen": {
                "created_at": "2012-07-30T21:40:05.386000+00:00",
                "timestamp": "2012-07-30T21:40:05.386000+00:00"
                    },
            "user": {
                "email": "michelle@keen.io",
                "id": "4fa2cccccf546ffff000006"
                    },
            "user_agent": {
                "browser": "chrome",
                "browser_version": "20.0.1132.57",
                "platform": "macos"
                          }
            }
        ]
}

Data Extraction to CSV file

Technical Reference: Extraction Resource

You can perform a file extraction at any time from the Keen.io website or via API. Shortly after requesting an extract from the Keen.io website, you will get an email letting you know that the extract is ready for download. The larger your extraction, the longer it will take to get the email. When you click the link, your CSV download will begin immediately (check the bottom of your browser – you should see the download progress there).

Here’s what the API request looks like:

https://api.keen.io/3.0/projects/<project_id>/queries/extraction?api_key=<read_key>&event_collection=<event_collection>&email=<email>

Extractions take the following parameters:

  • api_key (optional) - The API Key for the project containing the data you are analyzing. The API key can alternatively be provided in the request header. See Authentication for more information.
  • event_collection (required) - The name of the event collection you are analyzing.
  • filters (optional) - Filters are used to narrow down the events used in an analysis request based on event property values.
  • timeframe (optional) - A Timeframe specifies the events to use for analysis based on a window of time.
  • email (optional) - If an email address is specified, an email will be sent to it when your extraction is ready for download. If email is not specified, your extraction will be processed synchronously and your data will be returned as JSON.
  • latest (optional) - Use this parameter to specifically request the most recent events added to a given collection. Extract up to 100 of your most recent events.

Latest Events Extractions

Add a ‘latest’ parameter to your extraction request to get back the last 5 events, last 10 events, etc. Request up to 100 of the most recent events.

https://api.keen.io/3.0/projects/<project_id>/queries/extraction?api_key=<read_key>&event_collection=<event_collection>&latest=<number>

Extractions take the following parameters:

  • api_key (optional) - The API Key for the project containing the data you are analyzing. The API key can alternatively be provided in the request header. See Authentication for more information.
  • event_collection (required) - The name of the event collection you are analyzing.
  • filters (optional) - Filters are used to narrow down the events used in an analysis request based on event property values.
  • timeframe (optional) - A Timeframe specifies the events to use for analysis based on a window of time.
  • email (optional) - If an email address is specified, an email will be sent to it when your extraction is ready for download. If email is not specified, your extraction will be processed synchronously and your data will be returned as JSON.
  • latest (optional) - Use this parameter to specifically request the most recent events added to a given collection. Extract up to 100 of your most recent events.

The result is an array of your custom events and properties. Here’s an example using two sample login events:

{"result":[{
            "keen": {
                "created_at": "2012-07-30T21:21:46.566000+00:00",
                "timestamp": "2012-07-30T21:21:46.566000+00:00"
                    },
            "user": {
                "email": "dan@keen.io",
                "id": "4f4db6c7777d66ffff000000"
                    },
            "user_agent": {
                "browser": "chrome",
                "browser_version": "20.0.1132.57",
                "platform": "macos"
                          },
            },
            {
            "keen": {
                "created_at": "2012-07-30T21:40:05.386000+00:00",
                "timestamp": "2012-07-30T21:40:05.386000+00:00"
                    },
            "user": {
                "email": "michelle@keen.io",
                "id": "4fa2cccccf546ffff000006"
                    },
            "user_agent": {
                "browser": "chrome",
                "browser_version": "20.0.1132.57",
                "platform": "macos"
                          }
            }
        ]
}

Notes on Data Extraction

Technical Reference: Extraction Resource

Here is some additional info related to data extraction:

  • If you don’t specify any filters, your extract will include every event in an Event Collection. All Event Properties are included for each event in the extract. The files can get quite large. Use timeframes and filters to narrow the inventory of events that you extract.
  • Every event in your extract will have a keen.timestamp property. That’s the value used for sorting events by Timeframe. The timezone of this timestamp is UTC.
  • There is currently no way to specify the order of the properties (columns) in your extract file. They might not come out in the order you expect, but they will all be there.
  • Shortly after requesting an extract from the Keen.io website, you will get an email letting you know that the extract is ready for download. The larger your extraction, the longer it will take to get the email. When you click the link, your download will begin immediately (check the bottom of your browser – you should see the download progress there).
  • Extractions are done by Event Collection. If you want to extract 100% of your data from Keen, you’ll need to run the extraction for each Event Collection.
  • You can also programmatically request extractions via the Extraction Resource or using Saved Queries in our API. The Data Extraction APIs can be used to, for example, set up a nightly job that will have the data you need ready and waiting in your inbox in the morning.

So, what are you waiting for? It only takes a few minutes and a few lines of code to start collecting the events that really matter to you.

Sign Up Free