Data Preparation with Dataiku

  • Home
  • Data Preparation with Dataiku

Visual Flow

The Dataiku visual flow allows coders and non-coders alike to easily build

data pipelines with datasets, recipes to join and transform datasets, and

the ability to build predictive models.

The visual flow also has code and reusable plugin elements for

customization and advanced functions.

"Desktop Monitor with Visual"
"Graph on dataiku data sources"

Connect to Leading Data Sources

Dataiku provides connectors to over 25 leading data sources on-premise

and in the cloud, including Amazon S3, Azure Blob Storage, Google Cloud

Storage, Snowflake, SQL databases, NoSQL databases, HDFS, and more.

Data Preparation, Data Wrangling and Data Cleansing

Dataiku provides an easy-to-use visual interface that dramatically speeds

data preparation. Join and group datasets or aggregate, clean, normalize,

enrich, and deduplicate records, all with a few clicks.

Best of all, Dataiku captures these steps in a reproducible recipe as part of

the visual flow.

"Dataiku Dashboard"
"Dataiku dashboard"

90+ Built-in Data Transformers

Dataiku includes over 90 built-in data transformers for common data

manipulations like binning, concatenation, currency conversions, date

conversions, filtering, splitting, and more.

During data analysis, Dataiku saves time with suggested functions based

on the data type.

Even when a transformer doesn’t exist in the library, users can quickly write

formulas similar to those used in spreadsheets to accomplish almost any

data transformation task.

Geospatial Data Preparation

Dataiku provides built-in geospatial transformation functions when

working with geospatial data.

These functions include the ability to extract latitude and longitude from

geo point data and vice versa; GeoIP location to resolve location data like

country, region/state, city, postal code, and more from an IP address; and

geo join to connect datasets using geographic coordinates.

Additionally, geographic plugins provide additional capabilities for

geocoding and more.