The Dataiku visual flow allows coders and non-coders alike to easily build
data pipelines with datasets, recipes to join and transform datasets, and
the ability to build predictive models.
The visual flow also has code and reusable plugin elements for
customization and advanced functions.
Connect to Leading Data Sources
Dataiku provides connectors to over 25 leading data sources on-premise
and in the cloud, including Amazon S3, Azure Blob Storage, Google Cloud
Storage, Snowflake, SQL databases, NoSQL databases, HDFS, and more.
Data Preparation, Data Wrangling and Data Cleansing
Dataiku provides an easy-to-use visual interface that dramatically speeds
data preparation. Join and group datasets or aggregate, clean, normalize,
enrich, and deduplicate records, all with a few clicks.
Best of all, Dataiku captures these steps in a reproducible recipe as part of
the visual flow.
90+ Built-in Data Transformers
Dataiku includes over 90 built-in data transformers for common data
manipulations like binning, concatenation, currency conversions, date
conversions, filtering, splitting, and more.
During data analysis, Dataiku saves time with suggested functions based
on the data type.
Even when a transformer doesn’t exist in the library, users can quickly write
formulas similar to those used in spreadsheets to accomplish almost any
data transformation task.
Geospatial Data Preparation
Dataiku provides built-in geospatial transformation functions when
working with geospatial data.
These functions include the ability to extract latitude and longitude from
geo point data and vice versa; GeoIP location to resolve location data like
country, region/state, city, postal code, and more from an IP address; and
geo join to connect datasets using geographic coordinates.
Additionally, geographic plugins provide additional capabilities for
geocoding and more.