"We were disappointed, if not surprised, to see that data wrangling still takes the lion’s share of time in a typical data professional’s day. Data preparation and cleansing takes valuable time away from real data science work and has a negative impact on overall job satisfaction."
2020 State of Data Science: Moving From Hype Toward Maturity, Anaconda
DataPrep is designed for computational notebooks, the most popular environment among data scientists.
DataPrep is built using Pandas/Dask DataFrame and can be seamlessly integrated with other Python libraries.
DataPrep is free, open-source software released under the MIT license. Anyone can reuse DataPrep code for any purpose.
DataPrep.EDA is the fastest and the easiest EDA tool in Python. It allows data scientists to understand a Pandas/Dask DataFrame with a few lines of code in seconds.
DataPrep.Clean aims to provide a large number of functions with a unified interface for cleaning and standardizing data of various semantic types in a Pandas or Dask DataFrame.
DataPrep.Connector provides an intuitive, open-source API wrapper that speeds up development by standardizing calls to multiple APIs as a simple workflow. Streamline calls to multiple APIs through one intuitive library.
DataPrep.Connector also support loading data from databases through SQL queries. With one line of code, you can speed up pandas.read_sql by 10X with 3X less memory usage!
Available to use
Available to use
Available to use
Planning
Planning
Fetching data...
© 2022 SFU Database System Lab. MIT Licensed.