DataFrame is a tabular data structure for data analysis in Pharo.
To install DataFrame, go to the Playground (Ctrl+OW
) in your fresh Pharo image and execute the following Metacello script (select it and press Do-it button or Ctrl+D
):
Metacello new
baseline: 'DataFrame';
repository: 'github://PolyMathOrg/DataFrame/src';
load.
Data frames are the one of the essential parts of the data science toolkit. They are the specialized data structures for tabular data sets that provide us with a simple and powerful API for summarizing, cleaning, and manipulating a wealth of data sources that are currently cumbersome to use.
A data frame is like a database inside a variable. It is an object which can be created, modified, copied, serialized, debugged, inspected, and garbage collected. It allows you to communicate with your data quickly and effortlessly, using just a few lines of code. DataFrame project is similar to pandas library in Python or built-in data.frame class in R.
In this section I show a very simple example of creating and manipulating a little data frame. For more advanced examples, please check the DataFrame Booklet.
weather := DataFrame withRows: #(
(2.4 true rain)
(0.5 true rain)
(-1.2 true snow)
(-2.3 false -)
(3.2 true rain)).
For more information, please read Data Analysis Made Simple with Pharo DataFrame - a booklet that serves as the main source of documentation for the DataFrame project. It describes the complete API of DataFrame and DataSeries data structures, and provides examples for each method.