Name		Name	Last commit message	Last commit date
parent directory ..
.gitkeep		.gitkeep
IntelModin_GettingStarted.ipynb		IntelModin_GettingStarted.ipynb
Jupyter_Run.jpg		Jupyter_Run.jpg
Jupyter_Save_Py.jpg		Jupyter_Save_Py.jpg
License.txt		License.txt
README.md		README.md
requirements.txt		requirements.txt
sample.json		sample.json
third-party-programs.txt		third-party-programs.txt

README.md

Intel® Modin* Get Started Sample

This get started sample code shows how to use distributed Pandas using the Intel® Distribution of Modin* package. It demonstrates how to use software products that can be found in the Intel® AI Analytics Toolkit (AI Kit).

Property	Description
Category	Get started sample
What you will learn	Basic Intel® Distribution of Modin* programming model for Intel processors
Time to complete	5-8 minutes

Purpose

Intel Distribution of Modin* uses Ray or Dask to provide an effortless way to speed up your Pandas notebooks, scripts, and libraries. Unlike other distributed DataFrame libraries, Intel Distribution of Modin* provides seamless integration and compatibility with existing Pandas code.

In this sample, you will run Intel Distribution of Modin*-accelerated Pandas functions and note the performance gain when compared to "stock" (aka standard) Pandas functions.

Optimized for	Description
OS	64-bit Linux: Ubuntu 18.04 or higher
Hardware	Intel® Atom® processors; Intel® Core™ processor family; Intel® Xeon® processor family; Intel® Xeon® Scalable Performance processor family
Software	Intel® Distribution of Modin*, Intel® AI Analytics Toolkit

Key Implementation Details

This get started sample code is implemented for CPU using the Python language. The example assumes you have Pandas and Modin installed inside a conda environment.

Environment Setup

Install Intel Distribution of Modin in a new conda environment.

Note: replace python=3.x with your own python version
```
conda create -n aikit-modin python=3.x -y
conda activate aikit-modin
conda install modin-all -c intel -y
```
Install matplotlib.
```
conda install -c intel matplotlib -y
```
Install Jupyter Notebook.

Skip this step if you are working on the Intel DevCloud.
```
conda install jupyter nb_conda_kernels -y
```
Create a new kernel for Jupyter Notebook based on your activated conda environment.
```
conda install ipykernel
python -m ipykernel install --user --name usr_modin
```
This step is optional if you plan to open the notebook on your local server.

Run the Sample

You can run the Jupyter notebook with the sample code on your local server or download the sample code from the notebook as a Python file and run it locally or on the Intel DevCloud. Visit Intel® Distribution of Modin Getting Started Guide for more information.

Run the Sample in Jupyter Notebook

To open the Jupyter notebook on your local server:

Activate the conda environment.
```
conda activate aikit-modin
```
Start the Jupyter notebook server.
```
jupyter notebook
```
Open the IntelModin_GettingStarted.ipynb file in the Notebook Dashboard.
Run the cells in the Jupyter notebook sequentially by clicking the Run button.

Run the Sample in the Intel® DevCloud for oneAPI JupyterLab

If you do not already have an account, request an Intel® DevCloud account at Create an Intel® DevCloud Account.
Open the following link in your browser: https://devcloud.intel.com/oneapi/get_started/, locate the Connect with Jupyter Lab* section (near the bottom).
Click Sign in to Connect button. (If you are already signed in, the link should say Launch JupyterLab*.)
If the samples are not already present in your Intel® DevCloud account, download them.
- From JupyterLab, select File > New > Terminal.
- In the terminal, clone the samples from GitHub:
```
git clone https://github.com/oneapi-src/oneAPI-samples.git
```
Setup environment in the terminal:
- source oneAPI conda environment
```
source /opt/intel/oneapi/setvars.sh --force
```
- Refer to Environment Setup to setup environment
In the JupyterLab, navigate to the IntelModin_GettingStarted.ipynb file and open it.
To change the kernel, click Kernel > Change kernel > usr_modin.
Run the sample code and read the explanations in the notebook.

Run the Python Script Locally

Convert IntelModin_GettingStarted.ipynb to a python file in one of the following ways:
- Open the notebook in Jupyter and download as a python file. See the image from the daal4py Hello World sample:
- Run the following command to convert the notebook file to a Python script:
```
jupyter nbconvert --to python IntelModin_GettingStarted.ipynb
```
Run the Python script.
```
ipython IntelModin_GettingStarted.py
```

Run the Sample on the Intel® DevCloud in Batch Mode

This sample runs in batch mode, so you must have a script for batch processing.

Convert IntelModin_GettingStarted.ipynb to a python file.

jupyter nbconvert --to python IntelModin_GettingStarted.ipynb

Create a shell script file run-modin-sample.sh to activate the conda environment and run the sample.
```
source activate aikit-modin
ipython IntelModin_GettingStarted.py
```

Submit a job that requests a compute node to run the sample code.

qsub -l nodes=1:xeon:ppn=2 -d . run-modin-sample.sh -o output.txt

The -o output.txt option redirects the output of the script to the output.txt file.

Click here for additional information about requesting a compute node in the Intel DevCloud.

In order to run a script on the DevCloud, you need to request a compute node using node properties such as: gpu, xeon, fpga_compile, fpga_runtime and others. For more information about the node properties, execute the pbsnodes command.

This node information must be provided when submitting a job to run your sample in batch mode using the qsub command. When you see the qsub command in the Run section of the Hello World instructions, change the command to fit the node you are using. Nodes which are in bold indicate they are compatible with this sample:

Node	Command
GPU	qsub -l nodes=1:gpu:ppn=2 -d . hello-world.sh
CPU	qsub -l nodes=1:xeon:ppn=2 -d . hello-world.sh
FPGA Compile Time	qsub -l nodes=1:fpga_compile:ppn=2 -d . hello-world.sh
FPGA Runtime	qsub -l nodes=1:fpga_runtime:ppn=2 -d . hello-world.sh

Run the Sample in Visual Studio Code*

You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations, and browse and download samples.

The basic steps to build and run a sample using VS Code include:

Download a sample using the extension Code Sample Browser for Intel® oneAPI Toolkits.
Configure the oneAPI environment with the extension Environment Configurator for Intel(R) oneAPI Toolkits.
Open a Terminal in VS Code by clicking Terminal > New Terminal.
Run the sample in the VS Code terminal using the instructions below.

On Linux, you can debug your GPU application with GDB for Intel® oneAPI toolkits using the Generate Launch Configurations extension.

To learn more about the extensions, see Using Visual Studio Code with Intel® oneAPI Toolkits.

After learning how to use the extensions for Intel oneAPI Toolkits, return to this readme for instructions on how to build and run a sample.

Expected Printed Output:

Expected cell output is shown in IntelModin_GettingStarted.ipynb.

Related Samples

Several sample programs are available for you to try, many of which can be compiled and run in a similar fashion. Experiment with running the various samples on different kinds of compute nodes or adjust their source code to experiment with different workloads.

License

Code samples are licensed under the MIT license. See License.txt for details.

Third party program Licenses can be found here: third-party-programs.txt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IntelModin_GettingStarted

IntelModin_GettingStarted

README.md

Intel® Modin* Get Started Sample

Purpose

Key Implementation Details

Environment Setup

Run the Sample

Run the Sample in Jupyter Notebook

Run the Sample in the Intel® DevCloud for oneAPI JupyterLab

Run the Python Script Locally

Run the Sample on the Intel® DevCloud in Batch Mode

Run the Sample in Visual Studio Code*

Expected Printed Output:

Related Samples

License

Files

IntelModin_GettingStarted

Directory actions

More options

Directory actions

More options

Latest commit

History

IntelModin_GettingStarted

Folders and files

parent directory

README.md

Intel® Modin* Get Started Sample

Purpose

Key Implementation Details

Environment Setup

Run the Sample

Run the Sample in Jupyter Notebook

Run the Sample in the Intel® DevCloud for oneAPI JupyterLab

Run the Python Script Locally

Run the Sample on the Intel® DevCloud in Batch Mode

Run the Sample in Visual Studio Code*

Expected Printed Output:

Related Samples

License