This get started sample code shows how to do batch linear regression using the Python API package daal4py powered by the oneAPI Data Analytics Library (oneDAL). It demonstrates how to use software products that are powered by oneAPI Data Analytics Library and found in the Intel® AI Analytics Toolkit (AI Kit).
Property | Description |
---|---|
Category | Get started sample |
What you will learn | Basic daal4py programming model for Intel CPUs |
Time to complete | 5 minutes |
daal4py is a simplified API to Intel® oneDAL that allows for fast usage of the framework suited for data scientists or machine learning users. Built to help provide an abstraction to Intel® oneDAL for direct usage or integration into one's own framework.
In this sample, you will run a batch Linear Regression model with oneDAL daal4py library memory objects. You will also learn how to train a model and save the information to a file.
Optimized for | Description |
---|---|
OS |
|
Hardware |
|
Software | Intel® AI Analytics Toolkit |
This get started sample code is implemented for CPUs using the Python language. The example assumes you have daal4py and scikit-learn installed inside a conda environment, similar to what is delivered with the installation of the Intel® Distribution for Python* as part of the Intel® AI Analytics Toolkit.
-
Install Intel® AI Analytics Toolkit.
If you use the Intel® DevCloud, skip this step. The toolkit is already installed for you.
The oneAPI Data Analytics Library is ready for use once you finish the Intel® AI Analytics Toolkit installation and have run the post installation script.
You can refer to the oneAPI main page for toolkit installation and the Toolkit Getting Started Guide for Linux for post-installation steps and scripts.
-
Set up your Intel® AI Analytics Toolkit environment.
Source the
setvars
script located in the root of your oneAPI installation.-
Linux Sudo:
. /opt/intel/oneapi/setvars.sh
-
Linux User:
. ~/intel/oneapi/setvars.sh
-
Windows:
C:\Program Files(x86)\Intel\oneAPI\setvars.bat
For more information on environment variables, see Use the setvars Script for Linux or macOS, or Windows.
-
-
Activate the conda environment.
-
If you have the root access to your oneAPI installation path or if you use the Intel® DevCloud:
Intel Python environment will be active by default. However, if you activated another environment, you can return with the following command:
source activate base
-
If you do not have the root access to your oneAPI installation path:
By default, the Intel® AI Analytics Toolkit is installed in the
/opt/intel/oneapi
folder, which requires root privileges to manage it. If you would like to bypass using root access to manage your conda environment, then you can clone your desired conda environment using the following command:conda create --name usr_intelpython --clone base
Then activate your conda environment with the following command:
source activate usr_intelpython
-
-
Install Jupyter Notebook.
If you use the Intel DevCloud, skip this step.
conda install jupyter nb_conda_kernels
You can run the sample code in a Jupyter notebook or as a Python script locally or in the Intel DevCloud.
To open the Jupyter notebook on your local server:
-
Activate the conda environment.
source activate base # or source activate usr_intelpython
-
Start the Jupyter notebook server.
jupyter notebook
-
Open the
IntelPython_daal4py_GettingStarted.ipynb
file in the Notebook Dashboard. -
Run the cells in the Jupyter notebook sequentially by clicking the Run button.
-
Activate the conda environment.
source activate base # or source activate usr_intelpython
-
Run the Python script.
python IntelPython_daal4py_GettingStarted.py
The script saves the output files in the included models
and results
directories.
Here's our model:
NumberOfBetas: 14
NumberOfResponses: 1
InterceptFlag: False
Beta: array(
[[ 0.00000000e+00 -1.05416344e-01 5.25259886e-02 4.26844883e-03
2.76607367e+00 -2.82517989e+00 5.49968304e+00 3.48833264e-03
-8.73247684e-01 1.74005447e-01 -8.38917510e-03 -3.28044397e-01
1.58423529e-02 -4.57542900e-01]],
dtype=float64, shape=(1, 14))
NumberOfFeatures: 13
Here is one of our loaded model's features:
[[ 0.00000000e+00 -1.05416344e-01 5.25259886e-02 4.26844883e-03
2.76607367e+00 -2.82517989e+00 5.49968304e+00 3.48833264e-03
-8.73247684e-01 1.74005447e-01 -8.38917510e-03 -3.28044397e-01
1.58423529e-02 -4.57542900e-01]]
[CODE_SAMPLE_COMPLETED_SUCCESFULLY]
-
Open the following link in your browser: https://jupyter.oneapi.devcloud.intel.com/
-
In the Notebook Dashboard, navigate to the
IntelPython_daal4py_GettingStarted.ipynb
file and open it. -
Run the sample code and read the explanations in the notebook.
This sample includes the run.sh
script for batch processing.
Submit a job that requests a compute node to run the sample code:
qsub -l nodes=1:xeon:ppn=2 -d . run.sh
Click here for additional information about requesting a compute node in the Intel DevCloud.
In order to run a script in the DevCloud, you need to request a compute node using node properties such as: gpu
, xeon
, fpga_compile
, fpga_runtime
and others. For more information about the node properties, execute the pbsnodes
command.
This node information must be provided when submitting a job to run your sample in batch mode using the qsub command. When you see the qsub command in the Run section of the Hello World instructions, change the command to fit the node you are using. Nodes which are in bold indicate they are compatible with this sample:
Node | Command |
---|---|
GPU | qsub -l nodes=1:gpu:ppn=2 -d . hello-world.sh |
CPU | qsub -l nodes=1:xeon:ppn=2 -d . hello-world.sh |
FPGA Compile Time | qsub -l nodes=1:fpga_compile:ppn=2 -d . hello-world.sh |
FPGA Runtime | qsub -l nodes=1:fpga_runtime:ppn=2 -d . hello-world.sh |
The script saves the output files in the included models
and results
directories.
You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations, and browse and download samples.
The basic steps to build and run a sample using VS Code include:
-
Download a sample using the extension Code Sample Browser for Intel® oneAPI Toolkits.
-
Configure the oneAPI environment with the extension Environment Configurator for Intel(R) oneAPI Toolkits.
-
Open a Terminal in VS Code by clicking Terminal > New Terminal.
-
Run the sample in the VS Code terminal using the instructions in this document.
On Linux, you can debug your GPU application with GDB for Intel® oneAPI toolkits using the Generate Launch Configurations extension.
To learn more about the extensions, see Using Visual Studio Code with Intel® oneAPI Toolkits.
After learning how to use the extensions for Intel oneAPI Toolkits, return to this document for instructions on how to build and run a sample.
Several sample programs are available for you to try, many of which can be compiled and run in a similar fashion. Experiment with running the various samples on different kinds of compute nodes or adjust their source code to experiment with different workloads.
If an error occurs, troubleshoot the problem using the Diagnostics Utility for Intel® oneAPI Toolkits.
Code samples are licensed under the MIT license. See License.txt for details.
Third-party program licenses can be found here: third-party-programs.txt