cuML Processor
cuML is a suite of GPU-accelerated machine learning algorithms built on NVIDIA’s RAPIDS AI framework.
It provides familiar scikit-learn-style APIs while leveraging the power of CUDA to achieve significant speedups for large datasets.
cuML includes a wide range of algorithms like linear regression, k-means clustering, PCA, t-SNE, and more, all optimized for execution on NVIDIA GPUs. It is designed for data scientists and developers allowing them to scale machine learning workflows efficiently using GPUs.
The cuML processor allows you to perform following operations:
- Write custom code for defining transformations on Spark DataFrames. 
- Write custom code for processing input records at runtime. 
To get started, use the Code Snippets available in the processor’s configuration.

Click on a desired topic to view its sample code and data set.

Copy the code, close the search topic to return to processor’s configuration, and provide the code snippet as inline Python code.
Processor Configuration
Read about the cuML processor configuration fields in this section.
Overwrite Python Executable
Select to override the default Python executable path.
Python Path
Enter the Python executable path.
Utilize Python Virtual Environment
Enable it to use a Python virtual environment for the cuML processor.
Environment Details
Select the environment name and version.
Example: Production - v1.
Environment details are visible in the drop-down list once they are created and saved.
Environment Type
Choose whether to use a Python or Micromomba environment for the virtual environment setup.
Python Environment Type
Python Packages
Enter the python package names separated by newlines or upload a package list. Options to upload/download a sample package list for reference is available.

Micromomba Environment Type
Python Version for Micromamba Environment
Specify the version of Python to be used in Micromomba environment. Example: 3.10
Include Micromamba libraries
Enable this option to include Micromomba specific libraries when setting up the environment.
Package Management
Whether to use PIP or Micromomba for managing and installing packages in the environment.
Python Packages
Enter Python package names separated by newlines or upload a package list.
Download a sample package list for reference.
Code Input
Enter inline code or upload a Python script that has processing logic in it.
- Inline: This option enables you to write Python code in the text editor. If selected, you will view one additional field Python Code. 
- Upload: This option enables you to upload single and multiple python scripts (.py files) and python packages (.egg/.zip files). You have to specify module name (should be part of uploaded files or package) and method name that will be called by the cuML processor. - When you select Upload, UPLOAD FILE option appears on the screen, browse and select the files that need to be used in the cuML processor. - One additional field, Import Module will also appear on the screen, if the Upload option is selected. 
Python Code
For inline input type, write custom Python code directly on text editor. Use CODE SNIPPETS for a quick reference.
Import Module
Specify module name which contains function that will be called by cuML processor. Here you will get list of all uploaded files in drop down list.
The drop down list will show only .py files. You can also write a module name if it does not appear in the drop-down list.
Function Name
Enter the name of the function defined in the Python code that will be called during execution.
Add Configuration: Enables to add Additional properties.
To pass configuration parameters in cuML processor.
You can provide configuration parameters in cuML processor in form of key value pair. These parameters will be available in form of dictionary in function given in Function Name field as second argument. So function given in field Function Name will take two arguments: (df, config_map)
Where first argument will be dataframe and second argument will be a dictionary that contains configuration parameters as key value pair.
Ask AI Assistant
Use the AI assistant feature to simplify the creation of Python code.
It allows you to generate complex Python code effortlessly, using natural language inputs as your guide.
Describe your desired expression in plain, conversational language. The AI assistant will understand your instructions and transform them into a functional Python code.
Tailor code to your specific requirements, whether it’s for data transformation, filtering, calculations, or any other processing task.
Note: Press Ctrl + Space to list input columns and Ctrl + Enter to submit your request.
Input Example:
Create a column called inactive_days by calculating difference between last_login_date and current date and give those records whose inactive_days is more than 60 days.
Notes
Optionally, enter notes in the Notes → tab and save the configuration.
If you have any feedback on Gathr documentation, please email us!