Computing environment for Artificial Intelligence (AI) model development

Developing AI includes collecting data, training models, and deploying models. The tools for AI are primarily developed on Ubuntu Linux. Though most of the software for AI development can be used from packaged containers based on Ubuntu Linux. A few pieces of software are needed to run the containers.

The basic operating system (OS) configurations are:

Source code and container management:

Software needed on top of the base OS.

  • openssh for securely pull and uploading source code.
  • git for pulling and tracking source code.
  • docker-ce for building and running containers.
  • docker-compose for build and run commands for containers.

Check for packages installed on Debian based Unbuntu Linux:

apt list --installed | grep openssh
apt list --installed | grep git
apt list --installed | grep docker-ce
apt list --installed | grep docker-compose

Install packages if needed.

sudo apt-get update
sudo apt-get install openssh git docker-ce docker-compose

Training models is very computationally expensive and highly parallel graphics processing units (GPU) can speed up training to more reasonable times. The host computer needs a few packages for accessing GPUs from the Docker containers. Some of main packages for training AI models: Tensorflow, and Pytorch are currently only configured to use Nvidia GPUs through the CUDA API though support for ATI Radeon GPUs is in progress on Pytorch. Also GPU access from Docker containers has only been available on Linux though Windows support may be in progress.

The following are links to setting up GPU drivers and drivers for docker images.

Check Nvidia driver installation.

apt list --installed | grep nvidia

Check GPU is available to the Nvidia drivers.

nvidia-smi
Sun Apr 25 13:48:31 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.56       Driver Version: 460.56       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:01:00.0  On |                  N/A |
| 28%   32C    P8     7W / 151W |     19MiB /  8097MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Published by LearnIoTAI

A partnership of technology professionals sharing their knowledge of Artificial Intelligence (AI) and Internet of Things (ioT) devices helping people get started in the convergence of these two growing and exciting fields.

2 thoughts on “Computing environment for Artificial Intelligence (AI) model development

Leave a comment

Design a site like this with WordPress.com
Get started