技術工具
- 工具首頁\
- TensorFlow/CV-VSC-GPU
### 容器說明
我們從套件原始碼安裝的 TensorFlow 已與 NVIDIA TensorRT 整合,這使得模型推理加速變得更為簡單。
於模型訓練方面,本環境亦整合了 Uber Horovod + OpenMPI,這使得單或多節點多GPU模型訓練可以更為輕鬆的實現。
此外,GPU 的高速運算能力,使得 CPU 準備資料的速度成為了可能的計算瓶頸。為了使 CPU 能夠快速地執行數值運算,或是減少 CPU 處理資料的耗時,此容器環境亦安裝了高效能的數值運算函式庫: Intel Math Kernel Library (Intel MKL)。此函式庫的整合,能夠讓 Scikit-learn (機器學習套件)、NumPy/SciPy (數值/科學套件) 確保其計算性能可達最高水準。
此容器是專門設計給電腦視覺領域的深度學習開發者使用,因此,它亦包含了以下安裝: OpenCV (應用於電腦視覺)、imgaug (應用於圖像增益)、pydicom (用於讀取醫療影像) 以及 GDCM (Grassroots DICOM圖檔函式庫)。
本容器亦整合了 ```cdr/code-server```,它是網頁版本的 Visual Studio Code,使用者登入後,便可使用此位於雲端的 IDE 介面來做程式開發。
本容器建置於2020年6月。
### 環境限制
使用本容器之前,請先行於本機安裝 NVIDIA 驅動 418.39(或以上)的版本。
### 下載方式
請於終端機執行以下指令:
```bash
docker pull moeaidb/aigo:cu10.1-dnn7.6-gpu-tf-cv-vsc-20.06
```
### 使用方式
#### 使用範例 1: 使用 code server 服務
掛載當前位置目錄 (```$PWD```) 至容器內部的 ```/workspace``` 資料夾,並且讓code server服務監聽本機的port ```9999```:
```bash
# 決定 code server 該監聽本機的哪一個 port。
host_port=9999
# 啟動容器並取得容器 ID。
container_id=$(nvidia-docker run -d --rm -p ${host_port}:8080 \
-v $PWD:/workspace \
moeaidb/aigo:cu10.1-dnn7.6-gpu-tf-cv-vsc-20.06) # 休息一會,靜待容器服務啟動。
# 等待服務啟動。
sleep 3.
# 顯示密碼於螢幕。
docker exec -it ${container_id} cat /root/.config/code-server/config.yaml | grep password:
```
輸入以上指令於終端機後,應該會顯示登入 code server 的密碼:
```bash
password: 909655d9e1902d6a01a35b26
```
這代表我們已經在容器內啟動了 code server 服務。接著,請開啟瀏覽器至本機網址(http://[your_ip]:9999/),即可進入 code server 頁面。
#### 使用範例 2: 測試 code server 能否正常使用
進入 code server 頁面,輸入先前取得的密碼,然後按下 submit 登入 code server。
![](https://i.imgur.com/cGxZIOF.png)
登入 code server 後,將看到如下畫面:
![](https://i.imgur.com/qN0EW9V.png)
按下左下角應用圖示(如下圖左下紅框),進入至延伸套件下載頁面。這裡我們建議下載 ```code runner```,是一個方便的小套件。於搜尋區域輸入 code runner,並且按下綠色的 Install 按鈕(如下圖右上紅框),以便安裝此延伸套件。
![](https://i.imgur.com/aB3FTmY.png)
完成安裝後,按下 reload extension 即可。稍後我們會使用 code runner 來執行一個 python script。
3接著按下左側的檔案瀏覽圖示(如下圖左紅框),然後按下右側新增檔案圖示 (如下圖右紅框)。請輸入 ```tmp.py``` 然後按下enter。
![](https://i.imgur.com/eA37kAz.png)
輸入以下內容於 ```tmp.py``` 內:
```python
import tensorflow as tf # 載入TensorFlow
print(tf.__version__) # 印出TensorFlow版本
```
最後,按下右上角的播放鍵 (如下圖紅框), 來執行 ```tmp.py```。
![](https://i.imgur.com/UA1N4eY.png)
輸出結果如上圖,表示 TensorFlow 可順利被載入,並且版本為 2.2.0。
#### 使用範例 3: 於終端機內顯示套件資訊
AIGO 容器內含一個小程式: ```versions_summary```,它可以讓您迅速的了解容器內安裝了哪些套件,以及所安裝的套件是何種版本。請於終端機執行以下指令:
```bash
nvidia-docker run -it --privileged --rm moeaidb/cu10.1-dnn7.6-gpu-tf-cv-vsc-20.06 versions_summary
```
執行後,您應該會類同於以下的輸出結果:
```
System INFO:
Python v3.8.3
NVIDIA Driver v440.36
CUDA v10.1.243-1
cuDNN v7.6.5.32-1+cuda10.1
NCCL v2.4.8-1+cuda10.1
Installed Python3 Packages:
[Base]:
tensorflow v2.2.0
horovod v0.19.4
mpi4py v3.0.3
numba v0.49.1
[Numerical]:
numexpr v2.7.1
numpy v1.18.4
scipy v1.4.1
[Data Science]:
sklearn v0.23.1
pandas v1.0.4
matplotlib v3.2.1
seaborn v0.10.1
bokeh v2.0.2
jupyterlab v2.1.4
pyodbc v4.0.30
yacs v0.1.7
[NLP]:
[CV]:
cv2 v4.3.0
imgaug v0.4.0
pydicom v1.2.2
skimage v0.17.2
```
### 套件資訊
| 套件/軟體/函式庫名稱 | 版本 | 套件說明 |
|:---------|:---------:|:---------|
| [TensorFlow](https://www.tensorflow.org) | 2.2 | An open source machine learning library for research and production. 由Google維護的,開源的AI模型開發框架。 |
| [Python](https://docs.python.org/3.7/whatsnew/changelog.html#python-3-7-3-final) | 3.8.3 | Python is powerful... and fast; plays well with others; runs everywhere; is friendly & easy to learn; is Open. 我們環境採用Python 3.8,它於字串處理和檔案搜索方面較Python3.6快很多。 |
| [Horovod](https://github.com/horovod/horovod) | 0.19.4 | Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. 使用Uber Horovod可簡易的將AI訓練利用多GPU做加速。 |
| [OpenMPI](https://www.open-mpi.org) | 4.0.3 | A High Performance Message Passing Library. (Required by Uber Horovod) OpenMPI為Uber Horovod所需,可支持跨卡/跨伺服器節點的溝通。 |
| [NVIDIA CUDA](https://developer.nvidia.com/cuda-toolkit) | 10.1.243-1 | The NVIDIA® CUDA® Toolkit provides a development environment for creating high performance GPU-accelerated applications. CUDA為NVIDIA為其GPU所提供的開發框架。所有AI開發框架皆會呼叫其所提供的API。
| [NVIDIA cuDNN](https://developer.nvidia.com/cudnn) | 7.6.5.32-1+cuda10.1 | The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN是NVIDIA專門為深度神經網路開發所提供的函示庫。 |
| [NVIDIA TensorRT](https://developer.nvidia.com/tensorrt) | 6.0.1-1+cuda10.1 | NVIDIA TensorRT® is a platform for high-performance deep learning inference. 於模型部署階段,可利用NVIDIA TensorRT將模型優化,或將單精度模型以合適的方式轉換成半精度模型,使模型推理能夠以高速運行。 |
| [NVIDIA Collectives Communication Library (NCCL)](https://developer.nvidia.com/nccl) | v2.4.8-1+cuda10.1 | The NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multi-node collective communication primitives that are performance optimized for NVIDIA GPUs. 使用多GPU訓練時,TensorFlow可利用NVIDIA NCCL做多GPU加速。
| [Intel Math Kernel Library (Intel MKL)](https://software.intel.com/en-us/mkl) | 2020.0-088 | Intel® Math Kernel Library (Intel® MKL) optimizes code with minimal effort for future generations of Intel® processors. 針對Intel CPU做快速的數值運算。 |
| [NumPy](https://www.numpy.org) (Intel-MKL-acclerated) | 1.18.4 | NumPy is the fundamental package for scientific computing with Python. 常用的數值運算套件 (利用Intel MKL加速)。 |
| [SciPy](https://www.scipy.org) (Intel-MKL-acclerated) | 1.4.1 | SciPy (pronounced “Sigh Pie”) is a Python-based ecosystem of open-source software for mathematics, science, and engineering. 常用的科研套件,提供一些基礎算法,統計方法 (利用Intel MKL加速)。 |
| [Scikit-learn](https://scikit-learn.org/stable/#) (Intel-MKL-acclerated) | 0.23.1 | Machine Learning in Python. 常用的機器學習套件,提供一些基礎算法,統計方法 (利用Intel MKL加速)。 |
| [OpenCV](https://opencv.org) (Intel-MKL-acclerated) | 4.3.0 | OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. 用於影像處理,以及建立影像相關的機器學習模型。 |
| [imgaug](https://github.com/aleju/imgaug) | 0.4.0 | Image augmentation for machine learning experiments. 用於data augmentation (資料增益)。 |
| [pydicom](https://pydicom.github.io/pydicom/stable/getting_started.html) | 1.2.2 | Pydicom is a pure Python package for working with DICOM files such as medical images, reports, and radiotherapy objects. 用於讀取醫療影像。 |
| [gdcm](https://sourceforge.net/projects/gdcm/) | 3.0.6 | Grassroots DiCoM is a C++ library for DICOM medical files. It is accessible from Python, C#, Java and PHP. It supports RAW, JPEG, JPEG 2000, JPEG-LS, RLE and deflated transfer syntax. 須經由此函式庫的幫助,才能透過pydicom讀取壓縮過的醫療影像。 |
| [Numba](http://numba.pydata.org) | 0.49.1 | Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code. Python程式碼經JIT編譯器編譯後,可加速百倍至千倍。 |
| [Numexpr](https://github.com/pydata/numexpr) | 2.7.1 | Fast numerical array expression evaluator for Python, NumPy, PyTables, pandas, bcolz and more. 數學表達式經過計算優化後,可提升最高至4倍速。 |
| [pyodbc](https://github.com/mkleehammer/pyodbc) | 4.0.30 | pyodbc is an open source Python module that makes accessing ODBC databases simple. 連結資料庫使用。 |
| [Jupyterlab](https://github.com/jupyterlab/jupyterlab) | 2.1.4 | An extensible environment for interactive and reproducible computing, based on the Jupyter Notebook and Architecture. 程式碼運行,紀錄,筆記撰寫,皆可存放並整理至筆記本。 |
| [pandas](https://pandas.pydata.org) | 1.0.4 | pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. 建立並整理資料表,並且提供簡易的方式將資料表視覺化。 |
| [Matplotlib](https://matplotlib.org) | 3.2.1 | Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. 資料視覺化套件,可繪製長條圖,直方統計圖,散點圖等。 |
| [Seaborn](https://seaborn.pydata.org) | 0.10.1 | Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. 基於Matplotlib的高階繪圖API; 可接收資料表,自動做groupby後繪圖。 |
| [Bokeh](https://bokeh.pydata.org/en/latest/) | 2.0.2 | Bokeh is an interactive visualization library that targets modern web browsers for presentation. 可嵌入至網頁,實現互動式的數據呈現。|
| [code server](https://github.com/cdr/code-server) | 3.4.1 | VS Code in the browser 雲端版本的Visual Studio Code。 |
| 套件/軟體/函式庫名稱 | 版本 | 套件說明 |
|:---------|:---------:|:---------|
| [TensorFlow](https://www.tensorflow.org) | 2.2 | An open source machine learning library for research and production. 由Google維護的,開源的AI模型開發框架。 |
| [Python](https://docs.python.org/3.7/whatsnew/changelog.html#python-3-7-3-final) | 3.8.3 | Python is powerful... and fast; plays well with others; runs everywhere; is friendly & easy to learn; is Open. 我們環境採用Python 3.8,它於字串處理和檔案搜索方面較Python3.6快很多。 |
| [Horovod](https://github.com/horovod/horovod) | 0.19.4 | Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. 使用Uber Horovod可簡易的將AI訓練利用多GPU做加速。 |
| [OpenMPI](https://www.open-mpi.org) | 4.0.3 | A High Performance Message Passing Library. (Required by Uber Horovod) OpenMPI為Uber Horovod所需,可支持跨卡/跨伺服器節點的溝通。 |
| [NVIDIA CUDA](https://developer.nvidia.com/cuda-toolkit) | 10.1.243-1 | The NVIDIA® CUDA® Toolkit provides a development environment for creating high performance GPU-accelerated applications. CUDA為NVIDIA為其GPU所提供的開發框架。所有AI開發框架皆會呼叫其所提供的API。
| [NVIDIA cuDNN](https://developer.nvidia.com/cudnn) | 7.6.5.32-1+cuda10.1 | The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN是NVIDIA專門為深度神經網路開發所提供的函示庫。 |
| [NVIDIA TensorRT](https://developer.nvidia.com/tensorrt) | 6.0.1-1+cuda10.1 | NVIDIA TensorRT® is a platform for high-performance deep learning inference. 於模型部署階段,可利用NVIDIA TensorRT將模型優化,或將單精度模型以合適的方式轉換成半精度模型,使模型推理能夠以高速運行。 |
| [NVIDIA Collectives Communication Library (NCCL)](https://developer.nvidia.com/nccl) | v2.4.8-1+cuda10.1 | The NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multi-node collective communication primitives that are performance optimized for NVIDIA GPUs. 使用多GPU訓練時,TensorFlow可利用NVIDIA NCCL做多GPU加速。
| [Intel Math Kernel Library (Intel MKL)](https://software.intel.com/en-us/mkl) | 2020.0-088 | Intel® Math Kernel Library (Intel® MKL) optimizes code with minimal effort for future generations of Intel® processors. 針對Intel CPU做快速的數值運算。 |
| [NumPy](https://www.numpy.org) (Intel-MKL-acclerated) | 1.18.4 | NumPy is the fundamental package for scientific computing with Python. 常用的數值運算套件 (利用Intel MKL加速)。 |
| [SciPy](https://www.scipy.org) (Intel-MKL-acclerated) | 1.4.1 | SciPy (pronounced “Sigh Pie”) is a Python-based ecosystem of open-source software for mathematics, science, and engineering. 常用的科研套件,提供一些基礎算法,統計方法 (利用Intel MKL加速)。 |
| [Scikit-learn](https://scikit-learn.org/stable/#) (Intel-MKL-acclerated) | 0.23.1 | Machine Learning in Python. 常用的機器學習套件,提供一些基礎算法,統計方法 (利用Intel MKL加速)。 |
| [OpenCV](https://opencv.org) (Intel-MKL-acclerated) | 4.3.0 | OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. 用於影像處理,以及建立影像相關的機器學習模型。 |
| [imgaug](https://github.com/aleju/imgaug) | 0.4.0 | Image augmentation for machine learning experiments. 用於data augmentation (資料增益)。 |
| [pydicom](https://pydicom.github.io/pydicom/stable/getting_started.html) | 1.2.2 | Pydicom is a pure Python package for working with DICOM files such as medical images, reports, and radiotherapy objects. 用於讀取醫療影像。 |
| [gdcm](https://sourceforge.net/projects/gdcm/) | 3.0.6 | Grassroots DiCoM is a C++ library for DICOM medical files. It is accessible from Python, C#, Java and PHP. It supports RAW, JPEG, JPEG 2000, JPEG-LS, RLE and deflated transfer syntax. 須經由此函式庫的幫助,才能透過pydicom讀取壓縮過的醫療影像。 |
| [Numba](http://numba.pydata.org) | 0.49.1 | Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code. Python程式碼經JIT編譯器編譯後,可加速百倍至千倍。 |
| [Numexpr](https://github.com/pydata/numexpr) | 2.7.1 | Fast numerical array expression evaluator for Python, NumPy, PyTables, pandas, bcolz and more. 數學表達式經過計算優化後,可提升最高至4倍速。 |
| [pyodbc](https://github.com/mkleehammer/pyodbc) | 4.0.30 | pyodbc is an open source Python module that makes accessing ODBC databases simple. 連結資料庫使用。 |
| [Jupyterlab](https://github.com/jupyterlab/jupyterlab) | 2.1.4 | An extensible environment for interactive and reproducible computing, based on the Jupyter Notebook and Architecture. 程式碼運行,紀錄,筆記撰寫,皆可存放並整理至筆記本。 |
| [pandas](https://pandas.pydata.org) | 1.0.4 | pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. 建立並整理資料表,並且提供簡易的方式將資料表視覺化。 |
| [Matplotlib](https://matplotlib.org) | 3.2.1 | Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. 資料視覺化套件,可繪製長條圖,直方統計圖,散點圖等。 |
| [Seaborn](https://seaborn.pydata.org) | 0.10.1 | Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. 基於Matplotlib的高階繪圖API; 可接收資料表,自動做groupby後繪圖。 |
| [Bokeh](https://bokeh.pydata.org/en/latest/) | 2.0.2 | Bokeh is an interactive visualization library that targets modern web browsers for presentation. 可嵌入至網頁,實現互動式的數據呈現。|
| [code server](https://github.com/cdr/code-server) | 3.4.1 | VS Code in the browser 雲端版本的Visual Studio Code。 |
請先登入後輸入您的回覆