【2024年最新版】【Windows10】NVIDIAグラフィックカードのGPUをTensorFlowに認識させて動かす

カテゴリー【Python、Windows、Google】

【Windows10】NVIDIAグラフィックカードのGPUをTensorFlowに認識させて動かす

POSTED BY
2024-01-13

【Windows10】Python統合開発環境Anaconda3メモ
 【Anaconda3】指定した仮想環境でJupyter Notebookを動かす

続きです。当方の個人パソコンにはNVIDIAのグラフィックカードを差しているので、TensorFlowにGPUを認識させられるはずですので、やってみました。

TensorFlowのGPU認識はそれぞれのソフトウェアが決められたバージョン同士でないとかたくなに認識してくれませんでした（全部最新を入れてみたが認識せず）

当方が成功した組み合わせは以下のとおりでした。

【ハードウェア】

・Windows 10 Pro 64-bit
・Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz 2.81GHz
・32.0GB RAM
・NVIDIA GeForce GTX 1050 Ti 1341MHz Memory 4096MB GDDR5

【ソフトウェア】

・Anaconda3-2020.11-Windows-x86_64

・vs_community__824172766.1610265689
※↑Visual Studioが必須かどうか不明ですが一応入れました
Visual Studio 2019 コミュニティエディション
https://visualstudio.microsoft.com/ja/downloads/

・NVIDIA cuda_10.1.105_418.96_win10
NVIDIA CUDAをダウンロード・インストールします。バージョンは10.1です。

https://developer.nvidia.com/cuda-10.1-download-archive-base?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exelocal

インストールが完了したら、下記

・NVIDIA cudnn-10.1-windows10-x64-v7.6.5.32
NVIDIA cuDNNのZIPファイルをダウンロードします。バージョンはCUDA10.1用のcuDNN7.6です。
ダウンロードにはユーザー登録が必要です（無料）。

ここから7.6を探してダウンロード、ZIPを展開します。
https://developer.nvidia.com/rdp/cudnn-archive

展開したら、中身を手動でCUDAのプログラムフォルダに放り込みます。

cuda/binの中身を
↓
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/binへコピー。
 
cuda/includeの中身を
↓
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/includeへコピー。
 
cuda/lib/x64の中身を
↓
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/lib/x64へコピー。

TensorFlow GPU用の仮想環境を作成しセットアップ・確認

ここからはAnaconda Promptでの作業です。

CUDA Driverの確認

nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:26_Pacific_Standard_Time_2019
Cuda compilation tools, release 10.1, V10.1.105

cuDNNライブラリパスが通っているかの確認

where cudnn64_7.dll
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\cudnn64_7.dll

Pythonバージョンは3.7で仮想環境を作成

conda create -n tf-gpu python=3.7
conda activate tf-gpu

TensorFlow-GPUは2.2をインストール

(tf-gpu) C:\Users\hogeuser>pip install tensorflow-gpu==2.2

いよいよGPUの認識確認

Pythonを立ち上げて、以下のように打ちます。

(tf-gpu) C:\Users\hogeuser>python
from tensorflow.python.client import device_lib
device_lib.list_local_devices()

device_type: "GPU"が表示されれば成功です。当方環境は以下のように認識されました。

(tf-gpu) C:\Users\hogeuser>python
Python 3.7.9 (default, Aug 31 2020, 17:10:11) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from tensorflow.python.client import device_lib
2021-01-13 23:40:28.746568: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
>>> device_lib.list_local_devices()
2021-01-13 23:40:40.077791: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2021-01-13 23:40:40.085936: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2a602c459e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-01-13 23:40:40.086023: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-01-13 23:40:40.087743: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2021-01-13 23:40:40.110202: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 Ti computeCapability: 6.1
coreClock: 1.455GHz coreCount: 6 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 104.43GiB/s
2021-01-13 23:40:40.110372: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-01-13 23:40:40.503797: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2021-01-13 23:40:40.731304: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2021-01-13 23:40:40.751724: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2021-01-13 23:40:40.970555: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2021-01-13 23:40:41.178489: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2021-01-13 23:40:41.392853: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2021-01-13 23:40:41.393067: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-01-13 23:40:41.973926: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-01-13 23:40:41.974051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0
2021-01-13 23:40:41.974834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N
2021-01-13 23:40:41.975317: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2021-01-13 23:40:41.975628: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:0 with 2562 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2021-01-13 23:40:41.978480: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2a630e56380 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-01-13 23:40:41.978576: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1050 Ti, Compute Capability 6.1
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 5241415701586966866
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 15841878375801700475
physical_device_desc: "device: XLA_CPU device"
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 2686628660
locality {
  bus_id: 1
  links {
  }
}
incarnation: 13497597365978317240
physical_device_desc: "device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 4541417229844871161
physical_device_desc: "device: XLA_GPU device"
]
>>>

cuDNNメモリエラー回避環境変数の設定

Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILEDエラーの対処法のとおり環境変数TF_FORCE_GPU_ALLOW_GROWTHをtrueにセットしておくこと。

次回はTensorFlow公式のチュートリアルを進める（３）にて、上記無事認識されたGPUに明示的に計算させてみて、CPUで行った結果と比較してみます。

【次の記事】TensorFlow公式のチュートリアルを進める（３）

【前の記事】【Anaconda3】指定した仮想環境でJupyter Notebookを動かす

Android 　iPhone/iPad 　Flutter 　MacOS 　Windows 　Debian 　Ubuntu 　CentOS 　FreeBSD 　RaspberryPI 　HTML/CSS 　C/C++ 　PHP 　Java 　JavaScript 　Node.js 　Swift 　Python 　MatLab 　Amazon/AWS 　CORESERVER 　Google 　仮想通貨　 LINE 　OpenAI/ChatGPT 　IBM Watson 　Microsoft Azure 　Xcode 　VMware 　MySQL 　PostgreSQL 　Redis 　Groonga 　Git/GitHub 　Apache 　nginx 　Postfix 　SendGrid 　Hackintosh 　Hardware 　Fate/Grand Order 　ウマ娘　将棋　ドラレコ

【WEBMASTER/管理人】

自営業プログラマーです。お仕事ください！
ご連絡は以下アドレスまでお願いします★

【キーワード検索】

【最近の記事】【全部の記事】

ファイアウォール内部のGradio/WebUIを外部からProxyPassを通して使う
オープンソースリップシンクエンジンSadTalkerをDebianで動かす
ファイアウォール内部のOpenAPI/FastAPIのdocsを外部からProxyPassで呼ぶ
Debian 12でsshからshutdown -h nowしても電源が切れない場合
【Windows＆Mac】アプリのフルスクリーンを解除する方法
Debian 12でtsコマンドが見つからないcommand not found
Debian 12でsyslogやauth.logが出力されない場合
Debian 12で固定IPアドレスを使う設定をする
Debian 12 bookwormでNVIDIA RTX4060Ti-OC16GBを動かす
【Debian】apt updateでCD-ROMがどうのこうの言われエラーになる場合

【カテゴリーリンク】