Process runs on more GPUs than specified

I have a single 8-GPU machine with a faulty GPU0. 
I'm running imagenet_example.py on 7 GPUs on this machine by specifying `gpus=[1,2,3,4,5,6,7]` in the Trainer i.e. I do not want to use GPU0

However, when i run `nvidia-smi`, I see the Trainer's pid shows on all 8 GPUs, just with lower memory on GPU0 (see output below). I also find it to be slower than non-PL code by about 4x. I don't see this behavior if I manually set `CUDA_VISIBLE_DEVICES=1,2,3,4,5,6,7` followed by `gpus=7` in Trainer. Similarly, it works fine when using a single GPU with, say, `gpus=[1]`.
I'm not sure if it's relevant but I also see `gpu=0` in the tqdm progress bar

#### nvidia-smi with Trainer(gpus=[1,2,3,4,5,6,7]) and CUDA_VISIBLE_DEVICES unset
```
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     40155      C   python                                       719MiB |
|    1     40155      C   python                                      6003MiB |
|    2     40155      C   python                                      6019MiB |
|    3     40155      C   python                                      6019MiB |
|    4     40155      C   python                                      6019MiB |
|    5     40155      C   python                                      6019MiB |
|    6     40155      C   python                                      6019MiB |
|    7     40155      C   python                                      6019MiB |
+-----------------------------------------------------------------------------+
```

#### nvidia-smi with Trainer(gpus=7) and CUDA_VISIBLE_DEVICES=1,2,3,4,5,6,7

```
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    1     34452      C   python                                      6003MiB |
|    2     34452      C   python                                      6019MiB |
|    3     34452      C   python                                      6019MiB |
|    4     34452      C   python                                      6019MiB |
|    5     34452      C   python                                      6019MiB |
|    6     34452      C   python                                      6019MiB |
|    7     34452      C   python                                      6019MiB |
+-----------------------------------------------------------------------------+
```

### Expected behavior

The process should run on the specified GPUs without manually setting CUDA_VISIBLE_DEVICES

### Environment
```
PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: version 3.10.2

Python version: 3.8
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: GeForce RTX 2080 Ti
GPU 1: GeForce RTX 2080 Ti
GPU 2: GeForce RTX 2080 Ti
GPU 3: GeForce RTX 2080 Ti
GPU 4: GeForce RTX 2080 Ti
GPU 5: GeForce RTX 2080 Ti
GPU 6: GeForce RTX 2080 Ti
GPU 7: GeForce RTX 2080 Ti

Nvidia driver version: 418.87.00
cuDNN version: Could not collect

Versions of relevant libraries:
[pip] numpy==1.18.1
[pip] pytorch-lightning==0.6.0
[pip] torch==1.4.0
[pip] torch-lr-finder==0.1.2
[pip] torchvision==0.5.0
[conda] blas                      1.0                         mkl
[conda] mkl                       2020.0                      166
[conda] mkl-service               2.3.0            py38he904b0f_0
[conda] mkl_fft                   1.0.15           py38ha843d7b_0
[conda] mkl_random                1.1.0            py38h962f231_0
[conda] pytorch                   1.4.0           py3.8_cuda10.1.243_cudnn7.6.3_0    pytorch
[conda] pytorch-lightning         0.6.0                    pypi_0    pypi
[conda] torch-lr-finder           0.1.2                    pypi_0    pypi
[conda] torchvision               0.5.0                py38_cu101    pytorch
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Process runs on more GPUs than specified #958

nvidia-smi with Trainer(gpus=[1,2,3,4,5,6,7]) and CUDA_VISIBLE_DEVICES unset

nvidia-smi with Trainer(gpus=7) and CUDA_VISIBLE_DEVICES=1,2,3,4,5,6,7

Expected behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Process runs on more GPUs than specified #958

Description

nvidia-smi with Trainer(gpus=[1,2,3,4,5,6,7]) and CUDA_VISIBLE_DEVICES unset

nvidia-smi with Trainer(gpus=7) and CUDA_VISIBLE_DEVICES=1,2,3,4,5,6,7

Expected behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions