- 
                Notifications
    You must be signed in to change notification settings 
- Fork 6.9k
Closed
Labels
questionJust a question :)Just a question :)staleThe issue is stale. It will be closed within 7 days unless there are further conversationThe issue is stale. It will be closed within 7 days unless there are further conversation
Description
What is the problem?
I tried the custom environment example with PyTorch. But it cannot run with this error "RuntimeError: Expected object of scalar type Float but got scalar type Long for argument #2 'mat1' in call to _th_addmm".
When I switched the framework to Tensorflow, it runs properly. I can't find any documentation about how I am supposed to make changes if I want to use the custom environment with PyTorch.
System environment:
Ray: latest pip version
Python: 3.7.7
PyTorch: 1.5.0
OS: Ubuntu 18
Reproduction
import gym
from gym.spaces import Discrete, Box
from ray import tune
class SimpleCorridor(gym.Env):
    def __init__(self, config):
        self.end_pos = config["corridor_length"]
        self.cur_pos = 0
        self.action_space = Discrete(2)
        self.observation_space = Box(0.0, self.end_pos, shape=(1, ))
    def reset(self):
        self.cur_pos = 0
        return [self.cur_pos]
    def step(self, action):
        if action == 0 and self.cur_pos > 0:
            self.cur_pos -= 1
        elif action == 1:
            self.cur_pos += 1
        done = self.cur_pos >= self.end_pos
        return [self.cur_pos], 1 if done else 0, done, {}
tune.run(
    "PPO",
    config={
        "env": SimpleCorridor,
        "num_workers": 1,
        "env_config": {"corridor_length": 5},
        "use_pytorch":True})
Metadata
Metadata
Assignees
Labels
questionJust a question :)Just a question :)staleThe issue is stale. It will be closed within 7 days unless there are further conversationThe issue is stale. It will be closed within 7 days unless there are further conversation