Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Same reward thought the training in DDPG #1233

Open
Siddhu2502 opened this issue May 20, 2024 · 0 comments
Open

Same reward thought the training in DDPG #1233

Siddhu2502 opened this issue May 20, 2024 · 0 comments

Comments

@Siddhu2502
Copy link

Siddhu2502 commented May 20, 2024

agent = DRLAgent(env = env_train)
DDPG_PARAMS = {
    "batch_size": 4096,
    "buffer_size": 1000000,
    "learning_rate": 0.0003,
    "learning_starts": 100,
    "tau":0.02,
}

model_ddpg = agent.get_model("ddpg",model_kwargs = DDPG_PARAMS)

#training DDPG Agent
trained_ddpg = agent.train_model(model=model_ddpg,
                             tb_log_name='ddpg',
                             total_timesteps=50000)
----------------------------------
| time/              |           |
|    episodes        | 4         |
|    fps             | 29        |
|    time_elapsed    | 189       |
|    total_timesteps | 5608      |
| train/             |           |
|    actor_loss      | -11.6     |
|    critic_loss     | 0.0618    |
|    learning_rate   | 0.0003    |
|    n_updates       | 5507      |
|    reward          | 0.5398047 |
----------------------------------
day: 1401, episode: 10
begin_total_asset: 100000.00
end_total_asset: 259256.35
total_reward: 159256.35
total_cost: 138.56
total_trades: 72857
Sharpe: 0.778
=================================
----------------------------------
| time/              |           |
|    episodes        | 8         |
|    fps             | 29        |
|    time_elapsed    | 386       |
|    total_timesteps | 11216     |
| train/             |           |
|    actor_loss      | -3.94     |
|    critic_loss     | 0.0111    |
|    learning_rate   | 0.0003    |
|    n_updates       | 11115     |
|    reward          | 0.5398047 |
----------------------------------
----------------------------------
| time/              |           |
|    episodes        | 12        |
|    fps             | 28        |
|    time_elapsed    | 584       |
|    total_timesteps | 16824     |
| train/             |           |
|    actor_loss      | -1.22     |
|    critic_loss     | 0.0419    |
|    learning_rate   | 0.0003    |
|    n_updates       | 16723     |
|    reward          | 0.5398047 |
----------------------------------

I am trying to use DDPG for my StockTradingEnv provided by FINRL. The rewards is same for all over the episodes and also when plotting out the buys sells and holds of the stocks

df_account_value, df_actions = DRLAgent.DRL_prediction(
    model=trained_ddpg,
    environment = e_trade_gym)

The entire table is just 0s starting form the first row onwards the performance is way wayy worse than SAC and training the DDPG for 1000 time steps is giving same result as of training 10k time steps

Am i missing something is it with the hyper parameters ?

@ndronen @lcavalie @dubodog @kruzel

@Siddhu2502 Siddhu2502 changed the title Same reward through the training in DDPG Same reward thought the training in DDPG May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant