pushing this all to git before cleaing it. putting this thing on my fucking Dell Poweredge bc god knows im not waiting 2 hours to run another ML test on this 12 core machine

This commit is contained in:
klein panic
2025-01-27 15:29:04 -05:00
parent 9642bd4292
commit 638e0d82a4
24 changed files with 22165 additions and 0 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,268 @@
# I got Bitched. Fuck Python
# Below is a breakdown of how **`main.py`** works, what files it generates, and how all the pieces (LSTM model training, Optuna tuning, DQN trading agent, etc.) fit together. Ill also cover the CSV structure, how to interpret metrics, how to handle the various files, and how to potentially adapt this code for real-time predictions or a live trading bot.
---
## 1. **High-Level Flow of `main.py`**
1. **Imports & Logging Setup**
- Imports all necessary libraries (numpy, pandas, sklearn, TensorFlow, XGBoost, Optuna, Stable Baselines, etc.)
- Sets up basic logging with timestamps and message levels.
2. **Argument Parsing**
- Uses `argparse` to expect one argument: the path to the CSV file.
- You run the script like `python main.py your_data.csv`.
3. **Data Loading & Preprocessing**
- **`load_data(file_path)`**: Reads the CSV with pandas, renames columns (e.g., `time``Date`, `open``Open`, etc.), sorts by date, and returns a cleaned DataFrame.
- **`calculate_technical_indicators(df)`**:
- Computes various indicators (SMA, EMA, RSI, MACD, ADX, OBV).
- Drops rows with `NaN` values (because rolling windows can produce NaNs).
- After these steps, the data is ready for feature selection.
4. **Feature Selection & Scaling**
- Chooses certain columns as features (`feature_columns`) plus the target (`Close`).
- Scales features and target with `MinMaxScaler`, which normalizes values to [0, 1].
- Prepares sequences of length `window_size` (default = 15) for LSTM training via **`create_sequences()`**.
5. **Train/Validation/Test Split**
- Splits sequences into 70% for training, 15% for validation, and 15% for testing.
- This yields `X_train`, `X_val`, `X_test` and corresponding `y_train`, `y_val`, `y_test`.
6. **Device Configuration**
- Checks for GPUs and configures TensorFlow to allow memory growth on the available GPU(s).
7. **Model Building and Hyperparameter Tuning**
- **`build_advanced_lstm(...)`**: Creates a multi-layer, bidirectional LSTM with optional dropout, user-defined optimizer, learning rate, etc.
- **Optuna**:
- The `objective(trial)` function defines which hyperparameters to search.
- It trains an LSTM for each set of hyperparameters.
- Minimizes the validation MAE to find the best combination.
- Runs `study.optimize(objective, n_trials=50)` to try up to 50 hyperparameter sets.
- The best hyperparameters are then retrieved (`study.best_params`).
8. **Train the Best LSTM Model**
- Re-builds the LSTM with the best hyperparameters found.
- Uses callbacks (`EarlyStopping`, `ReduceLROnPlateau`) for better generalization.
- Trains up to 300 epochs or until early stopping.
9. **Model Evaluation**
- **`evaluate_model(...)`**:
- Gets predictions (`model.predict(X_test)`).
- Inverse-transforms them to the original scale (because we had scaled them).
- Computes **MSE**, **RMSE**, **MAE**, **R2**, and **directional accuracy**.
- Saves a plot called **`actual_vs_predicted.png`** comparing actual and predicted test prices.
- Prints the first 40 predictions in a tabular format.
10. **Save Model & Scalers**
- Saves the Keras model to **`optimized_lstm_model.h5`**.
- _(Keras warns this is a legacy format and suggests using `.keras` file extension—more on that later.)_
- Saves the scalers to **`scaler_features.save`** and **`scaler_target.save`** using `joblib`.
11. **Reinforcement Learning (DQN) Setup**
- Defines a custom Gym environment **`StockTradingEnv`** that simulates a simple stock trading scenario:
- Discrete action space: **0 = Sell**, **1 = Hold**, **2 = Buy**.
- Observations: scaled features + current balance + shares held + cost basis.
- Reward is the change in net worth.
- Uses **Stable Baselines 3** (`DQN`) to train an agent in that environment for 100,000 timesteps.
- Saves the agent as **`dqn_stock_trading.zip`**.
12. **Finally**:
- The script ends after the RL agent training completes.
---
## 2. **Explanation of Generated Files**
After you run `main.py`, you typically end up with these files:
1. **`optimized_lstm_model.h5`**
- The final trained LSTM model, saved in the HDF5 format.
- **Keras** now recommends using `model.save('optimized_lstm_model.keras')` or `keras.saving.save_model(model, 'optimized_lstm_model.keras')` for a more modern format.
2. **`scaler_features.save`**
- A `joblib` file containing the fitted `MinMaxScaler` for the features.
3. **`scaler_target.save`**
- Another `joblib` file for the target variables `MinMaxScaler`.
4. **`actual_vs_predicted.png`**
- A PNG plot comparing the actual vs. predicted close prices from the test set.
5. **`dqn_stock_trading.zip`**
- The trained DQN agent from Stable Baselines 3.
6. **`dqn_stock_tensorboard/`**
- A directory containing TensorBoard logs for the DQN training process.
- You can inspect these logs by running `tensorboard --logdir=./dqn_stock_tensorboard`.
7. **Other legacy or auxiliary files** you may have in the same folder:
- **`enhanced_lstm_model.h5`**, **`prediction_vs_actual.png`**, **`policy.pth`**, **`_stable_baselines3_version`**, etc., come from either old runs or intermediate attempts. You can clean them up if you no longer need them.
---
## 3. **Your CSV (`time,open,high,low,close,Volume`)**
An example snippet:
```
time,open,high,low,close,Volume
2024-01-08T09:30:00-05:00,59.23,59.69,59.03,59.53,4335
...
```
- The script renames these columns to `Date`, `Open`, `High`, `Low`, `Close`, `Volume`.
- It sorts by `Date` and starts computing features.
Because the script expects `Date` sorted in ascending order, make sure all timestamps follow the same format.
---
## 4. **Interpreting the Evaluation Metrics**
When the script prints:
- **MSE (Mean Squared Error)**
- **RMSE (Root MSE)**
- **MAE (Mean Absolute Error)**
- **R² (Coefficient of Determination)**
- **Directional Accuracy**
### Example Output Interpretation
- **MSE: 0.0838** → The average of squared errors on the original (inversely scaled) price scale is relatively low.
- **RMSE: 0.2895** → The square root of that MSE, so on average the models predictions deviate by about 0.29 from the actual close price.
- **MAE: 0.1836** → On average, the absolute deviation is ~0.18.
- **R²: 0.993** → Very high, suggests the model explains 99.3% of the variance in price.
However, a **directional accuracy** of ~0.48 suggests the model is not great at predicting whether the price goes up or down from one timestep to the next. Its close to random guessing (50%). This can happen if the model is good at capturing overall magnitude but not short-term direction.
If you need the model to be directionally correct more often (for trading), consider:
- Shifting the target to be the price change or return (rather than the absolute price).
- Using classification-based approach (up/down) or building a custom loss function that focuses more on directional accuracy.
---
## 5. **How to Improve or Change the Metric Outputs**
1. **Custom Metrics**:
- You can add them to `model.compile(metrics=[...])` if theyre supported by Keras.
- Or you can compute them manually in the `evaluate_model` function (like you already do for R², directional accuracy, etc.).
2. **Reducing Warnings**:
- **HDF5 warning**: Instead of `best_model.save('optimized_lstm_model.h5')`, do:
```python
best_model.save('optimized_lstm_model.keras')
```
Or:
```python
keras.saving.save_model(best_model, 'optimized_lstm_model.keras')
```
- **Gym vs. Gymnasium warning** in Stable Baselines:
- You can switch to Gymnasium by installing `gymnasium` and adapting the environment accordingly:
```python
import gymnasium as gym
```
Then use `gymnasium.make(...)`.
- But as long as its working, the warning is mostly informational.
3. **Remove Unused Files**:
- If certain files are no longer used or were generated by old runs, just delete them to keep your workspace clean.
---
## 6. **Using the DQN Agent**
### How Its Being Trained
- **`StockTradingEnv`** is a simplified environment that steps through your historical data row by row (`self.max_steps = len(df)`).
- Each step, you pick an action (Sell, Hold, or Buy).
- The environment updates your balance, shares held, cost basis, and net worth accordingly.
- The reward is `(net_worth - initial_balance)`, i.e. how much youve gained or lost.
### How to Deploy It
1. **After Training**: You have **`dqn_stock_trading.zip`** saved.
2. **Load the Model** in a separate script or Jupyter notebook:
```python
from stable_baselines3 import DQN
from stable_baselines3.common.vec_env import DummyVecEnv
# Recreate the same environment
env = StockTradingEnv(your_dataframe)
env = DummyVecEnv([lambda: env])
# Load the trained agent
model = DQN.load("dqn_stock_trading.zip", env=env)
```
3. **Run Predictions**:
```python
obs = env.reset()
done = False
while not done:
# Model predicts the best action
action, _states = model.predict(obs, deterministic=True)
obs, reward, done, info = env.step(action)
env.render()
```
This will step through the environment again, but now with your trained agent. In a real-time scenario, youd need a streaming environment that updates with new data in small increments (e.g., each new minutes bar).
---
## 7. **Transition to Real-Time (“Live”) Predictions**
1. **Live Price Feed**:
- You would replace the static CSV with a real-time feed (e.g., an API from a broker or a data provider).
- Keep a rolling window of the last `window_size` data points, compute your indicators on the fly.
2. **Online or Incremental Updates**:
- For an LSTM, you typically retrain or fine-tune it with new data over time. Or you load the existing model and just do forward passes for the new window.
- The code that constructs sequences would run each time you get a new data point, but typically youd keep a queue or buffer of the recent `N` bars.
3. **Deploying the DQN**:
- Similarly, in a real environment, each new bar triggers `env.step(action)`. The environments “current step” is the latest bar.
- You might have to rewrite the environments logic so it only advances by one bar at a time in real-time, rather than iterating over the entire historical dataset.
---
## 8. **Summary**
- **`main.py`** orchestrates:
1. Data Loading + Preprocessing
2. Feature Engineering (SMA, EMA, RSI, MACD, ADX, OBV)
3. LSTM Hyperparameter Tuning with Optuna
4. Best Model Training + Saving + Evaluation
5. Simple RL Environment + DQN Training + Saving
- **Key Files** Generated:
- `optimized_lstm_model.h5` (or `.keras`) → your final Keras LSTM model.
- `scaler_features.save`, `scaler_target.save` → joblib-saved scalers.
- `actual_vs_predicted.png` → visual of test set predictions.
- `dqn_stock_trading.zip` → trained RL agent.
- `dqn_stock_tensorboard/` → logs for the RL training.
- **Interpreting Metrics**:
- High R² with lower directional accuracy implies it fits magnitudes well but struggles with sign changes.
- Potential improvement: feature engineering for short-term direction or a classification approach for up vs. down.
- **Using the DQN Agent**:
- `StockTradingEnv` is a toy environment stepping over historical data.
- Real-time adaptation requires modifying how the environment receives data.
- **Warnings**:
- Switch `.h5` → `.keras` to remove the Keras format warning.
- Possibly switch from Gym to Gymnasium to remove the stable-baselines3 compatibility warning.
---
### Next Steps / Tips
1. **Clean Up Legacy Files**: If you have old models or references (like `enhanced_lstm_model.h5`), remove or rename them.
2. **Custom Loss / Custom Metrics**: If you want to focus on direction, consider a custom loss function or a classification-based approach.
3. **Try Different RL Algorithms**: DQN is just one method. PPO, A2C, or SAC might handle continuous or more complex action spaces.
4. **Hyperparameter Range**: Expand or refine your Optuna search space. For instance, trying different `window_sizes` or different dropout regularization strategies.
5. **Feature Engineering**: More sophisticated indicators or external features (e.g., news sentiment, fundamental data) might help.
All in all, your script is already quite comprehensive. You have an advanced LSTM pipeline for regression plus a DQN pipeline for RL. The main things to refine will be:
- **Data quality**
- **Indicator relevance**
- **Directional vs. magnitude accuracy**
- **Live streaming vs. historical backtesting**
Once you address those, your system will be closer to a real-time AI/bot capable of forecasting or trading on new data.

View File

@@ -0,0 +1 @@
2.4.1

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

View File

@@ -0,0 +1,123 @@
{
"policy_class": {
":type:": "<class 'abc.ABCMeta'>",
":serialized:": "gAWVMAAAAAAAAACMHnN0YWJsZV9iYXNlbGluZXMzLmRxbi5wb2xpY2llc5SMCURRTlBvbGljeZSTlC4=",
"__module__": "stable_baselines3.dqn.policies",
"__annotations__": "{'q_net': <class 'stable_baselines3.dqn.policies.QNetwork'>, 'q_net_target': <class 'stable_baselines3.dqn.policies.QNetwork'>}",
"__doc__": "\n Policy class with Q-Value Net and target net for DQN\n\n :param observation_space: Observation space\n :param action_space: Action space\n :param lr_schedule: Learning rate schedule (could be constant)\n :param net_arch: The specification of the policy and value networks.\n :param activation_fn: Activation function\n :param features_extractor_class: Features extractor to use.\n :param features_extractor_kwargs: Keyword arguments\n to pass to the features extractor.\n :param normalize_images: Whether to normalize images or not,\n dividing by 255.0 (True by default)\n :param optimizer_class: The optimizer to use,\n ``th.optim.Adam`` by default\n :param optimizer_kwargs: Additional keyword arguments,\n excluding the learning rate, to pass to the optimizer\n ",
"__init__": "<function DQNPolicy.__init__ at 0x7f6194e245e0>",
"_build": "<function DQNPolicy._build at 0x7f6194e24680>",
"make_q_net": "<function DQNPolicy.make_q_net at 0x7f6194e24720>",
"forward": "<function DQNPolicy.forward at 0x7f6194e247c0>",
"_predict": "<function DQNPolicy._predict at 0x7f6194e24860>",
"_get_constructor_parameters": "<function DQNPolicy._get_constructor_parameters at 0x7f6194e24900>",
"set_training_mode": "<function DQNPolicy.set_training_mode at 0x7f6194e249a0>",
"__abstractmethods__": "frozenset()",
"_abc_impl": "<_abc._abc_data object at 0x7f6194e22e40>"
},
"verbose": 1,
"policy_kwargs": {},
"num_timesteps": 100000,
"_total_timesteps": 100000,
"_num_timesteps_at_start": 0,
"seed": null,
"action_noise": null,
"start_time": 1737967108562402423,
"learning_rate": 0.001,
"tensorboard_log": "./dqn_stock_tensorboard/",
"_last_obs": {
":type:": "<class 'numpy.ndarray'>",
":serialized:": "gAWVtQAAAAAAAACMEm51bXB5LmNvcmUubnVtZXJpY5SMC19mcm9tYnVmZmVylJOUKJZAAAAAAAAAACZUeDh1fXg4anh4OP+EeDjsq9AzXS00OLOwR7EevdgzAACAPzMCODoHqXg4B6l4OM9ZeDgRUnE/AAAAAAAAAACUjAVudW1weZSMBWR0eXBllJOUjAJmNJSJiIeUUpQoSwOMATyUTk5OSv////9K/////0sAdJRiSwFLEIaUjAFDlHSUUpQu"
},
"_last_episode_starts": {
":type:": "<class 'numpy.ndarray'>",
":serialized:": "gAWVdAAAAAAAAACMEm51bXB5LmNvcmUubnVtZXJpY5SMC19mcm9tYnVmZmVylJOUKJYBAAAAAAAAAAGUjAVudW1weZSMBWR0eXBllJOUjAJiMZSJiIeUUpQoSwOMAXyUTk5OSv////9K/////0sAdJRiSwGFlIwBQ5R0lFKULg=="
},
"_last_original_obs": {
":type:": "<class 'numpy.ndarray'>",
":serialized:": "gAWVtQAAAAAAAACMEm51bXB5LmNvcmUubnVtZXJpY5SMC19mcm9tYnVmZmVylJOUKJZAAAAAAAAAAEkLeDhnUXg49Ex4OK1beDgMVd8zojo6OCTptrF6Ve4zAACAP7SYhjpzWng4nKl4OIU4eDgRUnE/AAAAAAAAAACUjAVudW1weZSMBWR0eXBllJOUjAJmNJSJiIeUUpQoSwOMATyUTk5OSv////9K/////0sAdJRiSwFLEIaUjAFDlHSUUpQu"
},
"_episode_num": 4,
"use_sde": false,
"sde_sample_freq": -1,
"_current_progress_remaining": 0.0,
"_stats_window_size": 100,
"ep_info_buffer": {
":type:": "<class 'collections.deque'>",
":serialized:": "gAWVIAAAAAAAAACMC2NvbGxlY3Rpb25zlIwFZGVxdWWUk5QpS2SGlFKULg=="
},
"ep_success_buffer": {
":type:": "<class 'collections.deque'>",
":serialized:": "gAWVIAAAAAAAAACMC2NvbGxlY3Rpb25zlIwFZGVxdWWUk5QpS2SGlFKULg=="
},
"_n_updates": 24750,
"observation_space": {
":type:": "<class 'gymnasium.spaces.box.Box'>",
":serialized:": "gAWVHgIAAAAAAACMFGd5bW5hc2l1bS5zcGFjZXMuYm94lIwDQm94lJOUKYGUfZQojAVkdHlwZZSMBW51bXB5lIwFZHR5cGWUk5SMAmY0lImIh5RSlChLA4wBPJROTk5K/////0r/////SwB0lGKMBl9zaGFwZZRLEIWUjANsb3eUjBJudW1weS5jb3JlLm51bWVyaWOUjAtfZnJvbWJ1ZmZlcpSTlCiWQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAlGgLSxCFlIwBQ5R0lFKUjA1ib3VuZGVkX2JlbG93lGgTKJYQAAAAAAAAAAEBAQEBAQEBAQEBAQEBAQGUaAiMAmIxlImIh5RSlChLA4wBfJROTk5K/////0r/////SwB0lGJLEIWUaBZ0lFKUjARoaWdolGgTKJZAAAAAAAAAAAAAgD8AAIA/AACAPwAAgD8AAIA/AACAPwAAgD8AAIA/AACAPwAAgD8AAIA/AACAPwAAgD8AAIA/AACAPwAAgD+UaAtLEIWUaBZ0lFKUjA1ib3VuZGVkX2Fib3ZllGgTKJYQAAAAAAAAAAEBAQEBAQEBAQEBAQEBAQGUaB1LEIWUaBZ0lFKUjAhsb3dfcmVwcpSMAzAuMJSMCWhpZ2hfcmVwcpSMAzEuMJSMCl9ucF9yYW5kb22UTnViLg==",
"dtype": "float32",
"_shape": [
16
],
"low": "[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]",
"bounded_below": "[ True True True True True True True True True True True True\n True True True True]",
"high": "[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]",
"bounded_above": "[ True True True True True True True True True True True True\n True True True True]",
"low_repr": "0.0",
"high_repr": "1.0",
"_np_random": null
},
"action_space": {
":type:": "<class 'gymnasium.spaces.discrete.Discrete'>",
":serialized:": "gAWVoQEAAAAAAACMGWd5bW5hc2l1bS5zcGFjZXMuZGlzY3JldGWUjAhEaXNjcmV0ZZSTlCmBlH2UKIwBbpSMFW51bXB5LmNvcmUubXVsdGlhcnJheZSMBnNjYWxhcpSTlIwFbnVtcHmUjAVkdHlwZZSTlIwCaTiUiYiHlFKUKEsDjAE8lE5OTkr/////Sv////9LAHSUYkMIAwAAAAAAAACUhpRSlIwFc3RhcnSUaAhoDkMIAAAAAAAAAACUhpRSlIwGX3NoYXBllCmMBWR0eXBllGgOjApfbnBfcmFuZG9tlIwUbnVtcHkucmFuZG9tLl9waWNrbGWUjBBfX2dlbmVyYXRvcl9jdG9ylJOUjAVQQ0c2NJRoG4wUX19iaXRfZ2VuZXJhdG9yX2N0b3KUk5SGlFKUfZQojA1iaXRfZ2VuZXJhdG9ylIwFUENHNjSUjAVzdGF0ZZR9lChoJooQ7/ZkFxYWRa/mnzjTOmwveYwDaW5jlIoQE+rIIlf7HpzUNhPkWy3PDXWMCmhhc191aW50MzKUSwGMCHVpbnRlZ2VylEonpHERdWJ1Yi4=",
"n": "3",
"start": "0",
"_shape": [],
"dtype": "int64",
"_np_random": "Generator(PCG64)"
},
"n_envs": 1,
"buffer_size": 10000,
"batch_size": 64,
"learning_starts": 1000,
"tau": 1.0,
"gamma": 0.99,
"gradient_steps": 1,
"optimize_memory_usage": false,
"replay_buffer_class": {
":type:": "<class 'abc.ABCMeta'>",
":serialized:": "gAWVNQAAAAAAAACMIHN0YWJsZV9iYXNlbGluZXMzLmNvbW1vbi5idWZmZXJzlIwMUmVwbGF5QnVmZmVylJOULg==",
"__module__": "stable_baselines3.common.buffers",
"__annotations__": "{'observations': <class 'numpy.ndarray'>, 'next_observations': <class 'numpy.ndarray'>, 'actions': <class 'numpy.ndarray'>, 'rewards': <class 'numpy.ndarray'>, 'dones': <class 'numpy.ndarray'>, 'timeouts': <class 'numpy.ndarray'>}",
"__doc__": "\n Replay buffer used in off-policy algorithms like SAC/TD3.\n\n :param buffer_size: Max number of element in the buffer\n :param observation_space: Observation space\n :param action_space: Action space\n :param device: PyTorch device\n :param n_envs: Number of parallel environments\n :param optimize_memory_usage: Enable a memory efficient variant\n of the replay buffer which reduces by almost a factor two the memory used,\n at a cost of more complexity.\n See https://github.com/DLR-RM/stable-baselines3/issues/37#issuecomment-637501195\n and https://github.com/DLR-RM/stable-baselines3/pull/28#issuecomment-637559274\n Cannot be used in combination with handle_timeout_termination.\n :param handle_timeout_termination: Handle timeout termination (due to timelimit)\n separately and treat the task as infinite horizon task.\n https://github.com/DLR-RM/stable-baselines3/issues/284\n ",
"__init__": "<function ReplayBuffer.__init__ at 0x7f6194f2b740>",
"add": "<function ReplayBuffer.add at 0x7f6194f2b880>",
"sample": "<function ReplayBuffer.sample at 0x7f6194f2b920>",
"_get_samples": "<function ReplayBuffer._get_samples at 0x7f6194f2b9c0>",
"_maybe_cast_dtype": "<staticmethod(<function ReplayBuffer._maybe_cast_dtype at 0x7f6194f2ba60>)>",
"__abstractmethods__": "frozenset()",
"_abc_impl": "<_abc._abc_data object at 0x7f6194f35280>"
},
"replay_buffer_kwargs": {},
"train_freq": {
":type:": "<class 'stable_baselines3.common.type_aliases.TrainFreq'>",
":serialized:": "gAWVeAAAAAAAAACMJXN0YWJsZV9iYXNlbGluZXMzLmNvbW1vbi50eXBlX2FsaWFzZXOUjAlUcmFpbkZyZXGUk5RLBIwIYnVpbHRpbnOUjAdnZXRhdHRylJOUaACMElRyYWluRnJlcXVlbmN5VW5pdJSTlIwEU1RFUJSGlFKUhpSBlC4="
},
"use_sde_at_warmup": false,
"exploration_initial_eps": 1.0,
"exploration_final_eps": 0.02,
"exploration_fraction": 0.1,
"target_update_interval": 1000,
"_n_calls": 100000,
"max_grad_norm": 10,
"exploration_rate": 0.02,
"lr_schedule": {
":type:": "<class 'function'>",
":serialized:": "gAWV3AQAAAAAAACMF2Nsb3VkcGlja2xlLmNsb3VkcGlja2xllIwOX21ha2VfZnVuY3Rpb26Uk5QoaACMDV9idWlsdGluX3R5cGWUk5SMCENvZGVUeXBllIWUUpQoSwFLAEsASwFLBUsTQzSVAZcAdAEAAAAAAAAAAAAAAgCJAXwApgEAAKsBAAAAAAAAAACmAQAAqwEAAAAAAAAAAFMAlE6FlIwFZmxvYXSUhZSMEnByb2dyZXNzX3JlbWFpbmluZ5SFlIynL2hvbWUva2xlaW4vY29kZVdTL1Byb2plY3RzL01pZGFzVGVjaG5vbG9naWVzTExDL01pZGFzVGVjaG5vbG9naWVzL3NyYy9NYWNoaW5lLUxlYXJuaW5nL0xTVE0tcHl0aG9uL3ZlbnYvbGliL3B5dGhvbjMuMTEvc2l0ZS1wYWNrYWdlcy9zdGFibGVfYmFzZWxpbmVzMy9jb21tb24vdXRpbHMucHmUjAg8bGFtYmRhPpSMIWdldF9zY2hlZHVsZV9mbi48bG9jYWxzPi48bGFtYmRhPpRLYUMa+IAApWWoTqhO0DtN0SxO1CxO0SZP1CZPgACUQwCUjA52YWx1ZV9zY2hlZHVsZZSFlCl0lFKUfZQojAtfX3BhY2thZ2VfX5SMGHN0YWJsZV9iYXNlbGluZXMzLmNvbW1vbpSMCF9fbmFtZV9flIwec3RhYmxlX2Jhc2VsaW5lczMuY29tbW9uLnV0aWxzlIwIX19maWxlX1+UjKcvaG9tZS9rbGVpbi9jb2RlV1MvUHJvamVjdHMvTWlkYXNUZWNobm9sb2dpZXNMTEMvTWlkYXNUZWNobm9sb2dpZXMvc3JjL01hY2hpbmUtTGVhcm5pbmcvTFNUTS1weXRob24vdmVudi9saWIvcHl0aG9uMy4xMS9zaXRlLXBhY2thZ2VzL3N0YWJsZV9iYXNlbGluZXMzL2NvbW1vbi91dGlscy5weZR1Tk5oAIwQX21ha2VfZW1wdHlfY2VsbJSTlClSlIWUdJRSlGgAjBJfZnVuY3Rpb25fc2V0c3RhdGWUk5RoI32UfZQoaBqMCDxsYW1iZGE+lIwMX19xdWFsbmFtZV9flIwhZ2V0X3NjaGVkdWxlX2ZuLjxsb2NhbHM+LjxsYW1iZGE+lIwPX19hbm5vdGF0aW9uc19flH2UjA5fX2t3ZGVmYXVsdHNfX5ROjAxfX2RlZmF1bHRzX1+UTowKX19tb2R1bGVfX5RoG4wHX19kb2NfX5ROjAtfX2Nsb3N1cmVfX5RoAIwKX21ha2VfY2VsbJSTlGgCKGgHKEsBSwBLAEsBSwFLE0MIlQGXAIkBUwCUaAkpjAFflIWUaA6MBGZ1bmOUjBljb25zdGFudF9mbi48bG9jYWxzPi5mdW5jlEuFQwj4gADYDxKICpRoEowDdmFslIWUKXSUUpRoF05OaB8pUpSFlHSUUpRoJWhBfZR9lChoGowEZnVuY5RoKYwZY29uc3RhbnRfZm4uPGxvY2Fscz4uZnVuY5RoK32UaC1OaC5OaC9oG2gwTmgxaDNHP1BiTdLxqfyFlFKUhZSMF19jbG91ZHBpY2tsZV9zdWJtb2R1bGVzlF2UjAtfX2dsb2JhbHNfX5R9lHWGlIZSMIWUUpSFlGhKXZRoTH2UdYaUhlIwLg=="
},
"batch_norm_stats": [],
"batch_norm_stats_target": [],
"exploration_schedule": {
":type:": "<class 'function'>",
":serialized:": "gAWVcAQAAAAAAACMF2Nsb3VkcGlja2xlLmNsb3VkcGlja2xllIwOX21ha2VfZnVuY3Rpb26Uk5QoaACMDV9idWlsdGluX3R5cGWUk5SMCENvZGVUeXBllIWUUpQoSwFLAEsASwFLBEsTQzyVA5cAZAF8AHoKAACJAmsEAAAAAHICiQFTAIkDZAF8AHoKAACJAYkDegoAAHoFAACJAnoLAAB6AAAAUwCUTksBhpQpjBJwcm9ncmVzc19yZW1haW5pbmeUhZSMpy9ob21lL2tsZWluL2NvZGVXUy9Qcm9qZWN0cy9NaWRhc1RlY2hub2xvZ2llc0xMQy9NaWRhc1RlY2hub2xvZ2llcy9zcmMvTWFjaGluZS1MZWFybmluZy9MU1RNLXB5dGhvbi92ZW52L2xpYi9weXRob24zLjExL3NpdGUtcGFja2FnZXMvc3RhYmxlX2Jhc2VsaW5lczMvY29tbW9uL3V0aWxzLnB5lIwEZnVuY5SMG2dldF9saW5lYXJfZm4uPGxvY2Fscz4uZnVuY5RLc0M4+IAA2AwN0BAi0QwioGzSCzLQCzLYExaISuATGJhB0CAy0RwysHO4VbF70RtDwGzRG1LRE1LQDFKUQwCUjANlbmSUjAxlbmRfZnJhY3Rpb26UjAVzdGFydJSHlCl0lFKUfZQojAtfX3BhY2thZ2VfX5SMGHN0YWJsZV9iYXNlbGluZXMzLmNvbW1vbpSMCF9fbmFtZV9flIwec3RhYmxlX2Jhc2VsaW5lczMuY29tbW9uLnV0aWxzlIwIX19maWxlX1+UjKcvaG9tZS9rbGVpbi9jb2RlV1MvUHJvamVjdHMvTWlkYXNUZWNobm9sb2dpZXNMTEMvTWlkYXNUZWNobm9sb2dpZXMvc3JjL01hY2hpbmUtTGVhcm5pbmcvTFNUTS1weXRob24vdmVudi9saWIvcHl0aG9uMy4xMS9zaXRlLXBhY2thZ2VzL3N0YWJsZV9iYXNlbGluZXMzL2NvbW1vbi91dGlscy5weZR1Tk5oAIwQX21ha2VfZW1wdHlfY2VsbJSTlClSlGgfKVKUaB8pUpSHlHSUUpRoAIwSX2Z1bmN0aW9uX3NldHN0YXRllJOUaCV9lH2UKGgajARmdW5jlIwMX19xdWFsbmFtZV9flIwbZ2V0X2xpbmVhcl9mbi48bG9jYWxzPi5mdW5jlIwPX19hbm5vdGF0aW9uc19flH2UKGgKjAhidWlsdGluc5SMBWZsb2F0lJOUjAZyZXR1cm6UaDF1jA5fX2t3ZGVmYXVsdHNfX5ROjAxfX2RlZmF1bHRzX1+UTowKX19tb2R1bGVfX5RoG4wHX19kb2NfX5ROjAtfX2Nsb3N1cmVfX5RoAIwKX21ha2VfY2VsbJSTlEc/lHrhR64Ue4WUUpRoOUc/uZmZmZmZmoWUUpRoOUc/8AAAAAAAAIWUUpSHlIwXX2Nsb3VkcGlja2xlX3N1Ym1vZHVsZXOUXZSMC19fZ2xvYmFsc19flH2UdYaUhlIwLg=="
}
}

Binary file not shown.

View File

@@ -0,0 +1,472 @@
import os
import sys
import argparse
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import logging
from tabulate import tabulate
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import TimeSeriesSplit, GridSearchCV
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout, Bidirectional
from tensorflow.keras.optimizers import Adam, Nadam
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.losses import Huber
import xgboost as xgb
import optuna
from optuna.integration import KerasPruningCallback
# Reinforcement Learning
import gym
from gym import spaces
from stable_baselines3 import DQN
from stable_baselines3.common.vec_env import DummyVecEnv
# Suppress TensorFlow warnings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
##############################
# 1. Data Loading & Indicators
##############################
def load_data(file_path):
logging.info(f"Loading data from: {file_path}")
try:
data = pd.read_csv(file_path, parse_dates=['time'])
except FileNotFoundError:
logging.error(f"File not found: {file_path}")
sys.exit(1)
except pd.errors.ParserError as e:
logging.error(f"Error parsing CSV file: {e}")
sys.exit(1)
except Exception as e:
logging.error(f"Unexpected error: {e}")
sys.exit(1)
rename_mapping = {
'time': 'Date',
'open': 'Open',
'high': 'High',
'low': 'Low',
'close': 'Close'
}
data.rename(columns=rename_mapping, inplace=True)
# Sort by Date
data.sort_values('Date', inplace=True)
data.reset_index(drop=True, inplace=True)
logging.info(f"Data columns after renaming: {data.columns.tolist()}")
logging.info("Data loaded and sorted successfully.")
return data
def compute_rsi(series, window=14):
delta = series.diff()
gain = delta.where(delta > 0, 0).rolling(window=window).mean()
loss = -delta.where(delta < 0, 0).rolling(window=window).mean()
RS = gain / loss
return 100 - (100 / (1 + RS))
def compute_macd(series, span_short=12, span_long=26, span_signal=9):
ema_short = series.ewm(span=span_short, adjust=False).mean()
ema_long = series.ewm(span=span_long, adjust=False).mean()
macd_line = ema_short - ema_long
signal_line = macd_line.ewm(span=span_signal, adjust=False).mean()
return macd_line - signal_line # MACD histogram
def compute_adx(df, window=14):
"""
Example ADX calculation (pseudo-real):
You can implement a full ADX formula if youd like.
Here, we do a slightly more robust approach than rolling std.
"""
# True range
df['H-L'] = df['High'] - df['Low']
df['H-Cp'] = (df['High'] - df['Close'].shift(1)).abs()
df['L-Cp'] = (df['Low'] - df['Close'].shift(1)).abs()
tr = df[['H-L', 'H-Cp', 'L-Cp']].max(axis=1)
tr_rolling = tr.rolling(window=window).mean()
# Simplistic to replicate ADX-like effect
adx_placeholder = tr_rolling / df['Close']
df.drop(['H-L','H-Cp','L-Cp'], axis=1, inplace=True)
return adx_placeholder
def compute_obv(df):
# On-Balance Volume
signed_volume = (np.sign(df['Close'].diff()) * df['Volume']).fillna(0)
return signed_volume.cumsum()
def compute_bollinger_bands(series, window=20, num_std=2):
"""
Bollinger Bands: middle=MA, upper=MA+2*std, lower=MA-2*std
Return the band width or separate columns.
"""
sma = series.rolling(window=window).mean()
std = series.rolling(window=window).std()
upper = sma + num_std * std
lower = sma - num_std * std
bandwidth = (upper - lower) / sma # optional metric
return upper, lower, bandwidth
def compute_mfi(df, window=14):
"""
Money Flow Index: uses typical price, volume, direction.
For demonstration.
"""
typical_price = (df['High'] + df['Low'] + df['Close']) / 3
money_flow = typical_price * df['Volume']
# Positive or negative
df_shift = typical_price.shift(1)
flow_positive = money_flow.where(typical_price > df_shift, 0)
flow_negative = money_flow.where(typical_price < df_shift, 0)
# Sum over window
pos_sum = flow_positive.rolling(window=window).sum()
neg_sum = flow_negative.rolling(window=window).sum()
# RSI-like formula
mfi = 100 - (100 / (1 + pos_sum / (neg_sum + 1e-9)))
return mfi
def calculate_technical_indicators(df):
logging.info("Calculating technical indicators...")
df['RSI'] = compute_rsi(df['Close'], window=14)
df['MACD'] = compute_macd(df['Close'])
df['OBV'] = compute_obv(df)
df['ADX'] = compute_adx(df)
# Bollinger
upper_bb, lower_bb, bb_width = compute_bollinger_bands(df['Close'], window=20)
df['BB_Upper'] = upper_bb
df['BB_Lower'] = lower_bb
df['BB_Width'] = bb_width
# MFI
df['MFI'] = compute_mfi(df)
# Simple/EMA
df['SMA_5'] = df['Close'].rolling(window=5).mean()
df['SMA_10'] = df['Close'].rolling(window=10).mean()
df['EMA_5'] = df['Close'].ewm(span=5, adjust=False).mean()
df['EMA_10'] = df['Close'].ewm(span=10, adjust=False).mean()
# STD
df['STDDEV_5'] = df['Close'].rolling(window=5).std()
df.dropna(inplace=True)
logging.info("Technical indicators calculated successfully.")
return df
##############################
# 2. Parse Arguments
##############################
def parse_arguments():
parser = argparse.ArgumentParser(description='Train LSTM and DQN models for stock trading.')
parser.add_argument('csv_path', type=str, help='Path to the CSV data file.')
return parser.parse_args()
##############################
# 3. Main
##############################
def main():
args = parse_arguments()
csv_path = args.csv_path
# 1) Load data
data = load_data(csv_path)
data = calculate_technical_indicators(data)
# 2) Build feature set
# We deliberately EXCLUDE 'Close' from the features so the model doesn't trivially see it.
# Instead, rely on advanced indicators + OHLC + Volume.
feature_columns = [
'Open', 'High', 'Low', 'Volume',
'RSI', 'MACD', 'OBV', 'ADX',
'BB_Upper', 'BB_Lower', 'BB_Width',
'MFI', 'SMA_5', 'SMA_10', 'EMA_5', 'EMA_10', 'STDDEV_5'
]
target_column = 'Close' # still used for label/evaluation
# Keep only these columns + Date + target
data = data[['Date'] + feature_columns + [target_column]].dropna()
# 3) Scale data
from sklearn.preprocessing import MinMaxScaler
scaler_features = MinMaxScaler()
scaler_target = MinMaxScaler()
scaled_features = scaler_features.fit_transform(data[feature_columns])
scaled_target = scaler_target.fit_transform(data[[target_column]]).flatten()
# 4) Create sequences for LSTM
def create_sequences(features, target, window_size=15):
X, y = [], []
for i in range(len(features) - window_size):
X.append(features[i:i+window_size])
y.append(target[i+window_size])
return np.array(X), np.array(y)
window_size = 15
X, y = create_sequences(scaled_features, scaled_target, window_size)
# 5) Train/Val/Test Split
train_size = int(len(X) * 0.7)
val_size = int(len(X) * 0.15)
test_size = len(X) - train_size - val_size
X_train, X_val, X_test = (
X[:train_size],
X[train_size:train_size+val_size],
X[train_size+val_size:]
)
y_train, y_val, y_test = (
y[:train_size],
y[train_size:train_size+val_size],
y[train_size+val_size:]
)
logging.info(f"X_train: {X_train.shape}, X_val: {X_val.shape}, X_test: {X_test.shape}")
logging.info(f"y_train: {y_train.shape}, y_val: {y_val.shape}, y_test: {y_test.shape}")
# 6) Device Config
def configure_device():
gpus = tf.config.list_physical_devices('GPU')
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logging.info(f"{len(gpus)} GPU(s) detected and configured.")
except RuntimeError as e:
logging.error(e)
else:
logging.info("No GPU detected, using CPU.")
configure_device()
# 7) Build LSTM
from tensorflow.keras.regularizers import l2
def build_lstm(input_shape, units=128, dropout=0.3, lr=1e-3):
model = Sequential()
# Example: 2 stacked LSTM layers
model.add(Bidirectional(LSTM(units, return_sequences=True, kernel_regularizer=l2(1e-4)), input_shape=input_shape))
model.add(Dropout(dropout))
model.add(Bidirectional(LSTM(units, return_sequences=False, kernel_regularizer=l2(1e-4))))
model.add(Dropout(dropout))
model.add(Dense(1, activation='linear'))
optimizer = Adam(learning_rate=lr)
model.compile(loss=Huber(), optimizer=optimizer, metrics=['mae'])
return model
# 8) Train LSTM (you can still do Optuna if you like, omitted here for brevity)
model_lstm = build_lstm((X_train.shape[1], X_train.shape[2]), units=128, dropout=0.3, lr=1e-3)
early_stop = EarlyStopping(patience=15, restore_best_weights=True)
reduce_lr = ReduceLROnPlateau(factor=0.5, patience=5, min_lr=1e-6)
model_lstm.fit(
X_train, y_train,
validation_data=(X_val, y_val),
epochs=100,
batch_size=32,
callbacks=[early_stop, reduce_lr],
verbose=1
)
# 9) Evaluate
def evaluate_lstm(model, X_test, y_test):
y_pred_scaled = model.predict(X_test).flatten()
# If we forcibly clamp predictions to [0,1], do so, else skip:
y_pred_scaled = np.clip(y_pred_scaled, 0, 1)
y_pred = scaler_target.inverse_transform(y_pred_scaled.reshape(-1,1)).flatten()
y_true = scaler_target.inverse_transform(y_test.reshape(-1,1)).flatten()
mse = mean_squared_error(y_true, y_pred)
rmse = np.sqrt(mse)
mae = mean_absolute_error(y_true, y_pred)
r2 = r2_score(y_true, y_pred)
# Direction
direction_true = np.sign(np.diff(y_true))
direction_pred = np.sign(np.diff(y_pred))
directional_acc = np.mean(direction_true == direction_pred)
logging.info(f"LSTM Test -> MSE={mse:.4f}, RMSE={rmse:.4f}, MAE={mae:.4f}, R2={r2:.4f}, DirAcc={directional_acc:.4f}")
# Quick Plot
plt.figure(figsize=(12,6))
plt.plot(y_true[:100], label='Actual')
plt.plot(y_pred[:100], label='Predicted')
plt.title("LSTM: Actual vs Predicted (first 100 test points)")
plt.legend()
plt.savefig("lstm_actual_vs_pred.png")
plt.close()
evaluate_lstm(model_lstm, X_test, y_test)
# Save
model_lstm.save("improved_lstm_model.keras")
import joblib
joblib.dump(scaler_features, "improved_scaler_features.pkl")
joblib.dump(scaler_target, "improved_scaler_target.pkl")
##############################
# 10) Reinforcement Learning
##############################
class StockTradingEnv(gym.Env):
"""
Improved RL Env that:
- excludes raw 'Close' from observation
- includes transaction cost (optional)
- uses step-based PnL as reward
"""
metadata = {'render.modes': ['human']}
def __init__(self, df, initial_balance=10000, transaction_cost=0.001):
super().__init__()
self.df = df.reset_index(drop=True)
self.initial_balance = initial_balance
self.balance = initial_balance
self.net_worth = initial_balance
self.current_step = 0
self.max_steps = len(df)
# Add transaction cost in decimal form (0.001 => 0.1%)
self.transaction_cost = transaction_cost
self.shares_held = 0
self.cost_basis = 0
# Suppose we exclude 'Close' from features to remove direct see of final price
self.obs_columns = [
'Open', 'High', 'Low', 'Volume',
'RSI', 'MACD', 'OBV', 'ADX',
'BB_Upper', 'BB_Lower', 'BB_Width',
'MFI', 'SMA_5', 'SMA_10', 'EMA_5', 'EMA_10', 'STDDEV_5'
]
# We'll normalize features with the same scaler used for LSTM. If you want EXACT same scaling:
# you can pass the same 'scaler_features' object into this environment.
self.scaler = MinMaxScaler().fit(df[self.obs_columns])
# Or load from a pkl if you prefer: joblib.load("improved_scaler_features.pkl")
self.action_space = spaces.Discrete(3) # 0=Sell, 1=Hold, 2=Buy
self.observation_space = spaces.Box(
low=0.0, high=1.0,
shape=(len(self.obs_columns) + 3,), # + balance, shares, cost_basis
dtype=np.float32
)
def reset(self):
self.balance = self.initial_balance
self.net_worth = self.initial_balance
self.shares_held = 0
self.cost_basis = 0
self.current_step = 0
return self._get_obs()
def step(self, action):
# Current row
row = self.df.iloc[self.current_step]
current_price = row['Close']
prev_net_worth = self.net_worth
if action == 2: # Buy
shares_bought = int(self.balance // current_price)
if shares_bought > 0:
cost = shares_bought * current_price
fee = cost * self.transaction_cost
self.balance -= (cost + fee)
# Weighted average cost basis
prev_shares = self.shares_held
self.shares_held += shares_bought
self.cost_basis = (
(self.cost_basis * prev_shares) + (shares_bought * current_price)
) / self.shares_held
elif action == 0: # Sell
if self.shares_held > 0:
revenue = self.shares_held * current_price
fee = revenue * self.transaction_cost
self.balance += (revenue - fee)
self.shares_held = 0
self.cost_basis = 0
# Recompute net worth
self.net_worth = self.balance + self.shares_held * current_price
self.current_step += 1
done = (self.current_step >= self.max_steps - 1)
# *Step-based* reward => daily PnL
reward = self.net_worth - prev_net_worth
obs = self._get_obs()
return obs, reward, done, {}
def _get_obs(self):
row = self.df.iloc[self.current_step][self.obs_columns]
# Scale
scaled = self.scaler.transform([row])[0]
additional = np.array([
self.balance / self.initial_balance,
self.shares_held / 100.0,
self.cost_basis / (self.initial_balance+1e-9)
], dtype=np.float32)
obs = np.concatenate([scaled, additional]).astype(np.float32)
return obs
def render(self, mode='human'):
profit = self.net_worth - self.initial_balance
print(f"Step: {self.current_step}, "
f"Balance: {self.balance:.2f}, "
f"Shares: {self.shares_held}, "
f"NetWorth: {self.net_worth:.2f}, "
f"Profit: {profit:.2f}")
##############################
# 11) Train DQN
##############################
def train_dqn(env):
logging.info("Training DQN agent with improved environment...")
model = DQN(
'MlpPolicy',
env,
verbose=1,
learning_rate=1e-3,
buffer_size=50000,
learning_starts=1000,
batch_size=64,
tau=0.99,
gamma=0.99,
train_freq=4,
target_update_interval=1000,
exploration_fraction=0.1,
exploration_final_eps=0.02,
tensorboard_log="./dqn_enhanced_tensorboard/"
)
model.learn(total_timesteps=50000)
model.save("improved_dqn_agent")
return model
# Initialize environment with the same data
# *In a real scenario, you might feed a different dataset or do a train/test split
# for the RL environment, too.
rl_env = StockTradingEnv(data, initial_balance=10000, transaction_cost=0.001)
vec_env = DummyVecEnv([lambda: rl_env])
dqn_model = train_dqn(vec_env)
logging.info("Finished DQN training. You can test with a script like 'use_dqn.py' or do an internal test here.")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,238 @@
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.layers import Input, LSTM, Dense, Dropout, Bidirectional
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
import optuna
import matplotlib.pyplot as plt
import logging
import sys
import os
# Force TensorFlow to use CPU if no GPU is available
if not any([os.environ.get('CUDA_VISIBLE_DEVICES'), os.environ.get('NVIDIA_VISIBLE_DEVICES')]):
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
# Initialize logger
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(message)s')
logger = logging.getLogger()
# Custom functions for technical indicators
def compute_sma(data, period):
return data.rolling(window=period).mean()
def compute_ema(data, period):
return data.ewm(span=period, adjust=False).mean()
def compute_rsi(data, period=14):
delta = data.diff(1)
gain = (delta.where(delta > 0, 0)).rolling(window=period).mean()
loss = (-delta.where(delta < 0, 0)).rolling(window=period).mean()
rs = gain / loss
return 100 - (100 / (1 + rs))
def compute_macd(data, fast_period=12, slow_period=26, signal_period=9):
fast_ema = compute_ema(data, fast_period)
slow_ema = compute_ema(data, slow_period)
macd = fast_ema - slow_ema
signal = compute_ema(macd, signal_period)
return macd, signal
def compute_atr(high, low, close, period=14):
tr = np.maximum(high - low, np.maximum(abs(high - close.shift(1)), abs(low - close.shift(1))))
return tr.rolling(window=period).mean()
def compute_adx(high, low, close, period=14):
tr = compute_atr(high, low, close, period)
plus_dm = (high - high.shift(1)).where((high - high.shift(1)) > (low.shift(1) - low), 0).fillna(0)
minus_dm = (low.shift(1) - low).where((low.shift(1) - low) > (high - high.shift(1)), 0).fillna(0)
plus_di = 100 * (plus_dm.rolling(window=period).sum() / tr)
minus_di = 100 * (minus_dm.rolling(window=period).sum() / tr)
dx = (abs(plus_di - minus_di) / (plus_di + minus_di)) * 100
return dx.rolling(window=period).mean()
# Load and preprocess data
def load_and_preprocess_data(file_path):
logger.info("Loading data...")
try:
df = pd.read_csv(file_path)
if 'time' not in df.columns:
logger.error("The CSV file must contain a 'time' column.")
sys.exit(1)
df['time'] = pd.to_datetime(df['time'], errors='coerce', utc=True)
invalid_time_count = df['time'].isna().sum()
if invalid_time_count > 0:
logger.warning(f"Dropping {invalid_time_count} rows with invalid datetime values.")
df = df.dropna(subset=['time'])
# Ensure required columns exist
required_columns = ['open', 'high', 'low', 'close', 'Volume']
for col in required_columns:
if col not in df.columns:
logger.warning(f"Missing column '{col}' in the data. Filling with default values.")
df[col] = 0
# Rename Volume column to lowercase for consistency
if 'Volume' in df.columns:
df.rename(columns={'Volume': 'volume'}, inplace=True)
except FileNotFoundError:
logger.error(f"File not found: {file_path}")
sys.exit(1)
except Exception as e:
logger.error(f"Error loading file: {e}")
sys.exit(1)
df['day'] = df['time'].dt.date
# Aggregate 5-minute data into daily data
daily_data = df.groupby('day').agg({
'open': 'first',
'high': 'max',
'low': 'min',
'close': 'last',
'volume': 'sum'
}).reset_index()
# Generate technical indicators
logger.info("Calculating technical indicators...")
daily_data['SMA_10'] = compute_sma(daily_data['close'], period=10)
daily_data['EMA_10'] = compute_ema(daily_data['close'], period=10)
daily_data['RSI'] = compute_rsi(daily_data['close'], period=14)
daily_data['MACD'], daily_data['MACD_signal'] = compute_macd(daily_data['close'])
daily_data['ATR'] = compute_atr(daily_data['high'], daily_data['low'], daily_data['close'], period=14)
daily_data['ADX'] = compute_adx(daily_data['high'], daily_data['low'], daily_data['close'], period=14)
# Drop NaN rows due to indicators
daily_data = daily_data.dropna()
# Scale data
logger.info("Scaling data...")
scaler = MinMaxScaler()
feature_columns = ['open', 'high', 'low', 'volume', 'SMA_10', 'EMA_10', 'RSI', 'MACD', 'MACD_signal', 'ATR', 'ADX']
scaled_features = scaler.fit_transform(daily_data[feature_columns])
return scaled_features, daily_data['close'].values, scaler
# Create sequences for LSTM
def create_sequences(data, target, window_size):
logger.info(f"Creating sequences with window size {window_size}...")
X, y = [], []
for i in range(len(data) - window_size):
X.append(data[i:i + window_size])
y.append(target[i + window_size])
return np.array(X), np.array(y)
# Define objective function for hyperparameter tuning
def objective(trial):
logger.info("Running Optuna trial...")
# Suggest hyperparameters
num_lstm_layers = trial.suggest_int("num_lstm_layers", 2, 4)
lstm_units = trial.suggest_int("lstm_units", 64, 256, step=64)
dropout_rate = trial.suggest_float("dropout_rate", 0.1, 0.5, step=0.1)
learning_rate = trial.suggest_float("learning_rate", 1e-4, 1e-2, log=True)
# Build model
inputs = Input(shape=(X_train.shape[1], X_train.shape[2]))
x = inputs
for _ in range(num_lstm_layers):
x = Bidirectional(LSTM(lstm_units, return_sequences=True, kernel_regularizer="l2"))(x)
x = Dropout(dropout_rate)(x)
x = LSTM(lstm_units, return_sequences=False, kernel_regularizer="l2")(x)
x = Dropout(dropout_rate)(x)
outputs = Dense(1, activation="linear")(x)
model = Model(inputs, outputs)
optimizer = Adam(learning_rate=learning_rate)
model.compile(optimizer=optimizer, loss="mean_squared_error", metrics=["mae"])
# Train model
early_stopping = EarlyStopping(monitor="val_loss", patience=10, restore_best_weights=True)
model.fit(
X_train, y_train,
validation_data=(X_test, y_test),
epochs=50,
batch_size=32,
callbacks=[early_stopping],
verbose=0
)
# Evaluate model
loss, mae = model.evaluate(X_test, y_test, verbose=0)
return mae
# Run hyperparameter tuning and train final model
if __name__ == "__main__":
if len(sys.argv) < 2:
logger.error("Please provide the CSV file path as an argument.")
sys.exit(1)
file_path = sys.argv[1] # Get the file path from command-line arguments
window_size = 30
# Load and preprocess data
data, target, scaler = load_and_preprocess_data(file_path)
# Create sequences
X, y = create_sequences(data, target, window_size)
# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, shuffle=False)
# Run Optuna
study = optuna.create_study(direction="minimize")
study.optimize(objective, n_trials=20)
# Train final model with best hyperparameters
best_params = study.best_params
logger.info(f"Best Hyperparameters: {best_params}")
inputs = Input(shape=(X_train.shape[1], X_train.shape[2]))
x = inputs
for _ in range(best_params["num_lstm_layers"]):
x = Bidirectional(LSTM(best_params["lstm_units"], return_sequences=True, kernel_regularizer="l2"))(x)
x = Dropout(best_params["dropout_rate"])(x)
x = LSTM(best_params["lstm_units"], return_sequences=False, kernel_regularizer="l2")(x)
x = Dropout(best_params["dropout_rate"])(x)
outputs = Dense(1, activation="linear")(x)
model = Model(inputs, outputs)
optimizer = Adam(learning_rate=best_params["learning_rate"])
model.compile(optimizer=optimizer, loss="mean_squared_error", metrics=["mae"])
# Callbacks
early_stopping = EarlyStopping(monitor="val_loss", patience=15, restore_best_weights=True)
reduce_lr = ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=5, min_lr=1e-5)
logger.info("Training final model...")
history = model.fit(
X_train, y_train,
validation_data=(X_test, y_test),
epochs=300,
batch_size=32,
callbacks=[early_stopping, reduce_lr]
)
# Save model
model.save("optimized_lstm_model.keras")
logger.info("Model saved as optimized_lstm_model.keras.")
# Evaluate model
loss, mae = model.evaluate(X_test, y_test)
logger.info(f"Final Model Test Loss: {loss}, Test MAE: {mae}")
# Make predictions and plot
y_pred = model.predict(X_test)
plt.figure(figsize=(10, 6))
plt.plot(y_test, label="Actual Prices")
plt.plot(y_pred, label="Predicted Prices")
plt.legend()
plt.title("Model Prediction vs Actual")
plt.xlabel("Time Steps")
plt.ylabel("Price")
plt.savefig("prediction_vs_actual.png")
plt.show()
logger.info("Predictions complete and saved to plot.")

View File

@@ -0,0 +1,452 @@
import os
import sys
import argparse # Added for argument parsing
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import logging
from tabulate import tabulate
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import TimeSeriesSplit, GridSearchCV
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, GRU, Dense, Dropout, Bidirectional
from tensorflow.keras.optimizers import Adam, Nadam
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.losses import Huber
import xgboost as xgb
import optuna
from optuna.integration import KerasPruningCallback
# For Reinforcement Learning
import gym
from gym import spaces
from stable_baselines3 import DQN
from stable_baselines3.common.vec_env import DummyVecEnv
# To handle parallelization
import multiprocessing
# Suppress TensorFlow warnings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' # Suppress INFO and WARNING messages
# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
# 1. Data Loading and Preprocessing
def load_data(file_path):
logging.info(f"Loading data from: {file_path}")
try:
# Parse 'time' column as dates
data = pd.read_csv(file_path, parse_dates=['time'])
except FileNotFoundError:
logging.error(f"File not found: {file_path}")
sys.exit(1)
except pd.errors.ParserError as e:
logging.error(f"Error parsing CSV file: {e}")
sys.exit(1)
except Exception as e:
logging.error(f"Unexpected error: {e}")
sys.exit(1)
# Rename columns to match script expectations
rename_mapping = {
'time': 'Date',
'open': 'Open',
'high': 'High',
'low': 'Low',
'close': 'Close'
}
data.rename(columns=rename_mapping, inplace=True)
logging.info(f"Data columns after renaming: {data.columns.tolist()}")
# Sort and reset index
data.sort_values('Date', inplace=True)
data.reset_index(drop=True, inplace=True)
logging.info("Data loaded and sorted successfully.")
return data
def compute_rsi(series, window=14):
delta = series.diff()
gain = (delta.where(delta > 0, 0)).rolling(window=window).mean()
loss = (-delta.where(delta < 0, 0)).rolling(window=window).mean()
RS = gain / loss
RSI = 100 - (100 / (1 + RS))
return RSI
def compute_macd(series, span_short=12, span_long=26, span_signal=9):
ema_short = series.ewm(span=span_short, adjust=False).mean()
ema_long = series.ewm(span=span_long, adjust=False).mean()
MACD = ema_short - ema_long
signal = MACD.ewm(span=span_signal, adjust=False).mean()
return MACD - signal
def compute_adx(df, window=14):
# Placeholder for ADX calculation
return df['Close'].rolling(window=window).std() # Simplistic placeholder
def compute_obv(df):
# On-Balance Volume calculation
OBV = (np.sign(df['Close'].diff()) * df['Volume']).fillna(0).cumsum()
return OBV
def calculate_technical_indicators(df):
logging.info("Calculating technical indicators...")
df['SMA_5'] = df['Close'].rolling(window=5).mean()
df['SMA_10'] = df['Close'].rolling(window=10).mean()
df['EMA_5'] = df['Close'].ewm(span=5, adjust=False).mean()
df['EMA_10'] = df['Close'].ewm(span=10, adjust=False).mean()
df['STDDEV_5'] = df['Close'].rolling(window=5).std()
df['RSI'] = compute_rsi(df['Close'], window=14)
df['MACD'] = compute_macd(df['Close'])
df['ADX'] = compute_adx(df)
df['OBV'] = compute_obv(df)
df.dropna(inplace=True) # Drop rows with NaN values after feature engineering
logging.info("Technical indicators calculated successfully.")
return df
# Argument Parsing
def parse_arguments():
parser = argparse.ArgumentParser(description='Train LSTM and DQN models for stock trading.')
parser.add_argument('csv_path', type=str, help='Path to the CSV data file.')
return parser.parse_args()
def main():
# Parse command-line arguments
args = parse_arguments()
csv_path = args.csv_path
# Load and preprocess data
data = load_data(csv_path)
data = calculate_technical_indicators(data)
# Feature selection
feature_columns = ['SMA_5', 'SMA_10', 'EMA_5', 'EMA_10', 'STDDEV_5', 'RSI', 'MACD', 'ADX', 'OBV', 'Volume', 'Open', 'High', 'Low']
target_column = 'Close'
data = data[['Date'] + feature_columns + [target_column]]
data.dropna(inplace=True)
# Scaling
scaler_features = MinMaxScaler()
scaler_target = MinMaxScaler()
scaled_features = scaler_features.fit_transform(data[feature_columns])
scaled_target = scaler_target.fit_transform(data[[target_column]]).flatten()
# Create sequences for LSTM
def create_sequences(features, target, window_size=15):
X, y = [], []
for i in range(len(features) - window_size):
X.append(features[i:i+window_size])
y.append(target[i+window_size])
return np.array(X), np.array(y)
window_size = 15
X, y = create_sequences(scaled_features, scaled_target, window_size)
# Split data into training, validation, and testing sets
train_size = int(len(X) * 0.7)
val_size = int(len(X) * 0.15)
test_size = len(X) - train_size - val_size
X_train, X_val, X_test = X[:train_size], X[train_size:train_size+val_size], X[train_size+val_size:]
y_train, y_val, y_test = y[:train_size], y[train_size:train_size+val_size], y[train_size+val_size:]
logging.info(f"Scaled training features shape: {X_train.shape}")
logging.info(f"Scaled validation features shape: {X_val.shape}")
logging.info(f"Scaled testing features shape: {X_test.shape}")
logging.info(f"Scaled training target shape: {y_train.shape}")
logging.info(f"Scaled validation target shape: {y_val.shape}")
logging.info(f"Scaled testing target shape: {y_test.shape}")
# 2. Device Configuration
def configure_device():
gpus = tf.config.list_physical_devices('GPU')
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logging.info(f"{len(gpus)} GPU(s) detected and configured.")
except RuntimeError as e:
logging.error(e)
else:
logging.info("No GPU detected, using CPU.")
configure_device()
# 3. Model Building
def build_advanced_lstm(input_shape, hyperparams):
model = Sequential()
for i in range(hyperparams['num_lstm_layers']):
return_sequences = True if i < hyperparams['num_lstm_layers'] - 1 else False
model.add(Bidirectional(LSTM(
hyperparams['lstm_units'],
return_sequences=return_sequences,
kernel_regularizer=tf.keras.regularizers.l2(0.001)
)))
model.add(Dropout(hyperparams['dropout_rate']))
model.add(Dense(1, activation='linear'))
if hyperparams['optimizer'] == 'Adam':
optimizer = Adam(learning_rate=hyperparams['learning_rate'], decay=hyperparams['decay'])
elif hyperparams['optimizer'] == 'Nadam':
optimizer = Nadam(learning_rate=hyperparams['learning_rate'])
else:
optimizer = Adam(learning_rate=hyperparams['learning_rate'])
model.compile(optimizer=optimizer, loss=Huber(), metrics=['mae'])
return model
def build_xgboost_model(X_train, y_train, hyperparams):
model = xgb.XGBRegressor(
objective='reg:squarederror',
n_estimators=hyperparams['n_estimators'],
max_depth=hyperparams['max_depth'],
learning_rate=hyperparams['learning_rate'],
subsample=hyperparams['subsample'],
colsample_bytree=hyperparams['colsample_bytree'],
random_state=42,
n_jobs=-1
)
model.fit(X_train.reshape(X_train.shape[0], -1), y_train)
return model
# 4. Hyperparameter Tuning with Optuna
def objective(trial):
# Hyperparameter suggestions
num_lstm_layers = trial.suggest_int('num_lstm_layers', 1, 3)
lstm_units = trial.suggest_categorical('lstm_units', [32, 64, 96, 128])
dropout_rate = trial.suggest_float('dropout_rate', 0.1, 0.5)
learning_rate = trial.suggest_loguniform('learning_rate', 1e-5, 1e-2)
optimizer_name = trial.suggest_categorical('optimizer', ['Adam', 'Nadam'])
decay = trial.suggest_float('decay', 0.0, 1e-4)
hyperparams = {
'num_lstm_layers': num_lstm_layers,
'lstm_units': lstm_units,
'dropout_rate': dropout_rate,
'learning_rate': learning_rate,
'optimizer': optimizer_name,
'decay': decay
}
model = build_advanced_lstm((X_train.shape[1], X_train.shape[2]), hyperparams)
early_stop = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
lr_reduce = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-6)
history = model.fit(
X_train, y_train,
epochs=100,
batch_size=16,
validation_data=(X_val, y_val),
callbacks=[early_stop, lr_reduce, KerasPruningCallback(trial, 'val_loss')],
verbose=0
)
val_mae = min(history.history['val_mae'])
return val_mae
# Optuna study
logging.info("Starting hyperparameter optimization with Optuna...")
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=50)
best_params = study.best_params
logging.info(f"Best Hyperparameters from Optuna: {best_params}")
# 5. Train the Best LSTM Model
best_model = build_advanced_lstm((X_train.shape[1], X_train.shape[2]), best_params)
early_stop = EarlyStopping(monitor='val_loss', patience=20, restore_best_weights=True)
lr_reduce = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-6)
logging.info("Training the best LSTM model with optimized hyperparameters...")
history = best_model.fit(
X_train, y_train,
epochs=300,
batch_size=16,
validation_data=(X_val, y_val),
callbacks=[early_stop, lr_reduce],
verbose=1
)
# 6. Evaluate the Model
def evaluate_model(model, X_test, y_test):
logging.info("Evaluating model...")
y_pred_scaled = model.predict(X_test).flatten()
y_pred_scaled = np.clip(y_pred_scaled, 0, 1) # Ensure predictions are within [0,1]
y_pred = scaler_target.inverse_transform(y_pred_scaled.reshape(-1, 1)).flatten()
y_test_actual = scaler_target.inverse_transform(y_test.reshape(-1, 1)).flatten()
mse = mean_squared_error(y_test_actual, y_pred)
rmse = np.sqrt(mse)
mae = mean_absolute_error(y_test_actual, y_pred)
r2 = r2_score(y_test_actual, y_pred)
# Directional Accuracy
direction_actual = np.sign(np.diff(y_test_actual))
direction_pred = np.sign(np.diff(y_pred))
directional_accuracy = np.mean(direction_actual == direction_pred)
logging.info(f"Test MSE: {mse}")
logging.info(f"Test RMSE: {rmse}")
logging.info(f"Test MAE: {mae}")
logging.info(f"Test R2 Score: {r2}")
logging.info(f"Directional Accuracy: {directional_accuracy}")
# Plot Actual vs Predicted
plt.figure(figsize=(14, 7))
plt.plot(y_test_actual, label='Actual Price')
plt.plot(y_pred, label='Predicted Price')
plt.title('Actual vs Predicted Prices')
plt.xlabel('Time Step')
plt.ylabel('Price')
plt.legend()
plt.grid(True)
plt.savefig('actual_vs_predicted.png') # Save the plot
plt.close()
logging.info("Actual vs Predicted plot saved as 'actual_vs_predicted.png'")
# Tabulate first 40 predictions
table = [[i, round(actual, 2), round(pred, 2)] for i, (actual, pred) in enumerate(zip(y_test_actual[:40], y_pred[:40]))]
headers = ["Index", "Actual Price", "Predicted Price"]
print(tabulate(table, headers=headers, tablefmt="pretty"))
return mse, rmse, mae, r2, directional_accuracy
mse, rmse, mae, r2, directional_accuracy = evaluate_model(best_model, X_test, y_test)
# 7. Save the Model and Scalers
best_model.save('optimized_lstm_model.h5')
import joblib
joblib.dump(scaler_features, 'scaler_features.save')
joblib.dump(scaler_target, 'scaler_target.save')
logging.info("Model and scalers saved as 'optimized_lstm_model.h5', 'scaler_features.save', and 'scaler_target.save'.")
# 8. Reinforcement Learning: Deep Q-Learning for Trading Actions
class StockTradingEnv(gym.Env):
"""
A simple stock trading environment for OpenAI gym
"""
metadata = {'render.modes': ['human']}
def __init__(self, df, initial_balance=10000):
super(StockTradingEnv, self).__init__()
self.df = df.reset_index()
self.initial_balance = initial_balance
self.balance = initial_balance
self.net_worth = initial_balance
self.max_steps = len(df)
self.current_step = 0
self.shares_held = 0
self.cost_basis = 0
# Actions: 0 = Sell, 1 = Hold, 2 = Buy
self.action_space = spaces.Discrete(3)
# Observations: [normalized features + balance + shares held + cost basis]
self.observation_space = spaces.Box(low=0, high=1, shape=(len(feature_columns) + 3,), dtype=np.float32)
def reset(self):
self.balance = self.initial_balance
self.net_worth = self.initial_balance
self.current_step = 0
self.shares_held = 0
self.cost_basis = 0
return self._next_observation()
def _next_observation(self):
obs = self.df.loc[self.current_step, feature_columns].values
# Normalize features by their max to ensure [0,1] range
obs = obs / np.max(obs)
# Append balance, shares held, and cost basis
additional = np.array([
self.balance / self.initial_balance,
self.shares_held / 100, # Assuming a maximum of 100 shares for normalization
self.cost_basis / self.initial_balance
])
return np.concatenate([obs, additional])
def step(self, action):
current_price = self.df.loc[self.current_step, 'Close']
if action == 2: # Buy
total_possible = self.balance // current_price
shares_bought = total_possible
if shares_bought > 0:
self.balance -= shares_bought * current_price
self.shares_held += shares_bought
self.cost_basis = (self.cost_basis * (self.shares_held - shares_bought) + shares_bought * current_price) / self.shares_held
elif action == 0: # Sell
if self.shares_held > 0:
self.balance += self.shares_held * current_price
self.shares_held = 0
self.cost_basis = 0
# Hold does nothing
self.net_worth = self.balance + self.shares_held * current_price
self.current_step += 1
done = self.current_step >= self.max_steps - 1
# Reward: change in net worth
reward = self.net_worth - self.initial_balance
obs = self._next_observation()
return obs, reward, done, {}
def render(self, mode='human', close=False):
profit = self.net_worth - self.initial_balance
print(f'Step: {self.current_step}')
print(f'Balance: {self.balance}')
print(f'Shares held: {self.shares_held} (Cost Basis: {self.cost_basis})')
print(f'Net worth: {self.net_worth}')
print(f'Profit: {profit}')
def train_dqn_agent(env):
logging.info("Training DQN Agent...")
try:
model = DQN(
'MlpPolicy',
env,
verbose=1,
learning_rate=1e-3,
buffer_size=10000,
learning_starts=1000,
batch_size=64,
tau=1.0,
gamma=0.99,
train_freq=4,
target_update_interval=1000,
exploration_fraction=0.1,
exploration_final_eps=0.02,
tensorboard_log="./dqn_stock_tensorboard/"
)
model.learn(total_timesteps=100000)
model.save("dqn_stock_trading")
logging.info("DQN Agent trained and saved as 'dqn_stock_trading.zip'.")
return model
except Exception as e:
logging.error(f"Error training DQN Agent: {e}")
sys.exit(1)
# Initialize trading environment
trading_env = StockTradingEnv(data)
trading_env = DummyVecEnv([lambda: trading_env])
# Train DQN agent
dqn_model = train_dqn_agent(trading_env)
if __name__ == "__main__":
main()

Binary file not shown.

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -0,0 +1,9 @@
- OS: Linux-6.1.0-30-amd64-x86_64-with-glibc2.36 # 1 SMP PREEMPT_DYNAMIC Debian 6.1.124-1 (2025-01-12)
- Python: 3.11.2
- Stable-Baselines3: 2.4.1
- PyTorch: 2.5.1+cu124
- GPU Enabled: False
- Numpy: 1.26.4
- Cloudpickle: 3.1.1
- Gymnasium: 1.0.0
- OpenAI Gym: 0.26.2

View File

@@ -0,0 +1,241 @@
import argparse
import gym
import numpy as np
import pandas as pd
from tabulate import tabulate
from stable_baselines3 import DQN
from stable_baselines3.common.vec_env import DummyVecEnv
###############################
# 1. HELPER FUNCTIONS
###############################
def compute_rsi(series, window=14):
delta = series.diff()
gain = (delta.where(delta > 0, 0)).rolling(window=window).mean()
loss = (-delta.where(delta < 0, 0)).rolling(window=window).mean()
RS = gain / loss
RSI = 100 - (100 / (1 + RS))
return RSI
def compute_macd(series, span_short=12, span_long=26, span_signal=9):
ema_short = series.ewm(span=span_short, adjust=False).mean()
ema_long = series.ewm(span=span_long, adjust=False).mean()
macd_line = ema_short - ema_long
signal_line = macd_line.ewm(span=span_signal, adjust=False).mean()
return macd_line - signal_line # MACD histogram
def compute_adx(df, window=14):
# Placeholder for ADX calculation
return df['Close'].rolling(window=window).std()
def compute_obv(df):
# On-Balance Volume calculation
OBV = (np.sign(df['Close'].diff()) * df['Volume']).fillna(0).cumsum()
return OBV
def compute_technical_indicators(df):
df['SMA_5'] = df['Close'].rolling(window=5).mean()
df['SMA_10'] = df['Close'].rolling(window=10).mean()
df['EMA_5'] = df['Close'].ewm(span=5, adjust=False).mean()
df['EMA_10'] = df['Close'].ewm(span=10, adjust=False).mean()
df['STDDEV_5'] = df['Close'].rolling(window=5).std()
df['RSI'] = compute_rsi(df['Close'], 14)
df['MACD'] = compute_macd(df['Close'])
df['ADX'] = compute_adx(df)
df['OBV'] = compute_obv(df)
df.dropna(inplace=True)
return df
###############################
# 2. ENVIRONMENT DEFINITION
###############################
class StockTradingEnv(gym.Env):
"""
Simple environment using older Gym API for SB3.
"""
def __init__(self, df, initial_balance=10000):
super().__init__()
self.df = df.reset_index(drop=True)
self.initial_balance = initial_balance
self.balance = initial_balance
self.net_worth = initial_balance
self.max_steps = len(df)
self.current_step = 0
self.shares_held = 0
self.cost_basis = 0
self.feature_columns = [
'SMA_5', 'SMA_10', 'EMA_5', 'EMA_10', 'STDDEV_5',
'RSI', 'MACD', 'ADX', 'OBV', 'Volume',
'Open', 'High', 'Low'
]
self.action_space = gym.spaces.Discrete(3)
self.observation_space = gym.spaces.Box(
low=0.0, high=1.0,
shape=(len(self.feature_columns)+3,),
dtype=np.float32
)
def reset(self):
self.balance = self.initial_balance
self.net_worth = self.initial_balance
self.current_step = 0
self.shares_held = 0
self.cost_basis = 0
return self._next_observation()
def step(self, action):
current_price = self.df.loc[self.current_step, 'Close']
# BUY
if action == 2:
total_possible = self.balance // current_price
shares_bought = int(total_possible)
if shares_bought > 0:
prev_shares = self.shares_held
self.balance -= shares_bought * current_price
self.shares_held += shares_bought
self.cost_basis = (
(self.cost_basis * prev_shares) + (shares_bought * current_price)
) / self.shares_held
# SELL
elif action == 0:
if self.shares_held > 0:
self.balance += self.shares_held * current_price
self.shares_held = 0
self.cost_basis = 0
self.net_worth = self.balance + self.shares_held * current_price
self.current_step += 1
done = (self.current_step >= self.max_steps - 1)
reward = self.net_worth - self.initial_balance
obs = self._next_observation()
return obs, reward, done, {}
def _next_observation(self):
row = self.df.loc[self.current_step, self.feature_columns].values
max_val = np.max(row) if np.max(row) != 0 else 1.0
row_norm = row / max_val
additional = np.array([
self.balance / self.initial_balance,
self.shares_held / 100.0,
self.cost_basis / self.initial_balance
], dtype=np.float32)
obs = np.concatenate([row_norm, additional]).astype(np.float32)
return obs
def render(self, mode='human'):
profit = self.net_worth - self.initial_balance
print(f"Step: {self.current_step} | "
f"Balance: {self.balance:.2f} | "
f"Shares: {self.shares_held} | "
f"NetWorth: {self.net_worth:.2f} | "
f"Profit: {profit:.2f}")
###############################
# 3. ARGUMENT PARSING
###############################
def parse_arguments():
parser = argparse.ArgumentParser(description="Use a trained DQN model to run a stock trading simulation.")
parser.add_argument("-s", "--show-steps", type=int, default=15,
help="Number of final steps to display in the summary (default: 15, max: 300).")
return parser.parse_args()
###############################
# 4. MAIN FUNCTION
###############################
def main():
args = parse_arguments()
# Bound how many steps we show at the end
steps_to_display = min(args.show_steps, 300)
# 1) Load CSV
df = pd.read_csv('BAT.csv')
rename_mapping = {
'time': 'Date',
'open': 'Open',
'high': 'High',
'low': 'Low',
'close': 'Close'
}
df.rename(columns=rename_mapping, inplace=True)
df.sort_values('Date', inplace=True)
df.reset_index(drop=True, inplace=True)
df = compute_technical_indicators(df)
if 'volume' in df.columns and 'Volume' not in df.columns:
df.rename(columns={'volume': 'Volume'}, inplace=True)
# 2) Instantiate environment
raw_env = StockTradingEnv(df)
vec_env = DummyVecEnv([lambda: raw_env])
# 3) Load your DQN model
model = DQN.load("dqn_stock_trading.zip", env=vec_env)
# 4) Run inference
obs = vec_env.reset()
done = [False]
total_reward = 0.0
step_data = []
step_count = 0
underlying_env = vec_env.envs[0]
while not done[0]:
step_count += 1
action, _ = model.predict(obs, deterministic=True)
obs, reward, done, info = vec_env.step(action)
reward_scalar = reward[0]
total_reward += reward_scalar
step_data.append({
"Step": step_count,
"Action": int(action[0]),
"Reward": reward_scalar,
"Balance": underlying_env.balance,
"Shares": underlying_env.shares_held,
"NetWorth": underlying_env.net_worth
})
final_net_worth = underlying_env.net_worth
final_profit = final_net_worth - underlying_env.initial_balance
# 5) Print final summary
print("\n=== DQN Agent Finished ===")
print(f"Total Steps Taken: {step_count}")
print(f"Final Net Worth: {final_net_worth:.2f}")
print(f"Final Profit: {final_profit:.2f}")
print(f"Sum of Rewards: {total_reward:.2f}")
# Count actions
buy_count = sum(1 for x in step_data if x["Action"] == 2)
sell_count = sum(1 for x in step_data if x["Action"] == 0)
hold_count = sum(1 for x in step_data if x["Action"] == 1)
print(f"Actions Taken -> BUY: {buy_count}, SELL: {sell_count}, HOLD: {hold_count}")
# 6) Show the last N steps, where N=steps_to_display
last_n = step_data[-steps_to_display:] if len(step_data) > steps_to_display else step_data
rows = []
for d in last_n:
rows.append([
d["Step"], d["Action"], f"{d['Reward']:.2f}",
f"{d['Balance']:.2f}", d["Shares"], f"{d['NetWorth']:.2f}"
])
headers = ["Step", "Action", "Reward", "Balance", "Shares", "NetWorth"]
print(f"\n== Last {steps_to_display} Steps ==")
print(tabulate(rows, headers=headers, tablefmt="pretty"))
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,134 @@
import os
import sys
import pandas as pd
import tensorflow as tf
from stable_baselines3.common.vec_env import DummyVecEnv
import gym
from gym import spaces
import numpy as np
import logging
# Suppress TensorFlow warnings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' # Suppress INFO and WARNING messages
# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
class StockTradingEnv(gym.Env):
"""
A minimal stock trading environment for testing DummyVecEnv.
"""
metadata = {'render.modes': ['human']}
def __init__(self, df, initial_balance=10000):
super(StockTradingEnv, self).__init__()
self.df = df.reset_index()
self.initial_balance = initial_balance
self.balance = initial_balance
self.net_worth = initial_balance
self.max_steps = len(df)
self.current_step = 0
self.shares_held = 0
self.cost_basis = 0
# Actions: 0 = Sell, 1 = Hold, 2 = Buy
self.action_space = spaces.Discrete(3)
# Observations: [normalized features + balance + shares held + cost basis]
# For simplicity, we'll use the same features as your main script
feature_columns = ['SMA_5', 'SMA_10', 'EMA_5', 'EMA_10', 'STDDEV_5', 'RSI', 'MACD', 'ADX', 'OBV', 'Volume', 'Open', 'High', 'Low']
self.observation_space = spaces.Box(low=0, high=1, shape=(len(feature_columns) + 3,), dtype=np.float32)
def reset(self):
self.balance = self.initial_balance
self.net_worth = self.initial_balance
self.current_step = 0
self.shares_held = 0
self.cost_basis = 0
return self._next_observation()
def _next_observation(self):
obs = self.df.loc[self.current_step, ['SMA_5', 'SMA_10', 'EMA_5', 'EMA_10', 'STDDEV_5', 'RSI', 'MACD', 'ADX', 'OBV', 'Volume', 'Open', 'High', 'Low']].values
# Normalize additional features
additional = np.array([
self.balance / self.initial_balance,
self.shares_held / 100, # Assuming a maximum of 100 shares for normalization
self.cost_basis / self.initial_balance
])
return np.concatenate([obs, additional])
def step(self, action):
current_price = self.df.loc[self.current_step, 'Close']
if action == 2: # Buy
total_possible = self.balance // current_price
shares_bought = total_possible
if shares_bought > 0:
self.balance -= shares_bought * current_price
self.shares_held += shares_bought
self.cost_basis = (self.cost_basis * (self.shares_held - shares_bought) + shares_bought * current_price) / self.shares_held
elif action == 0: # Sell
if self.shares_held > 0:
self.balance += self.shares_held * current_price
self.shares_held = 0
self.cost_basis = 0
# Hold does nothing
self.net_worth = self.balance + self.shares_held * current_price
self.current_step += 1
done = self.current_step >= self.max_steps - 1
# Reward: change in net worth
reward = self.net_worth - self.initial_balance
obs = self._next_observation()
return obs, reward, done, {}
def render(self, mode='human', close=False):
profit = self.net_worth - self.initial_balance
print(f'Step: {self.current_step}')
print(f'Balance: {self.balance}')
print(f'Shares held: {self.shares_held} (Cost Basis: {self.cost_basis})')
print(f'Net worth: {self.net_worth}')
print(f'Profit: {profit}')
def main(file_path):
# Check if file exists
if not os.path.exists(file_path):
logging.error(f"File not found: {file_path}")
sys.exit(1)
logging.info("File exists.")
# Load a small portion of the data
try:
data = pd.read_csv(file_path, nrows=5)
logging.info("Data loaded successfully:")
print(data.head())
except Exception as e:
logging.error(f"Error loading data: {e}")
sys.exit(1)
# Check TensorFlow GPU availability
gpus = tf.config.list_physical_devices('GPU')
logging.info("TensorFlow GPU Availability:")
print(gpus)
# Check DummyVecEnv import and initialization
try:
# Initialize a minimal environment for testing
test_env = StockTradingEnv(data)
env = DummyVecEnv([lambda: test_env])
logging.info("DummyVecEnv imported and initialized successfully.")
except Exception as e:
logging.error(f"Error initializing DummyVecEnv: {e}")
if __name__ == "__main__":
if len(sys.argv) != 2:
logging.error("Usage: python verify_setup.py <path_to_csv>")
sys.exit(1)
file_path = sys.argv[1]
main(file_path)