pushing this all to git before cleaing it. putting this thing on my fucking Dell Poweredge bc god knows im not waiting 2 hours to run another ML test on this 12 core machine

2025-01-27 15:29:04 -05:00
parent 9642bd4292
commit 638e0d82a4
24 changed files with 22165 additions and 0 deletions
--- a/src/Machine-Learning/LSTM-python/BAT.csv
+++ b/src/Machine-Learning/LSTM-python/BAT.csv
--- a/src/Machine-Learning/LSTM-python/README.md
+++ b/src/Machine-Learning/LSTM-python/README.md
@@ -0,0 +1,268 @@
 # I got Bitched. Fuck Python
 # Below is a breakdown of how **`main.py`** works, what files it generates, and how all the pieces (LSTM model training, Optuna tuning, DQN trading agent, etc.) fit together. I’ll also cover the CSV structure, how to interpret metrics, how to handle the various files, and how to potentially adapt this code for real-time predictions or a live trading bot.
 ---
 ## 1. **High-Level Flow of `main.py`**
 1. **Imports & Logging Setup**  
   - Imports all necessary libraries (numpy, pandas, sklearn, TensorFlow, XGBoost, Optuna, Stable Baselines, etc.)  
   - Sets up basic logging with timestamps and message levels.
 2. **Argument Parsing**  
   - Uses `argparse` to expect one argument: the path to the CSV file.  
   - You run the script like `python main.py your_data.csv`.
 3. **Data Loading & Preprocessing**  
   - **`load_data(file_path)`**: Reads the CSV with pandas, renames columns (e.g., `time` → `Date`, `open` → `Open`, etc.), sorts by date, and returns a cleaned DataFrame.  
   - **`calculate_technical_indicators(df)`**:  
     - Computes various indicators (SMA, EMA, RSI, MACD, ADX, OBV).  
     - Drops rows with `NaN` values (because rolling windows can produce NaNs).  
   - After these steps, the data is ready for feature selection.
 4. **Feature Selection & Scaling**  
   - Chooses certain columns as features (`feature_columns`) plus the target (`Close`).  
   - Scales features and target with `MinMaxScaler`, which normalizes values to [0, 1].  
   - Prepares sequences of length `window_size` (default = 15) for LSTM training via **`create_sequences()`**.
 5. **Train/Validation/Test Split**  
   - Splits sequences into 70% for training, 15% for validation, and 15% for testing.  
   - This yields `X_train`, `X_val`, `X_test` and corresponding `y_train`, `y_val`, `y_test`.
 6. **Device Configuration**  
   - Checks for GPUs and configures TensorFlow to allow memory growth on the available GPU(s).
 7. **Model Building and Hyperparameter Tuning**  
   - **`build_advanced_lstm(...)`**: Creates a multi-layer, bidirectional LSTM with optional dropout, user-defined optimizer, learning rate, etc.  
   - **Optuna**:  
     - The `objective(trial)` function defines which hyperparameters to search.  
     - It trains an LSTM for each set of hyperparameters.  
     - Minimizes the validation MAE to find the best combination.  
     - Runs `study.optimize(objective, n_trials=50)` to try up to 50 hyperparameter sets.  
   - The best hyperparameters are then retrieved (`study.best_params`).
 8. **Train the Best LSTM Model**  
   - Re-builds the LSTM with the best hyperparameters found.  
   - Uses callbacks (`EarlyStopping`, `ReduceLROnPlateau`) for better generalization.  
   - Trains up to 300 epochs or until early stopping.
 9. **Model Evaluation**  
   - **`evaluate_model(...)`**:  
     - Gets predictions (`model.predict(X_test)`).  
     - Inverse-transforms them to the original scale (because we had scaled them).  
     - Computes **MSE**, **RMSE**, **MAE**, **R2**, and **directional accuracy**.  
     - Saves a plot called **`actual_vs_predicted.png`** comparing actual and predicted test prices.  
     - Prints the first 40 predictions in a tabular format.
 10. **Save Model & Scalers**  
    - Saves the Keras model to **`optimized_lstm_model.h5`**.  
      - _(Keras warns this is a legacy format and suggests using `.keras` file extension—more on that later.)_  
    - Saves the scalers to **`scaler_features.save`** and **`scaler_target.save`** using `joblib`.
 11. **Reinforcement Learning (DQN) Setup**  
    - Defines a custom Gym environment **`StockTradingEnv`** that simulates a simple stock trading scenario:  
      - Discrete action space: **0 = Sell**, **1 = Hold**, **2 = Buy**.  
      - Observations: scaled features + current balance + shares held + cost basis.  
      - Reward is the change in net worth.  
    - Uses **Stable Baselines 3** (`DQN`) to train an agent in that environment for 100,000 timesteps.  
    - Saves the agent as **`dqn_stock_trading.zip`**.
 12. **Finally**:  
    - The script ends after the RL agent training completes.
 ---
 ## 2. **Explanation of Generated Files**
 After you run `main.py`, you typically end up with these files:
 1. **`optimized_lstm_model.h5`**  
   - The final trained LSTM model, saved in the HDF5 format.  
   - **Keras** now recommends using `model.save('optimized_lstm_model.keras')` or `keras.saving.save_model(model, 'optimized_lstm_model.keras')` for a more modern format.
 2. **`scaler_features.save`**  
   - A `joblib` file containing the fitted `MinMaxScaler` for the features.
 3. **`scaler_target.save`**  
   - Another `joblib` file for the target variable’s `MinMaxScaler`.
 4. **`actual_vs_predicted.png`**  
   - A PNG plot comparing the actual vs. predicted close prices from the test set.
 5. **`dqn_stock_trading.zip`**  
   - The trained DQN agent from Stable Baselines 3.
 6. **`dqn_stock_tensorboard/`**  
   - A directory containing TensorBoard logs for the DQN training process.  
   - You can inspect these logs by running `tensorboard --logdir=./dqn_stock_tensorboard`.
 7. **Other legacy or auxiliary files** you may have in the same folder:  
   - **`enhanced_lstm_model.h5`**, **`prediction_vs_actual.png`**, **`policy.pth`**, **`_stable_baselines3_version`**, etc., come from either old runs or intermediate attempts. You can clean them up if you no longer need them.
 ---
 ## 3. **Your CSV (`time,open,high,low,close,Volume`)**
 An example snippet:
 ```
 time,open,high,low,close,Volume
 2024-01-08T09:30:00-05:00,59.23,59.69,59.03,59.53,4335
 ...
 ```
 - The script renames these columns to `Date`, `Open`, `High`, `Low`, `Close`, `Volume`.  
 - It sorts by `Date` and starts computing features.
 Because the script expects `Date` sorted in ascending order, make sure all timestamps follow the same format.
 ---
 ## 4. **Interpreting the Evaluation Metrics**
 When the script prints:
 - **MSE (Mean Squared Error)**  
 - **RMSE (Root MSE)**  
 - **MAE (Mean Absolute Error)**  
 - **R² (Coefficient of Determination)**  
 - **Directional Accuracy**  
 ### Example Output Interpretation
 - **MSE: 0.0838** → The average of squared errors on the original (inversely scaled) price scale is relatively low.  
 - **RMSE: 0.2895** → The square root of that MSE, so on average the model’s predictions deviate by about 0.29 from the actual close price.  
 - **MAE: 0.1836** → On average, the absolute deviation is ~0.18.  
 - **R²: 0.993** → Very high, suggests the model explains 99.3% of the variance in price.  
 However, a **directional accuracy** of ~0.48 suggests the model is not great at predicting whether the price goes up or down from one timestep to the next. It’s close to random guessing (50%). This can happen if the model is good at capturing overall magnitude but not short-term direction.
 If you need the model to be directionally correct more often (for trading), consider:
 - Shifting the target to be the price change or return (rather than the absolute price).  
 - Using classification-based approach (up/down) or building a custom loss function that focuses more on directional accuracy.
 ---
 ## 5. **How to Improve or Change the Metric Outputs**
 1. **Custom Metrics**:  
   - You can add them to `model.compile(metrics=[...])` if they’re supported by Keras.  
   - Or you can compute them manually in the `evaluate_model` function (like you already do for R², directional accuracy, etc.).
 2. **Reducing Warnings**:  
   - **HDF5 warning**: Instead of `best_model.save('optimized_lstm_model.h5')`, do:
     ```python
     best_model.save('optimized_lstm_model.keras')
     ```
     Or:
     ```python
     keras.saving.save_model(best_model, 'optimized_lstm_model.keras')
     ```
   - **Gym vs. Gymnasium warning** in Stable Baselines:  
     - You can switch to Gymnasium by installing `gymnasium` and adapting the environment accordingly:
       ```python
       import gymnasium as gym
       ```
       Then use `gymnasium.make(...)`.  
       - But as long as it’s working, the warning is mostly informational.
 3. **Remove Unused Files**:  
   - If certain files are no longer used or were generated by old runs, just delete them to keep your workspace clean.
 ---
 ## 6. **Using the DQN Agent**
 ### How It’s Being Trained
 - **`StockTradingEnv`** is a simplified environment that steps through your historical data row by row (`self.max_steps = len(df)`).  
 - Each step, you pick an action (Sell, Hold, or Buy).  
 - The environment updates your balance, shares held, cost basis, and net worth accordingly.  
 - The reward is `(net_worth - initial_balance)`, i.e. how much you’ve gained or lost.
 ### How to Deploy It
 1. **After Training**: You have **`dqn_stock_trading.zip`** saved.  
 2. **Load the Model** in a separate script or Jupyter notebook:
   ```python
   from stable_baselines3 import DQN
   from stable_baselines3.common.vec_env import DummyVecEnv
   # Recreate the same environment
   env = StockTradingEnv(your_dataframe)
   env = DummyVecEnv([lambda: env])
   # Load the trained agent
   model = DQN.load("dqn_stock_trading.zip", env=env)
   ```
 3. **Run Predictions**:
   ```python
   obs = env.reset()
   done = False
   while not done:
       # Model predicts the best action
       action, _states = model.predict(obs, deterministic=True)
       obs, reward, done, info = env.step(action)
       env.render()
   ```
   This will step through the environment again, but now with your trained agent. In a real-time scenario, you’d need a streaming environment that updates with new data in small increments (e.g., each new minute’s bar).
 ---
 ## 7. **Transition to Real-Time (“Live”) Predictions**
 1. **Live Price Feed**:  
   - You would replace the static CSV with a real-time feed (e.g., an API from a broker or a data provider).  
   - Keep a rolling window of the last `window_size` data points, compute your indicators on the fly.
 2. **Online or Incremental Updates**:  
   - For an LSTM, you typically retrain or fine-tune it with new data over time. Or you load the existing model and just do forward passes for the new window.  
   - The code that constructs sequences would run each time you get a new data point, but typically you’d keep a queue or buffer of the recent `N` bars.
 3. **Deploying the DQN**:
   - Similarly, in a real environment, each new bar triggers `env.step(action)`. The environment’s “current step” is the latest bar.  
   - You might have to rewrite the environment’s logic so it only advances by one bar at a time in real-time, rather than iterating over the entire historical dataset.
 ---
 ## 8. **Summary**
 - **`main.py`** orchestrates:
  1. Data Loading + Preprocessing  
  2. Feature Engineering (SMA, EMA, RSI, MACD, ADX, OBV)  
  3. LSTM Hyperparameter Tuning with Optuna  
  4. Best Model Training + Saving + Evaluation  
  5. Simple RL Environment + DQN Training + Saving
 - **Key Files** Generated:  
  - `optimized_lstm_model.h5` (or `.keras`) → your final Keras LSTM model.  
  - `scaler_features.save`, `scaler_target.save` → joblib-saved scalers.  
  - `actual_vs_predicted.png` → visual of test set predictions.  
  - `dqn_stock_trading.zip` → trained RL agent.  
  - `dqn_stock_tensorboard/` → logs for the RL training.
 - **Interpreting Metrics**:  
  - High R² with lower directional accuracy implies it fits magnitudes well but struggles with sign changes.  
  - Potential improvement: feature engineering for short-term direction or a classification approach for up vs. down.
 - **Using the DQN Agent**:  
  - `StockTradingEnv` is a toy environment stepping over historical data.  
  - Real-time adaptation requires modifying how the environment receives data.
 - **Warnings**:  
  - Switch `.h5` → `.keras` to remove the Keras format warning.  
  - Possibly switch from Gym to Gymnasium to remove the stable-baselines3 compatibility warning.
 ---
 ### Next Steps / Tips
 1. **Clean Up Legacy Files**: If you have old models or references (like `enhanced_lstm_model.h5`), remove or rename them.  
 2. **Custom Loss / Custom Metrics**: If you want to focus on direction, consider a custom loss function or a classification-based approach.  
 3. **Try Different RL Algorithms**: DQN is just one method. PPO, A2C, or SAC might handle continuous or more complex action spaces.  
 4. **Hyperparameter Range**: Expand or refine your Optuna search space. For instance, trying different `window_sizes` or different dropout regularization strategies.  
 5. **Feature Engineering**: More sophisticated indicators or external features (e.g., news sentiment, fundamental data) might help.
 All in all, your script is already quite comprehensive. You have an advanced LSTM pipeline for regression plus a DQN pipeline for RL. The main things to refine will be:
 - **Data quality**  
 - **Indicator relevance**  
 - **Directional vs. magnitude accuracy**  
 - **Live streaming vs. historical backtesting**  
 Once you address those, your system will be closer to a real-time AI/bot capable of forecasting or trading on new data.
--- a/src/Machine-Learning/LSTM-python/_stable_baselines3_version
+++ b/src/Machine-Learning/LSTM-python/_stable_baselines3_version
@@ -0,0 +1 @@
 2.4.1
--- a/src/Machine-Learning/LSTM-python/actual_vs_predicted.png
+++ b/src/Machine-Learning/LSTM-python/actual_vs_predicted.png
--- a/src/Machine-Learning/LSTM-python/data
+++ b/src/Machine-Learning/LSTM-python/data
@@ -0,0 +1,123 @@
 {
    "policy_class": {
        ":type:": "<class 'abc.ABCMeta'>",
        ":serialized:": "gAWVMAAAAAAAAACMHnN0YWJsZV9iYXNlbGluZXMzLmRxbi5wb2xpY2llc5SMCURRTlBvbGljeZSTlC4=",
        "__module__": "stable_baselines3.dqn.policies",
        "__annotations__": "{'q_net': <class 'stable_baselines3.dqn.policies.QNetwork'>, 'q_net_target': <class 'stable_baselines3.dqn.policies.QNetwork'>}",
        "__doc__": "\n    Policy class with Q-Value Net and target net for DQN\n\n    :param observation_space: Observation space\n    :param action_space: Action space\n    :param lr_schedule: Learning rate schedule (could be constant)\n    :param net_arch: The specification of the policy and value networks.\n    :param activation_fn: Activation function\n    :param features_extractor_class: Features extractor to use.\n    :param features_extractor_kwargs: Keyword arguments\n        to pass to the features extractor.\n    :param normalize_images: Whether to normalize images or not,\n         dividing by 255.0 (True by default)\n    :param optimizer_class: The optimizer to use,\n        ``th.optim.Adam`` by default\n    :param optimizer_kwargs: Additional keyword arguments,\n        excluding the learning rate, to pass to the optimizer\n    ",
        "__init__": "<function DQNPolicy.__init__ at 0x7f6194e245e0>",
        "_build": "<function DQNPolicy._build at 0x7f6194e24680>",
        "make_q_net": "<function DQNPolicy.make_q_net at 0x7f6194e24720>",
        "forward": "<function DQNPolicy.forward at 0x7f6194e247c0>",
        "_predict": "<function DQNPolicy._predict at 0x7f6194e24860>",
        "_get_constructor_parameters": "<function DQNPolicy._get_constructor_parameters at 0x7f6194e24900>",
        "set_training_mode": "<function DQNPolicy.set_training_mode at 0x7f6194e249a0>",
        "__abstractmethods__": "frozenset()",
        "_abc_impl": "<_abc._abc_data object at 0x7f6194e22e40>"
    },
    "verbose": 1,
    "policy_kwargs": {},
    "num_timesteps": 100000,
    "_total_timesteps": 100000,
    "_num_timesteps_at_start": 0,
    "seed": null,
    "action_noise": null,
    "start_time": 1737967108562402423,
    "learning_rate": 0.001,
    "tensorboard_log": "./dqn_stock_tensorboard/",
    "_last_obs": {
        ":type:": "<class 'numpy.ndarray'>",
        ":serialized:": "gAWVtQAAAAAAAACMEm51bXB5LmNvcmUubnVtZXJpY5SMC19mcm9tYnVmZmVylJOUKJZAAAAAAAAAACZUeDh1fXg4anh4OP+EeDjsq9AzXS00OLOwR7EevdgzAACAPzMCODoHqXg4B6l4OM9ZeDgRUnE/AAAAAAAAAACUjAVudW1weZSMBWR0eXBllJOUjAJmNJSJiIeUUpQoSwOMATyUTk5OSv////9K/////0sAdJRiSwFLEIaUjAFDlHSUUpQu"
    },
    "_last_episode_starts": {
        ":type:": "<class 'numpy.ndarray'>",
        ":serialized:": "gAWVdAAAAAAAAACMEm51bXB5LmNvcmUubnVtZXJpY5SMC19mcm9tYnVmZmVylJOUKJYBAAAAAAAAAAGUjAVudW1weZSMBWR0eXBllJOUjAJiMZSJiIeUUpQoSwOMAXyUTk5OSv////9K/////0sAdJRiSwGFlIwBQ5R0lFKULg=="
    },
    "_last_original_obs": {
        ":type:": "<class 'numpy.ndarray'>",
        ":serialized:": "gAWVtQAAAAAAAACMEm51bXB5LmNvcmUubnVtZXJpY5SMC19mcm9tYnVmZmVylJOUKJZAAAAAAAAAAEkLeDhnUXg49Ex4OK1beDgMVd8zojo6OCTptrF6Ve4zAACAP7SYhjpzWng4nKl4OIU4eDgRUnE/AAAAAAAAAACUjAVudW1weZSMBWR0eXBllJOUjAJmNJSJiIeUUpQoSwOMATyUTk5OSv////9K/////0sAdJRiSwFLEIaUjAFDlHSUUpQu"
    },
    "_episode_num": 4,
    "use_sde": false,
    "sde_sample_freq": -1,
    "_current_progress_remaining": 0.0,
    "_stats_window_size": 100,
    "ep_info_buffer": {
        ":type:": "<class 'collections.deque'>",
        ":serialized:": "gAWVIAAAAAAAAACMC2NvbGxlY3Rpb25zlIwFZGVxdWWUk5QpS2SGlFKULg=="
    },
    "ep_success_buffer": {
        ":type:": "<class 'collections.deque'>",
        ":serialized:": "gAWVIAAAAAAAAACMC2NvbGxlY3Rpb25zlIwFZGVxdWWUk5QpS2SGlFKULg=="
    },
    "_n_updates": 24750,
    "observation_space": {
        ":type:": "<class 'gymnasium.spaces.box.Box'>",
        ":serialized:": "gAWVHgIAAAAAAACMFGd5bW5hc2l1bS5zcGFjZXMuYm94lIwDQm94lJOUKYGUfZQojAVkdHlwZZSMBW51bXB5lIwFZHR5cGWUk5SMAmY0lImIh5RSlChLA4wBPJROTk5K/////0r/////SwB0lGKMBl9zaGFwZZRLEIWUjANsb3eUjBJudW1weS5jb3JlLm51bWVyaWOUjAtfZnJvbWJ1ZmZlcpSTlCiWQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAlGgLSxCFlIwBQ5R0lFKUjA1ib3VuZGVkX2JlbG93lGgTKJYQAAAAAAAAAAEBAQEBAQEBAQEBAQEBAQGUaAiMAmIxlImIh5RSlChLA4wBfJROTk5K/////0r/////SwB0lGJLEIWUaBZ0lFKUjARoaWdolGgTKJZAAAAAAAAAAAAAgD8AAIA/AACAPwAAgD8AAIA/AACAPwAAgD8AAIA/AACAPwAAgD8AAIA/AACAPwAAgD8AAIA/AACAPwAAgD+UaAtLEIWUaBZ0lFKUjA1ib3VuZGVkX2Fib3ZllGgTKJYQAAAAAAAAAAEBAQEBAQEBAQEBAQEBAQGUaB1LEIWUaBZ0lFKUjAhsb3dfcmVwcpSMAzAuMJSMCWhpZ2hfcmVwcpSMAzEuMJSMCl9ucF9yYW5kb22UTnViLg==",
        "dtype": "float32",
        "_shape": [
            16
        ],
        "low": "[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]",
        "bounded_below": "[ True  True  True  True  True  True  True  True  True  True  True  True\n  True  True  True  True]",
        "high": "[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]",
        "bounded_above": "[ True  True  True  True  True  True  True  True  True  True  True  True\n  True  True  True  True]",
        "low_repr": "0.0",
        "high_repr": "1.0",
        "_np_random": null
    },
    "action_space": {
        ":type:": "<class 'gymnasium.spaces.discrete.Discrete'>",
        ":serialized:": "gAWVoQEAAAAAAACMGWd5bW5hc2l1bS5zcGFjZXMuZGlzY3JldGWUjAhEaXNjcmV0ZZSTlCmBlH2UKIwBbpSMFW51bXB5LmNvcmUubXVsdGlhcnJheZSMBnNjYWxhcpSTlIwFbnVtcHmUjAVkdHlwZZSTlIwCaTiUiYiHlFKUKEsDjAE8lE5OTkr/////Sv////9LAHSUYkMIAwAAAAAAAACUhpRSlIwFc3RhcnSUaAhoDkMIAAAAAAAAAACUhpRSlIwGX3NoYXBllCmMBWR0eXBllGgOjApfbnBfcmFuZG9tlIwUbnVtcHkucmFuZG9tLl9waWNrbGWUjBBfX2dlbmVyYXRvcl9jdG9ylJOUjAVQQ0c2NJRoG4wUX19iaXRfZ2VuZXJhdG9yX2N0b3KUk5SGlFKUfZQojA1iaXRfZ2VuZXJhdG9ylIwFUENHNjSUjAVzdGF0ZZR9lChoJooQ7/ZkFxYWRa/mnzjTOmwveYwDaW5jlIoQE+rIIlf7HpzUNhPkWy3PDXWMCmhhc191aW50MzKUSwGMCHVpbnRlZ2VylEonpHERdWJ1Yi4=",
        "n": "3",
        "start": "0",
        "_shape": [],
        "dtype": "int64",
        "_np_random": "Generator(PCG64)"
    },
    "n_envs": 1,
    "buffer_size": 10000,
    "batch_size": 64,
    "learning_starts": 1000,
    "tau": 1.0,
    "gamma": 0.99,
    "gradient_steps": 1,
    "optimize_memory_usage": false,
    "replay_buffer_class": {
        ":type:": "<class 'abc.ABCMeta'>",
        ":serialized:": "gAWVNQAAAAAAAACMIHN0YWJsZV9iYXNlbGluZXMzLmNvbW1vbi5idWZmZXJzlIwMUmVwbGF5QnVmZmVylJOULg==",
        "__module__": "stable_baselines3.common.buffers",
        "__annotations__": "{'observations': <class 'numpy.ndarray'>, 'next_observations': <class 'numpy.ndarray'>, 'actions': <class 'numpy.ndarray'>, 'rewards': <class 'numpy.ndarray'>, 'dones': <class 'numpy.ndarray'>, 'timeouts': <class 'numpy.ndarray'>}",
        "__doc__": "\n    Replay buffer used in off-policy algorithms like SAC/TD3.\n\n    :param buffer_size: Max number of element in the buffer\n    :param observation_space: Observation space\n    :param action_space: Action space\n    :param device: PyTorch device\n    :param n_envs: Number of parallel environments\n    :param optimize_memory_usage: Enable a memory efficient variant\n        of the replay buffer which reduces by almost a factor two the memory used,\n        at a cost of more complexity.\n        See https://github.com/DLR-RM/stable-baselines3/issues/37#issuecomment-637501195\n        and https://github.com/DLR-RM/stable-baselines3/pull/28#issuecomment-637559274\n        Cannot be used in combination with handle_timeout_termination.\n    :param handle_timeout_termination: Handle timeout termination (due to timelimit)\n        separately and treat the task as infinite horizon task.\n        https://github.com/DLR-RM/stable-baselines3/issues/284\n    ",
        "__init__": "<function ReplayBuffer.__init__ at 0x7f6194f2b740>",
        "add": "<function ReplayBuffer.add at 0x7f6194f2b880>",
        "sample": "<function ReplayBuffer.sample at 0x7f6194f2b920>",
        "_get_samples": "<function ReplayBuffer._get_samples at 0x7f6194f2b9c0>",
        "_maybe_cast_dtype": "<staticmethod(<function ReplayBuffer._maybe_cast_dtype at 0x7f6194f2ba60>)>",
        "__abstractmethods__": "frozenset()",
        "_abc_impl": "<_abc._abc_data object at 0x7f6194f35280>"
    },
    "replay_buffer_kwargs": {},
    "train_freq": {
        ":type:": "<class 'stable_baselines3.common.type_aliases.TrainFreq'>",
        ":serialized:": "gAWVeAAAAAAAAACMJXN0YWJsZV9iYXNlbGluZXMzLmNvbW1vbi50eXBlX2FsaWFzZXOUjAlUcmFpbkZyZXGUk5RLBIwIYnVpbHRpbnOUjAdnZXRhdHRylJOUaACMElRyYWluRnJlcXVlbmN5VW5pdJSTlIwEU1RFUJSGlFKUhpSBlC4="
    },
    "use_sde_at_warmup": false,
    "exploration_initial_eps": 1.0,
    "exploration_final_eps": 0.02,
    "exploration_fraction": 0.1,
    "target_update_interval": 1000,
    "_n_calls": 100000,
    "max_grad_norm": 10,
    "exploration_rate": 0.02,
    "lr_schedule": {
        ":type:": "<class 'function'>",
        ":serialized:": "gAWV3AQAAAAAAACMF2Nsb3VkcGlja2xlLmNsb3VkcGlja2xllIwOX21ha2VfZnVuY3Rpb26Uk5QoaACMDV9idWlsdGluX3R5cGWUk5SMCENvZGVUeXBllIWUUpQoSwFLAEsASwFLBUsTQzSVAZcAdAEAAAAAAAAAAAAAAgCJAXwApgEAAKsBAAAAAAAAAACmAQAAqwEAAAAAAAAAAFMAlE6FlIwFZmxvYXSUhZSMEnByb2dyZXNzX3JlbWFpbmluZ5SFlIynL2hvbWUva2xlaW4vY29kZVdTL1Byb2plY3RzL01pZGFzVGVjaG5vbG9naWVzTExDL01pZGFzVGVjaG5vbG9naWVzL3NyYy9NYWNoaW5lLUxlYXJuaW5nL0xTVE0tcHl0aG9uL3ZlbnYvbGliL3B5dGhvbjMuMTEvc2l0ZS1wYWNrYWdlcy9zdGFibGVfYmFzZWxpbmVzMy9jb21tb24vdXRpbHMucHmUjAg8bGFtYmRhPpSMIWdldF9zY2hlZHVsZV9mbi48bG9jYWxzPi48bGFtYmRhPpRLYUMa+IAApWWoTqhO0DtN0SxO1CxO0SZP1CZPgACUQwCUjA52YWx1ZV9zY2hlZHVsZZSFlCl0lFKUfZQojAtfX3BhY2thZ2VfX5SMGHN0YWJsZV9iYXNlbGluZXMzLmNvbW1vbpSMCF9fbmFtZV9flIwec3RhYmxlX2Jhc2VsaW5lczMuY29tbW9uLnV0aWxzlIwIX19maWxlX1+UjKcvaG9tZS9rbGVpbi9jb2RlV1MvUHJvamVjdHMvTWlkYXNUZWNobm9sb2dpZXNMTEMvTWlkYXNUZWNobm9sb2dpZXMvc3JjL01hY2hpbmUtTGVhcm5pbmcvTFNUTS1weXRob24vdmVudi9saWIvcHl0aG9uMy4xMS9zaXRlLXBhY2thZ2VzL3N0YWJsZV9iYXNlbGluZXMzL2NvbW1vbi91dGlscy5weZR1Tk5oAIwQX21ha2VfZW1wdHlfY2VsbJSTlClSlIWUdJRSlGgAjBJfZnVuY3Rpb25fc2V0c3RhdGWUk5RoI32UfZQoaBqMCDxsYW1iZGE+lIwMX19xdWFsbmFtZV9flIwhZ2V0X3NjaGVkdWxlX2ZuLjxsb2NhbHM+LjxsYW1iZGE+lIwPX19hbm5vdGF0aW9uc19flH2UjA5fX2t3ZGVmYXVsdHNfX5ROjAxfX2RlZmF1bHRzX1+UTowKX19tb2R1bGVfX5RoG4wHX19kb2NfX5ROjAtfX2Nsb3N1cmVfX5RoAIwKX21ha2VfY2VsbJSTlGgCKGgHKEsBSwBLAEsBSwFLE0MIlQGXAIkBUwCUaAkpjAFflIWUaA6MBGZ1bmOUjBljb25zdGFudF9mbi48bG9jYWxzPi5mdW5jlEuFQwj4gADYDxKICpRoEowDdmFslIWUKXSUUpRoF05OaB8pUpSFlHSUUpRoJWhBfZR9lChoGowEZnVuY5RoKYwZY29uc3RhbnRfZm4uPGxvY2Fscz4uZnVuY5RoK32UaC1OaC5OaC9oG2gwTmgxaDNHP1BiTdLxqfyFlFKUhZSMF19jbG91ZHBpY2tsZV9zdWJtb2R1bGVzlF2UjAtfX2dsb2JhbHNfX5R9lHWGlIZSMIWUUpSFlGhKXZRoTH2UdYaUhlIwLg=="
    },
    "batch_norm_stats": [],
    "batch_norm_stats_target": [],
    "exploration_schedule": {
        ":type:": "<class 'function'>",
        ":serialized:": "gAWVcAQAAAAAAACMF2Nsb3VkcGlja2xlLmNsb3VkcGlja2xllIwOX21ha2VfZnVuY3Rpb26Uk5QoaACMDV9idWlsdGluX3R5cGWUk5SMCENvZGVUeXBllIWUUpQoSwFLAEsASwFLBEsTQzyVA5cAZAF8AHoKAACJAmsEAAAAAHICiQFTAIkDZAF8AHoKAACJAYkDegoAAHoFAACJAnoLAAB6AAAAUwCUTksBhpQpjBJwcm9ncmVzc19yZW1haW5pbmeUhZSMpy9ob21lL2tsZWluL2NvZGVXUy9Qcm9qZWN0cy9NaWRhc1RlY2hub2xvZ2llc0xMQy9NaWRhc1RlY2hub2xvZ2llcy9zcmMvTWFjaGluZS1MZWFybmluZy9MU1RNLXB5dGhvbi92ZW52L2xpYi9weXRob24zLjExL3NpdGUtcGFja2FnZXMvc3RhYmxlX2Jhc2VsaW5lczMvY29tbW9uL3V0aWxzLnB5lIwEZnVuY5SMG2dldF9saW5lYXJfZm4uPGxvY2Fscz4uZnVuY5RLc0M4+IAA2AwN0BAi0QwioGzSCzLQCzLYExaISuATGJhB0CAy0RwysHO4VbF70RtDwGzRG1LRE1LQDFKUQwCUjANlbmSUjAxlbmRfZnJhY3Rpb26UjAVzdGFydJSHlCl0lFKUfZQojAtfX3BhY2thZ2VfX5SMGHN0YWJsZV9iYXNlbGluZXMzLmNvbW1vbpSMCF9fbmFtZV9flIwec3RhYmxlX2Jhc2VsaW5lczMuY29tbW9uLnV0aWxzlIwIX19maWxlX1+UjKcvaG9tZS9rbGVpbi9jb2RlV1MvUHJvamVjdHMvTWlkYXNUZWNobm9sb2dpZXNMTEMvTWlkYXNUZWNobm9sb2dpZXMvc3JjL01hY2hpbmUtTGVhcm5pbmcvTFNUTS1weXRob24vdmVudi9saWIvcHl0aG9uMy4xMS9zaXRlLXBhY2thZ2VzL3N0YWJsZV9iYXNlbGluZXMzL2NvbW1vbi91dGlscy5weZR1Tk5oAIwQX21ha2VfZW1wdHlfY2VsbJSTlClSlGgfKVKUaB8pUpSHlHSUUpRoAIwSX2Z1bmN0aW9uX3NldHN0YXRllJOUaCV9lH2UKGgajARmdW5jlIwMX19xdWFsbmFtZV9flIwbZ2V0X2xpbmVhcl9mbi48bG9jYWxzPi5mdW5jlIwPX19hbm5vdGF0aW9uc19flH2UKGgKjAhidWlsdGluc5SMBWZsb2F0lJOUjAZyZXR1cm6UaDF1jA5fX2t3ZGVmYXVsdHNfX5ROjAxfX2RlZmF1bHRzX1+UTowKX19tb2R1bGVfX5RoG4wHX19kb2NfX5ROjAtfX2Nsb3N1cmVfX5RoAIwKX21ha2VfY2VsbJSTlEc/lHrhR64Ue4WUUpRoOUc/uZmZmZmZmoWUUpRoOUc/8AAAAAAAAIWUUpSHlIwXX2Nsb3VkcGlja2xlX3N1Ym1vZHVsZXOUXZSMC19fZ2xvYmFsc19flH2UdYaUhlIwLg=="
    }
 }
--- a/src/Machine-Learning/LSTM-python/dqn_stock_tensorboard/DQN_1/events.out.tfevents.1737967108.kernelpanic.2054346.0
+++ b/src/Machine-Learning/LSTM-python/dqn_stock_tensorboard/DQN_1/events.out.tfevents.1737967108.kernelpanic.2054346.0
--- a/src/Machine-Learning/LSTM-python/dqn_stock_tensorboard/DQN_2/events.out.tfevents.1738006960.kernelpanic.425273.0
+++ b/src/Machine-Learning/LSTM-python/dqn_stock_tensorboard/DQN_2/events.out.tfevents.1738006960.kernelpanic.425273.0
--- a/src/Machine-Learning/LSTM-python/dqn_stock_trading.zip
+++ b/src/Machine-Learning/LSTM-python/dqn_stock_trading.zip
--- a/src/Machine-Learning/LSTM-python/enhanced_lstm_model.h5
+++ b/src/Machine-Learning/LSTM-python/enhanced_lstm_model.h5
--- a/src/Machine-Learning/LSTM-python/main.py
+++ b/src/Machine-Learning/LSTM-python/main.py
@@ -0,0 +1,472 @@
 import os
 import sys
 import argparse
 import numpy as np
 import pandas as pd
 import matplotlib.pyplot as plt
 import seaborn as sns
 import logging
 from tabulate import tabulate
 from sklearn.preprocessing import MinMaxScaler
 from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
 from sklearn.model_selection import TimeSeriesSplit, GridSearchCV
 import tensorflow as tf
 from tensorflow.keras.models import Sequential
 from tensorflow.keras.layers import LSTM, Dense, Dropout, Bidirectional
 from tensorflow.keras.optimizers import Adam, Nadam
 from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
 from tensorflow.keras.losses import Huber
 import xgboost as xgb
 import optuna
 from optuna.integration import KerasPruningCallback
 # Reinforcement Learning
 import gym
 from gym import spaces
 from stable_baselines3 import DQN
 from stable_baselines3.common.vec_env import DummyVecEnv
 # Suppress TensorFlow warnings
 os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
 # Configure logging
 logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
 ##############################
 # 1. Data Loading & Indicators
 ##############################
 def load_data(file_path):
    logging.info(f"Loading data from: {file_path}")
    try:
        data = pd.read_csv(file_path, parse_dates=['time'])
    except FileNotFoundError:
        logging.error(f"File not found: {file_path}")
        sys.exit(1)
    except pd.errors.ParserError as e:
        logging.error(f"Error parsing CSV file: {e}")
        sys.exit(1)
    except Exception as e:
        logging.error(f"Unexpected error: {e}")
        sys.exit(1)
    rename_mapping = {
        'time': 'Date',
        'open': 'Open',
        'high': 'High',
        'low': 'Low',
        'close': 'Close'
    }
    data.rename(columns=rename_mapping, inplace=True)
    # Sort by Date
    data.sort_values('Date', inplace=True)
    data.reset_index(drop=True, inplace=True)
    logging.info(f"Data columns after renaming: {data.columns.tolist()}")
    logging.info("Data loaded and sorted successfully.")
    return data
 def compute_rsi(series, window=14):
    delta = series.diff()
    gain = delta.where(delta > 0, 0).rolling(window=window).mean()
    loss = -delta.where(delta < 0, 0).rolling(window=window).mean()
    RS = gain / loss
    return 100 - (100 / (1 + RS))
 def compute_macd(series, span_short=12, span_long=26, span_signal=9):
    ema_short = series.ewm(span=span_short, adjust=False).mean()
    ema_long = series.ewm(span=span_long, adjust=False).mean()
    macd_line = ema_short - ema_long
    signal_line = macd_line.ewm(span=span_signal, adjust=False).mean()
    return macd_line - signal_line  # MACD histogram
 def compute_adx(df, window=14):
    """
    Example ADX calculation (pseudo-real):
    You can implement a full ADX formula if you’d like.
    Here, we do a slightly more robust approach than rolling std.
    """
    # True range
    df['H-L'] = df['High'] - df['Low']
    df['H-Cp'] = (df['High'] - df['Close'].shift(1)).abs()
    df['L-Cp'] = (df['Low'] - df['Close'].shift(1)).abs()
    tr = df[['H-L', 'H-Cp', 'L-Cp']].max(axis=1)
    tr_rolling = tr.rolling(window=window).mean()
    # Simplistic to replicate ADX-like effect
    adx_placeholder = tr_rolling / df['Close']
    df.drop(['H-L','H-Cp','L-Cp'], axis=1, inplace=True)
    return adx_placeholder
 def compute_obv(df):
    # On-Balance Volume
    signed_volume = (np.sign(df['Close'].diff()) * df['Volume']).fillna(0)
    return signed_volume.cumsum()
 def compute_bollinger_bands(series, window=20, num_std=2):
    """
    Bollinger Bands: middle=MA, upper=MA+2*std, lower=MA-2*std
    Return the band width or separate columns.
    """
    sma = series.rolling(window=window).mean()
    std = series.rolling(window=window).std()
    upper = sma + num_std * std
    lower = sma - num_std * std
    bandwidth = (upper - lower) / sma  # optional metric
    return upper, lower, bandwidth
 def compute_mfi(df, window=14):
    """
    Money Flow Index: uses typical price, volume, direction.
    For demonstration.
    """
    typical_price = (df['High'] + df['Low'] + df['Close']) / 3
    money_flow = typical_price * df['Volume']
    # Positive or negative
    df_shift = typical_price.shift(1)
    flow_positive = money_flow.where(typical_price > df_shift, 0)
    flow_negative = money_flow.where(typical_price < df_shift, 0)
    # Sum over window
    pos_sum = flow_positive.rolling(window=window).sum()
    neg_sum = flow_negative.rolling(window=window).sum()
    # RSI-like formula
    mfi = 100 - (100 / (1 + pos_sum / (neg_sum + 1e-9)))
    return mfi
 def calculate_technical_indicators(df):
    logging.info("Calculating technical indicators...")
    df['RSI'] = compute_rsi(df['Close'], window=14)
    df['MACD'] = compute_macd(df['Close'])
    df['OBV'] = compute_obv(df)
    df['ADX'] = compute_adx(df)
    # Bollinger
    upper_bb, lower_bb, bb_width = compute_bollinger_bands(df['Close'], window=20)
    df['BB_Upper'] = upper_bb
    df['BB_Lower'] = lower_bb
    df['BB_Width'] = bb_width
    # MFI
    df['MFI'] = compute_mfi(df)
    # Simple/EMA
    df['SMA_5'] = df['Close'].rolling(window=5).mean()
    df['SMA_10'] = df['Close'].rolling(window=10).mean()
    df['EMA_5'] = df['Close'].ewm(span=5, adjust=False).mean()
    df['EMA_10'] = df['Close'].ewm(span=10, adjust=False).mean()
    # STD
    df['STDDEV_5'] = df['Close'].rolling(window=5).std()
    df.dropna(inplace=True)
    logging.info("Technical indicators calculated successfully.")
    return df
 ##############################
 # 2. Parse Arguments
 ##############################
 def parse_arguments():
    parser = argparse.ArgumentParser(description='Train LSTM and DQN models for stock trading.')
    parser.add_argument('csv_path', type=str, help='Path to the CSV data file.')
    return parser.parse_args()
 ##############################
 # 3. Main
 ##############################
 def main():
    args = parse_arguments()
    csv_path = args.csv_path
    # 1) Load data
    data = load_data(csv_path)
    data = calculate_technical_indicators(data)
    # 2) Build feature set
    # We deliberately EXCLUDE 'Close' from the features so the model doesn't trivially see it.
    # Instead, rely on advanced indicators + OHLC + Volume.
    feature_columns = [
        'Open', 'High', 'Low', 'Volume',
        'RSI', 'MACD', 'OBV', 'ADX',
        'BB_Upper', 'BB_Lower', 'BB_Width',
        'MFI', 'SMA_5', 'SMA_10', 'EMA_5', 'EMA_10', 'STDDEV_5'
    ]
    target_column = 'Close'  # still used for label/evaluation
    # Keep only these columns + Date + target
    data = data[['Date'] + feature_columns + [target_column]].dropna()
    # 3) Scale data
    from sklearn.preprocessing import MinMaxScaler
    scaler_features = MinMaxScaler()
    scaler_target = MinMaxScaler()
    scaled_features = scaler_features.fit_transform(data[feature_columns])
    scaled_target = scaler_target.fit_transform(data[[target_column]]).flatten()
    # 4) Create sequences for LSTM
    def create_sequences(features, target, window_size=15):
        X, y = [], []
        for i in range(len(features) - window_size):
            X.append(features[i:i+window_size])
            y.append(target[i+window_size])
        return np.array(X), np.array(y)
    window_size = 15
    X, y = create_sequences(scaled_features, scaled_target, window_size)
    # 5) Train/Val/Test Split
    train_size = int(len(X) * 0.7)
    val_size = int(len(X) * 0.15)
    test_size = len(X) - train_size - val_size
    X_train, X_val, X_test = (
        X[:train_size],
        X[train_size:train_size+val_size],
        X[train_size+val_size:]
    )
    y_train, y_val, y_test = (
        y[:train_size],
        y[train_size:train_size+val_size],
        y[train_size+val_size:]
    )
    logging.info(f"X_train: {X_train.shape}, X_val: {X_val.shape}, X_test: {X_test.shape}")
    logging.info(f"y_train: {y_train.shape}, y_val: {y_val.shape}, y_test: {y_test.shape}")
    # 6) Device Config
    def configure_device():
        gpus = tf.config.list_physical_devices('GPU')
        if gpus:
            try:
                for gpu in gpus:
                    tf.config.experimental.set_memory_growth(gpu, True)
                logging.info(f"{len(gpus)} GPU(s) detected and configured.")
            except RuntimeError as e:
                logging.error(e)
        else:
            logging.info("No GPU detected, using CPU.")
    configure_device()
    # 7) Build LSTM
    from tensorflow.keras.regularizers import l2
    def build_lstm(input_shape, units=128, dropout=0.3, lr=1e-3):
        model = Sequential()
        # Example: 2 stacked LSTM layers
        model.add(Bidirectional(LSTM(units, return_sequences=True, kernel_regularizer=l2(1e-4)), input_shape=input_shape))
        model.add(Dropout(dropout))
        model.add(Bidirectional(LSTM(units, return_sequences=False, kernel_regularizer=l2(1e-4))))
        model.add(Dropout(dropout))
        model.add(Dense(1, activation='linear'))
        optimizer = Adam(learning_rate=lr)
        model.compile(loss=Huber(), optimizer=optimizer, metrics=['mae'])
        return model
    # 8) Train LSTM (you can still do Optuna if you like, omitted here for brevity)
    model_lstm = build_lstm((X_train.shape[1], X_train.shape[2]), units=128, dropout=0.3, lr=1e-3)
    early_stop = EarlyStopping(patience=15, restore_best_weights=True)
    reduce_lr = ReduceLROnPlateau(factor=0.5, patience=5, min_lr=1e-6)
    model_lstm.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        epochs=100,
        batch_size=32,
        callbacks=[early_stop, reduce_lr],
        verbose=1
    )
    # 9) Evaluate
    def evaluate_lstm(model, X_test, y_test):
        y_pred_scaled = model.predict(X_test).flatten()
        # If we forcibly clamp predictions to [0,1], do so, else skip:
        y_pred_scaled = np.clip(y_pred_scaled, 0, 1)
        y_pred = scaler_target.inverse_transform(y_pred_scaled.reshape(-1,1)).flatten()
        y_true = scaler_target.inverse_transform(y_test.reshape(-1,1)).flatten()
        mse = mean_squared_error(y_true, y_pred)
        rmse = np.sqrt(mse)
        mae = mean_absolute_error(y_true, y_pred)
        r2 = r2_score(y_true, y_pred)
        # Direction
        direction_true = np.sign(np.diff(y_true))
        direction_pred = np.sign(np.diff(y_pred))
        directional_acc = np.mean(direction_true == direction_pred)
        logging.info(f"LSTM Test -> MSE={mse:.4f}, RMSE={rmse:.4f}, MAE={mae:.4f}, R2={r2:.4f}, DirAcc={directional_acc:.4f}")
        # Quick Plot
        plt.figure(figsize=(12,6))
        plt.plot(y_true[:100], label='Actual')
        plt.plot(y_pred[:100], label='Predicted')
        plt.title("LSTM: Actual vs Predicted (first 100 test points)")
        plt.legend()
        plt.savefig("lstm_actual_vs_pred.png")
        plt.close()
    evaluate_lstm(model_lstm, X_test, y_test)
    # Save
    model_lstm.save("improved_lstm_model.keras")
    import joblib
    joblib.dump(scaler_features, "improved_scaler_features.pkl")
    joblib.dump(scaler_target, "improved_scaler_target.pkl")
    ##############################
    # 10) Reinforcement Learning
    ##############################
    class StockTradingEnv(gym.Env):
        """
        Improved RL Env that:
         - excludes raw 'Close' from observation
         - includes transaction cost (optional)
         - uses step-based PnL as reward
        """
        metadata = {'render.modes': ['human']}
        def __init__(self, df, initial_balance=10000, transaction_cost=0.001):
            super().__init__()
            self.df = df.reset_index(drop=True)
            self.initial_balance = initial_balance
            self.balance = initial_balance
            self.net_worth = initial_balance
            self.current_step = 0
            self.max_steps = len(df)
            # Add transaction cost in decimal form (0.001 => 0.1%)
            self.transaction_cost = transaction_cost
            self.shares_held = 0
            self.cost_basis = 0
            # Suppose we exclude 'Close' from features to remove direct see of final price
            self.obs_columns = [
                'Open', 'High', 'Low', 'Volume',
                'RSI', 'MACD', 'OBV', 'ADX',
                'BB_Upper', 'BB_Lower', 'BB_Width',
                'MFI', 'SMA_5', 'SMA_10', 'EMA_5', 'EMA_10', 'STDDEV_5'
            ]
            # We'll normalize features with the same scaler used for LSTM. If you want EXACT same scaling:
            # you can pass the same 'scaler_features' object into this environment.
            self.scaler = MinMaxScaler().fit(df[self.obs_columns])
            # Or load from a pkl if you prefer: joblib.load("improved_scaler_features.pkl")
            self.action_space = spaces.Discrete(3)  # 0=Sell, 1=Hold, 2=Buy
            self.observation_space = spaces.Box(
                low=0.0, high=1.0,
                shape=(len(self.obs_columns) + 3,),  # + balance, shares, cost_basis
                dtype=np.float32
            )
        def reset(self):
            self.balance = self.initial_balance
            self.net_worth = self.initial_balance
            self.shares_held = 0
            self.cost_basis = 0
            self.current_step = 0
            return self._get_obs()
        def step(self, action):
            # Current row
            row = self.df.iloc[self.current_step]
            current_price = row['Close']
            prev_net_worth = self.net_worth
            if action == 2:  # Buy
                shares_bought = int(self.balance // current_price)
                if shares_bought > 0:
                    cost = shares_bought * current_price
                    fee = cost * self.transaction_cost
                    self.balance -= (cost + fee)
                    # Weighted average cost basis
                    prev_shares = self.shares_held
                    self.shares_held += shares_bought
                    self.cost_basis = (
                        (self.cost_basis * prev_shares) + (shares_bought * current_price)
                    ) / self.shares_held
            elif action == 0:  # Sell
                if self.shares_held > 0:
                    revenue = self.shares_held * current_price
                    fee = revenue * self.transaction_cost
                    self.balance += (revenue - fee)
                    self.shares_held = 0
                    self.cost_basis = 0
            # Recompute net worth
            self.net_worth = self.balance + self.shares_held * current_price
            self.current_step += 1
            done = (self.current_step >= self.max_steps - 1)
            # *Step-based* reward => daily PnL
            reward = self.net_worth - prev_net_worth
            obs = self._get_obs()
            return obs, reward, done, {}
        def _get_obs(self):
            row = self.df.iloc[self.current_step][self.obs_columns]
            # Scale
            scaled = self.scaler.transform([row])[0]
            additional = np.array([
                self.balance / self.initial_balance,
                self.shares_held / 100.0,
                self.cost_basis / (self.initial_balance+1e-9)
            ], dtype=np.float32)
            obs = np.concatenate([scaled, additional]).astype(np.float32)
            return obs
        def render(self, mode='human'):
            profit = self.net_worth - self.initial_balance
            print(f"Step: {self.current_step}, "
                  f"Balance: {self.balance:.2f}, "
                  f"Shares: {self.shares_held}, "
                  f"NetWorth: {self.net_worth:.2f}, "
                  f"Profit: {profit:.2f}")
    ##############################
    # 11) Train DQN
    ##############################
    def train_dqn(env):
        logging.info("Training DQN agent with improved environment...")
        model = DQN(
            'MlpPolicy',
            env,
            verbose=1,
            learning_rate=1e-3,
            buffer_size=50000,
            learning_starts=1000,
            batch_size=64,
            tau=0.99,
            gamma=0.99,
            train_freq=4,
            target_update_interval=1000,
            exploration_fraction=0.1,
            exploration_final_eps=0.02,
            tensorboard_log="./dqn_enhanced_tensorboard/"
        )
        model.learn(total_timesteps=50000)
        model.save("improved_dqn_agent")
        return model
    # Initialize environment with the same data
    # *In a real scenario, you might feed a different dataset or do a train/test split
    # for the RL environment, too.
    rl_env = StockTradingEnv(data, initial_balance=10000, transaction_cost=0.001)
    vec_env = DummyVecEnv([lambda: rl_env])
    dqn_model = train_dqn(vec_env)
    logging.info("Finished DQN training. You can test with a script like 'use_dqn.py' or do an internal test here.")
 if __name__ == "__main__":
    main()
--- a/src/Machine-Learning/LSTM-python/optimized_lstm_model.h5
+++ b/src/Machine-Learning/LSTM-python/optimized_lstm_model.h5
--- a/src/Machine-Learning/LSTM-python/optimized_lstm_model.keras
+++ b/src/Machine-Learning/LSTM-python/optimized_lstm_model.keras
--- a/src/Machine-Learning/LSTM-python/past_iterations/main.py.iteration1
+++ b/src/Machine-Learning/LSTM-python/past_iterations/main.py.iteration1
@@ -0,0 +1,238 @@
 import pandas as pd
 import numpy as np
 from sklearn.preprocessing import MinMaxScaler
 from sklearn.model_selection import train_test_split
 from tensorflow.keras.models import Model, load_model
 from tensorflow.keras.layers import Input, LSTM, Dense, Dropout, Bidirectional
 from tensorflow.keras.optimizers import Adam
 from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
 import optuna
 import matplotlib.pyplot as plt
 import logging
 import sys
 import os
 # Force TensorFlow to use CPU if no GPU is available
 if not any([os.environ.get('CUDA_VISIBLE_DEVICES'), os.environ.get('NVIDIA_VISIBLE_DEVICES')]):
    os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
 # Initialize logger
 logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(message)s')
 logger = logging.getLogger()
 # Custom functions for technical indicators
 def compute_sma(data, period):
    return data.rolling(window=period).mean()
 def compute_ema(data, period):
    return data.ewm(span=period, adjust=False).mean()
 def compute_rsi(data, period=14):
    delta = data.diff(1)
    gain = (delta.where(delta > 0, 0)).rolling(window=period).mean()
    loss = (-delta.where(delta < 0, 0)).rolling(window=period).mean()
    rs = gain / loss
    return 100 - (100 / (1 + rs))
 def compute_macd(data, fast_period=12, slow_period=26, signal_period=9):
    fast_ema = compute_ema(data, fast_period)
    slow_ema = compute_ema(data, slow_period)
    macd = fast_ema - slow_ema
    signal = compute_ema(macd, signal_period)
    return macd, signal
 def compute_atr(high, low, close, period=14):
    tr = np.maximum(high - low, np.maximum(abs(high - close.shift(1)), abs(low - close.shift(1))))
    return tr.rolling(window=period).mean()
 def compute_adx(high, low, close, period=14):
    tr = compute_atr(high, low, close, period)
    plus_dm = (high - high.shift(1)).where((high - high.shift(1)) > (low.shift(1) - low), 0).fillna(0)
    minus_dm = (low.shift(1) - low).where((low.shift(1) - low) > (high - high.shift(1)), 0).fillna(0)
    plus_di = 100 * (plus_dm.rolling(window=period).sum() / tr)
    minus_di = 100 * (minus_dm.rolling(window=period).sum() / tr)
    dx = (abs(plus_di - minus_di) / (plus_di + minus_di)) * 100
    return dx.rolling(window=period).mean()
 # Load and preprocess data
 def load_and_preprocess_data(file_path):
    logger.info("Loading data...")
    try:
        df = pd.read_csv(file_path)
        if 'time' not in df.columns:
            logger.error("The CSV file must contain a 'time' column.")
            sys.exit(1)
        df['time'] = pd.to_datetime(df['time'], errors='coerce', utc=True)
        invalid_time_count = df['time'].isna().sum()
        if invalid_time_count > 0:
            logger.warning(f"Dropping {invalid_time_count} rows with invalid datetime values.")
            df = df.dropna(subset=['time'])
        # Ensure required columns exist
        required_columns = ['open', 'high', 'low', 'close', 'Volume']
        for col in required_columns:
            if col not in df.columns:
                logger.warning(f"Missing column '{col}' in the data. Filling with default values.")
                df[col] = 0
        # Rename Volume column to lowercase for consistency
        if 'Volume' in df.columns:
            df.rename(columns={'Volume': 'volume'}, inplace=True)
    except FileNotFoundError:
        logger.error(f"File not found: {file_path}")
        sys.exit(1)
    except Exception as e:
        logger.error(f"Error loading file: {e}")
        sys.exit(1)
    df['day'] = df['time'].dt.date
    # Aggregate 5-minute data into daily data
    daily_data = df.groupby('day').agg({
        'open': 'first',
        'high': 'max',
        'low': 'min',
        'close': 'last',
        'volume': 'sum'
    }).reset_index()
    # Generate technical indicators
    logger.info("Calculating technical indicators...")
    daily_data['SMA_10'] = compute_sma(daily_data['close'], period=10)
    daily_data['EMA_10'] = compute_ema(daily_data['close'], period=10)
    daily_data['RSI'] = compute_rsi(daily_data['close'], period=14)
    daily_data['MACD'], daily_data['MACD_signal'] = compute_macd(daily_data['close'])
    daily_data['ATR'] = compute_atr(daily_data['high'], daily_data['low'], daily_data['close'], period=14)
    daily_data['ADX'] = compute_adx(daily_data['high'], daily_data['low'], daily_data['close'], period=14)
    # Drop NaN rows due to indicators
    daily_data = daily_data.dropna()
    # Scale data
    logger.info("Scaling data...")
    scaler = MinMaxScaler()
    feature_columns = ['open', 'high', 'low', 'volume', 'SMA_10', 'EMA_10', 'RSI', 'MACD', 'MACD_signal', 'ATR', 'ADX']
    scaled_features = scaler.fit_transform(daily_data[feature_columns])
    return scaled_features, daily_data['close'].values, scaler
 # Create sequences for LSTM
 def create_sequences(data, target, window_size):
    logger.info(f"Creating sequences with window size {window_size}...")
    X, y = [], []
    for i in range(len(data) - window_size):
        X.append(data[i:i + window_size])
        y.append(target[i + window_size])
    return np.array(X), np.array(y)
 # Define objective function for hyperparameter tuning
 def objective(trial):
    logger.info("Running Optuna trial...")
    # Suggest hyperparameters
    num_lstm_layers = trial.suggest_int("num_lstm_layers", 2, 4)
    lstm_units = trial.suggest_int("lstm_units", 64, 256, step=64)
    dropout_rate = trial.suggest_float("dropout_rate", 0.1, 0.5, step=0.1)
    learning_rate = trial.suggest_float("learning_rate", 1e-4, 1e-2, log=True)
    # Build model
    inputs = Input(shape=(X_train.shape[1], X_train.shape[2]))
    x = inputs
    for _ in range(num_lstm_layers):
        x = Bidirectional(LSTM(lstm_units, return_sequences=True, kernel_regularizer="l2"))(x)
        x = Dropout(dropout_rate)(x)
    x = LSTM(lstm_units, return_sequences=False, kernel_regularizer="l2")(x)
    x = Dropout(dropout_rate)(x)
    outputs = Dense(1, activation="linear")(x)
    model = Model(inputs, outputs)
    optimizer = Adam(learning_rate=learning_rate)
    model.compile(optimizer=optimizer, loss="mean_squared_error", metrics=["mae"])
    # Train model
    early_stopping = EarlyStopping(monitor="val_loss", patience=10, restore_best_weights=True)
    model.fit(
        X_train, y_train,
        validation_data=(X_test, y_test),
        epochs=50,
        batch_size=32,
        callbacks=[early_stopping],
        verbose=0
    )
    # Evaluate model
    loss, mae = model.evaluate(X_test, y_test, verbose=0)
    return mae
 # Run hyperparameter tuning and train final model
 if __name__ == "__main__":
    if len(sys.argv) < 2:
        logger.error("Please provide the CSV file path as an argument.")
        sys.exit(1)
    file_path = sys.argv[1]  # Get the file path from command-line arguments
    window_size = 30
    # Load and preprocess data
    data, target, scaler = load_and_preprocess_data(file_path)
    # Create sequences
    X, y = create_sequences(data, target, window_size)
    # Split data
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, shuffle=False)
    # Run Optuna
    study = optuna.create_study(direction="minimize")
    study.optimize(objective, n_trials=20)
    # Train final model with best hyperparameters
    best_params = study.best_params
    logger.info(f"Best Hyperparameters: {best_params}")
    inputs = Input(shape=(X_train.shape[1], X_train.shape[2]))
    x = inputs
    for _ in range(best_params["num_lstm_layers"]):
        x = Bidirectional(LSTM(best_params["lstm_units"], return_sequences=True, kernel_regularizer="l2"))(x)
        x = Dropout(best_params["dropout_rate"])(x)
    x = LSTM(best_params["lstm_units"], return_sequences=False, kernel_regularizer="l2")(x)
    x = Dropout(best_params["dropout_rate"])(x)
    outputs = Dense(1, activation="linear")(x)
    model = Model(inputs, outputs)
    optimizer = Adam(learning_rate=best_params["learning_rate"])
    model.compile(optimizer=optimizer, loss="mean_squared_error", metrics=["mae"])
    # Callbacks
    early_stopping = EarlyStopping(monitor="val_loss", patience=15, restore_best_weights=True)
    reduce_lr = ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=5, min_lr=1e-5)
    logger.info("Training final model...")
    history = model.fit(
        X_train, y_train,
        validation_data=(X_test, y_test),
        epochs=300,
        batch_size=32,
        callbacks=[early_stopping, reduce_lr]
    )
    # Save model
    model.save("optimized_lstm_model.keras")
    logger.info("Model saved as optimized_lstm_model.keras.")
    # Evaluate model
    loss, mae = model.evaluate(X_test, y_test)
    logger.info(f"Final Model Test Loss: {loss}, Test MAE: {mae}")
    # Make predictions and plot
    y_pred = model.predict(X_test)
    plt.figure(figsize=(10, 6))
    plt.plot(y_test, label="Actual Prices")
    plt.plot(y_pred, label="Predicted Prices")
    plt.legend()
    plt.title("Model Prediction vs Actual")
    plt.xlabel("Time Steps")
    plt.ylabel("Price")
    plt.savefig("prediction_vs_actual.png")
    plt.show()
    logger.info("Predictions complete and saved to plot.")
--- a/src/Machine-Learning/LSTM-python/past_iterations/main.py.iteration2
+++ b/src/Machine-Learning/LSTM-python/past_iterations/main.py.iteration2
@@ -0,0 +1,452 @@
 import os
 import sys
 import argparse  # Added for argument parsing
 import numpy as np
 import pandas as pd
 import matplotlib.pyplot as plt
 import seaborn as sns
 import logging
 from tabulate import tabulate
 from sklearn.preprocessing import MinMaxScaler
 from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
 from sklearn.model_selection import TimeSeriesSplit, GridSearchCV
 import tensorflow as tf
 from tensorflow.keras.models import Sequential
 from tensorflow.keras.layers import LSTM, GRU, Dense, Dropout, Bidirectional
 from tensorflow.keras.optimizers import Adam, Nadam
 from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
 from tensorflow.keras.losses import Huber
 import xgboost as xgb
 import optuna
 from optuna.integration import KerasPruningCallback
 # For Reinforcement Learning
 import gym
 from gym import spaces
 from stable_baselines3 import DQN
 from stable_baselines3.common.vec_env import DummyVecEnv
 # To handle parallelization
 import multiprocessing
 # Suppress TensorFlow warnings
 os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  # Suppress INFO and WARNING messages
 # Configure logging
 logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
 # 1. Data Loading and Preprocessing
 def load_data(file_path):
    logging.info(f"Loading data from: {file_path}")
    try:
        # Parse 'time' column as dates
        data = pd.read_csv(file_path, parse_dates=['time'])
    except FileNotFoundError:
        logging.error(f"File not found: {file_path}")
        sys.exit(1)
    except pd.errors.ParserError as e:
        logging.error(f"Error parsing CSV file: {e}")
        sys.exit(1)
    except Exception as e:
        logging.error(f"Unexpected error: {e}")
        sys.exit(1)
    # Rename columns to match script expectations
    rename_mapping = {
        'time': 'Date',
        'open': 'Open',
        'high': 'High',
        'low': 'Low',
        'close': 'Close'
    }
    data.rename(columns=rename_mapping, inplace=True)
    logging.info(f"Data columns after renaming: {data.columns.tolist()}")
    # Sort and reset index
    data.sort_values('Date', inplace=True)
    data.reset_index(drop=True, inplace=True)
    logging.info("Data loaded and sorted successfully.")
    return data
 def compute_rsi(series, window=14):
    delta = series.diff()
    gain = (delta.where(delta > 0, 0)).rolling(window=window).mean()
    loss = (-delta.where(delta < 0, 0)).rolling(window=window).mean()
    RS = gain / loss
    RSI = 100 - (100 / (1 + RS))
    return RSI
 def compute_macd(series, span_short=12, span_long=26, span_signal=9):
    ema_short = series.ewm(span=span_short, adjust=False).mean()
    ema_long = series.ewm(span=span_long, adjust=False).mean()
    MACD = ema_short - ema_long
    signal = MACD.ewm(span=span_signal, adjust=False).mean()
    return MACD - signal
 def compute_adx(df, window=14):
    # Placeholder for ADX calculation
    return df['Close'].rolling(window=window).std()  # Simplistic placeholder
 def compute_obv(df):
    # On-Balance Volume calculation
    OBV = (np.sign(df['Close'].diff()) * df['Volume']).fillna(0).cumsum()
    return OBV
 def calculate_technical_indicators(df):
    logging.info("Calculating technical indicators...")
    df['SMA_5'] = df['Close'].rolling(window=5).mean()
    df['SMA_10'] = df['Close'].rolling(window=10).mean()
    df['EMA_5'] = df['Close'].ewm(span=5, adjust=False).mean()
    df['EMA_10'] = df['Close'].ewm(span=10, adjust=False).mean()
    df['STDDEV_5'] = df['Close'].rolling(window=5).std()
    df['RSI'] = compute_rsi(df['Close'], window=14)
    df['MACD'] = compute_macd(df['Close'])
    df['ADX'] = compute_adx(df)
    df['OBV'] = compute_obv(df)
    df.dropna(inplace=True)  # Drop rows with NaN values after feature engineering
    logging.info("Technical indicators calculated successfully.")
    return df
 # Argument Parsing
 def parse_arguments():
    parser = argparse.ArgumentParser(description='Train LSTM and DQN models for stock trading.')
    parser.add_argument('csv_path', type=str, help='Path to the CSV data file.')
    return parser.parse_args()
 def main():
    # Parse command-line arguments
    args = parse_arguments()
    csv_path = args.csv_path
    # Load and preprocess data
    data = load_data(csv_path)
    data = calculate_technical_indicators(data)
    # Feature selection
    feature_columns = ['SMA_5', 'SMA_10', 'EMA_5', 'EMA_10', 'STDDEV_5', 'RSI', 'MACD', 'ADX', 'OBV', 'Volume', 'Open', 'High', 'Low']
    target_column = 'Close'
    data = data[['Date'] + feature_columns + [target_column]]
    data.dropna(inplace=True)
    # Scaling
    scaler_features = MinMaxScaler()
    scaler_target = MinMaxScaler()
    scaled_features = scaler_features.fit_transform(data[feature_columns])
    scaled_target = scaler_target.fit_transform(data[[target_column]]).flatten()
    # Create sequences for LSTM
    def create_sequences(features, target, window_size=15):
        X, y = [], []
        for i in range(len(features) - window_size):
            X.append(features[i:i+window_size])
            y.append(target[i+window_size])
        return np.array(X), np.array(y)
    window_size = 15
    X, y = create_sequences(scaled_features, scaled_target, window_size)
    # Split data into training, validation, and testing sets
    train_size = int(len(X) * 0.7)
    val_size = int(len(X) * 0.15)
    test_size = len(X) - train_size - val_size
    X_train, X_val, X_test = X[:train_size], X[train_size:train_size+val_size], X[train_size+val_size:]
    y_train, y_val, y_test = y[:train_size], y[train_size:train_size+val_size], y[train_size+val_size:]
    logging.info(f"Scaled training features shape: {X_train.shape}")
    logging.info(f"Scaled validation features shape: {X_val.shape}")
    logging.info(f"Scaled testing features shape: {X_test.shape}")
    logging.info(f"Scaled training target shape: {y_train.shape}")
    logging.info(f"Scaled validation target shape: {y_val.shape}")
    logging.info(f"Scaled testing target shape: {y_test.shape}")
    # 2. Device Configuration
    def configure_device():
        gpus = tf.config.list_physical_devices('GPU')
        if gpus:
            try:
                for gpu in gpus:
                    tf.config.experimental.set_memory_growth(gpu, True)
                logging.info(f"{len(gpus)} GPU(s) detected and configured.")
            except RuntimeError as e:
                logging.error(e)
        else:
            logging.info("No GPU detected, using CPU.")
    configure_device()
    # 3. Model Building
    def build_advanced_lstm(input_shape, hyperparams):
        model = Sequential()
        for i in range(hyperparams['num_lstm_layers']):
            return_sequences = True if i < hyperparams['num_lstm_layers'] - 1 else False
            model.add(Bidirectional(LSTM(
                hyperparams['lstm_units'],
                return_sequences=return_sequences,
                kernel_regularizer=tf.keras.regularizers.l2(0.001)
            )))
            model.add(Dropout(hyperparams['dropout_rate']))
        model.add(Dense(1, activation='linear'))
        if hyperparams['optimizer'] == 'Adam':
            optimizer = Adam(learning_rate=hyperparams['learning_rate'], decay=hyperparams['decay'])
        elif hyperparams['optimizer'] == 'Nadam':
            optimizer = Nadam(learning_rate=hyperparams['learning_rate'])
        else:
            optimizer = Adam(learning_rate=hyperparams['learning_rate'])
        model.compile(optimizer=optimizer, loss=Huber(), metrics=['mae'])
        return model
    def build_xgboost_model(X_train, y_train, hyperparams):
        model = xgb.XGBRegressor(
            objective='reg:squarederror',
            n_estimators=hyperparams['n_estimators'],
            max_depth=hyperparams['max_depth'],
            learning_rate=hyperparams['learning_rate'],
            subsample=hyperparams['subsample'],
            colsample_bytree=hyperparams['colsample_bytree'],
            random_state=42,
            n_jobs=-1
        )
        model.fit(X_train.reshape(X_train.shape[0], -1), y_train)
        return model
    # 4. Hyperparameter Tuning with Optuna
    def objective(trial):
        # Hyperparameter suggestions
        num_lstm_layers = trial.suggest_int('num_lstm_layers', 1, 3)
        lstm_units = trial.suggest_categorical('lstm_units', [32, 64, 96, 128])
        dropout_rate = trial.suggest_float('dropout_rate', 0.1, 0.5)
        learning_rate = trial.suggest_loguniform('learning_rate', 1e-5, 1e-2)
        optimizer_name = trial.suggest_categorical('optimizer', ['Adam', 'Nadam'])
        decay = trial.suggest_float('decay', 0.0, 1e-4)
        hyperparams = {
            'num_lstm_layers': num_lstm_layers,
            'lstm_units': lstm_units,
            'dropout_rate': dropout_rate,
            'learning_rate': learning_rate,
            'optimizer': optimizer_name,
            'decay': decay
        }
        model = build_advanced_lstm((X_train.shape[1], X_train.shape[2]), hyperparams)
        early_stop = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
        lr_reduce = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-6)
        history = model.fit(
            X_train, y_train,
            epochs=100,
            batch_size=16,
            validation_data=(X_val, y_val),
            callbacks=[early_stop, lr_reduce, KerasPruningCallback(trial, 'val_loss')],
            verbose=0
        )
        val_mae = min(history.history['val_mae'])
        return val_mae
    # Optuna study
    logging.info("Starting hyperparameter optimization with Optuna...")
    study = optuna.create_study(direction='minimize')
    study.optimize(objective, n_trials=50)
    best_params = study.best_params
    logging.info(f"Best Hyperparameters from Optuna: {best_params}")
    # 5. Train the Best LSTM Model
    best_model = build_advanced_lstm((X_train.shape[1], X_train.shape[2]), best_params)
    early_stop = EarlyStopping(monitor='val_loss', patience=20, restore_best_weights=True)
    lr_reduce = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-6)
    logging.info("Training the best LSTM model with optimized hyperparameters...")
    history = best_model.fit(
        X_train, y_train,
        epochs=300,
        batch_size=16,
        validation_data=(X_val, y_val),
        callbacks=[early_stop, lr_reduce],
        verbose=1
    )
    # 6. Evaluate the Model
    def evaluate_model(model, X_test, y_test):
        logging.info("Evaluating model...")
        y_pred_scaled = model.predict(X_test).flatten()
        y_pred_scaled = np.clip(y_pred_scaled, 0, 1)  # Ensure predictions are within [0,1]
        y_pred = scaler_target.inverse_transform(y_pred_scaled.reshape(-1, 1)).flatten()
        y_test_actual = scaler_target.inverse_transform(y_test.reshape(-1, 1)).flatten()
        mse = mean_squared_error(y_test_actual, y_pred)
        rmse = np.sqrt(mse)
        mae = mean_absolute_error(y_test_actual, y_pred)
        r2 = r2_score(y_test_actual, y_pred)
        # Directional Accuracy
        direction_actual = np.sign(np.diff(y_test_actual))
        direction_pred = np.sign(np.diff(y_pred))
        directional_accuracy = np.mean(direction_actual == direction_pred)
        logging.info(f"Test MSE: {mse}")
        logging.info(f"Test RMSE: {rmse}")
        logging.info(f"Test MAE: {mae}")
        logging.info(f"Test R2 Score: {r2}")
        logging.info(f"Directional Accuracy: {directional_accuracy}")
        # Plot Actual vs Predicted
        plt.figure(figsize=(14, 7))
        plt.plot(y_test_actual, label='Actual Price')
        plt.plot(y_pred, label='Predicted Price')
        plt.title('Actual vs Predicted Prices')
        plt.xlabel('Time Step')
        plt.ylabel('Price')
        plt.legend()
        plt.grid(True)
        plt.savefig('actual_vs_predicted.png')  # Save the plot
        plt.close()
        logging.info("Actual vs Predicted plot saved as 'actual_vs_predicted.png'")
        # Tabulate first 40 predictions
        table = [[i, round(actual, 2), round(pred, 2)] for i, (actual, pred) in enumerate(zip(y_test_actual[:40], y_pred[:40]))]
        headers = ["Index", "Actual Price", "Predicted Price"]
        print(tabulate(table, headers=headers, tablefmt="pretty"))
        return mse, rmse, mae, r2, directional_accuracy
    mse, rmse, mae, r2, directional_accuracy = evaluate_model(best_model, X_test, y_test)
    # 7. Save the Model and Scalers
    best_model.save('optimized_lstm_model.h5')
    import joblib
    joblib.dump(scaler_features, 'scaler_features.save')
    joblib.dump(scaler_target, 'scaler_target.save')
    logging.info("Model and scalers saved as 'optimized_lstm_model.h5', 'scaler_features.save', and 'scaler_target.save'.")
    # 8. Reinforcement Learning: Deep Q-Learning for Trading Actions
    class StockTradingEnv(gym.Env):
        """
        A simple stock trading environment for OpenAI gym
        """
        metadata = {'render.modes': ['human']}
        def __init__(self, df, initial_balance=10000):
            super(StockTradingEnv, self).__init__()
            self.df = df.reset_index()
            self.initial_balance = initial_balance
            self.balance = initial_balance
            self.net_worth = initial_balance
            self.max_steps = len(df)
            self.current_step = 0
            self.shares_held = 0
            self.cost_basis = 0
            # Actions: 0 = Sell, 1 = Hold, 2 = Buy
            self.action_space = spaces.Discrete(3)
            # Observations: [normalized features + balance + shares held + cost basis]
            self.observation_space = spaces.Box(low=0, high=1, shape=(len(feature_columns) + 3,), dtype=np.float32)
        def reset(self):
            self.balance = self.initial_balance
            self.net_worth = self.initial_balance
            self.current_step = 0
            self.shares_held = 0
            self.cost_basis = 0
            return self._next_observation()
        def _next_observation(self):
            obs = self.df.loc[self.current_step, feature_columns].values
            # Normalize features by their max to ensure [0,1] range
            obs = obs / np.max(obs)
            # Append balance, shares held, and cost basis
            additional = np.array([
                self.balance / self.initial_balance,
                self.shares_held / 100,  # Assuming a maximum of 100 shares for normalization
                self.cost_basis / self.initial_balance
            ])
            return np.concatenate([obs, additional])
        def step(self, action):
            current_price = self.df.loc[self.current_step, 'Close']
            if action == 2:  # Buy
                total_possible = self.balance // current_price
                shares_bought = total_possible
                if shares_bought > 0:
                    self.balance -= shares_bought * current_price
                    self.shares_held += shares_bought
                    self.cost_basis = (self.cost_basis * (self.shares_held - shares_bought) + shares_bought * current_price) / self.shares_held
            elif action == 0:  # Sell
                if self.shares_held > 0:
                    self.balance += self.shares_held * current_price
                    self.shares_held = 0
                    self.cost_basis = 0
            # Hold does nothing
            self.net_worth = self.balance + self.shares_held * current_price
            self.current_step += 1
            done = self.current_step >= self.max_steps - 1
            # Reward: change in net worth
            reward = self.net_worth - self.initial_balance
            obs = self._next_observation()
            return obs, reward, done, {}
        def render(self, mode='human', close=False):
            profit = self.net_worth - self.initial_balance
            print(f'Step: {self.current_step}')
            print(f'Balance: {self.balance}')
            print(f'Shares held: {self.shares_held} (Cost Basis: {self.cost_basis})')
            print(f'Net worth: {self.net_worth}')
            print(f'Profit: {profit}')
    def train_dqn_agent(env):
        logging.info("Training DQN Agent...")
        try:
            model = DQN(
                'MlpPolicy',
                env,
                verbose=1,
                learning_rate=1e-3,
                buffer_size=10000,
                learning_starts=1000,
                batch_size=64,
                tau=1.0,
                gamma=0.99,
                train_freq=4,
                target_update_interval=1000,
                exploration_fraction=0.1,
                exploration_final_eps=0.02,
                tensorboard_log="./dqn_stock_tensorboard/"
            )
            model.learn(total_timesteps=100000)
            model.save("dqn_stock_trading")
            logging.info("DQN Agent trained and saved as 'dqn_stock_trading.zip'.")
            return model
        except Exception as e:
            logging.error(f"Error training DQN Agent: {e}")
            sys.exit(1)
    # Initialize trading environment
    trading_env = StockTradingEnv(data)
    trading_env = DummyVecEnv([lambda: trading_env])
    # Train DQN agent
    dqn_model = train_dqn_agent(trading_env)
 if __name__ == "__main__":
    main()
--- a/src/Machine-Learning/LSTM-python/policy.optimizer.pth
+++ b/src/Machine-Learning/LSTM-python/policy.optimizer.pth
--- a/src/Machine-Learning/LSTM-python/policy.pth
+++ b/src/Machine-Learning/LSTM-python/policy.pth
--- a/src/Machine-Learning/LSTM-python/prediction_vs_actual.png
+++ b/src/Machine-Learning/LSTM-python/prediction_vs_actual.png
--- a/src/Machine-Learning/LSTM-python/pytorch_variables.pth
+++ b/src/Machine-Learning/LSTM-python/pytorch_variables.pth
--- a/src/Machine-Learning/LSTM-python/scaler.save
+++ b/src/Machine-Learning/LSTM-python/scaler.save
--- a/src/Machine-Learning/LSTM-python/scaler_features.save
+++ b/src/Machine-Learning/LSTM-python/scaler_features.save
--- a/src/Machine-Learning/LSTM-python/scaler_target.save
+++ b/src/Machine-Learning/LSTM-python/scaler_target.save
--- a/src/Machine-Learning/LSTM-python/system_info.txt
+++ b/src/Machine-Learning/LSTM-python/system_info.txt
@@ -0,0 +1,9 @@
 - OS: Linux-6.1.0-30-amd64-x86_64-with-glibc2.36 # 1 SMP PREEMPT_DYNAMIC Debian 6.1.124-1 (2025-01-12)
 - Python: 3.11.2
 - Stable-Baselines3: 2.4.1
 - PyTorch: 2.5.1+cu124
 - GPU Enabled: False
 - Numpy: 1.26.4
 - Cloudpickle: 3.1.1
 - Gymnasium: 1.0.0
 - OpenAI Gym: 0.26.2
--- a/src/Machine-Learning/LSTM-python/use_dqn.py
+++ b/src/Machine-Learning/LSTM-python/use_dqn.py
@@ -0,0 +1,241 @@
 import argparse
 import gym
 import numpy as np
 import pandas as pd
 from tabulate import tabulate
 from stable_baselines3 import DQN
 from stable_baselines3.common.vec_env import DummyVecEnv
 ###############################
 # 1. HELPER FUNCTIONS
 ###############################
 def compute_rsi(series, window=14):
    delta = series.diff()
    gain = (delta.where(delta > 0, 0)).rolling(window=window).mean()
    loss = (-delta.where(delta < 0, 0)).rolling(window=window).mean()
    RS = gain / loss
    RSI = 100 - (100 / (1 + RS))
    return RSI
 def compute_macd(series, span_short=12, span_long=26, span_signal=9):
    ema_short = series.ewm(span=span_short, adjust=False).mean()
    ema_long = series.ewm(span=span_long, adjust=False).mean()
    macd_line = ema_short - ema_long
    signal_line = macd_line.ewm(span=span_signal, adjust=False).mean()
    return macd_line - signal_line  # MACD histogram
 def compute_adx(df, window=14):
    # Placeholder for ADX calculation
    return df['Close'].rolling(window=window).std()
 def compute_obv(df):
    # On-Balance Volume calculation
    OBV = (np.sign(df['Close'].diff()) * df['Volume']).fillna(0).cumsum()
    return OBV
 def compute_technical_indicators(df):
    df['SMA_5'] = df['Close'].rolling(window=5).mean()
    df['SMA_10'] = df['Close'].rolling(window=10).mean()
    df['EMA_5'] = df['Close'].ewm(span=5, adjust=False).mean()
    df['EMA_10'] = df['Close'].ewm(span=10, adjust=False).mean()
    df['STDDEV_5'] = df['Close'].rolling(window=5).std()
    df['RSI'] = compute_rsi(df['Close'], 14)
    df['MACD'] = compute_macd(df['Close'])
    df['ADX'] = compute_adx(df)
    df['OBV'] = compute_obv(df)
    df.dropna(inplace=True)
    return df
 ###############################
 # 2. ENVIRONMENT DEFINITION
 ###############################
 class StockTradingEnv(gym.Env):
    """
    Simple environment using older Gym API for SB3.
    """
    def __init__(self, df, initial_balance=10000):
        super().__init__()
        self.df = df.reset_index(drop=True)
        self.initial_balance = initial_balance
        self.balance = initial_balance
        self.net_worth = initial_balance
        self.max_steps = len(df)
        self.current_step = 0
        self.shares_held = 0
        self.cost_basis = 0
        self.feature_columns = [
            'SMA_5', 'SMA_10', 'EMA_5', 'EMA_10', 'STDDEV_5',
            'RSI', 'MACD', 'ADX', 'OBV', 'Volume',
            'Open', 'High', 'Low'
        ]
        self.action_space = gym.spaces.Discrete(3)
        self.observation_space = gym.spaces.Box(
            low=0.0, high=1.0,
            shape=(len(self.feature_columns)+3,),
            dtype=np.float32
        )
    def reset(self):
        self.balance = self.initial_balance
        self.net_worth = self.initial_balance
        self.current_step = 0
        self.shares_held = 0
        self.cost_basis = 0
        return self._next_observation()
    def step(self, action):
        current_price = self.df.loc[self.current_step, 'Close']
        # BUY
        if action == 2:
            total_possible = self.balance // current_price
            shares_bought = int(total_possible)
            if shares_bought > 0:
                prev_shares = self.shares_held
                self.balance -= shares_bought * current_price
                self.shares_held += shares_bought
                self.cost_basis = (
                    (self.cost_basis * prev_shares) + (shares_bought * current_price)
                ) / self.shares_held
        # SELL
        elif action == 0:
            if self.shares_held > 0:
                self.balance += self.shares_held * current_price
                self.shares_held = 0
                self.cost_basis = 0
        self.net_worth = self.balance + self.shares_held * current_price
        self.current_step += 1
        done = (self.current_step >= self.max_steps - 1)
        reward = self.net_worth - self.initial_balance
        obs = self._next_observation()
        return obs, reward, done, {}
    def _next_observation(self):
        row = self.df.loc[self.current_step, self.feature_columns].values
        max_val = np.max(row) if np.max(row) != 0 else 1.0
        row_norm = row / max_val
        additional = np.array([
            self.balance / self.initial_balance,
            self.shares_held / 100.0,
            self.cost_basis / self.initial_balance
        ], dtype=np.float32)
        obs = np.concatenate([row_norm, additional]).astype(np.float32)
        return obs
    def render(self, mode='human'):
        profit = self.net_worth - self.initial_balance
        print(f"Step: {self.current_step} | "
              f"Balance: {self.balance:.2f} | "
              f"Shares: {self.shares_held} | "
              f"NetWorth: {self.net_worth:.2f} | "
              f"Profit: {profit:.2f}")
 ###############################
 # 3. ARGUMENT PARSING
 ###############################
 def parse_arguments():
    parser = argparse.ArgumentParser(description="Use a trained DQN model to run a stock trading simulation.")
    parser.add_argument("-s", "--show-steps", type=int, default=15,
                        help="Number of final steps to display in the summary (default: 15, max: 300).")
    return parser.parse_args()
 ###############################
 # 4. MAIN FUNCTION
 ###############################
 def main():
    args = parse_arguments()
    # Bound how many steps we show at the end
    steps_to_display = min(args.show_steps, 300)
    # 1) Load CSV
    df = pd.read_csv('BAT.csv')
    rename_mapping = {
        'time': 'Date',
        'open': 'Open',
        'high': 'High',
        'low': 'Low',
        'close': 'Close'
    }
    df.rename(columns=rename_mapping, inplace=True)
    df.sort_values('Date', inplace=True)
    df.reset_index(drop=True, inplace=True)
    df = compute_technical_indicators(df)
    if 'volume' in df.columns and 'Volume' not in df.columns:
        df.rename(columns={'volume': 'Volume'}, inplace=True)
    # 2) Instantiate environment
    raw_env = StockTradingEnv(df)
    vec_env = DummyVecEnv([lambda: raw_env])
    # 3) Load your DQN model
    model = DQN.load("dqn_stock_trading.zip", env=vec_env)
    # 4) Run inference
    obs = vec_env.reset()
    done = [False]
    total_reward = 0.0
    step_data = []
    step_count = 0
    underlying_env = vec_env.envs[0]
    while not done[0]:
        step_count += 1
        action, _ = model.predict(obs, deterministic=True)
        obs, reward, done, info = vec_env.step(action)
        reward_scalar = reward[0]
        total_reward += reward_scalar
        step_data.append({
            "Step": step_count,
            "Action": int(action[0]),
            "Reward": reward_scalar,
            "Balance": underlying_env.balance,
            "Shares": underlying_env.shares_held,
            "NetWorth": underlying_env.net_worth
        })
    final_net_worth = underlying_env.net_worth
    final_profit = final_net_worth - underlying_env.initial_balance
    # 5) Print final summary
    print("\n=== DQN Agent Finished ===")
    print(f"Total Steps Taken: {step_count}")
    print(f"Final Net Worth: {final_net_worth:.2f}")
    print(f"Final Profit: {final_profit:.2f}")
    print(f"Sum of Rewards: {total_reward:.2f}")
    # Count actions
    buy_count = sum(1 for x in step_data if x["Action"] == 2)
    sell_count = sum(1 for x in step_data if x["Action"] == 0)
    hold_count = sum(1 for x in step_data if x["Action"] == 1)
    print(f"Actions Taken -> BUY: {buy_count}, SELL: {sell_count}, HOLD: {hold_count}")
    # 6) Show the last N steps, where N=steps_to_display
    last_n = step_data[-steps_to_display:] if len(step_data) > steps_to_display else step_data
    rows = []
    for d in last_n:
        rows.append([
            d["Step"], d["Action"], f"{d['Reward']:.2f}",
            f"{d['Balance']:.2f}", d["Shares"], f"{d['NetWorth']:.2f}"
        ])
    headers = ["Step", "Action", "Reward", "Balance", "Shares", "NetWorth"]
    print(f"\n== Last {steps_to_display} Steps ==")
    print(tabulate(rows, headers=headers, tablefmt="pretty"))
 if __name__ == "__main__":
    main()
--- a/src/Machine-Learning/LSTM-python/verify_setup.py
+++ b/src/Machine-Learning/LSTM-python/verify_setup.py
@@ -0,0 +1,134 @@
 import os
 import sys
 import pandas as pd
 import tensorflow as tf
 from stable_baselines3.common.vec_env import DummyVecEnv
 import gym
 from gym import spaces
 import numpy as np
 import logging
 # Suppress TensorFlow warnings
 os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  # Suppress INFO and WARNING messages
 # Configure logging
 logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
 class StockTradingEnv(gym.Env):
    """
    A minimal stock trading environment for testing DummyVecEnv.
    """
    metadata = {'render.modes': ['human']}
    def __init__(self, df, initial_balance=10000):
        super(StockTradingEnv, self).__init__()
        self.df = df.reset_index()
        self.initial_balance = initial_balance
        self.balance = initial_balance
        self.net_worth = initial_balance
        self.max_steps = len(df)
        self.current_step = 0
        self.shares_held = 0
        self.cost_basis = 0
        # Actions: 0 = Sell, 1 = Hold, 2 = Buy
        self.action_space = spaces.Discrete(3)
        # Observations: [normalized features + balance + shares held + cost basis]
        # For simplicity, we'll use the same features as your main script
        feature_columns = ['SMA_5', 'SMA_10', 'EMA_5', 'EMA_10', 'STDDEV_5', 'RSI', 'MACD', 'ADX', 'OBV', 'Volume', 'Open', 'High', 'Low']
        self.observation_space = spaces.Box(low=0, high=1, shape=(len(feature_columns) + 3,), dtype=np.float32)
    def reset(self):
        self.balance = self.initial_balance
        self.net_worth = self.initial_balance
        self.current_step = 0
        self.shares_held = 0
        self.cost_basis = 0
        return self._next_observation()
    def _next_observation(self):
        obs = self.df.loc[self.current_step, ['SMA_5', 'SMA_10', 'EMA_5', 'EMA_10', 'STDDEV_5', 'RSI', 'MACD', 'ADX', 'OBV', 'Volume', 'Open', 'High', 'Low']].values
        # Normalize additional features
        additional = np.array([
            self.balance / self.initial_balance,
            self.shares_held / 100,  # Assuming a maximum of 100 shares for normalization
            self.cost_basis / self.initial_balance
        ])
        return np.concatenate([obs, additional])
    def step(self, action):
        current_price = self.df.loc[self.current_step, 'Close']
        if action == 2:  # Buy
            total_possible = self.balance // current_price
            shares_bought = total_possible
            if shares_bought > 0:
                self.balance -= shares_bought * current_price
                self.shares_held += shares_bought
                self.cost_basis = (self.cost_basis * (self.shares_held - shares_bought) + shares_bought * current_price) / self.shares_held
        elif action == 0:  # Sell
            if self.shares_held > 0:
                self.balance += self.shares_held * current_price
                self.shares_held = 0
                self.cost_basis = 0
        # Hold does nothing
        self.net_worth = self.balance + self.shares_held * current_price
        self.current_step += 1
        done = self.current_step >= self.max_steps - 1
        # Reward: change in net worth
        reward = self.net_worth - self.initial_balance
        obs = self._next_observation()
        return obs, reward, done, {}
    def render(self, mode='human', close=False):
        profit = self.net_worth - self.initial_balance
        print(f'Step: {self.current_step}')
        print(f'Balance: {self.balance}')
        print(f'Shares held: {self.shares_held} (Cost Basis: {self.cost_basis})')
        print(f'Net worth: {self.net_worth}')
        print(f'Profit: {profit}')
 def main(file_path):
    # Check if file exists
    if not os.path.exists(file_path):
        logging.error(f"File not found: {file_path}")
        sys.exit(1)
    logging.info("File exists.")
    # Load a small portion of the data
    try:
        data = pd.read_csv(file_path, nrows=5)
        logging.info("Data loaded successfully:")
        print(data.head())
    except Exception as e:
        logging.error(f"Error loading data: {e}")
        sys.exit(1)
    # Check TensorFlow GPU availability
    gpus = tf.config.list_physical_devices('GPU')
    logging.info("TensorFlow GPU Availability:")
    print(gpus)
    # Check DummyVecEnv import and initialization
    try:
        # Initialize a minimal environment for testing
        test_env = StockTradingEnv(data)
        env = DummyVecEnv([lambda: test_env])
        logging.info("DummyVecEnv imported and initialized successfully.")
    except Exception as e:
        logging.error(f"Error initializing DummyVecEnv: {e}")
 if __name__ == "__main__":
    if len(sys.argv) != 2:
        logging.error("Usage: python verify_setup.py <path_to_csv>")
        sys.exit(1)
    file_path = sys.argv[1]
    main(file_path)