In this article, we will walk you through creating a deep learning application to forecast stock prices for major Nift50 companies in India. We will use LSTM (Long Short-Term Memory) networks for the prediction model, fetch data from Yahoo Finance, and build a user-friendly interface with Streamlit. Let’s get started!

Project Structure

First, let’s set up the project structure. Your project directory should look like this:

stock_forecasting_app/
├── data/                 # Folder to save CSV files of each company
├── models/               # Folder to save trained models for each company
├── data_fetch.py         # Script to fetch historical stock data
├── train_model.py        # Script to train LSTM models
├── utils.py              # Utility functions and constants
├── config.ini            # Configuration file
├── requirements.txt      # Required Python packages
└── frontend.py           # Streamlit UI script

Step 1: Setting Up the Environment

Create a virtual environment for your project and activate it:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Create requirements.txt file in project root folder (stock_forecasting_app/) and add following packages in that file

numpy==2.0.0
pandas==2.2.2
plotly==5.22.0
scikit_learn==1.3.2
streamlit==1.36.0
tensorflow==2.16.1
tensorflow_intel==2.16.1
yfinance==0.2.40

Install the required packages:

pip install -r requirements.txt

Step 2: Fetching Historical Data

Create utils.py file in project root folder (stock_forecasting_app/) to store company names with their NSE codes. We will consider 50 Top Indian companies from NIFTY50 Index based on market capitalization

nifty50_companies = {
    "Adani Enterprises": "ADANIENT.NS",
    "Asian Paints": "ASIANPAINT.NS",
    "Axis Bank": "AXISBANK.NS",
    "Bajaj Finance": "BAJFINANCE.NS",
    "Bharat Petroleum": "BPCL.NS",
    "Bharti Airtel": "BHARTIARTL.NS",
    "Britannia": "BRITANNIA.NS",
    "Cipla": "CIPLA.NS",
    "Coal India": "COALINDIA.NS",
    "Divi's Laboratories": "DIVISLAB.NS",
    "Dr. Reddy's Laboratories": "DRREDDY.NS",
    "Eicher Motors": "EICHERMOT.NS",
    "Grasim Industries": "GRASIM.NS",
    "HCL Technologies": "HCLTECH.NS",
    "HDFC Bank": "HDFCBANK.NS",
    "Hero MotoCorp": "HEROMOTOCO.NS",
    "Hindalco": "HINDALCO.NS",
    "Hindustan Unilever": "HINDUNILVR.NS",
    "ICICI Bank": "ICICIBANK.NS",
    "IndusInd Bank": "INDUSINDBK.NS",
    "Infosys": "INFY.NS",
    "ITC": "ITC.NS",
    "JSW Steel": "JSWSTEEL.NS",
    "Kotak Mahindra Bank": "KOTAKBANK.NS",
    "Larsen & Toubro": "LT.NS",
    "Mahindra & Mahindra": "M&M.NS",
    "Maruti Suzuki": "MARUTI.NS",
    "Nestle India": "NESTLEIND.NS",
    "NTPC": "NTPC.NS",
    "Oil and Natural Gas Corporation": "ONGC.NS",
    "Power Grid Corporation": "POWERGRID.NS",
    "Reliance Industries": "RELIANCE.NS",
    "State Bank of India": "SBIN.NS",
    "Shree Cement": "SHREECEM.NS",
    "Sun Pharmaceutical": "SUNPHARMA.NS",
    "Tata Consultancy Services": "TCS.NS",
    "Tata Motors": "TATAMOTORS.NS",
    "Tata Steel": "TATASTEEL.NS",
    "Tech Mahindra": "TECHM.NS",
    "Titan Company": "TITAN.NS",
    "UltraTech Cement": "ULTRACEMCO.NS",
    "UPL": "UPL.NS",
    "Wipro": "WIPRO.NS"
}

Create config.ini file to store configuration in project root folder (stock_forecasting_app/)

[data]
no_of_data_days = 20000

[model]
no_of_epochs = 10
batch_size = 32

We’ll create a script to fetch historical stock data for companies in the Nifty 50 index. This script will run every night to update the data.

data_fetch.py

import os
import configparser
import yfinance as yf
import pandas as pd
from datetime import datetime, timedelta
from utils import nifty50_companies

config = configparser.ConfigParser()
config.read("config.ini")

no_of_data_days = int(config['data']['no_of_data_days'])

def fetch_and_append_data(company, ticker):
    filename = f"data/{ticker}.csv"
    
    # Calculate start date to decide no of days to pull the data
    end_date = datetime.today().date()
    start_date = end_date - timedelta(days = no_of_data_days)  # duration back
    
    # Check if file exists
    if os.path.exists(filename):
        # Load existing data
        existing_data = pd.read_csv(filename, index_col='Date', parse_dates=True)
        last_date = existing_data.index[-1]
        
        # Fetch data from last date + 1 day to today
        start_date = last_date + timedelta(days=1)
    
    # Fetch data from start date to end date
    stock_data = yf.download(ticker, start=start_date, end=end_date)
    
    # Append or save new data
    if not stock_data.empty:
        if os.path.exists(filename):
            stock_data.to_csv(filename, mode='a', header=False)
            print(f"Appended data for {company} successfully.")
        else:
            stock_data.to_csv(filename)
            print(f"Saved data for {company} from {start_date} to {end_date}.")
    else:
        print(f"No new data available for {company}.")

def main():
    for company, ticker in nifty50_companies.items():
        fetch_and_append_data(company, ticker)

if __name__ == "__main__":
    main()

This script fetches complete data for each 50 company based on the number of days specified in the configuration file and saves it to a CSV file named by the ticker symbol. Here we have specified high number of days to fetch complete data of each company from which the company is listed to NSE.

Step 3: Training the LSTM Model

We’ll create another script to train an LSTM model for each company. This script will also run nightly to update the models.

train_models.py

import os
import warnings
import numpy as np
import pandas as pd
import configparser
warnings.filterwarnings("ignore")
from utils import nifty50_companies
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

config = configparser.ConfigParser()
config.read("config.ini")

no_of_epochs = int(config['model']['no_of_epochs'])
batch_size = int(config['model']['batch_size'])

def preprocess_data(data):
    scaler = MinMaxScaler(feature_range=(0, 1))
    scaled_data = scaler.fit_transform(data['Close'].values.reshape(-1, 1))
    return scaled_data, scaler

def build_lstm_model(X, Y):
    model = Sequential()
    model.add(LSTM(50, return_sequences=True, input_shape=(X.shape[1], 1)))
    model.add(LSTM(50, return_sequences=False))
    model.add(Dense(25))
    model.add(Dense(1))

    model.compile(optimizer='adam', loss='mean_squared_error')
    model.fit(X, Y, batch_size = batch_size, epochs = no_of_epochs)

    return model

def train_and_save_models():
    for company, ticker in nifty50_companies.items():
        filename = f"data/{ticker}.csv"
        model_filename = f"models/{ticker}_model.keras"
        
        # Check if data file exists
        if os.path.exists(filename):
            # Load data
            print(f"Training started for {company}.. Please wait..!!")
            data = pd.read_csv(filename, index_col='Date', parse_dates=True)
            scaled_data, scaler = preprocess_data(data)
            
            # Prepare dataset
            X, Y = [], []
            for i in range(len(scaled_data) - 100 - 1):
                X.append(scaled_data[i:(i + 100), 0])
                Y.append(scaled_data[i + 100, 0])
            X, Y = np.array(X), np.array(Y)
            X = X.reshape(X.shape[0], X.shape[1], 1)
            
            # Build and train model
            model = build_lstm_model(X, Y)
            
            # Save model
            model.save(model_filename)
            print(f"Saved model for {company} as {model_filename}.")
            print("-----------------------------------------------")
        else:
            print(f"No data file found for {company}.")

def main():
    train_and_save_models()

if __name__ == "__main__":
    main()

This script reads the data, preprocesses it, trains an LSTM model, and saves the model for each 50 companies using the ticker symbol.

Step 4: Building the User Interface with Streamlit

Finally, we’ll build a user interface using Streamlit. This interface will allow users to fetch data, train models, and make predictions.

frontend.py

import numpy as np
import pandas as pd
import streamlit as st
import plotly.graph_objects as go
from utils import nifty50_companies
from tensorflow.keras.models import load_model
from sklearn.preprocessing import MinMaxScaler

def load_data(company):
    filename = f"data/{company}.csv"
    data = pd.read_csv(filename, index_col='Date', parse_dates=True)
    return data

def load_model_file(company):
    model_filename = f"models/{company}_model.keras"
    model = load_model(model_filename)
    return model

def make_predictions(model, data, num_predictions, scaler):
    test_data = data[-100:].reshape(1, -1)
    predictions = []
    for _ in range(num_predictions):
        predicted_value = model.predict(test_data)
        predictions.append(predicted_value[0, 0])
        test_data = np.roll(test_data, -1)
        test_data[0, -1] = predicted_value
    predictions = scaler.inverse_transform(np.array(predictions).reshape(-1, 1))
    return predictions.flatten()

def convert_df_to_csv(df):
    return df.to_csv(index=True).encode('utf-8')

def main():
    st.title("Stock Price Forecasting App")

    company = st.selectbox("Select Company", list(nifty50_companies.keys()))

    if st.button("Get Data"):
        data = load_data(nifty50_companies[company])
        st.session_state.data = data
        st.session_state.company = company

    if "data" in st.session_state:
        st.subheader(f"{st.session_state.company} Original Data: Last 5 Records")
        st.table(st.session_state.data.tail(5))

        st.subheader(f"{st.session_state.company} Complete Data Visualization")
        fig = go.Figure()
        fig.add_trace(go.Scatter(x=st.session_state.data.index, y=st.session_state.data['Open'], mode='lines', name='Open'))
        fig.add_trace(go.Scatter(x=st.session_state.data.index, y=st.session_state.data['High'], mode='lines', name='High'))
        fig.add_trace(go.Scatter(x=st.session_state.data.index, y=st.session_state.data['Low'], mode='lines', name='Low'))
        fig.add_trace(go.Scatter(x=st.session_state.data.index, y=st.session_state.data['Close'], mode='lines', name='Close'))
        fig.update_layout(xaxis_title='Date', yaxis_title='Price')
        st.plotly_chart(fig)

        num_predictions = st.number_input("No of Next Predictions(1-90 days):", 
                                          min_value=1, max_value=90, value=60)

        if st.button("Forecast"):
            model = load_model_file(nifty50_companies[st.session_state.company])
            scaler = MinMaxScaler(feature_range=(0, 1))
            scaled_data = scaler.fit_transform(st.session_state.data['Close'].values.reshape(-1, 1))
            predictions = make_predictions(model, scaled_data, num_predictions, scaler)
            st.session_state.predictions = predictions
            st.session_state.num_predictions = num_predictions

    if "predictions" in st.session_state:
        forecast_df = pd.DataFrame({
            "Date": pd.date_range(start=st.session_state.data.index[-1] + pd.Timedelta(days=1), periods=st.session_state.num_predictions, freq='D'),
            "Forecasted_value": st.session_state.predictions
        })
        forecast_df["Forecasted_value"] = forecast_df["Forecasted_value"].astype("float64").round(2)
        st.subheader(f"Forecasted Values for {st.session_state.company}")
        st.table(forecast_df)

        st.subheader(f"{st.session_state.company} Forecasting")
        fig = go.Figure()
        fig.add_trace(go.Scatter(x=st.session_state.data.index, y=st.session_state.data['Close'], mode='lines', name='Original Data'))
        forecast_index = pd.date_range(start=st.session_state.data.index[-1] + pd.Timedelta(days=1), periods=st.session_state.num_predictions, freq='D')
        fig.add_trace(go.Scatter(x=forecast_index, y=st.session_state.predictions, mode='lines', name='Forecasted Data'))
        fig.update_layout(xaxis_title='Date', yaxis_title='Price')
        st.plotly_chart(fig)

        st.subheader("Download Data and Results")
        historical_csv = convert_df_to_csv(st.session_state.data)
        forecast_csv = convert_df_to_csv(forecast_df)

        col1, col2, col3 = st.columns(3)
        with col1:
            st.download_button(label="Download Original Data",
                               data=historical_csv,
                               file_name=f"{nifty50_companies[st.session_state.company]}_historical_data.csv",
                               mime='text/csv')
        
        with col2:
            st.download_button(label="Download Forecasted Results",
                               data=forecast_csv,
                               file_name=f"{nifty50_companies[st.session_state.company]}_forecast_results.csv",
                               mime='text/csv')
        with col3:
            if st.button("Reset"):
                st.session_state.clear()
                st.rerun()

if __name__ == "__main__":
    main()

Explanation:

  1. Select Company: The user can select a company from the Nifty 50 list.
  2. Get Data: This button loads and displays the last 5 records of the selected company’s historical data.
  3. Visualize Data: The app visualizes one year of data, showing Open, High, Low, and Close prices.
  4. Forecast: After selecting the number of predictions (between 1 and 90), the app forecasts future stock prices.
  5. Display Results: The forecasted values are displayed in a table and plotted alongside the original data.
  6. Download Data: Users can download the historical data and forecast results as CSV files.
  7. Reset: This button clears the session state and reloads the app.

Step 5: How to Execute the Project

1. Fetch the dataset from Yahoo finance:

python data_fetch.py

When you will execute this command, for each company data will be fetched from yahoo finance and stored in CSV file in data/ directory. so there will be 50 CSV files in data directory.

2. Train the LSTM Models for each 50 companies:

python train_models.py

When you will execute this command, data reading, data preprocessing, model building will be done and all 50 models will be stored into models/ directory.

3. Run the Streamlit App:

streamlit run frontend.py

This will open UI in web browser with address http://localhost:8501/ and you can the web application. you can view the data along with visualization, you can forecast for maximum next 1 to 90 days, observe forecasting plot along original data and even you can download the results as well as dataset also.

Conclusion

In this article, we have built a comprehensive stock price forecasting application using deep learning with LSTM models. We created scripts for fetching data, training models, and built a user-friendly interface using Streamlit.

Future Work

This project can be expanded further with additional features, such as more sophisticated models, evaluation of models, increase accuracy and relevant parameters, more data sources, more number of input features, improved UI elements and deployment of this on cloud as future works.

If you found this article helpful, please give it a like and share it with your friends. We’d love to hear your thoughts in the comments below..! Happy coding!!!

Leave a Reply