Inspiration
I built this app to help everyday people get a better idea of where stock prices might be heading. Investing in the stock market can be confusing and risky, especially if you’re not sure when to buy or sell. So, we wanted to create a simple tool that lets users check past stock prices and see future predictions using machine learning. I also added user login features to make it personal and secure. The goal is to make stock market insights easy and useful for everyone whether you're just starting out or already investing.
What it does
This web app is designed to help users explore and understand stock price trends, even if they don’t have a background in finance or data science. After creating an account or logging in, users can enter the stock ticker symbol of any publicly traded company (like AAPL for Apple or TSLA for Tesla) to instantly access its historical stock price data. The app pulls this information directly from Yahoo Finance.
Once the data is loaded, users can see recent price changes, as well as a full chart showing how the stock has performed over time. But it doesn't stop there the app also uses a simple machine learning model (Linear Regression) to analyze the historical data and predict what the stock price might be on the next day.
Along with the prediction, the app shows how well the model performs using metrics like R-squared and Mean Squared Error, which help users understand how accurate the predictions are. Users can also view visual comparisons between the actual stock prices and the predicted ones through clean, easy-to-read graphs.
The app includes user-friendly features like sign-up, login, and logout, and even allows users to update their password or delete their account entirely. All user data is stored securely in a PostgreSQL database with encrypted passwords for safety.
In short, this project combines finance, data science, and web development to provide an interactive platform where anyone from beginners to experienced investors can explore stock trends, see predictions, and make more informed decisions about the stock market.
How we built it
This project is a full-stack machine learning web application built using Python and various supporting libraries for data handling, model training, and web interface development.
Frontend & UI: I used Streamlit to build the interactive web interface. Streamlit’s widget system enabled rapid prototyping and integration of user inputs, dynamic visualizations, and layout control all within Python.
Authentication System: User authentication (Sign Up, Login, Password Update, Delete Account) is handled via a PostgreSQL backend. I used psycopg2 to interact with the database, and SHA-256 hashing (hashlib) to securely store user passwords. Session handling is managed using st.session_state to maintain login state across pages.
Data Source: I utilized yfinance to fetch historical stock price data based on user-specified ticker symbols and date ranges. This data is retrieved in real-time and cached using @st.cache_data to improve performance.
Machine Learning Model: The core predictive model is a Linear Regression model from scikit-learn.
Features: Dates (converted to ordinal values)
Target: Adjusted Close prices (Adj Close, with fallback to Close) The model is trained on the historical price data to predict future values, specifically the next day’s price.
Model Evaluation: I calculate and display:
R² (R-squared) – to evaluate goodness of fit
Mean Squared Error (MSE) – to quantify prediction error
Visualization: Matplotlib is used to create two key plots:
Historical stock prices over time
Actual vs. Predicted prices (including the next day’s prediction)
Deployment Readiness: The app is modular and ready for deployment via Streamlit Cloud, Docker, or other hosting platforms. With environment variables set for database credentials, it can be easily ported to production with secure configurations.
Challenges we ran into
While building this project, I faced a number of interesting and sometimes tricky challenges everything from user authentication to modeling stock prices. Here's a breakdown of what I ran into along the way:
🔐 1. User Authentication & Secure Credential Management
Implementing user authentication in Streamlit wasn’t as straightforward as using frameworks like Flask or Django, which have built-in tools for it. I had to get a bit creative.
First, I set up a PostgreSQL database to securely store user data. To make sure passwords were protected, I hashed them using SHA-256 before storing them no plain-text passwords, ever.
To protect the database from SQL injection attacks, I made sure to use parameterized SQL queries with psycopg2. On top of that, since Streamlit doesn’t maintain state across reruns by default, I had to manage login sessions manually using st.session_state. That way, I could keep track of who was logged in and what actions they could access though it definitely got tricky due to how Streamlit re-runs the script every time a widget is interacted with.
📉 2. Stock Data Fetching and Validation
Using the yfinance library made it easy to pull stock data, but I quickly realized it’s not always reliable.
Sometimes, if a user entered a wrong or non-existent ticker symbol, yfinance would fail silently or return an empty DataFrame. To prevent the app from crashing, I had to write validation logic to catch and handle these cases.
Another issue was with missing columns some stocks didn’t have the Adj Close column, which I used for modeling. In those cases, I had to fall back to using the Close column and inform the user with a warning message about the difference. Plus, I added more error handling to deal with things like network issues or rate-limiting from Yahoo’s API.
🧠 3. ML Modeling with Limited Features
For the prediction model, I decided to use Linear Regression from scikit-learn. It’s fast, easy to implement, and a good starting point but stock data is noisy and complex, so this came with limitations.
Feature Selection
I kept it simple by using only one feature: the date. That means I didn’t include things like trading volume, moving averages, or macroeconomic indicators all of which could improve predictions. I knew this would limit accuracy, but the goal was to demonstrate how ML could be integrated into the app, not to build a full trading engine.
Overfitting vs. Underfitting
Linear models tend to underfit volatile stock data, so while the predictions showed general trends, they weren’t great for short-term forecasting. To evaluate performance, I used R² (R-squared) to see how well the model fit the data, and MSE (Mean Squared Error) to measure how far off the predictions were.
Date Handling for Predictions
One interesting challenge was working with date values. Since the model used ordinal numbers for dates, I had to carefully convert predictions back into datetime format for display. That also meant I had to consider weekends and holidays days when the market is closed even though the model would still try to predict them.
Prediction Interpretation
Linear Regression draws a straight line through the data, assuming the price will keep rising or falling in a consistent way which rarely reflects reality in the stock market. So I added clear labels and disclaimers to let users know this is just a basic trend model, not financial advice or a trading recommendation.
📊 4. Visualization and Plot Rendering
Visualizing the results was a big part of making the app feel interactive and useful, but there were a few bumps here too.
I used Matplotlib for plotting, but integrating multiple figures on the same Streamlit page required careful use of st.pyplot() and manual control of figure objects.
Another challenge was keeping the plots reactive to user inputs like ticker symbols and date ranges. I had to use caching and state tracking to prevent unnecessary re-renders or data fetches.
Also, when plotting predicted vs. actual prices, I had to map the ordinal date values back to real dates, otherwise the x-axis labels wouldn’t make sense to users.
⚙️ 5. State Management in Streamlit
One of the most persistent challenges was dealing with state in Streamlit. Since it’s stateless by design (the script re-runs every time the user interacts), building a multi-step flow like login, navigating to the profile, or running predictions required careful management using st.session_state.
I used it to store:
Login status
Username
Temporary user data like profile settings
I also had to implement manual page navigation logic to move between sections like Login, Sign Up, Home, and Profile, while protecting access to sensitive areas. Finally, when users logged out or deleted their account, I had to clear the session state properly to avoid leaving behind any stale data.
🧮 6. Database Integration
Using PostgreSQL with psycopg2 gave me flexibility and reliability on the backend, but working with raw SQL required attention to detail.
I had to:
Use parameterized queries consistently to avoid injection attacks.
Manage connection and cursor lifecycles to prevent leaks or crashes, especially under rapid user interaction.
Handle edge cases like duplicate usernames during sign-up, blank form submissions, and deletion of accounts that might not exist.
Think ahead about deployment and security by preparing the code to pull database credentials from environment variables, so nothing sensitive is hardcoded.
Accomplishments that we're proud of
Looking back at the project, there are several accomplishments that I’m really proud of especially considering the challenges and the learning curve involved.
✅ Built a Full-Stack ML Web App from Scratch
One of the biggest wins was putting together a complete end-to-end application that combines data science, web development, and database integration. From the frontend UI to the backend database to the ML model in the middle. I was able to stitch everything together into a smooth and functional experience.
🔐 Implemented Secure User Authentication
Setting up a user login system that is actually secure was something I’d never done before in Streamlit. I managed to:
Hash passwords properly using SHA-256
Use parameterized queries to avoid SQL injection
Handle sessions manually with st.session_state
It’s simple but secure and it works well for a prototype.
📊 Integrated Machine Learning in a Real-Time Web App
It felt great to integrate a working Linear Regression model that pulls real stock data, trains on it instantly, and provides predictions all in a few seconds. Even though the model is basic, it’s a live demo of machine learning in action, and it helps users understand trends visually and interactively.
📈 Clean, Interactive Visualizations
Being able to plot both historical stock prices and the predicted trend line, and make those plots respond to user input (like ticker and date range), really brought the app to life. It was also satisfying to handle date conversions and formatting correctly so that everything was readable and user-friendly.
💾 Successfully Connected to a PostgreSQL Database
Setting up a real PostgreSQL backend, writing clean SQL queries, and using it to manage user accounts was another important milestone. It gave the project a layer of real-world functionality, moving beyond a typical single-user demo app.
🚀 Learned a Lot Along the Way
Finally, I'm proud of how much I learned not just about libraries like yfinance, scikit-learn, and psycopg2, but also about:
Data validation and error handling
App structure and state flow in Streamlit
Balancing simplicity with usability and accuracy
This project helped me grow across multiple areas of development, and seeing it work end-to-end with users logging in, fetching live stock data, and getting predictions made it all worth it.
What we learned
This project was a great learning experience that taught me a lot across different areas from software development to machine learning and data handling. Here are some key takeaways:
🔐 The Importance of Secure Authentication
Building a user login system from scratch made me realize how critical it is to protect user data. Hashing passwords, preventing SQL injection, and managing sessions securely are foundational even in seemingly simple apps.
📉 Handling Real-World Data Isn’t Always Easy
Working with live stock data through yfinance taught me that data can be messy or incomplete. You can’t always rely on perfect inputs or API responses, so you need robust validation, error handling, and fallback plans to keep your app stable.
🧠 Simple Models Have Their Limits
Using Linear Regression on stock prices was a great starting point, but I learned that financial data is complex, noisy, and often non-linear. Good predictions require richer features and more advanced techniques, but starting simple helped me understand the basics of training and evaluating models.
📊 Visualizations Need Thoughtful Design
Plotting data interactively in Streamlit was more challenging than expected. I learned how to manage multiple plots, keep them responsive to user input, and convert data formats so charts are clear and meaningful.
⚙️ State Management in Streamlit Requires Care
Streamlit’s rerun-based model means you have to plan carefully for things like user sessions and navigation. Using st.session_state effectively was key to providing a smooth user experience without unauthorized access or confusing app behavior.
🧮 Integrating a Database Adds Complexity but is Essential
Connecting to PostgreSQL introduced me to the realities of database management in web apps from writing secure queries to handling edge cases. It made the app more realistic and prepared me for deploying full-stack applications.
🚀 Building an End-to-End System Requires Cross-Disciplinary Skills
This project reinforced how important it is to combine multiple skills backend, frontend, data science, security, and user experience to build something practical and user-friendly.
Overall, this journey improved my confidence in handling both the technical challenges and design decisions needed to build real-world ML-powered apps.
What's next for Stocktells
Stocktells has laid a solid foundation, but there’s plenty of exciting room to grow and improve. Here’s what I’m planning to work on next:
🚀 Enhance the Machine Learning Model
Right now, we’re using a simple Linear Regression model with just date as a feature. The next step is to explore more sophisticated algorithms like Random Forests, Gradient Boosting, or even LSTM neural networks that can better capture the complex patterns and volatility in stock prices. I also want to include additional features such as trading volume, moving averages, or external factors like news sentiment to improve prediction accuracy.
🔍 Improve Data Quality and Handling
To make the app more robust, I plan to add smarter data validation and pre-processing for example, handling missing data more gracefully, better managing non-trading days like weekends and holidays, and incorporating multiple data sources to cross-check and enrich stock information.
🔐 Upgrade Security and User Management
I want to implement stronger security features such as email verification, password reset via email, and possibly OAuth integration for third-party logins (Google, GitHub). Improving session management and scaling user authentication will help make the app production-ready.
📱 Build a More Interactive and User-Friendly UI
Right now, the interface is functional but basic. I’d like to improve the UI/UX with richer interactivity like customizable charts, historical data comparisons, and more detailed model explanations to make the insights more accessible and actionable for users.
☁️ Deploy on a Scalable Cloud Platform
Moving from local or simple hosting to a scalable cloud platform will allow Stocktells to handle more users, automate data updates, and improve performance.
🤝 Add Collaborative Features and Social Sharing
Future versions could include features like user portfolios, watchlists, alerts, and even social sharing so users can discuss predictions and stock trends with others.
📈 Expand Beyond Stocks
Eventually, Stocktells could expand to cover other financial instruments like cryptocurrencies, ETFs, or commodities, broadening its usefulness.
Stocktells is just getting started, and I’m excited to keep improving it making stock data and machine learning accessible, transparent, and useful for everyone.
Built With
- matplotlib
- postgresql
- python
- scikit-learn
- streamlit
- yfinance
Log in or sign up for Devpost to join the conversation.