BaseLine Repo Scanner

About the Project

Inspiration

The inspiration for the Baseline GitHub Repository Checker came from the recurring need to quickly evaluate the health and quality of a codebase. In many development teams, especially larger ones, ensuring that repositories adhere to certain standards, use up-to-date dependencies, and follow best practices is a constant challenge. Manually checking each repository is tedious and error-prone. I wanted to create a tool that automates this process, providing a clear and actionable "baseline" report for any given GitHub repository. The idea was to build something that could help developers and managers get a snapshot of a project's technical state at a glance.

What I Learned

This project was a great learning experience. Here are some of the key takeaways:

Full-Stack Development with Modern Tools: I deepened my understanding of building a full-stack application using a modern MERN-like stack. On the frontend, I got to work with Vite for the first time and was impressed by its speed. I also used TanStack Query for managing server state, which was a great way to handle data fetching, caching, and synchronization.
Real-time Communication: Implementing real-time progress updates with Socket.IO was a fantastic challenge. It taught me a lot about managing WebSocket connections between a React client and a Node.js/Express backend.
Interacting with External APIs: I gained valuable experience using the GitHub API via the @octokit/rest client. I learned how to authenticate, fetch repository data, and handle API rate limits.
Asynchronous Job Processing: The backend uses a job queue to handle the repository scanning process asynchronously. This was crucial for preventing long-running tasks from blocking the main thread and for providing a better user experience.

How I Built It

The project is divided into two main parts: a frontend and a backend.

Backend:
- I started by setting up a Node.js server with Express.
- I used Mongoose to define schemas for Scan and Job models and connected to a MongoDB database to store the scan results.
- The core of the backend is the repoAnalyzer service, which clones a GitHub repository, parses its files, and performs various checks.
- I implemented a simple job queue system to handle scan requests asynchronously. When a new scan is requested, a job is added to the queue.
- Socket.IO is used to emit events from the backend to the frontend, providing real-time updates on the scan progress.
```
// Example of a backend route
app.post('/api/scan', [
    body('repoUrl').isURL().withMessage('A valid repository URL is required.'),
], async (req, res) => {
    // ... validation and job creation logic
});
```
Frontend:
- The frontend is a single-page application built with React and Vite.
- I used React Router to manage navigation between different pages like the home page, the scan page, and the scan details page.
- Material-UI and Tailwind CSS were used to create a clean and modern user interface.
- TanStack Query handles all the data fetching from the backend API.
- A custom useSocket hook was created to manage the Socket.IO connection and listen for real-time events from the backend.

Challenges I Faced

Parsing and Analyzing Code: One of the biggest challenges was to write code that could robustly parse different types of source code files to look for specific patterns (e.g., outdated dependencies in package.json). I used libraries like @babel/parser for JavaScript, but handling different languages and file formats can be complex.
Handling Large Repositories: Scanning very large repositories can take a significant amount of time and resources. I had to optimize the cloning and analysis process to avoid timeouts and excessive memory usage. The asynchronous job queue helped a lot here.
Error Handling: Building a resilient system that can gracefully handle errors (e.g., invalid repository URLs, private repositories, API failures) was a key focus. I implemented comprehensive error handling on both the frontend and backend.
State Management with Real-time Data: Synchronizing the frontend state with the real-time data coming from the backend via WebSockets was tricky. It required careful state management to ensure the UI updated smoothly without unnecessary re-renders. For example, updating a progress bar can be represented with the following equation: $$ \text{progress} = \frac{\text{completed_tasks}}{\text{total_tasks}} \times 100% $$

Built With

jest
mern
octalit

Updates

Mayur katla started this project — Oct 06, 2025 01:41 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.