Congressional Resource Allocation Analysis

Inspiration

The inspiration for this project stemmed from a desire to bridge the gap in educational attainment. In the United States, educational opportunities can vary significantly based on factors such as income, location, and school resources. Our team felt a responsibility to use our skills to address this issue. The chance to collaborate and innovate with like-minded individuals fueled our enthusiasm.

What it does

Our project focuses on analyzing and recommending resource allocation strategies to enhance educational attainment in congressional districts. It addresses the socioeconomic factors influencing educational outcomes, with a particular focus on Texas, while making comparisons with other states.

How we built it

We started by gathering data from the "2017-18 Civil Rights Data Collection (CRDC)" folder, focusing exclusively on specific datasets that we believed held key insights. These datasets included "ID 22 SCH - Title I Status.csv," "LEA Characteristics.csv," and "Enrollment.csv."

With our data in hand, we created a pipeline to transform it into actionable information. Python scripts played a pivotal role, especially the filter.py script, which cleaned the data, and the CsvtoSql.py script that imported the tables into our PostgreSQL database. We established a robust database system that efficiently managed and queried the data.

The magic of our project lay in the SQL schemas. These schemas organized the data and allowed us to create the school_summary table, which condensed the information into a format suitable for analysis.

Data visualization was the final piece of the puzzle. The dataVisualization.py script leveraged the summarized data to create graphs and visual representations that effectively conveyed our findings. It's through these visualizations that our recommendations took shape.

Challenges we ran into

Like any project, ours was not without its challenges. Data cleaning was a time-consuming process, and we encountered inconsistencies in the raw data that required careful handling.

Setting up a local database through PostgreSQL and allowing our team to connect to it was quite the challenge, as it took a lot of tinkering with different settings and firewall rules to find out what would allow incoming connections from external sources.

The SQL schemas and database management presented a learning curve. We had to design a schema that would facilitate effective data analysis and make it easy to generate insights.

Incorporating data visualization techniques was also a challenge, as we aimed to create visuals that would be both informative and persuasive.

Accomplishments that we're proud of

We're particularly proud of successfully transforming complex, raw data into meaningful insights and actionable recommendations. Our ability to navigate challenges in data cleaning, database management, and data visualization demonstrates our team's dedication and problem-solving skills.

What we learned

We learned how to set up a local database through PostgreSQL, and how to allow our team to connect to it. By adjusting the firewall rules on the host computer along with a few configuration settings, our team was able to connect and interact with the local database.

We also learned the importance of data preprocessing and database management. Initially, the raw data was challenging to work with, and we had to implement data-cleaning techniques to ensure accuracy and consistency. By using Python scripts, we converted the data into a format that could be efficiently imported into a PostgreSQL database.

Custom SQL schemas were crucial to our success. They allowed us to structure and organize the data effectively. Our SQL queries facilitated the creation of a new table called school_summary, which provided a clear and summarized view of the data, a crucial step in generating insights for resource allocation recommendations.

What's next for Congressional Resource Allocation Analysis

Moving forward, we aim to refine our analysis, incorporate more data sources, and expand our recommendations to encompass a broader geographical scope. Additionally, we plan to engage with education stakeholders to discuss the findings and potential implementation of resource allocation strategies. Our project represents the beginning of a larger effort to bridge the educational attainment gap, and we are excited to continue this vital work.

Built With

pandas
postgresql
psycopg2
python
remote-server
sql
sqlalchemy
sqlite

Submitted to

Rowdy Datathon 2023

Created by

I worked on the SQL scripts alongside Jacob. I also looked at the data and tried to understand it to come up with a plan for our analysis.

Carly Munoz
I worked closely with Alex on developing the python scripts required for filtering and transforming the CSV data into database format, and also providing a visualization of the finalized data for analysis.

Brandon Hawkins
Brandon and I worked on various Python scripts. These scripts helped us to initially filter data within the CSV. We then converted these CSV files to our SQL tables using another script. After we finished organizing the data in SQL we created another script to visualize the data.

JaR448
I worked on the back-end of the project. I allowed proper data to be queried from the data tables we created. Also contributed to debugging the python scripts.

Jquintero08

Updates

Carly Munoz started this project — Oct 29, 2023 12:58 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.