Green Lens

The user will be required to input a Company ESG report
After ananlysing the report, a radar chart with each of the metric will give the users an overview
Users will have access to the claims in the report identified as well as the reasons why these claims were reported.
Users will be able to toggle between each metrics and they will be given the specific claim/analysis report

Inspiration

There are many claims about sustainability, but it is very hard to verify them.

Businesses are releasing ESG and sustainability reports that make claims about environmental impact reduction, ethical sourcing, or carbon neutrality. However, investors, regulators, and the general public find it challenging to evaluate the veracity of these assertions since they are frequently ambiguous, selective, or deceptive.

This leads to a serious issue:

Greenwashing skews judgment.
ESG reporting is becoming less trustworthy.
Scalable technologies are lacking for regulators to validate claims.

To close this gap, we were motivated to create Green Lens.

Green Lens use AI to automatically analyse ESG reports, identify environmental claims, and assess them against the Seven Sins of Greenwashing, revealing evidence-backed risk assessments in place of laborious, manual audits.

What it does

An ESG or sustainability report is uploaded by the user (PDF).
The document's text, structure, and tables are parsed by the system.
Environmental claims are automatically retrieved.
Every assertion is assessed in light of the Seven Sins of Greenwashing.

The front-end shows:

Scores for risk (per sin)
Snippets of evidence from the report
Synopsis of insights

How we built it

Technical Flow

Frontend

The frontend is a Streamlit dashboard where users can explore the results. It includes:

a radar chart to show Seven Sins risk scores
an evidence panel to inspect supporting text from the report
a summary section with key findings

Backend

The backend is built with FastAPI and handles the document analysis pipeline.

How the pipeline works

Step 1. PDF parsing

The system first reads the uploaded ESG or sustainability report PDF and extracts:

the main text
document structure

Step 2. Claim extraction

Next, the system identifies environmental or sustainability-related claims in the report. This step uses a ClimateBERT-based model to detect and group relevant claims.

Step 3. Seven Sins analysis

Each extracted claim is then checked against the Seven Sins of Greenwashing using separate detection modules. These modules look for patterns such as:

Hidden Trade-Off
No Proof
Vagueness
Irrelevance
Lesser of Two Evils
Fibbing
False Labels

Step 4. RAG + LLM reasoning layer

After that, the system retrieves supporting evidence from the report and uses an LLM to:

judge how credible the claim is
explain why it may be risky
connect the score to actual evidence in the text

Step 5. Score aggregation

The outputs from all modules are combined into a structured set of risk scores.

Step 6. Final output

The backend returns the results as structured JSON, which the frontend uses to generate the dashboard visualisations.

Challenges we ran into

One of the biggest challenges was that ESG reports are messy. PDF layouts vary a lot, which makes parsing inconsistent.

Another challenge was that greenwashing detection is not just a text classification problem. A risky claim often needs context, missing evidence, or broader document comparison to be assessed properly.

We also found that some of the Seven Sins overlap conceptually, which makes clean category boundaries difficult.

On the engineering side, aligning the frontend and backend schema took careful design, and the ML backend introduced dependency and setup complexity.

Accomplishments that we're proud of

Built a working end-to-end prototype from PDF upload to risk visualization
Designed a modular Seven Sins pipeline
Connected risk categories to actual evidence in the report
Created a frontend structure that can cleanly support real backend outputs
Framed greenwashing detection as an explainability problem, not just a scoring problem ## What we learned

Greenwashing detection is not just about labelling text. It needs context, evidence, and explainability.

We also learned that document quality matters a lot. If the PDF structure is weak, downstream NLP becomes much harder.

A modular architecture was important because each sin requires slightly different logic. Some depend more on language patterns, while others depend more on missing evidence or contextual judgment.

Most importantly, users trust a system more when it shows why something was flagged, not just what score it got.