Japan Finsight - Catalyst Intelligence Platform
Inspiration
The Tokyo Stock Exchange forced 3,800 companies to publish capital efficiency improvement plans in 2023—the largest forced corporate transformation in modern history.
The opportunity: 20-30% filed weak, vague plans. These 760-1,140 companies are prime activist targets for the next 5-7 years.
The problem: Finding them requires analyzing thousands of Japanese regulatory filings across three siloed data sources. No one can do this systematically at scale.
I built the platform to find activist targets before the activists do.
What it does
Catalyst Intelligence Platform combines three data sources to identify companies under reform pressure:
1. Reform Pressure Ranking - Scores all 2,327 TSE companies (0-100)
2. Cross-Shareholding Network Analysis - Maps reciprocal ownership (A owns B, B owns A). When one company faces activist pressure to sell, counterparties must also sell—creating cascade effects.
3. Large Shareholding Tracker - Monitors 5%+ ownership filings (EDINET Doc 350) to identify activist campaigns.
4. Natural Language SQL - Ask questions in plain English:
- "Show me non-compliant companies with activist pressure"
- "Find cross-shareholdings > 5% where both companies have PBR < 1.0"
The edge: Multi-catalyst detection—find companies with activist pressure + weak TSE response + cross-holding unwind before they're priced in.
How I built it
Data: 2,327 TSE companies, EDINET Doc 350 (activist filings), EDINET Doc 120 (shareholdings)
Pipeline:
- Custom EDINET scrapers extract Japanese PDFs
- Gemini 2.5 Flash extracts structured data (investor names, ownership %, company names)
- PostgreSQL stores normalized data (activist_filings, shareholdings, tse_reform_status, corporate_entity)
- Reform pressure algorithm weights multiple catalysts
- Cross-shareholding detector finds bidirectional relationships via SQL self-joins
- Claude Sonnet 4.5 converts natural language to SQL
- Flask dashboard renders results with English company names
Tech: Python, Flask, PostgreSQL, Gemini API, Claude API, BeautifulSoup, Pydantic, SQLAlchemy
Key innovation: Multi-catalyst scoring—first platform to combine activist filings + TSE compliance + cross-holdings in one interface.
Challenges we ran into
- EDINET extraction complexity - Doc 350/120 have inconsistent formats. Solution: Pydantic schemas + LLM extraction with retry logic
- Company name matching - Same company has different names across datasets. Solution: EDINET code as primary key + English name enrichment
- Cross-shareholding networks - Detecting bidirectional relationships without double-counting. Solution: SQL self-joins + bidirectional matching
- Reform pressure scoring - No ground truth for "weak responder." Solution: Weighted scoring based on activist investment framework
- SQL injection risk - Whitelist SELECT queries only + parameterized execution
- Data completeness - Extraction ongoing. Solution: Dashboard shows data quality status
Accomplishments that we're proud of
✅ Analyzed 2,327 TSE companies with reform pressure scores ✅ Cross-shareholding network detection with cascade risk analysis ✅ Natural language SQL generates correct queries from plain English ✅ Multi-catalyst detection finds triple-threat companies ✅ Production dashboard deployed at japanfinsight.com/catalyst-intelligence ✅ English name enrichment for international accessibility
What we learned
Multi-catalyst analysis is the edge - Single metrics (just PBR or just TSE compliance) are commodity. Real alpha comes from intersection of 3+ catalysts.
LLM extraction works at scale - Gemini 2.5 Flash handles Japanese financial PDFs with $0.01-0.05 cost per document.
Natural language reduces friction - Financial analysts want "show me X with Y" not SQL joins. Example queries are critical for discovery.
Cross-shareholding unwinding is underappreciated - Cascade effects can unlock 40% locked registers overnight. Most platforms miss this entirely.
Domain expertise drives scoring - Activist investment framework from industry panels informed our weighting algorithm—beats naive metrics.
What's next for Japan FinSight
Near-term (1-3 months):
- Backfill Doc 350/120 extraction (target: 1,000+ filings, complete shareholding network)
- Add TSE reform plan quality scoring via LLM
- Saved queries + email alerts for new activist filings
Monetization:
- Institutional: API access, custom dashboards, alerts $999/mo
- Research partnerships: Co-invest with activist funds
Vision: Become the Bloomberg Terminal for Japanese corporate governance—the platform for identifying and tracking reform opportunities in Japan's $6 trillion equity market.
Log in or sign up for Devpost to join the conversation.