About ModuLens

Inspiration

ModuLens was born from a frustrating yet increasingly common experience in AI research: the inability to adversarially attacks LLms and explore legitimate interdisciplinary questions due to overzealous content filters. As researchers working across fields like may encounter serious ethics issues, we encountered significant barriers when trying to investigate how language models handle sensitive but academically important topics.

The final straw came during a project exploring AI's potential for crisis intervention, where our perfectly legitimate queries about suicide prevention techniques were repeatedly blocked - preventing valuable research that could potentially save lives. We realized that while content moderation is crucial, the current systems often lack nuance for distinguishing harmful content from genuine academic inquiry.

What We Learned

Building ModuLens taught us several valuable lessons:

  1. The complexity of modern AI moderation systems and their inconsistent handling of context
  2. The delicate balance between responsible access and ethical guardrails
  3. The effectiveness of different bypass strategies across various model providers
  4. The critical importance of strict authentication and usage monitoring for tools with dual-use potential

Perhaps most importantly, we gained deeper insight into how AI safety research itself can be impeded by the very systems designed to protect users - creating a paradoxical situation where improving safety requires testing its boundaries.

How We Built It

ModuLens was developed as a Python-based platform with both CLI and web interfaces. We built the system with a modular architecture that allows for:

  1. A core engine that orchestrates bypass strategies and model interactions
  2. A strategy layer with multiple interchangeable bypass techniques
  3. A strong authentication system to ensure responsible access
  4. Comprehensive logging for research accountability

The system supports multiple LLM providers (Google, Cohere, OpenAI, Anthropic) through their respective APIs, allowing researchers to compare moderation behaviors across different models.

Challenges We Faced

Developing ModuLens presented several significant challenges:

  1. Ethical Boundaries: Defining the line between legitimate research and potential misuse required careful consideration and implementation of strict guardrails.

  2. Technical Hurdles: Each model provider has different API structures, rate limits, and response formats that needed to be normalized.

  3. Strategy Development: Creating effective bypass strategies that work for legitimate research queries without enabling harmful content generation required extensive testing and refinement.

  4. Authentication System: Building a secure verification system that maintains academic freedom while preventing unauthorized access proved technically challenging.

  5. Documentation: Clearly communicating the project's legitimate research purpose while providing comprehensive ethical guidelines required careful messaging.

Throughout development, we maintained a commitment to responsible AI research, consulting with ethics experts and implementing strong safeguards to ensure ModuLens serves its intended purpose: advancing our understanding of AI systems while improving moderation for legitimate use cases.

Built With

Share this project:

Updates