Inspiration

The inspiration behind creating the "Briefly" app is to tackle the problem of information overload in the academic and research community. Research papers can be long and packed with complex information, which makes it difficult and time-consuming for researchers to extract the main points. Here's where we come in, we’ve built a new tool that can take these massive papers and turn them into short, easy-to-understand summaries with relevant image description. Briefly utilizes AI technologies like OpenAI's GPT and DALL-E APIs to provide researchers with concise summaries of research articles. This way, they can quickly grasp the key insights and enhance their understanding of the content. The aim is to make it easier for researchers to access and comprehend scholarly information, helping them stay updated with the latest developments in their respective fields more efficiently.

What it does

The application allows users to scroll through concise summaries of research articles that are publicly available. By default, it displays top-ranked research articles based on trending topics. However, users can also enter their own search topics by specifying them in the search field within the application. The publication date of each article is provided towards the bottom right corner. Users can tap at the bottom of the screen to access the full PDF of the paper, which opens in a new browser window and can be easily downloaded. We have also implemented a catchy phrase for each article, and the background is dynamically generated to align with the image style. Furthermore, the application offers the flexibility to share, like, and dislike certain articles. The image generation model used in the backend ensures that sensitive topics and classified information are not depicted, resulting in a dummy image displayed in the application.

How we built it

When we set out to build our application, we started with a simple goal: to make research easier for everyone. The task ahead was a challenging one, but we broke it down into manageable steps. Firstly, we wanted to get the right content, so we connected our application to arXive. This allowed us to fetch the most relevant research papers based on the keywords provided by the user. It was like creating a bridge between the vast ocean of academic papers and our application. Next, we turned to the tricky task of making these complex research papers easier to understand. We integrated our application with OpenAI's GPT API (text-davinci-003). This clever AI tool took text in the fetched papers and churned out brief, easy-to-understand summaries. But we didn't stop there. We wanted to provide visual support to these summaries, making the content even more accessible. But the big summaries could generate vague images, to come over this, we have used GPT for generating a simple phrase to prompt a text-to-image generator. For t2i generation, we utilized the DALL-E API. DALL-E was able to generate images that were relevant to the summaries, providing a visual interpretation of the text and aiding comprehension. To facilitate the end user, we have added a CTA (call-To-Action) with a catchy phrase generated by GPT. Finally, we made use of Flask APIs to integrate everything. They carried the summaries and images from OpenAI's GPT and DALL-E and delivered them to our application. Now let’s get into the client, where the client is a react-native app which can run all 3 platforms, IOS, Android and Web. At last, what we ended up with was a tool that could take a keyword, dig up relevant research papers, and present them as simple summaries with related images to the user. It's our way of taking the hustle out of academic research, and we're pretty proud of it!

Challenges we ran into

  • Text-to-image generation: Text-to-image generation was a daunting task, as DALL-E takes time to generate images. DALL-E doesn’t generate as relevant images as Stable Diffusion and Stable Diffusion could be deployed on CUDA (memory limitations), we ran into some environmental issues with respect to our client application. So we have to experiment with different models.
  • Defining Prompts: Structuring prompts to feed into GPT for our end summary and text summary to generate images.
  • Latency Challenges: We have faced the latency challenges, as we have multiple GPT, DALL-E calls and downloading the papers from arxive!

Accomplishments that we're proud of

  • We take pride in developing an app that offers a simple and efficient solution, allowing individuals to scroll through concise summary of research papers without consuming valuable research time. Some notable accomplishments include:
  • Conceptualizing an innovative idea that addresses a pressing academic issue.
  • Successfully creating a fully functional solution, handling all aspects from inception to completion.
  • Overcoming various technical challenges along the way, ensuring progress without getting stuck.
  • Implementing prompt engineering techniques to enhance the app's performance.
  • Additionally, we are proud of our team's collaborative efforts, enabling us to develop and deliver a fully functional application within a short timeframe.

What we learned

  • Gaining hands-on experience in building a functional app that provides concise summaries of research papers, strengthening our software development skills.
  • Developing proficiency in leveraging AI technologies, such as OpenAI's ChatGPT API and Stable Diffusion, for automated summarization and image generation, expanding our knowledge in AI-driven applications.
  • Cultivating expertise in user interface design and user experience considerations, optimizing the app's usability and engaging the target audience effectively.
  • Improve project management skills by coordinating a team effort to deliver a fully functional application within a defined timeline, gaining valuable experience in teamwork and collaboration.
  • Develop problem-solving skills by overcoming technical challenges and implementing prompt engineering techniques to ensure the app's performance and reliability.
  • Gain insights into the ethical considerations of information dissemination and user-generated content, understanding the importance of responsible AI use and protecting sensitive information.
  • Foster an entrepreneurial mindset by identifying market needs, conceptualizing innovative solutions, and bringing them to fruition through the development and deployment of the app.
  • Expand our professional network and create opportunities for collaboration by showcasing our app development skills and domain expertise within the academic and research community.

What's next for Briefly

Some ideas which we are planning to work are :

  • Decreasing Latency: Latency can be significantly decreased by performing tasks in parallel wherever possible. For example, we could process multiple articles at once or make API calls simultaneously using Python's “concurrent.futures” module or similar packages.

  • Addressing Discrepancies and Enhancing Contextual Coherence: We might need to introduce a subsequent processing phase following the initial text generation to ensure the contextual consistency and accuracy of the generated content.Furthermore, it's crucial to explore varying prompt-design methods and fine-tune GPT-3 parameters such as temperature and max tokens to achieve optimized, contextually relevant outputs.

  • Addressing Ethical Concerns: Being transparent with users about the tool and the information they provide. Including disclaimers to indicate that the summaries and images are AI-generated and may not perfectly reflect the content of the paper.

  • Recommender system: For a personalized experience, we could allow users to choose their preferred topics, and then tailor the content shown to them based on these preferences. Over time, we can use user interaction data (e.g., which articles they read or ignore, by creating a user database) to further refine these recommendations.

  • Generating Quality Images: The quality of AI-generated images depends largely on the model used. We are planning to fine tune models such as stable diffusion, openjourney . Also, the prompt we provide to the model plays a crucial role in the generated image's quality and relevance. Therefore, we might need to experiment with different prompt engineering techniques to get better results.

Built With

Share this project:

Updates