Inspiration

Many of the AI tools being released are focused around generating and enriching content. That is really fantastic, but what intrigues me, is the ability for these tools to be used to cutting content, to make it more digestible. I was playing around with OpenAI's playground using the text-davinci-003 model to summarize text for a 2nd grader. On a whim, I grabbed some article text from fox news and started fashioning prompts for the article. I was really surprised how good it was at doing more than just summarizing, it could identify bias, extract factual statements and more.

What it does

First, the problem statement.

Staying up to date with the news is hard.

Take a look at this sample from WSJ: wall street journal

  • news websites are overwhelming w/ their layout and organization
    • where do you start, where do you end
    • articles compete for attention (headline sizes, page spread, etc)
    • who decides which articles should have more attention, and how
  • articles are dense/verbose
    • it takes multiple minutes (3-10 or more!) to read an article. how are you expected to have enough time in your day to read all the pertinent articles?
    • too much fluff/filler/speculation obscures the actual news
  • headlines suck
    • either too brief (eg competing for your time) to be informative
    • too sensational (eg competing for your attention) to let you weigh for importance accurately
  • some news sources have hidden agendas and flavor the writing
    • are you being fed subtly biased/subjective statements (intentionally or not)
    • there are good unbiased/fair news sites but maybe they don't cover your locality, or the topics you're interested in

Then, the solution statement

Staying informed can be easier, faster.

Now, take a look at abridged news.ai: wall street journal

  • [x] We use a tried and true format designed for throughput (The list view)
    • [x] Articles are given uniform treatment
    • [x] Newest (most recently published) on top, scroll down until you're tired of reading news
    • [x] Filter by category
    • [ ] Filter by keywords, news sources, locality
    • [ ] Mobile friendly
  • [x] We use AI to extract and efficiently present the important information. Not summaries, critical thinking analysis!
    • [x] What does the author of the article want to convey (summary)? What do they want you to believe (agenda)?
    • [x] Is there political bias in the article, if so, what is it (agenda)?
    • [x] What are the objective statements of the article? (Ie, factual statements)
    • [x] What are the subjective statements of the article? (Ie, statements charged w/ emotion or bias)
    • [x] Does the article pertain to local, national or global concerns?
    • [ ] Ask your own questions of the article (future milestone)
    • [ ] Ask your own questions (in the context of all articles) (super future milestone)
  • [x] We ingest from as many news sources as possible so you're exposed to as much news as possible.

How we built it

Abridged News uses existing news feed APIs and OpenAI completion API to generate it's content.

  • A job crawls the news feeds for the latest news.
    • See server/src/news/news-ingest.service.ts for news ingestion
  • A different job selects batches of new records and analyzes them with OpenAI's API.
    • See server/src/news/political-bias.service.ts::getCompletionParams for the prompt formation.
  • Data is persisted in a (SQLITE) database and exposed via a GraphQL endpoint
  • UI is provided by React frontend that consumes the GraphQL endpoint

Challenges we ran into

For the sake of the hackathon and to optimize free credit utilization, I'm using a complex prompt that asks many questions of an article.

This seems to lead to slightly less quality than when the OpenAI completion API is called w/ a single question at a time. But doing so would have used up my available tokens too quickly due to all the repetition of providing the article context for each separate prompt.

There's quite a bit of data cleanup that needs to still happen. If credits weren't a concern, I would probably feed the responses back to open AI with some sort of quality prompt to flag analysis that need to be retried/tweaked.

The power and effectiveness of the project is not well demonstrated by my current news data sources. The current news source provides (relatively) fair (unbiased) news. The appeal and usefulness of the project would be better demonstrated with the ingestion of more broad, and unfair, news sources. The ultimate goal being aggregating all the major news sources, regardless of party affiliation.

Accomplishments that we're proud of

It works. It does the thing. It's a complete MVP. This is the first project I've ever worked on utilizing AI.

What we learned

AI has become really accessible recently. There's been an explosion of services that can facilitate adoption from developers that don't have any training or experience w/ AI.

Using the available AI APIS (in this case OpenAI) has proven to be an incredible facilitator of rapid development of an MVP.

Ultimately, this product would likely need to operate/host it's own AI model processing due to the volume of queries it would need to run. But the API's served as a great way to prototype and prove a concept.

What's next for abridged news.ai

If the project is well received, it might be time to pay for some subscriptions and start ingesting and processing data at volume. I'd also like to expand the news source integrations.

I'd love to enable users to be able to do Q&A against recent (and historic) news. Love to enable queries like "Are there any articles that rebuff the claims in this article?". "I'd like to know about any articles that mention Elon Musk's acquisitions".

From a data quality standpoint, I'd love to implement ability for users to flag bad data and trigger for review/re-processing.

Built With

Share this project:

Updates