We all know that it costs approximately $1.5B to develop a drug. To cut down the cost pharmaceuticals are always trying to see if they should go ahead and do clinical trials for a new drug. However, there has been a lack of information on new similar drugs that underwent clinical trials. What if we can predict the probability of success of a drug currently ongoing clinical trials by comparing it to other clinical trials on similar drugs?

What it does

The information of similar drugs on clinical trials exists in However, the results of a study might be published in a journal. Smart CT compares all the other studies that have complete or active for the condition set in your clinical trial.


  1. User inputs their National Clinical Trial (NCT) number (trial they want to compare with)
  2. Smart CT pulls all the information like Drug name and condition
  3. The user confirms the trial details
  4. Smart CT searches for similar drugs from a list of drugs.
  5. Smart CT will generate a list of similar trials from based on design characteristics from NCT provided in Step 1.
  6. Smart CT will extract the corresponding PubMed abstract for the active or completed studies.
  7. NLP tries to determine the success and failure of all the studies it narrowed down using the above criteria.
  8. Smart CT will calculate the overall success rate based on similar trials based on the algorithm we made.

How we built it

Using API able to extract information regarding the CT. Also search similar drugs by querying the API and fetching the results of the trials. Using PubMed API we queried the NCT number of similar drugs. The collection of data was parsed through our algorithm that looked for only Industry sponsored studies, and had studies in either Phase I, II or III. Only completed studies with results were used to search PubMed. We trained the model by manually reviewing 200 studies. This trained our NLP to find the success or failure. Then the overall success rate of the drug was calculated.

Challenges we ran into

API for PubMed kept changing during the hackathon. API for ChEMBL has been deactivated. Time and resource limitations were huge challenges.

Accomplishments that we're proud of and What we learned

Found a lot of great APIs and methods to extract the data. It was definitely a great learning experience, as we found a lot of different methods of interacting with different APIs.

What's next for Smart CT

Give Smart CT access to more information. Especially searching currently Investigational New Drug from sources like FDA. Enhance the algorithm to compute the probability of success by combining other factors in our algorithm, like what trial phase the drug is in currently. Include more information to narrow down our results from CT, like number of participants, start date.

Built With

Share this project: