Summary (Textual Analysis of Quarterly Inflation Reports generated by Bank of England)
Purpose: For studying insights from Quarterly Inflation Reports generated by Bank of England. To understand the effect of Revising High Order Moments. Also, to gain insights from the reports for predicting its effect on the Economy.
Framework: To achieve that we would Perform Textual Analysis on the Reports available from the Bank of England Website. We would do dimensionality reduction and determine the highly important features(words) for our Purpose.
Data Use: We Web scrap the Bank of England website and download 100 PDF Documents generated by Bank of England since 1992. We then extract the text from each PDF and convert into a large corpus string.
Model and Analytics: We convert this string into words by tokenizing it. Then we remove unnecessary symbols, stop words, and all the words which are not a part of the English Language Dictionary. This process Is greatly important for Data Cleaning. After which we calculate Term Frequency and Inverse Document Frequency to the importance of the word by its occurrence as compared to other words. Converted this into an array of matrix.
Validation: From this Matrix we can identify how the impact of occurrence of words has on specific conveyance of insights from the Quarterly Inflation Reports. We perform Dimensionality Reduction on this Matrix to reduce the Features and find out the weight assigned to each feature. This helps us determine important features among the matrices. We can further go on performing Clustering Techniques in order to segregate and determine important statistical insights.
Impact: This study leads to important understanding of financial terms and decision of Bank of England on generating inflation reports every quarter with the actual trend on Inflation. More further insights can be drawn as Textual Analysis can lead to much wider scope of understanding.
Log in or sign up for Devpost to join the conversation.