Perera. K, Karunarathne D.D, Siriwardena A, Balaretnaraja D
Abstract— The Vast number of publicly available electronic financial documents and document repositories and their rapid growth pose a great challenge in understanding, managing and structuring the information. Due to several reasons content of these documents is open to variety of differing interpretations and resulting ambiguity. Annotating these data with semantics to constrain the inconsistent interpretation of data facilitates better reuse and interoperability. We propose a semi-supervised approach for creating annotations for the extracted text of financial documents. A Supervised approach would include human experts in the annotation process. Unsupervised or machine based annotation is done by recommending Financial Industry Business Ontology (FIBO) terms for document sections based on the Okapi Similarity measure. Annotation data can be used to infer knowledge from important sentences or document sections to gain better understanding or decision making. Our annotation results indicate that similar pairs of sections have more common FIBO terms and different pairs of sections have a lesser number of similar FIBO terms.
Keywords— Ontology, Annotation, Semantic Web, Financial Data