Logo

Document Searching Engine Using Term Similarity Vector Space Model on English and Indonesian Document

Handojo, Andreas and Wibowo, Adi and RIA, YOVITA (2015) Document Searching Engine Using Term Similarity Vector Space Model on English and Indonesian Document. In: ICSIIT 2015, 12-03-2015 - 14-03-2015, Bali - Indonesia.

[img] PDF
Download (252Kb)

    Abstract

    In line with technology development, the number of digital documents increase significantly, this will make process to the search a particular documents experience a little problem. Therefore, the role of search engines is become inevitable. Usually, search engines conduct a searching process simply by looking at the similarities between keywords (that inputed by user) and terms in a document. In this research, we try to implement Term Similarity Vector Space Model (TSVSM), a method that also saw the relationship between the terms in the document. The relationship between terms in a document is calculated based on the frequency of occurrence in a paragraph. So this will make possible to search documents that do not contain the exact keywords that inputed, but have terms that related to those keywords. We also try to implement TSVSM to English language documents from CiteseerX journal collection [1]. In this research we also want try it to Indonesian language documents from journal collection on Petra Christian University Research Center (both in pdf format, with total 14.000 documents). This application was built using Microsoft Visual Basic.Net 2005 and PHP. Based on testing, this method can establish relationships between related terms that can find documents that do not contain keywords but contains terms that relate to the keyword. Time that needed to search document in Indonesian language journal is relative longer than in English language journal.

    Item Type: Conference or Workshop Item (Paper)
    Uncontrolled Keywords: Term Similarity Vector Space Model, Term Co-Occurrence Vector, Term Co-Occurrence Matrix, Search Engine.
    Subjects: Q Science > QA Mathematics > QA76 Computer software
    Divisions: Faculty of Industrial Technology > Informatics Engineering Department
    Depositing User: Admin
    Date Deposited: 14 Jul 2015 23:42
    Last Modified: 22 Jul 2015 21:02
    URI: https://repository.petra.ac.id/id/eprint/17095

    Actions (login required)

    View Item