Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine

Oscar Karnalim

doi:10.28932/jutisi.v1i2.578

PDF (English)

Diterbitkan: Aug 30, 2015

DOI: https://doi.org/10.28932/jutisi.v1i2.578

Oscar Karnalim

Maranatha Christian University

Abstrak

Byte code as information source is a novel approach which enable Java archive search engine to be built without relying on another resources except the Java archive itself [1]. Unfortunately, its effectiveness is not considerably high since some relevant documents may not be retrieved because of vocabulary mismatch. In this research, a vector space model (VSM) is extended with semantic relatedness to overcome vocabulary mismatch issue in Java archive search engine. Aiming the most effective retrieval model, some sort of equations in retrieval models are also proposed and evaluated such as sum up all related term, substituting non-existing term with most related term, logaritmic normalization, context-specific relatedness, and low-rank query-related retrieved documents. In general, semantic relatedness improves recall as a tradeoff of its precision reduction. We also proposed a scheme to take the advantage of relatedness without affected by its disadvantage (VSM + considering non-retrieved documents as low-rank retrieved documents using semantic relatedness). This scheme assures that relatedness score should be ranked lower than standard exact-match score. This scheme yields 1.754% higher effectiveness than our standard VSM.

Unduhan

Data unduhan belum tersedia.

Cara Mengutip

[1]

O. Karnalim, “Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine”, JuTISI, vol. 1, no. 2, Agu 2015.

Terbitan

Vol 1 No 2 (2015): JuTISI

Bagian

Articles

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (https://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial used, distribution and reproduction in any medium.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Bilah Samping Artikel

Isi Artikel Utama

Abstrak

Unduhan

Rincian Artikel

Artikel paling banyak dibaca berdasarkan penulis yang sama