CQUniversity
Browse

File(s) not publicly available

Fuzzy document filter for the internet

conference contribution
posted on 2017-12-06, 00:00 authored by Deepani Guruge, Russel Stonier
Current rnajor search engines on the web retrieve too many documents, of which only a small fraction are relevant to the user query. We propose a new fuzzy document- filtering algorithm to filter out documents irrelevant to the user query from the output of Internet search engines. This algorithrn uses output of 'Google' search engine as the basic input and processes this input to filter documents most relevant to the query. The clustering algorithm used here is based on the fuzzy c-means with simple modifications to the membership function formulation and cluster prototype initialisation. It classifies input documents into 3 predefined clusters. Finally, clustered and context-based ranked URLs are presented to the user. The effectiveness of the algorithm has been tested using data provided by the eighth Text REtrieval Conference (TREC8) [25]and also with on-line data. Experimental results were evaluated by using error matrix method, precision, recall and clustering validity measures.

Funding

Category 1 - Australian Competitive Grants (this includes ARC, NHMRC)

History

Start Page

39

End Page

53

Number of Pages

15

Start Date

2004-01-01

ISBN-13

9780646443799

Location

Cairns, Qld.

Publisher

University of Technology Sydney

Place of Publication

Sydney, N.S.W.

Peer Reviewed

  • Yes

Open Access

  • No

External Author Affiliations

Faculty of Informatics and Communication; TBA Research Institute;

Era Eligible

  • Yes

Name of Conference

Australasian Data Mining Conference

Usage metrics

    CQUniversity

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC