Esen A. Ozkarahan scite author profile

A new algorithm for document clustering is introduced. The base concept of the algorithm, the cover coefficient (CC) concept, provides a means of estimating the number of clusters within a document database and relates indexing and clustering analytically. The CC concept is used also to identify the cluster seeds and to form clusters with these seeds. It is shown that the complexity of the clustering process is very low. The retrieval experiments show that the information-retrieval effectiveness of the algorithm is compatible with a very demanding complete linkage clustering method that is known to have good retrieval performance. The experiments also show that the algorithm is 15.1 to 63.5 (with an average of 47.5) percent better than four other clustering algorithms in cluster-based information retrieval. The experiments have validated the indexing-clustering relationships and the complexity of the algorithm and have shown improvements in retrieval effectiveness. In the expe; ments, two document databases are used: TODS214 and INSPEC. The latter is a common database with 12,684 documents.

show abstract

Two partitioning type clustering algorithms

Can

Ozkarahan

1984

J. Am. Soc. Inf. Sci.

View full text Add to dashboard Cite

In this article, two partitioning type clustering algorithms are presented. Both algorithms use the same method for selecting cluster seeds; however, assignment of documents to the seeds is different. The first algorithm uses a new concept called “cover coefficient,” and it is a single‐pass algorithm. The second one uses a conventional measure for document assignment to the cluster seeds and is a multipass algorithm. The concept of clustering, a model for seed oriented partitioning, the new centroid generation approach, and an illustration for both algorithms are also presented in the article.

show abstract

Performance evaluation of a relational associative processor

Ozkarahan

Schuster

Sevcik

1977

ACM Trans. Database Syst.

View full text Add to dashboard Cite

An associative processor called RAP has been designed to provide hardware support for the use and manipulation of databases. RAP is particularly suited for supporting relational databases. In this paper, the relational operations provided by the RAP hardware are described, and a representative approach to providing the same relational operations with conventional software and hardware is devised. Analytic models are constructed for RAP and the conventional system. The execution times of several of the operations are shown to be vastly improved with RAP for large relations.

show abstract

Computation of term/document discrimination values by use of the cover coefficient concept

Can

Ozkarahan

1987

J. Am. Soc. Inf. Sci.

View full text Add to dashboard Cite

Indexing in information retrieval (IR) is used to obtain a suitable vocabulary of index terms and optimum assignment of these terms to documents for increasing the effectiveness and efficiency of an IR system. The concept of term discrimination value (TDV) is one of the criteria used for index-term selection. In this article a new concept called the cover coefficient (CC) will be used in computing TDVs. After a brief introduction to the theory of indexing and the CC concept, an efficient way of computing TDVs by use of the CC concept, index-term selection , and weight modification are discussed. It is also shown that the computational cost of the CC approach in the calculation of TDVs is favorably comparable to the cost of a different approach that uses similarity coefficients. Furthermore, the TDVs obtained by the CC approach are consistent with those of the latter approach.

show abstract

A clustering scheme

Can

Ozkarahan

1983

SIGIR Forum

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Esen A. Ozkarahan

Concepts and effectiveness of the cover-coefficient-based clustering methodology for text databases

Two partitioning type clustering algorithms

Performance evaluation of a relational associative processor

Computation of term/document discrimination values by use of the cover coefficient concept

A clustering scheme

Contact Info

Product

Resources

About