Path-based Connectivity for Clustering Genome Sequences
MetadataShow full item record
Clustering is an unsupervised data mining tool and in bioinformatics, clustering genome sequences is used to group related biological sequences when there is no additional supervision. Sequence clusters are often related with gene/protein families, which can shed some light onto determining tertiary structures. To extract such hidden and valuable structures in a data set of genome sequences can benefit from better clustering methods such as the recently popular Spectral Clustering. In this study, we apply spectral clustering and its improved variations to sequence clustering task in our efforts to develop a novel approach for improving it.