Skip to content

This repository contains the code for Integrative Clustering and Supervised Feature selection algorithm based on R language.

Notifications You must be signed in to change notification settings

bowenxin/ICS_feature_selection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

ICS_feature_selection

To address the challenge of optimally selecting informative, representative, and non-redundant features from the huge volume of data, we propose an integrative clustering and supervised (ICS) feature selection approach. In our framework, the unsupervised clutering contributes to reduce feature redundancy by exploring the correlation among features, while supervised learning selects informative and representative features by examining relavency between features and target outputs. The algorithm was implemented in R and validated on two datasets including CT image dataset and clinical factor dataset. Flowchart

Citation

For ICS feature selection method

Xin, Bowen, et al. "Integrative Clustering and Supervised Feature Selection for Clinical Applications." 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV). IEEE, 2018.

For machine learning projects of lung cancer survival analysis

Wang, Linlin, Taotao Dong, Bowen Xin, Chongrui Xu, Meiying Guo, Huaqi Zhang, Dagan Feng, Xiuying Wang, and Jinming Yu. "Integrative nomogram of CT imaging, clinical, and hematological features for survival prediction of patients with locally advanced non-small cell lung cancer." European radiology 29, no. 6 (2019): 2958-2967.

Prerequisites

  • R environment
  • packages: dplyr, survival, randomForestSRC, stats, ConsensusClusterPlus

Functions

  • Get stable features over repeated meaturements: getStableCor_func(df1, df2, threshold=0.7, method="pearson")
  • Get prognostic features by rank (Cox): getBestCox(dataset, nFeature=5, rankby="p.value", p.threshold = 0.05)
  • Get prognostic features by rank (RSF): getBestRSF(dataset, nFeature, seed = 1000, ntree = 2000)
  • Get ICS features (Cox): getBestCluster(dataset, nFeature = 5, maxK = 15, optK = NULL,rankby = "p.value")
  • Get ICS features (RSF): getBestCluster2(dataset, nFeature = 5, maxK = 15, optK = NULL,rankby = "p.value")
  • Fit multivaraite model (Cox): multi_cox(dataset, features)
  • Fit multivariate model (RSF): multi_rsf(dataset, features)

About

This repository contains the code for Integrative Clustering and Supervised Feature selection algorithm based on R language.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages