Feature Selection
Feature_scML fs func
fs module is feature selection function, including F-score algorithm (fscore), the squared coefficient of variation (cv2), principal component analysis (pca), random forest classifier (rfc), ANOVA Pvalue (ano), the maximal information coefficient measure (mic), ReliefF (turf), and linear support vector machine (linearsvm). These methods are referred in the index.
$Feature_scML fs -h
usage: fs
optional arguments:
-h, --help show this help message and exit
-i INPUT, --input INPUT
Input train data (CSV)
-m {fscore,pca,cv2,rfc,mic,turf,linearsvm}, --method {fscore,pca,cv2,rfc,mic,turf,linearsvm}
Feature selection method
-o OUTPUT, --output OUTPUT
Output directory
--njobs NJOBS The number of jobs to run in parallel (Only turf,
default=1)
Command
Parameters |
optional |
Descripton |
|---|---|---|
–method, -m |
fscore, pca, cv2, rfc, mic, turf, linearsvm |
|
–input,-i |
filename path |
input filename path (CSV format) |
–output, -o |
output directory |
output directory (default:Current directory) |
–njobs |
int, default=1 |
The number of jobs to run in parallel |
Example
# default start, end, and step
$Feature_scML fs -i example.csv -m cv2
The result will generate two dataframe. As follow:
example_cv2.csv
Feature |
cv2_score |
CCKBR |
122.898 |
IFITM1 |
15.624 |
CLDN10 |
11.051 |
… |
… |
The dataframe sorted by the features of the original data.
example_cv2_data.csv
Label |
CCKBR |
IFITM1 |
… |
6 |
229.00 |
32.21 |
… |
3 |
25.14 |
76.82 |
… |
6 |
154.65 |
52.62 |
… |