Skip to content

Commit 2aa1031

Browse files
committed
add dump nice to regression demo
1 parent 1440dc9 commit 2aa1031

File tree

4 files changed

+36
-11
lines changed

4 files changed

+36
-11
lines changed

demo/binary_classification/README

+6-5
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,11 @@ Run: ./runexp.sh
44

55
Format of input: LIBSVM format
66

7-
Format of featmap.txt:
8-
<featureid> <featurename> <q or i>\n
7+
Format of ```featmap.txt: <featureid> <featurename> <q or i or int>\n ```:
8+
- Feature id must be from 0 to number of features, in sorted order.
9+
- i means this feature is binary indicator feature
10+
- q means this feature is a quantitative value, such as age, time, can be missing
11+
- int means this feature is integer value (when int is hinted, the decision boundary will be integer)
912

10-
q means continuous quantities, i means indicator features.
11-
Feature id must be from 0 to num_features, in sorted order.
1213

13-
Detailed explaination: https://github.com/tqchen/xgboost/wiki/Binary-Classification
14+
Explainations: https://github.com/tqchen/xgboost/wiki/Binary-Classification

demo/regression/README

+9-1
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,12 @@ Demonstrating how to use XGBoost accomplish regression tasks on computer hardwar
22

33
Run: ./runexp.sh
44

5-
Format of input: LIBSVM format
5+
Format of input: LIBSVM format
6+
7+
Format of ```featmap.txt: <featureid> <featurename> <q or i or int>\n ```:
8+
- Feature id must be from 0 to number of features, in sorted order.
9+
- i means this feature is binary indicator feature
10+
- q means this feature is a quantitative value, such as age, time, can be missing
11+
- int means this feature is integer value (when int is hinted, the decision boundary will be integer)
12+
13+
Explainations: https://github.com/tqchen/xgboost/wiki/Regression

demo/regression/mapfeat.py

+14-3
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,23 @@
1010
for i in xrange( 0,6 ):
1111
fo.write( ' %d:%s' %(i,arr[i+2]) )
1212

13-
if arr[0] not in fmap.keys():
13+
if arr[0] not in fmap:
1414
fmap[arr[0]] = cnt
1515
cnt += 1
1616

17-
fo.write( ' %d:1' % fmap[arr[0]] )
18-
17+
fo.write( ' %d:1' % fmap[arr[0]] )
1918
fo.write('\n')
2019

2120
fo.close()
21+
22+
# create feature map for machine data
23+
fo = open('featmap.txt', 'w')
24+
# list from machine.names
25+
names = ['vendor','MYCT', 'MMIN', 'MMAX', 'CACH', 'CHMIN', 'CHMAX', 'PRP', 'ERP' ];
26+
27+
for i in xrange(0,6):
28+
fo.write( '%d\t%s\tint\n' % (i, names[i+1]))
29+
30+
for v, k in sorted( fmap.iteritems(), key = lambda x:x[1] ):
31+
fo.write( '%d\tvendor=%s\ti\n' % (k, v))
32+
fo.close()

demo/regression/runexp.sh

+7-2
Original file line numberDiff line numberDiff line change
@@ -7,5 +7,10 @@ python mknfold.py machine.txt 1
77
../../xgboost machine.conf
88
# output predictions of test data
99
../../xgboost machine.conf task=pred model_in=0002.model
10-
# print the boosters of 00002.model in dump.raw.txt
11-
../../xgboost machine.conf task=dump model_in=0002.model name_dump=dump.raw.txt
10+
# print the boosters of 0002.model in dump.raw.txt
11+
../../xgboost machine.conf task=dump model_in=0002.model name_dump=dump.raw.txt
12+
# print the boosters of 0002.model in dump.nice.txt with feature map
13+
../../xgboost machine.conf task=dump model_in=0002.model fmap=featmap.txt name_dump=dump.nice.txt
14+
15+
# cat the result
16+
cat dump.nice.txt

0 commit comments

Comments
 (0)