Skip to content
This repository was archived by the owner on Aug 9, 2021. It is now read-only.
This repository was archived by the owner on Aug 9, 2021. It is now read-only.

一个完整的机器学习项目中的对文本特征类编码地方问题 #139

@LelandYan

Description

@LelandYan

为什么使用sklearn的LabelEncoder()和pandas中的factorize()的结果不同

from sklearn.preprocessing import LabelEncoder
encoder = LabelEncoder()
housing_cat = housing["ocean_proximity"]
housing_cat_encoded1 = encoder.fit_transform(housing_cat)
housing_cat_encoded2, housing_categories = housing_cat.factorize()
housing_cat_encoded1[:10] 
 housing_cat_encoded2[:10] 

为什么housing_cat_encoded1的值0-4, housing_cat_encoded2的值0-2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions