Skip to content

DecisionTree, DecisionTreeModel, and RandomForest should be in three (or two) different repositories #2

Open
@olekscode

Description

@olekscode

I believe that DecisionTree can be used not only for machine learning, but also for many other applications (for example, to contain expert knowledge). So it would be nice to have it as a standalone project that lives in a separate repository.

Then there is DecisionTreeModel - a machine learning model that is used for building decision trees. This should be a separate repository can contains an abstract class DecisionTreeModel and several algorithms implemented as subclasses:

  • C4.5
  • ID3
  • etc.

DecisionTreeModel repository should depend on DecisionTree repository.

Finally, I don't remember well how RandomForest works, but if it is the ensembling algorithm that averages the output of several DecisionTreeModels, then you can put it into a separate repository and add a dependency on DecisionTreeModel.

However, if it is the kind of DecisionTreeModel that generates random decision trees and averages their outputs, then I would make it a subclass of the abstract DecisionTreeModel and keep it in the same repository.

This way we can have a clean separation and each module can be used independently.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions