Skip to content

Some problems in model development #28

Open
@Echo9573

Description

@Echo9573

Hi, I have some issues when I tried to develop SQLFlow models:

  • Analysts usually use Dataframe to manipulate data and use it as input to the Keras model. It is convenient to debug, but SQLFlow tf-codegen uses dataset, which requires additional learning costs.
  • It is troublesome to connect with SQLFlow. For models configured under SQLFlow models, if you want to debug locally, you need to implement a train.py yourself, including reading data, defining feature columns, etc.. However, train.py generated locally and train.py generated by SQLFlow do not always behave consistently.
  • Usually an analysis task includes feature engineering -> data preprocessing -> model training (prediction). At present, the model zoo only includes the last step. But actually, sharing model between operations is a chain that needs to share the entire data processing. I hope that SQLFlow will also have the ability to do custom data preprocessing and be included in the design of the model zoo.

Metadata

Metadata

Labels

DataScienceSome issue about the application in data scienceDiDiThe issue publisher is from DiDi

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions