Discussion: Auto-creation of Graphene Enums

In PR #98 I made an attempt to improve the auto generation of Graphene Enums, but it turned out that this feature still needs some discussion. Let's clarify here what we really want to achieve first.

In the following:
* g-enum = Graphene Enum types
* sa-enum = SQLAlchemy Enum type
* py-enum = Python Enum type
* sql-enum = Database enum data type

The sql-enums are supported as actual data types by some databases, otherwise implemented using name constraints. Note that both g-enums and sa-enums can be based either on py-enums or on simple lists of values. 

For the examples below, I use the following imports:

```
from enum import Enum as PyEnum
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base

Base = sa.ext.declarative.declarative_base()
```

When g-enums are created from sa-enums by `graphene_sqlalchemy`, they must be given a GraphQL type name. We must decide how these name are generated.

### 1) sa-enums based on py-enums

Example 1:

```
class PetKind(PyEnum):
    cat = 1
    dog = 2

class Pet(Base):
    __tablename__ = "pets"
    name = sa.Column(sa.String(30), primary_key=True)
    kind = sa.Column(sa.Enum(PetKind))
```

I think it's clear that in this case, the g-enum should be derived from `PetKind` and thus get the same name `PetKind`. Since GraphQL and Python have the same conventions for type/class names, this case is all good and fine.

### 1a) sa-enums based on py-enums, with SQL name

Example 1a:

```
class Pet(Base):
    __tablename__ = "pets"
    name = sa.Column(sa.String(30), primary_key=True)
    kind = sa.Column(sa.Enum(PetKind, name='kind_of_pet'))
```

The sa-enums can be declared with a name argument which is used for name constraints or as sql-enum type name on the database. In this case, I think the name of the py-enum should take precedence for us; the generated g-enum should still be derived from `PetKind`, and thus have the name `PetKind`, not `KindOfPet`.

### 2) sa-enums based on values, with SQL name

Example 2:

```
class Pet(Base):
    __tablename__ = "pets"
    name = sa.Column(sa.String(30), primary_key=True)
    kind = sa.Column(sa.Enum('cat', 'dog', name='kind_of_pet'))
```

Note: The name here is passed as argument to Enum, not Column. I.e. the sql-enum name is `kind_of_pet`, while the column name is `kind`.  This case is interesting. 

First, no matter which name we choose, it will not follow GraphQL type name conventions according to which type names are written in PascalCase (though I don't find it officially required or recommended, I think it's a strong convention that also resonates with the Python class name convention). So I believe an automatic name conversion should happen; the GraphQL name should be  `Kind` or `KindOfPet` rather than `kind` or `kind_of_pet`.

Second, I think in this case the sql-enum name should take precedence, since it describes the enum itself and should be characteristic for the enum, while the column name only describes its usage relative to the model. Also, different column names could be used for the same enum. For example, we could have another table:

```
class Person(Base):
    __tablename__ = "persons"
    name = sa.Column(sa.String(30), primary_key=True)
    favorite_pet_kind = sa.Column(sa.Enum('cat', 'dog', name='kind_of_pet'))
```

So in this case, I suggest the g-enum should be created like:

```
graphene.Enum('KindOfPet', [('CAT', 'cat'), ('DOG', 'cat')])
```

I.e. the g-enum should be named  `KindOfPet` and not `Kind` or `FavoritePetKind`. Note that in addition to the type name, the symbol names have also been converted, but this is a separate issue I will discuss below as point 7. 

### 3) sa-enums based on values, without SQL name

Example 3:

```
class Pet(Base):
    __tablename__ = "pets"
    name = sa.Column(sa.String(30), primary_key=True)
    kind = sa.Column(sa.Enum('cat', 'dog'))
```

In this case, I think it would make sense to create a g-enum type named `Kind` after the column, since we have no other clue how the type should be named. However, as in the example above, the same type could be also used in a differently named column:

```
class Person(Base):
    __tablename__ = "persons"
    name = sa.Column(sa.String(30), primary_key=True)
    favorite_pet_kind = sa.Column(sa.Enum('cat', 'dog'))
```

We have two options here: First, we create another g-enum type named `FavoritePetKind` with the same values. Or we re-use the `Kind` type from above. But in that case it would not be clear which name to use: The first name, `Kind,` or the last name, `FavoritePetKind`, or the shortest name or the longest name? We would need to specify an artificial rule for picking the name. So I suggest simply creating two g-enum types with the same values, but different names, for the two columns.

But we still have a problem. What if column names are equal, but values differ, like in the following

Example 3b:

```
class Pet(Base):
    __tablename__ = "pets"
    name = sa.Column(sa.String(30), primary_key=True)
    kind = sa.Column(sa.Enum('cat', 'dog'))
    eye_color = sa.Column(sa.Enum('amber', 'brown'))

class Person(Base):
    __tablename__ = "persons"
    name = sa.Column(sa.String(30), primary_key=True)
    kind = sa.Column(sa.Enum('thinker', 'doer'))
    favorite_pet_kind = sa.Column(sa.Enum('cat', 'dog'))
    eye_color = sa.Column(sa.Enum('blue', 'brown'))
```

My suggestion to solve this problem is to add the model name as a prefix and auto generate the g-enum types `PetKind`, `PetEyeColor`, `PersonKind`, `PersonFavoritePetKind` and `PersonEyeColor`. We could try to be smart and reuse g-enums with the same values, adding the prefix only when there are value clashes, but I think for consistency and simplicity sake we should always add the model as prefix, not only when there are conflicts.

### 4) Should we enforce an Enum postfix? 

Should we also enforce a postfix of `Enum` for g-enum names? Note that this is not a GraphQL convention, but it may make sense to avoid name clashes with other types in the GraphQL schema, particularly when we auto-create names. I.e. in the example above, the generated g-enum would be `PetEyeColorEnum` instead of `PetEyeColor`. And do we want to use the postfix only for auto generated names or also add one to g-enums derived from py-enums or generated from sql-enum names? Of course, the postfix would not be added when it already exists.

Currently I think we should *not* add such a prefix, particularly if we auto create types with model name as prefix as suggested above, since then name clashes are much less likely. And when enums already have names, they would normally also not clash with model class names. I can't think of an example where this would be a problem.

### 5) Retrieving g-enums

It is sometimes necessary to refer to the auto-generated g-enums, like when you are using them in GraphQL input types. In the example above, you may want to provide a query that takes a `PetKind` as argument and returns all pets of that kind. To do that, you need to refer to the g-enum generated from the `pets.kind` column.

I think we should provide two methods for this, one using the g-enum name as argument, another one using the column as argument. The following calls would then return the same g-enum:

```
get_enum_by_name('PetKind')

get_enum_for_column(Pet.pet_kind)
```

The first method could also be used for retrieving sort enums for models that are generated by `graphene_sqlalchemy` to be used as argument for sorting query results. There should be also a method for getting the sort g-enum that takes the model as parameter. These should return the same sort g-enum:

```
get_enum_by_name('PetSortEnum')

get_sort_enum_for_model(Pet)
```

Should the `get_enum_for_column` and `get_sort_enum_for_model` automatically create and register a g-enum when nothing has been registered yet? I think so, that's better than throwing an error. This would allow retrieving a column g-enum for usage in an argument used *before* the corresponding column, or a sort g-enum without explicitly creating it first using a `set_sort_enum_for_model` function. Of course the latter should be provided to allow customizing the generated sort enum, but otherwise you would not even need it.

### 6) Use the registry or utility or enum module?

To always retrieve the same g-enum that has been generated for a column or as a sort enum, the g-enums need to be stored in the `graphene_sqlalchemy` registry. The methods for retrieving will therefore naturally be methods of the Registry class. Should we also provide functions in the `utility` module with the same names, that take the registry as an optional parameter, using the global registry as default? Or maybe we should put all the enum support in a separate `enum` module instead? The functions for creating sort enums could then also live there. I think that makes sense.

### 7) Values of g-enums

GraphQL has the convention ([recommended in the specs](https://graphql.github.io/graphql-spec/draft/#sec-Enum-Value)) that enum values are named like `ENUM_VALUE` instead of `enum_value` or `EnumValue`. On the database, no such convention exists, and sql-enums often use lowercase names. Python enums also often use uppercase member names, but this is not really a convention, and they are also often lowercase.

So the question is: Should we alter the names of enum values to be uppercase in the g-enum? I think so. Graphene also adapts the names of fields from Python to GraphQL already. Graphene does *not* rename the enum values, and leaves it up to you to follow the conventions, which I think makes sense. However, in the case of  `graphene_sqlalchemy` we cannot define the names as we like, but must take those from the database. So I think in that case it makes sense to transform the *names* of g-enum values from the database to Graphene so that they follow GraphQL conventions. The *values* of the g-enum values should not be transformed. See the example above:

```
graphene.Enum('KindOfPet', [('CAT', 'cat'), ('DOG', 'cat')])  # names transformed, values not
```

For those who really don't want to transform enum value names, we could support a registry attribute `convert_enum_value_names: bool` that would be True by default but could be set to False to deactivate the automatic conversion.

Please let me hear your opinions, particulary if you have different opinions or issues I have forgotten to consider. I will then create a PR with the functionality we agreed upon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: Auto-creation of Graphene Enums #208

1) sa-enums based on py-enums

1a) sa-enums based on py-enums, with SQL name

2) sa-enums based on values, with SQL name

3) sa-enums based on values, without SQL name

4) Should we enforce an Enum postfix?

5) Retrieving g-enums

6) Use the registry or utility or enum module?

7) Values of g-enums

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discussion: Auto-creation of Graphene Enums #208

Description

1) sa-enums based on py-enums

1a) sa-enums based on py-enums, with SQL name

2) sa-enums based on values, with SQL name

3) sa-enums based on values, without SQL name

4) Should we enforce an Enum postfix?

5) Retrieving g-enums

6) Use the registry or utility or enum module?

7) Values of g-enums

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions