Skip to content

Discussion: Auto-creation of Graphene Enums #208

Closed
@Cito

Description

@Cito

In PR #98 I made an attempt to improve the auto generation of Graphene Enums, but it turned out that this feature still needs some discussion. Let's clarify here what we really want to achieve first.

In the following:

  • g-enum = Graphene Enum types
  • sa-enum = SQLAlchemy Enum type
  • py-enum = Python Enum type
  • sql-enum = Database enum data type

The sql-enums are supported as actual data types by some databases, otherwise implemented using name constraints. Note that both g-enums and sa-enums can be based either on py-enums or on simple lists of values.

For the examples below, I use the following imports:

from enum import Enum as PyEnum
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base

Base = sa.ext.declarative.declarative_base()

When g-enums are created from sa-enums by graphene_sqlalchemy, they must be given a GraphQL type name. We must decide how these name are generated.

1) sa-enums based on py-enums

Example 1:

class PetKind(PyEnum):
    cat = 1
    dog = 2

class Pet(Base):
    __tablename__ = "pets"
    name = sa.Column(sa.String(30), primary_key=True)
    kind = sa.Column(sa.Enum(PetKind))

I think it's clear that in this case, the g-enum should be derived from PetKind and thus get the same name PetKind. Since GraphQL and Python have the same conventions for type/class names, this case is all good and fine.

1a) sa-enums based on py-enums, with SQL name

Example 1a:

class Pet(Base):
    __tablename__ = "pets"
    name = sa.Column(sa.String(30), primary_key=True)
    kind = sa.Column(sa.Enum(PetKind, name='kind_of_pet'))

The sa-enums can be declared with a name argument which is used for name constraints or as sql-enum type name on the database. In this case, I think the name of the py-enum should take precedence for us; the generated g-enum should still be derived from PetKind, and thus have the name PetKind, not KindOfPet.

2) sa-enums based on values, with SQL name

Example 2:

class Pet(Base):
    __tablename__ = "pets"
    name = sa.Column(sa.String(30), primary_key=True)
    kind = sa.Column(sa.Enum('cat', 'dog', name='kind_of_pet'))

Note: The name here is passed as argument to Enum, not Column. I.e. the sql-enum name is kind_of_pet, while the column name is kind. This case is interesting.

First, no matter which name we choose, it will not follow GraphQL type name conventions according to which type names are written in PascalCase (though I don't find it officially required or recommended, I think it's a strong convention that also resonates with the Python class name convention). So I believe an automatic name conversion should happen; the GraphQL name should be Kind or KindOfPet rather than kind or kind_of_pet.

Second, I think in this case the sql-enum name should take precedence, since it describes the enum itself and should be characteristic for the enum, while the column name only describes its usage relative to the model. Also, different column names could be used for the same enum. For example, we could have another table:

class Person(Base):
    __tablename__ = "persons"
    name = sa.Column(sa.String(30), primary_key=True)
    favorite_pet_kind = sa.Column(sa.Enum('cat', 'dog', name='kind_of_pet'))

So in this case, I suggest the g-enum should be created like:

graphene.Enum('KindOfPet', [('CAT', 'cat'), ('DOG', 'cat')])

I.e. the g-enum should be named KindOfPet and not Kind or FavoritePetKind. Note that in addition to the type name, the symbol names have also been converted, but this is a separate issue I will discuss below as point 7.

3) sa-enums based on values, without SQL name

Example 3:

class Pet(Base):
    __tablename__ = "pets"
    name = sa.Column(sa.String(30), primary_key=True)
    kind = sa.Column(sa.Enum('cat', 'dog'))

In this case, I think it would make sense to create a g-enum type named Kind after the column, since we have no other clue how the type should be named. However, as in the example above, the same type could be also used in a differently named column:

class Person(Base):
    __tablename__ = "persons"
    name = sa.Column(sa.String(30), primary_key=True)
    favorite_pet_kind = sa.Column(sa.Enum('cat', 'dog'))

We have two options here: First, we create another g-enum type named FavoritePetKind with the same values. Or we re-use the Kind type from above. But in that case it would not be clear which name to use: The first name, Kind, or the last name, FavoritePetKind, or the shortest name or the longest name? We would need to specify an artificial rule for picking the name. So I suggest simply creating two g-enum types with the same values, but different names, for the two columns.

But we still have a problem. What if column names are equal, but values differ, like in the following

Example 3b:

class Pet(Base):
    __tablename__ = "pets"
    name = sa.Column(sa.String(30), primary_key=True)
    kind = sa.Column(sa.Enum('cat', 'dog'))
    eye_color = sa.Column(sa.Enum('amber', 'brown'))

class Person(Base):
    __tablename__ = "persons"
    name = sa.Column(sa.String(30), primary_key=True)
    kind = sa.Column(sa.Enum('thinker', 'doer'))
    favorite_pet_kind = sa.Column(sa.Enum('cat', 'dog'))
    eye_color = sa.Column(sa.Enum('blue', 'brown'))

My suggestion to solve this problem is to add the model name as a prefix and auto generate the g-enum types PetKind, PetEyeColor, PersonKind, PersonFavoritePetKind and PersonEyeColor. We could try to be smart and reuse g-enums with the same values, adding the prefix only when there are value clashes, but I think for consistency and simplicity sake we should always add the model as prefix, not only when there are conflicts.

4) Should we enforce an Enum postfix?

Should we also enforce a postfix of Enum for g-enum names? Note that this is not a GraphQL convention, but it may make sense to avoid name clashes with other types in the GraphQL schema, particularly when we auto-create names. I.e. in the example above, the generated g-enum would be PetEyeColorEnum instead of PetEyeColor. And do we want to use the postfix only for auto generated names or also add one to g-enums derived from py-enums or generated from sql-enum names? Of course, the postfix would not be added when it already exists.

Currently I think we should not add such a prefix, particularly if we auto create types with model name as prefix as suggested above, since then name clashes are much less likely. And when enums already have names, they would normally also not clash with model class names. I can't think of an example where this would be a problem.

5) Retrieving g-enums

It is sometimes necessary to refer to the auto-generated g-enums, like when you are using them in GraphQL input types. In the example above, you may want to provide a query that takes a PetKind as argument and returns all pets of that kind. To do that, you need to refer to the g-enum generated from the pets.kind column.

I think we should provide two methods for this, one using the g-enum name as argument, another one using the column as argument. The following calls would then return the same g-enum:

get_enum_by_name('PetKind')

get_enum_for_column(Pet.pet_kind)

The first method could also be used for retrieving sort enums for models that are generated by graphene_sqlalchemy to be used as argument for sorting query results. There should be also a method for getting the sort g-enum that takes the model as parameter. These should return the same sort g-enum:

get_enum_by_name('PetSortEnum')

get_sort_enum_for_model(Pet)

Should the get_enum_for_column and get_sort_enum_for_model automatically create and register a g-enum when nothing has been registered yet? I think so, that's better than throwing an error. This would allow retrieving a column g-enum for usage in an argument used before the corresponding column, or a sort g-enum without explicitly creating it first using a set_sort_enum_for_model function. Of course the latter should be provided to allow customizing the generated sort enum, but otherwise you would not even need it.

6) Use the registry or utility or enum module?

To always retrieve the same g-enum that has been generated for a column or as a sort enum, the g-enums need to be stored in the graphene_sqlalchemy registry. The methods for retrieving will therefore naturally be methods of the Registry class. Should we also provide functions in the utility module with the same names, that take the registry as an optional parameter, using the global registry as default? Or maybe we should put all the enum support in a separate enum module instead? The functions for creating sort enums could then also live there. I think that makes sense.

7) Values of g-enums

GraphQL has the convention (recommended in the specs) that enum values are named like ENUM_VALUE instead of enum_value or EnumValue. On the database, no such convention exists, and sql-enums often use lowercase names. Python enums also often use uppercase member names, but this is not really a convention, and they are also often lowercase.

So the question is: Should we alter the names of enum values to be uppercase in the g-enum? I think so. Graphene also adapts the names of fields from Python to GraphQL already. Graphene does not rename the enum values, and leaves it up to you to follow the conventions, which I think makes sense. However, in the case of graphene_sqlalchemy we cannot define the names as we like, but must take those from the database. So I think in that case it makes sense to transform the names of g-enum values from the database to Graphene so that they follow GraphQL conventions. The values of the g-enum values should not be transformed. See the example above:

graphene.Enum('KindOfPet', [('CAT', 'cat'), ('DOG', 'cat')])  # names transformed, values not

For those who really don't want to transform enum value names, we could support a registry attribute convert_enum_value_names: bool that would be True by default but could be set to False to deactivate the automatic conversion.

Please let me hear your opinions, particulary if you have different opinions or issues I have forgotten to consider. I will then create a PR with the functionality we agreed upon.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions