Schema type checker #335

abigailalice · 2024-08-02T01:43:03Z

This should close #274 and #186, and potentially #291. The semantics are primarily based on @KaneTW's gist in #274, except with two improvements:

It handles array types
It returns an error if two columns in a TableSchema will have identical names, instead of passing silently

It also displays information in a slightly different way, trying to show as many errors as it can rather than bailing on the first. It also adds a number of test cases to ensure it behaves correctly for the primitive types, higher kinded types, and a few tests to ensure it fails in cases where it should fail. I'm mostly putting this forward to see what other people think, as I'm not sure if it should have other changes before being accepted. Some of my (admittedly bikesheddy) thoughts on the current state:

I wasn't able to re-export it from Rel8 without creating an import cycle, so it's only exported via its own module being public. For some reason the module wouldn't compile unless it imported Rel8, getting a type error that Columns wasn't injective otherwise. I'm assuming the issue is that something not being in scope is effecting type checking, but I've never actually seen an error like that, so really aren't sure. I'm also not really sure if Rel8.Table.Verify is the best name for the module, I just picked Table pretty arbitrarily.
I reused tests/Rel8/Generic/Rel8able/Test.hs quite a bit to add some things necessary for my own test cases, but I'm not really sure if that's an appropriate place for them. I could move them if that's preferred.
The current work won't show KaneTW's original work, since I just copy-pasted it. Not sure if there's a proper way to base work off a gist, or if it's even an issue, but I thought it should still be said
I kind of want to have a version of the type checking function which also reads the data from the table after checking the schema, as having a valid schema isn't enough to ensure existing data can be read by rel8, which can be important when doing migrations. However, this isn't possible just using the Statement type. I could just add a version using Session which does this, but for people using hasql-transaction they would still need to rewrite that function manually. This isn't a terribly difficult thing for users of the library to do themselves, since anyone using it is already going to need to convert from Statement to Transaction, and can just read from the table after type checking, but it just feels like an oversite to not have this. For the moment the issue is explained in the documentation for the function.

It should also be said that #59 is kind of exploring similar ideas, and @KaneTW also was exploring some other ideas for interfaces in other changes to the original gist.

KaneTW · 2024-08-02T04:42:01Z

There's a way to tag people in the commit text:

Co-authored-by: NAME <[email protected]>

For me it would be:

Co-authored-by: David Kraeutmann <[email protected]>

abigailalice · 2024-08-02T15:46:00Z

Oh, I think Fixed n and Char also have a bug, due to not handling modifiers correctly. I'll work on that, as well as add the co-author disclaimer.

The failed build seems to just be from an unrelated test which happens to fail non-determinstically, using too precise rational comparison. I could fix that too, though I'm not sure if the 1e-15 is supposed to be correct, or just picked arbitrarily. I've noticed a few non-deterministic tests while working on this.

abigailalice · 2024-08-03T19:40:11Z

Okay, I fixed the bugs for those corner cases, added some test cases to cover them, and added proper attribution. My only major concern now is that I needed to force push to correct the first commit message, and that may have affected other commits in the middle. I'm newish to git in groups, so let me know if that's an issue, and I can just make another branch without the issue, and issue a new pull request. It does say I want to merge 9 commits in, and that is certainly not the case. Otherwise, unless any of the bikeshedding is cause for concern, I'm happy with the current state of things.

I should say I've only tested this superficially on a live project; it was able to detect some bugs in my code in some rarely inserted into tables (due to fields which had been removed in Haskell, but not in the database). But otherwise most of the testing is from the test cases themselves. Primarily they work by taking a schema, using showCreateTable to generate the table, ensuring the typechecker doesn't detect any errors, then inserting some random data into it, but there are a few tests which also generate bad tables to ensure the type checker does fail when it should as well.

ocharles · 2024-08-05T13:07:00Z

Thanks for all your hard work here @abigailalice and @KaneTW! There's quite a lot here so it might take me a while to get through this as I'm really busy at work at the moment, but I will try and get to this 🙏

abigailalice · 2024-08-05T18:10:55Z

Take your time, just remember that the middle commits aren't anything related to the actual PR, only the first and last are.

ocharles · 2024-08-10T19:39:09Z

Just remember that the middle commits aren't anything related to the actual PR, only the first and last are.

This makes it very difficult to review. Could you rebase your branch so the changes are on master? Don't worry too much about commit attribution - we squash merge here anyway, so we can sort that separately.

This closes circuithub#274 and circuithub#186, allowing the generation of CREATE TABLE statements from a TableSchema, as well as checking a TableSchema against a database to determine if the tables are defined correctly in the database to be read from/written to by rel8. It also adds tests for the creation and type checking of tables, to ensure they succeed/fail in appropriate cases. Co-authored-by: David Kraeutmann <[email protected]>

abigailalice · 2024-08-12T22:13:11Z

Okay @ocharles, I think it should be at an acceptable state to review. Sorry about the difficulty. I should say the issues stated in (1), (2), and (4) in the original PR are still present, assuming anyone feels strongly about them.

abigailalice · 2024-08-31T15:31:28Z

@ocharles, is there something preventing this from being accepted, or have you just not had time to get around to it? Not an issue if it's the latter, just wanted to make sure if I misunderstood anything.

ocharles · 2024-08-31T15:48:17Z

@abigailalice purely time constraints I'm afraid! It's still on my radar. I'll be on holiday in a few days so this might need to wait until mid September

TeofilC

Thanks for this @abigailalice and apologies that this has been waiting for a review for so long!

This is very nice! I appreciate that you added some tests, and that the exposed functions have nice comments.

I've left a few suggestions, mostly little ideas for making things clearer.
If we resolve those I think this should be good to merge.

TeofilC · 2025-05-07T15:32:48Z

src/Rel8/Table/Verify.hs

+
+
+attrsToMap :: [Attribute Result] -> M.Map String (Attribute Result)
+attrsToMap = foldMap (\attr -> M.singleton (T.unpack $ attr.attribute.attname) attr)


Suggested change

attrsToMap = foldMap (\attr -> M.singleton (T.unpack $ attr.attribute.attname) attr)

attrsToMap = M.fromList . map (\attr -> (T.unpack attr.attribute.name, attr))

Something like this is likely to have better performance and is a bit clearer (at least to me)

TeofilC · 2025-05-07T15:36:00Z

src/Rel8/Table/Verify.hs

+                 , typeName = info.typeName}])
+  where
+    go :: [(String, TypeInfo)] -> M.Map String (NonEmpty.NonEmpty TypeInfo)
+    go = M.fromListWith (<>) . (fmap . fmap) pure


Suggested change

go = M.fromListWith (<>) . (fmap . fmap) pure

go = M.fromListWith (<>) . map (\(name, typeInfo) -> (name, NonEmpty.singleton typeInfo))

This is a bit more verbose but a bit clearer

TeofilC · 2025-05-07T15:38:29Z

src/Rel8/Table/Verify.hs

+-- In the event multiple columns have the same name, this will fail silently. To
+-- handle that case, see @'checkedShowCreateTable'@


Suggested change

-- In the event multiple columns have the same name, this will fail silently. To

-- handle that case, see @'checkedShowCreateTable'@

-- In the event that multiple columns have the same name, this will fail silently. To

-- handle that case, see 'checkedShowCreateTable'

TeofilC · 2025-05-07T16:10:09Z

src/Rel8/Table/Verify.hs

+-- table. This does not show relationships like primary or foreign keys, but can
+-- still be useful to see what types rel8 will expect of the underlying database.
+--
+-- In the event multiple columns have the same name, this will return a map of


Suggested change

-- In the event multiple columns have the same name, this will return a map of

-- In the event that multiple columns have the same name, this will return a map of

TeofilC · 2025-05-07T16:10:43Z

src/Rel8/Table/Verify.hs

+-- names to the labels identifying the column.
+checkedShowCreateTable :: Rel8able k => TableSchema (k Name) -> Either (M.Map String (NonEmpty [String])) String
+checkedShowCreateTable schema = case checkedSchemaToTypeMap schema.columns of
+    Left e -> Left $ (fmap . fmap) (\typ -> typ.label) e


Suggested change

Left e -> Left $ (fmap . fmap) (\typ -> typ.label) e

Left e -> Left $ map (\(name, typ) -> (name, typ.label)) e

Slightly clearer

TeofilC · 2025-05-07T16:20:35Z

src/Rel8/Table/Verify.hs

+--
+-- However, it is possible for migrations to cause valid data to become invalid,
+-- in ways not detectable by this function, if the migration code changes the
+-- schema correctly but doesn't handle the value-level constraints correctly, so


Suggested change

-- schema correctly but doesn't handle the value-level constraints correctly, so

-- schema correctly but doesn't handle the value-level constraints correctly. So

TeofilC · 2025-05-07T16:20:47Z

src/Rel8/Table/Verify.hs

+-- in ways not detectable by this function, if the migration code changes the
+-- schema correctly but doesn't handle the value-level constraints correctly, so
+-- it is a good idea to both read from the tables and check the schema for errors
+-- in a transaction during the migration. This former will catch value-level


Suggested change

-- in a transaction during the migration. This former will catch value-level

-- in a transaction during the migration. The former will catch value-level

TeofilC · 2025-05-07T16:29:08Z

src/Rel8/Table/Verify.hs

+-- This is used by @'schemaErrors'@ to more conveniently group every table an
+-- application relies on together, for typechecking the postgresql schemas
+-- together in a single batch.


Suggested change

-- This is used by @'schemaErrors'@ to more conveniently group every table an

-- application relies on together, for typechecking the postgresql schemas

-- together in a single batch.

-- This is used by @'schemaErrors'@ to conveniently group every table an

-- application relies on for typechecking the postgresql schemas

-- together in a single batch.

TeofilC · 2025-05-07T16:29:38Z

src/Rel8/Table/Verify.hs

+-- |@'getSchemaErrors'@ checks whether the provided schemas have correct postgresql
+-- column names and types to allow reading and writing from their equivalent Haskell
+-- types, returning a list of errors if that is not the case. The function does not
+-- crash on an encountered bug, instead leaving it to the caller to decide how


Suggested change

-- crash on an encountered bug, instead leaving it to the caller to decide how

-- crash on encountering a bug, instead leaving it to the caller to decide how

TeofilC · 2025-05-07T16:30:41Z

src/Rel8/Table/Verify.hs

+-- This function does nothing to check that the conflict target of an @Upsert@
+-- are valid for the schema, nor can it prevent invalid uses of @unsafeDefault@.
+-- However, it should be enough to catch the most likely errors.
+getSchemaErrors :: [SomeTableSchema] -> HS.Statement () (Maybe Text)


Do we want Text or String here. Some of the other functions return String but this returns Text. I don't think it makes much of a difference but I wanted to check whether you wanted it to be consistent.

abigailalice force-pushed the verify branch from 1218f14 to 4ce75b4 Compare August 3, 2024 19:28

abigailalice force-pushed the verify branch from 4ce75b4 to 81a52b0 Compare August 12, 2024 19:45

abigailalice force-pushed the verify branch from 72e4408 to a23cd82 Compare August 12, 2024 22:06

Merge branch 'master' into verify

5ee157b

Merge branch 'master' into verify

e8b9fe6

TeofilC requested changes May 7, 2025

View reviewed changes



		attrsToMap :: [Attribute Result] -> M.Map String (Attribute Result)
		attrsToMap = foldMap (\attr -> M.singleton (T.unpack $ attr.attribute.attname) attr)

	attrsToMap = foldMap (\attr -> M.singleton (T.unpack $ attr.attribute.attname) attr)
	attrsToMap = M.fromList . map (\attr -> (T.unpack attr.attribute.name, attr))

	go = M.fromListWith (<>) . (fmap . fmap) pure
	go = M.fromListWith (<>) . map (\(name, typeInfo) -> (name, NonEmpty.singleton typeInfo))

		-- In the event multiple columns have the same name, this will fail silently. To
		-- handle that case, see @'checkedShowCreateTable'@

	-- In the event multiple columns have the same name, this will return a map of
	-- In the event that multiple columns have the same name, this will return a map of

	Left e -> Left $ (fmap . fmap) (\typ -> typ.label) e
	Left e -> Left $ map (\(name, typ) -> (name, typ.label)) e

	-- schema correctly but doesn't handle the value-level constraints correctly, so
	-- schema correctly but doesn't handle the value-level constraints correctly. So

	-- in a transaction during the migration. This former will catch value-level
	-- in a transaction during the migration. The former will catch value-level

	-- crash on an encountered bug, instead leaving it to the caller to decide how
	-- crash on encountering a bug, instead leaving it to the caller to decide how

Schema type checker #335

Are you sure you want to change the base?

Schema type checker #335

Uh oh!

Conversation

abigailalice commented Aug 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KaneTW commented Aug 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abigailalice commented Aug 2, 2024

Uh oh!

abigailalice commented Aug 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ocharles commented Aug 5, 2024

Uh oh!

abigailalice commented Aug 5, 2024

Uh oh!

ocharles commented Aug 10, 2024

Uh oh!

abigailalice commented Aug 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abigailalice commented Aug 31, 2024

Uh oh!

ocharles commented Aug 31, 2024

Uh oh!

TeofilC left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TeofilC May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

abigailalice commented Aug 2, 2024 •

edited

Loading

KaneTW commented Aug 2, 2024 •

edited

Loading

abigailalice commented Aug 3, 2024 •

edited

Loading

abigailalice commented Aug 12, 2024 •

edited

Loading

TeofilC May 7, 2025 •

edited

Loading