GeCo I2b2 data source: implementation of steps 1 and 2 #6

mickmis · 2021-12-23T16:21:16Z

Here is a working implementation of the i2b2-medco data source plugin for GeCo.

Summary:

implementation of i2b2 docker with test data
- this new image does not include the i2b2 demo data (only the structure), as such the build/test/deployment is much faster, also the CI is configured to cache the layers of the docker builds
integration with geco deployment, to reuse the same database easily
use of the definition of data source plugin interface through SDK package of GeCo
implementation of i2b2 XML API client enabling ontology browsing and explore queries
implementation of GeCo data source interface enabling ontology browsing, explore queries and cohort management
implementation of database structure and operations for the data source own database containing the explore queries history and the saved cohorts
- it loads its own structure at init when it finds the database to be empty
Tests / CI / Makefile / deployment / etc.
- notably in internal: test of the plugin through GeCo's data manager

…2 XML API

f-marino

The core of what we agreed on has been implemented.
@mickmis I left just a few comments to address in the code. In addition to those, I think adding a (even small) description to the packages would be useful.

Next steps to have a fully working plugin:

Implement search box and survival curves operations
Factor out all GeCo dependencies
Define how plugins are loaded in GeCo

.github/workflows/ci-cd.yml

f-marino · 2022-01-10T09:00:03Z

.gitmodules

@@ -0,0 +1,3 @@
+[submodule "third_party/geco"]
+	path = third_party/geco
+	url = [email protected]:ldsec/geco.git


to be deleted by extracting to a public repo the geco parts used by this plugin (i.e., as far as I understand, the dev deployment)

f-marino · 2022-01-10T09:46:44Z

build/package/i2b2/sql/50-stored-procedures/get_ontology_elements.plpgsql

@@ -0,0 +1,35 @@
+-- pl/pgsql function that returns (maximum lim) ontology elements whose paths contain the given search_string.
+
+CREATE OR REPLACE FUNCTION i2b2metadata.get_ontology_elements(search_string varchar, lim integer DEFAULT 10)


this function is not up to date

Indeed, actually none of the functions in 50-stored-procedures are used or up to date at the moment. I've included them though in anticipation to the next implementations, but it is likely that it will be needed for them to be modified.

I can remove for clarity if you prefer, or leave them here, maybe with a readme saying what I just wrote.

Fine, don't touch them, I'll take care of it.

deployments/i2b2.yml

README.md

cmd/geco-i2b2-data-source/data-source-plugin.go

internal/plugin_test.go

f-marino · 2022-01-11T15:19:16Z

pkg/datasource/database/ddl.go

+
+const ddlLoaded = `SELECT EXISTS (SELECT 1 FROM pg_namespace WHERE nspname = $1);`
+
+const createDeleteSchemaFunctions = `


I guess you want to create a schema every time you create a datasource to support the test case where multiple datasources' info are stored on the same DB. If this is the case, why aren't you creating the explore query and saved cohorts table in the correspondent schema. And if it is not the case, why you are doing it?

The intention was more to move the database structure creation logic into the code, rather than rely on the deployment to do so. I find this simplifies the deployment since:

in test deployment we have less control over the database since its deployment is done through geco

in production deployment the plugin is loaded by geco and not as an independent runtime, and I wanted to avoid the need of having to run devops stuff from geco

And yes one of the other objective was to support multiple data sources of the same type.

why aren't you creating the explore query and saved cohorts table in the correspondent schema

It is, through a separate statement ddlStatement.

Probably it's because of my not great proficiency in sql, but in the ddlstatement I don't see specified the schema in which the two tables are created. Aren't the table created in the default schema (i.e., public) in this case?

So for this, the connection string passed to the postgresql driver contains the schema (here), i.e. if the schema is not specified in the SQL query, it will default to the one passed to the driver. In this case, the name of the schema is configurable.

pkg/datasource/models/search.go

f-marino · 2022-01-12T21:34:54Z

Makefile

+go-build-plugin:
+	go build -buildmode=plugin -v -o ./build/ ./cmd/...


Suggested change

go-build-plugin:

go build -buildmode=plugin -v -o ./build/ ./cmd/...

go-build-plugin: export GOOS=linux

go-build-plugin:

go build -buildmode=plugin -v -o ./build/ ./cmd/...

I guess here we should always cross compile

I would say it depends on what is done on the geco side, as the produced binary for the plugin should be compatible with the geco binary.

Nevermind, actually I had to work a lot to make it work with the dockerized geco, I'll take care of it

Co-authored-by: Francesco Marino <[email protected]>

mickmis · 2022-01-18T11:16:12Z

@f-marino @romainbou
I'm done with the feedback on both PRs (modulo small additional things depending on @f-marino answers), which in terms of time spent was mostly producing additional documentation.

FYI I have about 1 hour left to spend. I don't know how you would like me to spend it, for additional fixes, a meeting for a debrief/walk-through the code, or anything else.

f-marino · 2022-01-18T18:26:27Z

There is only one last comment to address.
For the remaining hour, I'd say let's keep it there for the moment, and let's see how my work on the integration evolves, in case unexpected issues that need @mickmis experience emerge.

f-marino · 2022-02-16T21:39:41Z

@mickmis everything went smoothly enough with the integration, so I didn't need your help.

Last thing you can do for the remaining hour is add some comments/documentation about the i2b2 image (everything under /build/i2b2), like a small comment for the most relevant files describing what they do, with the most important parameters to take into account in case we want to modify something, if any.

In particular I noticed that in the i2b2 logs there is not anymore the dump of the xml requests addressed to i2b2, could you tell us which parameter we have to tweak to have them back?

mickmis · 2022-02-18T12:03:20Z

@mickmis everything went smoothly enough with the integration, so I didn't need your help.

Great news!

Last thing you can do for the remaining hour is add some comments/documentation about the i2b2 image (everything under /build/i2b2), like a small comment for the most relevant files describing what they do, with the most important parameters to take into account in case we want to modify something, if any.

OK this should be done now, I've added several READMEs which should contain all the information needed.

In particular I noticed that in the i2b2 logs there is not anymore the dump of the xml requests addressed to i2b2, could you tell us which parameter we have to tweak to have them back?

If you mean the dump by i2b2 itself it should be controlled the environment variable of the i2b2 docker image AXIS2_LOGLEVEL that you can set to DEBUG. I don't recommend it though as it is usually way too verbose.

If you mean the dump by the data source (which I recommend), it is logged at the debug level.
As you may have a noticed the logging is actually controlled outside of the data source by using a provided Logger logrus.FieldLogger, so this logging level must be set by the component that inits the datasource.
As an example in the tests the level is set to debug, see pkg/datasource/datasource_test.go:31.

f-marino · 2022-02-18T12:13:17Z

In MedCo in the logs we were able to see the xml requests when browsing the ontologies or performing queries (like the one in the picture), it is not the case here.

So setting AXIS2_LOGLEVEL to DEBUG should do the trick right?

mickmis · 2022-02-18T12:29:22Z

So setting AXIS2_LOGLEVEL to DEBUG should do the trick right?

Yes however it is pretty unreadable, so I suggest to log it at the datasource level, c.f. my previous response.

mickmis added 30 commits December 2, 2021 14:04

working i2b2 docker image

d3f87d9

add geco repo as a submodule

9abcaf5

add test scripts for i2b2 docker

7ecdc41

move geco submodule to vendor dir

fd266fc

reset submodule

41c9358

properly add back geco as submodule

740e1a6

first attempt at GA CI

89208b9

CI: set up SSH key

a3d42fb

CI: set up checkout token

1ea9be3

CI: remove sleep between steps and add active check for i2b2 to be up

e1f1e9b

CI: add xmllint tool

7b10b4e

add data source interface

091d9d5

add i2b2 XML API models

d6d22b1

add i2b2 XML client definition

d0fec19

add i2b2 data source plugin definition

1648c52

add go.mod def

89d1bdf

add gitignore in build

b4c0789

add test for plugin

673c6d4

add data source plugin for geco

8a16267

update data source interface implementation

7eb100d

implementation of searchConcept, searchModifier and corresponding i2b…

8caa1d1

…2 XML API

add/fix test of docker image for ontology query

8dd2eae

fix plugin declaration and test

e4eb177

finalize i2b2api package with ont and crc queries for explore queries

3d86c14

i2b2 api models modifications

39a3db0

add models for explore query

ca57396

change conversion of i2b2 path

72d2492

add explore query implementation

0b99f41

add data object functions; update to data source for explore query

54e81e8

cosmetic

984394e

mickmis added 3 commits December 23, 2021 15:59

updated go.sum with LFS fix

788e9a3

update geco version

8539c48

Merge remote-tracking branch 'origin/dev' into step2

46ddd08

mickmis changed the title ~~Step2~~ GeCo I2b2 data source: implementation of steps 1 and 2 Dec 23, 2021

mickmis added 2 commits December 23, 2021 17:31

CI: enable push of i2b2 image

b3539e4

update readme

30f5170

mickmis requested review from f-marino and romainbou December 23, 2021 16:37

mickmis marked this pull request as ready for review December 23, 2021 16:37

CI: disable i2b2 logs and fix push

01ac9e7

f-marino suggested changes Jan 11, 2022

View reviewed changes

f-marino reviewed Jan 12, 2022

View reviewed changes

mickmis and others added 8 commits January 18, 2022 08:50

CI: show i2b2 deployment logs when CI fails

cc6ea31

update i2b2 deployment port 8080>8081

78fd25b

Update README.md

0edeb6c

Co-authored-by: Francesco Marino <[email protected]>

Update README.md

65b6b4e

Co-authored-by: Francesco Marino <[email protected]>

add API documentation

5fa5787

rename field of cohort result for consistency in naming

25efc4b

update in-code doc of fields of search parameters

c64ca03

add readmes for packages

508940f

update references medco>geco

4b13e51

mickmis added 2 commits February 18, 2022 12:48

add READMEs to i2b2 docker image

5945adc

add env variables of i2b2 docker image in readme

c78d02f

fix links

0e51ba4

		@@ -0,0 +1,35 @@
		-- pl/pgsql function that returns (maximum lim) ontology elements whose paths contain the given search_string.

		CREATE OR REPLACE FUNCTION i2b2metadata.get_ontology_elements(search_string varchar, lim integer DEFAULT 10)


		const ddlLoaded = `SELECT EXISTS (SELECT 1 FROM pg_namespace WHERE nspname = $1);`

		const createDeleteSchemaFunctions = `

		go-build-plugin:
		go build -buildmode=plugin -v -o ./build/ ./cmd/...

GeCo I2b2 data source: implementation of steps 1 and 2 #6

Are you sure you want to change the base?

GeCo I2b2 data source: implementation of steps 1 and 2 #6

Uh oh!

Conversation

mickmis commented Dec 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

f-marino left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mickmis commented Jan 18, 2022

Uh oh!

f-marino commented Jan 18, 2022

Uh oh!

f-marino commented Feb 16, 2022

Uh oh!

mickmis commented Feb 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

f-marino commented Feb 18, 2022

Uh oh!

mickmis commented Feb 18, 2022

Uh oh!

Uh oh!

mickmis commented Dec 23, 2021 •

edited

Loading

mickmis commented Feb 18, 2022 •

edited

Loading