Skip to content

Stop place import script #151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Stop place import script #151

wants to merge 5 commits into from

Conversation

culka
Copy link
Contributor

@culka culka commented Mar 17, 2025

Basic stop place import script for testing


This change is Reviewable

importer.py Outdated

import sys

if sys.version_info[0] != 3:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python2-tulkki ei koskaan pääse suorittamaan tätä ehtoa, koska lähdekoodissa on käytetty ominaisuuksia, joita python2 ei tunne

eli tän iffin voi poistaa

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

README.md Outdated

### How to use

Modify the `importer.py` script by redefining the source database details and the target graphql address and admin secret. By default the script runs from the local jore3 test database and uses the base local Jore4 Hasura instance as the target.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Käytä ennemmin jotain kirjastoa, joka lukee env-tiedoston tai käytä suoraan ympäristömuuttujia.

Esim. https://pypi.org/project/django-environ/ on helppokäyttöinen (eikä vaadi djangoa depsinään vaikka django nimessä onkin)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nyt arvot luetaan .env tiedostosta.

importer.py Outdated

for stopArea in stopAreas:
areaStops = get_jore3_stops_for_area(stopArea['pysalueid'])
if (areaStops):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tämä iffi on turha. Toki iteraatiolla jää nuo alustukset tekemättä, mutta sillä ei oikeasti ole mitään väliä.

Lisäksi jos poistat tämän, voit muuttaa get_jore4_stops_for_area-funktion generaattoriksi, jolloin ei tarvii ladata kaikkea kamaa listaan (muistiin) -- tällä nyt ei luultavasti ole pahemmin merkitystä, mutta voi hyvinkin nopeattaa importtia, kun muistivaraukset vähenee

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Muutettu generaattoriksi

importer.py Outdated
jore3DatabaseUrl = "localhost:1433"
jore3DatabaseName = "jore3testdb"

def get_jore3_stops_for_area(pysakki_alue):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jos vähän modailet tätä kutsuvaa koodia, niin voit muuttaa tämän generaattoriksi (https://wiki.python.org/moin/Generators)

Käytännössä poista rivit 22 39 ja 42 ja muuta rivi 37 muotoon yield row

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tulipa opiskeltua miten nämä toimivat, mutta nyt taitaa olla generaattorina.

importer.py Outdated
if (tulos):
return tulos

return ''

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huono käytäntö palauttaa muaalla list, mutta siinä tapauksessa tyhjä merkkijono, kun ei löytynyt mitään.

myös tyhjä lista on pythonissa falsy eli tästä syystä ei tartte palauttaa tyhjää merkkijonoa:

>>> bool('')
False
>>> bool([])
False

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

importer.py Outdated
for v in values:
result_dict.setdefault(v['label'], []).append(v)
return result_dict
return ''

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tyhjä dict on myös falsy pythonissa:

>>> bool({})
False

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

importer.py Outdated
'validity_end': x['validity_end']
} for x in stop_points]
for v in values:
result_dict.setdefault(v['label'], []).append(v)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tää on (mielestäni) funktionaalisesti sama asia, kun rivien 108 - 117 toteutus, eikä looppaile kahdesti listaa läpi ja tallentele väliaikaista listaa:

for x in stop_points:
    result_dict.setdefault(x['label'], []).append({
            'label': x['label'],
            'lat': x['measured_location']['coordinates'][1],
            'lon': x['measured_location']['coordinates'][0],
            'priority': str(x['priority']),
            'validity_start': x['validity_start'],
            'validity_end': x['validity_end']
    })

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

importer.py Outdated
for row in cursor:
tulos.append(row)

if (tulos):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pythonissa ei käytetä sulkeita if:n expression ympärillä eli pitäisi olla

if tulos:
    ...

ja tässä tosiaan ihan turha iffi, kun return tulos korvaa rivit 39 - 42 ilman että toiminnallisuus muuttuu ('' on falsy)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poistettu muutenkin generaattorimuutoksen myötä

importer.py Outdated
for row in cursor:
tulos.append(row)

if (tulos):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rivit 85 - 88 voidaan korvata yhdellä returnilla: return tulos

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

importer.py Outdated
validityEnd = stopPoint['validity_end']
quayInput.append(quayInputForJore3Stop(stop, stopPoint['label'], validityStart, validityEnd , lon, lat))
latCoords.append(lat)
lonCoords.append(lon)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tekisin ehkä vaan yhen listan coords, jossa lat ja lon ois tuplena eli tässä koodi ois:

coords.append((lat, lon))

tai käyttäis esim. named-tuplea tai dataclassia (https://docs.python.org/3/library/dataclasses.html) jotta saa coordinate-tyypille järkevät attribuuttien nimet

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jätin nämä nyt ennalleen lopun keskiarvon laskemista varten.

@culka culka force-pushed the stop-place-import-script branch from 6b972e2 to 815b891 Compare March 18, 2025 07:02
Copy link
Contributor Author

@culka culka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 16 unresolved discussions (waiting on @janneronkko)

importer.py Outdated

import sys

if sys.version_info[0] != 3:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

importer.py Outdated
jore3DatabaseUrl = "localhost:1433"
jore3DatabaseName = "jore3testdb"

def get_jore3_stops_for_area(pysakki_alue):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tulipa opiskeltua miten nämä toimivat, mutta nyt taitaa olla generaattorina.

importer.py Outdated
inner join jr_lij_pysakkialue pa on (p.pysalueid = pa.pysalueid)
inner join jr_esteettomyys e on (p.soltunnus = e.tunnus)
inner join jr_varustelutiedot_uusi vt on (p.soltunnus = vt.tunnus)
where p.pysalueid = '{pysakki_alue}'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

importer.py Outdated
for row in cursor:
tulos.append(row)

if (tulos):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poistettu muutenkin generaattorimuutoksen myötä

importer.py Outdated
if (tulos):
return tulos

return ''
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

importer.py Outdated
'validity_end': x['validity_end']
} for x in stop_points]
for v in values:
result_dict.setdefault(v['label'], []).append(v)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

importer.py Outdated
for v in values:
result_dict.setdefault(v['label'], []).append(v)
return result_dict
return ''
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

importer.py Outdated

for stopArea in stopAreas:
areaStops = get_jore3_stops_for_area(stopArea['pysalueid'])
if (areaStops):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Muutettu generaattoriksi

importer.py Outdated
validityEnd = stopPoint['validity_end']
quayInput.append(quayInputForJore3Stop(stop, stopPoint['label'], validityStart, validityEnd , lon, lat))
latCoords.append(lat)
lonCoords.append(lon)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jätin nämä nyt ennalleen lopun keskiarvon laskemista varten.

README.md Outdated

### How to use

Modify the `importer.py` script by redefining the source database details and the target graphql address and admin secret. By default the script runs from the local jore3 test database and uses the base local Jore4 Hasura instance as the target.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nyt arvot luetaan .env tiedostosta.

@culka culka force-pushed the stop-place-import-script branch from 815b891 to b3131ec Compare March 18, 2025 09:45
@culka culka force-pushed the stop-place-import-script branch from b3131ec to 5ba22ae Compare March 20, 2025 07:49
@jarkkoka jarkkoka requested a review from janneronkko March 20, 2025 12:20
Copy link
Contributor

@jarkkoka jarkkoka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 17 unresolved discussions (waiting on @janneronkko)


README.md line 550 at r4 (raw file):

GRAPHQL_SECRET=
JORE3_USERNAME=
JORE3_PASSWORD=

The README.md doesn't tell what are the database preconditions for running the Python script.

Copy link
Contributor

@jarkkoka jarkkoka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 19 unresolved discussions (waiting on @culka and @janneronkko)


README.md line 542 at r4 (raw file):

By default the script runs from the local jore3 test database and uses the base local Jore4 Hasura instance as the target.
You can change the source database and target Hasura instance by creating a `.env` file in the same directory as the script.

Documentation is needed on what are the compatible versions of jore4-hasura' and other microservices. Actually, I think it would be better to make changes to the docker-compose.custom.yml` file.


README.md line 544 at r4 (raw file):

You can change the source database and target Hasura instance by creating a `.env` file in the same directory as the script.

Set the values for variables you want to set:

It is necessary to state how the data is expected to be exported to Azure. By using Azure's Hasura endpoint directly or taking a snapshot of a local database. If the latter is the case, what is the required pg_dump command?

- Save stop details in the stop_place - quay model
- Use requirements.txt to define libraries
- Update README
- Update docker setup
@culka culka force-pushed the stop-place-import-script branch from 5ba22ae to 587d5e1 Compare March 24, 2025 11:10
Copy link
Contributor Author

@culka culka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 6 files reviewed, 19 unresolved discussions (waiting on @janneronkko and @jarkkoka)


README.md line 542 at r4 (raw file):

Previously, jarkkoka (Jarkko Kaura) wrote…

Documentation is needed on what are the compatible versions of jore4-hasura' and other microservices. Actually, I think it would be better to make changes to the docker-compose.custom.yml` file.

Updated the file and startup script


README.md line 550 at r4 (raw file):

Previously, jarkkoka (Jarkko Kaura) wrote…

The README.md doesn't tell what are the database preconditions for running the Python script.

Added

Copy link
Contributor

@jarkkoka jarkkoka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r5, 1 of 5 files at r6.
Reviewable status: 3 of 8 files reviewed, 22 unresolved discussions (waiting on @culka and @janneronkko)


stopimport/Dockerfile line 11 at r6 (raw file):

COPY .env .

CMD ["python", "importer.py"]

Line break missing at the end


stopimport/run-stopimport.sh line 1 at r6 (raw file):

#!/usr/bin/env bash

The containing directory should be renamed from stopimport to stop_registry_import (note the underscores to improve reading) since plain "stopimport" is misleading, since the actual Java application also imports stops. We need to make it clearer what this Python application does.

In addition, I think this script should be renamed to run-stop-reg-import.sh.


stopimport/run-stopimport.sh line 3 at r6 (raw file):

#!/usr/bin/env bash

docker build -t stop-importer .

Rename stop-importer to stop-reg-importer. We need to make it clearer what this Docker container does, since the actual Java application also imports stops. We don't want to confuse new developers coming to this project in the future.

@culka culka force-pushed the stop-place-import-script branch from 0608f38 to ac6e8e9 Compare March 28, 2025 08:12
Copy link
Contributor

@jarkkoka jarkkoka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 6 files at r7.
Reviewable status: 3 of 9 files reviewed, 24 unresolved discussions (waiting on @culka and @janneronkko)


README.md line 537 at r7 (raw file):

---

## Jore3 stop import script

Use "Jore 3" spelling here as is used consistently in this README.


README.md line 545 at r7 (raw file):

### How to use

To run the stop registry importer script you need to have a Jore3 database, a populated Jore4 routes database, Jore4 Hasura and Jore4 Tiamat running.

Let's fix the Jore spelling to be consistent with the rest of this README file.

Change this sentence to the form:

"To run the stop register importer script, you must have a Jore 3 database, a populated Jore 4 routes database, and the jore4-hasura and jore4-tiamat microservices running."


README.md line 561 at r7 (raw file):

You should have run the base Jore3 importer first which ensures the Jore4 database has the required scheduled stop points. Then run the stop registry import script which will match scheduled stop points in the Jore4 routes database with stops in the Jore3 database and generate GraphQL mutations to Hasura/Tiamat according to the data. The script will also link the generated stop registry stops with the scheduled stop points by their NeTEx ID using Hasura for the mutation.

Let's be consistent in Jore spelling with the rest of the README file:

  • Jore3 => Jore 3
  • Jore4 => Jore 4

Copy link
Contributor

@jarkkoka jarkkoka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 9 files reviewed, 25 unresolved discussions (waiting on @culka and @janneronkko)


stop-registry-importer/importer.py line 35 at r7 (raw file):

        with conn.cursor(as_dict=True) as cursor:

            cursor.execute(f"""select * from jr_pysakki p

The SQL query does not conform to the general SQL style guide used e.g. in the jore4-hasura repository where SELECT, FROM and other such keywords are in uppercase.

Copy link
Contributor

@jarkkoka jarkkoka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 9 files reviewed, 27 unresolved discussions (waiting on @culka and @janneronkko)


stop-registry-importer/importer.py line 51 at r7 (raw file):

        with conn.cursor(as_dict=True) as cursor:

            cursor.execute("""select pa.nimi, pa.pysalueid, s.sollistunnus, s.solkirjain from jr_pysakki p

For better reading, FROM should start from a new line.


stop-registry-importer/importer.py line 55 at r7 (raw file):

            inner join jr_lij_pysakkialue pa on (p.pysalueid = pa.pysalueid)
            where p.pysalueid in (
            select jp.pysalueid from jr_pysakki jp

For better reading, FROM should start from a new line

@culka culka force-pushed the stop-place-import-script branch from ac6e8e9 to 83b6e96 Compare March 28, 2025 12:05
@culka culka force-pushed the stop-place-import-script branch from be16dc8 to 9a8853d Compare April 3, 2025 13:56
Copy link
Contributor Author

@culka culka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 10 files reviewed, 27 unresolved discussions (waiting on @janneronkko and @jarkkoka)


README.md line 537 at r7 (raw file):

Previously, jarkkoka (Jarkko Kaura) wrote…

Use "Jore 3" spelling here as is used consistently in this README.

Both spellings seem to be used in this file though?


README.md line 545 at r7 (raw file):

Previously, jarkkoka (Jarkko Kaura) wrote…

Let's fix the Jore spelling to be consistent with the rest of this README file.

Change this sentence to the form:

"To run the stop register importer script, you must have a Jore 3 database, a populated Jore 4 routes database, and the jore4-hasura and jore4-tiamat microservices running."

Done.


README.md line 561 at r7 (raw file):

Previously, jarkkoka (Jarkko Kaura) wrote…

Let's be consistent in Jore spelling with the rest of the README file:

  • Jore3 => Jore 3
  • Jore4 => Jore 4

Done.


stopimport/Dockerfile line 11 at r6 (raw file):

Previously, jarkkoka (Jarkko Kaura) wrote…

Line break missing at the end

There is line break at the end?

@culka culka force-pushed the stop-place-import-script branch from 9a8853d to 34c9ef8 Compare April 7, 2025 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants