-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
Hello everyone!
The Problem
First of all thanks for the fantasting work. I would like to be able to get skills-airflow and the skills-api up and running. However the instructions provided seem to be not enough for me to make them run. Maybe we can clarify things and together improve the documentation as well.
What I did so far for skills-airflow
- set up the virtual environment in skills-airflow repo (python 3.6.0) and pip installed requirements.txt and requirements_dev.txt
- installed postresql and createt a data base I called
daw_db - updated
config/api_v1_db_config.yamlto
PGPORT: 5432
PGHOST: localhost
PGDATABASE: daw_db
PGUSER: daw_db
PGPASSWORD:
-
the
alembic upgrade headcommand fails for me, not sure whether it's important? -
set up the following s3 buckets, right now empty
my-geo-bucket
my-job-postings
my-labeled-postings
my-model-cache
my-onet
my-output-tables
- I copied example_config.yaml to config.yaml
- running the
airflow schedulerwhich gives the following output
[2019-03-28 12:01:06,790] {__init__.py:51} INFO - Using executor SequentialExecutor
____________ _____________
____ |__( )_________ __/__ /________ __
____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
[2019-03-28 12:01:07,490] {jobs.py:1477} INFO - Starting the scheduler
[2019-03-28 12:01:07,490] {jobs.py:1485} INFO - Running execute loop for -1 seconds
[2019-03-28 12:01:07,491] {jobs.py:1486} INFO - Processing each file at most -1 times
[2019-03-28 12:01:07,491] {jobs.py:1489} INFO - Searching for files in /Users/matthausheer/airflow/dags
[2019-03-28 12:01:07,504] {jobs.py:1491} INFO - There are 19 files in /Users/matthausheer/airflow/dags
[2019-03-28 12:01:07,506] {jobs.py:1534} INFO - Resetting orphaned tasks for active dag runs
[2019-03-28 12:01:07,517] {dag_processing.py:453} INFO - Launched DagFileProcessorManager with pid: 34311
[2019-03-28 12:01:07,536] {settings.py:51} INFO - Configured default timezone <Timezone [UTC]>
[2019-03-28 12:01:07,568] {dag_processing.py:663} ERROR - Cannot use more than 1 thread when using sqlite. Setting parallelism to 1
[2019-03-28 12:01:08,002] {jobs.py:1559} INFO - Harvesting DAG parsing results
[2019-03-28 12:01:09,630] {jobs.py:1559} INFO - Harvesting DAG parsing results
...
What I did so far for skills-api
- setup virtual env (python 2.7.11) in skills-api repo and installed requirements.txt
- run
bin/make_config.shspecifyingpostgresql://localhost/daw_db python server.py runserverwhich gives starts a server running onhttp://127.0.0.1:5000/v1/jobs
I get the error that
ProgrammingError: (psycopg2.ProgrammingError) relation "jobs_alternate_titles" does not exist
LINE 3: FROM jobs_alternate_titles) AS anon_1
The Question
- What exactly do I have to place into the s3 buckets and in which format and or naming conventions?
- Did I miss anything else?
Some help would be greatly appreciated!
Cheers
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels