Description
Hi,When I try to pyton LLM_R2.py
to get the query rewrite result from JOB benchmark, the code meets an error:
#########################################
this code is running on: cuda:0
#########################################
Traceback (most recent call last):
File "/home/syy/LLM-R2/src/LLM_R2.py", line 720, in <module>
LLM_R2(dataset, method, num_promos)
File "/home/syy/LLM-R2/src/LLM_R2.py", line 542, in LLM_R2
promo_pool_pos = get_pool('/home/syy/LLM-R2/data/data_llmr2/pools/pos_pool_' + dataset + '_updated.csv', method)
File "/home/syy/LLM-R2/src/LLM_R2.py", line 514, in get_pool
embeddings = batcher(batch2, pool_df['db_id'].tolist())
File "/home/syy/LLM-R2/src/LLM_R2.py", line 54, in batcher
sent_features = prepare_enc_data(sentences, pre_lang_model, db_ids)
File "/home/syy/LLM-R2/src/encoder.py", line 324, in prepare_enc_data
nodes = [traversePlan(get_physical_tree(db_id, sql_input)) for sql_input in query_dataset[i]]
File "/home/syy/LLM-R2/src/encoder.py", line 324, in <listcomp>
nodes = [traversePlan(get_physical_tree(db_id, sql_input)) for sql_input in query_dataset[i]]
File "/home/syy/LLM-R2/src/encoder.py", line 73, in traversePlan
root_dict = root[0]
IndexError: list index out of range
I'm sure the input path and dataset
in LLM_R2.py were settled correctly, because I have sucessfully obtained TPC-H and DSB benchmark query rewrite results. I'm not sure why this error happens.
In addition, I find JOB benchmark queries in /LLM-R2/data/data_llmr2/pools/neg_pool_job_syn_updated.csv、/LLM-R2/data/data_llmr2/pools/pos_pool_job_syn_updated.csv、/LLM-R2/data/data_llmr2/queries/queries_job_syn_test.csv
... etc. were different from the offical JOB benchmark queries :https://www.vldb.org/pvldb/vol9/p204-leis.pdf.[1] Could you help me find your query source to get a better understand?
[1]Viktor Leis,etc. How Good Are Query Optimizers, Really?
Activity