Skip to content

Bug during trying on a large graph #75

@ccMrCaesar

Description

@ccMrCaesar

Hi, Dr. Zhang
Hello, I am currently using SEAL for link prediction on a really large graph with more than 100k nodes and 1 million edges for study purpose. After the first "Enclosing subgraph extraction" almost done (97%) , a bug occured:

" File "/lustre/home/Desktop/SEAL/Python/util_functions.py", line 163, in subgraph_extraction_labeling
nodes.remove(ind[1]) KeyError: 17196 "

And more informations like

"The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/lustre/home/Desktop/my_seal/Python/Main.py", line 160, in
train_graphs, test_graphs, max_n_label = links2subgraphs(
File "/lustre/home/Desktop/my_seal/Python/util_functions.py", line 132, in links2subgraphs
train_graphs = helper(A, train_pos, 1) + helper(A, train_neg, 0)
File "/lustre/home/Desktop/my_seal/Python/util_functions.py", line 118, in helper
results = results.get()
File "/lustre/opt/cascadelake/linux-centos7-x86_64/gcc-4.8.5/miniconda3-4.8.2-5yczksexambgeule63z3smwiwrbokjtu/envs/mytorch/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
KeyError: 17196
"
Show it seems like something happens in multiprocessing model.

I have tried command in your README file :"python Main.py --train-name PB_train.txt --test-name PB_val.txt --hop 1 --save-model" and it works just fine. But after using my edge list file, it happens. I also tried with or without the "--max-nodes-per-hop 100" command but there still a problem.

I notice that you suggest using PytorchGeometric to deal with the large graph, I just wonder whether it is fixable or any clue about it?

Thanks you for your work !

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions