Hello, i am using your paper for my own thesis research. Looking at the workflow.jpg diagram, the arrows are a bit confusing. I am trying to use the Skip-Gram ngram-ngram method. From what i understand, it seems that i have to go through the steps corpus2vocab -> corpus2pairs -> paris2sgns. But paris2sgns requires an "--input_vector_file" argument. I dont know what that is and the steps didnt generate one. I assume its the resulting word embeddings vectors in a file, but if i have that, then i wouldnt be using the tool. Do i have to run the original word2vec SG method and save a .vec model and use it here? I read the research paper and didnt find an answer to this either. I also tried pairs2vocab, but it also doesnt generate the input vector file.
A separate issue is with the corpus2pairs; it generates 4 different .txt files (pairs.txt_0, pairs.txt_1, pairs.txt_2, pairs.txt_3), when i give the argument "--pairs_file ./pairs.txt". Then later do i have to run paris2sgns for all pairs files? Do i generate different output vector files for each? Do the vector files get overwritten or appended to?
Hello, i am using your paper for my own thesis research. Looking at the workflow.jpg diagram, the arrows are a bit confusing. I am trying to use the Skip-Gram ngram-ngram method. From what i understand, it seems that i have to go through the steps corpus2vocab -> corpus2pairs -> paris2sgns. But paris2sgns requires an "--input_vector_file" argument. I dont know what that is and the steps didnt generate one. I assume its the resulting word embeddings vectors in a file, but if i have that, then i wouldnt be using the tool. Do i have to run the original word2vec SG method and save a .vec model and use it here? I read the research paper and didnt find an answer to this either. I also tried pairs2vocab, but it also doesnt generate the input vector file.
A separate issue is with the corpus2pairs; it generates 4 different .txt files (pairs.txt_0, pairs.txt_1, pairs.txt_2, pairs.txt_3), when i give the argument "--pairs_file ./pairs.txt". Then later do i have to run paris2sgns for all pairs files? Do i generate different output vector files for each? Do the vector files get overwritten or appended to?