-
Notifications
You must be signed in to change notification settings - Fork 29
Description
The saved embedding shapes in the embedding pkl file for the atom_embedding and block_embedding seem to be incorrect; for the exact same system (but different frames in a MD trajectory), I've found that different dimensions are saved. The behavior seems to depend on how many PDBs are processed at once (i.e. how many entries there are in the CSV passed into 'process_pdbs.py'). For this one PDB, for example, I got its embedding within a batch of 60 PDBs and again within a batch of 2 PDBs. The dimensions from the batch of 60 were:
id: 1_2_f6.pdb_ABCDEFGHIJK_L_LIG
graph_embedding: shape (32,)
block_embedding: shape (0, 32)
atom_embedding: shape (635, 32)
block_id: shape (155,)
atom_id: shape (859,)
And from the batch of 2:
id: 1_2_f6.pdb_ABCDEFGHIJK_L_LIG
graph_embedding: shape (32,)
block_embedding: shape (155, 32)
atom_embedding: shape (859, 32)
block_id: shape (155,)
atom_id: shape (859,)
But both had the same exact same embedding:
"1_2_f6.pdb_ABCDEFGHIJK_L_LIG": [-0.004074007738381624, 0.3173443675041199, -0.11195822060108185, -0.01592610776424408, -0.06759312003850937, 0.164351224899292, 0.024977976456284523, -0.16799166798591614, -0.09120690077543259, -0.1385965496301651, 0.46841853857040405, -0.018586348742246628, 0.17197035253047943, 0.13865599036216736, -0.06969504058361053, -0.2753834128379822, -0.00586903840303421, -0.12763790786266327, 0.051467981189489365, 0.41604381799697876, -0.26409682631492615, -0.12303176522254944, 0.010855807922780514, -0.09284979850053787, 0.03329434618353844, -0.015801483765244484, 0.2621055543422699, -0.10459930449724197, 0.10786740481853485, 0.03054526075720787, 0.12038268893957138, -0.23565126955509186]
So I don't think this affects the final embeddings seriously.