I have some questions about the dataset that you used.
I am currently doing a project whose task is similar to your paper(DeepREAL) using the IUPHAR interaction dataset(recent version).
However, I realized the data statistics you mentioned in the DeepREAL paper are much larger than mine. I downloaded the most recent version, and It is not common to the number of data to decrease as upgrading the version. Even though I already checked the csv format dataset( ikey2smiles_glass_ango_opo_new_combined.csv, train.csv, valid.csv) in your GitHub and Zeondo sites, the number of data is not matched as same as paper. (I showed that train.csv in the interaction folder only contains about 2250 data) I want to know the real number of data that you used. If the statistics you mentioned in the paper are right, how did you get that IUPHAR data?
I have some questions about the dataset that you used.
I am currently doing a project whose task is similar to your paper(DeepREAL) using the IUPHAR interaction dataset(recent version).
However, I realized the data statistics you mentioned in the DeepREAL paper are much larger than mine. I downloaded the most recent version, and It is not common to the number of data to decrease as upgrading the version. Even though I already checked the csv format dataset( ikey2smiles_glass_ango_opo_new_combined.csv, train.csv, valid.csv) in your GitHub and Zeondo sites, the number of data is not matched as same as paper. (I showed that train.csv in the interaction folder only contains about 2250 data) I want to know the real number of data that you used. If the statistics you mentioned in the paper are right, how did you get that IUPHAR data?