I've successfully uploaded a dataset (subset of PDB) but it has unusual labels in that they are matrices. Storing matrices/ndarrays/sparse arrays as a column in a .csv is not ideal. If you're writing to and reading from these files with pandas you quickly land up with issues where \t and \n characters mess up the parsing. I have just uploaded a seperate pickle file with a dictionary of my labels, but it probably something the team should consider if you want the full datasets available in a single file.
Perhaps we could consider if there is some way to automate pulling separate labels files when calling a dataset. This would make no difference to the end user as we could hide some computation from the API. Let me know your thoughts 😃 .
I've successfully uploaded a dataset (subset of PDB) but it has unusual labels in that they are matrices. Storing matrices/ndarrays/sparse arrays as a column in a
.csvis not ideal. If you're writing to and reading from these files withpandasyou quickly land up with issues where\tand\ncharacters mess up the parsing. I have just uploaded a seperate pickle file with a dictionary of my labels, but it probably something the team should consider if you want the full datasets available in a single file.Perhaps we could consider if there is some way to automate pulling separate labels files when calling a dataset. This would make no difference to the end user as we could hide some computation from the API. Let me know your thoughts 😃 .