cdflib can, at least under some circumstances, read CDF files using illegal paths constructed using a legal path and adding characters to it at the end.
>>> import cdflib
>>> a = cdflib.cdfread.CDF('/media/erjo/juice/datasets/2023/07/13/JUICE_L1a_RPWI-LF-SID7_20230713T043826_V02.cdfINVALID')
>>> a.cdf_info()
CDFInfo(CDF=PosixPath('/tmp/tmp1btjyu9h.cdf'), Version='3.9.0', Encoding=6, Majority='Row_major', rVariables=[], zVariables=['Epoch', 'SCET', 'TIME_RELATIVE', 'HW_SWITCHES_1', 'HW_SWITCHES_2', 'ARTEFACTS', 'COMPONENT_MASK', 'SNAPSHOT_NUMBER', 'SAMPLING_RATE', 'N_SAMPLES', 'SEQ_COUNTER', 'DATA'], Attributes=[{'Acknowledgement': 'Global'}, {'Data_type': 'Global'}, {'Data_version': 'Global'}, {'Dataset_ID': 'Global'}, {'Descriptor': 'Global'}, {'Discipline': 'Global'}, {'DOI': 'Global'}, {'Generated_by': 'Global'}, {'Generated_with_software': 'Global'}, {'Generation_date': 'Global'}, {'Generation_time_UTC': 'Global'}, {'git_log_message_DC': 'Global'}, {'git_log_message_HF': 'Global'}, {'git_log_message_LF': 'Global'}, {'git_log_message_LP': 'Global'}, {'git_log_message_MB': 'Global'}, {'git_log_message_MM': 'Global'}, {'git_log_message_PL': 'Global'}, {'HTTP_LINK': 'Global'}, {'Instrument_type': 'Global'}, {'LINK_TEXT': 'Global'}, {'LINK_TITLE': 'Global'}, {'Loaded_SPICE_kernels': 'Global'}, {'Local_TM_source_files': 'Global'}, {'Logical_file_id': 'Global'}, {'Logical_source': 'Global'}, {'Logical_source_description': 'Global'}, {'Mission_group': 'Global'}, {'Parents': 'Global'}, {'PDS_collection_id': 'Global'}, {'PDS_start_time': 'Global'}, {'PDS_stop_time': 'Global'}, {'PI_affiliation': 'Global'}, {'PI_name': 'Global'}, {'Project': 'Global'}, {'RPWI_FSW_version': 'Global'}, {'Rules_of_use': 'Global'}, {'SDUS_updates': 'Global'}, {'Skeleton_version': 'Global'}, {'Software_version': 'Global'}, {'Source_name': 'Global'}, {'Spacecraft_clock_to_TT2000_time_conversion_linear_approximation_epoch': 'Global'}, {'Spacecraft_clock_to_TT2000_time_conversion_type': 'Global'}, {'spase_DatasetResourceID': 'Global'}, {'CATDESC': 'Variable'}, {'DISPLAY_TYPE': 'Variable'}, {'FIELDNAM': 'Variable'}, {'FILLVAL': 'Variable'}, {'FORMAT': 'Variable'}, {'LABLAXIS': 'Variable'}, {'MONOTON': 'Variable'}, {'TIME_BASE': 'Variable'}, {'UNITS': 'Variable'}, {'VALIDMIN': 'Variable'}, {'VALIDMAX': 'Variable'}, {'VAR_NOTES': 'Variable'}, {'VAR_TYPE': 'Variable'}, {'DEPEND_0': 'Variable'}], Copyright='\nCommon Data Format (CDF)\nhttps://cdf.gsfc.nasa.gov\nSpace Physics Data Facility\nNASA/Goddard Space Flight Center\nGreenbelt, Maryland 20771 USA\n(User support: gsfc-cdf-support@lists.nasa.gov)\n', Checksum=True, Num_rdim=0, rDim_sizes=[], Compressed=True, LeapSecondUpdate=None)
There seems to be an upper bound to how many characters one can add to a legal path before it starts raising an error. This is the smallest amount of extra characters I could add which triggers an error for this particular example.
>>> s = '/media/erjo/juice/datasets/2023/07/13/JUICE_L1a_RPWI-LF-SID7_20230713T043826_V02.cdf' + 'A'*210
>>> len(s)
294
>>> a = cdflib.cdfread.CDF(s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/nonhome_data/work_files/JUICE/pipeline_code/normal/rpwi_pipeline_venv/lib/python3.11/site-packages/cdflib/cdfread.py", line 90, in __init__
if not path.is_file():
^^^^^^^^^^^^^^
File "/nonstd_installs/pyenv/versions/3.11.14/lib/python3.11/pathlib.py", line 1267, in is_file
return S_ISREG(self.stat().st_mode)
^^^^^^^^^^^
File "/nonstd_installs/pyenv/versions/3.11.14/lib/python3.11/pathlib.py", line 1013, in stat
return os.stat(self, follow_symlinks=follow_symlinks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 74] Bad message: '/media/erjo/juice/datasets/2023/07/13/JUICE_L1a_RPWI-LF-SID7_20230713T043826_V02.cdfAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'
Prepending a legal path with characters does not work.
a = cdflib.cdfread.CDF('INVALID/media/erjo/juice/datasets/2023/07/13/JUICE_L1a_RPWI-LF-SID7_20230713T043826_V02.cdf')
cdflib can, at least under some circumstances, read CDF files using illegal paths constructed using a legal path and adding characters to it at the end.
This behavor has been observed for
There seems to be an upper bound to how many characters one can add to a legal path before it starts raising an error. This is the smallest amount of extra characters I could add which triggers an error for this particular example.
Prepending a legal path with characters does not work.