Skip to content

floating point issues in CoTeDe #214

@katieannemills

Description

@katieannemills

Some of the warnings CoTeDe is throwing appear to be due to floating point calculations underlying CoTeDe's spike check. Running only CoTeDe_spike.py without parallelization, random profiles generate errors of the type:

/AutoQC/miniconda/lib/python2.7/site-packages/cotede/qctests/spike.py:63: RuntimeWarning: invalid value encountered in greater
  flag[np.nonzero(self.features['spike'] > threshold)] = flag_bad
/AutoQC/miniconda/lib/python2.7/site-packages/cotede/qctests/spike.py:64: RuntimeWarning: invalid value encountered in less_equal
  flag[np.nonzero(self.features['spike'] <= threshold)] = flag_good

Re-running from scratch will show the same error, but in a different profile (see sample of quota data I was doing this on here: quota_subset.txt)

Digging in a bit, the invalid values are NaNs appearing in the masked array self.features['spike']. If I dump this array without the mask for a profile that produced this error, I get:

[  6.93226434e-310  -1.20000000e-001   1.20000000e-001   1.40000000e-001
  -1.35000000e+000  -6.30000000e-001               nan]

Then, if I run the exact same thing again, the test doesn't throw a warning on the same profile and instead produces

[  6.94815498e-310  -1.20000000e-001   1.20000000e-001   1.40000000e-001
  -1.35000000e+000  -6.30000000e-001   6.94812563e-310]

The first and last entries in the array are coming out a bit differently each time, sometimes tripping over into NaN and potentially corrupting the result of this test.

@castelao @s-good thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions