Skip to content

Memory efficient data input options #104

@iancze

Description

@iancze

Is your feature request related to a problem or opportunity? Please describe.
We were strict about our inputs to the Gridder object, expecting that uu and vv are measured in kilolambda and have shape (nchan, nvis), and weights have shape (nchan, nvis).

Describe the solution you'd like
This strictness may be a bit cumbersome, especially when working with ALMA spectral line datasets with a large number of channels. For example, the measurement set more efficiently stores the baselines in meters (so they are the same for every channel) and when the weights are the same for each channel, only one weight is stored per baseline. This means that on disk, uu, vv, and weights have shape (nvis) instead of (nchan, nvis). This can be a considerable memory saving when talking about large visibility datasets with hundreds or even thousands of channels.

Describe alternatives you've considered
At minimum, we could port convenience routines to convert these quantities from visread to MPoL, or just reference that they exist in visread. This might make life easier for the user in that they keep the filesize on disk small, but may still pose memory requirement issues when doing the inference.

A more advanced operation would be to adjust the Gridder, or an alternate class of Gridder, to take in the measurement set-like data products and then perform the gridding operation in a memory efficient manner. This could be helpful, but should only be worked on after we've done a proper memory profiling of a whole image synthesis procedure. It could be that the actual image optimization (and associated derivatives) are the largest bottleneck, anyway.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions