Open Planck, CSFD, and Lenz2017 maps with memory mapping#69
Open
hombit wants to merge 1 commit intogregreen:masterfrom
Open
Open Planck, CSFD, and Lenz2017 maps with memory mapping#69hombit wants to merge 1 commit intogregreen:masterfrom
hombit wants to merge 1 commit intogregreen:masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I've found that in multiple pipelines the RAM pressure is significantly high, especially if running stuff in parallel. See, e.g., SNAD Viewer, which we run on a virtual machine with just 8GB of RAM, or LightCurveLynx, which has to duplicate dust maps when using multiprocessing.
This may be solved by using memory mapping instead of loading the data into memory. When I was looking for low-hanging fruits for those changes, I found that CSFD is really easy to change for using memory mapping: just changing
.flatten()to.raven()prevents memory copying.Similar changes are applied to
HEALPixFITSQuery. I've also moved dtype casting and scaling from the constructor to.query(), so no memory copy happening for Planck and Lenz2017. I've run benchmarks and found no difference in.query()performance for both scalar and 10,000-element input.