Interested in GSoC 2026 Project 6: Interface for post-simulation analysis crawling of WESTPA simulations. #5268
Replies: 3 comments 1 reply
-
|
Hi Kunj. Welcome.
I think MDAKit or westpa/westpa would probably work best, with the latter being a little bit easier on the maintenance side.
Well, the PR for the source code change is linked here. The code was written so one could read any supported trajectory format with mdanalysis --> save to HDF5 framework. How important/useful that is for the reverse direction, I'll leave it up to you. It's not mandatory to reuse that code.
I would probably have MDA do the parallelization (if possible) than run MDA multiple times. As for why, I won't give away the answer and let you think about it.
That logistic is something for you to plan and think over (and include in the pre-proposal). Overall, we want to feed in the west.h5 (and the iter_XXXX.h5 files) and get the
Unfortunately on the WESTPA-end we're quite small so we don't have a bunch of "good-first-issues" all lined up, but we have some ideas in westpa/westpa#321 that might spark some ideas. The tutorials suite/westpa-test-system repos should provide enough examples for you to play around with. |
Beta Was this translation helpful? Give feedback.
-
|
Thank you for taking the time to answer each of my questions! |
Beta Was this translation helpful? Give feedback.
-
|
Hey @jeremyleung521 Thanking You, |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi MDAnalysis team (and mentors @jeremyleung521 and @ltchong )
I'm Kunj, a CS student looking to work on project 6 for GSoC 2026. I have built similar CLI based apps before and have a slight bit of experience with multiprocessing as well. My proficiency is in Python and I am familiar with the git/github development workflow.
I have already
west.h5+traj_segsfilesw_crawldocs and Tutorial 7.5I plan to apply for this project and have a few targeted questions below to make sure I understand the exact scope before writing my proposal. Happy to discuss or share initial ideas
The project says “New MDAnalysis/MDAKit parser”. Should this be added as a core MDAnalysis reader (like the existing topology/trajectory parsers), or is an MDAKit preferred? Also, since this is a WESTPA collaboration project, would any parts ideally live in the westpa/westpa repo (like the Project 5 dashboard discussion)?
The description mentions that “Code that translates the topology in the HDF5 Framework to that of MDAnalysis has already been written and included in the source code of v2022.13.” How complete is it, and is it meant to be the foundation for the new parser?
Since
w_crawlalready does parallel analysis over segments and the skills explicitly list multiprocessing, I’m thinking the CLI tool could automatically parallelize simple MDAnalysis calls across iterations usingmultiprocessing(with a fallback serial mode). Is this the direction you have in mind, or would you prefer the parser itself to be lightweight and let users handle parallelism on top of theUniverse?From the HDF5 wiki,
west.h5contains metadata and links, but actual coordinates live in separatetraj_segs/iter_XXXXXX.h5files (and topology isn’t stored directly). How should the parser build a full MDAnalysis Universe? For example, should it treat segments as separate trajectories, support iteration/segment selection, automatically include weights fromseg_index, or still require the user to provide a reference topology file?To get familiar and demonstrate engagement, are there any beginner-friendly issues or small tasks in the MDAnalysis or WESTPA repos related to HDF5 readers or WESTPA support that I could pick up first?
Really looking forward to your thoughts!
Best Regards,
Kunj Sinha
Beta Was this translation helpful? Give feedback.
All reactions