Write ALARA output to SQLite#75
Conversation
gonuke
left a comment
There was a problem hiding this comment.
I started reviewing this a while ago and never got around to finishing it.
Here are some specific suggestions.
Overall, I think this PR is mixing things based on what your task was at the time and should maybe be broken down somehwat, aka separation of concerns.
Fundamentally, if you are going to rely on the ADF structure that we have developed for holding ALARA output data, then I think this PR should focus on methods that modify the ADF and then write it to SQL. Everything else is for a specific use case that can come in a future PR that focuses on how to process a variety of data that you have already generated using these new capabilities.
|
What if we limit |
|
To repeat the comments from #35:
Notably, this particular approach for generating the ADF makes sense for the current "campaign" of problems, but may not always make sense as we consider different variations on how we set up problems. |
Co-authored-by: Paul Wilson <paul.wilson@wisc.edu>
gonuke
left a comment
There was a problem hiding this comment.
I think it's important to separate the method for knowing the values of t_irr_flat from the modifying of the ADF. More generally, parsing a run_label may become the least common way to know this information.
There was a problem hiding this comment.
It looks like you only need this file to get the number of groups. Could that just be an input/command-line parameter? That is, in stead of providing a filename for the group bounds, you can just provide an integer number of groups.
| - 64 | ||
| flux_file : /filespace/a/asrajendra/research/activationDB/ref_flux_files/iter_dt_flux_2.0986E14 | ||
| flux_file : ../calc_dwell_dir/ref_flux_files/iter_dt_flux_2.0986E14 | ||
| vit_j_file : ../data/vit-j-175-bins.txt No newline at end of file |
There was a problem hiding this comment.
See previous comment - since this file only used to determine the number of groups - maybe this entry is just the number of groups
| vit_j_file : ../data/vit-j-175-bins.txt | |
| n_groups : 175 |
| inputs = yaml.safe_load(yaml_file) | ||
| return inputs | ||
|
|
||
| def modify_adf(adf, norm_flux_arr, t_irr_arr, inputs): |
There was a problem hiding this comment.
Adding docstrings early can help the reviewer know your intent for the data structures.
| num_pulses = inputs['num_pulses'] | ||
| duty_cycles = inputs['duty_cycles'] | ||
|
|
||
| # Extract the number of pulses and duty cycles from the run label | ||
| pulse_num_dc = adf['run_lbl'].str.extract(r'_(\d+)p_(\d+)_').astype(int) | ||
|
|
||
| # Map num_pulses and duty_cycles to an index | ||
| pulse_idx = pulse_num_dc[0].map({pulse_num: i for i, pulse_num in enumerate(num_pulses)}) | ||
| duty_cycle_idx = (pulse_num_dc[1]/100).map({duty_cycle: i for i, duty_cycle in enumerate(duty_cycles)}) | ||
|
|
||
| # Add flattened irradiation time: | ||
| adf['t_irr_flat'] = t_irr_arr.T[pulse_idx.to_numpy(), duty_cycle_idx.to_numpy()] | ||
| adf['t_irr_flat'] = aop.convert_times(adf['t_irr_flat'], from_unit='y', to_unit='s') |
There was a problem hiding this comment.
Separation of concerns: All of this seems like it's in the wrong place. All the data for the new columns in the ADF should be computed before you make the call the modify the ADF. Remember, there may be many ways to compute the data for the new columns depending on what the original source of the data is, and you want flexibility to mix and match.
Specifically, I thought we discussed that you shouldn't rely on decoding information from the run_label to get this information, anyway. If you still want to have that as one pathway to determine this data, that should be it's own method with only that purpose.
|
I've created a new script to run and test the methods in |
gonuke
left a comment
There was a problem hiding this comment.
The separation is getting better here. I think there's still room for more. My design sense is that there will be two separate python files (each with their own tests, perhaps).
- Generic methods for processing an ADF and saving it in an SQL database
- Specific methods for processing your particular set of test cases
I think this PR should focus on 1, where the tests are based on some kind of manually generated ADF to ensure that the various methods to prepare an ADF for SQL are working correctly.
A separate PR - or several - can focus on:
- generating several test cases based on num pulses, duty cycle, etc (maybe this is done??)
- loading those cases into an ADF
- processing the metadata and modifying the ADF with that metadata
| cursor = sqlite_conn.cursor() | ||
| sqlite_conn.commit() | ||
| result = cursor.fetchall() | ||
| cursor.close() |
There was a problem hiding this comment.
My python SQLite is rusty, but can you elaborate on what we need this for? I think I know why we need the commit(), but not sure why we need the cursor and fetchall?
There was a problem hiding this comment.
The cursor serves as a connection between the SQLite database and the local state of the database. While I was making this function, I was running a cursor.execute() statement to grab all the columns in the database, and cursor.fetchall() to grab all the rows, followed by a print statement on result to make sure it was doing what I expected. I've written out something similar now and moved it to the testing script. With the way it was written here, cursor.fetchall() wouldn't actually do much on its own.
| import sqlite3 | ||
|
|
||
|
|
||
| def calc_time_params(active_burn_time, duty_cycle_list, num_pulses): |
There was a problem hiding this comment.
This is still a special use case and either belongs somewhere else (your test code?) for now
There was a problem hiding this comment.
I've moved it to a different pair of scripts (to process and test methods for specific cases) that I'll PR shortly
|
I think |
|
I've added a short function called |
gonuke
left a comment
There was a problem hiding this comment.
Definitely a cleaner implementation. We can maybe think about a different file name than script_template.py now.
|
|
||
|
|
||
| def write_to_sqlite(adf, db_name="activation_results.db"): |
There was a problem hiding this comment.
Should we open (and close) the SQL connection elsewhere and pass it into this function (which then may do less...)
There was a problem hiding this comment.
I made a different function to close the connection since doing it in this function would complicate testing of the connection.commit() operation. I've now set this up such that an open connection is passed into this function
This PR introduces a structure to the sqlite activation results database and writes some preliminary results to it.
The
run_lblcolumn is used to keep track of the ALARA runs from which results were produced (using the number of pulses, duty cycle %, and active burn time).The
block_namecolumn has the effect of tracking the parent nuclide. The (flattened) irradiation time in seconds, flux spectrum shape, and number density all have separate columns.This database should be expanded to include the average flux magnitude (pending #14)
fixes #35