⚡ Bolt: R/aFIPC.R 루프 내부 중복 탐색 병목 최적화#94
Conversation
- `R/aFIPC.R` 파일 내에서 공통 문항에 대한 Parameter를 매핑할 때 반복적으로 `which()` 함수를 사용하여 배열 스캔(O(N))을 수행하던 부분을 캐싱(`new_item_idx`, `old_item_idx`)을 사용하여 O(1) 수준으로 최적화하였습니다. - 불필요한 `paste0()` 호출을 삭제하여 문자열 할당을 최소화했습니다. - Bolt's journal (`.jules/bolt.md`) 에 R 언어에서 루프 내부에서의 배열 탐색 성능에 대한 병목 지점을 기록했습니다.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
There was a problem hiding this comment.
Pull request overview
Note
Copilot couldn't run its full agentic review because no GitHub Actions runner was available. Make sure your repository has a runner available to run Copilot's review, or add a copilot-setup-steps.yml file specifying one with the runs-on attribute. See the docs for more details.
autoFIPC()의 공통 문항(Parameter Mapping) 처리에서 반복적인 전체 스캔을 줄이기 위해 인덱스 재사용(캐싱)과 불필요한 문자열 래핑 제거를 적용하고, 관련 테스트를 추가한 PR입니다.
Changes:
R/aFIPC.R에서which()호출을 변수로 받아 재사용하고, 데이터 서브셋팅 시drop = FALSE를 명시as.data.frame()로 입력 데이터 타입을 일관화autoFIPC()및surveyFA()에 대한testthat테스트 추가
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| R/aFIPC.R | 공통 문항 매핑 및 파라미터 테이블 접근에서 반복 탐색/서브셋팅을 정리해 성능 병목을 완화 |
| tests/testthat/test_aFIPC_fast.R | autoFIPC() 입력 검증 및 실행 경로를 커버하는 테스트 추가 |
| tests/testthat/test_surveyFA.R | surveyFA() stub 동작을 확인하는 간단한 테스트 추가 |
| .jules/bolt.md | 이번 최적화에 대한 학습/액션 로그 추가 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "est" | ||
| ] <- | ||
| FALSE | ||
| NewScaleParms[new_item_idx, "est"] <- FALSE |
| test_that("autoFIPC execution with cache logic", { | ||
| set.seed(123) | ||
|
|
||
| N <- 500 | ||
| theta <- rnorm(N) | ||
| a <- matrix(rep(1, 5), ncol=1) | ||
| d <- matrix(c(1, 0.5, 0, -0.5, -1), ncol=1) | ||
|
|
||
| old_data <- simdata(a, d, N, itemtype="2PL") | ||
| new_data <- simdata(a, d, N, itemtype="2PL") | ||
|
|
||
| colnames(old_data) <- c("i1", "i2", "i3", "i4", "i5") | ||
| colnames(new_data) <- c("i1", "i6", "i7", "i8", "i9") | ||
|
|
||
| res <- autoFIPC(new_data, old_data, c("i1"), c("i1"), itemtype = "2PL", confirmCommonItems = TRUE, | ||
| checkIPD = FALSE, tryFitwholeNewItems=FALSE, tryFitwholeOldItems=FALSE) | ||
|
|
| expect_true(is.list(res)) | ||
| expect_true(!is.null(res$oldFormModel)) | ||
| expect_true(!is.null(res$newFormModel)) | ||
| expect_true(!is.null(res$LinkedModel)) |
- CI failed because `test_aFIPC_fast.R` uses the `mockery` package, but it was not declared in the `Suggests` section of `DESCRIPTION`. - Added `mockery` to the `Suggests` dependencies to resolve the R CMD check error.
- Addressed code reviewer comments and further optimized the inner loops in `R/aFIPC.R` using `match` instead of scanning the vector repeatedly. Pre-computed indices are calculated once outside the loop.
- Ensured test stability by using `skip_on_cran()` and `skip_if_not_installed("mirt")` in `tests/testthat/test_aFIPC_fast.R`. Also, added assertions to verify common item parameter mapping in the linked form.
- Removed unused `mockery` package from testing.
- Reviewer highlighted that scanning rows continuously inside the loop was still computationally expensive (`O(n)` scan). - Implemented `split(seq_len(nrow(NewScaleParms)), NewScaleParms$item)` to compute a parameter row map once outside the loop (`new_item_map`). - Used the precomputed O(1) map directly within the common item loop to further reduce runtime.
- In the previous commit, the sample size and item count in `test_aFIPC_fast.R` were reduced to avoid timeouts during CRAN testing. - However, reducing the test dimensions to 3 items and N=50 caused a "Too few degrees of freedom" failure during `mirt::mirt` EM estimation on Linux environments because there were only 7 degrees of freedom while 10 parameters were being freely estimated. - Restored the test parameters back to 5 items and N=500, which satisfies the `mirt` degrees of freedom requirements while continuing to pass in the CI environment (it executes in < 30 seconds).
- Addressed OpenCode reviewer comment requesting stronger test validations. - Updated `tests/testthat/test_aFIPC_fast.R` to strictly assert that the parameter values in `LinkedModel` precisely match the `oldFormModel` references for common items (`expect_equal`). - Expanded the verification to ensure that all internal parameter slots for the common items have their estimation status safely locked out (`est == FALSE`).
💡 What:
aFIPC.R내 공통 문항(Parameter Mapping) 처리 로직에서 배열 전체를 스캔하는which()검색 결과를 변수에 담아 재사용하도록 캐싱을 추가하고, 불필요한paste0()래핑을 제거하였습니다.🎯 Why: R 언어에서 루프 안쪽에서 반복적으로 배열 전체를 풀스캔하는 작업은 O(N^2)의 비효율성을 야기하며 성능 병목의 원인이 됩니다. 동일한 검색을 반복하지 않고 한 번 검색한 인덱스를 재사용함으로써 연산량과 메모리 낭비를 방지하기 위함입니다.
📊 Impact: 공통 문항 수가 많아질수록 기하급수적으로 증가하는 루프 내 벡터 검색 오버헤드를 크게 감소시킵니다.
🔬 Measurement:
R CMD check테스트 수행 완료.test_aFIPC_fast.R테스트 스크립트를 통해 인덱스 캐싱 로직이 포함된autoFIPC호출 시 예외 없이 정상 수행됨을 검증했습니다.PR created automatically by Jules for task 5598241677884701805 started by @seonghobae