fix: strlen related variables should be of type size_t#3503
fix: strlen related variables should be of type size_t#3503fzipi wants to merge 2 commits intoowasp-modsecurity:v2/masterfrom
Conversation
Signed-off-by: Felipe Zipitria <felipe.zipitria@owasp.org>
There was a problem hiding this comment.
Pull request overview
This PR updates several strlen()-related variables and function parameters to use size_t (instead of narrower integer types) to address type-safety and platform-specific warnings (notably on Windows) in the Apache 2.x connector code.
Changes:
- Converted multiple length variables (derived from
strlen) fromint/unsigned shorttosize_t. - Updated
parse_pm_contentdeclaration/definition to acceptsize_t op_len, and propagated some call-site length type updates. - Adjusted several internal helper length variables across request processing, logging/sanitization, multipart handling, and hashing codepaths.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| apache2/re_variables.c | Uses size_t for request-line length in FULL_REQUEST generation. |
| apache2/re_operators.c | Updates several operator parameter length variables to size_t and propagates into parsing paths. |
| apache2/msc_util.h | Changes parse_pm_content prototype to take size_t op_len. |
| apache2/msc_util.c | Changes parse_pm_content definition and updates internal length usage. |
| apache2/msc_status_engine.c | Changes base32 helper function parameters/locals to size_t where derived from strlen. |
| apache2/msc_multipart.c | Updates some strlen-related locals to size_t in multipart parsing helpers. |
| apache2/msc_logging.c | Updates several sanitization loop counters/offsets to size_t. |
| apache2/msc_crypt.c | Changes bytes accumulator (based on strlen) to size_t. |
| apache2/apache2_util.c | Uses size_t for message length derived from strlen in internal logging helper. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| ret = verify_gsb(gsb, msr, match, match_length); | ||
|
|
There was a problem hiding this comment.
match_length/canon_length were changed to size_t, but verify_gsb still takes an unsigned int match_length (see its signature) and is called with a size_t argument. This can truncate lengths > UINT_MAX and typically triggers MSVC warnings. Update verify_gsb to take size_t (or apr_size_t) and adjust the apr_md5_update call accordingly.
| op_len = strlen(buf); | ||
| processed = apr_pstrdup(rule->ruleset->mp, parse_pm_content(buf, op_len, rule, error_msg)); |
There was a problem hiding this comment.
op_len in msre_op_pmFromFile_param_init is still declared as unsigned short int but is assigned from strlen(buf) and passed to parse_pm_content (now size_t). This truncates lengths over 65535 and can reintroduce the Windows warnings this PR is targeting. Change op_len to size_t and update any dependent logic accordingly.
…tion Keep masking offset as signed int in msc_logging.c to prevent negative pos from disabling sanitization via unsigned comparison. Propagate size_t to loop indices, function signatures, and format specifiers. Revert size_t where downstream consumers are all int.
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (2)
apache2/msc_status_engine.c:72
msc_status_engine_base32_encodenow takessize_t len, butcountis stillintand is compared againstlen(while (count < len ...)). This introduces signed/unsigned comparison warnings and can behave incorrectly iflenexceedsINT_MAX. Consider makingcountasize_t(and adjusting the return type or adding a checked cast on return).
int DSOLOCAL msc_status_engine_base32_encode(char *encoded,
const char *data, size_t len) {
int buffer;
int count = 0;
char *result = encoded;
size_t length = strlen(data);
buffer = data[0];
if (encoded == NULL && len == 0) {
len = length * 3;
count++;
}
if (length > 0) {
int next = 1;
int bitsLeft = 8;
while (count < len && (bitsLeft > 0 || next < length)) {
int index;
apache2/msc_status_engine.c:124
msc_status_engine_fill_with_dotsnow takessize_t len, but the loop usesint iand compares it to bothstrlen(data)(size_t) andlen(size_t). This is likely to trigger signed/unsigned warnings on MSVC/Clang. Usesize_t i(and avoid repeatedstrlen(data)inside the loop) to keep types consistent.
int DSOLOCAL msc_status_engine_fill_with_dots(char *encoded_with_dots,
const char *data, size_t len, int space)
{
int i;
int count = 0;
if (encoded_with_dots == NULL) {
if (len == 0 && data != NULL) {
len = strlen(data);
}
else if (len == 0 && data == NULL) {
count = -1;
goto return_length;
}
count = len/space + len + 1;
goto return_length;
}
for (i = 0; i < strlen(data) && i < len; i++) {
if (i % space == 0 && i != 0) {
encoded_with_dots[count++] = '.';
}
encoded_with_dots[count++] = data[i];
}
encoded_with_dots[count] = '\0';
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| size_t j; | ||
| char *buf = NULL, *pat = NULL; | ||
| msc_parm *mparm = NULL; | ||
| int arg_min, arg_max, sanitize_matched; |
There was a problem hiding this comment.
In sec_audit_logger_json, j is now size_t but arg_min/arg_max are still int and later receive j (derived from strlen). This can truncate on 64-bit and may defeat the goal of eliminating warnings. Use size_t for the related variables (or add an explicit, checked cast if these are intentionally bounded).
| int arg_min, arg_max, sanitize_matched; | |
| size_t arg_min, arg_max; | |
| int sanitize_matched; |
| int off = (int)strlen(mparm->value) - arg_max; | ||
| int pos = mparm->pad_1-1; | ||
| if(off > pos) { |
There was a problem hiding this comment.
Same as earlier in this file: int off = (int)strlen(mparm->value) - arg_max; performs potentially truncating arithmetic on strlen() results. Keeping off/arg_max/pos in a consistent unsigned (or signed) width avoids truncation and signed/unsigned comparison problems.
| int off = (int)strlen(mparm->value) - arg_max; | |
| int pos = mparm->pad_1-1; | |
| if(off > pos) { | |
| size_t off = strlen(mparm->value) - (size_t)arg_max; | |
| int pos = mparm->pad_1-1; | |
| if((pos < 0) || (off > (size_t)pos)) { |
| int off = (int)strlen(mparm->value) - arg_max; | ||
| int pos = mparm->pad_1-1; | ||
| if(off > pos) { |
There was a problem hiding this comment.
Same truncation risk here: int off = (int)strlen(mparm->value) - arg_max;. Since strlen() returns size_t, doing the math in int can overflow/truncate and makes the code less portable. Prefer using size_t (or ptrdiff_t) for off and adjust comparisons accordingly.
| int off = (int)strlen(mparm->value) - arg_max; | |
| int pos = mparm->pad_1-1; | |
| if(off > pos) { | |
| size_t off = strlen(mparm->value) - (size_t)arg_max; | |
| int pos = mparm->pad_1-1; | |
| if ((pos < 0) || (off > (size_t)pos)) { |
| size_t j; | ||
| int arg_min, arg_max; | ||
|
|
There was a problem hiding this comment.
In sanitize_request_line, j is now size_t but arg_min/arg_max remain int. A few lines later arg_min = j; assigns the result of strlen() (size_t) into an int, which can truncate on 64-bit and may reintroduce signed/unsigned warnings. Consider making arg_min/arg_max (and related computations) size_t (or removing arg_min if it’s unused outside the loop).
| if(arg_max > mparm->pad_2) { | ||
| int off = (strlen(mparm->value) - arg_max); | ||
| int off = (int)strlen(mparm->value) - arg_max; | ||
| int pos = (mparm->pad_1-1); | ||
| if(off > pos) { |
There was a problem hiding this comment.
off is computed as (int)strlen(mparm->value) - arg_max. Casting strlen() down to int can truncate for long values and can also mask underflow/overflow issues in the subtraction. Prefer doing this arithmetic in size_t (or ptrdiff_t if negatives are meaningful) and compare using consistent unsigned types.
| size_t j; | ||
| char *buf = NULL, *pat = NULL; | ||
| msc_parm *mparm = NULL; | ||
| int arg_min, arg_max, sanitize_matched; |
There was a problem hiding this comment.
In sec_audit_logger_native, j is now size_t but arg_min/arg_max remain int and are assigned from strlen()-derived values. This can truncate and may still trigger signed/unsigned warnings on some toolchains. Consider updating these related counters/lengths to size_t for consistency.
| int arg_min, arg_max, sanitize_matched; | |
| size_t arg_min, arg_max; | |
| int sanitize_matched; |
|
@copilot apply changes based on the comments in this thread |


what
strlenreturns asize_tvariable, that depends on the architecture and environmentwhy
references