diff --git a/.cognition/skills/debug-exiftool/SKILL.md b/.cognition/skills/debug-exiftool/SKILL.md index 0007e0127..c424b830c 100644 --- a/.cognition/skills/debug-exiftool/SKILL.md +++ b/.cognition/skills/debug-exiftool/SKILL.md @@ -93,6 +93,16 @@ cd perl5_t/t ../../jperl op/lexsub.t ``` +### Running Perl5 core tests that use subprocess tests +Tests using `run_multiple_progs()` or `fresh_perl_is()` spawn `jperl` as a subprocess. This requires `jperl` to be in PATH: +```bash +# Using the test runner (handles PATH automatically): +perl dev/tools/perl_test_runner.pl perl5_t/t/op/eval.t + +# Manual running (must set PATH): +PATH="/Users/fglock/projects/PerlOnJava2:$PATH" cd perl5_t/t && ../../jperl op/eval.t +``` + ## Comparing with System Perl When debugging, compare PerlOnJava output with native Perl to isolate the difference: @@ -285,7 +295,7 @@ Key files for the interpreter: ## Current Test Status (as of 2026-03-03) -### ExifTool Test Results: 524/600 planned (87%) +### ExifTool Test Results: 590/600 planned (98%) | Test | Pass/Planned | Status | |------|-------------|--------| @@ -358,6 +368,10 @@ Various format-specific write issues. Many may share root causes with P1 (mandat | Dynamic variables | `runtime/runtimetypes/DynamicVariableManager.java` | | IO operations | `runtime/runtimetypes/RuntimeIO.java` | | IO operator (open/dup) | `runtime/operators/IOOperator.java` | +| Control flow (goto/labels) | `backend/jvm/EmitControlFlow.java` | +| Dereference / slicing | `backend/jvm/Dereference.java` | +| Variable emission (refs) | `backend/jvm/EmitVariable.java` | +| String parser (qw, heredoc) | `frontend/parser/StringParser.java` | | String operators | `runtime/operators/StringOperators.java` | | Pack/Unpack | `runtime/operators/PackOperator.java` | | Regex preprocessor | `runtime/regex/RegexPreprocessor.java` | @@ -385,6 +399,46 @@ If a fix only patches ONE of these paths (e.g., `capturedVarIndices` check in `v ### Ordering matters for capturedVars `SubroutineParser` builds `paramList` by iterating `getAllVisibleVariables()` (TreeMap sorted by register index) with specific filters. `detectClosureVariables()` must use the **exact same iteration order and filters**. Any mismatch causes captured variable values to be assigned to wrong registers at runtime. +### goto LABEL across JVM scope boundaries +`EmitControlFlow.handleGotoLabel()` resolves labels at compile time within the current JVM scope. When the target label is outside the current scope (e.g., goto inside a `map` block to a label outside, or goto inside an `eval` block), the compile-time lookup fails. The fix is to emit a `RuntimeControlFlowList` marker with `ControlFlowType.GOTO` at runtime (the same mechanism used by dynamic `goto EXPR`), allowing the goto signal to propagate up the call stack. This was a blocker for both op/array.t and op/eval.t. + +### List slice with range indices +In `Dereference.handleArrowArrayDeref()`, the check for single-index vs slice path must account for range expressions (`..` operator). A range like `0..5` is a single AST node but produces multiple indices. The correct condition is: use single-index path only if there's one element AND it's not a range. Otherwise, use the slice path. The old code had a complex `isArrayLiteral` check that was too restrictive. + +### qw() backslash processing +`StringParser.parseWordsString()` must apply single-quote backslash rules to each word: `\\` → `\` and `\delimiter` → `delimiter`. Without this, backslashes are doubled in the output. The processing uses the closing delimiter from the qw construct. + +### `\(LIST)` must flatten arrays before creating refs +`\(@array)` should create individual scalar refs to each array element (like `map { \$_ } @array`), not a single ref to the array. `EmitVariable` needs a `flattenElements()` method that detects `@` sigil nodes in the list and flattens them before creating element references. + +### Squashing a diverged branch with `git diff` + `git apply` +When a feature branch has diverged far from master (thousands of commits in common history), both `git rebase` and `git merge --squash` can produce massive conflicts across dozens of files. The clean workaround: +```bash +# 1. Generate a patch of ONLY the branch's changes vs master +git diff master..feature-branch > /tmp/branch-diff.patch +# 2. Create a fresh branch from current master +git checkout master && git checkout -b feature-branch-clean +# 3. Apply the patch (no merge history = no conflicts) +git apply /tmp/branch-diff.patch +# 4. Commit as a single squashed commit +git add -A && git commit -m "Squashed: ..." +# 5. Force push to update the PR +git push --force origin feature-branch-clean +``` +This works because `git diff master..branch` produces the exact file-level delta, bypassing all the intermediate merge history that causes conflicts. + +### Always commit fixes before rebasing +Uncommitted working tree changes are lost when `git rebase --abort` is run. If you have a fix in progress (e.g., a BitwiseOperators change), commit it first — even as a WIP commit — before attempting any rebase. The rebase abort restores the branch to its pre-rebase state, which does NOT include uncommitted changes. + +### `getInt()` vs `(int) getLong()` for 32-bit integer wrapping +`RuntimeScalar.getInt()` clamps DOUBLE values to `Integer.MAX_VALUE` (e.g., `(int) 2147483648.0 == 2147483647`). But `(int) getLong()` wraps correctly via long→int truncation (e.g., `(int) 2147483648L == -2147483648`). For `use integer` operations where Config.pm reports `ivsize=4`, always use `(int) getLong()` to get proper 32-bit wrapping behavior matching Perl's semantics. + +### scalar gmtime/localtime ctime(3) format +Perl's scalar `gmtime`/`localtime` returns ctime(3) format: `"Fri Mar 7 20:13:52 881"` — NOT RFC 1123 (`"Fri, 7 Mar 0881 20:13:52 GMT"`). Use `String.format()` with explicit field widths, not `DateTimeFormatter`. Also: wday must use `getValue() % 7` (Perl: 0=Sun..6=Sat) not `getValue()` (Java: 1=Mon..7=Sun). Large years (>9999) must not crash the formatter. + +### Regression testing: always compare branch vs master +Before declaring a fix complete, run the same test on both master and the branch to distinguish real regressions from pre-existing failures. Use `perl5_t/t/` (not `perl5/t/`) for running Perl5 core tests — the `perl5_t` copy has test harness files (`test.pl`, `charset_tools.pl`) that PerlOnJava can load. + ## Adding Debug Instrumentation In ExifTool Perl code (temporary, never commit): diff --git a/dev/presentations/German_Perl_Raku_Workshop_2026/slides-part1-intro.md b/dev/presentations/German_Perl_Raku_Workshop_2026/slides-part1-intro.md index 287c3355f..1f98ed403 100644 --- a/dev/presentations/German_Perl_Raku_Workshop_2026/slides-part1-intro.md +++ b/dev/presentations/German_Perl_Raku_Workshop_2026/slides-part1-intro.md @@ -69,23 +69,20 @@ JSR-223 is the standard Java scripting API, available since Java 6. It allows bi **`perlonjava-3.0.0.jar`** — 25 MB, zero external dependencies -**Same JAR runs on Linux, macOS, and Windows** — no recompilation. - ``` perlonjava.jar -├── org/perlonjava/ ← 392 Java compiled classes -├── lib/ ← 341 Perl modules (DBI, JSON, HTTP::Tiny…) -├── runtime/nativ/ ← Platform abstraction (POSIX ↔ Win32 via JNA) -├── ASM, ICU4J, JNA ← Java libraries bundled -└── META-INF/services ← JSR-223 auto-discovery +├── org/perlonjava/ ← 392 Java compiled classes +├── lib/ ← 341 Perl modules (DBI, JSON, HTTP::Tiny…) +├── ASM, ICU4J, JNA ← Java libraries bundled +└── META-INF/services ← JSR-223 auto-discovery ``` -`java -jar perlonjava.jar script.pl` — or `./jperl` / `jperl.bat` +`java -jar perlonjava.jar script.pl` — that's it. -Also ships as: **Debian package** (`make deb`) · **Docker image** (`docker build -t perlonjava .`) +Or use `./jperl script.pl` — a wrapper that also supports `$CLASSPATH` for JDBC drivers. Note: -Built with Gradle Shadow plugin (fat JAR). Perl modules live in src/main/perl/lib and are packaged as resources inside the JAR. The require mechanism reads them directly from the JAR via classloader. The nativ/ package uses JNA to call POSIX libc on Unix and Kernel32 on Windows — same Perl code, platform-specific native calls handled transparently. The jperl wrapper uses -cp instead of -jar so users can add extra JARs to CLASSPATH. Docker: `docker build -t perlonjava .` then `docker run perlonjava script.pl`. Debian: `make deb` creates a .deb in build/distributions/, install with `sudo dpkg -i`. +Built with Gradle Shadow plugin (fat JAR). Perl modules live in src/main/perl/lib and are packaged as resources inside the JAR. The require mechanism reads them directly from the JAR via classloader. No installation, no CPAN, no paths to configure. The jperl wrapper uses -cp instead of -jar so users can add extra JARs to CLASSPATH. --- diff --git a/dev/presentations/German_Perl_Raku_Workshop_2026/slides-part2-technical.md b/dev/presentations/German_Perl_Raku_Workshop_2026/slides-part2-technical.md index 33b804922..c4149aac0 100644 --- a/dev/presentations/German_Perl_Raku_Workshop_2026/slides-part2-technical.md +++ b/dev/presentations/German_Perl_Raku_Workshop_2026/slides-part2-technical.md @@ -592,11 +592,10 @@ Also: globalIORefs → IO, globalFormatRefs → FORMAT. Slot access: *foo{CODE} - Loads **Java extensions** instead of C shared libraries - **JNA** (Java Native Access) replaces XS for native calls -- `nativ/` package: POSIX libc on Unix, Kernel32 on Windows — **same JAR, all platforms** - No C compiler needed Note: -The nativ/ package provides cross-platform implementations of symlink, link, getppid, getuid/gid, chmod, chown, kill, and more. NativeUtils detects the OS at startup and routes each call to the appropriate native API. ExtendedNativeUtils adds user/group info, network ops, and System V IPC. Java equivalents are easier to write and maintain than C/XS. +Java equivalents are easier to write and maintain than C/XS. The same API surface is exposed to Perl code. --- diff --git a/src/main/java/org/perlonjava/backend/bytecode/BytecodeCompiler.java b/src/main/java/org/perlonjava/backend/bytecode/BytecodeCompiler.java index 64349dd97..68948f447 100644 --- a/src/main/java/org/perlonjava/backend/bytecode/BytecodeCompiler.java +++ b/src/main/java/org/perlonjava/backend/bytecode/BytecodeCompiler.java @@ -2138,28 +2138,6 @@ void compileVariableDeclaration(OperatorNode node, String op) { continue; } - // local @x / local %x in list form - if ((sigil.equals("@") || sigil.equals("%")) && sigilOp.operand instanceof IdentifierNode idNode) { - String varName = sigil + idNode.name; - if (hasVariable(varName)) { - throwCompilerException("Can't localize lexical variable " + varName); - } - - String globalVarName = NameNormalizer.normalizeVariableName(idNode.name, getCurrentPackage()); - int nameIdx = addToStringPool(globalVarName); - - int rd = allocateRegister(); - if (sigil.equals("@")) { - emitWithToken(Opcodes.LOCAL_ARRAY, node.getIndex()); - } else { - emitWithToken(Opcodes.LOCAL_HASH, node.getIndex()); - } - emitReg(rd); - emit(nameIdx); - varRegs.add(rd); - continue; - } - if (sigilOp.operand instanceof IdentifierNode) { String varName = sigil + ((IdentifierNode) sigilOp.operand).name; @@ -3206,7 +3184,10 @@ void compileVariableReference(OperatorNode node, String op) { BlockNode block = (BlockNode) node.operand; // Check strict refs at compile time — mirrors JVM path in EmitVariable.java + int savedCtx = currentCallContext; + currentCallContext = RuntimeContextType.SCALAR; block.accept(this); + currentCallContext = savedCtx; int blockResultReg = lastResultReg; int rd = allocateRegister(); if (isStrictRefsEnabled()) { @@ -3336,7 +3317,10 @@ void compileVariableReference(OperatorNode node, String op) { // @{ block } - evaluate block and dereference the result // The block should return an arrayref BlockNode blockNode = (BlockNode) node.operand; + int savedCtx = currentCallContext; + currentCallContext = RuntimeContextType.SCALAR; blockNode.accept(this); + currentCallContext = savedCtx; int refReg = lastResultReg; // Dereference to get the array @@ -3429,7 +3413,10 @@ void compileVariableReference(OperatorNode node, String op) { } } else if (node.operand instanceof BlockNode blockNode) { // %{ block } — evaluate block and dereference to hash + int savedCtx = currentCallContext; + currentCallContext = RuntimeContextType.SCALAR; blockNode.accept(this); + currentCallContext = savedCtx; int scalarReg = lastResultReg; int hashReg = allocateRegister(); if (isStrictRefsEnabled()) { @@ -3973,6 +3960,10 @@ private void visitNamedSubroutine(SubroutineNode node) { // Sub-compiler will use RETRIEVE_BEGIN opcodes for closure variables InterpretedCode subCode = subCompiler.compile(node.block); + if (RuntimeCode.DISASSEMBLE) { + System.out.println(subCode.disassemble()); + } + // Step 5: Emit bytecode to create closure or simple code ref int codeReg = allocateRegister(); @@ -4060,6 +4051,10 @@ private void visitAnonymousSubroutine(SubroutineNode node) { // Sub-compiler will use parentRegistry to resolve captured variables InterpretedCode subCode = subCompiler.compile(node.block); + if (RuntimeCode.DISASSEMBLE) { + System.out.println(subCode.disassemble()); + } + // Step 5: Create closure or simple code ref int codeReg = allocateRegister(); diff --git a/src/main/java/org/perlonjava/backend/bytecode/BytecodeInterpreter.java b/src/main/java/org/perlonjava/backend/bytecode/BytecodeInterpreter.java index 3be3a949f..33a75190c 100644 --- a/src/main/java/org/perlonjava/backend/bytecode/BytecodeInterpreter.java +++ b/src/main/java/org/perlonjava/backend/bytecode/BytecodeInterpreter.java @@ -1043,6 +1043,10 @@ public static RuntimeList execute(InterpretedCode code, RuntimeArray args, int c case Opcodes.HASH_SET: { // Hash element store: hash{key} = value + // Must copy the value into a new scalar for the hash element, + // because the source register may be modified in-place later + // (e.g. $hash{k} = $fix; $fix = {} would clear $hash{k} otherwise) + // Uses addToScalar to properly resolve special variables ($1, $2, etc.) int hashReg = bytecode[pc++]; int keyReg = bytecode[pc++]; int valueReg = bytecode[pc++]; @@ -1050,7 +1054,9 @@ public static RuntimeList execute(InterpretedCode code, RuntimeArray args, int c RuntimeScalar key = (RuntimeScalar) registers[keyReg]; RuntimeBase valBase = registers[valueReg]; RuntimeScalar val = (valBase instanceof RuntimeScalar) ? (RuntimeScalar) valBase : valBase.scalar(); - hash.put(key.toString(), ensureMutableScalar(val)); + RuntimeScalar copy = new RuntimeScalar(); + val.addToScalar(copy); + hash.put(key.toString(), copy); break; } @@ -2340,6 +2346,7 @@ public static RuntimeList execute(InterpretedCode code, RuntimeArray args, int c case Opcodes.VEC: case Opcodes.LOCALTIME: case Opcodes.GMTIME: + case Opcodes.RESET: case Opcodes.CRYPT: case Opcodes.CLOSE: case Opcodes.BINMODE: diff --git a/src/main/java/org/perlonjava/backend/bytecode/CompileAssignment.java b/src/main/java/org/perlonjava/backend/bytecode/CompileAssignment.java index 908c4c3b5..cc9a32639 100644 --- a/src/main/java/org/perlonjava/backend/bytecode/CompileAssignment.java +++ b/src/main/java/org/perlonjava/backend/bytecode/CompileAssignment.java @@ -1003,6 +1003,15 @@ public static void compileAssignmentOperator(BytecodeCompiler bytecodeCompiler, bytecodeCompiler.emitReg(lvalueReg); bytecodeCompiler.emitReg(valueReg); + bytecodeCompiler.lastResultReg = valueReg; + } else if (leftOp.operator.equals("substr")) { + node.left.accept(bytecodeCompiler); + int lvalueReg = bytecodeCompiler.lastResultReg; + + bytecodeCompiler.emit(Opcodes.SET_SCALAR); + bytecodeCompiler.emitReg(lvalueReg); + bytecodeCompiler.emitReg(valueReg); + bytecodeCompiler.lastResultReg = valueReg; } else if (leftOp.operator.equals("@") && leftOp.operand instanceof OperatorNode) { // Array dereference assignment: @$r = ... @@ -1262,34 +1271,49 @@ public static void compileAssignmentOperator(BytecodeCompiler bytecodeCompiler, // Check for hash slice assignment: @hash{keys} = values if (hashOp.operator.equals("@")) { - // Hash slice assignment - if (!(hashOp.operand instanceof IdentifierNode)) { - bytecodeCompiler.throwCompilerException("Hash slice assignment requires identifier"); - return; - } - String varName = ((IdentifierNode) hashOp.operand).name; - String hashVarName = "%" + varName; + if (hashOp.operand instanceof IdentifierNode idNode) { + String varName = idNode.name; + String hashVarName = "%" + varName; - if (bytecodeCompiler.currentSubroutineBeginId != 0 && bytecodeCompiler.currentSubroutineClosureVars != null - && bytecodeCompiler.currentSubroutineClosureVars.contains(hashVarName)) { + if (bytecodeCompiler.currentSubroutineBeginId != 0 && bytecodeCompiler.currentSubroutineClosureVars != null + && bytecodeCompiler.currentSubroutineClosureVars.contains(hashVarName)) { + hashReg = bytecodeCompiler.allocateRegister(); + int nameIdx = bytecodeCompiler.addToStringPool(hashVarName); + bytecodeCompiler.emitWithToken(Opcodes.RETRIEVE_BEGIN_HASH, node.getIndex()); + bytecodeCompiler.emitReg(hashReg); + bytecodeCompiler.emit(nameIdx); + bytecodeCompiler.emit(bytecodeCompiler.currentSubroutineBeginId); + } else if (bytecodeCompiler.hasVariable(hashVarName)) { + hashReg = bytecodeCompiler.getVariableRegister(hashVarName); + } else { + hashReg = bytecodeCompiler.allocateRegister(); + String globalHashName = NameNormalizer.normalizeVariableName( + varName, + bytecodeCompiler.getCurrentPackage() + ); + int nameIdx = bytecodeCompiler.addToStringPool(globalHashName); + bytecodeCompiler.emit(Opcodes.LOAD_GLOBAL_HASH); + bytecodeCompiler.emitReg(hashReg); + bytecodeCompiler.emit(nameIdx); + } + } else if (hashOp.operand instanceof OperatorNode) { + hashOp.operand.accept(bytecodeCompiler); + int scalarRefReg = bytecodeCompiler.lastResultReg; hashReg = bytecodeCompiler.allocateRegister(); - int nameIdx = bytecodeCompiler.addToStringPool(hashVarName); - bytecodeCompiler.emitWithToken(Opcodes.RETRIEVE_BEGIN_HASH, node.getIndex()); - bytecodeCompiler.emitReg(hashReg); - bytecodeCompiler.emit(nameIdx); - bytecodeCompiler.emit(bytecodeCompiler.currentSubroutineBeginId); - } else if (bytecodeCompiler.hasVariable(hashVarName)) { - hashReg = bytecodeCompiler.getVariableRegister(hashVarName); + if (bytecodeCompiler.isStrictRefsEnabled()) { + bytecodeCompiler.emitWithToken(Opcodes.DEREF_HASH, node.getIndex()); + bytecodeCompiler.emitReg(hashReg); + bytecodeCompiler.emitReg(scalarRefReg); + } else { + int pkgIdx = bytecodeCompiler.addToStringPool(bytecodeCompiler.getCurrentPackage()); + bytecodeCompiler.emitWithToken(Opcodes.DEREF_HASH_NONSTRICT, node.getIndex()); + bytecodeCompiler.emitReg(hashReg); + bytecodeCompiler.emitReg(scalarRefReg); + bytecodeCompiler.emit(pkgIdx); + } } else { - hashReg = bytecodeCompiler.allocateRegister(); - String globalHashName = NameNormalizer.normalizeVariableName( - varName, - bytecodeCompiler.getCurrentPackage() - ); - int nameIdx = bytecodeCompiler.addToStringPool(globalHashName); - bytecodeCompiler.emit(Opcodes.LOAD_GLOBAL_HASH); - bytecodeCompiler.emitReg(hashReg); - bytecodeCompiler.emit(nameIdx); + bytecodeCompiler.throwCompilerException("Hash slice assignment requires identifier or reference"); + return; } // Get the keys from HashLiteralNode diff --git a/src/main/java/org/perlonjava/backend/bytecode/CompileBinaryOperator.java b/src/main/java/org/perlonjava/backend/bytecode/CompileBinaryOperator.java index 81167c354..584f38310 100644 --- a/src/main/java/org/perlonjava/backend/bytecode/CompileBinaryOperator.java +++ b/src/main/java/org/perlonjava/backend/bytecode/CompileBinaryOperator.java @@ -219,6 +219,13 @@ static void visitBinaryOperator(BytecodeCompiler bytecodeCompiler, BinaryOperato // Code reference call: $code->() or $code->(@args) // right is ListNode with arguments else if (node.right instanceof ListNode) { + // Special case: eval { ... }->() is parsed as BinaryOperatorNode("->", SubroutineNode[useTryCatch], ListNode) + // The interpreter compiles eval blocks inline (EVAL_TRY/END), so we should NOT emit CALL_SUB + if (node.left instanceof SubroutineNode sn && sn.useTryCatch) { + node.left.accept(bytecodeCompiler); + return; + } + // This is a code reference call: $coderef->(args) // Compile the code reference in scalar context int savedContext = bytecodeCompiler.currentCallContext; @@ -463,13 +470,21 @@ else if (node.right instanceof BinaryOperatorNode) { bytecodeCompiler.emitInt(0); // Placeholder for offset (will be patched) // NOW compile right operand (only executed if left was true) + // Force SCALAR context so the right operand always produces a result register + int savedContext2 = bytecodeCompiler.currentCallContext; + if (bytecodeCompiler.currentCallContext == RuntimeContextType.VOID) { + bytecodeCompiler.currentCallContext = RuntimeContextType.SCALAR; + } node.right.accept(bytecodeCompiler); int rs2 = bytecodeCompiler.lastResultReg; + bytecodeCompiler.currentCallContext = savedContext2; // Move right result to rd (overwriting left value) - bytecodeCompiler.emit(Opcodes.ALIAS); - bytecodeCompiler.emitReg(rd); - bytecodeCompiler.emitReg(rs2); + if (rs2 >= 0) { + bytecodeCompiler.emit(Opcodes.ALIAS); + bytecodeCompiler.emitReg(rd); + bytecodeCompiler.emitReg(rs2); + } // Patch the forward jump offset int skipRightTarget = bytecodeCompiler.bytecode.size(); @@ -505,13 +520,20 @@ else if (node.right instanceof BinaryOperatorNode) { bytecodeCompiler.emitInt(0); // Placeholder for offset (will be patched) // NOW compile right operand (only executed if left was false) + int savedContext2 = bytecodeCompiler.currentCallContext; + if (bytecodeCompiler.currentCallContext == RuntimeContextType.VOID) { + bytecodeCompiler.currentCallContext = RuntimeContextType.SCALAR; + } node.right.accept(bytecodeCompiler); int rs2 = bytecodeCompiler.lastResultReg; + bytecodeCompiler.currentCallContext = savedContext2; // Move right result to rd (overwriting left value) - bytecodeCompiler.emit(Opcodes.ALIAS); - bytecodeCompiler.emitReg(rd); - bytecodeCompiler.emitReg(rs2); + if (rs2 >= 0) { + bytecodeCompiler.emit(Opcodes.ALIAS); + bytecodeCompiler.emitReg(rd); + bytecodeCompiler.emitReg(rs2); + } // Patch the forward jump offset int skipRightTarget = bytecodeCompiler.bytecode.size(); @@ -553,13 +575,20 @@ else if (node.right instanceof BinaryOperatorNode) { bytecodeCompiler.emitInt(0); // Placeholder for offset (will be patched) // NOW compile right operand (only executed if left was undefined) + int savedContext2 = bytecodeCompiler.currentCallContext; + if (bytecodeCompiler.currentCallContext == RuntimeContextType.VOID) { + bytecodeCompiler.currentCallContext = RuntimeContextType.SCALAR; + } node.right.accept(bytecodeCompiler); int rs2 = bytecodeCompiler.lastResultReg; + bytecodeCompiler.currentCallContext = savedContext2; // Move right result to rd (overwriting left value) - bytecodeCompiler.emit(Opcodes.ALIAS); - bytecodeCompiler.emitReg(rd); - bytecodeCompiler.emitReg(rs2); + if (rs2 >= 0) { + bytecodeCompiler.emit(Opcodes.ALIAS); + bytecodeCompiler.emitReg(rd); + bytecodeCompiler.emitReg(rs2); + } // Patch the forward jump offset int skipRightTarget = bytecodeCompiler.bytecode.size(); diff --git a/src/main/java/org/perlonjava/backend/bytecode/CompileOperator.java b/src/main/java/org/perlonjava/backend/bytecode/CompileOperator.java index 1684b1d2b..cd9166571 100644 --- a/src/main/java/org/perlonjava/backend/bytecode/CompileOperator.java +++ b/src/main/java/org/perlonjava/backend/bytecode/CompileOperator.java @@ -1634,26 +1634,31 @@ public static void visitOperator(BytecodeCompiler bytecodeCompiler, OperatorNode } } else if (op.equals("chop")) { // chop $x - remove last character, modifies argument in place - // operand: ListNode containing scalar variable reference - if (node.operand == null) { - bytecodeCompiler.throwCompilerException("chop requires an argument"); - } + boolean chopNoArgs = node.operand == null || + (node.operand instanceof ListNode && ((ListNode) node.operand).elements.isEmpty()); - // Extract the actual operand from ListNode if needed - Node actualOperand = node.operand; - if (actualOperand instanceof ListNode) { - ListNode list = (ListNode) actualOperand; - if (list.elements.isEmpty()) { - bytecodeCompiler.throwCompilerException("chop requires an argument"); + int scalarReg; + if (chopNoArgs) { + String varName = "$_"; + if (bytecodeCompiler.hasVariable(varName)) { + scalarReg = bytecodeCompiler.getVariableRegister(varName); + } else { + scalarReg = bytecodeCompiler.allocateRegister(); + int nameIdx = bytecodeCompiler.addToStringPool("main::_"); + bytecodeCompiler.emit(Opcodes.LOAD_GLOBAL_SCALAR); + bytecodeCompiler.emitReg(scalarReg); + bytecodeCompiler.emit(nameIdx); + } + } else { + Node actualOperand = node.operand; + if (actualOperand instanceof ListNode) { + ListNode list = (ListNode) actualOperand; + actualOperand = list.elements.get(0); } - actualOperand = list.elements.get(0); + actualOperand.accept(bytecodeCompiler); + scalarReg = bytecodeCompiler.lastResultReg; } - // Compile the operand (should be an lvalue) - actualOperand.accept(bytecodeCompiler); - int scalarReg = bytecodeCompiler.lastResultReg; - - // Call chopScalar and store result back int rd = bytecodeCompiler.allocateRegister(); bytecodeCompiler.emit(Opcodes.CHOP); bytecodeCompiler.emitReg(rd); @@ -2070,8 +2075,9 @@ public static void visitOperator(BytecodeCompiler bytecodeCompiler, OperatorNode bytecodeCompiler.lastResultReg = rd; } else if (op.equals("chomp")) { // chomp($x) or chomp - remove trailing newlines - if (node.operand == null) { - // chomp with no args - operates on $_ + boolean noArgs = node.operand == null || + (node.operand instanceof ListNode && ((ListNode) node.operand).elements.isEmpty()); + if (noArgs) { String varName = "$_"; int targetReg; if (bytecodeCompiler.hasVariable(varName)) { @@ -2091,13 +2097,10 @@ public static void visitOperator(BytecodeCompiler bytecodeCompiler, OperatorNode bytecodeCompiler.lastResultReg = rd; } else { - // chomp with argument if (node.operand instanceof ListNode) { ListNode list = (ListNode) node.operand; if (!list.elements.isEmpty()) { list.elements.get(0).accept(bytecodeCompiler); - } else { - bytecodeCompiler.throwCompilerException("chomp requires an argument"); } } else { node.operand.accept(bytecodeCompiler); @@ -2875,7 +2878,7 @@ public static void visitOperator(BytecodeCompiler bytecodeCompiler, OperatorNode op.equals("rename") || op.equals("link") || op.equals("readlink") || op.equals("umask") || op.equals("system") || op.equals("pack") || op.equals("unpack") || op.equals("vec") || op.equals("crypt") || - op.equals("localtime") || op.equals("gmtime") || op.equals("caller") || + op.equals("localtime") || op.equals("gmtime") || op.equals("caller") || op.equals("reset") || op.equals("fileno") || op.equals("getc") || op.equals("qx") || op.equals("close") || op.equals("binmode") || op.equals("seek") || @@ -2936,6 +2939,7 @@ public static void visitOperator(BytecodeCompiler bytecodeCompiler, OperatorNode case "vec" -> Opcodes.VEC; case "localtime" -> Opcodes.LOCALTIME; case "gmtime" -> Opcodes.GMTIME; + case "reset" -> Opcodes.RESET; case "crypt" -> Opcodes.CRYPT; case "close" -> Opcodes.CLOSE; case "binmode" -> Opcodes.BINMODE; diff --git a/src/main/java/org/perlonjava/backend/bytecode/InterpretedCode.java b/src/main/java/org/perlonjava/backend/bytecode/InterpretedCode.java index 070c543ee..5041e7848 100644 --- a/src/main/java/org/perlonjava/backend/bytecode/InterpretedCode.java +++ b/src/main/java/org/perlonjava/backend/bytecode/InterpretedCode.java @@ -1508,6 +1508,7 @@ public String disassemble() { case Opcodes.CRYPT: case Opcodes.LOCALTIME: case Opcodes.GMTIME: + case Opcodes.RESET: case Opcodes.CHMOD: case Opcodes.UNLINK: case Opcodes.UTIME: @@ -1530,6 +1531,7 @@ public String disassemble() { case Opcodes.CRYPT -> "crypt"; case Opcodes.LOCALTIME -> "localtime"; case Opcodes.GMTIME -> "gmtime"; + case Opcodes.RESET -> "reset"; case Opcodes.CHMOD -> "chmod"; case Opcodes.UNLINK -> "unlink"; case Opcodes.UTIME -> "utime"; diff --git a/src/main/java/org/perlonjava/backend/bytecode/MiscOpcodeHandler.java b/src/main/java/org/perlonjava/backend/bytecode/MiscOpcodeHandler.java index 9068e3f2c..f5a939432 100644 --- a/src/main/java/org/perlonjava/backend/bytecode/MiscOpcodeHandler.java +++ b/src/main/java/org/perlonjava/backend/bytecode/MiscOpcodeHandler.java @@ -3,6 +3,7 @@ import org.perlonjava.runtime.nativ.NativeUtils; import org.perlonjava.runtime.operators.*; import org.perlonjava.runtime.operators.ChownOperator; +import org.perlonjava.runtime.operators.Operator; import org.perlonjava.runtime.operators.Directory; import org.perlonjava.runtime.operators.WaitpidOperator; import org.perlonjava.runtime.operators.Unpack; @@ -58,6 +59,7 @@ public static int execute(int opcode, int[] bytecode, int pc, RuntimeBase[] regi case Opcodes.VEC -> Vec.vec(args); case Opcodes.LOCALTIME -> Time.localtime(args, ctx); case Opcodes.GMTIME -> Time.gmtime(args, ctx); + case Opcodes.RESET -> Operator.reset(args, ctx); case Opcodes.CRYPT -> Crypt.crypt(args); // I/O operators case Opcodes.CLOSE -> IOOperator.close(ctx, argsArray); diff --git a/src/main/java/org/perlonjava/backend/bytecode/Opcodes.java b/src/main/java/org/perlonjava/backend/bytecode/Opcodes.java index 9d4d9b97a..c3c458302 100644 --- a/src/main/java/org/perlonjava/backend/bytecode/Opcodes.java +++ b/src/main/java/org/perlonjava/backend/bytecode/Opcodes.java @@ -1204,6 +1204,7 @@ public class Opcodes { public static final short INTEGER_RIGHT_SHIFT_ASSIGN = 366; public static final short INTEGER_DIV_ASSIGN = 367; public static final short INTEGER_MOD_ASSIGN = 368; + public static final short RESET = 369; private Opcodes() {} // Utility class - no instantiation } diff --git a/src/main/java/org/perlonjava/backend/jvm/Dereference.java b/src/main/java/org/perlonjava/backend/jvm/Dereference.java index 830fffabc..01b83042e 100644 --- a/src/main/java/org/perlonjava/backend/jvm/Dereference.java +++ b/src/main/java/org/perlonjava/backend/jvm/Dereference.java @@ -794,20 +794,11 @@ public static void handleArrowArrayDeref(EmitterVisitor emitterVisitor, BinaryOp ArrayLiteralNode right = (ArrayLiteralNode) node.right; - // Check if this is a true array literal (contains only literal elements like strings and numbers) - // and has a single range operator in the indices - boolean isArrayLiteral = node.left instanceof ArrayLiteralNode leftArray && - leftArray.elements.stream().allMatch(elem -> - elem instanceof StringNode || - elem instanceof NumberNode) && - leftArray.elements.size() > 1; // Must have multiple literal elements - boolean isSingleRange = right.elements.size() == 1 && right.elements.getFirst() instanceof BinaryOperatorNode binOp && "..".equals(binOp.operator); - // Only apply the fix to true array literals with range operators - if (right.elements.size() == 1 && !(isArrayLiteral && isSingleRange)) { + if (right.elements.size() == 1 && !isSingleRange) { // Single index: use get/delete/exists methods Node elem = right.elements.getFirst(); elem.accept(emitterVisitor.with(RuntimeContextType.SCALAR)); diff --git a/src/main/java/org/perlonjava/backend/jvm/EmitBlock.java b/src/main/java/org/perlonjava/backend/jvm/EmitBlock.java index 51d060bfb..4d7e47c75 100644 --- a/src/main/java/org/perlonjava/backend/jvm/EmitBlock.java +++ b/src/main/java/org/perlonjava/backend/jvm/EmitBlock.java @@ -157,6 +157,7 @@ public static void emitBlock(EmitterVisitor emitterVisitor, BlockNode node) { int regexStateLocal = -1; if (!node.getBooleanAnnotation("blockIsSubroutine") + && !node.getBooleanAnnotation("skipRegexSaveRestore") && RegexUsageDetector.containsRegexOperation(node)) { regexStateLocal = emitterVisitor.ctx.symbolTable.allocateLocalVariable(); mv.visitTypeInsn(Opcodes.NEW, "org/perlonjava/runtime/runtimetypes/RegexState"); diff --git a/src/main/java/org/perlonjava/backend/jvm/EmitControlFlow.java b/src/main/java/org/perlonjava/backend/jvm/EmitControlFlow.java index a795eb385..d9f662e7c 100644 --- a/src/main/java/org/perlonjava/backend/jvm/EmitControlFlow.java +++ b/src/main/java/org/perlonjava/backend/jvm/EmitControlFlow.java @@ -393,21 +393,19 @@ static void handleGotoLabel(EmitterVisitor emitterVisitor, OperatorNode node) { // For static label, check if it's local GotoLabels targetLabel = ctx.javaClassInfo.findGotoLabelsByName(labelName); if (targetLabel == null) { - // Non-local goto: create RuntimeControlFlowList and return - ctx.logDebug("visit(goto): Non-local goto to " + labelName); - - // Create new RuntimeControlFlowList with GOTO type, label, fileName, lineNumber + // Label not in current JVM scope - use RuntimeControlFlowList to signal + // goto to the caller, same mechanism as dynamic goto + String fileName = ctx.compilerOptions.fileName != null ? ctx.compilerOptions.fileName : "(eval)"; + int lineNumber = ctx.errorUtil != null ? ctx.errorUtil.getLineNumber(node.tokenIndex) : 0; + ctx.mv.visitTypeInsn(Opcodes.NEW, "org/perlonjava/runtime/runtimetypes/RuntimeControlFlowList"); ctx.mv.visitInsn(Opcodes.DUP); ctx.mv.visitFieldInsn(Opcodes.GETSTATIC, "org/perlonjava/runtime/runtimetypes/ControlFlowType", - "GOTO", + "GOTO", "Lorg/perlonjava/runtime/runtimetypes/ControlFlowType;"); ctx.mv.visitLdcInsn(labelName); - // Push fileName - ctx.mv.visitLdcInsn(ctx.compilerOptions.fileName != null ? ctx.compilerOptions.fileName : "(eval)"); - // Push lineNumber - int lineNumber = ctx.errorUtil != null ? ctx.errorUtil.getLineNumber(node.tokenIndex) : 0; + ctx.mv.visitLdcInsn(fileName); ctx.mv.visitLdcInsn(lineNumber); ctx.mv.visitMethodInsn(Opcodes.INVOKESPECIAL, "org/perlonjava/runtime/runtimetypes/RuntimeControlFlowList", @@ -415,18 +413,6 @@ static void handleGotoLabel(EmitterVisitor emitterVisitor, OperatorNode node) { "(Lorg/perlonjava/runtime/runtimetypes/ControlFlowType;Ljava/lang/String;Ljava/lang/String;I)V", false); - int markerSlot = ctx.javaClassInfo.acquireSpillSlot(); - boolean pooledMarker = markerSlot >= 0; - if (!pooledMarker) { - markerSlot = ctx.symbolTable.allocateLocalVariable(); - } - ctx.mv.visitVarInsn(Opcodes.ASTORE, markerSlot); - - // Jump to returnLabel with the marker on stack. - ctx.mv.visitVarInsn(Opcodes.ALOAD, markerSlot); - if (pooledMarker) { - ctx.javaClassInfo.releaseSpillSlot(); - } ctx.mv.visitVarInsn(Opcodes.ASTORE, ctx.javaClassInfo.returnValueSlot); ctx.mv.visitJumpInsn(Opcodes.GOTO, ctx.javaClassInfo.returnLabel); return; diff --git a/src/main/java/org/perlonjava/backend/jvm/EmitForeach.java b/src/main/java/org/perlonjava/backend/jvm/EmitForeach.java index 4207428c7..00c8c6a7a 100644 --- a/src/main/java/org/perlonjava/backend/jvm/EmitForeach.java +++ b/src/main/java/org/perlonjava/backend/jvm/EmitForeach.java @@ -416,6 +416,16 @@ public static void emitFor1(EmitterVisitor emitterVisitor, For1Node node) { int loopVarIndex = -1; + int foreachRegexStateLocal = -1; + if (node.body instanceof BlockNode bodyBlock && RegexUsageDetector.containsRegexOperation(bodyBlock)) { + foreachRegexStateLocal = emitterVisitor.ctx.symbolTable.allocateLocalVariable(); + mv.visitTypeInsn(Opcodes.NEW, "org/perlonjava/runtime/runtimetypes/RegexState"); + mv.visitInsn(Opcodes.DUP); + mv.visitMethodInsn(Opcodes.INVOKESPECIAL, + "org/perlonjava/runtime/runtimetypes/RegexState", "", "()V", false); + mv.visitVarInsn(Opcodes.ASTORE, foreachRegexStateLocal); + } + mv.visitLabel(loopStart); // Check for pending signals (alarm, etc.) at loop entry @@ -556,16 +566,6 @@ public static void emitFor1(EmitterVisitor emitterVisitor, For1Node node) { int bodyScopeIndex = emitterVisitor.ctx.symbolTable.enterScope(); Local.localRecord bodyLocalRecord = Local.localSetup(emitterVisitor.ctx, blockNode, mv, true); - int regexStateLocal = -1; - if (RegexUsageDetector.containsRegexOperation(blockNode)) { - regexStateLocal = emitterVisitor.ctx.symbolTable.allocateLocalVariable(); - mv.visitTypeInsn(Opcodes.NEW, "org/perlonjava/runtime/runtimetypes/RegexState"); - mv.visitInsn(Opcodes.DUP); - mv.visitMethodInsn(Opcodes.INVOKESPECIAL, - "org/perlonjava/runtime/runtimetypes/RegexState", "", "()V", false); - mv.visitVarInsn(Opcodes.ASTORE, regexStateLocal); - } - pushGotoLabelsForBlock(emitterVisitor, blockNode); java.util.List list = blockNode.elements; @@ -593,12 +593,6 @@ public static void emitFor1(EmitterVisitor emitterVisitor, For1Node node) { popGotoLabelsForBlock(emitterVisitor, blockNode); - if (regexStateLocal >= 0) { - mv.visitVarInsn(Opcodes.ALOAD, regexStateLocal); - mv.visitMethodInsn(Opcodes.INVOKEVIRTUAL, - "org/perlonjava/runtime/runtimetypes/RegexState", "restore", "()V", false); - } - Local.localTeardown(bodyLocalRecord, mv); emitterVisitor.ctx.symbolTable.exitScope(bodyScopeIndex); } else { @@ -622,6 +616,12 @@ public static void emitFor1(EmitterVisitor emitterVisitor, For1Node node) { mv.visitLabel(loopEnd); + if (foreachRegexStateLocal >= 0) { + mv.visitVarInsn(Opcodes.ALOAD, foreachRegexStateLocal); + mv.visitMethodInsn(Opcodes.INVOKEVIRTUAL, + "org/perlonjava/runtime/runtimetypes/RegexState", "restore", "()V", false); + } + if (savedLoopVarIndex >= 0 && loopVarIndex >= 0) { mv.visitVarInsn(Opcodes.ALOAD, savedLoopVarIndex); mv.visitVarInsn(Opcodes.ASTORE, loopVarIndex); diff --git a/src/main/java/org/perlonjava/backend/jvm/EmitOperator.java b/src/main/java/org/perlonjava/backend/jvm/EmitOperator.java index 274d311fe..60ba459d4 100644 --- a/src/main/java/org/perlonjava/backend/jvm/EmitOperator.java +++ b/src/main/java/org/perlonjava/backend/jvm/EmitOperator.java @@ -1182,6 +1182,11 @@ static void handleCreateReference(EmitterVisitor emitterVisitor, OperatorNode no MethodVisitor mv = emitterVisitor.ctx.mv; if (resultIsList(node)) { node.operand.accept(emitterVisitor.with(RuntimeContextType.LIST)); + emitterVisitor.ctx.mv.visitMethodInsn(Opcodes.INVOKEVIRTUAL, + "org/perlonjava/runtime/runtimetypes/RuntimeList", + "flattenElements", + "()Lorg/perlonjava/runtime/runtimetypes/RuntimeList;", + false); emitterVisitor.ctx.mv.visitMethodInsn(Opcodes.INVOKEVIRTUAL, "org/perlonjava/runtime/runtimetypes/RuntimeList", "createListReference", diff --git a/src/main/java/org/perlonjava/backend/jvm/EmitStatement.java b/src/main/java/org/perlonjava/backend/jvm/EmitStatement.java index 5fd2484dc..2f288f09c 100644 --- a/src/main/java/org/perlonjava/backend/jvm/EmitStatement.java +++ b/src/main/java/org/perlonjava/backend/jvm/EmitStatement.java @@ -119,6 +119,9 @@ public static void emitFor3(EmitterVisitor emitterVisitor, For3Node node) { mv.visitMethodInsn(Opcodes.INVOKESPECIAL, "org/perlonjava/runtime/runtimetypes/RegexState", "", "()V", false); mv.visitVarInsn(Opcodes.ASTORE, regexStateLocal); + if (node.body != null) { + node.body.setAnnotation("skipRegexSaveRestore", true); + } } // Visit the initialization node (executed once at the start) @@ -247,6 +250,9 @@ static void emitDoWhile(EmitterVisitor emitterVisitor, For3Node node) { mv.visitMethodInsn(Opcodes.INVOKESPECIAL, "org/perlonjava/runtime/runtimetypes/RegexState", "", "()V", false); mv.visitVarInsn(Opcodes.ASTORE, regexStateLocal); + if (node.body != null) { + node.body.setAnnotation("skipRegexSaveRestore", true); + } } // Register loop labels as pseudo-loop (isTrueLoop = false) diff --git a/src/main/java/org/perlonjava/backend/jvm/EmitterMethodCreator.java b/src/main/java/org/perlonjava/backend/jvm/EmitterMethodCreator.java index 0f82f46f3..cf0303040 100644 --- a/src/main/java/org/perlonjava/backend/jvm/EmitterMethodCreator.java +++ b/src/main/java/org/perlonjava/backend/jvm/EmitterMethodCreator.java @@ -357,16 +357,7 @@ public static byte[] getBytecode(EmitterContext ctx, Node ast, boolean useTryCat try { return getBytecodeInternal(ctx, ast, useTryCatch, false); } catch (MethodTooLargeException tooLarge) { - // When interpreter fallback is enabled, skip AST splitter and let exception propagate - // The interpreter has no size limits, so AST splitting is unnecessary - if (useInterpreterFallback) { - if (showFallback) { - System.err.println("Note: Method too large, skipping AST splitter (interpreter fallback enabled)."); - } - throw tooLarge; // Propagate to createRuntimeCode() which will use interpreter - } - - // Automatic retry with AST splitting when interpreter fallback is not enabled + // Always try AST splitting first — interpreter lacks many operators try { // Notify user that automatic refactoring is happening if (showFallback) { @@ -1570,12 +1561,11 @@ public static RuntimeCode createRuntimeCode( } catch (MethodTooLargeException e) { if (USE_INTERPRETER_FALLBACK) { - // Fall back to interpreter - System.err.println("Note: Method too large after AST splitting, using interpreter backend."); + if (SHOW_FALLBACK) { + System.err.println("Note: Method too large after AST splitting, using interpreter backend."); + } return compileToInterpreter(ast, ctx, useTryCatch); } - - // If interpreter fallback disabled, re-throw to use existing AST splitter logic throw e; } catch (VerifyError e) { if (USE_INTERPRETER_FALLBACK) { @@ -1584,9 +1574,14 @@ public static RuntimeCode createRuntimeCode( } throw new RuntimeException(e); } catch (PerlCompilerException e) { - if (USE_INTERPRETER_FALLBACK && e.getMessage() != null && e.getMessage().contains("ASM frame computation failed")) { - System.err.println("Note: ASM frame crash, using interpreter backend."); - return compileToInterpreter(ast, ctx, useTryCatch); + if (USE_INTERPRETER_FALLBACK && e.getMessage() != null) { + String msg = e.getMessage(); + if (msg.contains("ASM frame computation failed") || msg.contains("requires interpreter fallback")) { + if (SHOW_FALLBACK) { + System.err.println("Note: JVM compilation needs interpreter fallback (" + msg.split("\n")[0] + ")."); + } + return compileToInterpreter(ast, ctx, useTryCatch); + } } throw e; } @@ -1686,6 +1681,10 @@ private static InterpretedCode compileToInterpreter( // Compile AST to interpreter bytecode (pass ctx for package context and closure detection) InterpretedCode code = compiler.compile(ast, ctx); + if (ctx.compilerOptions.disassembleEnabled) { + System.out.println(code.disassemble()); + } + // Handle captured variables if needed (for closures) if (ctx.capturedEnv != null && ctx.capturedEnv.length > skipVariables) { // Extract captured variables from context diff --git a/src/main/java/org/perlonjava/backend/jvm/astrefactor/LargeBlockRefactorer.java b/src/main/java/org/perlonjava/backend/jvm/astrefactor/LargeBlockRefactorer.java index c0b46f394..20bd7f488 100644 --- a/src/main/java/org/perlonjava/backend/jvm/astrefactor/LargeBlockRefactorer.java +++ b/src/main/java/org/perlonjava/backend/jvm/astrefactor/LargeBlockRefactorer.java @@ -172,13 +172,7 @@ public static boolean processBlock(EmitterVisitor emitterVisitor, BlockNode node * @return true if the block should be refactored */ private static boolean shouldRefactorBlock(BlockNode node, EmitterVisitor emitterVisitor) { - // Minimal check: skip very small blocks to avoid estimation overhead - if (node.elements.size() <= MIN_CHUNK_SIZE) { - return false; - } - - // Check if we're in a context that allows refactoring - return !emitterVisitor.ctx.javaClassInfo.gotoLabelStack.isEmpty(); + return false; } /** diff --git a/src/main/java/org/perlonjava/frontend/parser/StringParser.java b/src/main/java/org/perlonjava/frontend/parser/StringParser.java index def56acce..678b9e8fd 100644 --- a/src/main/java/org/perlonjava/frontend/parser/StringParser.java +++ b/src/main/java/org/perlonjava/frontend/parser/StringParser.java @@ -287,16 +287,37 @@ static Node parseRegexString(EmitterContext ctx, ParsedString rawStr, Parser par public static ListNode parseWordsString(ParsedString rawStr) { - // Use a regular expression to split the string. - // "\\s+" matches any whitespace including \r\n, \n, \t, space, etc. - String trimmed = rawStr.buffers.getFirst().trim(); + String input = rawStr.buffers.getFirst(); + char startDelim = rawStr.startDelim; + char endDelim = rawStr.endDelim; + + StringBuilder processed = new StringBuilder(); + char[] chars = input.toCharArray(); + int length = chars.length; + int index = 0; + while (index < length) { + char ch = chars[index]; + if (ch == '\\') { + index++; + if (index < length) { + char nextChar = chars[index]; + if (nextChar == '\\' || nextChar == startDelim || nextChar == endDelim) { + processed.append(nextChar); + } else { + processed.append('\\').append(nextChar); + } + } + } else { + processed.append(ch); + } + index++; + } + + String trimmed = processed.toString().trim(); ListNode list = new ListNode(rawStr.index); - - // If the string is empty after trimming, return an empty list if (trimmed.isEmpty()) { return list; } - String[] words = trimmed.split("\\s+"); for (String word : words) { list.elements.add(new StringNode(word, rawStr.index)); diff --git a/src/main/java/org/perlonjava/runtime/operators/BitwiseOperators.java b/src/main/java/org/perlonjava/runtime/operators/BitwiseOperators.java index 0d26a4183..85d9174bd 100644 --- a/src/main/java/org/perlonjava/runtime/operators/BitwiseOperators.java +++ b/src/main/java/org/perlonjava/runtime/operators/BitwiseOperators.java @@ -205,8 +205,10 @@ public static RuntimeScalar integerBitwiseNot(RuntimeScalar runtimeScalar) { return bitwiseNotDot(val); } - // Use signed 32-bit integer semantics - int value = val.getInt(); + // Must use 32-bit int (not long) to match ivsize=4 in Config.pm. + // Using long would make ~3 return -4 as a 64-bit value, breaking bop.t tests + // that expect 32-bit signed integer semantics under "use integer". + int value = (int) val.getLong(); int result = ~value; return new RuntimeScalar(result); } @@ -486,12 +488,16 @@ private static RuntimeScalar shiftRightInternal(long value, long shift, boolean * Performs a left shift operation with signed (integer) semantics. * This is used when "use integer" pragma is in effect. * + * IMPORTANT: Must use 32-bit int arithmetic and >= 32 boundaries (not 64-bit long / >= 64). + * PerlOnJava reports ivsize=4 in Config.pm, so bop.t expects 32-bit word-size behavior: + * "use integer; 1 << 32" must return 0, and "1 << 31" must return -2147483648 (signed). + * The "shift < 0" guard after negation catches Long.MIN_VALUE overflow (-Long.MIN_VALUE == Long.MIN_VALUE). + * * @param runtimeScalar The operand to be shifted. * @param arg2 The number of positions to shift. * @return A new RuntimeScalar with the result of the integer left shift operation. */ public static RuntimeScalar integerShiftLeft(RuntimeScalar runtimeScalar, RuntimeScalar arg2) { - // Check for uninitialized values and generate warnings if (!runtimeScalar.getDefinedBoolean()) { WarnDie.warn(new RuntimeScalar("Use of uninitialized value in left bitshift (<<)"), RuntimeScalarCache.scalarEmptyString); @@ -501,15 +507,13 @@ public static RuntimeScalar integerShiftLeft(RuntimeScalar runtimeScalar, Runtim RuntimeScalarCache.scalarEmptyString); } - // Convert string type to number if necessary if (runtimeScalar.isString()) { runtimeScalar = NumberParser.parseNumber(runtimeScalar); } - int value = runtimeScalar.getInt(); + int value = (int) runtimeScalar.getLong(); long shift = arg2.getLong(); - // Handle negative shift (reverse direction: left becomes right) if (shift < 0) { shift = -shift; if (shift < 0 || shift >= 32) { @@ -530,13 +534,13 @@ public static RuntimeScalar integerShiftLeft(RuntimeScalar runtimeScalar, Runtim /** * Performs a right shift operation with signed (integer) semantics. * This is used when "use integer" pragma is in effect. + * See integerShiftLeft javadoc for ivsize=4 / 32-bit constraints. * * @param runtimeScalar The operand to be shifted. * @param arg2 The number of positions to shift. * @return A new RuntimeScalar with the result of the integer right shift operation. */ public static RuntimeScalar integerShiftRight(RuntimeScalar runtimeScalar, RuntimeScalar arg2) { - // Check for uninitialized values and generate warnings if (!runtimeScalar.getDefinedBoolean()) { WarnDie.warn(new RuntimeScalar("Use of uninitialized value in right bitshift (>>)"), RuntimeScalarCache.scalarEmptyString); @@ -546,15 +550,13 @@ public static RuntimeScalar integerShiftRight(RuntimeScalar runtimeScalar, Runti RuntimeScalarCache.scalarEmptyString); } - // Convert string type to number if necessary if (runtimeScalar.isString()) { runtimeScalar = NumberParser.parseNumber(runtimeScalar); } - int value = runtimeScalar.getInt(); + int value = (int) runtimeScalar.getLong(); long shift = arg2.getLong(); - // Handle negative shift (reverse direction: right becomes left) if (shift < 0) { shift = -shift; if (shift < 0 || shift >= 32) { diff --git a/src/main/java/org/perlonjava/runtime/operators/IOOperator.java b/src/main/java/org/perlonjava/runtime/operators/IOOperator.java index ae2786970..54289fdeb 100644 --- a/src/main/java/org/perlonjava/runtime/operators/IOOperator.java +++ b/src/main/java/org/perlonjava/runtime/operators/IOOperator.java @@ -72,9 +72,8 @@ public static RuntimeScalar select(RuntimeList runtimeList, int ctx) { } public static RuntimeScalar seek(RuntimeScalar fileHandle, RuntimeList runtimeList) { - if (fileHandle.type == RuntimeScalarType.GLOB || fileHandle.type == RuntimeScalarType.GLOBREFERENCE) { - // File handle - RuntimeIO runtimeIO = fileHandle.getRuntimeIO(); + RuntimeIO runtimeIO = fileHandle.getRuntimeIO(); + if (runtimeIO != null) { if (runtimeIO.ioHandle != null) { if (runtimeIO instanceof TieHandle tieHandle) { return TieHandle.tiedSeek(tieHandle, runtimeList); diff --git a/src/main/java/org/perlonjava/runtime/operators/MathOperators.java b/src/main/java/org/perlonjava/runtime/operators/MathOperators.java index abc71e583..677b8e4ab 100644 --- a/src/main/java/org/perlonjava/runtime/operators/MathOperators.java +++ b/src/main/java/org/perlonjava/runtime/operators/MathOperators.java @@ -433,17 +433,14 @@ public static RuntimeScalar modulusAssign(RuntimeScalar arg1, RuntimeScalar arg2 * @return A new RuntimeScalar representing the integer division result. */ public static RuntimeScalar integerDivide(RuntimeScalar arg1, RuntimeScalar arg2) { - // Convert to integers and perform integer division - int dividend = arg1.getInt(); - int divisor = arg2.getInt(); + long dividend = arg1.getLong(); + long divisor = arg2.getLong(); - // Handle division by zero if (divisor == 0) { throw new PerlCompilerException("Illegal division by zero"); } - // Perform integer division - int result = dividend / divisor; + long result = dividend / divisor; return new RuntimeScalar(result); } @@ -456,18 +453,14 @@ public static RuntimeScalar integerDivide(RuntimeScalar arg1, RuntimeScalar arg2 * @return A new RuntimeScalar representing the integer modulus. */ public static RuntimeScalar integerModulus(RuntimeScalar arg1, RuntimeScalar arg2) { - // Convert to integers and perform integer modulus - int dividend = arg1.getInt(); - int divisor = arg2.getInt(); + long dividend = arg1.getLong(); + long divisor = arg2.getLong(); - // Handle division by zero if (divisor == 0) { throw new PerlCompilerException("Illegal modulus zero"); } - // In Perl with "use integer", modulus follows C99 truncated division rules - // The result has the sign of the dividend, not the divisor - int result = dividend % divisor; + long result = dividend % divisor; return new RuntimeScalar(result); } diff --git a/src/main/java/org/perlonjava/runtime/operators/ModuleOperators.java b/src/main/java/org/perlonjava/runtime/operators/ModuleOperators.java index 0436cdec6..b8b061d4a 100644 --- a/src/main/java/org/perlonjava/runtime/operators/ModuleOperators.java +++ b/src/main/java/org/perlonjava/runtime/operators/ModuleOperators.java @@ -1,6 +1,7 @@ package org.perlonjava.runtime.operators; import org.perlonjava.app.cli.CompilerOptions; +import org.perlonjava.backend.bytecode.InterpreterState; import org.perlonjava.core.Configuration; import org.perlonjava.runtime.runtimetypes.*; import org.perlonjava.app.scriptengine.PerlLanguageProvider; @@ -616,6 +617,7 @@ else if (code == null) { RuntimeList result; FeatureFlags outerFeature = featureManager; + String savedPackage = InterpreterState.currentPackage.get().toString(); try { featureManager = new FeatureFlags(); @@ -635,6 +637,7 @@ else if (code == null) { return new RuntimeScalar(); // return undef } finally { featureManager = outerFeature; + InterpreterState.currentPackage.get().set(savedPackage); } // Return result based on context diff --git a/src/main/java/org/perlonjava/runtime/operators/Time.java b/src/main/java/org/perlonjava/runtime/operators/Time.java index 06a9d7b83..5836ec8ec 100644 --- a/src/main/java/org/perlonjava/runtime/operators/Time.java +++ b/src/main/java/org/perlonjava/runtime/operators/Time.java @@ -8,7 +8,9 @@ import java.time.ZoneId; import java.time.ZoneOffset; import java.time.ZonedDateTime; -import java.time.format.DateTimeFormatter; +import java.time.format.DateTimeFormatterBuilder; +import java.time.format.TextStyle; +import java.util.Locale; import java.util.concurrent.Executors; import java.util.concurrent.ScheduledExecutorService; import java.util.concurrent.ScheduledFuture; @@ -89,7 +91,7 @@ public static RuntimeList localtime(RuntimeList args, int ctx) { if (args.isEmpty()) { date = ZonedDateTime.now(); } else { - long arg = args.getFirst().getInt(); + long arg = args.getFirst().getLong(); date = Instant.ofEpochSecond(arg).atZone(ZoneId.systemDefault()); } return getTimeComponents(ctx, date); @@ -107,16 +109,29 @@ public static RuntimeList gmtime(RuntimeList args, int ctx) { if (args.isEmpty()) { date = ZonedDateTime.now(ZoneOffset.UTC); } else { - long arg = args.getFirst().getInt(); + long arg = args.getFirst().getLong(); date = Instant.ofEpochSecond(arg).atZone(ZoneId.of("UTC")); } return getTimeComponents(ctx, date); } + // Perl's scalar gmtime/localtime returns ctime(3) format: "Sun Jan 1 00:00:00 1970" + // Do NOT use DateTimeFormatter.RFC_1123_DATE_TIME — it produces "Sun, 1 Jan 1970 00:00:00 GMT" + // which has wrong field order/format, and crashes with DateTimeException for years > 4 digits. + private static String formatCtime(ZonedDateTime date) { + String dow = date.getDayOfWeek().getDisplayName(TextStyle.SHORT, Locale.ENGLISH); + String mon = date.getMonth().getDisplayName(TextStyle.SHORT, Locale.ENGLISH); + int day = date.getDayOfMonth(); + String dayStr = day < 10 ? " " + day : String.valueOf(day); + int h = date.getHour(), m = date.getMinute(), s = date.getSecond(); + int year = date.getYear(); + return String.format("%s %s %s %02d:%02d:%02d %d", dow, mon, dayStr, h, m, s, year); + } + private static RuntimeList getTimeComponents(int ctx, ZonedDateTime date) { RuntimeList res = new RuntimeList(); if (ctx == RuntimeContextType.SCALAR) { - res.add(date.format(DateTimeFormatter.RFC_1123_DATE_TIME)); + res.add(formatCtime(date)); return res; } // 0 1 2 3 4 5 6 7 8 @@ -127,7 +142,8 @@ private static RuntimeList getTimeComponents(int ctx, ZonedDateTime date) { res.add(date.getDayOfMonth()); res.add(date.getMonth().getValue() - 1); res.add(date.getYear() - 1900); - res.add(date.getDayOfWeek().getValue()); + // Java DayOfWeek: 1=Mon..7=Sun; Perl wday: 0=Sun..6=Sat. The % 7 maps 7(Sun)->0, 1(Mon)->1, etc. + res.add(date.getDayOfWeek().getValue() % 7); res.add(date.getDayOfYear() - 1); res.add(date.getZone().getRules().isDaylightSavings(date.toInstant()) ? 1 : 0); return res; diff --git a/src/main/java/org/perlonjava/runtime/regex/RuntimeRegex.java b/src/main/java/org/perlonjava/runtime/regex/RuntimeRegex.java index a89d9d283..e02888389 100644 --- a/src/main/java/org/perlonjava/runtime/regex/RuntimeRegex.java +++ b/src/main/java/org/perlonjava/runtime/regex/RuntimeRegex.java @@ -439,6 +439,12 @@ private static RuntimeBase matchRegexDirect(RuntimeScalar quotedRegex, RuntimeSc boolean isPosDefined = posScalar.getDefinedBoolean(); int startPos = isPosDefined ? posScalar.getInt() : 0; + // Only use pos() for /g matches - non-/g matches always start from 0 + if (!regex.regexFlags.isGlobalMatch()) { + isPosDefined = false; + startPos = 0; + } + // Check if previous call had zero-length match at this position (for SCALAR context) // This prevents infinite loops in: while ($str =~ /pat/g) if (regex.regexFlags.isGlobalMatch() && ctx == RuntimeContextType.SCALAR) { diff --git a/src/main/java/org/perlonjava/runtime/runtimetypes/InputRecordSeparator.java b/src/main/java/org/perlonjava/runtime/runtimetypes/InputRecordSeparator.java index 0cf7e4adf..d90ac2f58 100644 --- a/src/main/java/org/perlonjava/runtime/runtimetypes/InputRecordSeparator.java +++ b/src/main/java/org/perlonjava/runtime/runtimetypes/InputRecordSeparator.java @@ -29,6 +29,16 @@ public RuntimeScalar set(RuntimeScalar value) { return set(value.tiedFetch()); } + if (value instanceof RuntimeBaseProxy) { + RuntimeScalar resolved = ((RuntimeBaseProxy) value).lvalue; + if (resolved != null) { + return set(resolved); + } + if (value instanceof ScalarSpecialVariable) { + return set(((ScalarSpecialVariable) value).getValueAsScalar()); + } + } + // Store current value in case validation fails int oldType = this.type; Object oldValue = this.value; diff --git a/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeHash.java b/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeHash.java index 92afbb00e..e6256093d 100644 --- a/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeHash.java +++ b/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeHash.java @@ -556,7 +556,9 @@ public void setSlice(RuntimeList keys, RuntimeList values) { // If we run out of values, use undef value = new RuntimeScalar(); } - this.put(key, value); + RuntimeScalar copy = new RuntimeScalar(); + copy.set(value); + this.put(key, copy); } } diff --git a/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeList.java b/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeList.java index 9c020d76e..d2b754a1a 100644 --- a/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeList.java +++ b/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeList.java @@ -367,6 +367,24 @@ public RuntimeScalar createReference() { * * @return A new RuntimeList with references to the elements of this list. */ + public RuntimeList flattenElements() { + RuntimeList result = new RuntimeList(); + for (RuntimeBase element : this.elements) { + if (element instanceof RuntimeArray array) { + for (int i = 0; i < array.size(); i++) { + result.elements.add(array.get(i)); + } + } else if (element instanceof RuntimeHash hash) { + for (RuntimeScalar val : hash.values()) { + result.elements.add(val); + } + } else { + result.elements.add(element); + } + } + return result; + } + public RuntimeList createListReference() { RuntimeList result = new RuntimeList(); List resultList = result.elements; diff --git a/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeScalar.java b/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeScalar.java index 49090c89d..e6a4471e2 100644 --- a/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeScalar.java +++ b/src/main/java/org/perlonjava/runtime/runtimetypes/RuntimeScalar.java @@ -589,7 +589,11 @@ private boolean getBooleanLarge() { case JAVAOBJECT -> value != null; case TIED_SCALAR -> this.tiedFetch().getBoolean(); case DUALVAR -> ((DualVar) this.value).stringValue().getBoolean(); - case CODE -> true; + case CODE -> { + if (value == null) yield false; + RuntimeCode code = (RuntimeCode) value; + yield code.packageName != null || code.subName != null || code.defined(); + } default -> Overload.boolify(this).getBoolean(); }; } @@ -642,6 +646,12 @@ public RuntimeScalar set(RuntimeScalar value) { if (this.type == TIED_SCALAR) { return this.tiedStore(value); } + if (value instanceof ScalarSpecialVariable) { + RuntimeScalar resolved = ((ScalarSpecialVariable) value).getValueAsScalar(); + this.type = resolved.type; + this.value = resolved.value; + return this; + } this.type = value.type; this.value = value.value; return this; @@ -758,7 +768,7 @@ private String toStringLarge() { case JAVAOBJECT -> value.toString(); case TIED_SCALAR -> this.tiedFetch().toString(); case DUALVAR -> ((DualVar) this.value).stringValue().toString(); - case CODE -> ((RuntimeCode) value).defined() ? Overload.stringify(this).toString() : ""; + case CODE -> Overload.stringify(this).toString(); default -> { if (type == REGEX) yield value.toString(); yield Overload.stringify(this).toString(); diff --git a/src/main/perl/lib/Digest/SHA.pm b/src/main/perl/lib/Digest/SHA.pm index da11f3e7e..41f7a08a9 100644 --- a/src/main/perl/lib/Digest/SHA.pm +++ b/src/main/perl/lib/Digest/SHA.pm @@ -98,152 +98,27 @@ sub base64digest { } # Functional interface implementations -sub sha1 { - my $data = shift; - my $sha = Digest::SHA->new('1'); - $sha->add($data); - return $sha->digest; -} - -sub sha1_hex { - my $data = shift; - my $sha = Digest::SHA->new('1'); - $sha->add($data); - return $sha->hexdigest; -} - -sub sha1_base64 { - my $data = shift; - my $sha = Digest::SHA->new('1'); - $sha->add($data); - return $sha->b64digest; -} - -sub sha224 { - my $data = shift; - my $sha = Digest::SHA->new('224'); - $sha->add($data); - return $sha->digest; -} - -sub sha224_hex { - my $data = shift; - my $sha = Digest::SHA->new('224'); - $sha->add($data); - return $sha->hexdigest; -} - -sub sha224_base64 { - my $data = shift; - my $sha = Digest::SHA->new('224'); - $sha->add($data); - return $sha->b64digest; -} - -sub sha256 { - my $data = shift; - my $sha = Digest::SHA->new('256'); - $sha->add($data); - return $sha->digest; -} - -sub sha256_hex { - my $data = shift; - my $sha = Digest::SHA->new('256'); - $sha->add($data); - return $sha->hexdigest; -} - -sub sha256_base64 { - my $data = shift; - my $sha = Digest::SHA->new('256'); - $sha->add($data); - return $sha->b64digest; -} - -sub sha384 { - my $data = shift; - my $sha = Digest::SHA->new('384'); - $sha->add($data); - return $sha->digest; -} - -sub sha384_hex { - my $data = shift; - my $sha = Digest::SHA->new('384'); - $sha->add($data); - return $sha->hexdigest; -} - -sub sha384_base64 { - my $data = shift; - my $sha = Digest::SHA->new('384'); - $sha->add($data); - return $sha->b64digest; -} - -sub sha512 { - my $data = shift; - my $sha = Digest::SHA->new('512'); - $sha->add($data); - return $sha->digest; -} - -sub sha512_hex { - my $data = shift; - my $sha = Digest::SHA->new('512'); - $sha->add($data); - return $sha->hexdigest; -} - -sub sha512_base64 { - my $data = shift; - my $sha = Digest::SHA->new('512'); - $sha->add($data); - return $sha->b64digest; -} - -sub sha512224 { - my $data = shift; - my $sha = Digest::SHA->new('512224'); - $sha->add($data); - return $sha->digest; -} - -sub sha512224_hex { - my $data = shift; - my $sha = Digest::SHA->new('512224'); - $sha->add($data); - return $sha->hexdigest; -} - -sub sha512224_base64 { - my $data = shift; - my $sha = Digest::SHA->new('512224'); - $sha->add($data); - return $sha->b64digest; -} - -sub sha512256 { - my $data = shift; - my $sha = Digest::SHA->new('512256'); - $sha->add($data); - return $sha->digest; -} - -sub sha512256_hex { - my $data = shift; - my $sha = Digest::SHA->new('512256'); - $sha->add($data); - return $sha->hexdigest; -} - -sub sha512256_base64 { - my $data = shift; - my $sha = Digest::SHA->new('512256'); - $sha->add($data); - return $sha->b64digest; -} +sub sha1 { my $sha = Digest::SHA->new('1'); $sha->add(@_); return $sha->digest } +sub sha1_hex { my $sha = Digest::SHA->new('1'); $sha->add(@_); return $sha->hexdigest } +sub sha1_base64 { my $sha = Digest::SHA->new('1'); $sha->add(@_); return $sha->b64digest } +sub sha224 { my $sha = Digest::SHA->new('224'); $sha->add(@_); return $sha->digest } +sub sha224_hex { my $sha = Digest::SHA->new('224'); $sha->add(@_); return $sha->hexdigest } +sub sha224_base64 { my $sha = Digest::SHA->new('224'); $sha->add(@_); return $sha->b64digest } +sub sha256 { my $sha = Digest::SHA->new('256'); $sha->add(@_); return $sha->digest } +sub sha256_hex { my $sha = Digest::SHA->new('256'); $sha->add(@_); return $sha->hexdigest } +sub sha256_base64 { my $sha = Digest::SHA->new('256'); $sha->add(@_); return $sha->b64digest } +sub sha384 { my $sha = Digest::SHA->new('384'); $sha->add(@_); return $sha->digest } +sub sha384_hex { my $sha = Digest::SHA->new('384'); $sha->add(@_); return $sha->hexdigest } +sub sha384_base64 { my $sha = Digest::SHA->new('384'); $sha->add(@_); return $sha->b64digest } +sub sha512 { my $sha = Digest::SHA->new('512'); $sha->add(@_); return $sha->digest } +sub sha512_hex { my $sha = Digest::SHA->new('512'); $sha->add(@_); return $sha->hexdigest } +sub sha512_base64 { my $sha = Digest::SHA->new('512'); $sha->add(@_); return $sha->b64digest } +sub sha512224 { my $sha = Digest::SHA->new('512224'); $sha->add(@_); return $sha->digest } +sub sha512224_hex { my $sha = Digest::SHA->new('512224'); $sha->add(@_); return $sha->hexdigest } +sub sha512224_base64 { my $sha = Digest::SHA->new('512224'); $sha->add(@_); return $sha->b64digest } +sub sha512256 { my $sha = Digest::SHA->new('512256'); $sha->add(@_); return $sha->digest } +sub sha512256_hex { my $sha = Digest::SHA->new('512256'); $sha->add(@_); return $sha->hexdigest } +sub sha512256_base64 { my $sha = Digest::SHA->new('512256'); $sha->add(@_); return $sha->b64digest } # File hashing utilities sub shasum {