Skip to content

Conversation

@ana-pantilie
Copy link
Contributor

@ana-pantilie ana-pantilie commented Nov 19, 2025

@ana-pantilie

This comment was marked as outdated.

@ana-pantilie

This comment was marked as outdated.

@ana-pantilie ana-pantilie changed the title Value costing: insertCoin, unionValue Value costing: insertCoin, unionValue, scaleValue Nov 20, 2025
@ana-pantilie

This comment was marked as outdated.

@ana-pantilie ana-pantilie force-pushed the ana/value-costing-final branch 3 times, most recently from 1e44958 to abdc6ed Compare December 5, 2025 14:15
@ana-pantilie ana-pantilie requested a review from Unisay December 18, 2025 19:10
@github-actions
Copy link
Contributor

github-actions bot commented Dec 18, 2025

PR Preview Action v1.6.3

🚀 View preview at
https://IntersectMBO.github.io/plutus/pr-preview/cost-models/pr-7435/

Built to branch gh-pages at 2025-12-24 12:16 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

"memory": {
"arguments": {
"intercept": 24,
"slope": 21
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are the slope and intercept determined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the note.

generateConstrainedValueWithMaxPolicyAndQuantity numPolicies tokensPerPolicy maxBound

-- | Generate constrained Value with information about max-size policy and quantity
generateConstrainedValueWithMaxPolicyAndQuantity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend to use this somewhere, perhaps in scaleValueArgs? Otherwise there's not much point factoring this out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a remnant of the experimentation which happened on this branch. I'll revert the change it if I have time.

- 2 words for the key 'Int' closure
- 2 words for the value 'Int' closure
Putting this all together, we arrive at the following memory usage function:
1*n + 4 + 7 + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear what the "4", "7" and "1" refer to. Please make it more obvious.

Copy link
Contributor Author

@ana-pantilie ana-pantilie Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They come from me drawing out the closures from my understanding of ghc's memory allocation and counting the number of words (which are mentioned in the note). I don't think this methodology is the best, so we might end up removing this note in the end.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note is useful, just organize it better and make it easier to follow. There's a lot of "1 word for this", "2 words for that", etc etc., and it is non-obvious what the "4" "7" "1" correspond to.

... and the allocations for the 'IntMap' and nested 'Map' themselves.
The 'Map' and 'IntMap' are implemented as a self-balancing binary tree and a big-endian patricia tree,
respectively. Since we are interested in the worst-case memory usage, we assume that
for each application of the builtin function both of the trees have to be completely reallocated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only applies to unionValue, since in insertCoin and scaleValue, there's only one tree.

Also, insertCoin never completely allocates the whole tree, even in the worst case.

"intercept": 45,
"slope": 21
},
"type": "linear_in_w"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like to know if insertCoin is CPU bound or memory bound.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, I haven't had time to check this yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to know this because if it exhausts memory 10 times faster than it exhausts CPU, it indicates that the memory cost is nonsensical and can severely limit the usefulness of Value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had time today to do this, these are the results. I created and ran the following test:

compiledInsertCoin :: CompiledCode BuiltinValue
compiledInsertCoin =
  $$(compile [||insertCoin||])
    `unsafeApplyCode` liftCodeDef "foo"
    `unsafeApplyCode` liftCodeDef "bar"
    `unsafeApplyCode` liftCodeDef 10
    `unsafeApplyCode` liftCodeDef (mkValueInData 1000000)

mkValueInData :: Int -> BuiltinValue
mkValueInData n =
  let currSymbols = (\x -> B.mkB $ "cS" <> B.encodeUtf8 (B.toBuiltin (Text.pack (show x)))) <$> take n [1 ..]
      tokN = B.mkB "tN"
      entries =
        B.mkMap
          [ ( cS
            , B.mkMap [(tokN, B.mkI 100)]
            )
          | cS <- currSymbols
          ]
   in B.unsafeDataAsValue entries

The resulting budget is:

CPU:                   935_697
Memory:                  1_786
AST Size:                   12
Flat Size:          17_888_928

I would say that's definitely not memory bound, unless I didn't construct the example properly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do a rough calculation like this: when you insertCoin into a Value of max depth 10:

CPU = (18413 * 10 + 356924) / 10000000000
= 0.00541054% of the transaction CPU limit

Memory = (21 * 10 + 45) / 14000000
= 0.001821428% of the transaction memory limit

So we are good.

respectively. Since we are interested in the worst-case memory usage, we assume that
for each application of the builtin function both of the trees have to be completely reallocated.
Since we are dealing with a nested 'Map' with the structure 'Map CurrencySymbol (Map TokenName Integer)',
we can consider only the worst-case scenario where the map is flattened, i.e., each currency symbol maps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why this is the worst case memory-wise, and I highly doubt this is the worst case for insertCoin. When inserting a node into a Map, usually the deeper the Map is, the more allocation is needed. I imagine a deep Map also makes rebalancing trickier.

Copy link
Contributor Author

@ana-pantilie ana-pantilie Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why this is the worst case memory-wise, and I highly doubt this is the worst case for insertCoin.

It's not, I was mistaken. @kwxm, @Unisay and I had a meeting a few days ago where we discussed this. I think the structure of the map doesn't matter when it comes to how much memory is allocated (for one map), only the number of nodes does. But that doesn't really matter much because we care about the new memory being allocated after the builtin run, not the total memory.

I imagine a deep Map also makes rebalancing trickier.

You're probably right, but it's tricky to figure out what the worst case is and how much memory is allocated.

I think we need to approach memory costing for the Value builtins similar to the way we approach cpu costing: by experimentally finding the worst case and generating the data which we can base predictions on. I don't think we currently do this for the memory costing of any of the other builtins, so we need to also add the required code infrastructure to generate the models using R.

At the same time, @kwxm said we gave up on doing accurate memory costing a while ago, but I don't know the reason behind this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is almost always sufficient to determine the amount of allocation by understanding how the algorithm works. I don't even know how you can accurately measure allocations from a specific builtin call - seems quite tricky.

For insertCoin, the key is to be aware that rebalancing allocates at most O(logn) nodes. I think it's more than enough to assume that it allocates ValueMaxDepth number of nodes and be done with it. If you want to be more precise, feel free to read the paper to determine the exact worst case (which should still be far easier than trying to measure it accurately), but for memory cost we don't need to be that precise.

Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Value builtins should now be added to PlutusLedgerApi.Common.Versions.

- 2 words for the key 'Int' closure
- 2 words for the value 'Int' closure
Putting this all together, we arrive at the following memory usage function:
1*n + 4 + 7 + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note is useful, just organize it better and make it easier to follow. There's a lot of "1 word for this", "2 words for that", etc etc., and it is non-obvious what the "4" "7" "1" correspond to.

respectively. Since we are interested in the worst-case memory usage, we assume that
for each application of the builtin function both of the trees have to be completely reallocated.
Since we are dealing with a nested 'Map' with the structure 'Map CurrencySymbol (Map TokenName Integer)',
we can consider only the worst-case scenario where the map is flattened, i.e., each currency symbol maps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is almost always sufficient to determine the amount of allocation by understanding how the algorithm works. I don't even know how you can accurately measure allocations from a specific builtin call - seems quite tricky.

For insertCoin, the key is to be aware that rebalancing allocates at most O(logn) nodes. I think it's more than enough to assume that it allocates ValueMaxDepth number of nodes and be done with it. If you want to be more precise, feel free to read the paper to determine the exact worst case (which should still be far easier than trying to measure it accurately), but for memory cost we don't need to be that precise.

"intercept": 45,
"slope": 21
},
"type": "linear_in_w"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to know this because if it exhausts memory 10 times faster than it exhausts CPU, it indicates that the memory cost is nonsensical and can severely limit the usefulness of Value.

Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but we need @kwxm to sign off on this.

"intercept": 45,
"slope": 21
},
"type": "linear_in_w"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do a rough calculation like this: when you insertCoin into a Value of max depth 10:

CPU = (18413 * 10 + 356924) / 10000000000
= 0.00541054% of the transaction CPU limit

Memory = (21 * 10 + 45) / 14000000
= 0.001821428% of the transaction memory limit

So we are good.

Copy link
Contributor

@kwxm kwxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some initial comments; I'll continue to review the PR (in particular, the benchmarking code: it's going to take me a while to grasp the details of that) and add more comments later. It all looks pretty sensible so far though.

-- | a*x + b*y + c*x*y + I
data TwoVariableWithInteractionFunction = TwoVariableWithInteractionFunction
{ intercept :: Intercept
, slopex :: Slope
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest not including x and y in the parameter names here since it's conceivable that we could re-use this for something else where it's applied to other parameters. Similar costing functions use slope1 and slope2 instead, and I suppose we could have slope12 as well; alternatively, this is a type of polynomial (and the graph isn't a plane, so it doesn't really make sense to call the coefficients "slopes"), so we could use the already existing Coefficient00, Coefficient01, Coefficient 10, and Coefficient11 instead, although that's a bit lengthy.

This thing is in fact an instance of the TwoVariableQuadraticFunction type (with the x^2 and y^2 coeccificents equal to zero), but it'd probably be overkill to use that instead.

(ModelTwoArgumentsConstantCost c) = lazy $ \_ _ -> CostLast c
runTwoArgumentModel
(ModelTwoArgumentsAddedSizes (OneVariableLinearFunction intercept slope)) =
(ModelTwoArgumentsAddedSizes (OneVariableLinearFunction intercept' slope)) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these all intercept' now because TwoVariableWithInteractionFunction contains a field called intercept? It seems questionable to change a large number of variable names so that the simple form of the name can be used in a small number of places. Note that other similar types have unwieldy field names like oneVariableLinearFunctionIntercept to avoid clashes and leave the simple names available for use elsewhere.

benchname <- regex("([[:alnum:]_]+)/(\\d+)(?:/(\\d+))?(?:/(\\d+))?(?:/(\\d+))?")
## We have benchmark names like "AddInteger/11/22", the numbers representing the sizes of
## the inputs to the benchmark. This extracts the name and up to three numbers, returning
## "NA" for any that are missing. If we ever have builtins with more than three arguments
Copy link
Contributor

@kwxm kwxm Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment's outdated now because we do in fact have builtins with more than three arguments and have had for some time. I think this is the first time we've had to base costing on an argument after the third one though.

## constant anyway).

numbercols = c("x_mem", "y_mem", "z_mem")
numbercols = c("x_mem", "y_mem", "z_mem", "w_mem")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very small quibble: when I've been working with benchmark results in R I've been in the habit of calling the arguments x, y, z, u, v, and w, as in the toframe script. I don't think that notation's actually made it into the codebase anywhere before though. I'm not sure what we'll do if we have to worry about a seventh argument at some time in the future. It's a bit late now, but maybe we should have called the arguments x1. x2, x3, ...

filtered <- data %>%
filter.and.check.nonempty(fname) %>%
discard.overhead ()
filtered$off <- 1 # offset to force inclusion of intercept in the model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you let lm infer the intercept? Do you get some huge number? I think it's generally regarded as bad practice to force models to got through specific points since it can cause all kinds of distortion. This will probably give you estimates smaller than the real value for small inputs, but maybe that's acceptable if it gives better estimates for bigger values. Admittedly, letting lm infer the intercept means that you sometimes end up with a negative intercept, and there's code in this file to adjust it if that happens.

Copy link
Contributor

@kwxm kwxm Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, that gets ignored anyway because it does infer a negative intercept: when you run generate-cost-model it says (amongst other things)

** WARNING: a negative coefficient -3216.982664 for (Intercept) occurred in the model for unValueData. This has been adjusted to 0.001.
** WARNING: a negative coefficient -764.798877 for (Intercept) occurred in the model for unionValue. This has been adjusted to 0.001.
** WARNING: a negative coefficient -2163.609705 for (Intercept) occurred in the model for scaleValue. This has been adjusted to 0.001.

So in the JSON file these all end up with an intercept of 1000 anyway. I'm not sure what's going on here...

var1
var2 =
printf
"%d + %d + %d*%s + %d*%s + %d*%s*%s"
Copy link
Contributor

@kwxm kwxm Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"%d + %d + %d*%s + %d*%s + %d*%s*%s"
"%d + %d*%s + %d*%s + %d*%s*%s"

There's an extra %d at the start.

unValueDataModel <- linearInX ("UnValueData")

# Y wrapped with `TotalSize` (contained value size)
scaleValueModel <- linearInY ("ScaleValue")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still puzzled by the shape of the data here, but making it linear in y does give a good overapproximation for all values of y. The downward bend at the start drags the regression line down quite a bit and you end up with a negavie intercept, but later that gets adjusted to 1000 and we end up with a line that lies just above all of the benchmark times.

Comment on lines +294 to +300
, LookupCoin
, InsertCoin
, ValueContains
, UnionValue
, ScaleValue
, ValueData
, UnValueData
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
, LookupCoin
, InsertCoin
, ValueContains
, UnionValue
, ScaleValue
, ValueData
, UnValueData
, InsertCoin
, LookupCoin
, UnionValue
, ValueContains
, ValueData
, UnValueData
, ScaleValue

^ That's the same order as the constructors in Builtins.hs. It probably doesn't matter, but let's be consistent.

Copy link
Contributor

@kwxm kwxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some more comments. I still need to check the benchmarking code though (not that I expect any problems, since the benchmark results look quite reasonable).

| UnValueData'cpu'arguments'intercept
| UnValueData'cpu'arguments'slope
| UnValueData'memory'arguments
| InsertCoin'cpu'arguments'intercept
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should also put the parameter names in the same order as the constructors in Builtins.hs (and also for V2 and V3). Again, it probably doesn't matter, but it'd make it a bit easier to match up the parameters with the relevant builtins.

<> Benchmarks.Unit.makeBenchmarks gen
<> Benchmarks.Values.makeBenchmarks gen
<> Benchmarks.Values.makeBenchmarks
gen
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this on a new line? Is fourmolu being awkward?

Shortcircuits if either value is empty.
Since 'unionValue' is commutative, we switch the arguments whenever the second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that we effectively always evaluate unionValue x y with the total size of x greater than the total size of y, right? In the costing benchmarks for value it says

therefore we consider two Values where the first is a sub-value of the second.

and that's the case where the total size of x is less than the total size of y. For a horrible moment I thought that that was the wrong way round, but it doesn't matter because the implementation will swap them anyway. In fact maybe that's better, because it'll include the (very small) swapping time in the costs as well.

-- gets back a 'Data' value it'll traverse all of it.
benchWith params name term =
bench name $
whnf (handleEvaluationErrors . evaluateCekNoEmit params) term
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have to change this to use something like evaluateCekLikeInProd (see https://github.com/IntersectMBO/plutus-private/issues/1722), in which case the log messages will disappear again.

Also, I wonder if changing from a non-emitting evaluator to one which does emit will affect the benchmark performance. I guess we can check that by running a few benchmarks with both versions of the evaluator.

Copy link
Contributor

@kwxm kwxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty much OK. I've added quite a lot of minor comments, and I think the memory cost for scaleValue needs to be updated.

Since 'scaleValue' needs to traverse the entire Value, we may fix the structure
of the Value to be a flattened map with a single token name per policy ID.
To ensure worst-case performance, we fix the resulting scaled quantities to
be 'maxBound'. -}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this really is the worst case. All of the multiplications are (maxBound/2) * 2, and perhaps mulitplication by 2 will be extra-efficient. Also, is there any danger that caching will speed things up, since we're doing the same multiplication each time? If we're really paranoid we could try something like generating random numbers close to maxBound and setting the scalar and the amount to be the square root of that, or something close. I suspect that most of the work will be done traversing the values, so the cost of the multiplications won't in fact make any significant difference (I could be wrong though!).

insertCoinArgs gen = do
lookupArgs <- lookupCoinArgs gen
let noOfBenchs = length lookupArgs
amounts <- genZeroOrMaxAmount gen noOfBenchs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be annoying, I'd go for "Benches" here instead of "Benchs".

----------------------------------------------------------------------------------------------------
-- ScaleValue --------------------------------------------------------------------------------------

scaleValueBenchmark :: StdGen -> Benchmark
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like to work oout why the graph is that odd shape (curved at first but then straightening out). The benchmark inputs look perfectly reasonable, so it's a bit mystifying. I don't think we need to let that hold us up though.

## Overview
The memory model for the Value builtin 'unionValue' estimates the heap allocation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't just about unionValue: it talks about other builtins later. Presumably when this gets merged it'll need to be updated to deal with the other value builtins too, which might require significant rewriting.

For each unique (currency symbol, token name) pair:
Outer 'Map' (CurrencySymbol to inner 'Map'):
- 1 word for the 'Bin' closure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is talking about the implementation of the Map type: there's nothing to tell the reader what Bin and Tip are, and seraching the Plutus code won't help. Maybe include the definition of Map so that we know what this is talking about.

For 'unionValue', worst-case assumes disjoint sets of pairs in both 'Value's being united:
Memory = 21*n + 12 + 21*m + 12 = 21*(n + m) + 24
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't say what m is, but presumably m and n are the sizes of the arguments.

## Memory Analysis Assumptions
We assume for each builtin application:
1. Both trees must be completely reallocated (worst-case)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reader needs to know that Maps are implemented as trees here. If you look at the defintion of Value it's not obvious that trees are involved anywhere.

## Memory Analysis Assumptions
We assume for each builtin application:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

each builtin application

This is just about unionValue, right?

Remember, an integer is allocated as 3 words (pointer + closure data, where the data is max
of two words size), so for 'scaleValue':
Memory = 3*n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems incorrect. If you're scaling a value with n nodes, in general the entire internal structure of the value will have to be re-created, so the total memory size of the output will be the same as the total memory size of the input, no? That would mean that the memory used by scaleValue on a value with n nodes is the total memory usage of a value with n nodes, which I think is 21n+12: that's a lot more than 3*n.

\**Total for IntMap: 7 words (constant)**
## Final Formula
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The final formula for what? I think this is the total memory usage of a value with n entries, but it'd be helpful if it said that explicilty. I'd separate out this section and then use the formula in discussions of specific builtins later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants