Skip to main content

Adversarial Audit Paper: Ajna ERC-4626 Vaults + CRE Automation

Date: 2026-03-24
Scope: ajna-finance/4626-ajna-vault, ajna-finance/4626-ajna-vault-keeper, 4626fun/4626, and official CRE documentation.

1. Executive Summary

The Ajna ERC-4626 vault design is technically coherent but economically liveness-sensitive: users can mint/redeem shares against totalAssets(), yet actual exits are hard-gated by buffer liquidity (maxWithdraw()/maxRedeem() cap by BUFFER.total() in Vault.sol). This means safety is not only smart-contract correctness; it critically depends on continuous, correct offchain rebalancing from Ajna buckets back into the buffer.

The biggest user-loss risk is not a classic reentrancy bug; it is valuation-to-liquidity mismatch under stress. totalAssets() in Vault.sol counts bucket value (lpToValue) plus buffer value, but redemptions are paid from buffer only. In auction/bad-debt/debt-lock regimes, assets can be economically impaired or operationally trapped while shares still reference accounting value, creating unfair exits and bank-run ordering.

The biggest operational/liveness risk is the keeper control-plane and config surface (4626-ajna-vault-keeper). Critical behavior depends on KEEPER_INTERVAL_MS, oracle mode (ONCHAIN_ORACLE_PRIMARY/ORACLE_API_URL/FIXED_PRICE), subgraph availability (EXIT_ON_SUBGRAPH_FAILURE), and gas fallback behavior. A wrong-but-valid config can keep the system "running" while reallocating liquidity badly.

CRE improves determinism, replay resistance, and observability in local workspace (runtimeBridge idempotency keys, replay nonces, queue dedupe), but current production shape is still mostly an HTTP bridge to a shared hot key (/cre/keeper/tend|report|sweep). CRE should be treated as strong orchestration and policy infrastructure, not a magical safety layer. Used correctly, it materially helps; used as a thin wrapper around centralized writes, it mostly shifts failure domains.

2. Repository Map

RepoFiles reviewedPurposeAudit relevance
ajna-finance/4626-ajna-vault (latest head, local clone in tmp/audit-sources/4626-ajna-vault)src/Vault.sol, src/Buffer.sol, src/VaultAuth.sol, src/AjnaVaultLibrary.sol, interfaces/*, test/*, config/vault-config.example.json, script/Vault.s.sol, README.mdOnchain vault/accounting/liquidity logicCore asset safety, share integrity, role risk, withdrawal behavior
ajna-finance/4626-ajna-vault-keeper (latest head, local clone in tmp/audit-sources/4626-ajna-vault-keeper)src/keeper.ts, src/utils/env.ts, src/utils/transaction.ts, src/oracle/*, src/subgraph/poolHealth.ts, src/utils/scheduler.ts, .env.example, README.md, integration testsOffchain rebalance decision and execution logicPrimary liveness + economic decision engine
4626fun/4626 (local workspace)cre/cre-workflows/*, cre/cre-workflows/_shared/*, cre/utils/onchain.ts, frontend/api/_handlers/cre/*, frontend/server/_lib/cre/runtimeBridge.ts, frontend/server/_lib/keeprRegistry.ts, frontend/server/keepr/xmtpQueueExecutor.ts, cre/README.mdCRE orchestration, queueing, runtime sink, execution authorityDetermines replay/idempotency, authority boundaries, CRE migration feasibility
Chainlink CRE Docs (official)docs.chain.link/cre/, simulation, limits, consensus, deploy access, service quotasRuntime guarantees and constraintsRequired to judge simulation confidence vs production readiness

3. Implemented System Architecture

Onchain components

  • Vault.sol
    • ERC-4626 entrypoint (deposit, mint, withdraw, redeem)
    • totalAssets() = buffer value + removed collateral value + Ajna bucket values
    • maxWithdraw/maxRedeem are buffer-limited.
  • Buffer.sol
    • Internal quote accounting (total, Mana) with onlyVault and lock.
  • VaultAuth.sol
    • Admin/keeper/swapper roles and mutable policy (pause, fees, bufferRatio, minBucketIndex).
  • AjnaVaultLibrary.sol
    • Move primitives + constraints (_checkBufferRatio, _validDestination, role-gating for move paths).

Offchain components

  • Legacy Ajna keeper (4626-ajna-vault-keeper)
    • Periodic run loop (startScheduler + run)
    • Oracle/subgraph checks, bucket selection, move execution.
  • CRE workflows (cre/cre-workflows/*)
    • vault-keeper, cca-finalization, keepr-action-queue, runtime-*, payout-integrity, ajna-bucket-manager.
  • Vercel bridge handlers
    • /api/cre/keeper/*, /api/cre/runtime/*, /api/cre/vaults/active.
  • Queue execution path
    • keepr_actions queue + xmtpQueueExecutor.ts, including canonical CSW path for Ajna rebucket (setMinBucketIndex only).

Data dependencies

  • Ajna pool onchain state (LUP, HTP, bucket LP/deposits, bankruptcy time, debt lock).
  • Oracle sources:
    • CoinGecko API (ORACLE_API_URL)
    • Chronicle onchain (ONCHAIN_ORACLE_ADDRESS, staleness checks)
    • Optional FIXED_PRICE.
  • Subgraph for unsettled auction discovery (SUBGRAPH_URL).
  • CRE runtime storage in Postgres:
    • cre_runtime_records, cre_runtime_decisions, cre_runtime_replay_nonces
    • keepr_actions, keepr_vaults, keepr_vault_automation.

Control/authority map

ComponentCritical actionsAuthorityNotes
VaultAuth.solpause/unpause, role changes, fee/cap/buffer/minBucket changesadminSingle highest-risk role
AjnaVaultLibrary.moveFromBuffermove buffer -> poolkeeper onlyEnforces role via AUTH.isKeeper(msg.sender)
AjnaVaultLibrary.move / moveToBuffermove within pool / pool -> bufferadmin or keeperCentral to rebalance and exits
recoverCollateral/returnQuoteTokencollateral unwind/recoveryadmin or swapperPauses vault via removedCollateralValue path
Keeper wallet (KEEPR_PRIVATE_KEY) via bridgetend, report, sweep*bearer token + keeper keyHTTP bridge centralization risk
Ajna canonical CSW pathsetMinBucketIndexcreator canonical CSW + embedded EOA ownerExplicit fail-closed admin-match checks

Exit path map

FlowHappy pathFailure mode
Deposit/mintUser assets -> vault -> buffer accounting; shares mintedIf paused/cap/fee config invalid, entry blocked
Withdraw/redeemShares burned -> buffer accounting decreases -> assets transferredIf BUFFER.total() insufficient, reverts (hard liveness gate)
Keeper refillmoveToBuffer from Ajna bucketsFails if out-of-range, debt lock, bad debt, oracle/subgraph issues
Emergency collateral recoveryrecoverCollateral -> vault paused -> returnQuoteTokenOperator-dependent recovery; can remain paused until quote returned

Direct answers to Q1–Q5

  • Q1 (exact flow)
    • Deposit/mint: Vault.deposit/Vault.mint -> _deposit -> BUFFER.addQuoteToken + _fill.
    • Withdraw/redeem: Vault.withdraw/Vault.redeem -> _withdraw -> BUFFER.removeQuoteToken + _wash -> transfer to receiver.
    • Rebalance: move, moveToBuffer, moveFromBuffer in Vault.sol through AjnaVaultLibrary.
    • Emergency: VaultAuth.pause or recoverCollateral path (removedCollateralValue), then returnQuoteToken.
  • Q2 (asset location)
    • User wallet before entry; vault-controlled balances and Ajna LP exposure after entry; buffer is accounting reserve used for exits.
  • Q3 (onchain vs offchain)
    • Asset state transitions are onchain; timing and policy execution are offchain (keeper/CRE).
  • Q4 (safety-critical for exits)
    • BUFFER.total maintenance, keeper liveness, Ajna unwindability, and role/key integrity.
  • Q5 (non-custodial vs operator-liveness)
    • It is non-custodial in key ownership sense, but user exit safety is materially operator-liveness dependent.

4. Critical Assumptions

AssumptionWhere it lives (code/config/offchain)Why it mattersWhat breaks if false
Buffer can be refilled quickly enoughVault.sol + keeper cadence/configWithdrawals are buffer-onlyRedemptions revert; first-exit advantage
totalAssets approximates economic realityVault.sol.totalAssets, Ajna state readsShare pricing fairnessShare value can overstate realizable exits
Keeper picks economically safe bucketskeeper.ts, oracle inputs, OPTIMAL_BUCKET_DIFFPrevents toxic placementWrong bucket concentration and trapped liquidity
Oracle inputs are sane and timelyoracle/*, env mode selectionDrives target bucket indexPersistent misallocation from stale/wrong price
Subgraph bad-debt check is trustworthypoolHealth.ts, EXIT_ON_SUBGRAPH_FAILUREHealth gate for unsafe pool statesFail-open can rebalance in unhealthy state
Admin behaves and secures keysVaultAuth.solFull policy/role controlMalicious or mistaken config changes can harm users
Keeper keys are not compromised.env / bridge handlersExecutes writesUnauthorized tend/report/sweep writes
Ajna auth admin matches canonical CSW for Ajna automationajnaManager.ts, xmtpQueueExecutor.tsFail-closed canonical executionAjna rebucket actions blocked or misrouted
CRE sink idempotency keys are uniqueruntimeBridge.ts, runtime workflowsPrevent duplicate actionsDuplicate/omitted actions under replay
Queue status transitions remain atomickeepr/actions/_updateStatus.tsAt-most-once practical semanticsStuck/duplicated operational actions
CRE simulation approximates productionCRE docsConfidence before deployFalse confidence if DON behavior diverges
Operational runbooks are completeoffchain opsFast recovery on incidentExtended liveness outages

5. ERC-4626 Accounting Review

  • Q6: totalAssets() in Vault.sol adds BUFFER.lpToValue(bufferLps) + removedCollateralValue + Σ lpToValue(bucket) and converts WAD to asset decimals. This is accounting value, not guaranteed immediate exit value. It can overstate practical exitability during debt lock/auction/bad debt because user exits are still buffer-only.
  • Q7:
    • Fee math uses _getFee (ceilDiv) and _getAssetsWithFee; previews map through fee-adjusted paths (previewDeposit, previewRedeem).
    • _decimalsOffset() is correctly overridden (18 - assetDecimals) in Vault.sol, fixing known non-18 decimal share scaling risk.
    • Classic first-depositor inflation attack is reduced by OZ virtual share/asset mechanics; donation-based manipulation is dampened because direct token donations to vault are not directly included in buffer accounting.
  • Q8: Mismatch exists by design:
    1. assets counted in totalAssets,
    2. assets liquid now (buffer),
    3. assets only liquid after keeper/market operations (Ajna buckets),
    4. assets potentially impaired/trapped by Ajna states.
  • Q9: Yes, users can get economically unfair outcomes. Shares can price off accounting while actual exit path is constrained by buffer + keeper timing.
  • Q10: Yes. Early withdrawers consume finite buffer; later withdrawers can revert until rebalance. This is material safety/economic risk, not just UX.

6. Buffer and Redemption Risk Review

  • Q11: Static bufferRatio is not stress-adaptive; sufficiency depends on redemption velocity, cadence, and unwindability.
  • Q12: Buffer depletes; maxWithdraw/maxRedeem drop with BUFFER.total; further withdrawals revert.
  • Q13: Yes, structural bank-run dynamic exists.
  • Q14: Economic/safety issue: ordering-dependent access to liquidity can force involuntary hold-through-stress.
  • Q15: Yes, an attacker can pin/drain buffer through timing and market-state stress.
  • Q16: Drift vectors include interest/time drift, fee-on-transfer/non-standard token behavior assumptions, and out-of-band balance changes.

7. Ajna Market-Structure Review

  • Q17: Bucket placement controls realized lender economics; accounting may look healthy while practical yield/risk worsens.
  • Q18: Keeper assumes LUP/HTP + minBucketIndex gates and skip logic for bad debt/bankruptcy/debt lock.
  • Q19: Partially valid under fast moves; lag + data quality can invalidate assumptions quickly.
  • Q20: Yes, adverse selection exposure exists as passive liquidity provider.
  • Q21: Yes, deterministic optimal-bucket logic can over-concentrate.
  • Q22: Yes, strategy can chase yield into fragile or hard-to-exit buckets under stale/mispriced conditions.
  • Q23: Yes, liquidity can appear productive while practically trapped.

8. Keeper Risk Review

  • Q24: run() gates on pause/bad debt, updates interest, drains, checks in-range/dust/bankruptcy/debt-lock, then rebalances.
  • Q25: Critical config: KEEPER_INTERVAL_MS, OPTIMAL_BUCKET_DIFF, BUFFER_PADDING, MIN_MOVE_AMOUNT, MIN_TIME_SINCE_BANKRUPTCY, MAX_AUCTION_AGE, EXIT_ON_SUBGRAPH_FAILURE, ONCHAIN_ORACLE_PRIMARY, ONCHAIN_ORACLE_MAX_STALENESS, FIXED_PRICE, HALT_KEEPER_IF_LUP_BELOW_HTP.
  • Q26: Trusted inputs: RPC, subgraph, CoinGecko/Chronicle/fixed price, env config, keeper key custody.
  • Q27:
    • offline keeper: no rebalance, buffer decay risk;
    • subgraph fail: fail-open or fail-closed by config;
    • stale/wrong oracle: bad target or abort;
    • fixed misprice: deterministic bad policy;
    • bad debt/live auction: abort;
    • gas estimation fail: default-gas fallback;
    • LUPBelowHTP: optional hard halt.
  • Q28: Yes, safe halts exist.
  • Q29: Yes, can continue while economically wrong (e.g., fixed misprice, stale accepted inputs, subgraph fail-open).
  • Q30: Yes, keeper delays can cause material user harm.
  • Q31: 12-hour cadence is high latency risk in volatility.
  • Q32: Deterministic path dependency can be exploited around timing.
  • Q33: Yes, timing manipulation can induce bad/skip outcomes.
  • Q34: Yes, read/decide/write race conditions exist.

9. Oracle and Data Dependency Review

  • Q35: Oracle dependence is reintroduced in keeper price path (getPrice) for bucket targeting.
  • Q36:
    • CoinGecko/API: centralized/outage/rate-limit risk;
    • Chronicle/onchain: stronger integrity, still staleness risk;
    • fixed price: highest misconfig risk;
    • CRE + Chainlink feeds: strongest integrity if directly wired into deterministic gates.
  • Q37: Safest target path is deterministic CRE policy + Chainlink feeds + strict stale/deviation guards.
  • Q38: Easiest misconfigure is FIXED_PRICE.
  • Q39: Most robust to latency/manipulation is CRE consensus + Chainlink feed path if used as primary policy input.
  • Q40: Yes, stale but valid-looking prices can systematically misplace liquidity.
  • Q41: Yes, disagreement/drift can cause oscillation and bad rebalances.
  • Q42: Yes, stale reference can move vault from safer to fragile bucket.

10. Smart Contract Security Review

  • Q43: Privileged roles are admin, keeper, swapper (VaultAuth.sol).
  • Q44: Yes, role misuse or compromise can directly or indirectly cause loss.
  • Q45:
    • pause/unpause, cap/fee/minBucket updates are admin-controlled;
    • move functions are role-gated through library checks;
    • emergency recovery is operator-mediated.
  • Q46: Reentrancy locks exist; primary residual risks are stale-state economics and offchain timing, not classic reentrancy.
  • Q47: Hidden assumptions exist in buffer-ratio checks and decimal conversions under dynamic conditions.
  • Q48: Yes, contracts assume Ajna valuation proxies that may diverge from immediate realizability.

11. Attack Scenarios

ScenarioPreconditionsExploit pathImpactDetectabilityMitigationSeverity
49. Share price manipulationThin liquidity + timing edgeTime deposits/redeems around stale accounting and buffer asymmetryFairness distortionMediumfaster cadence, anti-MEV pathing, liquidity-aware previewsMedium
50. Donation/inflation attackDirect transfer attemptsDonate assets to skew accountingLow practical exploitability in this designHighmaintain donation-invariant testsLow
51. Sandwich around deposit/redeemPublic mempoolFront-run keeper/market state then user opUser execution fairness lossMediumprivate orderflow/jitterMedium
52. Strategic buffer exhaustionLarge/coord exitsDrain buffer before stressExit liveness failure, ordering unfairnessHighdynamic buffer + emergency refill playbookCritical
53. Keeper timing exploitationPredictable cadenceShift state near runs to induce skips/bad movesEconomic dragMediumevent-driven triggers + randomized windowsHigh
54. Wrong-bucket via stale oracleStale accepted inputDeterministic mis-targetingYield loss + trap riskMediumstrict freshness/deviation checksHigh
55. Dust-state griefingMany tiny bucketsForce skip-heavy behaviorOperational dragHighperiodic dust cleanup policyMedium
56. Borrower toxic flowInformed borrowersBorrow against passive placementMTM and realized lossesMediumconservative targeting policyHigh
57. Auction/bankruptcy trapLiquidation/bad debt stateLiquidity remains where exits constrainedSevere liveness/economic harmHighfail-closed stress policyCritical
58. Operator key compromiseKeeper key leakUnauthorized bridge writesOperational and economic damageMedium-HighHSM/MPC signer + rotation/runbooksHigh
59. Misconfigured buffer ratioAdmin/config errorRatio set too low/highExit failures or excess yield dragHighbounded config guardrails + staged rolloutHigh
60. Misconfigured fixed priceHuman errorWrong fixed price acceptedSystematic bad rebalancingHighdisable in prod or require strict controlsHigh
61. CRE workflow bug repeat/skipWorkflow defectDuplicate/omitted operationsLiveness and consistency failuresMediuminvariant tests + canary deploymentHigh
62. AI advisory misinfluenceAI output over-trustedHuman follows wrong advisoryProcess error; low direct code risk nowHighkeep AI non-authoritativeLow
63. Replay/duplication in CRE flowsRetries/duplicate triggersSame intent posted multiple timesMostly mitigated by idempotency; residual race riskMediumstrict sink/executor idempotencyMedium
64. Deterministic vs AI divergenceConflicting outputsOperator follows AI over checksDelayed/incorrect responseMediumdeterministic precedence in runbooksInformational

12. CRE Design Review

  • Q65: Full replacement is premature; best near-term is CRE-led scheduling/policy with narrowly scoped deterministic execution.
  • Q66: Best CRE candidates are scheduling, monitoring, deterministic policy checks, queueing/deduping, checkpointing, and alerting.
  • Q67: Must remain deterministic/minimal: action construction, auth checks, idempotency, owner verification, allowlists.
  • Q68: Never rely on AI for write auth, safety gating, liquidation-sensitive actions, or emergency actions.
  • Q69: CRE should be scheduler + monitor + deterministic policy engine + constrained tx orchestrator + HITL escalation.
  • Q70:
    • reusable: runtime-*, keepr-action-queue, sink/idempotency, registry filtering;
    • demo-grade aspects: heavy HTTP bridge dependence and prototype native-write fallback pathing;
    • hardening needed: key management, stricter auth defaults, invariant/chaos testing.
  • Q71: CRE adds real value (determinism, replay protection, observability) but also complexity.
  • Q72: Key CRE risks are misconfig, secret handling, trigger duplication, HTTP dependency, persistence mismatch, write authority, and early-access maturity risk.
  • Q73:
    • simulation: strong but not production-equivalent DON behavior;
    • production readiness: depends on deployment hardening and quotas;
    • institutional resilience: requires mature ops controls beyond baseline.

13. Stress Test Outcomes

Stress scenarioImmediate effectMedium-term effectUser impactRecovery pathSeverity
74. 20/35/50% collateral shocksLUP/HTP shiftsMore skips/lock risk; bad debt at larger shocksLiveness pressure -> realized impairment riskstress policy + operator interventionHigh/Critical
75. Rapid deleveragingUtilization shifts quicklyTarget bucket staleFairness and yield degradationfaster/event-driven loopHigh
76. Active liquidation auctionsKeeper may abortRefill delaysExit liveness degradationauction-aware runbooksHigh
77. Bad debt emergenceHealth gate tripsProlonged no-rebalanceSevere liveness + economic riskmanual intervention/pool recoveryCritical
78. 25/50/80% TVL redeem pre-rebalanceBuffer exhaustionRevert windows persistfirst-exit advantagehigher dynamic buffer + emergency processHigh/Critical
79. Stale oracle/bad priceWrong target bucketRepeated misallocationrealized yield lossstrict stale/deviation checksHigh
80. Keeper offline 12/24/72hNo actionsStale positioning compounds72h can be severefailover + pagingMedium/High/Critical
81. Subgraph failureDebt visibility lossfail-open unsafe continuation possiblelatent risk accumulationfail-closed setting + alertsMedium/High
82. Admin mistakeUnsafe policy updateMisbehavior under "valid" codebroad immediate riskmultisig/timelock + change controlHigh
83. CRE workflow outageMissed schedule windowsqueue backlogmostly livenessfallback runner + recovery runbookMedium
84. CRE duplicate triggerrepeated sink attemptsusually deduped, residual racesoperational noise/retry churnstronger dedupe + transition guardsMedium
85. AI advisory nonsensebad AI verdict textdeterministic checks still passlow direct risk currentlykeep advisory-onlyLow

14. Findings

Critical

  • Title: Buffer-only exits create structural bank-run liveness risk

    • Severity: Critical
    • Affected component: Vault.sol + offchain keeper loop
    • Evidence: maxWithdraw/maxRedeem cap by BUFFER.total; withdraw/redeem consume buffer path only; keeper must refill via moveToBuffer
    • Why it matters: Exit fairness becomes timing-dependent; late users can be locked out during stress
    • Exploitability: High under panic, no advanced exploit needed
    • Recommended fix: Dynamic buffer policy + faster/event-driven refill + explicit stress controls and user-facing liquidity state
  • Title: Accounting value can diverge from realizable exit value

    • Severity: Critical
    • Affected component: Vault.sol.totalAssets, Ajna exposure
    • Evidence: totalAssets sums bucket valuations while withdrawals remain buffer-only
    • Why it matters: Shares can look solvent while exits fail or become economically unfavorable
    • Exploitability: High in auction/bad-debt/debt-lock windows
    • Recommended fix: Add liquidity-aware risk metrics and stronger rebalance SLOs

High

  • Title: Oracle/config path can produce systematic wrong rebalances

    • Severity: High
    • Affected component: keeper oracle stack (oracle/price.ts, env.ts)
    • Evidence: CoinGecko/Chronicle/fixed mode switching and fallback behavior
    • Why it matters: Wrong inputs drive deterministic wrong bucket choices
    • Exploitability: Medium-High
    • Recommended fix: disable fixed price in production, enforce freshness/deviation bounds, add source quorum
  • Title: Shared keeper wallet via HTTP bridge is concentrated authority

    • Severity: High
    • Affected component: /api/cre/keeper/*, KEEPR_PRIVATE_KEY
    • Evidence: Bridge endpoints execute writes from shared signer
    • Why it matters: Key compromise or auth failure affects broad surface
    • Exploitability: Medium
    • Recommended fix: HSM/MPC signing, scoped route permissions, key rotation + incident runbooks
  • Title: Subgraph failure can fail-open on bad-debt gate

    • Severity: High
    • Affected component: poolHealth.ts, EXIT_ON_SUBGRAPH_FAILURE
    • Evidence: Empty-auction fallback unless fail-closed configured
    • Why it matters: Keeper may continue with degraded safety visibility
    • Exploitability: Medium
    • Recommended fix: production fail-closed + immediate alerting
  • Title: Keeper cadence/default latency too slow for stress

    • Severity: High
    • Affected component: KEEPER_INTERVAL_MS defaults and scheduler
    • Evidence: 12h default in keeper repo
    • Why it matters: liquidity/risk drift between runs
    • Exploitability: High via market speed
    • Recommended fix: shorter cadence + event triggers + bounded jitter

Medium

  • Title: Deterministic target bucket logic is gameable around timing

    • Severity: Medium
    • Affected component: keeper.ts (optimalBucket = priceIndex + diff)
    • Evidence: simple deterministic offset and predictable schedule
    • Why it matters: adversaries can shape pre/post-run state
    • Exploitability: Medium
    • Recommended fix: richer policy and randomized execution windows
  • Title: Read-decide-write race remains in keeper and CRE loops

    • Severity: Medium
    • Affected component: offchain orchestration
    • Evidence: separate read/decision/transaction stages
    • Why it matters: stale actions can become wrong actions
    • Exploitability: Medium
    • Recommended fix: pre-submit revalidation + postcondition checks
  • Title: CRE adds resilience but also complexity

    • Severity: Medium
    • Affected component: runtime-*, bridge, DB schema coupling
    • Evidence: replay/idempotency controls with cross-system dependencies
    • Why it matters: new failure planes
    • Exploitability: Medium (operational)
    • Recommended fix: hardening, chaos tests, canary + rollback flow

Low

  • Title: Administrative centralization lacks visible governance hardening evidence

    • Severity: Low
    • Affected component: VaultAuth.sol
    • Evidence: broad onlyAdmin mutability
    • Why it matters: human/admin key error risk
    • Exploitability: Medium via process weakness
    • Recommended fix: multisig + timelock + alerts
  • Title: Recovery path is operator-dependent

    • Severity: Low
    • Affected component: recoverCollateral/returnQuoteToken
    • Evidence: pause requires trusted restitution flow
    • Why it matters: recovery quality drives downtime
    • Exploitability: Low direct, medium operational
    • Recommended fix: tested runbooks and bounded SLAs

Informational

  • Title: AI advisory path is non-authoritative (good pattern)
    • Severity: Informational
    • Affected component: /api/cre/keeper/_aiAssess.ts, payout-integrity/main.ts
    • Evidence: deterministic fallback verdict and advisory use only
    • Why it matters: preserves deterministic safety boundary
    • Exploitability: N/A
    • Recommended fix: preserve this invariant

15. Recommended Architecture

Choice: C. Move scheduling/decisioning to CRE but keep onchain execution narrowly scoped.

Rationale:

  1. Core risk is policy timing + data quality + liveness (CRE is strong here).
  2. Safety-critical writes should remain minimal, deterministic, and allowlisted.
  3. Current local stack already includes strong building blocks: runtime idempotency, queue dedupe, canonical CSW checks.

Migration shape:

  1. CRE handles schedule, monitoring, deterministic policy synthesis, and queueing.
  2. Hardened execution layer performs allowlisted writes with strict preflight checks.
  3. AI remains advisory-only.
  4. Legacy keeper path remains fallback during migration, then retired after parity/SLO proof.

16. Production Readiness Verdict

  • smart contract safety: 7/10

  • accounting integrity: 6/10

  • redemption fairness: 4/10

  • liveness robustness: 4/10

  • oracle/data robustness: 5/10

  • operator risk: 4/10

  • CRE suitability: 6/10

  • overall deployability: 5/10

  • Would this be deployed with real user funds today?
    Only in a constrained pilot, not mainnet scale.

  • Under what strict conditions only?

    • conservative TVL caps;
    • higher dynamic buffer targets;
    • faster/event-driven keeper cadence;
    • fail-closed production settings on critical dependencies;
    • hardened signer custody and tested incident runbooks.
  • Minimum blockers before mainnet scale

    1. Buffer-liveness hardening with explicit stress policy.
    2. Oracle/data policy hardening (no unsafe fixed-price operations).
    3. Operator key and governance hardening (multisig/timelock/HSM).
    4. Proven CRE+queue reliability under replay/outage/duplication tests.
    5. User-facing disclosure of realizable exit vs accounting value.

17. Missing Information / Next Artifacts Needed

  • deployed addresses per active vault and environment
  • exact production vault config JSONs
  • production keeper env/config values (KEEPER_INTERVAL_MS, oracle mode, fail-open/closed flags)
  • formal pool/bucket selection policy and approved risk limits
  • admin/keeper/swapper ownership model (multisig/timelock/signer inventory)
  • CRE workflow IDs, deployment targets, and secret custody model
  • incident runbooks for buffer depletion, oracle outage, key compromise, CRE outage
  • SLO/SLI targets for rebalance latency and redemption availability
  • postmortem and escalation templates