The Physics of Memory (aka can Javascript ECS?)

Lobsters Hottest Papers

Summary

This article presents a detailed benchmark comparing ECS and OOP architectures in JavaScript for a 2D physics simulation, testing memory locality and performance across multiple dimensions including broad-phase algorithms and sorting strategies, with results on an M4 Mac.

<p><a href="https://lobste.rs/s/q8vdre/physics_memory_aka_can_javascript_ecs">Comments</a></p>
Original Article
View Cached Full Text

Cached at: 07/02/26, 06:08 AM

# The Physics of Memory - dmurph.com Source: [https://www.dmurph.com/posts/2026/06/ecs_vs_oop_benchmark/ecs_vs_oop_benchmark.html](https://www.dmurph.com/posts/2026/06/ecs_vs_oop_benchmark/ecs_vs_oop_benchmark.html) I’ve heard that many algorithms, like sorting, are improved quite a bit when the accessed data has high memory\-locality \(e\.g\. the CPU can load all of the relevant information in it’s L1/L2 cache for processing, instead of hitting RAM\)\. A common way to work with this is using the Entity Component System \(ECS\) software architecture\. However, it can be difficult to achieve this memory layout for interpreted or OOP languages \(Java, Python, Javscript\), as they often don’t offer the developer as much control over their object memory location\. My question is: > > **Is it possible to use an ECS\-style architecture in Javascript? And for applicable operations, does that actually do better than objects \+ V8’s garbage collection?** To test this, I created a complex\-enough benchmark of… “balls bouncing around in a box”\! This follows standard 2d physics engine techniques: - A “**broad\-phase**” collision step \(AKA “do bounding boxes intersect?”\) - A “**narrow\-phase**” collision step \(AKA “given these intersecting boxes, resolve collisions\!”\) And of course, I got very carried away and ended up creating a ton of benchmark types below\. Try it\! ## Background \- What is ECS? If you’ve spent any time in game development, you’ve probably heard of**Entity Component System \(ECS\)**\. If OOP is a row\-based database \(where each object stores every attribute in one memory chunk\), ECS is a column\-based database, storing each attribute \(or collection of attributes\) in separate column arrays\. Put another way \(and ou can probably find much better descriptions of ECS elsewhere\)… instead of building complex class hierarchies like a`GameEntity`that inherits from`MovingObject`which inherits from`PhysicalObject`etc, ECS breaks everything down into three distinct concepts: 1. **Entities**: These are nothing more than unique integer IDs\. They don’t contain any data or behavior\. 2. **Components**: Plain data structures that represent individual facets of state \(e\.g\.`Position`,`Velocity`,`Color`\)\. They contain no logic\. Each entity can have a dynamic set of components, from 0 to all\. 3. **Systems**: Functions or loops that execute the game logic\. They query entities given a specific set of components to then operate on that data\. ## Benchmark Choices I went overboard and tested across five dimensions: - **Language**:**Javascript**\(testing V8 engine optimizations & TypedArrays\) vs\.**WASM**\(AssemblyScript for extreme memory control\)\. - **Architecture**:**Traditional OOP**\(arrays of object references\) vs\.**ECS**\(cache\-friendly Structure of Arrays\)\. - **Broad\-phase**:**Spatial AVL Tree**\(O\(Nlog⁡N\)O\(N \\log N\)balanced tree\) vs\.**Sweep & Prune**\(sort on X\-axis, sweep for overlaps\)\. - **Sorting Strategy \(for S&P\)**: Tested**Insertion Sort**\(the textbook favorite for nearly\-sorted data\),**Hybrid Quicksort**, a zero\-copy**Hybrid Mergesort**, and JS**Native**`Array\.sort\(\)`\. - **Motion Coherence**:**Wandering**\(smooth movement; array stays nearly sorted\),**Erratic**\(random teleporting; extreme cache thrashing\), and**Static**\(zero motion but with permanent collisions; isolates pure memory read speed with some collisions\)\. Anyways, let’s see how it works\! This is hosted on[GitHub](https://github.com/dmurph/ecs-on-web)if you want to play around with it\. ## Macbook M4 Results Running all these configurations locally on my**Apple M4 Pro chip**\(plugged in, 200 frames\) creates a lot of numbers\. If you want the full dataset of all 6 runs, check out the[Appendix: Full Benchmark Dataset](https://www.dmurph.com/posts/2026/06/ecs_vs_oop_benchmark/ecs_vs_oop_benchmark.html#full-benchmark-dataset)below\. To cut through information overload, let’s focus on 3 main hypotheses: ### H1: Spatial Trees vs\. Sweep & Prune #### **Contiguous 1D memory sweeps consistently outperform hierarchical 2D spatial trees, even when both are stored in flat arrays\.** At 15,000 wandering entities, the flat array Sweep & Prune \(`WASM ECS S&P Quick`\) runs in**1\.81 ms**\(a 9\.02x speedup over the baseline\), while the flat array spatial tree \(`ECS Tree`\) runs in**9\.97 ms**\(only a 1\.64x speedup\)\. Broad\-phase SystemArchitectureAvg Frame Time99th PercentileSpeedup vs OOP Tree**WASM ECS S&P \(Quick\)**Flat 1D Array**1\.809 ms****2\.031 ms****9\.02x****ECS \(Custom SoA\) S&P \(Quick\)**Flat 1D Array4\.668 ms6\.111 ms3\.50x**ECS Tree**2D Spatial BVH9\.970 ms43\.666 ms1\.64x**OOP Tree**2D Spatial BVH16\.321 ms40\.756 ms1\.00xWhy is the tree so much slower, even though it uses flat arrays and avoids JavaScript heap objects? - **Nested Indirection**: While Sweep & Prune has a single layer of index indirection \(`posX\[indices\[i\]\]`\), the AABB Tree requires**nested indirection**\. During traversal, the CPU must look up a child index \(`treeLeft\[idx\]`\) and then use that to look up the node’s bounds \(`treeMinX\[leftChild\]`\)\. This non\-linear jumping across arrays prevents the CPU’s hardware prefetcher from predicting the next memory address\. - **Branch Misprediction**: The tree traversal loop is filled with conditional branches \(e\.g\., deciding whether to descend left, right, or both based on overlap\)\. At 15,000 moving entities, these branches are highly unpredictable, causing frequent CPU instruction pipeline flushes\. Sweep & Prune, by contrast, is a simple, highly predictable linear sweep\. - **Temporal Coherence**: Because entities only move a tiny amount frame\-to\-frame \(high spatial coherence\), the sorted`indices`in Sweep & Prune remain nearly contiguous\. This keeps index lookups highly localized in the cache, whereas tree traversal always jumps across distant nodes\. --- ### H2: JavaScript vs\. WebAssembly #### **WebAssembly is not required to realize the cache\-locality benefits of Data\-Oriented Design, but it provides a major secondary boost by enabling unchecked memory access\.** Simply migrating from OOP \(Arrays of Objects\) to ECS \(flat TypedArrays\) in pure JavaScript yields a**1\.58x to 24\.9x speedup**depending on spatial coherence\. Runtime & Memory LayoutContenderAvg Frame TimeSpeedup vs OOP**WebAssembly \(SoA\)**WASM ECS S&P \(Quick\)**2\.141 ms****3\.91x****JavaScript \(SoA\)**ECS \(Custom SoA\) S&P \(Quick\)5\.318 ms1\.58x**JavaScript \(AoS\)**OOP S&P \(Quick\)8\.380 ms1\.00xMoving to AssemblyScript WASM provides an additional ~2\.5x speedup \(down to**2\.11 ms**\)\. The primary driver of this boost is the ability to use the`unchecked\(\)`operator\. This tells the compiler to bypass array boundary checks, allowing it to emit highly optimized vector instructions and maximize CPU instruction throughput\. --- ## Interesting Learnings There were a number of interesting things I discovered while making this\! ### 1\. The Spatial Coherence Sorting Trap \(Why we need Hybrid Sorts\) #### **Textbook Insertion Sort is a performance trap for real\-time physics; a hybrid sort is required to prevent worst\-case frame jank\.** Under**Wandering**\(high coherence\), the array remains nearly sorted, and Insertion Sort is extremely fast \(**4\.21 ms**\)\. However, under**Erratic**\(low coherence\) workloads \(such as explosions or teleporting\), the array becomes fully unsorted\. Insertion Sort collapses to its quadratic worst\-case \(O\(N2\)O\(N^2\)\), ballooning frame times to**36\.39 ms**and causing severe jank\. To solve this, we implemented a**Hybrid Sort**\(Quicksort or Mergesort that switches to Insertion Sort when a subarray falls below 12 elements\)\. This eliminates the recursive stack overhead for small partitions while guaranteeingO\(Nlog⁡N\)O\(N \\log N\)performance under chaotic motion \(**5\.60 ms**\)\. **Why switch to insertion sort when the subarray falls below 12 elements?**This eliminates the relatively high call\-stack and buffer overhead of divide\-and\-conquer recursion for tiny arrays, where insertion sort’s near\-zero setup overhead is very fast\. Changing to this improved performance\! \(I forget how much though\) ### 2\. The Library Trade\-off \(bitECS vs\. Custom SoA\) #### **Using an ECS library is slightly slower than a hand\-tuned SoA, but remains highly competitive and is the optimal choice for real\-world engineering\.** At 15,000 erratic entities,`bitECS S&P \(Quick\)`runs in**9\.56 ms**compared to our custom ECS at**5\.60 ms**\. While 2x slower, it still delivers a massive**14x speedup**over the OOP baseline \(131\.6 ms\)\. Why is bitECS slower? - **No Property Packing**: In our ECS \(Custom SoA\), we pack`y`,`w`, and`h`into a single`posYwh`array \(`posYwh\[i \* 3 \+ 0\]`\), meaning all three values load into a single CPU cache line\. By default,`bitECS`allocates a**separate TypedArray**for every property in a component \(`y`,`w`, and`h`are separate arrays\), forcing the CPU to fetch from three separate memory streams\. - **Query & ID Mapping Overhead**: bitECS must query archetypes and map sparse Entity IDs to internal array slots, introducing minor indirection\. However, for a real game, the usability, API safety, and dynamic entity management of a library like`bitECS`are far more valuable than saving a few milliseconds of micro\-optimization\. Plus, this can all go away by doing one of the performance optimization tips below, and extracting the physics data to a custom structure before operating on it\. ### 3\. The JavaScript Float32 Gotcha #### **Switching to 32\-bit floats \(Float32Array\) in pure JavaScript degrades performance due to V8’s double\-precision promotion overhead\.** At 15,000 entities, the`Float32Array`implementation of our ECS \(Custom SoA\) was**~5% slower**than the`Float64Array`version \(5\.60 ms vs\. 5\.32 ms\)\. Why? JavaScript numbers are always 64\-bit doubles\. When V8 reads from a`Float32Array`, it must convert the 32\-bit float to a 64\-bit double at runtime before performing math\. When writing back, it must convert it back\. The overhead of these runtime conversions outweighs the L1/L2 cache density gains\. In native environments \(like WASM, C\+\+, or Rust\),`f32`is processed directly on FPU registers without conversion, making it a pure performance win\. ### 4\. The Physics of Determinism #### **Cross\-algorithm trajectory determinism is a false economy once contact resolution order diverges\.** Because trees and sorted sweeps traverse and resolve contacts in different orders, their rounding errors accumulate differently\. This causes the simulation paths to diverge almost immediately \(by frame 2\)\. Focus instead on statistical correctness and local stability rather than matching trajectories exactly across different paradigms\. ### 5\. DOD and Frame\-Time Stability \(The 99th Percentile\) #### **Data\-Oriented Design provides rock\-solid frame rate stability by eliminating garbage collection and cache\-miss spikes\.** If you look at the 99th percentile numbers in our benchmarks, a striking pattern emerges: in the OOP implementations, the 99th percentile is often**2x to 3x higher**than the average frame time\. In the ECS \(Custom SoA\) and WASM implementations, the 99th percentile is almost**identical**to the average\. In OOP, entities are scattered across the heap\. As they move and interact, the JavaScript engine’s garbage collector is constantly triggered, and the CPU frequently stalls waiting for pointer lookups\. This causes sporadic frame drops \(micro\-stutter\)\. Because ECS uses pre\-allocated, flat TypedArrays, memory access is 100% predictable and GC overhead is zero, guaranteeing perfectly smooth frame delivery\. --- ### 🏆 Grand Finale **Question:**What seems to be the fastest reasonably architected code that we can come up with? **Verdict:**Pitting the top spatial trees \(`WASM Tree`,`JS ECS Tree`\) against the Sweep & Prune champions \(`WASM Quick S&P`,`JS Quick S&P`\) across 15,000 chaotic colliding bodies: ContenderParadigmArchitectureRuntime \(Erratic\)Verdict**WASM Quick S&P**ECS1D Flat Array**2\.14 ms****🏆 Winner \(~7\.9x faster than WASM Tree\)****JS Quick S&P**ECS1D Flat Array5\.32 msBest pure Javascript engine**WASM Tree**ECS2D Spatial BVH16\.98 msPeak spatial tree performance**JS ECS Tree**ECS2D Spatial BVH16\.64 msJS spatial tree baseline#### **My Takeaways:** 1. **Cache Locality \> Algorithmic Complexity**: At 15,000 entities, pointer\-chasing and unpredictable tree branching cannot compete with the contiguous L1/L2 cache locality of a flat 1D array sort—even though trees have a better theoretical Big\-O complexity\. 2. **You Don’t Need WASM for ECS Wins**: Simply switching your JavaScript codebase to a flat Structure of Arrays \(SoA\) layout yields up to a**24x speedup**over OOP\. WASM is the cherry on top \(another 2\.5x\), not the entry ticket\. 3. **Pragmatism Wins**: While a hand\-tuned SoA is the absolute fastest, using a production ECS library like`bitECS`still gives you a massive**14x speedup**over OOP while providing a clean, scalable API\. IMO, for 99% of applications using a library is the correct engineering choice\. --- ## Potential Improvements This section outlines how to optimize the S&P broadphase even further for production\-grade game engines\. ### 1\. Eliminating Index Indirection via the ETL Physics Buffer Pattern #### **We can eliminate index indirection during the Sweep phase by extracting and physically permuting a lightweight physics buffer\.** In our simple ECS S&P, the sweep phase must read coordinates indirectly:`posX\[indices\[i\]\]`\. While we could physically sort all of our component arrays to make this contiguous, doing so in a real game is highly impractical because of archetype fragmentation and the waste of permuting non\-physics components \(like inventories or AI state\)\. Instead, we can use an**Extract\-Transform\-Load \(ETL\) Physics Buffer**pattern: 1. **Extract**: At the start of the broadphase, we copy*only*the required spatial fields \(`x, y, w, h, id`\) from the main ECS arrays into a dedicated, flat`PhysicsBuffer`\. 2. **Transform \(Sort & Permute\)**: We sort the`indices`of this buffer, and then physically permute*only*this lightweight`PhysicsBuffer`\. 3. **Contiguous Sweep**: The Sweep phase can now read the sorted`PhysicsBuffer`sequentially \(`posX\[i\]`instead of`posX\[indices\[i\]\]`\)\. 4. **Load \(Output\)**: The broadphase outputs the resulting contact pairs \(Entity IDs\) to the narrowphase, leaving the main ECS database completely untouched\. **Why this is highly practical:** - **Universal Compatibility**: Because the data is extracted into an isolated buffer, this pattern is**100% compatible with any ECS library \(like bitECS\)**\. You simply query the library, populate the buffer, solve, and write back\. - **Incredibly Fast Copies**: StraightO\(N\)O\(N\)sequential data copying in JavaScript \(using`\.set\(\)`or simple loops\) is extremely fast\. Modern CPUs are highly optimized for sequential memory copies, easily saturating memory bandwidth while keeping the prefetcher happy\. - **Measurable Gains**: Implementing this in our benchmark yielded an additional**~6% speedup**at 15,000 entities \(reducing frame times from**5\.32 ms**to**5\.01 ms**under erratic loads\)\. ### 2\. Spatial Hash Grids \(Alternative Spatial Partitioning\) #### **For uniform\-sized entities, a flat\-array Spatial Hash Grid can offerO\(1\)O\(1\)average\-case performance with zero sorting or tree\-balancing overhead\.** While we compared Sweep & Prune against a Dynamic AABB Tree, another highly cache\-friendly alternative is the**Spatial Hash Grid**\. - Unlike the AABB Tree \(which requires recursive, non\-linear tree traversal\), a Spatial Hash Grid maps 2D coordinates into a 1D grid array\. - Entities are inserted into grid cells in a singleO\(N\)O\(N\)pass, and queries are performed by checking neighboring cells inO\(1\)O\(1\)time\. - This approach is extremely cache\-friendly when implemented using flat arrays, and it completely avoids the sorting step of Sweep & Prune, making it highly optimal at very large scales\. ### 3\. Multi\-Axis Sweep & Prune #### **Sorting on the axis of highest spatial variance prevents Sweep & Prune from degrading under alignment pathological cases\.** Our simple Sweep & Prune only sorts along the X\-axis\. If many entities align vertically \(e\.g\., falling down a narrow corridor\), they all overlap on the X projection, causing the sweep phase to degrade towardO\(N2\)O\(N^2\)\. - A production\-grade Sweep & Prune calculates the**variance**of entity positions along both the X and Y axes\. - It then dynamically chooses to sort and sweep along the axis with the highest variance, minimizing the number of overlapping intervals\. ### 4\. WebAssembly SIMD Vectorization #### **We can use WASM SIMD \(Single Instruction Multiple Data\) to check four bounding box intersections simultaneously\.** Since our`PhysicsBuffer`stores coordinates contiguously, we can load 128\-bit SIMD registers with the bounds of four entities at once\. The CPU can then perform the AABB overlap checks in parallel with a single instruction, potentially quadrupling the throughput of the sweep phase\. --- ## Am I missing anything? What am I missing? What other configurations should we add? Did I mess something up? ## FAQs - **Why not use Web Workers?**Because in a microbenchmark, putting things in Web Workers can actually obscure the real bottlenecks\. If we spin up multiple threads, the browser might optimize the main thread in a vacuum, leading to “microbenchmark syndrome\.” Keeping it single\-threaded makes sure we are directly comparing how the browser’s engine manages memory access under active simulation load\. - **Why not C or C\+\+ instead of AssemblyScript?**I attempted to use C\+\+ here, but there were two main hurdles that made it slower than AssemblyScript:- **Pointer Aliasing**: AssemblyScript allowed me to easily use`unchecked\(\)`to disable pointer aliasing protection, which was a bit harder to do with C\+\+ and clang, but definitely not impossible\. - **Optimized Math Operations**: I could not figure out how to get the math library calls faster in C\+\+ to WASM \- somehow AssemblyScript links in really fast math calls for sin/cos/pow/etc, and I failed to get that working with C\+\+\. ## Appendix ### Full Benchmark Dataset For the statistics nerds and performance purists, here is the full, unadulterated dataset of our run of**200 frames**across all 18 registered simulator configurations\. **1\. Static Behavior \(Absolute Coherence \- 5,000 Entities\)**SystemAvg Frame Time99th PercentileSpeedupWASM ECS S&P \(Insertion\)0\.148 ms0\.178 ms4\.43xWASM ECS S&P \(Merge\)0\.182 ms0\.226 ms3\.60xWASM ECS S&P \(Quick\)0\.202 ms0\.243 ms3\.24xECS \(Custom SoA\) S&P \(Insertion\)0\.434 ms0\.484 ms1\.51xECS \(Custom SoA\) S&P \(Quick\)0\.445 ms0\.525 ms1\.47xECS \(Custom SoA\) S&P \(Merge\)0\.445 ms0\.507 ms1\.47xECS \(Custom SoA\) S&P \(Native\)0\.601 ms0\.807 ms1\.09xOOP S&P \(Native\)0\.604 ms0\.821 ms1\.08xOOP S&P \(Insertion\)0\.654 ms0\.912 ms1\.00xOOP S&P \(Quick\)0\.706 ms1\.067 ms0\.93xOOP S&P \(Merge\)0\.720 ms1\.044 ms0\.91xWASM Tree0\.742 ms0\.859 ms0\.88xECS Tree0\.887 ms1\.052 ms0\.74xbitECS S&P \(Insertion\)0\.888 ms0\.967 ms0\.74xbitECS S&P \(Quick\)0\.957 ms1\.067 ms0\.68xbitECS S&P \(Merge\)1\.066 ms2\.337 ms0\.61xOOP Tree1\.104 ms1\.782 ms0\.59xbitECS S&P \(Native\)1\.351 ms3\.841 ms0\.48x **2\. Wandering Behavior \(High Coherence \- 5,000 Entities\)**SystemAvg Frame Time99th PercentileSpeedupWASM ECS S&P \(Insertion\)0\.318 ms0\.355 ms3\.54xWASM ECS S&P \(Merge\)0\.343 ms0\.483 ms3\.28xWASM ECS S&P \(Quick\)0\.374 ms0\.472 ms3\.01xECS \(Custom SoA\) S&P \(Insertion\)0\.845 ms0\.963 ms1\.33xECS \(Custom SoA\) S&P \(Merge\)0\.856 ms1\.090 ms1\.31xECS \(Custom SoA\) S&P \(Quick\)0\.858 ms0\.994 ms1\.31xECS \(Custom SoA\) S&P \(Native\)1\.059 ms1\.723 ms1\.06xOOP S&P \(Insertion\)1\.124 ms1\.676 ms1\.00xOOP S&P \(Quick\)1\.201 ms1\.769 ms0\.94xOOP S&P \(Merge\)1\.325 ms1\.885 ms0\.85xOOP S&P \(Native\)1\.379 ms2\.104 ms0\.81xbitECS S&P \(Insertion\)1\.588 ms2\.603 ms0\.71xbitECS S&P \(Quick\)1\.662 ms2\.384 ms0\.68xbitECS S&P \(Merge\)1\.773 ms2\.556 ms0\.63xbitECS S&P \(Native\)1\.829 ms2\.673 ms0\.61xWASM Tree2\.233 ms2\.614 ms0\.50xECS Tree2\.353 ms2\.941 ms0\.48xOOP Tree2\.938 ms11\.642 ms0\.38x **3\. Erratic Behavior \(Low Coherence \- 5,000 Entities\)**SystemAvg Frame Time99th PercentileSpeedupWASM ECS S&P \(Quick\)0\.462 ms0\.719 ms28\.71xWASM ECS S&P \(Merge\)0\.475 ms0\.740 ms27\.91xECS \(Custom SoA\) S&P \(Quick\)1\.070 ms1\.605 ms12\.39xECS \(Custom SoA\) S&P \(Merge\)1\.125 ms1\.637 ms11\.79xECS \(Custom SoA\) S&P \(Native\)1\.487 ms2\.149 ms8\.92xOOP S&P \(Quick\)1\.549 ms1\.884 ms8\.56xOOP S&P \(Merge\)1\.649 ms2\.101 ms8\.04xbitECS S&P \(Quick\)1\.675 ms2\.174 ms7\.92xbitECS S&P \(Merge\)1\.743 ms2\.373 ms7\.61xbitECS S&P \(Native\)1\.966 ms2\.413 ms6\.74xOOP S&P \(Native\)2\.052 ms2\.765 ms6\.46xWASM ECS S&P \(Insertion\)3\.433 ms4\.153 ms3\.86xECS Tree4\.422 ms5\.360 ms3\.00xWASM Tree4\.485 ms4\.973 ms2\.96xECS \(Custom SoA\) S&P \(Insertion\)4\.511 ms6\.598 ms2\.94xOOP Tree5\.509 ms12\.983 ms2\.41xbitECS S&P \(Insertion\)11\.583 ms19\.861 ms1\.14xOOP S&P \(Insertion\)13\.258 ms18\.408 ms1\.00x **4\. Static Behavior \(Absolute Coherence \- 15,000 Entities\)**SystemAvg Frame Time99th PercentileSpeedupWASM ECS S&P \(Insertion\)0\.959 ms1\.085 ms5\.43xWASM ECS S&P \(Merge\)1\.112 ms1\.349 ms4\.69xWASM ECS S&P \(Quick\)1\.170 ms1\.449 ms4\.45xECS \(Custom SoA\) S&P \(Insertion\)3\.162 ms3\.653 ms1\.65xECS \(Custom SoA\) S&P \(Quick\)3\.293 ms3\.769 ms1\.58xECS \(Custom SoA\) S&P \(Merge\)3\.395 ms3\.954 ms1\.53xWASM Tree3\.587 ms5\.116 ms1\.45xECS \(Custom SoA\) S&P \(Native\)3\.902 ms4\.741 ms1\.34xECS Tree4\.320 ms5\.095 ms1\.21xOOP S&P \(Insertion\)5\.210 ms6\.260 ms1\.00xOOP S&P \(Native\)5\.423 ms7\.281 ms0\.96xOOP S&P \(Quick\)5\.614 ms6\.560 ms0\.93xOOP S&P \(Merge\)5\.995 ms9\.221 ms0\.87xOOP Tree6\.746 ms9\.319 ms0\.77xbitECS S&P \(Quick\)7\.456 ms10\.520 ms0\.70xbitECS S&P \(Merge\)7\.447 ms9\.227 ms0\.70xbitECS S&P \(Native\)7\.712 ms9\.912 ms0\.68xbitECS S&P \(Insertion\)8\.448 ms14\.122 ms0\.62x **5\. Wandering Behavior \(High Coherence \- 15,000 Entities\)**SystemAvg Frame Time99th PercentileSpeedupWASM ECS S&P \(Insertion\)1\.557 ms1\.859 ms4\.71xWASM ECS S&P \(Merge\)1\.767 ms2\.186 ms4\.15xWASM ECS S&P \(Quick\)1\.809 ms2\.031 ms4\.05xECS \(Custom SoA\) S&P \(Insertion\)4\.489 ms6\.483 ms1\.63xECS \(Custom SoA\) S&P \(Quick\)4\.668 ms6\.111 ms1\.57xECS \(Custom SoA\) S&P \(Merge\)4\.781 ms5\.670 ms1\.53xECS \(Custom SoA\) S&P \(Native\)5\.658 ms8\.731 ms1\.30xOOP S&P \(Insertion\)7\.335 ms9\.044 ms1\.00xOOP S&P \(Merge\)7\.540 ms9\.290 ms0\.97xOOP S&P \(Quick\)7\.645 ms10\.421 ms0\.96xOOP S&P \(Native\)8\.036 ms11\.292 ms0\.91xbitECS S&P \(Quick\)9\.077 ms10\.931 ms0\.81xWASM Tree9\.083 ms10\.938 ms0\.81xbitECS S&P \(Merge\)9\.215 ms11\.554 ms0\.80xbitECS S&P \(Native\)9\.674 ms11\.345 ms0\.76xECS Tree9\.970 ms43\.666 ms0\.74xbitECS S&P \(Insertion\)10\.730 ms37\.820 ms0\.68xOOP Tree16\.321 ms40\.756 ms0\.45x **6\. Erratic Behavior \(Low Coherence \- 15,000 Entities\)**SystemAvg Frame Time99th PercentileSpeedupWASM ECS S&P \(Quick\)2\.141 ms2\.588 ms61\.92xWASM ECS S&P \(Merge\)2\.226 ms2\.548 ms59\.56xECS \(Custom SoA\) S&P \(Quick\)5\.318 ms5\.765 ms24\.93xECS \(Custom SoA\) S&P \(Merge\)5\.668 ms6\.334 ms23\.39xECS \(Custom SoA\) S&P \(Native\)6\.991 ms8\.146 ms18\.97xOOP S&P \(Quick\)8\.380 ms9\.820 ms15\.82xOOP S&P \(Merge\)8\.826 ms12\.094 ms15\.02xbitECS S&P \(Quick\)9\.692 ms11\.258 ms13\.68xbitECS S&P \(Merge\)10\.037 ms12\.722 ms13\.21xOOP S&P \(Native\)10\.645 ms14\.415 ms12\.45xbitECS S&P \(Native\)10\.877 ms12\.864 ms12\.19xECS Tree16\.639 ms19\.107 ms7\.97xWASM Tree16\.976 ms19\.827 ms7\.81xOOP Tree29\.261 ms74\.175 ms4\.53xWASM ECS S&P \(Insertion\)32\.518 ms35\.499 ms4\.08xECS \(Custom SoA\) S&P \(Insertion\)36\.896 ms41\.801 ms3\.59xbitECS S&P \(Insertion\)101\.341 ms122\.182 ms1\.31xOOP S&P \(Insertion\)132\.579 ms140\.137 ms1\.00x

Similar Articles

Echo-Memory: A Controlled Study of Memory in Action World Models

Hugging Face Daily Papers

Echo-Memory presents a controlled study of memory mechanisms in action-conditioned world models, revealing that memory structure and capacity significantly impact open-domain return performance beyond replay fidelity. The study introduces a matched evaluation protocol and finds that raw context and state-space recurrence are strong mechanisms.

Agent Memory: An Anatomy

Hacker News Top

An exploration of the components and design decisions behind agent memory libraries, clarifying the gap between cognitive science terminology and engineering implementation.

@mem0ai: https://x.com/mem0ai/status/2064383137338233179

X AI KOLs Timeline

This article analyzes GitHub Copilot's memory architecture, which uses structured memory objects anchored to specific code citations and employs just-in-time verification to combat knowledge staleness. With memory enabled, Copilot's pull request merge rate improved from 83% to 90% in an A/B test on real developers.

@mem0ai: https://x.com/mem0ai/status/2054580022049198513

X AI KOLs Timeline

This article explains how memory works in Codex CLI, OpenAI's open-source coding agent. It describes the memory architecture based on markdown files, the write path with phased extraction and consolidation, and the read path using keyword search, all designed for predictability and low retrieval cost.