The problem with a fixed count value is that it result in varying
degrees of performance and memory impact proportions depending on
the usage pattern of each script.
E.g. if a script keeps using the same 1M objects, then a threshold of
10K objects will result in GC cycles which free only 1% of objects,
which is hugely wasteful in terms of performance. On the other hand,
if a script only uses 100 objects then a threshold of 10K means it
uses 100 times more memory than it actually needs before GC triggers.
Now the threshold is a target memory usage factor (of the minimum
needed memory) which GC tries to aim at. This makes the GC impact
have constant proportions.
The default aims at a memory usage factor of 5, i.e. 80% garbage
and 20% remaining on each GC cycle.
The factor is only a target/goal because the actual overhead is not
known until GC completes. However, most scripts exhibit consistent
enough behavior such that the real overhead is within 10% or less of
the goal even when the usage pattern changes over time.
Within the v8 bench test suite, the actual GC threshold count varies
between ~50K to ~500K, where only one test (raytrace.js) stabilizes on
less than 10K (the previous fixed default) - at about 9K and its score
decreases by ~5%.
The splay.js score increases about x12 fold (or x50 fold from the
previous commit which counts properties), other tests quite a bit
less or none at all, and the overall score increases by nearly 40%.
Also, change the count type from int to unsigned int to get twice the
range. Preferably we should make it even bigger, maybe uint64_t.
For instance the splay.js v8 bench test surpasses million non-garbage
allocations within few seconds (that's why its score increased so much
with a proportional overhead), and the default GC threshold is 5 times
that number.
They're not negligible in the overall count.
This decreases the performance of scripts which use objects with many
properties because the GC threshold remains the same (10K) but it's
reached quicker due to counting more allocations, so GC is more
frequent and wastes more overall time.
Good example of the performance impact is the splay.js v8 bench test,
where this commit reduces its score by a factor of 5.
We're not changing the threshold because it's arbitrary anyway, but the
next commit will change it in a way which allows more proportional
control of the GC overhead.
When scanning an iterator object, the iterated object was marked
unconditionally. Now it's marked only if it's not already marked - like
all other object markings.
This code was incorrect for some years, but wasn't really an issue
before commit 331c5ec because marking an object twice simply used some
more CPU cycles but otherwise without issues - unless there were cycles,
and apparently typically/always there never were cycles with iterators,
so it was hard/impossible to behave badly.
However, since 331c5ec, marking an object means inserting it into a
linked list where the list nodes are part of the object, therefore
marking the same object twice now creates a broken linked list.
A broken list means that some objects are skipped while scanned, which
means they don't get marked even when they should, and as a result
freed incorrectly while still referenced by other objects, resulting in
random errors related to use-after-free.
The following functions are no longer restricted to 16-bit integer values:
String.fromCharCode()
String.prototype.charCodeAt()
repr() will not escape SMP characters, as doing so would require conversion to
surrogate pairs, but will encode these characters as UTF-8. Unicode characters
in the BMP will still be escaped with \uXXXX as before.
JSON.stringify() only escapes control characters, so will represent all non-ASCII
characters as UTF-8.
We do no automatic conversions to/from surrogate pairs. Code that worked with
surrogate pairs should not be affected by these changes.
We're supposed to check whether a string turned into an integer and back
is itself, while also returning the value of the integer. We were
unintentionally allowing integers with leading zero through.
We normally build as one compilation unit with "one.c" but we should
still be able to build each source file as a separate compilation unit
if desired.
We can't know at compile time that the 'arguments' object will not be used
from the eval statement, so err on the side of caution and always create
the arguments object if eval can be called.