There were certain cases where an unrenderable texture view could be
created from a renderable parent texture (e.g. it has multiple mipmap
levels). This would leave renderView as NULL, which would cause a
crash.
Currently if you create a texture from dimensions (assumed to be a
render target), its mipmap count differs depending on whether you
provide an options table. Now it will consistently be 1.
If the target index is missing, the state will apply to all targets.
Fixes undefined behavior when setting color state in a pass with
multiple color attachments.
They are copied by value. Upon popping, they are pushed as temporary
vectors. Matrices are allocated on the heap, everything else is stored
in the Variant itself.
This enables a few things:
- Custom vector methods can be added/replaced.
- Vector methods can be called as non-methods e.g. vec3.normalize(v)
- Possibility for vector constants in an __index metamethod.
This was measured as increasing overhead by 3.5% when creating temporary
vectors (2ns), which is arbitrarily deemed acceptable.
- Allow multisampled render pass to have a single-sample depth attachment
- Add a new depthResolve feature, indicating whether it's supported
- Fix stencil load/save
- Minor changes to render pass caching
- Currently the depth resolve is done using the first sample. A future
improvement would be to expose/use the min/max/average resolve modes.
- Put channel into thread module file.
- Make thread internals private.
- Handle more thread bookkeeping in thread module instead of Lua API.
- Fix a few race conditions/leaks nobody was probably ever going to hit.
Zip archives weren't enumerating in the root directory when they were
mounted with a non-empty mountpoint. Additionally, zips mounted at the
root directory weren't listing files properly. This fixes both by
normalizing the mountpoint prefix (it had a prepended slash when it was
empty, which messed up hashing), and ensuring there is a "root node" in
the tree with an empty string.
When a memory block is used for host-visible memory, its mapped pointer
is tracked with the block. If that memory is freed and later re-used
for some non-mappable memory, the pointer never gets cleared, and so
code thinks the memory is mappable and tries to use the pointer.
Although the name is unfortunate, this allows access to lovr.headset
when no window is opened or when the graphics module is disabled. This
requires the XR_MND_headless extension to be supported by the runtime.
Only when a readback is read back before a pass is created.
Should really change gpu to know if the frame has started yet and adjust
the tick index accordingly.
- Check for layers before enabling
- Check for instance/device extensions before enabling
Fixes unfriendly errors when running on a system without validation layers
installed.
Uses same table approach as OpenXR code.
Some Android header defines DEPTH, which clashes with a symbol in the
OpenXR driver. This change just stops using Android headers in there
and declares more granular private functions. It also removes a few
unused private os functions.
- Allow parent CMake projects to expose symbols more easily
- Allow for custom plugins folder
- Include directories are always relative to lovr's source dir
Co-authored-by: Ilya Chelyadin <ilya77105@gmail.com>
ModelData:getTriangles currently adds a fresh set of vertices for every
mesh in a node. This is technically correct, but it wastes space when 2
nodes reference the same set of vertices with different index buffers,
which is pretty common when a node has multiple materials. It also
breaks ODE, who doesn't like it when vertices outnumber indices too
much.
- Add helper functions for creating shapes to avoid duplication between
newShape and newShapeCollider.
- Add lovr.physics.newMeshShape and lovr.physics.newTerrainShape
- Register TerrainShape so it has all the base Shape methods
- Smooth out a few TerrainShape warnings
Fixes easily-encounterable GPU OOM on discrete cards.
Currently when mapping CPU-accessible GPU memory, there are only two
types of memory: write and read.
The "write" allocations try to use the special 256MB pinned memory
region, with the thought that since this memory is usually for vertices,
uniforms, etc. it should be fast.
However, this memory is also used for staging buffers for buffers and
textures, which can easily exceed the 256MB (or 246MB on NV) limit upon
creating a handful of large textures.
To fix this, we're going to separate WRITE mappings into STREAM and
STAGING. STREAM will act like the old CPU_WRITE mapping type and use
the same memory type. STAGING will use plain host-visible memory and
avoid hogging the precious 256MB memory region.
STAGING also uses a different allocation strategy. Instead of creating
a big buffer with a zone for each tick, it's a more traditional linear
allocator that allocates in 4MB chunks and condemns the chunk if it ever
fills up. This is a better fit for staging buffer lifetimes since there's
usually a bunch of them at startup and then a small/sporadic amount
afterwards. The buffer doesn't need to double in size, and it doesn't
need to be kept around after the transfers are issued. The memory
really is single-use and won't roll over from frame to frame like the
other scratchpads.
There's a "portability enumeration" extension and flag you have to set
to get Vulkan to work on macOS. If you don't set it, Vulkan hides the
MoltenVK runtime since it's not 100% conformant. The flag was added
unconditionally, but it needs to only be added when the extension is
active.
Fixes easily-encounterable GPU OOM on discrete cards.
Currently when mapping CPU-accessible GPU memory, there are only two
types of memory: write and read.
The "write" allocations try to use the special 256MB pinned memory
region, with the thought that since this memory is usually for vertices,
uniforms, etc. it should be fast.
However, this memory is also used for staging buffers for buffers and
textures, which can easily exceed the 256MB (or 246MB on NV) limit upon
creating a handful of large textures.
To fix this, we're going to separate WRITE mappings into STREAM and
STAGING. STREAM will act like the old CPU_WRITE mapping type and use
the same memory type. STAGING will use plain host-visible memory and
avoid hogging the precious 256MB memory region.
STAGING also uses a different allocation strategy. Instead of creating
a big buffer with a zone for each tick, it's a more traditional linear
allocator that allocates in 4MB chunks and condemns the chunk if it ever
fills up. This is a better fit for staging buffer lifetimes since there's
usually a bunch of them at startup and then a small/sporadic amount
afterwards. The buffer doesn't need to double in size, and it doesn't
need to be kept around after the transfers are issued. The memory
really is single-use and won't roll over from frame to frame like the
other scratchpads.
- When a memory block was freed, any allocators that were using it need
to null out their memory blocks. Otherwise it would just keep using
the freed block.
- The emergency morgue expunge doesn't work. It would be nice if it
did, but if the emergency expunge deletes an object that exists in a
command buffer that's still being recorded, it causes all kinds of
problems and corrupts the command buffer. You'd either need to submit
these command buffers early before deleting the object (this is super
tricky) or just prevent this entirely, maybe by growing the morgue
infinitely or throwing an error if it fills up. Either way, creating
and releasing textures in a loop without submitting work will
eventually throw an 'out of memory' error. None of this is
satisfactory and I'm not sure how to solve this well yet.
- To compromise, increase the size of the morgue a bit so emergency
flushes happen slightly less often.
The "system" button on Valve Index controller may not be exposed to
applications through OpenXR. Oculus runtime throws error when binding
for that button is attempted.
When multiview is not supported (although technically lovr requires it),
the renderSize limit for array layers was zero, which meant no render
passes would work. Instead, make sure it's at least 1, which is more
correct.
The animation compute shader was not specializing the workgroup size
properly, so it was only working on GPUs with a subgroup size of 32.
The Quest 1 has a subgroup size of 32 and the Quest 2 has a subgroup
size of 64, so this resulted in hand models breaking on Quest 2 only!
A null-char is valid part of Lua string. When such a string is sent
through the channel, its length should be stored as well to be able to
correctly reconstruct it on the other thread.
The bug was triggered with this code:
s1 = 'a \0 b'
print(#s1) -- 5
ch:push(s1)
s2 = ch:pop()
print(#s2) -- 2
- glowTexture is on by default, but still requires the glow flag.
- occlusionTexture is named ambientOcclusion, and is on by default,
but is still not used by any builtin shaders/helpers.
Sigh, back to getPass. I don't even know at this point. Basically now
that we came up with a half-solution for temp buffers, it makes sense to
apply this to passes as well, since we aren't going with the workstream
idea and temp passes are more convenient than retained passes.
- They no longer live in temporary memory, but in a dedicated pool.
- There are error checks for using a temporary buffer after it's invalid
- However, these are imperfect, and could be improved. One idea is to
avoid recycling a temporary buffer until its refcount decays (i.e.
Lua finally decides to garbage collect it). This would explode
memory usage sometimes, so it could only be enabled when
t.graphics.debug is true.