- lovr.headset.getPassthrough returns current passthrough mode
- lovr.headset.setPassthrough sets the passthrough mode
- nil --> uses the default passthrough mode for the headset
- bool --> false = opaque, true = one of the transparent modes
- string --> explicit PassthroughMode
- lovr.headset.getPassthroughModes returns a table of supported modes
Creates a lightweight copy of a Model, for situations where a single
model needs to be rendered with multiple poses in a single frame, which
is currently not possible.
Enables automatic CPU/GPU timing for all passes. Defaults to true
when graphics debugging is active, but can be enabled/disabled manually.
When active, Pass:getStats will return submitTime and gpuTime table
keys, respectively indicating CPU time the Pass took to record and the
time the Pass took to run on the GPU. These have a delay of a few
frames.
This doesn't include a way to get "global" timing info for a submit.
This information would be useful because it doesn't require lovrrs to
sum all the timing info for all the passes and it would include other
work like transfers, synchronization, and CPU waits. However, this is
more challenging to implement under the current architecture and will be
deferred to later. Even if this were added, these per-pass timings will
remain useful.
- Make sure to reset barriers for compute/canvas resources too
- Delay stream ending so OpenXR layout transitions actually go in an
active command buffer.
If you switch to/from a compute shader and the other shader is either
nil or a graphics shader, clear bindings.
Maybe if you switch to/from nil the bindings shouldn't be cleared, but
this is a bit more complicated to implement and it's not clear that
there's any reason not to treat nil shaders as graphics shaders.
Previously, if you wanted to run compute operations that depend on the
results of prior compute operations, you had to put these in 2 different
passes, because logically all of the compute calls in a pass run "at the
same time" (or we're at least giving the GPU the freedom to do that).
Having to set up an entirely new pass just to synchronize 2 :compute
calls is pretty cumbersome, and incurs extra overhead. It would be
possible to change things so *every* :compute call waits for previous
computes to finish, but this would destroy GPU parallelism.
The Pass:barrier method lets compute calls within a pass synchronize
with each other, without requiring multiple passes. Adding a barrier
basically means "hey, wait for all the :compute calls before the barrier
to finish before running future :computes".
This lets things remain highly parallel but allows them to be easily
synchronized when needed.
Pass stores draw commands rather than sending them to Vulkan
immediately.
The main motivation is to allow more flexibility in the Lua API. Passes
are now regular objects, aren't invalidated whenever submit is called,
and can cache their draws across multiple frames. Draws can also be
internally culled, sorted, and batched.
Some API methods (tallies) are missing, and there are still some bugs to
fix, notably with background color.
View count is well-defined to be 2 with the current view configuration,
and people should be able to rely on getViewCount even before the views
are tracked. It returns the number of views in the view configuration,
not the number of views with valid data.
- If timestamp is zero (before .update is called), return empty data
instead of erroring.
- Check for valid position/orientation separately, and return empty data
for anything that's invalid. Previously both position/orientation
were used if either was valid, which returns undefined results.
- Add Buffer:newReadback
- Add Buffer:getData
- Buffer:getPointer works with permanent buffers
- Buffer:setData works with permanent buffers
- Buffer:clear works with permanent buffers
- Add Texture:newReadback
- Add Texture:getPixels
- Add Texture:setPixels
- Add Texture:clear
- Add Texture:generateMipmaps
- Buffer readbacks can now return tables in addition to Blobs using Readback:getData
Tally is coming back soon with an improved API, it's temporarily removed
since it made the transfer rework a bit easier.
Note that synchronous readbacks (Buffer:getData, Texture:getPixels)
internally call lovr.graphics.submit, so they invalidate existing Pass
objects. This will be improved soon.
If the number of skinned vertices in a Model doesn't fit in a single
dispatch (~2M vertices on 32-sized subgroups or ~500K vertices on
8-sized subgroups), split it into multiple dispatches.
- Pass:mesh accepts tables for vertices/indices
- Add Pass:setVertexFormat to set format used for table-based meshes
- Pass:send accepts tables for buffers
- Pass:send supports arbitrarily nested structs/arrays for push constants
- Buffer formats support arbitrarily nested structs/arrays
- Zero-length buffers are valid and represent structs
- Fields can have names using 'name'
- Field types can be tables of other fields (structs)
- Fields can have 'length' key
- newBuffer syntax has been changed to put format first (old version
still works)
- Buffers can be created from shader variables, avoiding need to declare
matching format.
- Pass:clear/Pass:read use byte offsets instead of indices
- Pass:copy uses byte offsets when copying a Buffer to a Buffer
- Deprecate lovr.graphics.getBuffer (tables can be used instead)
- core/spv just returns the type of image variables instead of trying to
validate them.
- When Shader is loading resources, it will reject combined image
samplers, uniform/texel buffers, and input attachments, with better
error messages that include the binding number of the invalid resource.
There were certain cases where an unrenderable texture view could be
created from a renderable parent texture (e.g. it has multiple mipmap
levels). This would leave renderView as NULL, which would cause a
crash.
If the target index is missing, the state will apply to all targets.
Fixes undefined behavior when setting color state in a pass with
multiple color attachments.
They are copied by value. Upon popping, they are pushed as temporary
vectors. Matrices are allocated on the heap, everything else is stored
in the Variant itself.
- Allow multisampled render pass to have a single-sample depth attachment
- Add a new depthResolve feature, indicating whether it's supported
- Fix stencil load/save
- Minor changes to render pass caching
- Currently the depth resolve is done using the first sample. A future
improvement would be to expose/use the min/max/average resolve modes.
- Put channel into thread module file.
- Make thread internals private.
- Handle more thread bookkeeping in thread module instead of Lua API.
- Fix a few race conditions/leaks nobody was probably ever going to hit.
Zip archives weren't enumerating in the root directory when they were
mounted with a non-empty mountpoint. Additionally, zips mounted at the
root directory weren't listing files properly. This fixes both by
normalizing the mountpoint prefix (it had a prepended slash when it was
empty, which messed up hashing), and ensuring there is a "root node" in
the tree with an empty string.
Although the name is unfortunate, this allows access to lovr.headset
when no window is opened or when the graphics module is disabled. This
requires the XR_MND_headless extension to be supported by the runtime.
Only when a readback is read back before a pass is created.
Should really change gpu to know if the frame has started yet and adjust
the tick index accordingly.
Some Android header defines DEPTH, which clashes with a symbol in the
OpenXR driver. This change just stops using Android headers in there
and declares more granular private functions. It also removes a few
unused private os functions.
ModelData:getTriangles currently adds a fresh set of vertices for every
mesh in a node. This is technically correct, but it wastes space when 2
nodes reference the same set of vertices with different index buffers,
which is pretty common when a node has multiple materials. It also
breaks ODE, who doesn't like it when vertices outnumber indices too
much.
- Add helper functions for creating shapes to avoid duplication between
newShape and newShapeCollider.
- Add lovr.physics.newMeshShape and lovr.physics.newTerrainShape
- Register TerrainShape so it has all the base Shape methods
- Smooth out a few TerrainShape warnings
Fixes easily-encounterable GPU OOM on discrete cards.
Currently when mapping CPU-accessible GPU memory, there are only two
types of memory: write and read.
The "write" allocations try to use the special 256MB pinned memory
region, with the thought that since this memory is usually for vertices,
uniforms, etc. it should be fast.
However, this memory is also used for staging buffers for buffers and
textures, which can easily exceed the 256MB (or 246MB on NV) limit upon
creating a handful of large textures.
To fix this, we're going to separate WRITE mappings into STREAM and
STAGING. STREAM will act like the old CPU_WRITE mapping type and use
the same memory type. STAGING will use plain host-visible memory and
avoid hogging the precious 256MB memory region.
STAGING also uses a different allocation strategy. Instead of creating
a big buffer with a zone for each tick, it's a more traditional linear
allocator that allocates in 4MB chunks and condemns the chunk if it ever
fills up. This is a better fit for staging buffer lifetimes since there's
usually a bunch of them at startup and then a small/sporadic amount
afterwards. The buffer doesn't need to double in size, and it doesn't
need to be kept around after the transfers are issued. The memory
really is single-use and won't roll over from frame to frame like the
other scratchpads.
The "system" button on Valve Index controller may not be exposed to
applications through OpenXR. Oculus runtime throws error when binding
for that button is attempted.
The animation compute shader was not specializing the workgroup size
properly, so it was only working on GPUs with a subgroup size of 32.
The Quest 1 has a subgroup size of 32 and the Quest 2 has a subgroup
size of 64, so this resulted in hand models breaking on Quest 2 only!
A null-char is valid part of Lua string. When such a string is sent
through the channel, its length should be stored as well to be able to
correctly reconstruct it on the other thread.
The bug was triggered with this code:
s1 = 'a \0 b'
print(#s1) -- 5
ch:push(s1)
s2 = ch:pop()
print(#s2) -- 2
- glowTexture is on by default, but still requires the glow flag.
- occlusionTexture is named ambientOcclusion, and is on by default,
but is still not used by any builtin shaders/helpers.
Sigh, back to getPass. I don't even know at this point. Basically now
that we came up with a half-solution for temp buffers, it makes sense to
apply this to passes as well, since we aren't going with the workstream
idea and temp passes are more convenient than retained passes.
- They no longer live in temporary memory, but in a dedicated pool.
- There are error checks for using a temporary buffer after it's invalid
- However, these are imperfect, and could be improved. One idea is to
avoid recycling a temporary buffer until its refcount decays (i.e.
Lua finally decides to garbage collect it). This would explode
memory usage sometimes, so it could only be enabled when
t.graphics.debug is true.
A lot of clean up can happen now that C doesn't push delayed errors to
Lua. This was happening for Pico and WebVR, neither of which are used
anymore.
Also default vsync to true but force it off if VR is active.
The sync was totally wrong here. It's a bit better now. However there
are some general sync issues that need to be fixed. Basically a Pass
that does reads and writes or multiple writes doesn't work properly, for
various reasons. I think sync needs to be split into 2 phases -- first
process all the reads and merge barrier bits into the barrier of the
last writer, then process all the writes and set 'final' resource state
for stuff in the pass. Due to branch prediction it may be better to
have 2 separate lists -- one for reads and one for writes. And I'm not
100% sure on how to reconcile a Pass that is doing reads and writes to
the same resource yet, still thinking about it.