1
0
Fork 0
mirror of https://github.com/bjornbytes/lovr.git synced 2024-07-04 21:43:34 +00:00
Commit graph

160 commits

Author SHA1 Message Date
bjorn 8b78a4d3b5 core/gpu: depth-only passes can include fragment shader; 2024-02-20 10:19:16 -08:00
bjorn 7b39a30600 Fix sType field in gpu_xr_acquire; 2024-02-09 19:01:55 -08:00
Bjorn d23164235b
Merge pull request #732 from bjornbytes/model-vertex-compression
Model Vertex Compression
2024-01-31 13:35:58 -08:00
bjorn 287769f1f2 Fix 32 bit sync flag truncation; Fix sync validation; 2024-01-29 02:10:40 -08:00
bjorn 6516cca39d Switch to VK_KHR_synchronization2; 2024-01-29 00:58:11 -08:00
bjorn 1ede3ef012 gpu_wait_idle also updates tick;
So you don't have to call `gpu_wait_tick` right after.
2024-01-25 00:13:25 -08:00
bjorn 1e02c16c5d 'sample' TextureFeature no longer implies linear filtering support;
Basically it's somewhat common for depth-stencil formats to not support
linear filtering and that is kind of annoying because you can't create
depth textures with the `sample` usage.  Instead we'll just ignore the
LINEAR format feature bit for now.

In the future I'd like to fix this by silently demoting individual
texture's filtering to nearest when linear is not supported for the
format, but this requires per-texture sampler settings which isn't done
yet.
2024-01-20 21:21:00 -08:00
bjorn 5434dd6add Add sn10x3 data type;
The unpacking code might not be working properly...
2024-01-20 17:39:36 -08:00
bjorn e1abf0b332 Refactor render passes for thread safety;
Currently pipeline compilation accesses the render pass cache, which
presents thread safety challenges.  The framebuffer and render pass
caches are also slow and gross.

This adds a `gpu_pass` object which is basically just a VkRenderPass
object.  The graphics module creates and caches these in
`lovrPassSetCanvas`.  They are used when compiling pipelines and
beginning render passes.

Framebuffers are no longer cached but are just created and immediately
condemned dynamically when beginning a render pass.  This is fine,
because framebuffers are super cheap.

There's still technically a thread safety issue with the `gpu_pass`
object caching, but it's much easier to solve that with a lock in
`lovrPassSetCanvas` compared to trying to make core/gpu's render pass
cache thread safe.

This is all still a temporary measure until we can use a more
"ergonomic" render pass API like dynamic rendering.

Oh also we stopped using automatic layout transitions because they seem
to be pessimistic in drivers and require tying render pass objects to
texture usage which is annoying.  So now we do attachment layout
transitions manually before/after beginning/ending a render pass, which
isn't so bad.
2024-01-13 17:35:43 -08:00
bjorn 5c46d6169a Fix depth write not working when depth test is disabled;
A classic
2024-01-13 03:05:39 -08:00
bjorn 83f106b89f Fix render pass cache when compiling pipelines;
Comparing the lower 32 bits to the full hash was producing false
negatives and causing unnecessary render pass creation when creating
pipelines.
2024-01-09 14:53:27 -08:00
bjorn f1530e4d29 Disable fragment stage if there are no color attachments;
Fixes a validation layer, and may result in performance improvement.  I
think this technically means you can't do discard/FragDepth to adjust
depth buffer values, but that's kinda niche.
2024-01-09 11:56:30 -08:00
bjorn 3cf76c81e7 Refactor shader stages; rm ShaderType;
Goal is to support more combinations of shader stages, both for
vertex-only shaders and mesh/raytracing shaders in the future.  In
general most of the logic that conflated "stage count" with "shader
type" has been changed to look at individual shader stages.
2024-01-05 14:59:19 -08:00
bjorn 655ebfe626 Something something core/gpu supports descriptor arrays;
Arrays of bindings is really bad for API usability so the existing
single-descriptor API remains backwards compatible -- if you specify a
count of zero then the old "by-value" union entry is used, but you can
specify a count > 0 and then it will use an array of bindings.
2023-12-08 21:45:34 -08:00
bjorn 858bc7ffa0 Fix morgue overflow;
The morgue is a fixed-size queue for GPU resources that are waiting to
be destroyed.  There's been an annoying issue with it for a while where
destroying too many objects at once will trigger a "Morgue overflow!"
error.  Even innocuous projects that create more than 1024 textures will
see this during a normal quit.

One way to solve this problem is to make the queue unbounded instead of
bounded.  However, this can hide problems and lead to more catastrophic
failure modes.

A better solution is to add "backpressure", where we avoid putting
things in the queue if it's full, or find some way to deal with them.
In this case it means finding a way to destroy stuff in the morgue when
it's full, to make space for more victims.

We weren't able to add backpressure reliably before, because command
buffers could have commands that reference the condemned resources.
This was mostly a problem for texture transfers -- if you create
thousands of textures in a loop, we'd have a giant command buffer with
commands to transfer pixels to the textures.  If these textures were
destroyed before submitting anything, the morgue would fill up, and we
wouldn't have any way to clear space because there was still a pending
command buffer that needs to act on the textures!

A simple change is to flush all pending transfers whenever a buffer or
texture is destroyed.  This lets us add backpressure to the morgue
because we can guarantee that there are no pending command buffers that
refer to an object in the morgue.

For backpressure, we try to destroy the oldest object in the morgue if
the GPU is done using it.  If that doesn't work, we'll wait on the fence
for its tick and destroy it.  This *should* always work, although in an
extreme case you could vkDeviceWaitIdle and clear out the entire morgue.

It should also be noted that in general command buffers need to be
flushed when destroying objects that they refer to.  However, for our
particular usage patterns, we only need to flush state.stream when a
buffer or texture is destroyed.  Pass objects already refcount their
buffers and textures and their commands are software command buffers, so
they don't require any special handling.  Other objects like shaders,
pipelines, descriptor set layouts, etc. all survive until shutdown, so
those don't impact anything either.
2023-11-30 05:45:13 -08:00
bjorn d375e96c13 sRGB storage image views take 2;
There were numerous problems with the previous effort to add support for
linear views of sRGB storage textures.  Here's another attempt:

- Images are always created with the linear version of their format.
- The default texture view uses the sRGB format if the parent is sRGB.
- Use ImageViewUsageCreateInfo to specify the usage for render/storage views.
- sRGB image views always have their storage bit forcibly cleared.

The storage view now behaves more like the existing renderView -- if we
detect that you couldn't use the default texture view for storage, we'll
create one that is guaranteed to be usable for storage bindings (by
clearing the sRGB flag on it).
2023-11-28 22:47:17 -08:00
bjorn e0e1bc68f9 Add support for cubemap arrays;
- Cubemaps can have any layer count that is a multiple of 6.
- A cubemap with more than 6 layers will be a cubemap array image view.
  - This isn't perfect because it conflates regular cubemaps with
    6-layer cubemap arrays.
- Enable the vk feature, handle the spv feature, add getPixel helper.
2023-11-10 11:15:16 -08:00
bjorn 62810a195c rm Pass:append;
This was an experiment that was never documented/announced.
2023-11-09 16:08:14 -08:00
bjorn c4cda0a7bb core/gpu: rm TRANSIENT texture usage;
It's implied when the usage is just RENDER
2023-11-08 14:54:10 -08:00
bjorn e8945763c2 core/gpu: allocation callbacks; 2023-11-08 14:45:04 -08:00
bjorn 038db88cb7 Consolidate texture format features;
- 'sample' now implies both sample and linear filtering (practically always
  true for all formats lovr supports)
- 'render' now includes 'blend' for color formats (also practically
  always true except for r32f on some old mobile GPUs)
- 'blit' now includes 'blitsrc'/'blitdst' because lovr doesn't support
  blitting between textures with different formats
- 'atomic' is removed because lovr doesn't really support atomic images yet
2023-11-02 15:33:29 -07:00
bjorn eac68d2fe4 Skip creating texture views for transfer-only textures;
Vulkan forbids it!
2023-11-02 13:50:55 -07:00
bjorn e9743a2fb8 Secretly create and bind linear views of sRGB storage textures; 2023-10-31 17:14:09 -07:00
bjorn 1d82e7f66c gpu_texture refactoring; 2023-10-30 18:41:51 -07:00
bjorn 9a276e5f9a Tally fixups;
- rm :getTallyData, it's totally lame, just do a readback
  - rm gpu_tally_get_data too, webgpu doesn't support it anyway
- Clamp tally copy count so it doesn't overflow buffer
- Tally buffer offset's gotta be a multiple of 4
- Return nil instead of 2 values when tally buffer isn't set
- Copy correct number of tallies (multiply by view count instead of max
  view count)
- Skip occlusion queries entirely if no tally buffer was set
2023-10-02 10:20:52 -07:00
bjorn f66ac0820c Fix Android probably; 2023-09-20 21:17:24 -07:00
bjorn 479983fede gpu: surface improvements;
Restores ability to open window after initializing graphics module.

Surface is created lazily instead of being required upfront.

Use native platorm handles instead of GLFW's callbacks.

Some minor reorganization around core/gpu present API and xr transitions.

Linux links against libxcb/libX11/libX11-xcb for XGetXCBConnection.
2023-09-20 21:17:24 -07:00
bjorn f318c68796 Buffers can be cleared to values other than zero; 2023-09-18 23:05:27 -07:00
bjorn dabbd449a8 Format support considers both linear/srgb encodings; 2023-07-10 19:21:11 -07:00
bjorn 22202af27f Revert "gpu: prefer rgb10a2 for window swapchains;"
This reverts commit b7a00c82d1.

I think srgb-encoded rgb10a2 swapchains require manual gamma correction
in shaders (?!), which we aren't quite ready for yet because shaders don't
know the "color space" of their canvas textures.
2023-07-08 15:47:58 -07:00
bjorn b7a00c82d1 gpu: prefer rgb10a2 for window swapchains;
We don't support transparent desktop windows, so losing alpha bits is fine.
2023-07-08 15:02:42 -07:00
bjorn 313fc953cc Pass:append;
Copies draws from one pass onto another one.  Experimental.
2023-06-09 21:34:39 -07:00
bjorn f90cd237ca Add new occlusion query API; 2023-05-03 23:08:45 -07:00
bjorn 28553dda9f core/gpu: skip texture allocator init for unsupported formats;
Should avoid erroneous errors during init for devices that don't support
all the depth formats.
2023-05-03 21:00:41 -07:00
bjorn 452ee5c7c6 Pass rework;
Pass stores draw commands rather than sending them to Vulkan
immediately.

The main motivation is to allow more flexibility in the Lua API.  Passes
are now regular objects, aren't invalidated whenever submit is called,
and can cache their draws across multiple frames.  Draws can also be
internally culled, sorted, and batched.

Some API methods (tallies) are missing, and there are still some bugs to
fix, notably with background color.
2023-05-02 00:06:01 -07:00
bjorn f98306e786 rm transfer passes; rm Tally for now;
- Add Buffer:newReadback
- Add Buffer:getData
- Buffer:getPointer works with permanent buffers
- Buffer:setData works with permanent buffers
- Buffer:clear works with permanent buffers
- Add Texture:newReadback
- Add Texture:getPixels
- Add Texture:setPixels
- Add Texture:clear
- Add Texture:generateMipmaps
- Buffer readbacks can now return tables in addition to Blobs using Readback:getData

Tally is coming back soon with an improved API, it's temporarily removed
since it made the transfer rework a bit easier.

Note that synchronous readbacks (Buffer:getData, Texture:getPixels)
internally call lovr.graphics.submit, so they invalidate existing Pass
objects.  This will be improved soon.
2023-04-29 18:31:03 -07:00
bjorn b402745f6c Update comment; 2023-04-27 19:52:24 -07:00
bjorn b64fdc937f Move mapped buffers from core/gpu into graphics module; 2023-04-27 19:48:12 -07:00
bjorn cb44549205 Fix more prototypes; 2023-04-25 21:45:30 -07:00
bjorn ad4978f692 Fix prototypes; 2023-04-25 21:37:14 -07:00
bjorn 6a030ef4f2 Add shader debug info when t.graphics.debug is set; 2023-04-19 20:49:34 -07:00
bjorn ba0412182b gpu: nickname shaders; 2023-04-05 21:53:39 -07:00
bjorn 086aef1a79 Convert pipeline type bool into enum;
It's a little more readable.
Also batch compute passes for models with multiple skins, since they can
all run at the same time.
2023-03-31 18:44:31 -07:00
bjorn 06c150ce4e Pass:setBlendMode/Pass:setColorWrite take optional target index;
If the target index is missing, the state will apply to all targets.
Fixes undefined behavior when setting color state in a pass with
multiple color attachments.
2023-02-05 15:07:33 -08:00
bjorn 017c2136fd Merge branch 'master' into dev 2023-01-24 18:36:27 -08:00
bjorn 4e01f84070 Depth resolves;
- Allow multisampled render pass to have a single-sample depth attachment
- Add a new depthResolve feature, indicating whether it's supported
- Fix stencil load/save
- Minor changes to render pass caching
- Currently the depth resolve is done using the first sample.  A future
  improvement would be to expose/use the min/max/average resolve modes.
2023-01-22 23:30:57 -08:00
bjorn b24350fb31 gpu: macOS also tries linking to MoltenVK; 2023-01-21 15:27:14 -08:00
bjorn 5bb3f50d77 gpu: enable VK_EXT_swapchain_colorspace when available; 2023-01-20 22:02:32 -08:00
bjorn 14610333ab gpu: reduce tick count to 2;
4 is likely excessive, especially for VR, and increases memory usage and
cache misses.
2023-01-18 17:58:28 -08:00
bjorn 32cc6d52e7 gpu: cleanup;
- Surface stuff in struct;
- Struct for extensions;
2023-01-18 17:57:03 -08:00