Skip to content
This repository was archived by the owner on Jan 7, 2023. It is now read-only.

Conversation

YangShi-Intel
Copy link

On Gordenpeak platform, this state flush function have problem.
It will cause GPU read some garbage data sometime.
So add time wait after cache flush.
After stress test, It can keep GPU read the right state data now.
Will continue to optimize this solution.

Signed-off-by: Yang Shi [email protected]

Sergii Romantsov and others added 30 commits September 8, 2018 01:28
Building of 32bit mesa with meson causes issue:
"implicit declaration of function ‘__builtin_ia32_clflush’".
Fixed by adding msse2 compilation flag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107843
Fixes: 314879f (i965: Fix asynchronous mappings on !LLC platforms.)
Signed-off-by: Sergii Romantsov <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
(cherry picked from commit 97fcccb)
[Andres Gomez: resolve trivial conflicts]
Signed-off-by: Andres Gomez <[email protected]>

Conflicts:
	src/intel/tools/meson.build
There were two bugs working together to make things mostly work: I wasn't
dividing the VPM output size available by the size of a batch (vertex),
but I also had the size of the VPM reduced by a factor of 8.

Fixes dEQP-GLES3.functional.vertex_array_objects.all_attributes and it
seems also my intermittent varying failures.

Fixes: 1561e49 ("v3d: Emit the VCM_CACHE_SIZE packet.")
(cherry picked from commit a91b158)
The Vulkan 1.1.81 spec says:

    "It is legal for offset.x + extent.width or offset.y + extent.height
    to exceed the dimensions of the framebuffer - the scissor test still
    applies as defined above. Rasterization does not produce fragments
    outside of the framebuffer, so such fragments never have the scissor
    test performed on them."

Elsewhere, the Vulkan 1.1.81 spec says:

    "The application must ensure (using scissor if necessary) that all
    rendering is contained within the render area, otherwise the pixels
    outside of the render area become undefined and shader side effects
    may occur for fragments outside the render area. The render area
    must be contained within the framebuffer dimensions."

Unfortunately, there's some room for interpretation here as to what the
consequences are of having the render area set to exactly the
framebuffer dimensions and having a scissor that is larger than the
framebuffer.  Given that GL and other APIs provide automatic clipping to
the framebuffer, it makes sense that applications would assume that
Vulkan does this as well.  It costs us very little to play it safe and
just clamp client-provided scissors to the framebuffer dimensions.
Fortunately, the user is required to provide us with at least one
scissor so we don't need to handle the case where they don't.

Fixes: fb2a5ce "anv: Emit DRAWING_RECTANGLE once at driver..."
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit 465e5a8)
VI uses addrlib so it's unaffected.

Cc: 18.1 18.2 <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
(cherry picked from commit a1b9a00)
… on SI/CI

Cc: 18.2 <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
(cherry picked from commit d4e5228)
Cc: 18.1 18.2 <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
(cherry picked from commit da72b62)
important for debugging

Cc: 18.1 18.2 <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
(cherry picked from commit 662db03)
Cc: 18.2 <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
(cherry picked from commit a5f35aa)
Acked-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 34a17a4)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 6f00785)
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
CC: 18.2 <[email protected]>
(cherry picked from commit f6e09db)
Since commit af3685d various OpenGL applications regressed
on the classic mesa radeon driver.

Signed-off-by: Christopher Egert <[email protected]>
CC: 18.1 18.2 <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
(cherry picked from commit 51995f6)
This fixes the situation where we'd send a shader with just the
header and no data.

piglit/glsl-max-varyings test was causing this to happen, and
the renderer fix was breaking it.

v2: drop fprintf

Fixes: a8987b8 "virgl: add driver for virtio-gpu 3D (v2)"
Reviewed-by: Erik Faye-Lund <[email protected]>
(cherry picked from commit 240af61)
Building of 32bit mesa with meson causes linkage issue:
"undefined reference to `util_get_process_name'"
Fixed by adding link-with mesa_util for xmlconfig primary.

v2: Removed '[]', commit message corrected.

v3: Reverted changes in gbm and glx libraries.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107843
Fixes: 2e1e651 "util: extract get_process_name from xmlconfig.c"
Cc: Marek Olšák <[email protected]>
Cc: Dylan Baker <[email protected]>
Signed-off-by: Sergii Romantsov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
(cherry picked from commit bbe551f)
We don't need to wait before drawing to the fake front buffer, as front
buffer rendering by definition is allowed to produce artifacts.

Fixes hangs in some cases when re-using the fake front buffer, due to it
still being busy (i.e. in use for presentation).

Cc: [email protected]
Bugzilla: https://bugs.freedesktop.org/106404
Bugzilla: https://bugs.freedesktop.org/107757
Tested-by: Olivier Fourdan <[email protected]>
Reviewed-by: Thomas Hellstrom <[email protected]>
(cherry picked from commit aefac10)
Otherwise they are not exported.

CC: 18.2 <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]
Signed-off-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit d4bf954)
It's actually just the opposite.

This fixes the new Sascha conditionalrender demo.

CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit abdf396)
Bumping to 64 should be safe enough.

Fixes some crashes with new CTS:
dEQP-VK.binding_model.descriptorset_random.*

CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit b9f6521)
This was wrong for descriptor #0 when all of them are indirect.
This is because indirect_offset was 0 and we emitted a
"normal" descriptor pointer for nothing.

While we are at it remove
radv_userdata_info::indirect_offset which is useless.

CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit aa30205)
LLVM 6 isn't affected.

Fixes GPU hangs with new CTS:
dEQP-VK.binding_model.descriptorset_random.*

CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 063264d)
Let say, we first bind a graphics pipeline that needs indirect
descriptors sets. The userdata pointers will be emitted at draw
time. Then if we bind a compute pipeline that doesn't need any
indirect descriptors, the driver will re-emit them for all
grpahics stages.

To avoid this to happen, just check the bind point type.

CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 748f4cc)
Indirect descriptors only need one entry, we don't have to
emit a location for every descriptors.

Fixes GPU hangs with new CTS:
dEQP-VK.binding_model.descriptorset_random.*

CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 9de062e)
When using Freecad, I was getting intermittent segfaults inside of
mesa.  I traced it down to this path in st_cb_drawpixels.c where the
result of pipe_transfer_map wasn't being checked.  In my case, it was
returning NULL because nouveau_bo_new returned ENOENT.  I'm by no
means a mesa developer, but this patch solves the problem for me and
seems reasonable enough.

v2: Marek - also unmap the PBO and release the texture, and call
    the make_texture function sooner for less cleanup

Cc: 18.1 18.2 <[email protected]>
(cherry picked from commit 936e0dc)
…macro

Fixes the following building error, happening when building both intel and broadcom:

Gen Header: libmesa_broadcom_genxml_32 <= v3d_packet_v21_pack.h
FAILED: gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h
/bin/bash -c "python external/mesa/src/broadcom/cle/gen_pack_header.py \
external/mesa/src/broadcom/cle/v3d_packet_v21.xml \
> gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h"
Traceback (most recent call last):
  File "external/mesa/src/broadcom/cle/gen_pack_header.py", line 626, in <module>
    p = Parser(sys.argv[2])
IndexError: list index out of range

header-gen macro is already defined by Intel genxml building rules
and the existing header-gen does not have the $(PRIVATE_VER) argument,
infact the bash command line logged in the building error is missing
exactly $(PRIVATE_VER) argument

Renaming the macro as pack-header-gen in src/broadcom/Android.genxml.mk
solves the building error, another possible way is to keep the gen rules
commands expanded and not use the macros.

Fixes: 7f80a9f ("vc4: Introduce XML-based packet header generation like Intel's.")
Cc: "18.2" <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Mauro Rossi <[email protected]>
(cherry picked from commit 3341429)
Fixes the following building error:

In file included from external/mesa/src/broadcom/cle/v3d_decoder.c:38:
In file included from external/mesa/src/broadcom/cle/v3d_packet_helpers.h:29:
external/mesa/src/gallium/auxiliary/util/u_math.h:42:10:
fatal error: 'pipe/p_compiler.h' file not found
         ^~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 5b10216 ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
Cc: "18.2" <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Mauro Rossi <[email protected]>
(cherry picked from commit 9158e0b)
Fixes the following building error in vc4 build:

In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_render_cl.c:34:
In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_drv.h:27:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_simulator_validate.h:34:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_context.h:39:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_cl.h:56:
gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h:12:10:
fatal error: 'cle/v3d_packet_helpers.h' file not found
         ^~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 5b10216 ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
Cc: "18.2" <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Mauro Rossi <[email protected]>
(cherry picked from commit cc3b99b)
Otherwise using 32 user SGPRs would be broken.

CC: <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit d97c892)
Apparently for compute there are only 16 instead of the 32 for the
graphics path.

Fixes dEQP-VK.binding_model.descriptorset_random.sets16.noarray.ubolimitlow.sbolimitlow.imglimitlow.noiub.comp.0

CC: <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit 0dd8189)
If we end up never taking the loop that writes ret, we can end up with
an uninitialized value, and if we're *really* unlucky, that value can
be -1, causing us to go down an error-path instead of a success path.

This was obviously not intended, so let's just initialize this to zero.

Noticed by Valgrind:

Conditional jump or move depends on uninitialised value(s)
   at 0xBA640A0: virgl_drm_winsys_resource_cache_create (virgl_drm_winsys.c:348)
   by 0xBA62FCF: virgl_buffer_create (virgl_buffer.c:170)
   by 0xBA605AC: virgl_resource_create (virgl_resource.c:60)
   by 0xBCF816F: bufferobj_data (st_cb_bufferobjects.c:344)
   by 0xBCF816F: st_bufferobj_data (st_cb_bufferobjects.c:390)
   by 0xBB7E836: vbo_use_buffer_objects (vbo_exec_api.c:1136)
   by 0xBCFCC6E: st_create_context_priv (st_context.c:414)
   by 0xBCFD3CD: st_create_context (st_context.c:590)
   by 0xBBB30CA: st_api_create_context (st_manager.c:896)
   by 0xB981E76: dri_create_context (dri_context.c:155)
   by 0xB97BDCE: driCreateContextAttribs (dri_util.c:473)
   by 0x5288331: dri3_create_context_attribs (dri3_glx.c:309)
   by 0x5264D64: glXCreateContextAttribsARB (create_context.c:78)

Fixes: a8987b8 ("virgl: add driver for virtio-gpu 3D (v2)")
Signed-off-by: Erik Faye-Lund <[email protected]>
(cherry picked from commit eaa7185)
Those operations do not map to actual hardware instructions, therefore
those should always be lowered to 32-bit instructions.

Fixes: 009c54a "nv50/ir: Split 64-bit integer MAD/MUL operations"

Signed-off-by: Pierre Moreau <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
(cherry picked from commit 21b92b3)
Rafael Antognolli and others added 5 commits December 3, 2018 11:47
Gen9 hardware requires some workarounds to disable preemption depending
on the type of primitive being emitted.

We implement this by adding a new atom that tracks BRW_NEW_PRIMITIVE.
Whenever it happens, we check the current type of primitive and
enable/disable object preemption.

For now, we just ignore blorp.  The only primitive it emits is
3DPRIM_RECTLIST, and since it's not listed in the workarounds, we can
safely leave preemption enabled when it happens. Or it will be disabled
by a previous 3DPRIMITIVE, which should be fine too.

Signed-off-by: Rafael Antognolli <[email protected]>
Cc: Kenneth Graunke <[email protected]>
(am from https://patchwork.freedesktop.org/patch/210952/)
Apparently, we're supposed to look at the texture object's built-in
sampler object's sRGB decode setting in order to decide whether to
decode/downsample/re-encode, or simply downsample as-is.  Previously,
I had always done the decoding/encoding.

Fixes SKQP's Skia_Unit_Tests.SRGBMipMaps test.

Reviewed-by: Tapani Pälli <[email protected]>
(cherry picked from commit 337a808)
…pport

Fixes Skqp's unitTest_EGLImageTest test.

For Intel platforms, we support external textures only for EGLImages
created with EGL_EXT_image_dma_buf_import. This restriction seems to
be Intel specific and not present for other platforms.

While running SKQP test - unitTest_EGLImageTest, GL_INVALID is sent
to the test because of this restriction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105301
Signed-off-by: Aditya Swarup <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
(cherry picked from commit a5c39ed)
…ort it on gen8+

Dual source blending behaviour is undefined when shader doesn't
have second color output, dismissing fragment in such situation
leads to a hang on gen8+ if depth test in enabled.

Since blending cannot be gracefully fixed in such case and the result
is undefined - blending is simply disabled.

v2 (Kenneth Graunke):
 - Listen to BRW_NEW_FS_PROG_DATA in 3DSTATE_PS_BLEND
 - Also whack BLEND_STATE[] to keep the two in sync, since we're not
   sure exactly which copy of the redundant info the hardware will use.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107088
Signed-off-by: Danylo Piliaiev <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit eca4a65)
(cover letter https://patchwork.freedesktop.org/series/51006/)

FROMLIST: i965: SIMD32 heuristics debug flag

Added a new DEBUG_HEUR32 flag to INTEL_DEBUG flags for enabling SIMD32
selection heuristics.

(am from https://patchwork.freedesktop.org/patch/256764/)

FROMLIST: i965: SIMD32 heuristics control data

Added a new structure for holding SIMD32 heuristics control data. The
control data itself will be fetched from drirc.

(am from https://patchwork.freedesktop.org/patch/256806/)

FROMLIST: i965: SIMD32 heuristics control data from drirc

To be able to test the heuristics with different parameters, they can be
controlled via environment variables through drirc.

(am from https://patchwork.freedesktop.org/patch/256788/)

FROMLIST: mesa: Helper functions for counting set bits in a mask

(am from https://patchwork.freedesktop.org/patch/256765/)

FROMLIST: i965/fs: Save the instruction count of each dispatch width

The SIMD32 selection heuristics will use this information for deciding whether
SIMD32 shaders should be used.

(am from https://patchwork.freedesktop.org/patch/256793/)

FROMLIST: i965/fs: SIMD32 selection heuristic based on grouped texture fetches

The function goes through the compiled shader and checks how many grouped
texture fetches there are. This is a simple heuristic which gets rid of most
of the regressions when enabling SIMD32 shaders but still retains some of
the benefits.

(am from https://patchwork.freedesktop.org/patch/256798/)

FROMLIST: i965/fs: Enable all SIMD32 heuristics

There are three simple heuristics for SIMD32 shader enabling:

- How many MRTs does the shader write into?
- How many grouped texture fetches does the shader have?
- How many instructions does the SIMD32 shader have compared to the SIMD16
   shader?

For testing purposes, the heuristics can be controlled via these environment
variables:

simd32_heuristic_mrt_check
- Enables MRT write check
- Default: true

simd32_heuristic_max_mrts
- How many MRT writes the heuristic allows
- Default: 1

simd32_heuristic_grouped_check
- Enables grouped texture fetch check
- Default: true

simd32_heuristic_grouped_sends
- How many grouped texture fetches the heuristic allows
- Default: 6

simd32_heuristic_inst_check
- Enables SIMD32 vs. SIMD16 instruction count check
- Default: true

simd32_heuristic_inst_ratio
- SIMD32 vs. SIMD16 instruction count ratio the heuristic allows
- Default: 2.3

SIMD32 shaders will not be compiled also when SIMD16 compilation fails or
spills.

(am from https://patchwork.freedesktop.org/patch/256766/)
@renchenglei
Copy link
Contributor

Hi @tpalli, does this change look good for you?

@js0701
Copy link
Contributor

js0701 commented Dec 5, 2018

@renchenglei Why it include so many commits here? totally 44 files changed

@renchenglei
Copy link
Contributor

@js0701 This may be caused by one new round of mesa rebase, @YangShi-Intel will help rebase the PR. :)

On Gordenpeak platform, this state flush function have problem.
It will cause GPU read some garbage data sometime.
So add time wait after cache flush.
After stress test, It can keep GPU read the right state data now.
Will continue to optimize this solution.

Signed-off-by: Yang Shi <[email protected]>
@jennycao
Copy link
Contributor

@tpalli @strassek
Would you pls help to review the new version patch which have been rebased on latest code.
If no more comments, Pls help to merge it .

@tpalli tpalli requested a review from strassek December 10, 2018 06:07
@tpalli
Copy link
Contributor

tpalli commented Dec 10, 2018

Yeah, this is the best solution we have for now.

Copy link

@sysopenci sysopenci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Autobuild started from pull-request-approved on this PR.

FAILURE: CheckBug Bad comments/Bugs

For more information, see: /absp/builders/celadon-autobuild/builds/89

device->no_hw = physical_device->no_hw;
device->lost = false;
device->needwait = false;
sprintf(procPath,"/proc/%d/comm",getpid());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does skqp populate device->instance->app_info.app_name? If so, that's what we should run the strcmp against and avoid the fread.

char procPath[WAIT_DETECT_MAX_LEN]={0};
char procName[WAIT_DETECT_MAX_LEN]={0};
int count = 0;
int need = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.