Skip to content

Test Group Failure: System.Runtime.Tests outerloop #56567

@josalem

Description

@josalem

Noticed these failures when I was investigating some disabled tracing tests in #56507. These failures are unrelated to the tests I turned back on in that PR, so I looked at the history.

net6.0-Linux-Debug-x64-CoreCLR_release-Ubuntu.1804.Amd64.Open

/datadisks/disk1/work/B3F20994/w/C4E20A47/e /datadisks/disk1/work/B3F20994/w/C4E20A47/e
  Discovering: System.Runtime.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Runtime.Tests (found 28 of 6255 test cases)
  Starting:    System.Runtime.Tests (parallel test collections = on, max threads = 2)
./RunTests.sh: line 162: 11202 Killed                  "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Runtime.Tests.runtimeconfig.json --depsfile System.Runtime.Tests.deps.json xunit.console.dll System.Runtime.Tests.dll -xml testResults.xml -nologo -nocolor -trait category=OuterLoop -notrait category=IgnoreForCI -notrait category=failing $RSP_FILE
/datadisks/disk1/work/B3F20994/w/C4E20A47/e
----- end Thu Jul 29 01:24:36 UTC 2021 ----- exit code 137 ----------------------------------------------------------
exit code 137 means SIGKILL Killed eg by kill
ulimit -c value: unlimited
[ 2439.914551] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 2439.914551] 251 total pagecache pages
[ 2439.914552] 0 pages in swap cache
[ 2439.914553] Swap cache stats: add 0, delete 0, find 0/0
[ 2439.914553] Free swap  = 0kB
[ 2439.914553] Total swap = 0kB
[ 2439.914554] 2097038 pages RAM
[ 2439.914554] 0 pages HighMem/MovableOnly
[ 2439.914555] 58679 pages reserved
[ 2439.914555] 0 pages cma reserved
[ 2439.914555] 0 pages hwpoisoned
[ 2439.914556] Tasks state (memory values in pages):
[ 2439.914556] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[ 2439.914560] [    447]     0   447    43216      215   331776        0             0 systemd-journal
[ 2439.914562] [    470]     0   470    24428       43    94208        0             0 lvmetad
[ 2439.914563] [    476]     0   476    11204      566   131072        0         -1000 systemd-udevd
[ 2439.914564] [    523]     0   523     3005      229    69632        0             0 hv_kvp_daemon
[ 2439.914565] [    896] 62583   896    35489      133   184320        0             0 systemd-timesyn
[ 2439.914566] [   1024]   100  1024    20021      151   176128        0             0 systemd-network
[ 2439.914567] [   1062]   101  1062    17697      173   176128        0             0 systemd-resolve
[ 2439.914569] [   1319]     0  1319    20058     3259   204800        0             0 python3
[ 2439.914570] [   1332]     0  1332    15545      168   155648        0             0 systemd-logind
[ 2439.914571] [   1333]     0  1333    42739     1957   229376        0             0 networkd-dispat
[ 2439.914572] [   1336]     0  1336    40270       32    86016        0             0 lxcfs
[ 2439.914573] [   1338]   103  1338    12514      160   143360        0          -900 dbus-daemon
[ 2439.914574] [   1366]     0  1366    72000      214   188416        0             0 accounts-daemon
[ 2439.914575] [   1372]     0  1372    27605       56   114688        0             0 irqbalance
[ 2439.914576] [   1381]     0  1381     7084       51    94208        0             0 atd
[ 2439.914577] [   1382]   102  1382    66817      364   163840        0             0 rsyslogd
[ 2439.914578] [   1391]     0  1391     7938       73    98304        0             0 cron
[ 2439.914579] [   1393]     0  1393   226267     6655   286720        0          -999 containerd
[ 2439.914580] [   1397]     0  1397     4104       38    73728        0             0 agetty
[ 2439.914581] [   1408]     0  1408     3723       32    69632        0             0 agetty
[ 2439.914582] [   1436]     0  1436    72221      197   200704        0             0 polkitd
[ 2439.914583] [   1622]     0  1622     1128       17    53248        0             0 none
[ 2439.914584] [   1785]     0  1785    18076      181   176128        0         -1000 sshd
[ 2439.914585] [   1806]     0  1806    96545     4082   266240        0             0 python3
[ 2439.914586] [   2473]  1000  2473     2899       66    65536        0             0 helix.sh
[ 2439.914588] [   2928]     0  2928   247469    11662   483328        0          -500 dockerd
[ 2439.914589] [   3295]  1000  3295    44341     6852   241664        0             0 python3
[ 2439.914590] [   3299]   106  3299     7150       46    94208        0             0 uuidd
[ 2439.914591] [   3313]  1000  3313    63593     7085   270336        0             0 python3
[ 2439.914592] [   3314]  1000  3314   124773    11968   348160        0             0 python3
[ 2439.914593] [  11190]  1000 11190     1158       16    57344        0             0 sh
[ 2439.914594] [  11192]  1000 11192     1158       17    57344        0             0 execute.sh
[ 2439.914595] [  11194]  1000 11194     2932       83    69632        0             0 bash
[ 2439.914596] [  11202]  1000 11202  2906815  1915794 15781888        0             0 dotnet
[ 2439.914597] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/helix.service,task=dotnet,pid=11202,uid=1000
[ 2439.914636] Out of memory: Killed process 11202 (dotnet) total-vm:11627260kB, anon-rss:7663176kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:15412kB oom_score_adj:0
[ 2440.040540] oom_reaper: reaped process 11202 (dotnet), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Waiting a few seconds for any dump to be written..
cat /proc/sys/kernel/core_pattern: /home/helixbot/dotnetbuild/dumps/core.%u.%p
cat /proc/sys/kernel/core_uses_pid: 0
cat: /proc/sys/kernel/coredump_filter: No such file or directory
cat /proc/sys/kernel/coredump_filter:
Looking around for any Linux dump..
... found no dump in /datadisks/disk1/work/B3F20994/w/C4E20A47/e
+ export _commandExitCode=137

and

net6.0-Linux-Debug-x64-CoreCLR_release-SLES.15.Amd64.Open

~/work/A42C0904/w/A0C3088C/e ~/work/A42C0904/w/A0C3088C/e
  Discovering: System.Runtime.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Runtime.Tests (found 28 of 6255 test cases)
  Starting:    System.Runtime.Tests (parallel test collections = on, max threads = 2)
./RunTests.sh: line 162: 19114 Killed                  "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Runtime.Tests.runtimeconfig.json --depsfile System.Runtime.Tests.deps.json xunit.console.dll System.Runtime.Tests.dll -xml testResults.xml -nologo -nocolor -trait category=OuterLoop -notrait category=IgnoreForCI -notrait category=failing $RSP_FILE
~/work/A42C0904/w/A0C3088C/e
----- end Thu Jul 29 01:40:46 UTC 2021 ----- exit code 137 ----------------------------------------------------------
exit code 137 means SIGKILL Killed eg by kill
ulimit -c value: unlimited
dmesg: read kernel buffer failed: Operation not permitted
Waiting a few seconds for any dump to be written..
cat /proc/sys/kernel/core_pattern: /home/helixbot/dotnetbuild/dumps/core.%u.%p
cat /proc/sys/kernel/core_uses_pid: 0
cat: /proc/sys/kernel/coredump_filter: No such file or directory
cat /proc/sys/kernel/coredump_filter:
Looking around for any Linux dump..
... found no dump in /home/helixbot/work/A42C0904/w/A0C3088C/e

Both appear to be the same failure with little to no other diagnostics information. I see a few other failures in the history in AzDO going as far back as at least June 24th, but I saw failures all the way back into early May. The logs for those builds are gone, so I can't verify that they are the same failures. I stopped going back in the history at May, so I'm not sure how far back this failure goes.

Based on the history, it looks like this test is potentially flakey. It routinely passes, but occasionally fails. Seemingly in pairs, e.g., if one test run fails, there is another failure within a run of the other. All records of the test in AzDO have the exact same duration 00:01:00.00 regardless of pass or fail. I'm not sure how much I trust these records as a result.

I couldn't find an issue tracking this, but feel free to dup if there is already one.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions