Commits · d4ee603c8f21b4ae722c2a34d4dfa54b7abeeb16 · Panda / LLVM project

This project is mirrored from https://github.com/llvm/llvm-project.git. Pull mirroring failed 3 years ago.
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
Last successful update 3 years ago.

Apr 23, 2021

Coverage: Document how to collect a profile without a filesystem · d4ee603c

Duncan P. N. Exon Smith authored 4 years ago

The profiling runtime was designed to work without static initializers
or a a filesystem (see 117cf2bd and
others). The no-static-initializers part was already documented but this
part got missed before.

Differential Revision: https://reviews.llvm.org/D101000

d4ee603c

Fix the triple used in llvm-mca. · 832340ca

Kai Nacke authored 4 years ago

lookupTarget() can update the passed triple argument. This happens
when no triple is given on the command line, and the architecture
argument does not match the architecture in the default triple.

For example, passing -march=aarch64 on the command line, and the
default triple being x86_64-windows-msvc, the triple is changed
to aarch64-windows-msvc.

However, this triple is not saved, and later in the code, the
triple is constructed again from the triple name, which is the
default triple at this point. Thus the default triple is passed
to constructor of MCSubtargetInfo instance.

The triple is only used determine the object file format, and by
chance, the AArch64 target also uses the COFF file format, and
all is fine. Obviously, the AArch64 target does not support all
available binary file formats, e.g. XCOFF and GOFF, and llvm-mca
crashes in this case.

The fix is to update the triple name with the changed triple
name for the target lookup. Then the default object file format
for the architecture is used, in the example ELF.

Reviewed By: andreadb, abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D100992

832340ca

[flang] (NFC) Document Fortran feature history · 47283e15
peter klausler authored 4 years ago
```
Add flang/docs/FortranFeatureHistory.md

Differential Revision: https://reviews.llvm.org/D101081
```
47283e15

scudo: Use a table to look up the LSB for computing the odd/even mask. NFCI. · 4e88e587

Peter Collingbourne authored 4 years ago

In the most common case we call computeOddEvenMaskForPointerMaybe()
from quarantineOrDeallocateChunk(), in which case we need to look up
the class size from the SizeClassMap in order to compute the LSB. Since
we need to do a lookup anyway, we may as well look up the LSB itself
and avoid computing it every time.

While here, switch to a slightly more efficient way of computing the
odd/even mask.

Differential Revision: https://reviews.llvm.org/D101018

4e88e587

Revert "[sanitizer] Use COMPILER_RT_EMULATOR with gtests" · 68632826
Vitaly Buka authored 4 years ago
```
Missed review comments.

This reverts commit e2508296.
```
68632826

[SCEV] Compute ranges for lshr recurrences · 424d6cb9

Philip Reames authored 4 years ago

Straight forward extension to the recently added infrastructure which was pioneered with shl.

Differential Revision: https://reviews.llvm.org/D99687

424d6cb9

[Driver] Specify -ccc-install-dir for linux-cross test · 45340efb

Petr Hosek authored 4 years ago

This avoids test failures where extra files exist in the tree, such
as the standard library built using the runtimes build.

Differential Revision: https://reviews.llvm.org/D101023

45340efb

Revert "[instcombine] Exploit UB implied by nofree attributes" · 15e19a25

Philip Reames authored 4 years ago

This change effectively reverts 86664638, but since there have been some changes on top and I wanted to leave the tests in, it's not a mechanical revert.

Why revert this now? Two main reasons:
1) There are continuing discussion around what the semantics of nofree. I am getting increasing uncomfortable with the seeming possibility we might redefine nofree in a way incompatible with these changes.
2) There was a reported miscompile triggered by this change (https://github.com/emscripten-core/emscripten/issues/9443). At first, I was making good progress on tracking down the issues exposed and those issues appeared to be unrelated latent bugs. Now that we've found at least one bug in the original change, and the investigation has stalled, I'm no longer comfortable leaving this in tree. In retrospect, I probably should have reverted this earlier and investigated the issues once the triggering change was out of tree.

15e19a25

[RISCV] Add IR intrinsics for vmsge(u).vv/vx/vi. · e01c419e

Craig Topper authored 4 years ago

These instructions don't really exist, but we have ways we can
emulate them.

.vv will swap operands and use vmsle().vv. .vi will adjust the
immediate and use .vmsgt(u).vi when possible. For .vx we need to
use some of the multiple instruction sequences from the V extension
spec.

For unmasked vmsge(u).vx we use:
  vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd

For cases where mask and maskedoff are the same value then we have
vmsge{u}.vx v0, va, x, v0.t which is the vd==v0 case that
requires a temporary so we use:
  vmslt{u}.vx vt, va, x; vmandnot.mm vd, vd, vt

For other masked cases we use this sequence:
  vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0
We trust that register allocation will prevent vd in vmslt{u}.vx
from being v0 since v0 is still needed by the vmxor.

Differential Revision: https://reviews.llvm.org/D100925

e01c419e

[RISCV] Add missing tests for vector type for second operand of vmsgt and vmsgtu IR intrinsics. · d77d56ac

Craig Topper authored 4 years ago

Refactor to use new multiclass instead of individual patterns.

We already supported this due to SEW=64 on RV32, but we didn't have
test cases for all the types we supported.

Part of D100925

d77d56ac

[RISCV] Support vector type for second operand of vmfge and vmfgt IR intrinsics. · 9524a055

Craig Topper authored 4 years ago

We don't have instructions for these, but can swap the operands
to use vmle/vmflt. This makes the IR interface more consistent and
simplifies the frontend implementation.

Part of D100925

9524a055

[NFC] Remove reference to file deleted by D100981. · 37e14581
Vitaly Buka authored 4 years ago

37e14581

[scudo] Check if MADV_DONTNEED zeroes memory · d423509b

Vitaly Buka authored 4 years ago

QEMU just ignores MADV_DONTNEED
https://github.com/qemu/qemu/blob/b1cffefa1b163bce9aebc3416f562c1d3886eeaa/linux-user/syscall.c#L11941

Depends on D100998.

Differential Revision: https://reviews.llvm.org/D101031

d423509b

[sanitizer] Use COMPILER_RT_EMULATOR with gtests · e2508296
Vitaly Buka authored 4 years ago
```
Differential Revision: https://reviews.llvm.org/D100998
```
e2508296

[flang] Update recently added OpenMP tests to use the new driver · 43831d62

Andrzej Warzynski authored 4 years ago

Switching from `%f18` to `%flang_fc1` in LIT tests added in
https://reviews.llvm.org/D91159. This way these tests are run with the
new driver, `flang-new`, when enabled (i.e. when
`FLANG_BUILD_NEW_DRIVER` is set).

Differential Revision: https://reviews.llvm.org/D101078

43831d62

Temporarily revert the code part of D100981 "Delete le32/le64 targets" · ef5e7f90

Fangrui Song authored 4 years ago

This partially reverts commit 77ac823f.

Halide uses le32/le64 (https://github.com/halide/Halide/pull/5934).
Temporarily brings back the code part to give them some time for migration.

ef5e7f90

[lsan] Temporarily disable new check broken on arm7 · 149d5a8c
Vitaly Buka authored 4 years ago

149d5a8c

[RISCV] Turn splat shuffles of vector loads into strided load with stride of x0. · 70254ccb

Craig Topper authored 4 years ago

Implementations are allowed to optimize an x0 stride to perform
less memory accesses. This is the case in SiFive cores.

No idea if this is the case in other implementations. We might
need a tuning flag for this.

Reviewed By: frasercrmck, arcbbb

Differential Revision: https://reviews.llvm.org/D100815

70254ccb

[lldb] Fix that the expression commands --top-level flag overwrites --allow-jit false · d616a6bd

Raphael Isemann authored 4 years ago

The `--allow-jit` flag allows the user to force the IR interpreter to run the
provided expression.

The `--top-level` flag parses and injects the code as if its in the top level
scope of a source file.

Both flags just change the ExecutionPolicy of the expression:
* `--allow-jit true` -> doesn't change anything (its the default)
* `--allow-jit false` -> ExecutionPolicyNever
* `--top-level` -> ExecutionPolicyTopLevel

Passing `--allow-jit false` and `--top-level` currently causes the `--top-level`
to silently overwrite the ExecutionPolicy value that was set by `--allow-jit
false`. There isn't any ExecutionPolicy value that says "top-level but only
interpret", so I would say we reject this combination of flags until someone
finds time to refactor top-level feature out of the ExecutionPolicy enum.

The SBExpressionOptions suffer from a similar symptom as `SetTopLevel` and
`SetAllowJIT` just silently disable each other. But those functions don't have
any error handling, so not a lot we can do about this in the meantime.

Reviewed By: labath, kastiglione

Differential Revision: https://reviews.llvm.org/D91780

d616a6bd

[RISCV] Use stack temporary to splat two GPRs into SEW=64 vector on RV32. · 77f14c96

Craig Topper authored 4 years ago

Rather than doing splatting each separately and doing bit manipulation
to merge them in the vector domain, copy the data to the stack
and splat it using a strided load with x0 stride. At least on
some implementations this vector load is optimized to not do
a load for each element.

This is equivalent to how we move i64 to f64 on RV32.

I've only implemented this for the intrinsic fallbacks in this
patch. I think we do similar splatting/shifting/oring in other
places. If this is approved, I'll refactor the others to share
the code.

Differential Revision: https://reviews.llvm.org/D101002

77f14c96

[Hexagon] Add HVX intrinsics for conditional vector loads/stores · deda60fc

Krzysztof Parzyszek authored 4 years ago

Intrinsics for the following instructions are added. The intrinsic
name is "int_hexagon_<inst>[_128B]", e.g.
  int_hexagon_V6_vL32b_pred_ai        for 64-byte version
  int_hexagon_V6_vL32b_pred_ai_128B   for 128-byte version

V6_vL32b_pred_ai        if (Pv4) Vd32 = vmem(Rt32+#s4)
V6_vL32b_pred_pi        if (Pv4) Vd32 = vmem(Rx32++#s3)
V6_vL32b_pred_ppu       if (Pv4) Vd32 = vmem(Rx32++Mu2)
V6_vL32b_npred_ai       if (!Pv4) Vd32 = vmem(Rt32+#s4)
V6_vL32b_npred_pi       if (!Pv4) Vd32 = vmem(Rx32++#s3)
V6_vL32b_npred_ppu      if (!Pv4) Vd32 = vmem(Rx32++Mu2)

V6_vL32b_nt_pred_ai     if (Pv4) Vd32 = vmem(Rt32+#s4):nt
V6_vL32b_nt_pred_pi     if (Pv4) Vd32 = vmem(Rx32++#s3):nt
V6_vL32b_nt_pred_ppu    if (Pv4) Vd32 = vmem(Rx32++Mu2):nt
V6_vL32b_nt_npred_ai    if (!Pv4) Vd32 = vmem(Rt32+#s4):nt
V6_vL32b_nt_npred_pi    if (!Pv4) Vd32 = vmem(Rx32++#s3):nt
V6_vL32b_nt_npred_ppu   if (!Pv4) Vd32 = vmem(Rx32++Mu2):nt

V6_vS32b_pred_ai        if (Pv4) vmem(Rt32+#s4) = Vs32
V6_vS32b_pred_pi        if (Pv4) vmem(Rx32++#s3) = Vs32
V6_vS32b_pred_ppu       if (Pv4) vmem(Rx32++Mu2) = Vs32
V6_vS32b_npred_ai       if (!Pv4) vmem(Rt32+#s4) = Vs32
V6_vS32b_npred_pi       if (!Pv4) vmem(Rx32++#s3) = Vs32
V6_vS32b_npred_ppu      if (!Pv4) vmem(Rx32++Mu2) = Vs32

V6_vS32Ub_pred_ai       if (Pv4) vmemu(Rt32+#s4) = Vs32
V6_vS32Ub_pred_pi       if (Pv4) vmemu(Rx32++#s3) = Vs32
V6_vS32Ub_pred_ppu      if (Pv4) vmemu(Rx32++Mu2) = Vs32
V6_vS32Ub_npred_ai      if (!Pv4) vmemu(Rt32+#s4) = Vs32
V6_vS32Ub_npred_pi      if (!Pv4) vmemu(Rx32++#s3) = Vs32
V6_vS32Ub_npred_ppu     if (!Pv4) vmemu(Rx32++Mu2) = Vs32

V6_vS32b_nt_pred_ai     if (Pv4) vmem(Rt32+#s4):nt = Vs32
V6_vS32b_nt_pred_pi     if (Pv4) vmem(Rx32++#s3):nt = Vs32
V6_vS32b_nt_pred_ppu    if (Pv4) vmem(Rx32++Mu2):nt = Vs32
V6_vS32b_nt_npred_ai    if (!Pv4) vmem(Rt32+#s4):nt = Vs32
V6_vS32b_nt_npred_pi    if (!Pv4) vmem(Rx32++#s3):nt = Vs32
V6_vS32b_nt_npred_ppu   if (!Pv4) vmem(Rx32++Mu2):nt = Vs32

deda60fc

[OpenMP] Add function for setting LIBOMPTARGET_INFO at runtime · 2b6f2008

Joseph Huber authored 4 years ago

Summary:
This patch adds a new runtime function __tgt_set_info_flag that allows the
user to set the information level at runtime without using the environment
variable. Using this will require an extern function, but will eventually be
added into an auxilliary library for OpenMP support functions.

This patch required moving the current InfoLevel to a global variable which must
be instantiated by each plugin.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D100774

2b6f2008

Fix memory leak in MicrosoftDemangleNodes's Node::toString · ae209aa9

Raphael Isemann authored 4 years ago

The buffer we turn into a std::string here is malloc'd and should be
free'd before we return from this function.

Follow up to LLDB leak fixes such as D100806.

Reviewed By: mstorsjo, rupprecht, MaskRay

Differential Revision: https://reviews.llvm.org/D100843

ae209aa9

[flang] Fix spurious errors from runtime derived type table construction · 803f1e46

peter klausler authored 4 years ago

Andrezj W. @ Arm discovered that the runtime derived type table
building code in semantics was detecting fatal errors in the tests
that the f18 driver wasn't printing. This patch fixes f18 so that
these messages are printed; however, the messages were not valid user
errors, and the rest of this patch fixes them up.

There were two sources of the bogus errors. One was that the runtime
derived type information table builder was calculating the shapes of
allocatable and pointer array components in derived types, and then
complaining that they weren't constant or LEN parameter values, which
of course they couldn't be since they have to have deferred shapes
and those bounds were expressions like LBOUND(component,dim=1).

The second was that f18 was forwarding the actual LEN type parameter
expressions of a type instantiation too far into the uses of those
parameters in various expressions in the declarations of components;
when an actual LEN type parameter is not a constant value, it needs
to remain a "bare" type parameter inquiry so that it will be lowered
to a descriptor inquiry and acquire a captured expression value.

Fixing this up properly involved: moving some code into new utility
function templates in Evaluate/tools.h, tweaking the rewriting of
conversions in expression folding to elide needless integer kind
conversions of type parameter inquiries, making type parameter
inquiry folding *not* replace bare LEN type parameters with
non-constant actual parameter values, and cleaning up some
altered test results.

Differential Revision: https://reviews.llvm.org/D101001

803f1e46

[dfsan] Track origin at loads · 7fdf2709

Jianzhou Zhao authored 4 years ago

The first version of origin tracking tracks only memory stores. Although
this is sufficient for understanding correct flows, it is hard to figure
out where an undefined value is read from. To find reading undefined values,
we still have to do a reverse binary search from the last store in the chain
with printing and logging at possible code paths. This is
quite inefficient.

Tracking memory load instructions can help this case. The main issues of
tracking loads are performance and code size overheads.

With tracking only stores, the code size overhead is 38%,
memory overhead is 1x, and cpu overhead is 3x. In practice #load is much
larger than #store, so both code size and cpu overhead increases. The
first blocker is code size overhead: link fails if we inline tracking
loads. The workaround is using external function calls to propagate
metadata. This is also the workaround ASan uses. The cpu overhead
is ~10x. This is a trade off between debuggability and performance,
and will be used only when debugging cases that tracking only stores
is not enough.

Reviewed By: gbalats

Differential Revision: https://reviews.llvm.org/D100967

7fdf2709

[libc++] [test] Fix nodiscard_extensions.pass.cpp in _LIBCPP_DEBUG mode. · 5dfbcc5a

Arthur O'Dwyer authored 4 years ago

`std::clamp(2, 1, 3, std::greater<int>())` has UB because (1 > 3) is false.
Swap the operands to fix the _LIBCPP_ASSERT failure in this test.

5dfbcc5a

[flang][openmp] Add General Semantic Checks for Allocate Directive · 123ae425

Irina Dobrescu authored 4 years ago

This patch adds semantic checks for the General Restrictions of the
Allocate Directive.

Since the requires directive is not yet implemented in Flang, the
restriction:
```
allocate directives that appear in a target region must
specify an allocator clause unless a requires directive with the
dynamic_allocators clause is present in the same compilation unit
```
will need to be updated at a later time.

A different patch will be made with the Fortran specific restrictions of
this directive.

I have used the code from https://reviews.llvm.org/D89395

 for the
CheckObjectListStructure function.

Co-authored-by: Isaac Perry <isaac.perry@arm.com>

Reviewed By: clementval, kiranchandramohan

Differential Revision: https://reviews.llvm.org/D91159

123ae425

[x86] remove stale comment from test file; NFC · 11232037
Sanjay Patel authored 4 years ago

11232037
[libc++] Eliminate macro _LIBCPP_UNUSED_VAR. NFCI. · b98b6d99
Arthur O'Dwyer authored 4 years ago
```
Reviewed as part of https://reviews.llvm.org/D100737
```
b98b6d99
[libc++] Fix some typos and remove unused macros. NFCI. · e6972024
Arthur O'Dwyer authored 4 years ago
```
Reviewed as part of https://reviews.llvm.org/D100737
```
e6972024

Apr 22, 2021

[SLP]Skip undefs trying to find perfect/shuffled tree entries matching. · 18c61fc4

Alexey Bataev authored 4 years ago

We can skip check for undefs trying to find perfect/shuffled tree
entries matching, they can be ignored completely improving the final
cost/vectorization results.

Differential Revision: https://reviews.llvm.org/D101061

18c61fc4

[llvm-profgen] A couple tweaks to the testing harness. · aaf120b5

Hongtao Yu authored 4 years ago

1. Remove unnecessary filtering code.
2. Add llvm-profgen to tool substitutions.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D101006

aaf120b5

[flang] Fix checking of argument passing for parameterized derived types · e9be1e7d

Peter Steinfeld authored 4 years ago

We were erroneously not taking into account the constant values of LEN type
parameters of parameterized derived types when checking for argument
compatibility. The required checks are identical to those for assignment
compatibility. Since argument compatibility is checked in .../lib/Evaluate and
assignment compatibility is checked in .../lib/Semantics, I moved the common
code into .../lib/Evaluate/tools.cpp and changed the assignment compatibility
checking code to call it.

After implementing these new checks, tests in resolve53.f90 were failing
because the tests were erroneous. I fixed these tests and added new tests
to call03.f90 to test argument passing of parameterized derived types more
completely.

Differential Revision: https://reviews.llvm.org/D100989

e9be1e7d

[flang][driver][Revert] Reverts f18 to allow options passed to -W · 4299ab6c
Arnamoy Bhattacharyya authored 4 years ago
```
Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D100883
```
4299ab6c
[PowerPC] Add missing casts for vec_xlds and vec_load_splats · 7a5641d6
Nemanja Ivanovic authored 4 years ago
```
The previous commits just missed some pointer casts and ended up
producing warnings.
```
7a5641d6

[PowerPC] Add vec_vclz as an alias for vec_cntlz in altivec.h · 1cc1d9db

Nemanja Ivanovic authored 4 years ago

Another addition for compatibility with XLC. The functions have the
same overloads so just add it as a preprocessor define.

1cc1d9db

[PowerPC] Add vec_load_splats to altivec.h · e43963db
Nemanja Ivanovic authored 4 years ago
```
Add these overloads for compatibility with XLC. This is a word
load-and-splat.
```
e43963db
[PowerPC] Add vec_xlds to altivec.h · a0e61897
Nemanja Ivanovic authored 4 years ago
```
Add these overloads for compatibility with XLC. This is a doubleword
load-and-splat.
```
a0e61897
[PowerPC] Add vec_roundz as alias for vec_trunc in altivec.h · a1d325af
Nemanja Ivanovic authored 4 years ago
```
Add the overloads for compatibility with XLC.
```
a1d325af
[PowerPC] Add vec_roundp as alias for vec_ceil · 1550c47c
Nemanja Ivanovic authored 4 years ago
```
Add the overloads for compatibility with XLC.
```
1550c47c