Commits · 088d38890ccee92d5ef6ae13ec1c50f9b0083866 · Panda / LLVM project

This project is mirrored from https://github.com/llvm/llvm-project.git. Pull mirroring failed 3 years ago.
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
Last successful update 3 years ago.

Apr 08, 2022

[mlir][Arithmetic] Add constant folder for negf. · 088d3889
jacquesguan authored 3 years ago
```
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D123293
```
088d3889

[Clang][Fortify] drop inline decls when redeclared · 301e0d91

serge-sans-paille authored 3 years ago

When an inline builtin declaration is shadowed by an actual declaration, we must
reference the actual declaration, even if it's not the last, following GCC
behavior.

This fixes #54715

Differential Revision: https://reviews.llvm.org/D123308

301e0d91

[builtin_object_size] Basic support for posix_memalign · aa15ea47

serge-sans-paille authored 3 years ago

It actually implements support for seeing through loads, using alias analysis to
refine the result.

This is rather limited, but I didn't want to rely on more than available
analysis at that point (to be gentle with compilation time), and it does seem to
catch common scenario, as showcased by the included tests.

Differential Revision: https://reviews.llvm.org/D122431

aa15ea47

[clang][deps] Ensure deterministic filename case · b672638d

Jan Svoboda authored 3 years ago

The dependency scanner can reuse single FileManager instance across multiple translation units. This may lead to non-deterministic output depending on which TU gets processed first.

One of the problems is that Clang uses DirectoryEntry::getName in the header search algorithm. This function returns the path that was first used to construct the (shared) entry in FileManager. Using DirectoryEntryRef::getName instead preserves the case as it was spelled out for the current "get directory entry" request.

rdar://90647508

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D123229

b672638d

Reland "[RISCV][NFC] Moving RVV intrinsic type related util to llvm/Support" · fc2d8326

Kito Cheng authored 3 years ago

Reland Note: We've resolve the circular dependency issue on llvm/lib/Support and
llvm/TableGen.

Differential Revision: https://reviews.llvm.org/D121984

fc2d8326

Bump minimum toolchain version · 4c72deb6

Tobias Hieta authored 3 years ago

RFC: https://discourse.llvm.org/t/rfc-increasing-the-gcc-and-clang-requirements-to-support-c-17-in-llvm

Following the policy here: https://llvm.org/docs/DeveloperPolicy.html#toolchain

This forum post here will be updated with the timeline and status: https://discourse.llvm.org/t/important-new-toolchain-requirements-to-build-llvm-will-most-likely-be-landing-within-a-week-prepare-your-buildbots/61447

Reviewed By: mehdi_amini, jyknight, jhenderson, cor3ntin, MaskRay

Differential Revision: https://reviews.llvm.org/D122976

4c72deb6

Introduce branchless sorting functions for sort3, sort4 and sort5. · 194d1965

Marco Gelmi authored 3 years ago

We are introducing branchless variants for sort3, sort4 and sort5.
These sorting functions have been generated using Reinforcement
Learning and aim to replace __sort3, __sort4 and __sort5 variants
for integral types.

The libc++ benchmarks were run on isolated machines for Skylake, ARM and
AMD architectures and achieve statistically significant improvement in
sorting random integers on test cases from sort1 to sort262144 for
uint32 and uint64.

A full performance overview for Intel Skylake, AMD and Arm can be
found here: https://bit.ly/3AtesYf

Reviewed By: ldionne, #libc, philnik

Spies: daniel.mankowitz, mgrang, Quuxplusone, andreamichi, philnik, libcxx-commits, nilayvaish, kristof.beyls

Differential Revision: https://reviews.llvm.org/D118029

194d1965

compiler-rt: Add udivmodei5 to builtins and add bitint library · bf2dc4b3

Matthias Gehre authored 3 years ago

According to the RFC [0], this review contains the compiler-rt parts of large integer divison for _BitInt.

It adds the functions
```
/// Computes the unsigned division of a / b for two large integers
/// composed of n significant words.
/// Writes the quotient to quo and the remainder to rem.
///
/// \param quo The quotient represented by n words. Must be non-null.
/// \param rem The remainder represented by n words. Must be non-null.
/// \param a The dividend represented by n + 1 words. Must be non-null.
/// \param b The divisor represented by n words. Must be non-null.

/// \note The word order is in host endianness.
/// \note Might modify a and b.
/// \note The storage of 'a' needs to hold n + 1 elements because some
///       implementations need extra scratch space in the most significant word.
///       The value of that word is ignored.
COMPILER_RT_ABI void __udivmodei5(su_int *quo, su_int *rem, su_int *a,
                                  su_int *b, unsigned int n);

/// Computes the signed division of a / b.
/// See __udivmodei5 for details.
COMPILER_RT_ABI void __divmodei5(su_int *quo, su_int *rem, su_int *a, su_int *b,
                                 unsigned int words);
```
into builtins.
In addition it introduces a new "bitint" library containing only those new functions,
which is meant as a way to provide those when using libgcc as runtime.

[0] https://discourse.llvm.org/t/rfc-add-support-for-division-of-large-bitint-builtins-selectiondag-globalisel-clang/60329

Differential Revision: https://reviews.llvm.org/D120327

bf2dc4b3

[mlir][NFC] Drop a few unnecessary includes from Pass.h · 36d3efea
River Riddle authored 3 years ago

36d3efea
[CSKY] Correct the alignment of FPR register · 3d4ca8a8
Zi Xuan Wu authored 3 years ago
```
The alignment of FPR64 and sFPR64 declared in RegisterClass should be 32 bit.
```
3d4ca8a8

[mlir] Add support for operation-produced successor arguments in BranchOpInterface · 0c789db5

Markus Böck authored 3 years ago

This patch revamps the BranchOpInterface a bit and allows a proper implementation of what was previously `getMutableSuccessorOperands` for operations, which internally produce arguments to some of the block arguments. A motivating example for this would be an invoke op with a error handling path:
```
invoke %function(%0)
  label ^success ^error(%1 : i32)

^error(%e: !error, %arg0 : i32):
  ...
```
The advantages of this are that any users of `BranchOpInterface` can still argue over remaining block argument operands (such as `%1` in the example above), as well as make use of the modifying capabilities to add more operands, erase an operand etc.

The way this patch implements that functionality is via a new class called `SuccessorOperands`, which is now returned by `getSuccessorOperands`. It basically contains an `unsigned` denoting how many operator produced operands exist, as well as a `MutableOperandRange`, which are the usual forwarded operands we are used to. The produced operands are assumed to the first few block arguments, followed by the forwarded operands afterwards. The role of `SuccessorOperands` is to provide various utility functions to modify and query the successor arguments from a `BranchOpInterface`.

Differential Revision: https://reviews.llvm.org/D123062

0c789db5

[asan] Always skip first object from dl_iterate_phdr · 795b07f5

Michael Forney authored 3 years ago

All platforms return the main executable as the first dl_phdr_info.
FreeBSD, NetBSD, Solaris, and Linux-musl place the executable name
in the dlpi_name field of this entry. It appears that only Linux-glibc
uses the empty string.

To make this work generically on all platforms, unconditionally
skip the first object (like is currently done for FreeBSD and NetBSD).
This fixes first DSO detection on Linux-musl. It also would likely
fix detection on Solaris/Illumos if it were to gain PIE support
(since dlpi_addr would not be NULL).

Additionally, only skip the Linux VDSO on linux.

Finally, use the empty string as the "seen first dl_phdr_info"
marker rather than (char *)-1. If there was no other object, we
would try to dereference it for a string comparison.

Reviewed By: MaskRay, vitalybuka

Differential Revision: https://reviews.llvm.org/D119515

795b07f5

[llvm-profgen] Filter out invalid LBR ranges. · 8a0406dc

Hongtao Yu authored 3 years ago

The profiler can sometimes give us a LBR trace that implicates bogus code ranges. For example,

    0xc5acb56/0xc66c6c0 0xc628195/0xf31fbb0 0xc611261/0xc628130 0xc5c1a21/0xc6111c0 0x1f7edfd3/0xc5c3a50 0xc5c154f/0x1f7edec0 0xe8eed07/0xc5c11e0

, note that the first two pairs are supposed to form a linear execution range, in this case, it is [0xf31fbb0, 0xc5acb56] , which doesn't make sense.

Such bogus ranges should be ruled out to avoid generating a bad profile. I'm fixing this for both CS and non-CS cases.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D123271

8a0406dc

[CSKY] support select instruction in floating type · 208f93c1

Zi Xuan Wu authored 3 years ago

In FPUv3, there is fsel.32/64 instruction to select float/double type data.
In FPUv2, split block and use branch and move instruction to select float/double type data.

208f93c1

[demangler] Support C23 _BitInt type · a23652f6

Senran Zhang authored 3 years ago

Reviewed By: #libc_abi, aaron.ballman, urnathan

Differential Revision: https://reviews.llvm.org/D122530

a23652f6

NFC: Silence unused function 'scaleAndAdd' in release build. · 497f87bb
Stella Laurenzo authored 3 years ago
```
Differential Revision: https://reviews.llvm.org/D123354
```
497f87bb
[RISCV][NFC] Add missing lit.local.cfg in test/CodeGen/MIR/RISCV/ · 5286c7ae
Kito Cheng authored 3 years ago

5286c7ae
[gn build] Port 690085c9 · a5daf81d
LLVM GN Syncbot authored 3 years ago

a5daf81d

[libomptarget] Implement pointer lookup as 5.1 spec. · c1a6fe19

Ye Luo authored 3 years ago

As described in 5.1 spec
2.21.7.2 Pointer Initialization for Device Data Environments

Reviewed By: RaviNarayanaswamy

Differential Revision: https://reviews.llvm.org/D123093

c1a6fe19

[RISCV] Fixing stack offset for RVV object with vararg in stack. · 9c5aedfb

Kito Cheng authored 3 years ago

We found LLVM generate wrong stack offset for RVV object when stack
having variable argument, that cause by we didn't count vaarg part during
calculate RVV stack objects.

Also update the stack layout diagram for including vaarg in the diagram.

Stack layout ref:
https://github.com/gcc-mirror/gcc/blob/master/gcc/config/riscv/riscv.cc#L3941

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D123180

9c5aedfb

[RISCV] Pre-commit for fixing stack offset for RVV object · 7a123890
Kito Cheng authored 3 years ago
```
Reviewed By: rogfer01, frasercrmck

Differential Revision: https://reviews.llvm.org/D123179
```
7a123890

[RISCV] Store/restore RISCVMachineFunctionInfo into MIR YAML file · 690085c9

Kito Cheng authored 3 years ago

RISCVMachineFunctionInfo has some fields like VarArgsFrameIndex and
VarArgsSaveSize are calculated at ISel lowering stage, those info are
not contained in MIR files, that cause test cases rely on those field
can't not reproduce correctly by MIR dump files.

This patch adding the MIR read/write for those fields.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D123178

690085c9

[NFC] Remove unused variable in CodeGenModules · 74b56e02
Chuanqi Xu authored 3 years ago
```
This eliminates an unused-variable warning
```
74b56e02

Add support for atomic memory copy lowering · da41214d

Evgeniy Brevnov authored 3 years ago

Currently, the utility supports lowering of non atomic memory transfer routines only. This patch adds support for atomic version of memcopy. This may be useful for targets not supporting atomic memcopy.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D118443

da41214d

[mlir][LLVMIR] Add more vector predication intrinsic ops. · 5bd7b0ef

jacquesguan authored 3 years ago

This revision adds float unary, ternary and float/integer reduction intrinsic ops.

Differential Revision: https://reviews.llvm.org/D123189

5bd7b0ef

[InferAddressSpaces] Fix assert on invalid bitcast placement · 26b14c3e

Austin Kerbow authored 3 years ago

Similar to the problem in 0bb25b46, bitcasts that are inserted must
dominate all uses. When rewriting "values" with "new values" that have
the updated address space, we may replace the "new value" with a bitcast
if one of the original users is an addresspace cast. This bitcast must
be inserted before ALL users, not only before the addresspace cast.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122964

26b14c3e

[RISCV][NFC] Use defvar to simplify pattern definations. · a55c19c4
jacquesguan authored 3 years ago
```
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D123292
```
a55c19c4

[InstCombine] fold more constant divisor to select-of-constants divisor · 467cbb62

Chenbing Zheng authored 3 years ago

By adding a parameter to function FoldOpIntoSelect， we can fold more Ops to Select.
For this example, we tend to fold the division instruction,
so we no longer care whether SelectInst is one use.

This patch slove TODO left in InstCombine/div.ll.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D122967

467cbb62

[mlir] Width parameterization of BitEnum attributes · 21949de6

Jeremy Furtek authored 3 years ago

This diff contains:

- Parameterization of bit enum attributes in OpBase.td by bit width (e.g. 32
and 64). Previously, all enums were 32-bits. This brings enum functionality in
line with other integer attributes, and allows for bit enums greater than 32
bits.
- SPIRV and Vector dialects were updated to use bit enum attributes with an
  explicit bit width

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D123095

21949de6

NFC: Eliminate warning for unused type alias FnTraitsT in release builds. · 145574fa
Stella Laurenzo authored 3 years ago
```
Differential Revision: https://reviews.llvm.org/D123351
```
145574fa

[ORC] Fix handling of casts in llvm.global_ctors. · a76209c2

Lang Hames authored 3 years ago

Removes a bogus dyn_cast_or_null that was breaking cast-expression handling when
parsing llvm.global_ctors.

The intent of this code was to identify Functions nested within cast
expressions, but the offending dyn_cast_or_null was actually blocking that:
Since a function is not a cast expression, we would set FuncC to null and break
the loop without finding the Function. The cast was not necessary either:
Functions are already Constants, and we didn't need to do anything
ConstantExpr-specific with FuncC, so we could just drop the cast.

Thanks to Jonas Hahnfeld for tracking this down.

http://llvm.org/PR54797

a76209c2

DebugInfo: Consider the type of NTTP when simplifying template names · 1cee3d9d

David Blaikie authored 3 years ago

Since the NTTP may need to be cast to the type when rebuilding the name,
check that the type can be rebuilt when determining whether a template
name can be simplified.

1cee3d9d

[MSAN] extend prctl interceptor to support PR_SCHED_CORE · 0713053e
Kevin Athey authored 3 years ago
```
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D122851
```
0713053e

[trace][intel pt] Create a common accessor for live and postmortem data · e0cfe20a

Walter Erquinigo authored 3 years ago

Some parts of the code have to distinguish between live and postmortem threads
to figure out how to get some data, e.g. thread trace buffers. This makes the
code less generic and more error prone. An example of that is that we have
two different decoders: LiveThreadDecoder and PostMortemThreadDecoder. They
exist because getting the trace bufer is different for each case.

The problem doesn't stop there. Soon we'll have even more kinds of data, like
the context switch trace, whose fetching will be different for live and post-
mortem processes.

As a way to fix this, I'm creating a common API for accessing thread data,
which is able to figure out how to handle the postmortem and live cases on
behalf of the caller. As a result of that, I was able to eliminate the two
decoders and unify them into a simpler one. Not only that, our TraceSave
functionality only worked for live threads, but now it can also work for
postmortem processes, which might be useful now, but it might in the future.

This common API is OnThreadBinaryDataRead. More information in the inline
documentation.

Differential Revision: https://reviews.llvm.org/D123281

e0cfe20a

[trace][intel pt] Create a class for the libipt decoder wrapper · 6423b502

Walter Erquinigo authored 3 years ago

As we soon will need to decode multiple raw traces for the same thread,
having a class that encapsulates the decoding of a single raw trace is
a stepping stone that will make the coming features easier to implement.

So, I'm creating a LibiptDecoder class with that purpose. I refactored
the code and it's now much more readable. Besides that, more comments
were added. With this new structure, it's also easier to implement unit
tests.

Differential Revision: https://reviews.llvm.org/D123106

6423b502

[test][DSE] Precommit more assume tests · 47130384
Arthur Eubanks authored 3 years ago

47130384

Fix format specifier. NFCI. · 627f55b3

Jorge Gorbe Moya authored 3 years ago

Using a portable format specifier avoids a "format specifies type
'unsigned long long' but the argument has type 'uint64_t' (aka 'unsigned
long') [-Werror,-Wformat]" error depending on the exact definition of
`uint64_t`.

627f55b3

[llvm-symbolizer] Fix line offset for inline site. · 1da67ece

Zequan Wu authored 3 years ago

This fixes the issue when the current line offset is actually for next range.

Maintain a current code range with current line offset and cache next file/line
offset. Update file/line offset after finishing current range.

Differential Revision: https://reviews.llvm.org/D123151

1da67ece

[lld-macho][nfc] Give non-text ConcatOutputSections order-independent finalization · b440c257

Jez Ng authored 3 years ago

This diff is motivated by my work to add proper DWARF unwind support. As
detailed in PR50956 functions that need DWARF unwind need to have
compact unwind entries synthesized for them. These CU entries encode an
offset within `__eh_frame` that points to the corresponding DWARF FDE.

In order to encode this offset during
`UnwindInfoSectionImpl::finalize()`, we need to first assign values to
`InputSection::outSecOff` for each `__eh_frame` subsection. But
`__eh_frame` is ordered after `__unwind_info` (according to ld64 at
least), which puts us in a bit of a bind: `outSecOff` gets assigned
during finalization, but `__eh_frame` is being finalized after
`__unwind_info`.

But it occurred to me that there's no real need for most
ConcatOutputSections to be finalized sequentially. It's only necessary
for text-containing ConcatOutputSections that may contain branch relocs
which may need thunks. ConcatOutputSections containing other types of
data can be finalized in any order.

This diff moves the finalization logic for non-text sections into a
separate `finalizeContents()` method. This method is called before
section address assignment & unwind info finalization takes place. In
theory we could call these `finalizeContents()` methods in parallel, but
in practice it seems to be faster to do it all on the main thread.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D123279

b440c257

[AMDGPU] Fix handling of gfx10 LDS misaligned access bug · 16cf9e6d

Stanislav Mekhanoshin authored 3 years ago

It was only handled for FLAT initially because we did not have
unaligned DS instructions lowering. Now it is implemented but
the bug is not handled.

Differential Revision: https://reviews.llvm.org/D123338

16cf9e6d