Commits · 225ccf0c50a8d4e03b51f2f09a8e8776328363d8 · Panda / LLVM project

This project is mirrored from https://github.com/llvm/llvm-project.git. Pull mirroring failed 3 years ago.
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
Last successful update 3 years ago.

Feb 04, 2021

[clang][cli] Command line round-trip for HeaderSearch options · 225ccf0c

Jan Svoboda authored 4 years ago

This patch implements generation of remaining header search arguments.
It's done manually in C++ as opposed to TableGen, because we need the flexibility and don't anticipate reuse.

This patch also tests the generation of header search options via a round-trip. This way, the code gets exercised whenever Clang is built and tested in asserts mode. All `check-clang` tests pass.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D94472

225ccf0c

[Support] Indent multi-line descr of enum cli options. · e3f02302

Joachim Meyer authored 4 years ago

As noted in https://reviews.llvm.org/D93459, the formatting of
multi-line descriptions of clEnumValN and the likes is unfavorable.
Thus this patch adds support for correctly indenting these.

Reviewed By: serge-sans-paille

Differential Revision: https://reviews.llvm.org/D93494

e3f02302

[AMDGPU] Save all lanes for reserved VGPRs · 6c59dc47

Sebastian Neubauer authored 4 years ago

When SGPRs are spilled to VGPRs, they can overwrite any lane. We need
to preserve the value of inactive lanes in function calls, so we save
the register even if it is marked as caller saved.

Also, teach buildPrologSpill to work when no registers are free like in
CodeGen/AMDGPU/pei-scavenge-vgpr-spill.mir and update the comment on
findScratchNonCalleeSaveRegister as it is not used anymore to realign
the stack pointer since D95865.

Differential Revision: https://reviews.llvm.org/D95946

6c59dc47

[clangd] Detect rename conflicits within enclosing scope · 5eec9a38

Kirill Bobyrev authored 4 years ago

This patch allows detecting conflicts with variables defined in the current
CompoundStmt or If/While/For variable init statements.

Reviewed By: hokein

Differential Revision: https://reviews.llvm.org/D95925

5eec9a38

[Syntax] Support condition for IfStmt. · 6c1a2330
Haojian Wu authored 4 years ago
```
Differential Revision: https://reviews.llvm.org/D95782
```
6c1a2330

[mlir][Linalg] Generalize the definition of a Linalg contraction. · f245b7ad

Nicolas Vasilache authored 4 years ago

This revision defines a Linalg contraction in general terms:

  1. Has 2 input and 1 output shapes.
  2. Has at least one reduction dimension.
  3. Has only projected permutation indexing maps.
  4. its body computes `u5(u1(c) + u2(u3(a) * u4(b)))` on some field
    (AddOpType, MulOpType), where u1, u2, u3, u4 and u5 represent scalar unary
    operations that may change the type (e.g. for mixed-precision).

As a consequence, when vectorization of such an op occurs, the only special
behavior is that the (unique) MulOpType is vectorized into a
`vector.contract`. All other ops are handled in a generic fashion.

 In the future, we may wish to allow more input arguments and elementwise and
 constant operations that do not involve the reduction dimension(s).

A test is added to demonstrate the proper vectorization of matmul_i8_i8_i32.

Differential revision: https://reviews.llvm.org/D95939

f245b7ad

Give this test a target triple. · 3b9de993
Richard Smith authored 4 years ago

3b9de993

Fix miscompile when performing template instantiation of non-dependent · cde8d2fd

Richard Smith authored 4 years ago

doubly-nested implicit CXXConstructExprs.

Ensure that we transform the parameter initializer using
TransformInitializer rather than TransformExpr so that we properly strip
down and rebuild the initialization, including any necessary
CXXBindTemporaryExprs. Otherwise we can end up forgetting to destroy
temporary objects used to construct a constructor parameter.

cde8d2fd

[mlir][Linalg] NFC - Extract a standalone LinalgInterfaces · 1029c82c

Nicolas Vasilache authored 4 years ago

This separation improves the layering and paves the way for more interfaces coming up in the future.

Differential revision: https://reviews.llvm.org/D95941

1029c82c

[hip][cuda] Enable extended lambda support on Windows. · a2fdf9d4

Michael Liao authored 5 years ago

- On Windows, extended lambda has extra issues due to the numbering
  schemes are different between the host compilation (Microsoft C++ ABI)
  and the device compilation (Itanium C++ ABI. Additional device side
  lambda number is required per lambda for the host compilation to
  correctly mangle the device-side lambda name.
- A hybrid numbering context `MSHIPNumberingContext` is introduced to
  number a lambda for both host- and device-compilations.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D69322

a2fdf9d4

[CSSPGO][llvm-profgen] Compress recursive cycles in calling context · ac14bb14

wlei authored 4 years ago

This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic.
Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration.
For example:
Considering a input context string stack:
[“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For first iteration,, it removed all adjacent repeated frames of size 1:
[“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For second iteration, it removed all adjacent repeated frames of size 2:
[“a”, “b”, “c”, “a”, “b”, “c”, “d”]
So in the end, we get compressed output:
[“a”, “b”, “c”, “d”]

Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator.
Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit.
Added unit tests and regression test for this.

Differential Revision: https://reviews.llvm.org/D93556

ac14bb14

Revert "[CSSPGO][llvm-profgen] Compress recursive cycles in calling context" · 6bccdcdb
wlei authored 4 years ago
```
This reverts commit 0609f257.
```
6bccdcdb
Revert "[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation" · 08e8bb60
wlei authored 4 years ago
```
This reverts commit 1714ad23.
```
08e8bb60

[ASTReader] Always rebuild a cached module that has errors · a2c1054c

Ben Barham authored 4 years ago

A module in the cache with an error should just be a cache miss. If
allowing errors (with -fallow-pcm-with-compiler-errors), a rebuild is
needed so that the appropriate diagnostics are output and in case search
paths have changed. If not allowing errors, the module was built
*allowing* errors and thus should be rebuilt regardless.

Reviewed By: akyrtzi

Differential Revision: https://reviews.llvm.org/D95989

a2c1054c

[NFC] Fix the noprofile attribute comment · b42ccdf3
Petr Hosek authored 4 years ago

b42ccdf3

[lldb] Convert more assertTrue to assertEqual (NFC) · 0ed758b2

Dave Lee authored 4 years ago

Follow up to D95813, this converts multiline assertTrue to assertEqual.

Differential Revision: https://reviews.llvm.org/D95899

0ed758b2

[NFC][Coroutine] Remove redundant comment · 9511fa2d

Chuanqi Xu authored 4 years ago

The functionallity in the TODO was added before:
https://reviews.llvm.org/rGb3a722e66b75328ab5e2eb5c8572022cb083855b

9511fa2d

[Transforms/IPO] Use range-based for loops (NFC) · be374758
Kazu Hirata authored 4 years ago

be374758
[TableGen] Use ListSeparator (NFC) · 643c00f7
Kazu Hirata authored 4 years ago

643c00f7
[Support] Drop unnecessary const from return types (NFC) · b4de30f6
Kazu Hirata authored 4 years ago
```
Identified with const-return-type.
```
b4de30f6

Fix the guaranteed alignment of memory returned by malloc/new on Darwin · aade0ec2

Akira Hatanaka authored 4 years ago

The guaranteed alignment is 16 bytes on Darwin.

rdar://73431623

Differential Revision: https://reviews.llvm.org/D95910

aade0ec2

[test] Pin spir-codegen.ll to legacy PM · 781a1b1e

Arthur Eubanks authored 4 years ago

-polly-enable-delicm is not supported under the new PM but is tested here:
  Assertion `!EnableDeLICM && "This option is not implemented"' failed.

781a1b1e

[CSSPGO][llvm-profgen] Aggregate samples on call frame trie to speed up profile generation · 1714ad23

wlei authored 4 years ago

For CS profile generation, the process of call stack unwinding is time-consuming since for each LBR entry we need linear time to generate the context( hash, compression, string concatenation). This change speeds up this by grouping all the call frame within one LBR sample into a trie and aggregating the result(sample counter) on it, deferring the context compression and string generation to the end of unwinding.

Specifically, it uses `StackLeaf` as the top frame on the stack and manipulates(pop or push a trie node) it dynamically during virtual unwinding so that the raw sample can just be recoded on the leaf node, the path(root to leaf) will represent its calling context. In the end, it traverses the trie and generates the context on the fly.

Results:
Our internal branch shows about 5X speed-up on some large workloads in SPEC06 benchmark.

Differential Revision: https://reviews.llvm.org/D94110

1714ad23

[CSSPGO][llvm-profgen] Compress recursive cycles in calling context · 0609f257

wlei authored 4 years ago

This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic.
Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration.
For example:
Considering a input context string stack:
[“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For first iteration,, it removed all adjacent repeated frames of size 1:
[“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For second iteration, it removed all adjacent repeated frames of size 2:
[“a”, “b”, “c”, “a”, “b”, “c”, “d”]
So in the end, we get compressed output:
[“a”, “b”, “c”, “d”]

Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator.
Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit.
Added unit tests and regression test for this.

Differential Revision: https://reviews.llvm.org/D93556

0609f257

[MLIR] Fix building unittests in in-tree build · c95c0db2
Isuru Fernando authored 4 years ago
```
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D95978
```
c95c0db2

Make the folder more robust against op fold() methods that generate a type mismatch · a1d5bdf8

Mehdi Amini authored 4 years ago

We could extend this with an interface to allow dialect to perform a type
conversion, but that would make the folder creating operation which isn't
the case at the moment, and isn't necessarily always desirable.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D95991

a1d5bdf8

[OpenMP][NVPTX] Take functions in `deviceRTLs` as `convergent` · 0f0ce3c1

Shilei Tian authored 4 years ago

OpenMP device compiler (similar to other SPMD compilers) assumes that
functions are convergent by default to avoid invalid transformations, such as
the bug (https://bugs.llvm.org/show_bug.cgi?id=49021).

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95971

0f0ce3c1

[OpenMPIRBuilder] Implement collapseLoops. · 26b5be66

Michael Kruse authored 4 years ago

The collapseLoops method implements a transformations facilitating the implementation of the collapse-clause. It takes a list of loops from a loop nest and reduces it to a single loop that can be used by other methods that are implemented on just a single loop, such as createStaticWorkshareLoop.

This patch shares some changes with D92974 (such as adding some getters to CanonicalLoopNest), used by both patches.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D93268

26b5be66

[lldb] Rollback to using i386 for the watch simulator · e3bb1c80

Jonas Devlieghere authored 4 years ago

I switched the watch simulator test from i386 to using x86_64, but
apparently that's not supported on the bots. Rollback to using i386 and
solve the original issue by passing the target, similar to what I did
in TestSimulatorPlatform.py.

e3bb1c80

[CSSPGO][llvm-profgen] Pseudo probe based CS profile generation · c82b24f4

wlei authored 4 years ago

This change implements profile generation infra for pseudo probe in llvm-profgen. During virtual unwinding, the raw profile is extracted into range counter and branch counter and aggregated to sample counter map indexed by the call stack context. This change introduces the last step and produces the eventual profile. Specifically, the body of function sample is recorded by going through each probe among the range and callsite target sample is recorded by extracting the callsite probe from branch's source.

Please refer https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.

**Implementation**

- Extended `PseudoProbeProfileGenerator` for pseudo probe based profile generation.
- `populateBodySamplesWithProbes` reading range counter is responsible for recording function body samples and inferring caller's body samples.
- `populateBoundarySamplesWithProbes` reading branch counter is responsible for recording call site target samples.
- Each sample is recorded with its calling context(named `ContextId`). Remind that the probe based context key doesn't include the leaf frame probe info, so the `ContextId` string is created from two part: one from the probe stack strings' concatenation and other one from the leaf frame probe.
- Added regression test

Test Plan:

ninja & ninja check-llvm

Differential Revision: https://reviews.llvm.org/D92998

c82b24f4

[AArch64][GlobalISel] Change store value type from p0 -> s64 to import patterns · 56fcd4ea

Jessica Paquette authored 4 years ago

Similar to the G_PTR_ADD + G_LOAD twiddling we do in `preISelLower`.

The imported patterns expect scalars only, so they can't handle things like

```
 G_STORE %ptr1, %ptr2
```

To get around this, use s64 instead.

(This probably makes a good portion of the manual selection code for G_STORE
dead.)

This is a 0.2% geomean code size improvement on CTMark at -Os.

(Best is consumer-typeset @ -0.7%)

Differential Revision: https://reviews.llvm.org/D95908

56fcd4ea

Revert "[InstrProfiling] Use !associated metadata for counters, data and values" · b9953141
Nico Weber authored 4 years ago
```
This reverts commit 97ba5cde.
Still breaks tests: https://reviews.llvm.org/D76802#2540647
```
b9953141

[AArch64][GlobalISel] Emit G_ASSERT_ZEXT in assignValueToAddress for ZExt params · a1f6bb20

Jessica Paquette authored 4 years ago

When we have a zeroext parameter coming in on the stack, build

```
%x = G_LOAD ...
%x_assert_zext = G_ASSERT_ZEXT %x, narrow_size
%trunc = G_TRUNC %x_assert_zext
```

Rather than just loading into the truncated type.

This allows us to optimize cases like this: https://godbolt.org/z/vfjhW8

Differential Revision: https://reviews.llvm.org/D95805

a1f6bb20

[libc++] [P0879] constexpr std::sort · 493f1407

Arthur O'Dwyer authored 4 years ago

This completes libc++'s implementation of
P0879 "Constexpr for swap and swap related functions."
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0879r0.html

For the feature-macro adjustment, see
https://cplusplus.github.io/LWG/issue3256

Differential Revision: https://reviews.llvm.org/D93661

493f1407

[Docs] Add some documentation for constructor homing, a debug info optimization (-fuse-ctor-homing) · 26e9c990
Amy Huang authored 4 years ago
```
Adding this, since there's currently no documentation about this.

Differential Revision: https://reviews.llvm.org/D95911
```
26e9c990

[clang-tidy] Use new mapping matchers · c0199b2a

Stephen Kelly authored 4 years ago

Use mapAnyOf() and matchers based on it.

Use of binaryOperation() means that modernize-loop-convert and
readability-container-size-empty can now be used with rewritten binary
operators.

Differential Revision: https://reviews.llvm.org/D94131

c0199b2a

PR44325 (and duplicates): don't issue -Wzero-as-null-pointer-constant · 1f06f419

Richard Smith authored 4 years ago

when rewriting 'a < b' as '(a <=> b) < 0'.

It's pretty common for comparison category types to use a pointer or
pointer-to-member type as their '0' parameter.

1f06f419

Revert "[LTO] Use lto::backend for code generation." · 7db390cc
Florian Hahn authored 4 years ago
```
This reverts commit 6a59f056, because
it is causing failures on green dragon.
```
7db390cc
Revert "[LTO] Add option enable NewPM with LTOCodeGenerator." · 0a17664b
Florian Hahn authored 4 years ago
```
This reverts commit 7a6a2cc8 because
it is causing failures on green dragon.
```
0a17664b
Revert "[LTOCodeGenerator] Use lto::Config for options (NFC)." · b0a8e41c
Florian Hahn authored 4 years ago
```
This reverts commit 0d487cf8 because
it is causing failures on green dragon.
```
b0a8e41c