This project is mirrored from https://github.com/llvm/llvm-project.git.
Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
Last successful update .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
Last successful update .
- Feb 04, 2021
-
-
Jan Svoboda authored
This patch implements generation of remaining header search arguments. It's done manually in C++ as opposed to TableGen, because we need the flexibility and don't anticipate reuse. This patch also tests the generation of header search options via a round-trip. This way, the code gets exercised whenever Clang is built and tested in asserts mode. All `check-clang` tests pass. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D94472
-
Joachim Meyer authored
As noted in https://reviews.llvm.org/D93459, the formatting of multi-line descriptions of clEnumValN and the likes is unfavorable. Thus this patch adds support for correctly indenting these. Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D93494
-
Sebastian Neubauer authored
When SGPRs are spilled to VGPRs, they can overwrite any lane. We need to preserve the value of inactive lanes in function calls, so we save the register even if it is marked as caller saved. Also, teach buildPrologSpill to work when no registers are free like in CodeGen/AMDGPU/pei-scavenge-vgpr-spill.mir and update the comment on findScratchNonCalleeSaveRegister as it is not used anymore to realign the stack pointer since D95865. Differential Revision: https://reviews.llvm.org/D95946
-
Kirill Bobyrev authored
This patch allows detecting conflicts with variables defined in the current CompoundStmt or If/While/For variable init statements. Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D95925
-
Haojian Wu authored
Differential Revision: https://reviews.llvm.org/D95782
-
Nicolas Vasilache authored
This revision defines a Linalg contraction in general terms: 1. Has 2 input and 1 output shapes. 2. Has at least one reduction dimension. 3. Has only projected permutation indexing maps. 4. its body computes `u5(u1(c) + u2(u3(a) * u4(b)))` on some field (AddOpType, MulOpType), where u1, u2, u3, u4 and u5 represent scalar unary operations that may change the type (e.g. for mixed-precision). As a consequence, when vectorization of such an op occurs, the only special behavior is that the (unique) MulOpType is vectorized into a `vector.contract`. All other ops are handled in a generic fashion. In the future, we may wish to allow more input arguments and elementwise and constant operations that do not involve the reduction dimension(s). A test is added to demonstrate the proper vectorization of matmul_i8_i8_i32. Differential revision: https://reviews.llvm.org/D95939
-
Richard Smith authored
-
Richard Smith authored
doubly-nested implicit CXXConstructExprs. Ensure that we transform the parameter initializer using TransformInitializer rather than TransformExpr so that we properly strip down and rebuild the initialization, including any necessary CXXBindTemporaryExprs. Otherwise we can end up forgetting to destroy temporary objects used to construct a constructor parameter.
-
Nicolas Vasilache authored
This separation improves the layering and paves the way for more interfaces coming up in the future. Differential revision: https://reviews.llvm.org/D95941
-
Michael Liao authored
- On Windows, extended lambda has extra issues due to the numbering schemes are different between the host compilation (Microsoft C++ ABI) and the device compilation (Itanium C++ ABI. Additional device side lambda number is required per lambda for the host compilation to correctly mangle the device-side lambda name. - A hybrid numbering context `MSHIPNumberingContext` is introduced to number a lambda for both host- and device-compilations. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D69322
-
wlei authored
This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic. Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration. For example: Considering a input context string stack: [“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For first iteration,, it removed all adjacent repeated frames of size 1: [“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For second iteration, it removed all adjacent repeated frames of size 2: [“a”, “b”, “c”, “a”, “b”, “c”, “d”] So in the end, we get compressed output: [“a”, “b”, “c”, “d”] Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator. Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit. Added unit tests and regression test for this. Differential Revision: https://reviews.llvm.org/D93556
-
Ben Barham authored
A module in the cache with an error should just be a cache miss. If allowing errors (with -fallow-pcm-with-compiler-errors), a rebuild is needed so that the appropriate diagnostics are output and in case search paths have changed. If not allowing errors, the module was built *allowing* errors and thus should be rebuilt regardless. Reviewed By: akyrtzi Differential Revision: https://reviews.llvm.org/D95989
-
Petr Hosek authored
-
Dave Lee authored
Follow up to D95813, this converts multiline assertTrue to assertEqual. Differential Revision: https://reviews.llvm.org/D95899
-
Chuanqi Xu authored
The functionallity in the TODO was added before: https://reviews.llvm.org/rGb3a722e66b75328ab5e2eb5c8572022cb083855b
-
Kazu Hirata authored
-
Kazu Hirata authored
-
Kazu Hirata authored
Identified with const-return-type.
-
Akira Hatanaka authored
The guaranteed alignment is 16 bytes on Darwin. rdar://73431623 Differential Revision: https://reviews.llvm.org/D95910
-
Arthur Eubanks authored
-polly-enable-delicm is not supported under the new PM but is tested here: Assertion `!EnableDeLICM && "This option is not implemented"' failed.
-
wlei authored
For CS profile generation, the process of call stack unwinding is time-consuming since for each LBR entry we need linear time to generate the context( hash, compression, string concatenation). This change speeds up this by grouping all the call frame within one LBR sample into a trie and aggregating the result(sample counter) on it, deferring the context compression and string generation to the end of unwinding. Specifically, it uses `StackLeaf` as the top frame on the stack and manipulates(pop or push a trie node) it dynamically during virtual unwinding so that the raw sample can just be recoded on the leaf node, the path(root to leaf) will represent its calling context. In the end, it traverses the trie and generates the context on the fly. Results: Our internal branch shows about 5X speed-up on some large workloads in SPEC06 benchmark. Differential Revision: https://reviews.llvm.org/D94110
-
wlei authored
This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic. Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration. For example: Considering a input context string stack: [“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For first iteration,, it removed all adjacent repeated frames of size 1: [“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”] For second iteration, it removed all adjacent repeated frames of size 2: [“a”, “b”, “c”, “a”, “b”, “c”, “d”] So in the end, we get compressed output: [“a”, “b”, “c”, “d”] Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator. Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit. Added unit tests and regression test for this. Differential Revision: https://reviews.llvm.org/D93556
-
Isuru Fernando authored
Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D95978
-
Mehdi Amini authored
We could extend this with an interface to allow dialect to perform a type conversion, but that would make the folder creating operation which isn't the case at the moment, and isn't necessarily always desirable. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D95991
-
Shilei Tian authored
OpenMP device compiler (similar to other SPMD compilers) assumes that functions are convergent by default to avoid invalid transformations, such as the bug (https://bugs.llvm.org/show_bug.cgi?id=49021). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D95971
-
Michael Kruse authored
The collapseLoops method implements a transformations facilitating the implementation of the collapse-clause. It takes a list of loops from a loop nest and reduces it to a single loop that can be used by other methods that are implemented on just a single loop, such as createStaticWorkshareLoop. This patch shares some changes with D92974 (such as adding some getters to CanonicalLoopNest), used by both patches. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D93268
-
Jonas Devlieghere authored
I switched the watch simulator test from i386 to using x86_64, but apparently that's not supported on the bots. Rollback to using i386 and solve the original issue by passing the target, similar to what I did in TestSimulatorPlatform.py.
-
wlei authored
This change implements profile generation infra for pseudo probe in llvm-profgen. During virtual unwinding, the raw profile is extracted into range counter and branch counter and aggregated to sample counter map indexed by the call stack context. This change introduces the last step and produces the eventual profile. Specifically, the body of function sample is recorded by going through each probe among the range and callsite target sample is recorded by extracting the callsite probe from branch's source. Please refer https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen. **Implementation** - Extended `PseudoProbeProfileGenerator` for pseudo probe based profile generation. - `populateBodySamplesWithProbes` reading range counter is responsible for recording function body samples and inferring caller's body samples. - `populateBoundarySamplesWithProbes` reading branch counter is responsible for recording call site target samples. - Each sample is recorded with its calling context(named `ContextId`). Remind that the probe based context key doesn't include the leaf frame probe info, so the `ContextId` string is created from two part: one from the probe stack strings' concatenation and other one from the leaf frame probe. - Added regression test Test Plan: ninja & ninja check-llvm Differential Revision: https://reviews.llvm.org/D92998
-
Jessica Paquette authored
Similar to the G_PTR_ADD + G_LOAD twiddling we do in `preISelLower`. The imported patterns expect scalars only, so they can't handle things like ``` G_STORE %ptr1, %ptr2 ``` To get around this, use s64 instead. (This probably makes a good portion of the manual selection code for G_STORE dead.) This is a 0.2% geomean code size improvement on CTMark at -Os. (Best is consumer-typeset @ -0.7%) Differential Revision: https://reviews.llvm.org/D95908
-
Nico Weber authored
This reverts commit 97ba5cde. Still breaks tests: https://reviews.llvm.org/D76802#2540647
-
Jessica Paquette authored
When we have a zeroext parameter coming in on the stack, build ``` %x = G_LOAD ... %x_assert_zext = G_ASSERT_ZEXT %x, narrow_size %trunc = G_TRUNC %x_assert_zext ``` Rather than just loading into the truncated type. This allows us to optimize cases like this: https://godbolt.org/z/vfjhW8 Differential Revision: https://reviews.llvm.org/D95805
-
Arthur O'Dwyer authored
This completes libc++'s implementation of P0879 "Constexpr for swap and swap related functions." http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0879r0.html For the feature-macro adjustment, see https://cplusplus.github.io/LWG/issue3256 Differential Revision: https://reviews.llvm.org/D93661
-
Amy Huang authored
Adding this, since there's currently no documentation about this. Differential Revision: https://reviews.llvm.org/D95911
-
Stephen Kelly authored
Use mapAnyOf() and matchers based on it. Use of binaryOperation() means that modernize-loop-convert and readability-container-size-empty can now be used with rewritten binary operators. Differential Revision: https://reviews.llvm.org/D94131
-
Richard Smith authored
when rewriting 'a < b' as '(a <=> b) < 0'. It's pretty common for comparison category types to use a pointer or pointer-to-member type as their '0' parameter.
-
Florian Hahn authored
This reverts commit 6a59f056, because it is causing failures on green dragon.
-
Florian Hahn authored
This reverts commit 7a6a2cc8 because it is causing failures on green dragon.
-
Florian Hahn authored
This reverts commit 0d487cf8 because it is causing failures on green dragon.
-