This project is mirrored from https://github.com/llvm/llvm-project.git.
Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
Last successful update .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
Last successful update .
- Apr 23, 2021
-
-
Duncan P. N. Exon Smith authored
The profiling runtime was designed to work without static initializers or a a filesystem (see 117cf2bd and others). The no-static-initializers part was already documented but this part got missed before. Differential Revision: https://reviews.llvm.org/D101000
-
Kai Nacke authored
lookupTarget() can update the passed triple argument. This happens when no triple is given on the command line, and the architecture argument does not match the architecture in the default triple. For example, passing -march=aarch64 on the command line, and the default triple being x86_64-windows-msvc, the triple is changed to aarch64-windows-msvc. However, this triple is not saved, and later in the code, the triple is constructed again from the triple name, which is the default triple at this point. Thus the default triple is passed to constructor of MCSubtargetInfo instance. The triple is only used determine the object file format, and by chance, the AArch64 target also uses the COFF file format, and all is fine. Obviously, the AArch64 target does not support all available binary file formats, e.g. XCOFF and GOFF, and llvm-mca crashes in this case. The fix is to update the triple name with the changed triple name for the target lookup. Then the default object file format for the architecture is used, in the example ELF. Reviewed By: andreadb, abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D100992
-
peter klausler authored
Add flang/docs/FortranFeatureHistory.md Differential Revision: https://reviews.llvm.org/D101081
-
Peter Collingbourne authored
In the most common case we call computeOddEvenMaskForPointerMaybe() from quarantineOrDeallocateChunk(), in which case we need to look up the class size from the SizeClassMap in order to compute the LSB. Since we need to do a lookup anyway, we may as well look up the LSB itself and avoid computing it every time. While here, switch to a slightly more efficient way of computing the odd/even mask. Differential Revision: https://reviews.llvm.org/D101018
-
Vitaly Buka authored
Missed review comments. This reverts commit e2508296.
-
Philip Reames authored
Straight forward extension to the recently added infrastructure which was pioneered with shl. Differential Revision: https://reviews.llvm.org/D99687
-
Petr Hosek authored
This avoids test failures where extra files exist in the tree, such as the standard library built using the runtimes build. Differential Revision: https://reviews.llvm.org/D101023
-
Philip Reames authored
This change effectively reverts 86664638, but since there have been some changes on top and I wanted to leave the tests in, it's not a mechanical revert. Why revert this now? Two main reasons: 1) There are continuing discussion around what the semantics of nofree. I am getting increasing uncomfortable with the seeming possibility we might redefine nofree in a way incompatible with these changes. 2) There was a reported miscompile triggered by this change (https://github.com/emscripten-core/emscripten/issues/9443). At first, I was making good progress on tracking down the issues exposed and those issues appeared to be unrelated latent bugs. Now that we've found at least one bug in the original change, and the investigation has stalled, I'm no longer comfortable leaving this in tree. In retrospect, I probably should have reverted this earlier and investigated the issues once the triggering change was out of tree.
-
Craig Topper authored
These instructions don't really exist, but we have ways we can emulate them. .vv will swap operands and use vmsle().vv. .vi will adjust the immediate and use .vmsgt(u).vi when possible. For .vx we need to use some of the multiple instruction sequences from the V extension spec. For unmasked vmsge(u).vx we use: vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd For cases where mask and maskedoff are the same value then we have vmsge{u}.vx v0, va, x, v0.t which is the vd==v0 case that requires a temporary so we use: vmslt{u}.vx vt, va, x; vmandnot.mm vd, vd, vt For other masked cases we use this sequence: vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0 We trust that register allocation will prevent vd in vmslt{u}.vx from being v0 since v0 is still needed by the vmxor. Differential Revision: https://reviews.llvm.org/D100925
-
Craig Topper authored
Refactor to use new multiclass instead of individual patterns. We already supported this due to SEW=64 on RV32, but we didn't have test cases for all the types we supported. Part of D100925
-
Craig Topper authored
We don't have instructions for these, but can swap the operands to use vmle/vmflt. This makes the IR interface more consistent and simplifies the frontend implementation. Part of D100925
-
Vitaly Buka authored
-
Vitaly Buka authored
QEMU just ignores MADV_DONTNEED https://github.com/qemu/qemu/blob/b1cffefa1b163bce9aebc3416f562c1d3886eeaa/linux-user/syscall.c#L11941 Depends on D100998. Differential Revision: https://reviews.llvm.org/D101031
-
Vitaly Buka authored
Differential Revision: https://reviews.llvm.org/D100998
-
Andrzej Warzynski authored
Switching from `%f18` to `%flang_fc1` in LIT tests added in https://reviews.llvm.org/D91159. This way these tests are run with the new driver, `flang-new`, when enabled (i.e. when `FLANG_BUILD_NEW_DRIVER` is set). Differential Revision: https://reviews.llvm.org/D101078
-
Fangrui Song authored
This partially reverts commit 77ac823f. Halide uses le32/le64 (https://github.com/halide/Halide/pull/5934). Temporarily brings back the code part to give them some time for migration.
-
Vitaly Buka authored
-
Craig Topper authored
Implementations are allowed to optimize an x0 stride to perform less memory accesses. This is the case in SiFive cores. No idea if this is the case in other implementations. We might need a tuning flag for this. Reviewed By: frasercrmck, arcbbb Differential Revision: https://reviews.llvm.org/D100815
-
Raphael Isemann authored
The `--allow-jit` flag allows the user to force the IR interpreter to run the provided expression. The `--top-level` flag parses and injects the code as if its in the top level scope of a source file. Both flags just change the ExecutionPolicy of the expression: * `--allow-jit true` -> doesn't change anything (its the default) * `--allow-jit false` -> ExecutionPolicyNever * `--top-level` -> ExecutionPolicyTopLevel Passing `--allow-jit false` and `--top-level` currently causes the `--top-level` to silently overwrite the ExecutionPolicy value that was set by `--allow-jit false`. There isn't any ExecutionPolicy value that says "top-level but only interpret", so I would say we reject this combination of flags until someone finds time to refactor top-level feature out of the ExecutionPolicy enum. The SBExpressionOptions suffer from a similar symptom as `SetTopLevel` and `SetAllowJIT` just silently disable each other. But those functions don't have any error handling, so not a lot we can do about this in the meantime. Reviewed By: labath, kastiglione Differential Revision: https://reviews.llvm.org/D91780
-
Craig Topper authored
Rather than doing splatting each separately and doing bit manipulation to merge them in the vector domain, copy the data to the stack and splat it using a strided load with x0 stride. At least on some implementations this vector load is optimized to not do a load for each element. This is equivalent to how we move i64 to f64 on RV32. I've only implemented this for the intrinsic fallbacks in this patch. I think we do similar splatting/shifting/oring in other places. If this is approved, I'll refactor the others to share the code. Differential Revision: https://reviews.llvm.org/D101002
-
Krzysztof Parzyszek authored
Intrinsics for the following instructions are added. The intrinsic name is "int_hexagon_<inst>[_128B]", e.g. int_hexagon_V6_vL32b_pred_ai for 64-byte version int_hexagon_V6_vL32b_pred_ai_128B for 128-byte version V6_vL32b_pred_ai if (Pv4) Vd32 = vmem(Rt32+#s4) V6_vL32b_pred_pi if (Pv4) Vd32 = vmem(Rx32++#s3) V6_vL32b_pred_ppu if (Pv4) Vd32 = vmem(Rx32++Mu2) V6_vL32b_npred_ai if (!Pv4) Vd32 = vmem(Rt32+#s4) V6_vL32b_npred_pi if (!Pv4) Vd32 = vmem(Rx32++#s3) V6_vL32b_npred_ppu if (!Pv4) Vd32 = vmem(Rx32++Mu2) V6_vL32b_nt_pred_ai if (Pv4) Vd32 = vmem(Rt32+#s4):nt V6_vL32b_nt_pred_pi if (Pv4) Vd32 = vmem(Rx32++#s3):nt V6_vL32b_nt_pred_ppu if (Pv4) Vd32 = vmem(Rx32++Mu2):nt V6_vL32b_nt_npred_ai if (!Pv4) Vd32 = vmem(Rt32+#s4):nt V6_vL32b_nt_npred_pi if (!Pv4) Vd32 = vmem(Rx32++#s3):nt V6_vL32b_nt_npred_ppu if (!Pv4) Vd32 = vmem(Rx32++Mu2):nt V6_vS32b_pred_ai if (Pv4) vmem(Rt32+#s4) = Vs32 V6_vS32b_pred_pi if (Pv4) vmem(Rx32++#s3) = Vs32 V6_vS32b_pred_ppu if (Pv4) vmem(Rx32++Mu2) = Vs32 V6_vS32b_npred_ai if (!Pv4) vmem(Rt32+#s4) = Vs32 V6_vS32b_npred_pi if (!Pv4) vmem(Rx32++#s3) = Vs32 V6_vS32b_npred_ppu if (!Pv4) vmem(Rx32++Mu2) = Vs32 V6_vS32Ub_pred_ai if (Pv4) vmemu(Rt32+#s4) = Vs32 V6_vS32Ub_pred_pi if (Pv4) vmemu(Rx32++#s3) = Vs32 V6_vS32Ub_pred_ppu if (Pv4) vmemu(Rx32++Mu2) = Vs32 V6_vS32Ub_npred_ai if (!Pv4) vmemu(Rt32+#s4) = Vs32 V6_vS32Ub_npred_pi if (!Pv4) vmemu(Rx32++#s3) = Vs32 V6_vS32Ub_npred_ppu if (!Pv4) vmemu(Rx32++Mu2) = Vs32 V6_vS32b_nt_pred_ai if (Pv4) vmem(Rt32+#s4):nt = Vs32 V6_vS32b_nt_pred_pi if (Pv4) vmem(Rx32++#s3):nt = Vs32 V6_vS32b_nt_pred_ppu if (Pv4) vmem(Rx32++Mu2):nt = Vs32 V6_vS32b_nt_npred_ai if (!Pv4) vmem(Rt32+#s4):nt = Vs32 V6_vS32b_nt_npred_pi if (!Pv4) vmem(Rx32++#s3):nt = Vs32 V6_vS32b_nt_npred_ppu if (!Pv4) vmem(Rx32++Mu2):nt = Vs32
-
Joseph Huber authored
Summary: This patch adds a new runtime function __tgt_set_info_flag that allows the user to set the information level at runtime without using the environment variable. Using this will require an extern function, but will eventually be added into an auxilliary library for OpenMP support functions. This patch required moving the current InfoLevel to a global variable which must be instantiated by each plugin. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D100774
-
Raphael Isemann authored
The buffer we turn into a std::string here is malloc'd and should be free'd before we return from this function. Follow up to LLDB leak fixes such as D100806. Reviewed By: mstorsjo, rupprecht, MaskRay Differential Revision: https://reviews.llvm.org/D100843
-
peter klausler authored
Andrezj W. @ Arm discovered that the runtime derived type table building code in semantics was detecting fatal errors in the tests that the f18 driver wasn't printing. This patch fixes f18 so that these messages are printed; however, the messages were not valid user errors, and the rest of this patch fixes them up. There were two sources of the bogus errors. One was that the runtime derived type information table builder was calculating the shapes of allocatable and pointer array components in derived types, and then complaining that they weren't constant or LEN parameter values, which of course they couldn't be since they have to have deferred shapes and those bounds were expressions like LBOUND(component,dim=1). The second was that f18 was forwarding the actual LEN type parameter expressions of a type instantiation too far into the uses of those parameters in various expressions in the declarations of components; when an actual LEN type parameter is not a constant value, it needs to remain a "bare" type parameter inquiry so that it will be lowered to a descriptor inquiry and acquire a captured expression value. Fixing this up properly involved: moving some code into new utility function templates in Evaluate/tools.h, tweaking the rewriting of conversions in expression folding to elide needless integer kind conversions of type parameter inquiries, making type parameter inquiry folding *not* replace bare LEN type parameters with non-constant actual parameter values, and cleaning up some altered test results. Differential Revision: https://reviews.llvm.org/D101001
-
Jianzhou Zhao authored
The first version of origin tracking tracks only memory stores. Although this is sufficient for understanding correct flows, it is hard to figure out where an undefined value is read from. To find reading undefined values, we still have to do a reverse binary search from the last store in the chain with printing and logging at possible code paths. This is quite inefficient. Tracking memory load instructions can help this case. The main issues of tracking loads are performance and code size overheads. With tracking only stores, the code size overhead is 38%, memory overhead is 1x, and cpu overhead is 3x. In practice #load is much larger than #store, so both code size and cpu overhead increases. The first blocker is code size overhead: link fails if we inline tracking loads. The workaround is using external function calls to propagate metadata. This is also the workaround ASan uses. The cpu overhead is ~10x. This is a trade off between debuggability and performance, and will be used only when debugging cases that tracking only stores is not enough. Reviewed By: gbalats Differential Revision: https://reviews.llvm.org/D100967
-
Arthur O'Dwyer authored
`std::clamp(2, 1, 3, std::greater<int>())` has UB because (1 > 3) is false. Swap the operands to fix the _LIBCPP_ASSERT failure in this test.
-
Irina Dobrescu authored
This patch adds semantic checks for the General Restrictions of the Allocate Directive. Since the requires directive is not yet implemented in Flang, the restriction: ``` allocate directives that appear in a target region must specify an allocator clause unless a requires directive with the dynamic_allocators clause is present in the same compilation unit ``` will need to be updated at a later time. A different patch will be made with the Fortran specific restrictions of this directive. I have used the code from https://reviews.llvm.org/D89395 for the CheckObjectListStructure function. Co-authored-by:
Isaac Perry <isaac.perry@arm.com> Reviewed By: clementval, kiranchandramohan Differential Revision: https://reviews.llvm.org/D91159
-
Sanjay Patel authored
-
Arthur O'Dwyer authored
Reviewed as part of https://reviews.llvm.org/D100737
-
Arthur O'Dwyer authored
Reviewed as part of https://reviews.llvm.org/D100737
-
- Apr 22, 2021
-
-
Alexey Bataev authored
We can skip check for undefs trying to find perfect/shuffled tree entries matching, they can be ignored completely improving the final cost/vectorization results. Differential Revision: https://reviews.llvm.org/D101061
-
Hongtao Yu authored
1. Remove unnecessary filtering code. 2. Add llvm-profgen to tool substitutions. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D101006
-
Peter Steinfeld authored
We were erroneously not taking into account the constant values of LEN type parameters of parameterized derived types when checking for argument compatibility. The required checks are identical to those for assignment compatibility. Since argument compatibility is checked in .../lib/Evaluate and assignment compatibility is checked in .../lib/Semantics, I moved the common code into .../lib/Evaluate/tools.cpp and changed the assignment compatibility checking code to call it. After implementing these new checks, tests in resolve53.f90 were failing because the tests were erroneous. I fixed these tests and added new tests to call03.f90 to test argument passing of parameterized derived types more completely. Differential Revision: https://reviews.llvm.org/D100989
-
Arnamoy Bhattacharyya authored
Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D100883
-
Nemanja Ivanovic authored
The previous commits just missed some pointer casts and ended up producing warnings.
-
Nemanja Ivanovic authored
Another addition for compatibility with XLC. The functions have the same overloads so just add it as a preprocessor define.
-
Nemanja Ivanovic authored
Add these overloads for compatibility with XLC. This is a word load-and-splat.
-
Nemanja Ivanovic authored
Add these overloads for compatibility with XLC. This is a doubleword load-and-splat.
-
Nemanja Ivanovic authored
Add the overloads for compatibility with XLC.
-
Nemanja Ivanovic authored
Add the overloads for compatibility with XLC.
-