Skip to content
Snippets Groups Projects
This project is mirrored from https://github.com/llvm/llvm-project.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
Last successful update .
  1. Apr 23, 2021
    • Duncan P. N. Exon Smith's avatar
      Coverage: Document how to collect a profile without a filesystem · d4ee603c
      Duncan P. N. Exon Smith authored
      The profiling runtime was designed to work without static initializers
      or a a filesystem (see 117cf2bd and
      others). The no-static-initializers part was already documented but this
      part got missed before.
      
      Differential Revision: https://reviews.llvm.org/D101000
      d4ee603c
    • Kai Nacke's avatar
      Fix the triple used in llvm-mca. · 832340ca
      Kai Nacke authored
      lookupTarget() can update the passed triple argument. This happens
      when no triple is given on the command line, and the architecture
      argument does not match the architecture in the default triple.
      
      For example, passing -march=aarch64 on the command line, and the
      default triple being x86_64-windows-msvc, the triple is changed
      to aarch64-windows-msvc.
      
      However, this triple is not saved, and later in the code, the
      triple is constructed again from the triple name, which is the
      default triple at this point. Thus the default triple is passed
      to constructor of MCSubtargetInfo instance.
      
      The triple is only used determine the object file format, and by
      chance, the AArch64 target also uses the COFF file format, and
      all is fine. Obviously, the AArch64 target does not support all
      available binary file formats, e.g. XCOFF and GOFF, and llvm-mca
      crashes in this case.
      
      The fix is to update the triple name with the changed triple
      name for the target lookup. Then the default object file format
      for the architecture is used, in the example ELF.
      
      Reviewed By: andreadb, abhina.sreeskantharajan
      
      Differential Revision: https://reviews.llvm.org/D100992
      832340ca
    • peter klausler's avatar
      [flang] (NFC) Document Fortran feature history · 47283e15
      peter klausler authored
      Add flang/docs/FortranFeatureHistory.md
      
      Differential Revision: https://reviews.llvm.org/D101081
      47283e15
    • Peter Collingbourne's avatar
      scudo: Use a table to look up the LSB for computing the odd/even mask. NFCI. · 4e88e587
      Peter Collingbourne authored
      In the most common case we call computeOddEvenMaskForPointerMaybe()
      from quarantineOrDeallocateChunk(), in which case we need to look up
      the class size from the SizeClassMap in order to compute the LSB. Since
      we need to do a lookup anyway, we may as well look up the LSB itself
      and avoid computing it every time.
      
      While here, switch to a slightly more efficient way of computing the
      odd/even mask.
      
      Differential Revision: https://reviews.llvm.org/D101018
      4e88e587
    • Vitaly Buka's avatar
      Revert "[sanitizer] Use COMPILER_RT_EMULATOR with gtests" · 68632826
      Vitaly Buka authored
      Missed review comments.
      
      This reverts commit e2508296.
      68632826
    • Philip Reames's avatar
      [SCEV] Compute ranges for lshr recurrences · 424d6cb9
      Philip Reames authored
      Straight forward extension to the recently added infrastructure which was pioneered with shl.
      
      Differential Revision: https://reviews.llvm.org/D99687
      424d6cb9
    • Petr Hosek's avatar
      [Driver] Specify -ccc-install-dir for linux-cross test · 45340efb
      Petr Hosek authored
      This avoids test failures where extra files exist in the tree, such
      as the standard library built using the runtimes build.
      
      Differential Revision: https://reviews.llvm.org/D101023
      45340efb
    • Philip Reames's avatar
      Revert "[instcombine] Exploit UB implied by nofree attributes" · 15e19a25
      Philip Reames authored
      This change effectively reverts 86664638, but since there have been some changes on top and I wanted to leave the tests in, it's not a mechanical revert.
      
      Why revert this now?  Two main reasons:
      1) There are continuing discussion around what the semantics of nofree.  I am getting increasing uncomfortable with the seeming possibility we might redefine nofree in a way incompatible with these changes.
      2) There was a reported miscompile triggered by this change (https://github.com/emscripten-core/emscripten/issues/9443).  At first, I was making good progress on tracking down the issues exposed and those issues appeared to be unrelated latent bugs.  Now that we've found at least one bug in the original change, and the investigation has stalled, I'm no longer comfortable leaving this in tree.  In retrospect, I probably should have reverted this earlier and investigated the issues once the triggering change was out of tree.
      15e19a25
    • Craig Topper's avatar
      [RISCV] Add IR intrinsics for vmsge(u).vv/vx/vi. · e01c419e
      Craig Topper authored
      These instructions don't really exist, but we have ways we can
      emulate them.
      
      .vv will swap operands and use vmsle().vv. .vi will adjust the
      immediate and use .vmsgt(u).vi when possible. For .vx we need to
      use some of the multiple instruction sequences from the V extension
      spec.
      
      For unmasked vmsge(u).vx we use:
        vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd
      
      For cases where mask and maskedoff are the same value then we have
      vmsge{u}.vx v0, va, x, v0.t which is the vd==v0 case that
      requires a temporary so we use:
        vmslt{u}.vx vt, va, x; vmandnot.mm vd, vd, vt
      
      For other masked cases we use this sequence:
        vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0
      We trust that register allocation will prevent vd in vmslt{u}.vx
      from being v0 since v0 is still needed by the vmxor.
      
      Differential Revision: https://reviews.llvm.org/D100925
      e01c419e
    • Craig Topper's avatar
      [RISCV] Add missing tests for vector type for second operand of vmsgt and vmsgtu IR intrinsics. · d77d56ac
      Craig Topper authored
      Refactor to use new multiclass instead of individual patterns.
      
      We already supported this due to SEW=64 on RV32, but we didn't have
      test cases for all the types we supported.
      
      Part of D100925
      d77d56ac
    • Craig Topper's avatar
      [RISCV] Support vector type for second operand of vmfge and vmfgt IR intrinsics. · 9524a055
      Craig Topper authored
      We don't have instructions for these, but can swap the operands
      to use vmle/vmflt. This makes the IR interface more consistent and
      simplifies the frontend implementation.
      
      Part of D100925
      9524a055
    • Vitaly Buka's avatar
      37e14581
    • Vitaly Buka's avatar
    • Vitaly Buka's avatar
      e2508296
    • Andrzej Warzynski's avatar
      [flang] Update recently added OpenMP tests to use the new driver · 43831d62
      Andrzej Warzynski authored
      Switching from `%f18` to `%flang_fc1` in LIT tests added in
      https://reviews.llvm.org/D91159. This way these tests are run with the
      new driver, `flang-new`, when enabled (i.e. when
      `FLANG_BUILD_NEW_DRIVER` is set).
      
      Differential Revision: https://reviews.llvm.org/D101078
      43831d62
    • Fangrui Song's avatar
      Temporarily revert the code part of D100981 "Delete le32/le64 targets" · ef5e7f90
      Fangrui Song authored
      This partially reverts commit 77ac823f.
      
      Halide uses le32/le64 (https://github.com/halide/Halide/pull/5934).
      Temporarily brings back the code part to give them some time for migration.
      ef5e7f90
    • Vitaly Buka's avatar
      149d5a8c
    • Craig Topper's avatar
      [RISCV] Turn splat shuffles of vector loads into strided load with stride of x0. · 70254ccb
      Craig Topper authored
      Implementations are allowed to optimize an x0 stride to perform
      less memory accesses. This is the case in SiFive cores.
      
      No idea if this is the case in other implementations. We might
      need a tuning flag for this.
      
      Reviewed By: frasercrmck, arcbbb
      
      Differential Revision: https://reviews.llvm.org/D100815
      70254ccb
    • Raphael Isemann's avatar
      [lldb] Fix that the expression commands --top-level flag overwrites --allow-jit false · d616a6bd
      Raphael Isemann authored
      The `--allow-jit` flag allows the user to force the IR interpreter to run the
      provided expression.
      
      The `--top-level` flag parses and injects the code as if its in the top level
      scope of a source file.
      
      Both flags just change the ExecutionPolicy of the expression:
      * `--allow-jit true` -> doesn't change anything (its the default)
      * `--allow-jit false` -> ExecutionPolicyNever
      * `--top-level` -> ExecutionPolicyTopLevel
      
      Passing `--allow-jit false` and `--top-level` currently causes the `--top-level`
      to silently overwrite the ExecutionPolicy value that was set by `--allow-jit
      false`. There isn't any ExecutionPolicy value that says "top-level but only
      interpret", so I would say we reject this combination of flags until someone
      finds time to refactor top-level feature out of the ExecutionPolicy enum.
      
      The SBExpressionOptions suffer from a similar symptom as `SetTopLevel` and
      `SetAllowJIT` just silently disable each other. But those functions don't have
      any error handling, so not a lot we can do about this in the meantime.
      
      Reviewed By: labath, kastiglione
      
      Differential Revision: https://reviews.llvm.org/D91780
      d616a6bd
    • Craig Topper's avatar
      [RISCV] Use stack temporary to splat two GPRs into SEW=64 vector on RV32. · 77f14c96
      Craig Topper authored
      Rather than doing splatting each separately and doing bit manipulation
      to merge them in the vector domain, copy the data to the stack
      and splat it using a strided load with x0 stride. At least on
      some implementations this vector load is optimized to not do
      a load for each element.
      
      This is equivalent to how we move i64 to f64 on RV32.
      
      I've only implemented this for the intrinsic fallbacks in this
      patch. I think we do similar splatting/shifting/oring in other
      places. If this is approved, I'll refactor the others to share
      the code.
      
      Differential Revision: https://reviews.llvm.org/D101002
      77f14c96
    • Krzysztof Parzyszek's avatar
      [Hexagon] Add HVX intrinsics for conditional vector loads/stores · deda60fc
      Krzysztof Parzyszek authored
      Intrinsics for the following instructions are added. The intrinsic
      name is "int_hexagon_<inst>[_128B]", e.g.
        int_hexagon_V6_vL32b_pred_ai        for 64-byte version
        int_hexagon_V6_vL32b_pred_ai_128B   for 128-byte version
      
      V6_vL32b_pred_ai        if (Pv4) Vd32 = vmem(Rt32+#s4)
      V6_vL32b_pred_pi        if (Pv4) Vd32 = vmem(Rx32++#s3)
      V6_vL32b_pred_ppu       if (Pv4) Vd32 = vmem(Rx32++Mu2)
      V6_vL32b_npred_ai       if (!Pv4) Vd32 = vmem(Rt32+#s4)
      V6_vL32b_npred_pi       if (!Pv4) Vd32 = vmem(Rx32++#s3)
      V6_vL32b_npred_ppu      if (!Pv4) Vd32 = vmem(Rx32++Mu2)
      
      V6_vL32b_nt_pred_ai     if (Pv4) Vd32 = vmem(Rt32+#s4):nt
      V6_vL32b_nt_pred_pi     if (Pv4) Vd32 = vmem(Rx32++#s3):nt
      V6_vL32b_nt_pred_ppu    if (Pv4) Vd32 = vmem(Rx32++Mu2):nt
      V6_vL32b_nt_npred_ai    if (!Pv4) Vd32 = vmem(Rt32+#s4):nt
      V6_vL32b_nt_npred_pi    if (!Pv4) Vd32 = vmem(Rx32++#s3):nt
      V6_vL32b_nt_npred_ppu   if (!Pv4) Vd32 = vmem(Rx32++Mu2):nt
      
      V6_vS32b_pred_ai        if (Pv4) vmem(Rt32+#s4) = Vs32
      V6_vS32b_pred_pi        if (Pv4) vmem(Rx32++#s3) = Vs32
      V6_vS32b_pred_ppu       if (Pv4) vmem(Rx32++Mu2) = Vs32
      V6_vS32b_npred_ai       if (!Pv4) vmem(Rt32+#s4) = Vs32
      V6_vS32b_npred_pi       if (!Pv4) vmem(Rx32++#s3) = Vs32
      V6_vS32b_npred_ppu      if (!Pv4) vmem(Rx32++Mu2) = Vs32
      
      V6_vS32Ub_pred_ai       if (Pv4) vmemu(Rt32+#s4) = Vs32
      V6_vS32Ub_pred_pi       if (Pv4) vmemu(Rx32++#s3) = Vs32
      V6_vS32Ub_pred_ppu      if (Pv4) vmemu(Rx32++Mu2) = Vs32
      V6_vS32Ub_npred_ai      if (!Pv4) vmemu(Rt32+#s4) = Vs32
      V6_vS32Ub_npred_pi      if (!Pv4) vmemu(Rx32++#s3) = Vs32
      V6_vS32Ub_npred_ppu     if (!Pv4) vmemu(Rx32++Mu2) = Vs32
      
      V6_vS32b_nt_pred_ai     if (Pv4) vmem(Rt32+#s4):nt = Vs32
      V6_vS32b_nt_pred_pi     if (Pv4) vmem(Rx32++#s3):nt = Vs32
      V6_vS32b_nt_pred_ppu    if (Pv4) vmem(Rx32++Mu2):nt = Vs32
      V6_vS32b_nt_npred_ai    if (!Pv4) vmem(Rt32+#s4):nt = Vs32
      V6_vS32b_nt_npred_pi    if (!Pv4) vmem(Rx32++#s3):nt = Vs32
      V6_vS32b_nt_npred_ppu   if (!Pv4) vmem(Rx32++Mu2):nt = Vs32
      deda60fc
    • Joseph Huber's avatar
      [OpenMP] Add function for setting LIBOMPTARGET_INFO at runtime · 2b6f2008
      Joseph Huber authored
      Summary:
      This patch adds a new runtime function __tgt_set_info_flag that allows the
      user to set the information level at runtime without using the environment
      variable. Using this will require an extern function, but will eventually be
      added into an auxilliary library for OpenMP support functions.
      
      This patch required moving the current InfoLevel to a global variable which must
      be instantiated by each plugin.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D100774
      2b6f2008
    • Raphael Isemann's avatar
      Fix memory leak in MicrosoftDemangleNodes's Node::toString · ae209aa9
      Raphael Isemann authored
      The buffer we turn into a std::string here is malloc'd and should be
      free'd before we return from this function.
      
      Follow up to LLDB leak fixes such as D100806.
      
      Reviewed By: mstorsjo, rupprecht, MaskRay
      
      Differential Revision: https://reviews.llvm.org/D100843
      ae209aa9
    • peter klausler's avatar
      [flang] Fix spurious errors from runtime derived type table construction · 803f1e46
      peter klausler authored
      Andrezj W. @ Arm discovered that the runtime derived type table
      building code in semantics was detecting fatal errors in the tests
      that the f18 driver wasn't printing.  This patch fixes f18 so that
      these messages are printed; however, the messages were not valid user
      errors, and the rest of this patch fixes them up.
      
      There were two sources of the bogus errors.  One was that the runtime
      derived type information table builder was calculating the shapes of
      allocatable and pointer array components in derived types, and then
      complaining that they weren't constant or LEN parameter values, which
      of course they couldn't be since they have to have deferred shapes
      and those bounds were expressions like LBOUND(component,dim=1).
      
      The second was that f18 was forwarding the actual LEN type parameter
      expressions of a type instantiation too far into the uses of those
      parameters in various expressions in the declarations of components;
      when an actual LEN type parameter is not a constant value, it needs
      to remain a "bare" type parameter inquiry so that it will be lowered
      to a descriptor inquiry and acquire a captured expression value.
      
      Fixing this up properly involved: moving some code into new utility
      function templates in Evaluate/tools.h, tweaking the rewriting of
      conversions in expression folding to elide needless integer kind
      conversions of type parameter inquiries, making type parameter
      inquiry folding *not* replace bare LEN type parameters with
      non-constant actual parameter values, and cleaning up some
      altered test results.
      
      Differential Revision: https://reviews.llvm.org/D101001
      803f1e46
    • Jianzhou Zhao's avatar
      [dfsan] Track origin at loads · 7fdf2709
      Jianzhou Zhao authored
          The first version of origin tracking tracks only memory stores. Although
          this is sufficient for understanding correct flows, it is hard to figure
          out where an undefined value is read from. To find reading undefined values,
          we still have to do a reverse binary search from the last store in the chain
          with printing and logging at possible code paths. This is
          quite inefficient.
      
          Tracking memory load instructions can help this case. The main issues of
          tracking loads are performance and code size overheads.
      
          With tracking only stores, the code size overhead is 38%,
          memory overhead is 1x, and cpu overhead is 3x. In practice #load is much
          larger than #store, so both code size and cpu overhead increases. The
          first blocker is code size overhead: link fails if we inline tracking
          loads. The workaround is using external function calls to propagate
          metadata. This is also the workaround ASan uses. The cpu overhead
          is ~10x. This is a trade off between debuggability and performance,
          and will be used only when debugging cases that tracking only stores
          is not enough.
      
      Reviewed By: gbalats
      
      Differential Revision: https://reviews.llvm.org/D100967
      7fdf2709
    • Arthur O'Dwyer's avatar
      [libc++] [test] Fix nodiscard_extensions.pass.cpp in _LIBCPP_DEBUG mode. · 5dfbcc5a
      Arthur O'Dwyer authored
      `std::clamp(2, 1, 3, std::greater<int>())` has UB because (1 > 3) is false.
      Swap the operands to fix the _LIBCPP_ASSERT failure in this test.
      5dfbcc5a
    • Irina Dobrescu's avatar
      [flang][openmp] Add General Semantic Checks for Allocate Directive · 123ae425
      Irina Dobrescu authored
      This patch adds semantic checks for the General Restrictions of the
      Allocate Directive.
      
      Since the requires directive is not yet implemented in Flang, the
      restriction:
      ```
      allocate directives that appear in a target region must
      specify an allocator clause unless a requires directive with the
      dynamic_allocators clause is present in the same compilation unit
      ```
      will need to be updated at a later time.
      
      A different patch will be made with the Fortran specific restrictions of
      this directive.
      
      I have used the code from https://reviews.llvm.org/D89395
      
       for the
      CheckObjectListStructure function.
      
      Co-authored-by: default avatarIsaac Perry <isaac.perry@arm.com>
      
      Reviewed By: clementval, kiranchandramohan
      
      Differential Revision: https://reviews.llvm.org/D91159
      123ae425
    • Sanjay Patel's avatar
      11232037
    • Arthur O'Dwyer's avatar
      b98b6d99
    • Arthur O'Dwyer's avatar
  2. Apr 22, 2021
Loading