class LoopVectorizationCostModel

Declaration

class LoopVectorizationCostModel { /* full declaration omitted */ };

Description

LoopVectorizationCostModel - estimates the expected speedups due to vectorization. In many cases vectorization is not profitable. This can happen because of a number of reasons. In this class we mainly attempt to predict the expected speedup/slowdowns due to the supported instruction set. We use the TargetTransformInfo to query the different backends for the cost of different operations.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1133

Member Variables

private unsigned int NumPredStores = 0
private MapVector<llvm::Instruction*, uint64_t> MinBWs: Map of scalar integer values to the smallest bitwidth they can be legally represented as. The vector equivalents of these values should be truncated to this type.
private DenseMap<llvm::ElementCount, SmallPtrSet<llvm::BasicBlock*, 4>> PredicatedBBsAfterVectorization: A set containing all BasicBlocks that are known to present after vectorization as a predicated block.
private llvm::ScalarEpilogueLowering ScalarEpilogueStatus = CM_ScalarEpilogueAllowed: Records whether it is allowed to have the original scalar loop execute at least once. This may be needed as a fallback loop in case runtime aliasing/dependence checks fail, or to handle the tail/remainder iterations when the trip count is unknown or doesn't divide by the VF, or as a peel-loop to handle gaps in interleave-groups. Under optsize and when the trip count is very small we don't allow any iterations to execute in the scalar loop.
private bool FoldTailByMasking = false: All blocks of loop are to be masked to fold tail of scalar iterations.
private DenseMap<llvm::ElementCount, llvm::LoopVectorizationCostModel:: ScalarCostsTy> InstsToScalarize: A map holding scalar costs for different vectorization factors. The presence of a cost for an instruction in the mapping indicates that the instruction will be scalarized when vectorizing with the associated vectorization factor. The entries are VF-ScalarCostTy pairs.
private DenseMap<llvm::ElementCount, SmallPtrSet<llvm::Instruction*, 4>> Uniforms: Holds the instructions known to be uniform after vectorization. The data is collected per VF.
private DenseMap<llvm::ElementCount, SmallPtrSet<llvm::Instruction*, 4>> Scalars: Holds the instructions known to be scalar after vectorization. The data is collected per VF.
private DenseMap<llvm::ElementCount, SmallPtrSet<llvm::Instruction*, 4>> ForcedScalars: Holds the instructions (address computations) that are forced to be scalarized.
private llvm::LoopVectorizationCostModel:: ReductionChainMap InLoopReductionChains: PHINodes of the reductions that should be expanded in-loop along with their associated chains of reduction operations, in program order from top (PHI) to bottom
private DenseMap<llvm::Instruction*, llvm::Instruction*> InLoopReductionImmediateChains: A Map of inloop reduction operations and their immediate chain operand. FIXME: This can be removed once reductions can be costed correctly in vplan. This was added to allow quick lookup to the inloop operations, without having to loop through InLoopReductionChains.
private llvm::LoopVectorizationCostModel::DecisionList WideningDecisions
public llvm::Loop* TheLoop: The loop that we evaluate.
public llvm::PredicatedScalarEvolution& PSE: Predicated scalar evolution analysis.
public llvm::LoopInfo* LI: Loop Info analysis.
public llvm::LoopVectorizationLegality* Legal: Vectorization legality.
public const llvm::TargetTransformInfo& TTI: Vector target information.
public const llvm::TargetLibraryInfo* TLI: Target Library Info.
public llvm::DemandedBits* DB: Demanded bits analysis.
public llvm::AssumptionCache* AC: Assumption cache.
public llvm::OptimizationRemarkEmitter* ORE: Interface to emit optimization remarks.
public const llvm::Function* TheFunction
public const llvm::LoopVectorizeHints* Hints: Loop Vectorize Hint.
public llvm::InterleavedAccessInfo& InterleaveInfo: The interleave access information contains groups of interleaved accesses with the same stride and close to each other.
public SmallPtrSet<const llvm::Value*, 16> ValuesToIgnore: Values to ignore in the cost model.
public SmallPtrSet<const llvm::Value*, 16> VecValuesToIgnore: Values to ignore in the cost model when VF > 1.
public SmallPtrSet<llvm::Type*, 16> ElementTypesInLoop: All element types found in the loop.
public SmallVector<llvm::VectorizationFactor, 8> ProfitableVFs: Profitable vector factors.

Method Overview

public LoopVectorizationCostModel(llvm::ScalarEpilogueLowering SEL, llvm::Loop * L, llvm::PredicatedScalarEvolution & PSE, llvm::LoopInfo * LI, llvm::LoopVectorizationLegality * Legal, const llvm::TargetTransformInfo & TTI, const llvm::TargetLibraryInfo * TLI, llvm::DemandedBits * DB, llvm::AssumptionCache * AC, llvm::OptimizationRemarkEmitter * ORE, const llvm::Function * F, const llvm::LoopVectorizeHints * Hints, llvm::InterleavedAccessInfo & IAI)
public bool blockNeedsPredicationForAnyReason(llvm::BasicBlock * BB) const
public SmallVector<llvm::LoopVectorizationCostModel::RegisterUsage, 8> calculateRegisterUsage(ArrayRef<llvm::ElementCount> VFs)
public bool canTruncateToMinimalBitwidth(llvm::Instruction * I, llvm::ElementCount VF) const
public bool canVectorizeReductions(llvm::ElementCount VF) const
public void collectElementTypesForWidening()
public void collectInLoopReductions()
public void collectInstsToScalarize(llvm::ElementCount VF)
private void collectLoopScalars(llvm::ElementCount VF)
private void collectLoopUniforms(llvm::ElementCount VF)
public void collectUniformsAndScalars(llvm::ElementCount VF)
public void collectValuesToIgnore()
private llvm::FixedScalableVFPair computeFeasibleMaxVF(unsigned int ConstTripCount, llvm::ElementCount UserVF, bool FoldTailByMasking)
public llvm::FixedScalableVFPair computeMaxVF(llvm::ElementCount UserVF, unsigned int UserIC)
private int computePredInstDiscount(llvm::Instruction * PredInst, llvm::LoopVectorizationCostModel::ScalarCostsTy & ScalarCosts, llvm::ElementCount VF)
private llvm::LoopVectorizationCostModel::VectorizationCostTy expectedCost(llvm::ElementCount VF, SmallVectorImpl<llvm::LoopVectorizationCostModel::InstructionVFPair> * Invalid = nullptr)
private SmallVector<llvm::Value *, 4> filterExtractingOperands(Instruction::op_range Ops, llvm::ElementCount VF) const
public bool foldTailByMasking() const
private llvm::InstructionCost getConsecutiveMemOpCost(llvm::Instruction * I, llvm::ElementCount VF)
private llvm::InstructionCost getGatherScatterCost(llvm::Instruction * I, llvm::ElementCount VF)
public const llvm::LoopVectorizationCostModel::ReductionChainMap & getInLoopReductionChains() const
private llvm::LoopVectorizationCostModel::VectorizationCostTy getInstructionCost(llvm::Instruction * I, llvm::ElementCount VF)
private llvm::InstructionCost getInstructionCost(llvm::Instruction * I, llvm::ElementCount VF, llvm::Type *& VectorTy)
private llvm::InstructionCost getInterleaveGroupCost(llvm::Instruction * I, llvm::ElementCount VF)
public const InterleaveGroup<llvm::Instruction> * getInterleavedAccessGroup(llvm::Instruction * Instr)
private llvm::ElementCount getMaxLegalScalableVF(unsigned int MaxSafeElements)
private llvm::ElementCount getMaximizedVFForTarget(unsigned int ConstTripCount, unsigned int SmallestType, unsigned int WidestType, llvm::ElementCount MaxSafeVF, bool FoldTailByMasking)
private llvm::InstructionCost getMemInstScalarizationCost(llvm::Instruction * I, llvm::ElementCount VF)
private llvm::InstructionCost getMemoryInstructionCost(llvm::Instruction * I, llvm::ElementCount VF)
public const MapVector<llvm::Instruction *, uint64_t> & getMinimalBitwidths() const
private Optional<llvm::InstructionCost> getReductionPatternCost(llvm::Instruction * I, llvm::ElementCount VF, llvm::Type * VectorTy, TTI::TargetCostKind CostKind)
private llvm::InstructionCost getScalarizationOverhead(llvm::Instruction * I, llvm::ElementCount VF) const
public std::pair<unsigned int, unsigned int> getSmallestAndWidestTypes()
private llvm::InstructionCost getUniformMemOpCost(llvm::Instruction * I, llvm::ElementCount VF)
public Optional<unsigned int> getVScaleForTuning() const
public llvm::InstructionCost getVectorCallCost(llvm::CallInst * CI, llvm::ElementCount VF, bool & NeedToScalarize) const
public llvm::InstructionCost getVectorIntrinsicCost(llvm::CallInst * CI, llvm::ElementCount VF) const
public llvm::InstructionCost getWideningCost(llvm::Instruction * I, llvm::ElementCount VF)
public llvm::LoopVectorizationCostModel::InstWidening getWideningDecision(llvm::Instruction * I, llvm::ElementCount VF) const
public bool interleavedAccessCanBeWidened(llvm::Instruction * I, llvm::ElementCount VF = ElementCount::getFixed(1))
public void invalidateCostModelingDecisions()
public bool isAccessInterleaved(llvm::Instruction * Instr)
private bool isCandidateForEpilogueVectorization(const llvm::Loop & L, const llvm::ElementCount VF) const
private bool isEpilogueVectorizationProfitable(const llvm::ElementCount VF) const
public bool isInLoopReduction(llvm::PHINode * Phi) const
public bool isLegalGatherOrScatter(llvm::Value * V, llvm::ElementCount VF = ElementCount::getFixed(1))
public bool isLegalMaskedLoad(llvm::Type * DataType, llvm::Value * Ptr, llvm::Align Alignment) const
public bool isLegalMaskedStore(llvm::Type * DataType, llvm::Value * Ptr, llvm::Align Alignment) const
public bool isMoreProfitable(const llvm::VectorizationFactor & A, const llvm::VectorizationFactor & B) const
public bool isOptimizableIVTruncate(llvm::Instruction * I, llvm::ElementCount VF)
public bool isPredicatedInst(llvm::Instruction * I, llvm::ElementCount VF)
public bool isProfitableToScalarize(llvm::Instruction * I, llvm::ElementCount VF) const
public bool isScalarAfterVectorization(llvm::Instruction * I, llvm::ElementCount VF) const
public bool isScalarEpilogueAllowed() const
public bool isScalarWithPredication(llvm::Instruction * I, llvm::ElementCount VF) const
public bool isUniformAfterVectorization(llvm::Instruction * I, llvm::ElementCount VF) const
public bool memoryInstructionCanBeWidened(llvm::Instruction * I, llvm::ElementCount VF = ElementCount::getFixed(1))
private bool needsExtract(llvm::Value * V, llvm::ElementCount VF) const
public bool requiresScalarEpilogue(llvm::ElementCount VF) const
public bool runtimeChecksRequired()
public llvm::VectorizationFactor selectEpilogueVectorizationFactor(const llvm::ElementCount MaxVF, const llvm::LoopVectorizationPlanner & LVP)
public unsigned int selectInterleaveCount(llvm::ElementCount VF, unsigned int LoopCost)
public bool selectUserVectorizationFactor(llvm::ElementCount UserVF)
public llvm::VectorizationFactor selectVectorizationFactor(const llvm::ElementCountSet & CandidateVFs)
public void setCostBasedWideningDecision(llvm::ElementCount VF)
public void setWideningDecision(llvm::Instruction * I, llvm::ElementCount VF, llvm::LoopVectorizationCostModel::InstWidening W, llvm::InstructionCost Cost)
public void setWideningDecision(const InterleaveGroup<llvm::Instruction> * Grp, llvm::ElementCount VF, llvm::LoopVectorizationCostModel::InstWidening W, llvm::InstructionCost Cost)
public bool useActiveLaneMaskForControlFlow() const
private bool useEmulatedMaskMemRefHack(llvm::Instruction * I, llvm::ElementCount VF)
public bool useOrderedReductions(const llvm::RecurrenceDescriptor & RdxDesc) const

Methods

¶LoopVectorizationCostModel(
    llvm::ScalarEpilogueLowering SEL,
    llvm::Loop* L,
    llvm::PredicatedScalarEvolution& PSE,
    llvm::LoopInfo* LI,
    llvm::LoopVectorizationLegality* Legal,
    const llvm::TargetTransformInfo& TTI,
    const llvm::TargetLibraryInfo* TLI,
    llvm::DemandedBits* DB,
    llvm::AssumptionCache* AC,
    llvm::OptimizationRemarkEmitter* ORE,
    const llvm::Function* F,
    const llvm::LoopVectorizeHints* Hints,
    llvm::InterleavedAccessInfo& IAI)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1135

Parameters

llvm::ScalarEpilogueLowering SEL
llvm::Loop* L
llvm::PredicatedScalarEvolution& PSE
llvm::LoopInfo* LI
llvm::LoopVectorizationLegality* Legal
const llvm::TargetTransformInfo& TTI
const llvm::TargetLibraryInfo* TLI
llvm::DemandedBits* DB
llvm::AssumptionCache* AC
llvm::OptimizationRemarkEmitter* ORE
const llvm::Function* F
const llvm::LoopVectorizeHints* Hints
llvm::InterleavedAccessInfo& IAI

¶bool blockNeedsPredicationForAnyReason(
    llvm::BasicBlock* BB) const

Description

Returns true if the instructions in this block requires predication for any reason, e.g. because tail folding now requires a predicate or because the block in the original loop was predicated.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1522

Parameters

llvm::BasicBlock* BB

¶SmallVector<llvm::LoopVectorizationCostModel::
                RegisterUsage,
            8>
calculateRegisterUsage(
    ArrayRef<llvm::ElementCount> VFs)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1210

Parameters

ArrayRef<llvm::ElementCount> VFs

Returns

Returns information about the register usages of the loop for the given vectorization factors.

¶bool canTruncateToMinimalBitwidth(
    llvm::Instruction* I,
    llvm::ElementCount VF) const

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1288

Parameters

llvm::Instruction* I
llvm::ElementCount VF

Returns

True if instruction \p I can be truncated to a smaller bitwidth for vectorization factor \p VF.

¶bool canVectorizeReductions(
    llvm::ElementCount VF) const

Description

Returns true if the target machine supports all of the reduction variables found for the given VF.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1431

Parameters

llvm::ElementCount VF

¶void collectElementTypesForWidening()

Description

Collect all element types in the loop for which widening is needed.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1216

¶void collectInLoopReductions()

Description

Split reductions into those that happen in the loop, and those that happen outside. In loop reductions are collected into InLoopReductionChains.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1220

¶void collectInstsToScalarize(
    llvm::ElementCount VF)

Description

Collects the instructions to scalarize for each predicated instruction in the loop.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1385

Parameters

llvm::ElementCount VF

¶void collectLoopScalars(llvm::ElementCount VF)

Description

Collect the instructions that are scalar after vectorization. An instruction is scalar if it is known to be uniform or will be scalarized during vectorization. collectLoopScalars should only add non-uniform nodes to the list if they are used by a load/store instruction that is marked as CM_Scalarize. Non-uniform scalarized instructions will be represented by VF values in the vectorized loop, each corresponding to an iteration of the original scalar loop.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1743

Parameters

llvm::ElementCount VF

¶void collectLoopUniforms(llvm::ElementCount VF)

Description

Collect the instructions that are uniform after vectorization. An instruction is uniform if we represent it with a single scalar value in the vectorized loop corresponding to each vector iteration. Examples of uniform instructions include pointer operands of consecutive or interleaved memory accesses. Note that although uniformity implies an instruction will be scalar, the reverse is not true. In general, a scalarized instruction will be represented by VF scalar values in the vectorized loop, each corresponding to an iteration of the original scalar loop.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1734

Parameters

llvm::ElementCount VF

¶void collectUniformsAndScalars(
    llvm::ElementCount VF)

Description

Collect Uniform and Scalar values for the given \p VF. The sets depend on CM decision for Load/Store instructions that may be vectorized as interleave, gather-scatter or scalarized.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1390

Parameters

llvm::ElementCount VF

¶void collectValuesToIgnore()

Description

Collect values we want to ignore in the cost model.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1213

¶llvm::FixedScalableVFPair computeFeasibleMaxVF(
    unsigned int ConstTripCount,
    llvm::ElementCount UserVF,
    bool FoldTailByMasking)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1580

Parameters

unsigned int ConstTripCount
llvm::ElementCount UserVF
bool FoldTailByMasking

Returns

An upper bound for the vectorization factors for both fixed and scalable vectorization, where the minimum-known number of elements is a power-of-2 larger than zero. If scalable vectorization is disabled or unsupported, then the scalable part will be equal to ElementCount::getScalable(0).

¶llvm::FixedScalableVFPair computeMaxVF(
    llvm::ElementCount UserVF,
    unsigned int UserIC)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1151

Parameters

llvm::ElementCount UserVF
unsigned int UserIC

Returns

An upper bound for the vectorization factors (both fixed and scalable). If the factors are 0, vectorization and interleaving should be avoided up front.

¶int computePredInstDiscount(
    llvm::Instruction* PredInst,
    llvm::LoopVectorizationCostModel::
        ScalarCostsTy& ScalarCosts,
    llvm::ElementCount VF)

Description

Returns the expected difference in cost from scalarizing the expression feeding a predicated instruction \p PredInst. The instructions to scalarize and their scalar costs are collected in \p ScalarCosts. A non-negative return value implies the expression will be scalarized. Currently, only single-use chains are considered for scalarization.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1722

Parameters

llvm::Instruction* PredInst
llvm::LoopVectorizationCostModel::ScalarCostsTy& ScalarCosts
llvm::ElementCount VF

¶llvm::LoopVectorizationCostModel::
    VectorizationCostTy
    expectedCost(
        llvm::ElementCount VF,
        SmallVectorImpl<
            llvm::LoopVectorizationCostModel::
                InstructionVFPair>* Invalid =
            nullptr)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1612

Parameters

llvm::ElementCount VF
SmallVectorImpl<llvm::LoopVectorizationCostModel:: InstructionVFPair>* Invalid = nullptr

¶SmallVector<llvm::Value*, 4>
filterExtractingOperands(
    Instruction::op_range Ops,
    llvm::ElementCount VF) const

Description

Returns a range containing only operands needing to be extracted.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1771

Parameters

Instruction::op_range Ops
llvm::ElementCount VF

¶bool foldTailByMasking() const

Description

Returns true if all loop blocks should be masked to fold tail loop.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1510

¶llvm::InstructionCost getConsecutiveMemOpCost(
    llvm::Instruction* I,
    llvm::ElementCount VF)

Description

The cost computation for widening instruction \p I with consecutive memory access.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1644

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶llvm::InstructionCost getGatherScatterCost(
    llvm::Instruction* I,
    llvm::ElementCount VF)

Description

The cost computation for Gather/Scatter instruction.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1640

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶const llvm::LoopVectorizationCostModel::
    ReductionChainMap&
    getInLoopReductionChains() const

Description

Return the chain of instructions representing an inloop reduction.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1533

¶llvm::LoopVectorizationCostModel::
    VectorizationCostTy
    getInstructionCost(llvm::Instruction* I,
                       llvm::ElementCount VF)

Description

Returns the execution time cost of an instruction for a given vector width. Vector width of one means scalar.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1617

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶llvm::InstructionCost getInstructionCost(
    llvm::Instruction* I,
    llvm::ElementCount VF,
    llvm::Type*& VectorTy)

Description

The cost-computation logic from getInstructionCost which provides the vector type as an output parameter.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1621

Parameters

llvm::Instruction* I
llvm::ElementCount VF
llvm::Type*& VectorTy

¶llvm::InstructionCost getInterleaveGroupCost(
    llvm::Instruction* I,
    llvm::ElementCount VF)

Description

The cost computation for interleaving group of memory instructions.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1637

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶const InterleaveGroup<llvm::Instruction>*
getInterleavedAccessGroup(
    llvm::Instruction* Instr)

Description

Get the interleaved access group that \p Instr belongs to.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1487

Parameters

llvm::Instruction* Instr

¶llvm::ElementCount getMaxLegalScalableVF(
    unsigned int MaxSafeElements)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1595

Parameters

unsigned int MaxSafeElements

Returns

the maximum legal scalable VF, based on the safe max number of elements.

¶llvm::ElementCount getMaximizedVFForTarget(
    unsigned int ConstTripCount,
    unsigned int SmallestType,
    unsigned int WidestType,
    llvm::ElementCount MaxSafeVF,
    bool FoldTailByMasking)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1587

Parameters

unsigned int ConstTripCount
unsigned int SmallestType
unsigned int WidestType
llvm::ElementCount MaxSafeVF
bool FoldTailByMasking

Returns

the maximized element count based on the targets vector registers and the loop trip-count, but limited to a maximum safe VF. This is a helper function of computeFeasibleMaxVF.

¶llvm::InstructionCost getMemInstScalarizationCost(
    llvm::Instruction* I,
    llvm::ElementCount VF)

Description

The cost computation for scalarized memory instruction.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1634

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶llvm::InstructionCost getMemoryInstructionCost(
    llvm::Instruction* I,
    llvm::ElementCount VF)

Description

Calculate vectorization cost of memory instruction \p I.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1631

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶const MapVector<llvm::Instruction*, uint64_t>&
getMinimalBitwidths() const

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1233

Returns

The smallest bitwidth each instruction can be represented with. The vector equivalents of these instructions should be truncated to this type.

¶Optional<llvm::InstructionCost>
getReductionPatternCost(
    llvm::Instruction* I,
    llvm::ElementCount VF,
    llvm::Type* VectorTy,
    TTI::TargetCostKind CostKind)

Description

Return the cost of instructions in an inloop reduction pattern, if I is part of that pattern.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1627

Parameters

llvm::Instruction* I
llvm::ElementCount VF
llvm::Type* VectorTy
TTI::TargetCostKind CostKind

¶llvm::InstructionCost getScalarizationOverhead(
    llvm::Instruction* I,
    llvm::ElementCount VF) const

Description

Estimate the overhead of scalarizing an instruction. This is a convenience wrapper for the type-based getScalarizationOverhead API.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1654

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶std::pair<unsigned int, unsigned int>
getSmallestAndWidestTypes()

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1179

Returns

The size (in bits) of the smallest and widest types in the code that needs to be vectorized. We ignore values that remain scalar such as 64 bit loop indices.

¶llvm::InstructionCost getUniformMemOpCost(
    llvm::Instruction* I,
    llvm::ElementCount VF)

Description

The cost calculation for Load/Store instruction \p I with uniform pointer - Load: scalar load + broadcast. Store: scalar store + (loop invariant value stored? 0 : extract of last element)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1650

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶Optional<unsigned int> getVScaleForTuning() const

Description

Convenience function that returns the value of vscale_range iff vscale_range.min == vscale_range.max or otherwise returns the value returned by the corresponding TLI method.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1570

¶llvm::InstructionCost getVectorCallCost(
    llvm::CallInst* CI,
    llvm::ElementCount VF,
    bool& NeedToScalarize) const

Description

Estimate cost of a call instruction CI if it were vectorized with factor VF. Return the cost of the instruction, including scalarization overhead if it's needed. The flag NeedToScalarize shows if the call needs to be scalarized - i.e. either vector version isn't available, or is too expensive.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1552

Parameters

llvm::CallInst* CI
llvm::ElementCount VF
bool& NeedToScalarize

¶llvm::InstructionCost getVectorIntrinsicCost(
    llvm::CallInst* CI,
    llvm::ElementCount VF) const

Description

Estimate cost of an intrinsic call instruction CI if it were vectorized with factor VF. Return the cost of the instruction, including scalarization overhead if it's needed.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1545

Parameters

llvm::CallInst* CI
llvm::ElementCount VF

¶llvm::InstructionCost getWideningCost(
    llvm::Instruction* I,
    llvm::ElementCount VF)

Description

Return the vectorization cost for the given instruction \p I and vector width \p VF.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1349

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶llvm::LoopVectorizationCostModel::InstWidening
getWideningDecision(llvm::Instruction* I,
                    llvm::ElementCount VF) const

Description

Return the cost model decision for the given instruction \p I and vector width \p VF. Return CM_Unknown if this instruction did not pass through the cost modeling.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1333

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶bool interleavedAccessCanBeWidened(
    llvm::Instruction* I,
    llvm::ElementCount VF =
        ElementCount::getFixed(1))

Description

Returns true if \p I is a memory instruction in an interleaved-group of memory accesses that can be vectorized with wide vector loads/stores and shuffles.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1477

Parameters

llvm::Instruction* I
llvm::ElementCount VF = ElementCount::getFixed(1)

¶void invalidateCostModelingDecisions()

Description

Invalidates decisions already taken by the cost model.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1561

¶bool isAccessInterleaved(llvm::Instruction* Instr)

Description

Check if \p Instr belongs to any interleaved access group.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1481

Parameters

llvm::Instruction* Instr

¶bool isCandidateForEpilogueVectorization(
    const llvm::Loop& L,
    const llvm::ElementCount VF) const

Description

Determines if we have the infrastructure to vectorize loop \p L and its epilogue, assuming the main loop is vectorized by \p VF.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1779

Parameters

const llvm::Loop& L
const llvm::ElementCount VF

¶bool isEpilogueVectorizationProfitable(
    const llvm::ElementCount VF) const

Description

Returns true if epilogue vectorization is considered profitable, and false otherwise.\p VF is the vectorization factor chosen for the original loop.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1785

Parameters

const llvm::ElementCount VF

¶bool isInLoopReduction(llvm::PHINode* Phi) const

Description

Returns true if the Phi is part of an inloop reduction.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1538

Parameters

llvm::PHINode* Phi

¶bool isLegalGatherOrScatter(
    llvm::Value* V,
    llvm::ElementCount VF =
        ElementCount::getFixed(1))

Description

Returns true if the target machine can represent \p V as a masked gather or scatter operation.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1415

Parameters

llvm::Value* V
llvm::ElementCount VF = ElementCount::getFixed(1)

¶bool isLegalMaskedLoad(
    llvm::Type* DataType,
    llvm::Value* Ptr,
    llvm::Align Alignment) const

Description

Returns true if the target machine supports masked load operation for the given \p DataType and kind of access to \p Ptr.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1408

Parameters

llvm::Type* DataType
llvm::Value* Ptr
llvm::Align Alignment

¶bool isLegalMaskedStore(
    llvm::Type* DataType,
    llvm::Value* Ptr,
    llvm::Align Alignment) const

Description

Returns true if the target machine supports masked store operation for the given \p DataType and kind of access to \p Ptr.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1401

Parameters

llvm::Type* DataType
llvm::Value* Ptr
llvm::Align Alignment

¶bool isMoreProfitable(
    const llvm::VectorizationFactor& A,
    const llvm::VectorizationFactor& B) const

Description

Returns true if the per-lane cost of VectorizationFactor A is lower than that of B.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1557

Parameters

const llvm::VectorizationFactor& A
const llvm::VectorizationFactor& B

¶bool isOptimizableIVTruncate(
    llvm::Instruction* I,
    llvm::ElementCount VF)

Description

Return True if instruction \p I is an optimizable truncate whose operand is an induction variable. Such a truncate will be removed by adding a new induction variable with the destination type.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1360

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶bool isPredicatedInst(llvm::Instruction* I,
                      llvm::ElementCount VF)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1448

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶bool isProfitableToScalarize(
    llvm::Instruction* I,
    llvm::ElementCount VF) const

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1239

Parameters

llvm::Instruction* I
llvm::ElementCount VF

Returns

True if it is more profitable to scalarize instruction \p I for vectorization factor \p VF.

¶bool isScalarAfterVectorization(
    llvm::Instruction* I,
    llvm::ElementCount VF) const

Description

Returns true if \p I is known to be scalar after vectorization.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1271

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶bool isScalarEpilogueAllowed() const

Description

Returns true if a scalar epilogue is not allowed due to optsize or a loop hint annotation.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1505

¶bool isScalarWithPredication(
    llvm::Instruction* I,
    llvm::ElementCount VF) const

Description

Returns true if \p I is an instruction that will be scalarized with predication when vectorizing \p I with vectorization factor \p VF. Such instructions include conditional stores and instructions that may divide by zero.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1442

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶bool isUniformAfterVectorization(
    llvm::Instruction* I,
    llvm::ElementCount VF) const

Description

Returns true if \p I is known to be uniform after vectorization.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1255

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶bool memoryInstructionCanBeWidened(
    llvm::Instruction* I,
    llvm::ElementCount VF =
        ElementCount::getFixed(1))

Description

Returns true if \p I is a memory instruction with consecutive memory access that can be widened.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1470

Parameters

llvm::Instruction* I
llvm::ElementCount VF = ElementCount::getFixed(1)

¶bool needsExtract(llvm::Value* V,
                  llvm::ElementCount VF) const

Description

Returns true if \p V is expected to be vectorized and it needs to be extracted.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1754

Parameters

llvm::Value* V
llvm::ElementCount VF

¶bool requiresScalarEpilogue(
    llvm::ElementCount VF) const

Description

Returns true if we're required to use a scalar epilogue for at least the final iteration of the original loop.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1493

Parameters

llvm::ElementCount VF

¶bool runtimeChecksRequired()

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1155

Returns

True if runtime checks are required for vectorization, and false otherwise.

¶llvm::VectorizationFactor
selectEpilogueVectorizationFactor(
    const llvm::ElementCount MaxVF,
    const llvm::LoopVectorizationPlanner& LVP)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1165

Parameters

const llvm::ElementCount MaxVF
const llvm::LoopVectorizationPlanner& LVP

¶unsigned int selectInterleaveCount(
    llvm::ElementCount VF,
    unsigned int LoopCost)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1185

Parameters

llvm::ElementCount VF
unsigned int LoopCost

Returns

The desired interleave count. If interleave count has been specified by metadata it will be returned. Otherwise, the interleave count is computed and returned. VF and LoopCost are the selected vectorization factor and the cost of the selected VF.

¶bool selectUserVectorizationFactor(
    llvm::ElementCount UserVF)

Description

Setup cost-based decisions for user vectorization factor.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1170

Parameters

llvm::ElementCount UserVF

Returns

true if the UserVF is a feasible VF to be chosen.

¶llvm::VectorizationFactor
selectVectorizationFactor(
    const llvm::ElementCountSet& CandidateVFs)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1162

Parameters

const llvm::ElementCountSet& CandidateVFs

Returns

The most profitable vectorization factor and the cost of that VF. This method checks every VF in \p CandidateVFs. If UserVF is not ZERO then this vectorization factor will be selected if vectorization is possible.

¶void setCostBasedWideningDecision(
    llvm::ElementCount VF)

Description

Memory access instruction may be vectorized in more than one way. Form of instruction after vectorization depends on cost. This function takes cost-based decisions for Load/Store instructions and collects them in a map. This decisions map is used for building the lists of loop-uniform and loop-scalar instructions. The calculated cost is saved with widening decision in order to avoid redundant calculations.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1194

Parameters

llvm::ElementCount VF

¶void setWideningDecision(
    llvm::Instruction* I,
    llvm::ElementCount VF,
    llvm::LoopVectorizationCostModel::InstWidening
        W,
    llvm::InstructionCost Cost)

Description

Save vectorization decision \p W and \p Cost taken by the cost model for instruction \p I and vector width \p VF.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1306

Parameters

llvm::Instruction* I
llvm::ElementCount VF
llvm::LoopVectorizationCostModel::InstWidening W
llvm::InstructionCost Cost

¶void setWideningDecision(
    const InterleaveGroup<llvm::Instruction>* Grp,
    llvm::ElementCount VF,
    llvm::LoopVectorizationCostModel::InstWidening
        W,
    llvm::InstructionCost Cost)

Description

Save vectorization decision \p W and \p Cost taken by the cost model for interleaving group \p Grp and vector width \p VF.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1314

Parameters

const InterleaveGroup<llvm::Instruction>* Grp
llvm::ElementCount VF
llvm::LoopVectorizationCostModel::InstWidening W
llvm::InstructionCost Cost

¶bool useActiveLaneMaskForControlFlow() const

Description

Returns true if were tail-folding and want to use the active lane mask for vector loop control flow.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1514

¶bool useEmulatedMaskMemRefHack(
    llvm::Instruction* I,
    llvm::ElementCount VF)

Description

Returns true if an artificially high cost for emulated masked memrefs should be used.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1659

Parameters

llvm::Instruction* I
llvm::ElementCount VF

¶bool useOrderedReductions(
    const llvm::RecurrenceDescriptor& RdxDesc)
    const

Description

Returns true if we should use strict in-order reductions for the given RdxDesc. This is true if the -enable-strict-reductions flag is passed, the IsOrdered flag of RdxDesc is set and we do not allow reordering of FP operations.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1226

Parameters

const llvm::RecurrenceDescriptor& RdxDesc

class LoopVectorizationCostModel

Declaration

Description

Member Variables

Method Overview

Methods

Parameters

¶bool blockNeedsPredicationForAnyReason( llvm::BasicBlock* BB) const

Description

Parameters

¶SmallVector<llvm::LoopVectorizationCostModel:: RegisterUsage, 8> calculateRegisterUsage( ArrayRef<llvm::ElementCount> VFs)

Parameters

Returns

¶bool canTruncateToMinimalBitwidth( llvm::Instruction* I, llvm::ElementCount VF) const

Parameters

Returns

¶bool canVectorizeReductions( llvm::ElementCount VF) const

Description

Parameters

¶void collectElementTypesForWidening()

Description

¶void collectInLoopReductions()

Description

¶void collectInstsToScalarize( llvm::ElementCount VF)

Description

Parameters

¶void collectLoopScalars(llvm::ElementCount VF)

Description

Parameters

¶void collectLoopUniforms(llvm::ElementCount VF)

Description

Parameters

¶void collectUniformsAndScalars( llvm::ElementCount VF)

Description

Parameters

¶void collectValuesToIgnore()

Description

¶llvm::FixedScalableVFPair computeFeasibleMaxVF( unsigned int ConstTripCount, llvm::ElementCount UserVF, bool FoldTailByMasking)

Parameters

Returns

¶llvm::FixedScalableVFPair computeMaxVF( llvm::ElementCount UserVF, unsigned int UserIC)

Parameters

Returns

¶int computePredInstDiscount( llvm::Instruction* PredInst, llvm::LoopVectorizationCostModel:: ScalarCostsTy& ScalarCosts, llvm::ElementCount VF)

Description

Parameters

¶llvm::LoopVectorizationCostModel:: VectorizationCostTy expectedCost( llvm::ElementCount VF, SmallVectorImpl< llvm::LoopVectorizationCostModel:: InstructionVFPair>* Invalid = nullptr)

Parameters

¶SmallVector<llvm::Value*, 4> filterExtractingOperands( Instruction::op_range Ops, llvm::ElementCount VF) const

Description

Parameters

¶bool foldTailByMasking() const

Description

¶llvm::InstructionCost getConsecutiveMemOpCost( llvm::Instruction* I, llvm::ElementCount VF)

Description

Parameters

¶llvm::InstructionCost getGatherScatterCost( llvm::Instruction* I, llvm::ElementCount VF)

Description

Parameters

¶const llvm::LoopVectorizationCostModel:: ReductionChainMap& getInLoopReductionChains() const

Description

¶llvm::LoopVectorizationCostModel:: VectorizationCostTy getInstructionCost(llvm::Instruction* I, llvm::ElementCount VF)

Description

Parameters

¶llvm::InstructionCost getInstructionCost( llvm::Instruction* I, llvm::ElementCount VF, llvm::Type*& VectorTy)

Description

Parameters

¶llvm::InstructionCost getInterleaveGroupCost( llvm::Instruction* I, llvm::ElementCount VF)

Description

Parameters

¶const InterleaveGroup<llvm::Instruction>* getInterleavedAccessGroup( llvm::Instruction* Instr)

Description

Parameters

¶llvm::ElementCount getMaxLegalScalableVF( unsigned int MaxSafeElements)

Parameters

Returns

¶llvm::ElementCount getMaximizedVFForTarget( unsigned int ConstTripCount, unsigned int SmallestType, unsigned int WidestType, llvm::ElementCount MaxSafeVF, bool FoldTailByMasking)

Parameters

Returns

¶llvm::InstructionCost getMemInstScalarizationCost( llvm::Instruction* I, llvm::ElementCount VF)

¶`bool blockNeedsPredicationForAnyReason( llvm::BasicBlock* BB) const`

¶`SmallVector<llvm::LoopVectorizationCostModel:: RegisterUsage, 8> calculateRegisterUsage( ArrayRef<llvm::ElementCount> VFs)`

¶`bool canTruncateToMinimalBitwidth( llvm::Instruction* I, llvm::ElementCount VF) const`

¶`bool canVectorizeReductions( llvm::ElementCount VF) const`

¶`void collectElementTypesForWidening()`

¶`void collectInLoopReductions()`

¶`void collectInstsToScalarize( llvm::ElementCount VF)`

¶`void collectLoopScalars(llvm::ElementCount VF)`

¶`void collectLoopUniforms(llvm::ElementCount VF)`

¶`void collectUniformsAndScalars( llvm::ElementCount VF)`

¶`void collectValuesToIgnore()`

¶`llvm::FixedScalableVFPair computeFeasibleMaxVF( unsigned int ConstTripCount, llvm::ElementCount UserVF, bool FoldTailByMasking)`

¶`llvm::FixedScalableVFPair computeMaxVF( llvm::ElementCount UserVF, unsigned int UserIC)`

¶`int computePredInstDiscount( llvm::Instruction* PredInst, llvm::LoopVectorizationCostModel:: ScalarCostsTy& ScalarCosts, llvm::ElementCount VF)`

¶`llvm::LoopVectorizationCostModel:: VectorizationCostTy expectedCost( llvm::ElementCount VF, SmallVectorImpl< llvm::LoopVectorizationCostModel:: InstructionVFPair>* Invalid = nullptr)`

¶`SmallVector<llvm::Value*, 4> filterExtractingOperands( Instruction::op_range Ops, llvm::ElementCount VF) const`

¶`bool foldTailByMasking() const`

¶`llvm::InstructionCost getConsecutiveMemOpCost( llvm::Instruction* I, llvm::ElementCount VF)`

¶`llvm::InstructionCost getGatherScatterCost( llvm::Instruction* I, llvm::ElementCount VF)`

¶`const llvm::LoopVectorizationCostModel:: ReductionChainMap& getInLoopReductionChains() const`

¶`llvm::LoopVectorizationCostModel:: VectorizationCostTy getInstructionCost(llvm::Instruction* I, llvm::ElementCount VF)`

¶`llvm::InstructionCost getInstructionCost( llvm::Instruction* I, llvm::ElementCount VF, llvm::Type*& VectorTy)`

¶`llvm::InstructionCost getInterleaveGroupCost( llvm::Instruction* I, llvm::ElementCount VF)`

¶`const InterleaveGroup<llvm::Instruction>* getInterleavedAccessGroup( llvm::Instruction* Instr)`

¶`llvm::ElementCount getMaxLegalScalableVF( unsigned int MaxSafeElements)`

¶`llvm::ElementCount getMaximizedVFForTarget( unsigned int ConstTripCount, unsigned int SmallestType, unsigned int WidestType, llvm::ElementCount MaxSafeVF, bool FoldTailByMasking)`

¶`llvm::InstructionCost getMemInstScalarizationCost( llvm::Instruction* I, llvm::ElementCount VF)`

¶`llvm::InstructionCost getMemoryInstructionCost( llvm::Instruction* I, llvm::ElementCount VF)`

¶`const MapVector<llvm::Instruction*, uint64_t>& getMinimalBitwidths() const`

¶`Optional<llvm::InstructionCost> getReductionPatternCost( llvm::Instruction* I, llvm::ElementCount VF, llvm::Type* VectorTy, TTI::TargetCostKind CostKind)`

¶`llvm::InstructionCost getScalarizationOverhead( llvm::Instruction* I, llvm::ElementCount VF) const`

¶`std::pair<unsigned int, unsigned int> getSmallestAndWidestTypes()`

¶`llvm::InstructionCost getUniformMemOpCost( llvm::Instruction* I, llvm::ElementCount VF)`

¶`Optional<unsigned int> getVScaleForTuning() const`

¶`llvm::InstructionCost getVectorCallCost( llvm::CallInst* CI, llvm::ElementCount VF, bool& NeedToScalarize) const`

¶`llvm::InstructionCost getVectorIntrinsicCost( llvm::CallInst* CI, llvm::ElementCount VF) const`

¶`llvm::InstructionCost getWideningCost( llvm::Instruction* I, llvm::ElementCount VF)`

¶`llvm::LoopVectorizationCostModel::InstWidening getWideningDecision(llvm::Instruction* I, llvm::ElementCount VF) const`

¶`bool interleavedAccessCanBeWidened( llvm::Instruction* I, llvm::ElementCount VF = ElementCount::getFixed(1))`

¶`void invalidateCostModelingDecisions()`

¶`bool isAccessInterleaved(llvm::Instruction* Instr)`

¶`bool isCandidateForEpilogueVectorization( const llvm::Loop& L, const llvm::ElementCount VF) const`

¶`bool isEpilogueVectorizationProfitable( const llvm::ElementCount VF) const`

¶`bool isInLoopReduction(llvm::PHINode* Phi) const`

¶`bool isLegalGatherOrScatter( llvm::Value* V, llvm::ElementCount VF = ElementCount::getFixed(1))`

¶`bool isLegalMaskedLoad( llvm::Type* DataType, llvm::Value* Ptr, llvm::Align Alignment) const`

¶`bool isLegalMaskedStore( llvm::Type* DataType, llvm::Value* Ptr, llvm::Align Alignment) const`

¶`bool isMoreProfitable( const llvm::VectorizationFactor& A, const llvm::VectorizationFactor& B) const`

¶`bool isOptimizableIVTruncate( llvm::Instruction* I, llvm::ElementCount VF)`

¶`bool isPredicatedInst(llvm::Instruction* I, llvm::ElementCount VF)`

¶`bool isProfitableToScalarize( llvm::Instruction* I, llvm::ElementCount VF) const`