class LoopVectorizationCostModel
Declaration
class LoopVectorizationCostModel { /* full declaration omitted */ };
Description
LoopVectorizationCostModel - estimates the expected speedups due to vectorization. In many cases vectorization is not profitable. This can happen because of a number of reasons. In this class we mainly attempt to predict the expected speedup/slowdowns due to the supported instruction set. We use the TargetTransformInfo to query the different backends for the cost of different operations.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1133
Member Variables
- private unsigned int NumPredStores = 0
- private MapVector<llvm::Instruction*, uint64_t> MinBWs
- Map of scalar integer values to the smallest bitwidth they can be legally represented as. The vector equivalents of these values should be truncated to this type.
- private DenseMap<llvm::ElementCount, SmallPtrSet<llvm::BasicBlock*, 4>> PredicatedBBsAfterVectorization
- A set containing all BasicBlocks that are known to present after vectorization as a predicated block.
- private llvm::ScalarEpilogueLowering ScalarEpilogueStatus = CM_ScalarEpilogueAllowed
- Records whether it is allowed to have the original scalar loop execute at least once. This may be needed as a fallback loop in case runtime aliasing/dependence checks fail, or to handle the tail/remainder iterations when the trip count is unknown or doesn't divide by the VF, or as a peel-loop to handle gaps in interleave-groups. Under optsize and when the trip count is very small we don't allow any iterations to execute in the scalar loop.
- private bool FoldTailByMasking = false
- All blocks of loop are to be masked to fold tail of scalar iterations.
- private DenseMap<llvm::ElementCount, llvm::LoopVectorizationCostModel:: ScalarCostsTy> InstsToScalarize
- A map holding scalar costs for different vectorization factors. The presence of a cost for an instruction in the mapping indicates that the instruction will be scalarized when vectorizing with the associated vectorization factor. The entries are VF-ScalarCostTy pairs.
- private DenseMap<llvm::ElementCount, SmallPtrSet<llvm::Instruction*, 4>> Uniforms
- Holds the instructions known to be uniform after vectorization. The data is collected per VF.
- private DenseMap<llvm::ElementCount, SmallPtrSet<llvm::Instruction*, 4>> Scalars
- Holds the instructions known to be scalar after vectorization. The data is collected per VF.
- private DenseMap<llvm::ElementCount, SmallPtrSet<llvm::Instruction*, 4>> ForcedScalars
- Holds the instructions (address computations) that are forced to be scalarized.
- private llvm::LoopVectorizationCostModel:: ReductionChainMap InLoopReductionChains
- PHINodes of the reductions that should be expanded in-loop along with their associated chains of reduction operations, in program order from top (PHI) to bottom
- private DenseMap<llvm::Instruction*, llvm::Instruction*> InLoopReductionImmediateChains
- A Map of inloop reduction operations and their immediate chain operand. FIXME: This can be removed once reductions can be costed correctly in vplan. This was added to allow quick lookup to the inloop operations, without having to loop through InLoopReductionChains.
- private llvm::LoopVectorizationCostModel::DecisionList WideningDecisions
- public llvm::Loop* TheLoop
- The loop that we evaluate.
- public llvm::PredicatedScalarEvolution& PSE
- Predicated scalar evolution analysis.
- public llvm::LoopInfo* LI
- Loop Info analysis.
- public llvm::LoopVectorizationLegality* Legal
- Vectorization legality.
- public const llvm::TargetTransformInfo& TTI
- Vector target information.
- public const llvm::TargetLibraryInfo* TLI
- Target Library Info.
- public llvm::DemandedBits* DB
- Demanded bits analysis.
- public llvm::AssumptionCache* AC
- Assumption cache.
- public llvm::OptimizationRemarkEmitter* ORE
- Interface to emit optimization remarks.
- public const llvm::Function* TheFunction
- public const llvm::LoopVectorizeHints* Hints
- Loop Vectorize Hint.
- public llvm::InterleavedAccessInfo& InterleaveInfo
- The interleave access information contains groups of interleaved accesses with the same stride and close to each other.
- public SmallPtrSet<const llvm::Value*, 16> ValuesToIgnore
- Values to ignore in the cost model.
- public SmallPtrSet<const llvm::Value*, 16> VecValuesToIgnore
- Values to ignore in the cost model when VF > 1.
- public SmallPtrSet<llvm::Type*, 16> ElementTypesInLoop
- All element types found in the loop.
- public SmallVector<llvm::VectorizationFactor, 8> ProfitableVFs
- Profitable vector factors.
Method Overview
- public LoopVectorizationCostModel(llvm::ScalarEpilogueLowering SEL, llvm::Loop * L, llvm::PredicatedScalarEvolution & PSE, llvm::LoopInfo * LI, llvm::LoopVectorizationLegality * Legal, const llvm::TargetTransformInfo & TTI, const llvm::TargetLibraryInfo * TLI, llvm::DemandedBits * DB, llvm::AssumptionCache * AC, llvm::OptimizationRemarkEmitter * ORE, const llvm::Function * F, const llvm::LoopVectorizeHints * Hints, llvm::InterleavedAccessInfo & IAI)
- public bool blockNeedsPredicationForAnyReason(llvm::BasicBlock * BB) const
- public SmallVector<llvm::LoopVectorizationCostModel::RegisterUsage, 8> calculateRegisterUsage(ArrayRef<llvm::ElementCount> VFs)
- public bool canTruncateToMinimalBitwidth(llvm::Instruction * I, llvm::ElementCount VF) const
- public bool canVectorizeReductions(llvm::ElementCount VF) const
- public void collectElementTypesForWidening()
- public void collectInLoopReductions()
- public void collectInstsToScalarize(llvm::ElementCount VF)
- private void collectLoopScalars(llvm::ElementCount VF)
- private void collectLoopUniforms(llvm::ElementCount VF)
- public void collectUniformsAndScalars(llvm::ElementCount VF)
- public void collectValuesToIgnore()
- private llvm::FixedScalableVFPair computeFeasibleMaxVF(unsigned int ConstTripCount, llvm::ElementCount UserVF, bool FoldTailByMasking)
- public llvm::FixedScalableVFPair computeMaxVF(llvm::ElementCount UserVF, unsigned int UserIC)
- private int computePredInstDiscount(llvm::Instruction * PredInst, llvm::LoopVectorizationCostModel::ScalarCostsTy & ScalarCosts, llvm::ElementCount VF)
- private llvm::LoopVectorizationCostModel::VectorizationCostTy expectedCost(llvm::ElementCount VF, SmallVectorImpl<llvm::LoopVectorizationCostModel::InstructionVFPair> * Invalid = nullptr)
- private SmallVector<llvm::Value *, 4> filterExtractingOperands(Instruction::op_range Ops, llvm::ElementCount VF) const
- public bool foldTailByMasking() const
- private llvm::InstructionCost getConsecutiveMemOpCost(llvm::Instruction * I, llvm::ElementCount VF)
- private llvm::InstructionCost getGatherScatterCost(llvm::Instruction * I, llvm::ElementCount VF)
- public const llvm::LoopVectorizationCostModel::ReductionChainMap & getInLoopReductionChains() const
- private llvm::LoopVectorizationCostModel::VectorizationCostTy getInstructionCost(llvm::Instruction * I, llvm::ElementCount VF)
- private llvm::InstructionCost getInstructionCost(llvm::Instruction * I, llvm::ElementCount VF, llvm::Type *& VectorTy)
- private llvm::InstructionCost getInterleaveGroupCost(llvm::Instruction * I, llvm::ElementCount VF)
- public const InterleaveGroup<llvm::Instruction> * getInterleavedAccessGroup(llvm::Instruction * Instr)
- private llvm::ElementCount getMaxLegalScalableVF(unsigned int MaxSafeElements)
- private llvm::ElementCount getMaximizedVFForTarget(unsigned int ConstTripCount, unsigned int SmallestType, unsigned int WidestType, llvm::ElementCount MaxSafeVF, bool FoldTailByMasking)
- private llvm::InstructionCost getMemInstScalarizationCost(llvm::Instruction * I, llvm::ElementCount VF)
- private llvm::InstructionCost getMemoryInstructionCost(llvm::Instruction * I, llvm::ElementCount VF)
- public const MapVector<llvm::Instruction *, uint64_t> & getMinimalBitwidths() const
- private Optional<llvm::InstructionCost> getReductionPatternCost(llvm::Instruction * I, llvm::ElementCount VF, llvm::Type * VectorTy, TTI::TargetCostKind CostKind)
- private llvm::InstructionCost getScalarizationOverhead(llvm::Instruction * I, llvm::ElementCount VF) const
- public std::pair<unsigned int, unsigned int> getSmallestAndWidestTypes()
- private llvm::InstructionCost getUniformMemOpCost(llvm::Instruction * I, llvm::ElementCount VF)
- public Optional<unsigned int> getVScaleForTuning() const
- public llvm::InstructionCost getVectorCallCost(llvm::CallInst * CI, llvm::ElementCount VF, bool & NeedToScalarize) const
- public llvm::InstructionCost getVectorIntrinsicCost(llvm::CallInst * CI, llvm::ElementCount VF) const
- public llvm::InstructionCost getWideningCost(llvm::Instruction * I, llvm::ElementCount VF)
- public llvm::LoopVectorizationCostModel::InstWidening getWideningDecision(llvm::Instruction * I, llvm::ElementCount VF) const
- public bool interleavedAccessCanBeWidened(llvm::Instruction * I, llvm::ElementCount VF = ElementCount::getFixed(1))
- public void invalidateCostModelingDecisions()
- public bool isAccessInterleaved(llvm::Instruction * Instr)
- private bool isCandidateForEpilogueVectorization(const llvm::Loop & L, const llvm::ElementCount VF) const
- private bool isEpilogueVectorizationProfitable(const llvm::ElementCount VF) const
- public bool isInLoopReduction(llvm::PHINode * Phi) const
- public bool isLegalGatherOrScatter(llvm::Value * V, llvm::ElementCount VF = ElementCount::getFixed(1))
- public bool isLegalMaskedLoad(llvm::Type * DataType, llvm::Value * Ptr, llvm::Align Alignment) const
- public bool isLegalMaskedStore(llvm::Type * DataType, llvm::Value * Ptr, llvm::Align Alignment) const
- public bool isMoreProfitable(const llvm::VectorizationFactor & A, const llvm::VectorizationFactor & B) const
- public bool isOptimizableIVTruncate(llvm::Instruction * I, llvm::ElementCount VF)
- public bool isPredicatedInst(llvm::Instruction * I, llvm::ElementCount VF)
- public bool isProfitableToScalarize(llvm::Instruction * I, llvm::ElementCount VF) const
- public bool isScalarAfterVectorization(llvm::Instruction * I, llvm::ElementCount VF) const
- public bool isScalarEpilogueAllowed() const
- public bool isScalarWithPredication(llvm::Instruction * I, llvm::ElementCount VF) const
- public bool isUniformAfterVectorization(llvm::Instruction * I, llvm::ElementCount VF) const
- public bool memoryInstructionCanBeWidened(llvm::Instruction * I, llvm::ElementCount VF = ElementCount::getFixed(1))
- private bool needsExtract(llvm::Value * V, llvm::ElementCount VF) const
- public bool requiresScalarEpilogue(llvm::ElementCount VF) const
- public bool runtimeChecksRequired()
- public llvm::VectorizationFactor selectEpilogueVectorizationFactor(const llvm::ElementCount MaxVF, const llvm::LoopVectorizationPlanner & LVP)
- public unsigned int selectInterleaveCount(llvm::ElementCount VF, unsigned int LoopCost)
- public bool selectUserVectorizationFactor(llvm::ElementCount UserVF)
- public llvm::VectorizationFactor selectVectorizationFactor(const llvm::ElementCountSet & CandidateVFs)
- public void setCostBasedWideningDecision(llvm::ElementCount VF)
- public void setWideningDecision(llvm::Instruction * I, llvm::ElementCount VF, llvm::LoopVectorizationCostModel::InstWidening W, llvm::InstructionCost Cost)
- public void setWideningDecision(const InterleaveGroup<llvm::Instruction> * Grp, llvm::ElementCount VF, llvm::LoopVectorizationCostModel::InstWidening W, llvm::InstructionCost Cost)
- public bool useActiveLaneMaskForControlFlow() const
- private bool useEmulatedMaskMemRefHack(llvm::Instruction * I, llvm::ElementCount VF)
- public bool useOrderedReductions(const llvm::RecurrenceDescriptor & RdxDesc) const
Methods
¶LoopVectorizationCostModel(
llvm::ScalarEpilogueLowering SEL,
llvm::Loop* L,
llvm::PredicatedScalarEvolution& PSE,
llvm::LoopInfo* LI,
llvm::LoopVectorizationLegality* Legal,
const llvm::TargetTransformInfo& TTI,
const llvm::TargetLibraryInfo* TLI,
llvm::DemandedBits* DB,
llvm::AssumptionCache* AC,
llvm::OptimizationRemarkEmitter* ORE,
const llvm::Function* F,
const llvm::LoopVectorizeHints* Hints,
llvm::InterleavedAccessInfo& IAI)
LoopVectorizationCostModel(
llvm::ScalarEpilogueLowering SEL,
llvm::Loop* L,
llvm::PredicatedScalarEvolution& PSE,
llvm::LoopInfo* LI,
llvm::LoopVectorizationLegality* Legal,
const llvm::TargetTransformInfo& TTI,
const llvm::TargetLibraryInfo* TLI,
llvm::DemandedBits* DB,
llvm::AssumptionCache* AC,
llvm::OptimizationRemarkEmitter* ORE,
const llvm::Function* F,
const llvm::LoopVectorizeHints* Hints,
llvm::InterleavedAccessInfo& IAI)
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1135
Parameters
- llvm::ScalarEpilogueLowering SEL
- llvm::Loop* L
- llvm::PredicatedScalarEvolution& PSE
- llvm::LoopInfo* LI
- llvm::LoopVectorizationLegality* Legal
- const llvm::TargetTransformInfo& TTI
- const llvm::TargetLibraryInfo* TLI
- llvm::DemandedBits* DB
- llvm::AssumptionCache* AC
- llvm::OptimizationRemarkEmitter* ORE
- const llvm::Function* F
- const llvm::LoopVectorizeHints* Hints
- llvm::InterleavedAccessInfo& IAI
¶bool blockNeedsPredicationForAnyReason(
llvm::BasicBlock* BB) const
bool blockNeedsPredicationForAnyReason(
llvm::BasicBlock* BB) const
Description
Returns true if the instructions in this block requires predication for any reason, e.g. because tail folding now requires a predicate or because the block in the original loop was predicated.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1522
Parameters
- llvm::BasicBlock* BB
¶SmallVector<llvm::LoopVectorizationCostModel::
RegisterUsage,
8>
calculateRegisterUsage(
ArrayRef<llvm::ElementCount> VFs)
SmallVector<llvm::LoopVectorizationCostModel::
RegisterUsage,
8>
calculateRegisterUsage(
ArrayRef<llvm::ElementCount> VFs)
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1210
Parameters
- ArrayRef<llvm::ElementCount> VFs
Returns
Returns information about the register usages of the loop for the given vectorization factors.
¶bool canTruncateToMinimalBitwidth(
llvm::Instruction* I,
llvm::ElementCount VF) const
bool canTruncateToMinimalBitwidth(
llvm::Instruction* I,
llvm::ElementCount VF) const
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1288
Parameters
Returns
True if instruction \p I can be truncated to a smaller bitwidth for vectorization factor \p VF.
¶bool canVectorizeReductions(
llvm::ElementCount VF) const
bool canVectorizeReductions(
llvm::ElementCount VF) const
Description
Returns true if the target machine supports all of the reduction variables found for the given VF.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1431
Parameters
¶void collectElementTypesForWidening()
void collectElementTypesForWidening()
Description
Collect all element types in the loop for which widening is needed.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1216
¶void collectInLoopReductions()
void collectInLoopReductions()
Description
Split reductions into those that happen in the loop, and those that happen outside. In loop reductions are collected into InLoopReductionChains.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1220
¶void collectInstsToScalarize(
llvm::ElementCount VF)
void collectInstsToScalarize(
llvm::ElementCount VF)
Description
Collects the instructions to scalarize for each predicated instruction in the loop.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1385
Parameters
¶void collectLoopScalars(llvm::ElementCount VF)
void collectLoopScalars(llvm::ElementCount VF)
Description
Collect the instructions that are scalar after vectorization. An instruction is scalar if it is known to be uniform or will be scalarized during vectorization. collectLoopScalars should only add non-uniform nodes to the list if they are used by a load/store instruction that is marked as CM_Scalarize. Non-uniform scalarized instructions will be represented by VF values in the vectorized loop, each corresponding to an iteration of the original scalar loop.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1743
Parameters
¶void collectLoopUniforms(llvm::ElementCount VF)
void collectLoopUniforms(llvm::ElementCount VF)
Description
Collect the instructions that are uniform after vectorization. An instruction is uniform if we represent it with a single scalar value in the vectorized loop corresponding to each vector iteration. Examples of uniform instructions include pointer operands of consecutive or interleaved memory accesses. Note that although uniformity implies an instruction will be scalar, the reverse is not true. In general, a scalarized instruction will be represented by VF scalar values in the vectorized loop, each corresponding to an iteration of the original scalar loop.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1734
Parameters
¶void collectUniformsAndScalars(
llvm::ElementCount VF)
void collectUniformsAndScalars(
llvm::ElementCount VF)
Description
Collect Uniform and Scalar values for the given \p VF. The sets depend on CM decision for Load/Store instructions that may be vectorized as interleave, gather-scatter or scalarized.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1390
Parameters
¶void collectValuesToIgnore()
void collectValuesToIgnore()
Description
Collect values we want to ignore in the cost model.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1213
¶llvm::FixedScalableVFPair computeFeasibleMaxVF(
unsigned int ConstTripCount,
llvm::ElementCount UserVF,
bool FoldTailByMasking)
llvm::FixedScalableVFPair computeFeasibleMaxVF(
unsigned int ConstTripCount,
llvm::ElementCount UserVF,
bool FoldTailByMasking)
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1580
Parameters
- unsigned int ConstTripCount
- llvm::ElementCount UserVF
- bool FoldTailByMasking
Returns
An upper bound for the vectorization factors for both fixed and scalable vectorization, where the minimum-known number of elements is a power-of-2 larger than zero. If scalable vectorization is disabled or unsupported, then the scalable part will be equal to ElementCount::getScalable(0).
¶llvm::FixedScalableVFPair computeMaxVF(
llvm::ElementCount UserVF,
unsigned int UserIC)
llvm::FixedScalableVFPair computeMaxVF(
llvm::ElementCount UserVF,
unsigned int UserIC)
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1151
Parameters
- llvm::ElementCount UserVF
- unsigned int UserIC
Returns
An upper bound for the vectorization factors (both fixed and scalable). If the factors are 0, vectorization and interleaving should be avoided up front.
¶int computePredInstDiscount(
llvm::Instruction* PredInst,
llvm::LoopVectorizationCostModel::
ScalarCostsTy& ScalarCosts,
llvm::ElementCount VF)
int computePredInstDiscount(
llvm::Instruction* PredInst,
llvm::LoopVectorizationCostModel::
ScalarCostsTy& ScalarCosts,
llvm::ElementCount VF)
Description
Returns the expected difference in cost from scalarizing the expression feeding a predicated instruction \p PredInst. The instructions to scalarize and their scalar costs are collected in \p ScalarCosts. A non-negative return value implies the expression will be scalarized. Currently, only single-use chains are considered for scalarization.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1722
Parameters
- llvm::Instruction* PredInst
- llvm::LoopVectorizationCostModel::ScalarCostsTy& ScalarCosts
- llvm::ElementCount VF
¶llvm::LoopVectorizationCostModel::
VectorizationCostTy
expectedCost(
llvm::ElementCount VF,
SmallVectorImpl<
llvm::LoopVectorizationCostModel::
InstructionVFPair>* Invalid =
nullptr)
llvm::LoopVectorizationCostModel::
VectorizationCostTy
expectedCost(
llvm::ElementCount VF,
SmallVectorImpl<
llvm::LoopVectorizationCostModel::
InstructionVFPair>* Invalid =
nullptr)
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1612
Parameters
- llvm::ElementCount VF
- SmallVectorImpl<llvm::LoopVectorizationCostModel:: InstructionVFPair>* Invalid = nullptr
¶SmallVector<llvm::Value*, 4>
filterExtractingOperands(
Instruction::op_range Ops,
llvm::ElementCount VF) const
SmallVector<llvm::Value*, 4>
filterExtractingOperands(
Instruction::op_range Ops,
llvm::ElementCount VF) const
Description
Returns a range containing only operands needing to be extracted.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1771
Parameters
¶bool foldTailByMasking() const
bool foldTailByMasking() const
Description
Returns true if all loop blocks should be masked to fold tail loop.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1510
¶llvm::InstructionCost getConsecutiveMemOpCost(
llvm::Instruction* I,
llvm::ElementCount VF)
llvm::InstructionCost getConsecutiveMemOpCost(
llvm::Instruction* I,
llvm::ElementCount VF)
Description
The cost computation for widening instruction \p I with consecutive memory access.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1644
Parameters
¶llvm::InstructionCost getGatherScatterCost(
llvm::Instruction* I,
llvm::ElementCount VF)
llvm::InstructionCost getGatherScatterCost(
llvm::Instruction* I,
llvm::ElementCount VF)
Description
The cost computation for Gather/Scatter instruction.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1640
Parameters
¶const llvm::LoopVectorizationCostModel::
ReductionChainMap&
getInLoopReductionChains() const
const llvm::LoopVectorizationCostModel::
ReductionChainMap&
getInLoopReductionChains() const
Description
Return the chain of instructions representing an inloop reduction.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1533
¶llvm::LoopVectorizationCostModel::
VectorizationCostTy
getInstructionCost(llvm::Instruction* I,
llvm::ElementCount VF)
llvm::LoopVectorizationCostModel::
VectorizationCostTy
getInstructionCost(llvm::Instruction* I,
llvm::ElementCount VF)
Description
Returns the execution time cost of an instruction for a given vector width. Vector width of one means scalar.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1617
Parameters
¶llvm::InstructionCost getInstructionCost(
llvm::Instruction* I,
llvm::ElementCount VF,
llvm::Type*& VectorTy)
llvm::InstructionCost getInstructionCost(
llvm::Instruction* I,
llvm::ElementCount VF,
llvm::Type*& VectorTy)
Description
The cost-computation logic from getInstructionCost which provides the vector type as an output parameter.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1621
Parameters
- llvm::Instruction* I
- llvm::ElementCount VF
- llvm::Type*& VectorTy
¶llvm::InstructionCost getInterleaveGroupCost(
llvm::Instruction* I,
llvm::ElementCount VF)
llvm::InstructionCost getInterleaveGroupCost(
llvm::Instruction* I,
llvm::ElementCount VF)
Description
The cost computation for interleaving group of memory instructions.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1637
Parameters
¶const InterleaveGroup<llvm::Instruction>*
getInterleavedAccessGroup(
llvm::Instruction* Instr)
const InterleaveGroup<llvm::Instruction>*
getInterleavedAccessGroup(
llvm::Instruction* Instr)
Description
Get the interleaved access group that \p Instr belongs to.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1487
Parameters
- llvm::Instruction* Instr
¶llvm::ElementCount getMaxLegalScalableVF(
unsigned int MaxSafeElements)
llvm::ElementCount getMaxLegalScalableVF(
unsigned int MaxSafeElements)
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1595
Parameters
- unsigned int MaxSafeElements
Returns
the maximum legal scalable VF, based on the safe max number of elements.
¶llvm::ElementCount getMaximizedVFForTarget(
unsigned int ConstTripCount,
unsigned int SmallestType,
unsigned int WidestType,
llvm::ElementCount MaxSafeVF,
bool FoldTailByMasking)
llvm::ElementCount getMaximizedVFForTarget(
unsigned int ConstTripCount,
unsigned int SmallestType,
unsigned int WidestType,
llvm::ElementCount MaxSafeVF,
bool FoldTailByMasking)
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1587
Parameters
- unsigned int ConstTripCount
- unsigned int SmallestType
- unsigned int WidestType
- llvm::ElementCount MaxSafeVF
- bool FoldTailByMasking
Returns
the maximized element count based on the targets vector registers and the loop trip-count, but limited to a maximum safe VF. This is a helper function of computeFeasibleMaxVF.
¶llvm::InstructionCost getMemInstScalarizationCost(
llvm::Instruction* I,
llvm::ElementCount VF)
llvm::InstructionCost getMemInstScalarizationCost(
llvm::Instruction* I,
llvm::ElementCount VF)
Description
The cost computation for scalarized memory instruction.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1634
Parameters
¶llvm::InstructionCost getMemoryInstructionCost(
llvm::Instruction* I,
llvm::ElementCount VF)
llvm::InstructionCost getMemoryInstructionCost(
llvm::Instruction* I,
llvm::ElementCount VF)
Description
Calculate vectorization cost of memory instruction \p I.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1631
Parameters
¶const MapVector<llvm::Instruction*, uint64_t>&
getMinimalBitwidths() const
const MapVector<llvm::Instruction*, uint64_t>&
getMinimalBitwidths() const
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1233
Returns
The smallest bitwidth each instruction can be represented with. The vector equivalents of these instructions should be truncated to this type.
¶Optional<llvm::InstructionCost>
getReductionPatternCost(
llvm::Instruction* I,
llvm::ElementCount VF,
llvm::Type* VectorTy,
TTI::TargetCostKind CostKind)
Optional<llvm::InstructionCost>
getReductionPatternCost(
llvm::Instruction* I,
llvm::ElementCount VF,
llvm::Type* VectorTy,
TTI::TargetCostKind CostKind)
Description
Return the cost of instructions in an inloop reduction pattern, if I is part of that pattern.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1627
Parameters
- llvm::Instruction* I
- llvm::ElementCount VF
- llvm::Type* VectorTy
- TTI::TargetCostKind CostKind
¶llvm::InstructionCost getScalarizationOverhead(
llvm::Instruction* I,
llvm::ElementCount VF) const
llvm::InstructionCost getScalarizationOverhead(
llvm::Instruction* I,
llvm::ElementCount VF) const
Description
Estimate the overhead of scalarizing an instruction. This is a convenience wrapper for the type-based getScalarizationOverhead API.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1654
Parameters
¶std::pair<unsigned int, unsigned int>
getSmallestAndWidestTypes()
std::pair<unsigned int, unsigned int>
getSmallestAndWidestTypes()
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1179
Returns
The size (in bits) of the smallest and widest types in the code that needs to be vectorized. We ignore values that remain scalar such as 64 bit loop indices.
¶llvm::InstructionCost getUniformMemOpCost(
llvm::Instruction* I,
llvm::ElementCount VF)
llvm::InstructionCost getUniformMemOpCost(
llvm::Instruction* I,
llvm::ElementCount VF)
Description
The cost calculation for Load/Store instruction \p I with uniform pointer - Load: scalar load + broadcast. Store: scalar store + (loop invariant value stored? 0 : extract of last element)
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1650
Parameters
¶Optional<unsigned int> getVScaleForTuning() const
Optional<unsigned int> getVScaleForTuning() const
Description
Convenience function that returns the value of vscale_range iff vscale_range.min == vscale_range.max or otherwise returns the value returned by the corresponding TLI method.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1570
¶llvm::InstructionCost getVectorCallCost(
llvm::CallInst* CI,
llvm::ElementCount VF,
bool& NeedToScalarize) const
llvm::InstructionCost getVectorCallCost(
llvm::CallInst* CI,
llvm::ElementCount VF,
bool& NeedToScalarize) const
Description
Estimate cost of a call instruction CI if it were vectorized with factor VF. Return the cost of the instruction, including scalarization overhead if it's needed. The flag NeedToScalarize shows if the call needs to be scalarized - i.e. either vector version isn't available, or is too expensive.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1552
Parameters
- llvm::CallInst* CI
- llvm::ElementCount VF
- bool& NeedToScalarize
¶llvm::InstructionCost getVectorIntrinsicCost(
llvm::CallInst* CI,
llvm::ElementCount VF) const
llvm::InstructionCost getVectorIntrinsicCost(
llvm::CallInst* CI,
llvm::ElementCount VF) const
Description
Estimate cost of an intrinsic call instruction CI if it were vectorized with factor VF. Return the cost of the instruction, including scalarization overhead if it's needed.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1545
Parameters
¶llvm::InstructionCost getWideningCost(
llvm::Instruction* I,
llvm::ElementCount VF)
llvm::InstructionCost getWideningCost(
llvm::Instruction* I,
llvm::ElementCount VF)
Description
Return the vectorization cost for the given instruction \p I and vector width \p VF.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1349
Parameters
¶llvm::LoopVectorizationCostModel::InstWidening
getWideningDecision(llvm::Instruction* I,
llvm::ElementCount VF) const
llvm::LoopVectorizationCostModel::InstWidening
getWideningDecision(llvm::Instruction* I,
llvm::ElementCount VF) const
Description
Return the cost model decision for the given instruction \p I and vector width \p VF. Return CM_Unknown if this instruction did not pass through the cost modeling.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1333
Parameters
¶bool interleavedAccessCanBeWidened(
llvm::Instruction* I,
llvm::ElementCount VF =
ElementCount::getFixed(1))
bool interleavedAccessCanBeWidened(
llvm::Instruction* I,
llvm::ElementCount VF =
ElementCount::getFixed(1))
Description
Returns true if \p I is a memory instruction in an interleaved-group of memory accesses that can be vectorized with wide vector loads/stores and shuffles.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1477
Parameters
- llvm::Instruction* I
- llvm::ElementCount VF = ElementCount::getFixed(1)
¶void invalidateCostModelingDecisions()
void invalidateCostModelingDecisions()
Description
Invalidates decisions already taken by the cost model.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1561
¶bool isAccessInterleaved(llvm::Instruction* Instr)
bool isAccessInterleaved(llvm::Instruction* Instr)
Description
Check if \p Instr belongs to any interleaved access group.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1481
Parameters
- llvm::Instruction* Instr
¶bool isCandidateForEpilogueVectorization(
const llvm::Loop& L,
const llvm::ElementCount VF) const
bool isCandidateForEpilogueVectorization(
const llvm::Loop& L,
const llvm::ElementCount VF) const
Description
Determines if we have the infrastructure to vectorize loop \p L and its epilogue, assuming the main loop is vectorized by \p VF.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1779
Parameters
- const llvm::Loop& L
- const llvm::ElementCount VF
¶bool isEpilogueVectorizationProfitable(
const llvm::ElementCount VF) const
bool isEpilogueVectorizationProfitable(
const llvm::ElementCount VF) const
Description
Returns true if epilogue vectorization is considered profitable, and false otherwise.\p VF is the vectorization factor chosen for the original loop.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1785
Parameters
- const llvm::ElementCount VF
¶bool isInLoopReduction(llvm::PHINode* Phi) const
bool isInLoopReduction(llvm::PHINode* Phi) const
Description
Returns true if the Phi is part of an inloop reduction.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1538
Parameters
- llvm::PHINode* Phi
¶bool isLegalGatherOrScatter(
llvm::Value* V,
llvm::ElementCount VF =
ElementCount::getFixed(1))
bool isLegalGatherOrScatter(
llvm::Value* V,
llvm::ElementCount VF =
ElementCount::getFixed(1))
Description
Returns true if the target machine can represent \p V as a masked gather or scatter operation.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1415
Parameters
- llvm::Value* V
- llvm::ElementCount VF = ElementCount::getFixed(1)
¶bool isLegalMaskedLoad(
llvm::Type* DataType,
llvm::Value* Ptr,
llvm::Align Alignment) const
bool isLegalMaskedLoad(
llvm::Type* DataType,
llvm::Value* Ptr,
llvm::Align Alignment) const
Description
Returns true if the target machine supports masked load operation for the given \p DataType and kind of access to \p Ptr.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1408
Parameters
- llvm::Type* DataType
- llvm::Value* Ptr
- llvm::Align Alignment
¶bool isLegalMaskedStore(
llvm::Type* DataType,
llvm::Value* Ptr,
llvm::Align Alignment) const
bool isLegalMaskedStore(
llvm::Type* DataType,
llvm::Value* Ptr,
llvm::Align Alignment) const
Description
Returns true if the target machine supports masked store operation for the given \p DataType and kind of access to \p Ptr.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1401
Parameters
- llvm::Type* DataType
- llvm::Value* Ptr
- llvm::Align Alignment
¶bool isMoreProfitable(
const llvm::VectorizationFactor& A,
const llvm::VectorizationFactor& B) const
bool isMoreProfitable(
const llvm::VectorizationFactor& A,
const llvm::VectorizationFactor& B) const
Description
Returns true if the per-lane cost of VectorizationFactor A is lower than that of B.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1557
Parameters
- const llvm::VectorizationFactor& A
- const llvm::VectorizationFactor& B
¶bool isOptimizableIVTruncate(
llvm::Instruction* I,
llvm::ElementCount VF)
bool isOptimizableIVTruncate(
llvm::Instruction* I,
llvm::ElementCount VF)
Description
Return True if instruction \p I is an optimizable truncate whose operand is an induction variable. Such a truncate will be removed by adding a new induction variable with the destination type.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1360
Parameters
¶bool isPredicatedInst(llvm::Instruction* I,
llvm::ElementCount VF)
bool isPredicatedInst(llvm::Instruction* I,
llvm::ElementCount VF)
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1448
Parameters
¶bool isProfitableToScalarize(
llvm::Instruction* I,
llvm::ElementCount VF) const
bool isProfitableToScalarize(
llvm::Instruction* I,
llvm::ElementCount VF) const
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1239
Parameters
Returns
True if it is more profitable to scalarize instruction \p I for vectorization factor \p VF.
¶bool isScalarAfterVectorization(
llvm::Instruction* I,
llvm::ElementCount VF) const
bool isScalarAfterVectorization(
llvm::Instruction* I,
llvm::ElementCount VF) const
Description
Returns true if \p I is known to be scalar after vectorization.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1271
Parameters
¶bool isScalarEpilogueAllowed() const
bool isScalarEpilogueAllowed() const
Description
Returns true if a scalar epilogue is not allowed due to optsize or a loop hint annotation.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1505
¶bool isScalarWithPredication(
llvm::Instruction* I,
llvm::ElementCount VF) const
bool isScalarWithPredication(
llvm::Instruction* I,
llvm::ElementCount VF) const
Description
Returns true if \p I is an instruction that will be scalarized with predication when vectorizing \p I with vectorization factor \p VF. Such instructions include conditional stores and instructions that may divide by zero.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1442
Parameters
¶bool isUniformAfterVectorization(
llvm::Instruction* I,
llvm::ElementCount VF) const
bool isUniformAfterVectorization(
llvm::Instruction* I,
llvm::ElementCount VF) const
Description
Returns true if \p I is known to be uniform after vectorization.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1255
Parameters
¶bool memoryInstructionCanBeWidened(
llvm::Instruction* I,
llvm::ElementCount VF =
ElementCount::getFixed(1))
bool memoryInstructionCanBeWidened(
llvm::Instruction* I,
llvm::ElementCount VF =
ElementCount::getFixed(1))
Description
Returns true if \p I is a memory instruction with consecutive memory access that can be widened.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1470
Parameters
- llvm::Instruction* I
- llvm::ElementCount VF = ElementCount::getFixed(1)
¶bool needsExtract(llvm::Value* V,
llvm::ElementCount VF) const
bool needsExtract(llvm::Value* V,
llvm::ElementCount VF) const
Description
Returns true if \p V is expected to be vectorized and it needs to be extracted.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1754
Parameters
¶bool requiresScalarEpilogue(
llvm::ElementCount VF) const
bool requiresScalarEpilogue(
llvm::ElementCount VF) const
Description
Returns true if we're required to use a scalar epilogue for at least the final iteration of the original loop.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1493
Parameters
¶bool runtimeChecksRequired()
bool runtimeChecksRequired()
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1155
Returns
True if runtime checks are required for vectorization, and false otherwise.
¶llvm::VectorizationFactor
selectEpilogueVectorizationFactor(
const llvm::ElementCount MaxVF,
const llvm::LoopVectorizationPlanner& LVP)
llvm::VectorizationFactor
selectEpilogueVectorizationFactor(
const llvm::ElementCount MaxVF,
const llvm::LoopVectorizationPlanner& LVP)
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1165
Parameters
- const llvm::ElementCount MaxVF
- const llvm::LoopVectorizationPlanner& LVP
¶unsigned int selectInterleaveCount(
llvm::ElementCount VF,
unsigned int LoopCost)
unsigned int selectInterleaveCount(
llvm::ElementCount VF,
unsigned int LoopCost)
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1185
Parameters
- llvm::ElementCount VF
- unsigned int LoopCost
Returns
The desired interleave count. If interleave count has been specified by metadata it will be returned. Otherwise, the interleave count is computed and returned. VF and LoopCost are the selected vectorization factor and the cost of the selected VF.
¶bool selectUserVectorizationFactor(
llvm::ElementCount UserVF)
bool selectUserVectorizationFactor(
llvm::ElementCount UserVF)
Description
Setup cost-based decisions for user vectorization factor.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1170
Parameters
- llvm::ElementCount UserVF
Returns
true if the UserVF is a feasible VF to be chosen.
¶llvm::VectorizationFactor
selectVectorizationFactor(
const llvm::ElementCountSet& CandidateVFs)
llvm::VectorizationFactor
selectVectorizationFactor(
const llvm::ElementCountSet& CandidateVFs)
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1162
Parameters
- const llvm::ElementCountSet& CandidateVFs
Returns
The most profitable vectorization factor and the cost of that VF. This method checks every VF in \p CandidateVFs. If UserVF is not ZERO then this vectorization factor will be selected if vectorization is possible.
¶void setCostBasedWideningDecision(
llvm::ElementCount VF)
void setCostBasedWideningDecision(
llvm::ElementCount VF)
Description
Memory access instruction may be vectorized in more than one way. Form of instruction after vectorization depends on cost. This function takes cost-based decisions for Load/Store instructions and collects them in a map. This decisions map is used for building the lists of loop-uniform and loop-scalar instructions. The calculated cost is saved with widening decision in order to avoid redundant calculations.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1194
Parameters
¶void setWideningDecision(
llvm::Instruction* I,
llvm::ElementCount VF,
llvm::LoopVectorizationCostModel::InstWidening
W,
llvm::InstructionCost Cost)
void setWideningDecision(
llvm::Instruction* I,
llvm::ElementCount VF,
llvm::LoopVectorizationCostModel::InstWidening
W,
llvm::InstructionCost Cost)
Description
Save vectorization decision \p W and \p Cost taken by the cost model for instruction \p I and vector width \p VF.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1306
Parameters
- llvm::Instruction* I
- llvm::ElementCount VF
- llvm::LoopVectorizationCostModel::InstWidening W
- llvm::InstructionCost Cost
¶void setWideningDecision(
const InterleaveGroup<llvm::Instruction>* Grp,
llvm::ElementCount VF,
llvm::LoopVectorizationCostModel::InstWidening
W,
llvm::InstructionCost Cost)
void setWideningDecision(
const InterleaveGroup<llvm::Instruction>* Grp,
llvm::ElementCount VF,
llvm::LoopVectorizationCostModel::InstWidening
W,
llvm::InstructionCost Cost)
Description
Save vectorization decision \p W and \p Cost taken by the cost model for interleaving group \p Grp and vector width \p VF.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1314
Parameters
- const InterleaveGroup<llvm::Instruction>* Grp
- llvm::ElementCount VF
- llvm::LoopVectorizationCostModel::InstWidening W
- llvm::InstructionCost Cost
¶bool useActiveLaneMaskForControlFlow() const
bool useActiveLaneMaskForControlFlow() const
Description
Returns true if were tail-folding and want to use the active lane mask for vector loop control flow.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1514
¶bool useEmulatedMaskMemRefHack(
llvm::Instruction* I,
llvm::ElementCount VF)
bool useEmulatedMaskMemRefHack(
llvm::Instruction* I,
llvm::ElementCount VF)
Description
Returns true if an artificially high cost for emulated masked memrefs should be used.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1659
Parameters
¶bool useOrderedReductions(
const llvm::RecurrenceDescriptor& RdxDesc)
const
bool useOrderedReductions(
const llvm::RecurrenceDescriptor& RdxDesc)
const
Description
Returns true if we should use strict in-order reductions for the given RdxDesc. This is true if the -enable-strict-reductions flag is passed, the IsOrdered flag of RdxDesc is set and we do not allow reordering of FP operations.
Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1226
Parameters
- const llvm::RecurrenceDescriptor& RdxDesc