class InnerLoopVectorizer

Declaration

class InnerLoopVectorizer { /* full declaration omitted */ };

Description

InnerLoopVectorizer vectorizes loops which contain only one basic block to a specified vectorization factor (VF). This class performs the widening of scalars into vectors, or multiple scalars. This class also implements the following features: * It inserts an epilogue loop for handling loops that don't have iteration counts that are known to be a multiple of the vectorization factor. * It handles the code generation for reduction variables. * Scalarization (implementation using scalars) of un-vectorizable instructions. InnerLoopVectorizer does not perform any vectorization-legality checks, and relies on the caller to check for the different legality aspects. The InnerLoopVectorizer relies on the LoopVectorizationLegality class to provide information about the induction and reduction variables that were found to a given vectorization factor.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:438

Member Variables

protected llvm::Loop* OrigLoop
The original loop.
protected llvm::PredicatedScalarEvolution& PSE
A wrapper around ScalarEvolution used to add runtime SCEV checks. Applies dynamic knowledge to simplify SCEV expressions and converts them to a more usable form.
protected llvm::LoopInfo* LI
Loop Info.
protected llvm::DominatorTree* DT
Dominator Tree.
protected llvm::AAResults* AA
Alias Analysis.
protected const llvm::TargetLibraryInfo* TLI
Target Library Info.
protected const llvm::TargetTransformInfo* TTI
Target Transform Info.
protected llvm::AssumptionCache* AC
Assumption Cache.
protected llvm::OptimizationRemarkEmitter* ORE
Interface to emit optimization remarks.
protected llvm::ElementCount VF
The vectorization SIMD factor to use. Each vector will have this many vector elements.
protected llvm::ElementCount MinProfitableTripCount
protected unsigned int UF
The vectorization unroll factor to use. Each scalar is vectorized to this many different vector instructions.
protected IRBuilder<> Builder
The builder that we use
protected llvm::BasicBlock* LoopVectorPreHeader
The vector-loop preheader.
protected llvm::BasicBlock* LoopScalarPreHeader
The scalar-loop preheader.
protected llvm::BasicBlock* LoopMiddleBlock
Middle Block between the vector and the scalar.
protected llvm::BasicBlock* LoopExitBlock
The unique ExitBlock of the scalar loop if one exists. Note that there can be multiple exiting edges reaching this block.
protected llvm::BasicBlock* LoopScalarBody
The scalar loop body.
protected SmallVector<llvm::BasicBlock*, 4> LoopBypassBlocks
A list of all bypass blocks. The first block is the entry of the loop.
protected SmallVector<llvm::Instruction*, 4> PredicatedInstructions
Store instructions that were predicated.
protected llvm::Value* TripCount = nullptr
Trip count of the original loop.
protected llvm::Value* VectorTripCount = nullptr
Trip count of the widened loop (TripCount - TripCount % (VF*UF))
The legality analysis.
protected llvm::LoopVectorizationCostModel* Cost
The profitablity analysis.
protected bool AddedSafetyChecks = false
protected DenseMap<llvm::PHINode*, llvm::Value*> IVEndValues
protected llvm::BlockFrequencyInfo* BFI
BFI and PSI are used to check for profile guided size optimizations.
protected llvm::ProfileSummaryInfo* PSI
protected bool OptForSizeBasedOnProfile
protected GeneratedRTChecks& RTChecks
Structure to hold information about generated runtime checks, responsible for cleaning the checks, if vectorization turns out unprofitable.
protected SmallMapVector<const llvm::RecurrenceDescriptor*, llvm::PHINode*, 4> ReductionResumeValues

Method Overview

  • public InnerLoopVectorizer(llvm::Loop * OrigLoop, llvm::PredicatedScalarEvolution & PSE, llvm::LoopInfo * LI, llvm::DominatorTree * DT, const llvm::TargetLibraryInfo * TLI, const llvm::TargetTransformInfo * TTI, llvm::AssumptionCache * AC, llvm::OptimizationRemarkEmitter * ORE, llvm::ElementCount VecWidth, llvm::ElementCount MinProfitableTripCount, unsigned int UnrollFactor, llvm::LoopVectorizationLegality * LVL, llvm::LoopVectorizationCostModel * CM, llvm::BlockFrequencyInfo * BFI, llvm::ProfileSummaryInfo * PSI, GeneratedRTChecks & RTChecks)
  • public bool areSafetyChecksAdded()
  • protected void clearReductionWrapFlags(llvm::VPReductionPHIRecipe * PhiR, llvm::VPTransformState & State)
  • protected void collectPoisonGeneratingRecipes(llvm::VPTransformState & State)
  • protected llvm::BasicBlock * completeLoopSkeleton(llvm::MDNode * OrigLoopID)
  • protected llvm::Value * createBitOrPointerCast(llvm::Value * V, llvm::VectorType * DstVTy, const llvm::DataLayout & DL)
  • protected void createInductionResumeValues(std::pair<BasicBlock *, Value *> AdditionalBypass = {nullptr, nullptr})
  • protected void createVectorLoopSkeleton(llvm::StringRef Prefix)
  • public virtual std::pair<BasicBlock *, Value *> createVectorizedLoopSkeleton()
  • protected void emitIterationCountCheck(llvm::BasicBlock * Bypass)
  • protected llvm::BasicBlock * emitMemRuntimeChecks(llvm::BasicBlock * Bypass)
  • protected llvm::BasicBlock * emitSCEVChecks(llvm::BasicBlock * Bypass)
  • protected void fixCrossIterationPHIs(llvm::VPTransformState & State)
  • protected void fixFirstOrderRecurrence(llvm::VPFirstOrderRecurrencePHIRecipe * PhiR, llvm::VPTransformState & State)
  • public void fixNonInductionPHIs(llvm::VPlan & Plan, llvm::VPTransformState & State)
  • protected void fixReduction(llvm::VPReductionPHIRecipe * Phi, llvm::VPTransformState & State)
  • public void fixVectorizedLoop(llvm::VPTransformState & State, llvm::VPlan & Plan)
  • protected void fixupIVUsers(llvm::PHINode * OrigPhi, const llvm::InductionDescriptor & II, llvm::Value * VectorTripCount, llvm::Value * EndValue, llvm::BasicBlock * MiddleBlock, llvm::BasicBlock * VectorHeader, llvm::VPlan & Plan)
  • public virtual llvm::Value * getBroadcastInstrs(llvm::Value * V)
  • protected llvm::Value * getOrCreateTripCount(llvm::BasicBlock * InsertBlock)
  • protected llvm::Value * getOrCreateVectorTripCount(llvm::BasicBlock * InsertBlock)
  • public llvm::PHINode * getReductionResumeValue(const llvm::RecurrenceDescriptor & RdxDesc)
  • public void packScalarIntoVectorValue(llvm::VPValue * Def, const llvm::VPIteration & Instance, llvm::VPTransformState & State)
  • protected virtual void printDebugTracesAtEnd()
  • protected virtual void printDebugTracesAtStart()
  • public void scalarizeInstruction(llvm::Instruction * Instr, llvm::VPReplicateRecipe * RepRecipe, const llvm::VPIteration & Instance, bool IfPredicateInstr, llvm::VPTransformState & State)
  • protected void sinkScalarOperands(llvm::Instruction * PredInst)
  • protected void truncateToMinimalBitwidths(llvm::VPTransformState & State)
  • public bool useOrderedReductions(const llvm::RecurrenceDescriptor & RdxDesc)
  • public void vectorizeInterleaveGroup(const InterleaveGroup<llvm::Instruction> * Group, ArrayRef<llvm::VPValue *> VPDefs, llvm::VPTransformState & State, llvm::VPValue * Addr, ArrayRef<llvm::VPValue *> StoredValues, llvm::VPValue * BlockInMask = nullptr)
  • public void widenCallInstruction(llvm::CallInst & CI, llvm::VPValue * Def, llvm::VPUser & ArgOperands, llvm::VPTransformState & State)
  • public virtual ~InnerLoopVectorizer()

Methods

InnerLoopVectorizer(
    llvm::Loop* OrigLoop,
    llvm::PredicatedScalarEvolution& PSE,
    llvm::LoopInfo* LI,
    llvm::DominatorTree* DT,
    const llvm::TargetLibraryInfo* TLI,
    const llvm::TargetTransformInfo* TTI,
    llvm::AssumptionCache* AC,
    llvm::OptimizationRemarkEmitter* ORE,
    llvm::ElementCount VecWidth,
    llvm::ElementCount MinProfitableTripCount,
    unsigned int UnrollFactor,
    llvm::LoopVectorizationLegality* LVL,
    llvm::LoopVectorizationCostModel* CM,
    llvm::BlockFrequencyInfo* BFI,
    llvm::ProfileSummaryInfo* PSI,
    GeneratedRTChecks& RTChecks)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:440

Parameters

llvm::Loop* OrigLoop
llvm::PredicatedScalarEvolution& PSE
llvm::LoopInfo* LI
llvm::DominatorTree* DT
const llvm::TargetLibraryInfo* TLI
const llvm::TargetTransformInfo* TTI
llvm::AssumptionCache* AC
llvm::OptimizationRemarkEmitter* ORE
llvm::ElementCount VecWidth
llvm::ElementCount MinProfitableTripCount
unsigned int UnrollFactor
llvm::LoopVectorizationLegality* LVL
llvm::LoopVectorizationCostModel* CM
llvm::BlockFrequencyInfo* BFI
llvm::ProfileSummaryInfo* PSI
GeneratedRTChecks& RTChecks

bool areSafetyChecksAdded()

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:484

void clearReductionWrapFlags(
    llvm::VPReductionPHIRecipe* PhiR,
    llvm::VPTransformState& State)

Description

Clear NSW/NUW flags from reduction instructions if necessary.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:562

Parameters

llvm::VPReductionPHIRecipe* PhiR
llvm::VPTransformState& State

void collectPoisonGeneratingRecipes(
    llvm::VPTransformState& State)

Description

Collect poison-generating recipes that may generate a poison value that is used after vectorization, even when their operands are not poison. Those recipes meet the following conditions: * Contribute to the address computation of a recipe generating a widen memory load/store (VPWidenMemoryInstructionRecipe or VPInterleaveRecipe). * Such a widen memory load/store has at least one underlying Instruction that is in a basic block that needs predication and after vectorization the generated instruction won't be predicated.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:625

Parameters

llvm::VPTransformState& State

llvm::BasicBlock* completeLoopSkeleton(
    llvm::MDNode* OrigLoopID)

Description

Complete the loop skeleton by adding debug MDs, creating appropriate conditional branches in the middle block, preparing the builder and running the verifier. Return the preheader of the completed vector loop.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:614

Parameters

llvm::MDNode* OrigLoopID

llvm::Value* createBitOrPointerCast(
    llvm::Value* V,
    llvm::VectorType* DstVTy,
    const llvm::DataLayout& DL)

Description

Returns a bitcasted value to the requested vector type. Also handles bitcasts of vector <float > < -> vector <pointer > types.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:581

Parameters

llvm::Value* V
llvm::VectorType* DstVTy
const llvm::DataLayout& DL

void createInductionResumeValues(
    std::pair<BasicBlock*, Value*>
        AdditionalBypass = {nullptr, nullptr})

Description

Create new phi nodes for the induction variables to resume iteration count in the scalar epilogue, from where the vectorized loop left off. In cases where the loop skeleton is more complicated (eg. epilogue vectorization) and the resume values can come from an additional bypass block, the \p AdditionalBypass pair provides information about the bypass block and the end value on the edge from bypass to this loop.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:608

Parameters

std::pair<BasicBlock*, Value*> AdditionalBypass = {nullptr, nullptr}

void createVectorLoopSkeleton(
    llvm::StringRef Prefix)

Description

Emit basic blocks (prefixed with \p Prefix) for the iteration check, vector loop preheader, middle block and scalar preheader.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:600

Parameters

llvm::StringRef Prefix

virtual std::pair<BasicBlock*, Value*>
createVectorizedLoopSkeleton()

Description

Create a new empty loop that will contain vectorized instructions later on, while the old loop will be used as the scalar remainder. Control flow is generated around the vectorized (and scalar epilogue) loops consisting of various checks and bypasses. Return the pre-header block of the new loop and the start value for the canonical induction, if it is != 0. The latter is the case when vectorizing the epilogue loop. In the case of epilogue vectorization, this function is overriden to handle the more complex control flow around the loops.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:474

void emitIterationCountCheck(
    llvm::BasicBlock* Bypass)

Description

Emit a bypass check to see if the vector trip count is zero, including if it overflows.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:586

Parameters

llvm::BasicBlock* Bypass

llvm::BasicBlock* emitMemRuntimeChecks(
    llvm::BasicBlock* Bypass)

Description

Emit bypass checks to check any memory assumptions we may have made. Returns the block containing the checks or nullptr if no checks have been added.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:596

Parameters

llvm::BasicBlock* Bypass

llvm::BasicBlock* emitSCEVChecks(
    llvm::BasicBlock* Bypass)

Description

Emit a bypass check to see if all of the SCEV assumptions we've had to make are correct. Returns the block containing the checks or nullptr if no checks have been added.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:591

Parameters

llvm::BasicBlock* Bypass

void fixCrossIterationPHIs(
    llvm::VPTransformState& State)

Description

Handle all cross-iteration phis in the header.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:551

Parameters

llvm::VPTransformState& State

void fixFirstOrderRecurrence(
    llvm::VPFirstOrderRecurrencePHIRecipe* PhiR,
    llvm::VPTransformState& State)

Description

Create the exit value of first order recurrences in the middle block and update their users.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:555

Parameters

llvm::VPFirstOrderRecurrencePHIRecipe* PhiR
llvm::VPTransformState& State

void fixNonInductionPHIs(
    llvm::VPlan& Plan,
    llvm::VPTransformState& State)

Description

Fix the non-induction PHIs in \p Plan.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:515

Parameters

llvm::VPlan& Plan
llvm::VPTransformState& State

void fixReduction(llvm::VPReductionPHIRecipe* Phi,
                  llvm::VPTransformState& State)

Description

Create code for the loop exit value of the reduction.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:559

Parameters

llvm::VPReductionPHIRecipe* Phi
llvm::VPTransformState& State

void fixVectorizedLoop(
    llvm::VPTransformState& State,
    llvm::VPlan& Plan)

Description

Fix the vectorized code, taking care of header phi's, live-outs, and more.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:481

Parameters

llvm::VPTransformState& State
llvm::VPlan& Plan

void fixupIVUsers(
    llvm::PHINode* OrigPhi,
    const llvm::InductionDescriptor& II,
    llvm::Value* VectorTripCount,
    llvm::Value* EndValue,
    llvm::BasicBlock* MiddleBlock,
    llvm::BasicBlock* VectorHeader,
    llvm::VPlan& Plan)

Description

Set up the values of the IVs correctly when exiting the vector loop.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:545

Parameters

llvm::PHINode* OrigPhi
const llvm::InductionDescriptor& II
llvm::Value* VectorTripCount
llvm::Value* EndValue
llvm::BasicBlock* MiddleBlock
llvm::BasicBlock* VectorHeader
llvm::VPlan& Plan

virtual llvm::Value* getBroadcastInstrs(
    llvm::Value* V)

Description

Create a broadcast instruction. This method generates a broadcast instruction (shuffle) for loop invariant values and for the induction value. If this is the induction variable then we extend it to N, N+1, ... this is needed because each iteration in the loop corresponds to a SIMD element.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:526

Parameters

llvm::Value* V

llvm::Value* getOrCreateTripCount(
    llvm::BasicBlock* InsertBlock)

Description

Returns (and creates if needed) the original loop trip count.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:574

Parameters

llvm::BasicBlock* InsertBlock

llvm::Value* getOrCreateVectorTripCount(
    llvm::BasicBlock* InsertBlock)

Description

Returns (and creates if needed) the trip count of the widened loop.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:577

Parameters

llvm::BasicBlock* InsertBlock

llvm::PHINode* getReductionResumeValue(
    const llvm::RecurrenceDescriptor& RdxDesc)

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:530

Parameters

const llvm::RecurrenceDescriptor& RdxDesc

void packScalarIntoVectorValue(
    llvm::VPValue* Def,
    const llvm::VPIteration& Instance,
    llvm::VPTransformState& State)

Description

Construct the vector value of a scalarized value \p V one lane at a time.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:501

Parameters

llvm::VPValue* Def
const llvm::VPIteration& Instance
llvm::VPTransformState& State

virtual void printDebugTracesAtEnd()

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:630

virtual void printDebugTracesAtStart()

Description

Allow subclasses to override and print debug traces before/after vplan execution, when trace information is requested.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:629

void scalarizeInstruction(
    llvm::Instruction* Instr,
    llvm::VPReplicateRecipe* RepRecipe,
    const llvm::VPIteration& Instance,
    bool IfPredicateInstr,
    llvm::VPTransformState& State)

Description

A helper function to scalarize a single Instruction in the innermost loop. Generates a sequence of scalar instances for each lane between \p MinLane and \p MaxLane, times each part between \p MinPart and \p MaxPart, inclusive. Uses the VPValue operands from \p RepRecipe instead of \p Instr's operands.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:496

Parameters

llvm::Instruction* Instr
llvm::VPReplicateRecipe* RepRecipe
const llvm::VPIteration& Instance
bool IfPredicateInstr
llvm::VPTransformState& State

void sinkScalarOperands(
    llvm::Instruction* PredInst)

Description

Iteratively sink the scalarized operands of a predicated instruction into the block that was created for it.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:567

Parameters

llvm::Instruction* PredInst

void truncateToMinimalBitwidths(
    llvm::VPTransformState& State)

Description

Shrinks vector element sizes to the smallest bitwidth they can be legally represented as.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:571

Parameters

llvm::VPTransformState& State

bool useOrderedReductions(
    const llvm::RecurrenceDescriptor& RdxDesc)

Description

Returns true if the reordering of FP operations is not allowed, but we are able to vectorize with strict in-order reductions for the given RdxDesc.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:519

Parameters

const llvm::RecurrenceDescriptor& RdxDesc

void vectorizeInterleaveGroup(
    const InterleaveGroup<llvm::Instruction>*
        Group,
    ArrayRef<llvm::VPValue*> VPDefs,
    llvm::VPTransformState& State,
    llvm::VPValue* Addr,
    ArrayRef<llvm::VPValue*> StoredValues,
    llvm::VPValue* BlockInMask = nullptr)

Description

Try to vectorize interleaved access group \p Group with the base address given in \p Addr, optionally masking the vector operations if \p BlockInMask is non-null. Use \p State to translate given VPValues to IR values in the vectorized loop.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:508

Parameters

const InterleaveGroup<llvm::Instruction>* Group
ArrayRef<llvm::VPValue*> VPDefs
llvm::VPTransformState& State
llvm::VPValue* Addr
ArrayRef<llvm::VPValue*> StoredValues
llvm::VPValue* BlockInMask = nullptr

void widenCallInstruction(
    llvm::CallInst& CI,
    llvm::VPValue* Def,
    llvm::VPUser& ArgOperands,
    llvm::VPTransformState& State)

Description

Widen a single call instruction within the innermost loop.

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:477

Parameters

llvm::CallInst& CI
llvm::VPValue* Def
llvm::VPUser& ArgOperands
llvm::VPTransformState& State

virtual ~InnerLoopVectorizer()

Declared at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:464