class BlockFrequencyInfoImpl

Declaration

template <class BT>
class BlockFrequencyInfoImpl : private BlockFrequencyInfoImplBase { /* full declaration omitted */ };

Description

Shared implementation for block frequency analysis. This is a shared implementation of BlockFrequencyInfo and MachineBlockFrequencyInfo, and calculates the relative frequencies of blocks. LoopInfo defines a loop as a "non-trivial" SCC dominated by a single block, which is called the header. A given loop, L, can have sub-loops, which are loops within the subgraph of L that exclude its header. (A "trivial" SCC consists of a single block that does not have a self-edge.) In addition to loops, this algorithm has limited support for irreducible SCCs, which are SCCs with multiple entry blocks. Irreducible SCCs are discovered on the fly, and modelled as loops with multiple headers. The headers of irreducible sub-SCCs consist of its entry blocks and all nodes that are targets of a backedge within it (excluding backedges within true sub-loops). Block frequency calculations act as if a block is inserted that intercepts all the edges to the headers. All backedges and entries point to this block. Its successors are the headers, which split the frequency evenly. This algorithm leverages BlockMass and ScaledNumber to maintain precision, separates mass distribution from loop scaling, and dithers to eliminate probability mass loss. The implementation is split between BlockFrequencyInfoImpl, which knows the type of graph being modelled (BasicBlock vs. MachineBasicBlock), and BlockFrequencyInfoImplBase, which doesn't. The base class uses \a BlockNode, a wrapper around a uint32_t. BlockNode is numbered from 0 in reverse-post order. This gives two advantages: it's easy to compare the relative ordering of two nodes, and maps keyed on BlockT can be represented by vectors. This algorithm is O(V+E), unless there is irreducible control flow, in which case it's O(V*E) in the worst case. These are the main stages: 0. Reverse post-order traversal (\a initializeRPOT()). Run a single post-order traversal and save it (in reverse) in RPOT. All other stages make use of this ordering. Save a lookup from BlockT to BlockNode (the index into RPOT) in Nodes. 1. Loop initialization (\a initializeLoops()). Translate LoopInfo/MachineLoopInfo into a form suitable for the rest of the algorithm. In particular, store the immediate members of each loop in reverse post-order. 2. Calculate mass and scale in loops (\a computeMassInLoops()). For each loop (bottom-up), distribute mass through the DAG resulting from ignoring backedges and treating sub-loops as a single pseudo-node. Track the backedge mass distributed to the loop header, and use it to calculate the loop scale (number of loop iterations). Immediate members that represent sub-loops will already have been visited and packaged into a pseudo-node. Distributing mass in a loop is a reverse-post-order traversal through the loop. Start by assigning full mass to the Loop header. For each node in the loop: - Fetch and categorize the weight distribution for its successors. If this is a packaged-subloop, the weight distribution is stored in \a LoopData::Exits. Otherwise, fetch it from BranchProbabilityInfo. - Each successor is categorized as \a Weight::Local, a local edge within the current loop, \a Weight::Backedge, a backedge to the loop header, or \a Weight::Exit, any successor outside the loop. The weight, the successor, and its category are stored in \a Distribution. There can be multiple edges to each successor. - If there's a backedge to a non-header, there's an irreducible SCC. The usual flow is temporarily aborted. \a computeIrreducibleMass() finds the irreducible SCCs within the loop, packages them up, and restarts the flow. - Normalize the distribution: scale weights down so that their sum is 32-bits, and coalesce multiple edges to the same node. - Distribute the mass accordingly, dithering to minimize mass loss, as described in \a distributeMass(). In the case of irreducible loops, instead of a single loop header, there will be several. The computation of backedge masses is similar but instead of having a single backedge mass, there will be one backedge per loop header. In these cases, each backedge will carry a mass proportional to the edge weights along the corresponding path. At the end of propagation, the full mass assigned to the loop will be distributed among the loop headers proportionally according to the mass flowing through their backedges. Finally, calculate the loop scale from the accumulated backedge mass. 3. Distribute mass in the function (\a computeMassInFunction()). Finally, distribute mass through the DAG resulting from packaging all loops in the function. This uses the same algorithm as distributing mass in a loop, except that there are no exit or backedge edges. 4. Unpackage loops (\a unwrapLoops()). Initialize each block's frequency to a floating point representation of its mass. Visit loops top-down, scaling the frequencies of its immediate members by the loop's pseudo-node's frequency. 5. Convert frequencies to a 64-bit range (\a finalizeMetrics()). Using the min and max frequencies as a guide, translate floating point frequencies to an appropriate range in uint64_t. It has some known flaws. - The model of irreducible control flow is a rough approximation. Modelling irreducible control flow exactly involves setting up and solving a group of infinite geometric series. Such precision is unlikely to be worthwhile, since most of our algorithms give up on irreducible control flow anyway. Nevertheless, we might find that we need to get closer. Here's a sort of TODO list for the model with diminishing returns, to be completed as necessary. - The headers for the \a LoopData representing an irreducible SCC include non-entry blocks. When these extra blocks exist, they indicate a self-contained irreducible sub-SCC. We could treat them as sub-loops, rather than arbitrarily shoving the problematic blocks into the headers of the main irreducible SCC. - Entry frequencies are assumed to be evenly split between the headers of a given irreducible SCC, which is the only option if we need to compute mass in the SCC before its parent loop. Instead, we could partially compute mass in the parent loop, and stop when we get to the SCC. Here, we have the correct ratio of entry masses, which we can use to adjust their relative frequencies. Compute mass in the SCC, and then continue propagation in the parent. - We can propagate mass iteratively through the SCC, for some fixed number of iterations. Each iteration starts by assigning the entry blocks their backedge mass from the prior iteration. The final mass for each block (and each exit, and the total backedge mass used for computing loop scale) is the sum of all iterations. (Running this until fixed point would "solve" the geometric series by simulation.)

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:848

Inherits from: BlockFrequencyInfoImplBase

Templates

BT

Member Variables

private const llvm::BlockFrequencyInfoImpl:: BranchProbabilityInfoT* BPI = nullptr
private const llvm::BlockFrequencyInfoImpl::LoopInfoT* LI = nullptr
private const llvm::BlockFrequencyInfoImpl::FunctionT* F = nullptr
private std::vector<const BlockT*> RPOT
private DenseMap<llvm::BlockFrequencyInfoImpl::BlockKeyT, std::pair<BlockNode, BFICallbackVH>> Nodes

Method Overview

  • public BlockFrequencyInfoImpl<BlockT>()
  • private void applyIterativeInference()
  • public void calculate(const llvm::BlockFrequencyInfoImpl::FunctionT & F, const llvm::BlockFrequencyInfoImpl::BranchProbabilityInfoT & BPI, const llvm::BlockFrequencyInfoImpl::LoopInfoT & LI)
  • private void computeIrreducibleMass(llvm::BlockFrequencyInfoImplBase::LoopData * OuterLoop, std::list<LoopData>::iterator Insert)
  • private void computeMassInFunction()
  • private bool computeMassInLoop(llvm::BlockFrequencyInfoImplBase::LoopData & Loop)
  • private void computeMassInLoops()
  • private llvm::BlockFrequencyInfoImplBase::Scaled64 discrepancy(const llvm::BlockFrequencyInfoImpl::ProbMatrixType & ProbMatrix, const std::vector<Scaled64> & Freq) const
  • private void findReachableBlocks(std::vector<const BlockT *> & Blocks) const
  • public void forgetBlock(const llvm::BlockFrequencyInfoImpl::BlockT * BB)
  • public const llvm::BlockFrequencyInfoImpl::BranchProbabilityInfoT & getBPI() const
  • private const llvm::BlockFrequencyInfoImpl::BlockT * getBlock(const llvm::BlockFrequencyInfoImplBase::BlockNode & Node) const
  • public llvm::BlockFrequency getBlockFreq(const llvm::BlockFrequencyInfoImpl::BlockT * BB) const
  • private std::string getBlockName(const llvm::BlockFrequencyInfoImplBase::BlockNode & Node) const
  • public Optional<uint64_t> getBlockProfileCount(const llvm::Function & F, const llvm::BlockFrequencyInfoImpl::BlockT * BB, bool AllowSynthetic = false) const
  • public llvm::BlockFrequencyInfoImplBase::Scaled64 getFloatingBlockFreq(const llvm::BlockFrequencyInfoImpl::BlockT * BB) const
  • public const llvm::BlockFrequencyInfoImpl::FunctionT * getFunction() const
  • private size_t getIndex(const llvm::BlockFrequencyInfoImpl::rpot_iterator & I) const
  • private llvm::BlockFrequencyInfoImplBase::BlockNode getNode(const llvm::BlockFrequencyInfoImpl::BlockT * BB) const
  • private llvm::BlockFrequencyInfoImplBase::BlockNode getNode(const llvm::BlockFrequencyInfoImpl::rpot_iterator & I) const
  • public Optional<uint64_t> getProfileCountFromFreq(const llvm::Function & F, uint64_t Freq, bool AllowSynthetic = false) const
  • private void initTransitionProbabilities(const std::vector<const BlockT *> & Blocks, const DenseMap<const llvm::BlockFrequencyInfoImpl::BlockT *, size_t> & BlockIndex, llvm::BlockFrequencyInfoImpl::ProbMatrixType & ProbMatrix) const
  • private void initializeLoops()
  • private void initializeRPOT()
  • public bool isIrrLoopHeader(const llvm::BlockFrequencyInfoImpl::BlockT * BB)
  • private void iterativeInference(const llvm::BlockFrequencyInfoImpl::ProbMatrixType & ProbMatrix, std::vector<Scaled64> & Freq) const
  • private bool needIterativeInference() const
  • public llvm::raw_ostream & print(llvm::raw_ostream & OS) const
  • public llvm::raw_ostream & printBlockFreq(llvm::raw_ostream & OS, const llvm::BlockFrequencyInfoImpl::BlockT * BB) const
  • private bool propagateMassToSuccessors(llvm::BlockFrequencyInfoImplBase::LoopData * OuterLoop, const llvm::BlockFrequencyInfoImplBase::BlockNode & Node)
  • private llvm::BlockFrequencyInfoImpl::rpot_iterator rpot_begin() const
  • private llvm::BlockFrequencyInfoImpl::rpot_iterator rpot_end() const
  • public void setBlockFreq(const llvm::BlockFrequencyInfoImpl::BlockT * BB, uint64_t Freq)
  • private bool tryToComputeMassInFunction()
  • public void verifyMatch(BlockFrequencyInfoImpl<BT> & Other) const

Methods

BlockFrequencyInfoImpl<BlockT>()

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1010

void applyIterativeInference()

Description

Apply an iterative post-processing to infer correct counts for irr loops.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:983

void calculate(
    const llvm::BlockFrequencyInfoImpl::FunctionT&
        F,
    const llvm::BlockFrequencyInfoImpl::
        BranchProbabilityInfoT& BPI,
    const llvm::BlockFrequencyInfoImpl::LoopInfoT&
        LI)

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1014

Parameters

const llvm::BlockFrequencyInfoImpl::FunctionT& F
const llvm::BlockFrequencyInfoImpl:: BranchProbabilityInfoT& BPI
const llvm::BlockFrequencyInfoImpl::LoopInfoT& LI

void computeIrreducibleMass(
    llvm::BlockFrequencyInfoImplBase::LoopData*
        OuterLoop,
    std::list<LoopData>::iterator Insert)

Description

Compute mass in (and package up) irreducible SCCs. Find the irreducible SCCs in \c OuterLoop, add them to \a Loops (in front of \c Insert), and call \a computeMassInLoop() on each of them. If \c OuterLoop is \c nullptr, it refers to the top-level function.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:944

Parameters

llvm::BlockFrequencyInfoImplBase::LoopData* OuterLoop
std::list<LoopData>::iterator Insert

void computeMassInFunction()

Description

Compute mass in the top-level function. Uses \a tryToComputeMassInFunction() and \a computeIrreducibleMass() to compute mass in the top-level function.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:964

bool computeMassInLoop(
    llvm::BlockFrequencyInfoImplBase::LoopData&
        Loop)

Description

Compute mass in a particular loop. Assign mass to \c Loop's header, and then for each block in \c Loop in reverse post-order, distribute mass to its successors. Only visits nodes that have not been packaged into sub-loops.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:920

Parameters

llvm::BlockFrequencyInfoImplBase::LoopData& Loop

Returns

\c true unless there's an irreducible backedge.

void computeMassInLoops()

Description

Compute mass in all loops. For each loop bottom-up, call \a computeMassInLoop(). \a computeMassInLoop() aborts (and returns \c false) on loops that contain a irreducible sub-SCCs. Use \a computeIrreducibleMass() and then re-enter \a computeMassInLoop().

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:956

llvm::BlockFrequencyInfoImplBase::Scaled64
discrepancy(
    const llvm::BlockFrequencyInfoImpl::
        ProbMatrixType& ProbMatrix,
    const std::vector<Scaled64>& Freq) const

Description

Compute the discrepancy between current block frequencies and the probability matrix.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1005

Parameters

const llvm::BlockFrequencyInfoImpl:: ProbMatrixType& ProbMatrix
const std::vector<Scaled64>& Freq

void findReachableBlocks(
    std::vector<const BlockT*>& Blocks) const

Description

Find all blocks to apply inference on, that is, reachable from the entry and backward reachable from exists along edges with positive probability.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:993

Parameters

std::vector<const BlockT*>& Blocks

void forgetBlock(
    const llvm::BlockFrequencyInfoImpl::BlockT*
        BB)

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1043

Parameters

const llvm::BlockFrequencyInfoImpl::BlockT* BB

const llvm::BlockFrequencyInfoImpl::
    BranchProbabilityInfoT&
    getBPI() const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1054

const llvm::BlockFrequencyInfoImpl::BlockT*
getBlock(const llvm::BlockFrequencyInfoImplBase::
             BlockNode& Node) const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:885

Parameters

const llvm::BlockFrequencyInfoImplBase::BlockNode& Node

llvm::BlockFrequency getBlockFreq(
    const llvm::BlockFrequencyInfoImpl::BlockT*
        BB) const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1019

Parameters

const llvm::BlockFrequencyInfoImpl::BlockT* BB

std::string getBlockName(
    const llvm::BlockFrequencyInfoImplBase::
        BlockNode& Node) const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:966

Parameters

const llvm::BlockFrequencyInfoImplBase::BlockNode& Node

Optional<uint64_t> getBlockProfileCount(
    const llvm::Function& F,
    const llvm::BlockFrequencyInfoImpl::BlockT*
        BB,
    bool AllowSynthetic = false) const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1023

Parameters

const llvm::Function& F
const llvm::BlockFrequencyInfoImpl::BlockT* BB
bool AllowSynthetic = false

llvm::BlockFrequencyInfoImplBase::Scaled64
getFloatingBlockFreq(
    const llvm::BlockFrequencyInfoImpl::BlockT*
        BB) const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1050

Parameters

const llvm::BlockFrequencyInfoImpl::BlockT* BB

const llvm::BlockFrequencyInfoImpl::FunctionT*
getFunction() const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1012

size_t getIndex(
    const llvm::BlockFrequencyInfoImpl::
        rpot_iterator& I) const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:877

Parameters

const llvm::BlockFrequencyInfoImpl::rpot_iterator& I

llvm::BlockFrequencyInfoImplBase::BlockNode
getNode(
    const llvm::BlockFrequencyInfoImpl::BlockT*
        BB) const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:883

Parameters

const llvm::BlockFrequencyInfoImpl::BlockT* BB

llvm::BlockFrequencyInfoImplBase::BlockNode
getNode(const llvm::BlockFrequencyInfoImpl::
            rpot_iterator& I) const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:879

Parameters

const llvm::BlockFrequencyInfoImpl::rpot_iterator& I

Optional<uint64_t> getProfileCountFromFreq(
    const llvm::Function& F,
    uint64_t Freq,
    bool AllowSynthetic = false) const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1030

Parameters

const llvm::Function& F
uint64_t Freq
bool AllowSynthetic = false

void initTransitionProbabilities(
    const std::vector<const BlockT*>& Blocks,
    const DenseMap<
        const llvm::BlockFrequencyInfoImpl::
            BlockT*,
        size_t>& BlockIndex,
    llvm::BlockFrequencyInfoImpl::ProbMatrixType&
        ProbMatrix) const

Description

Build a matrix of probabilities with transitions (edges) between the blocks: ProbMatrix[I] holds pairs (J, P), where Pr[J -> I | J] = P

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:997

Parameters

const std::vector<const BlockT*>& Blocks
const DenseMap< const llvm::BlockFrequencyInfoImpl::BlockT*, size_t>& BlockIndex
llvm::BlockFrequencyInfoImpl::ProbMatrixType& ProbMatrix

void initializeLoops()

Description

Initialize loop data. Build up \a Loops using \a LoopInfo. \a LoopInfo gives us a mapping from each block to the deepest loop it's in, but we need the inverse. For each loop, we store in reverse post-order its "immediate" members, defined as the header, the headers of immediate sub-loops, and all other blocks in the loop that are not in sub-loops.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:902

void initializeRPOT()

Description

Run (and save) a post-order traversal. Saves a reverse post-order traversal of all the nodes in \a F.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:893

bool isIrrLoopHeader(
    const llvm::BlockFrequencyInfoImpl::BlockT*
        BB)

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1037

Parameters

const llvm::BlockFrequencyInfoImpl::BlockT* BB

void iterativeInference(
    const llvm::BlockFrequencyInfoImpl::
        ProbMatrixType& ProbMatrix,
    std::vector<Scaled64>& Freq) const

Description

Run iterative inference for a probability matrix and initial frequencies.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:988

Parameters

const llvm::BlockFrequencyInfoImpl:: ProbMatrixType& ProbMatrix
std::vector<Scaled64>& Freq

bool needIterativeInference() const

Description

The current implementation for computing relative block frequencies does not handle correctly control-flow graphs containing irreducible loops. To resolve the problem, we apply a post-processing step, which iteratively updates block frequencies based on the frequencies of their predesessors. This corresponds to finding the stationary point of the Markov chain by an iterative method aka "PageRank computation". The algorithm takes at most O(|E| * IterativeBFIMaxIterations) steps but typically converges faster. Decide whether we want to apply iterative inference for a given function.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:980

llvm::raw_ostream& print(
    llvm::raw_ostream& OS) const

Description

Print the frequencies for the current function. Prints the frequencies for the blocks in the current function. Blocks are printed in the natural iteration order of the function, rather than reverse post-order. This provides two advantages: writing -analyze tests is easier (since blocks come out in source order), and even unreachable blocks are printed. \a BlockFrequencyInfoImplBase::print() only knows reverse post-order, so we need to override it here.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1067

Parameters

llvm::raw_ostream& OS

llvm::raw_ostream& printBlockFreq(
    llvm::raw_ostream& OS,
    const llvm::BlockFrequencyInfoImpl::BlockT*
        BB) const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1072

Parameters

llvm::raw_ostream& OS
const llvm::BlockFrequencyInfoImpl::BlockT* BB

bool propagateMassToSuccessors(
    llvm::BlockFrequencyInfoImplBase::LoopData*
        OuterLoop,
    const llvm::BlockFrequencyInfoImplBase::
        BlockNode& Node)

Description

Propagate to a block's successors. In the context of distributing mass through \c OuterLoop, divide the mass currently assigned to \c Node between its successors.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:910

Parameters

llvm::BlockFrequencyInfoImplBase::LoopData* OuterLoop
const llvm::BlockFrequencyInfoImplBase::BlockNode& Node

Returns

\c true unless there's an irreducible backedge.

llvm::BlockFrequencyInfoImpl::rpot_iterator
rpot_begin() const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:874

llvm::BlockFrequencyInfoImpl::rpot_iterator
rpot_end() const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:875

void setBlockFreq(
    const llvm::BlockFrequencyInfoImpl::BlockT*
        BB,
    uint64_t Freq)

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1041

Parameters

const llvm::BlockFrequencyInfoImpl::BlockT* BB
uint64_t Freq

bool tryToComputeMassInFunction()

Description

Try to compute mass in the top-level function. Assign mass to the entry block, and then for each block in reverse post-order, distribute mass to its successors. Skips nodes that have been packaged into loops.

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:930

Returns

\c true unless there's an irreducible backedge.

void verifyMatch(
    BlockFrequencyInfoImpl<BT>& Other) const

Declared at: llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h:1076

Parameters

BlockFrequencyInfoImpl<BT>& Other