class DoubleAPFloat

Declaration

class DoubleAPFloat : public APFloatBase { /* full declaration omitted */ };

Description

A self-contained host- and target-independent arbitrary-precision floating-point software implementation. APFloat uses bignum integer arithmetic as provided by static functions in the APInt class. The library will work with bignum integers whose parts are any unsigned type at least 16 bits wide, but 64 bits is recommended. Written for clarity rather than speed, in particular with a view to use in the front-end of a cross compiler so that target arithmetic can be correctly performed on the host. Performance should nonetheless be reasonable, particularly for its intended use. It may be useful as a base implementation for a run-time library during development of a faster target-specific one. All 5 rounding modes in the IEEE-754R draft are handled correctly for all implemented operations. Currently implemented operations are add, subtract, multiply, divide, fused-multiply-add, conversion-to-float, conversion-to-integer and conversion-from-integer. New rounding modes (e.g. away from zero) can be added with three or four lines of code. Four formats are built-in: IEEE single precision, double precision, quadruple precision, and x87 80-bit extended double (when operating with full extended precision). Adding a new format that obeys IEEE semantics only requires adding two lines of code: a declaration and definition of the format. All operations return the status of that operation as an exception bit-mask, so multiple operations can be done consecutively with their results or-ed together. The returned status can be useful for compiler diagnostics; e.g., inexact, underflow and overflow can be easily diagnosed on constant folding, and compiler optimizers can determine what exceptions would be raised by folding operations and optimize, or perhaps not optimize, accordingly. At present, underflow tininess is detected after rounding; it should be straight forward to add support for the before-rounding case too. The library reads hexadecimal floating point numbers as per C99, and correctly rounds if necessary according to the specified rounding mode. Syntax is required to have been validated by the caller. It also converts floating point numbers to hexadecimal text as per the C99 %a and %A conversions. The output precision (or alternatively the natural minimal precision) can be specified; if the requested precision is less than the natural precision the output is correctly rounded for the specified rounding mode. It also reads decimal floating point numbers and correctly rounds according to the specified rounding mode. Conversion to decimal text is not currently implemented. Non-zero finite numbers are represented internally as a sign bit, a 16-bit signed exponent, and the significand as an array of integer parts. After normalization of a number of precision P the exponent is within the range of the format, and if the number is not denormal the P-th bit of the significand is set as an explicit integer bit. For denormals the most significant bit is shifted right so that the exponent is maintained at the format's minimum, so that the smallest denormal has just the least significant bit of the significand set. The sign of zeroes and infinities is significant; the exponent and significand of such numbers is not stored, but has a known implicit (deterministic) value: 0 for the significands, 0 for zero exponent, all 1 bits for infinity exponent. For NaNs the sign and significand are deterministic, although not really meaningful, and preserved in non-conversion operations. The exponent is implicitly all 1 bits. APFloat does not provide any exception handling beyond default exception handling. We represent Signaling NaNs via IEEE-754R 2008 6.2.1 should clause by encoding Signaling NaNs with the first bit of its trailing significand as 0. TODO ==== Some features that may or may not be worth adding: Binary to decimal conversion (hard). Optional ability to detect underflow tininess before rounding. New formats: x87 in single and double precision mode (IEEE apart from extended exponent range) (hard). New operations: sqrt, IEEE remainder, C90 fmod, nexttoward.

Declared at: llvm/include/llvm/ADT/APFloat.h:602

Inherits from: APFloatBase

Member Variables

private const llvm::fltSemantics* Semantics
private std::unique_ptr<APFloat[]> Floats

Inherited from APFloatBase:

public static integerPartWidth = APInt::APINT_BITS_PER_WORD
public static rmNearestTiesToEven = RoundingMode::NearestTiesToEven
public static rmTowardPositive = RoundingMode::TowardPositive
public static rmTowardNegative = RoundingMode::TowardNegative
public static rmTowardZero = RoundingMode::TowardZero
public static rmNearestTiesToAway = RoundingMode::NearestTiesToAway

Method Overview

  • public DoubleAPFloat(const llvm::fltSemantics & S)
  • public DoubleAPFloat(const llvm::fltSemantics & S, llvm::APFloatBase::uninitializedTag)
  • public DoubleAPFloat(const llvm::fltSemantics & S, llvm::APFloatBase::integerPart)
  • public DoubleAPFloat(const llvm::fltSemantics & S, const llvm::APInt & I)
  • public DoubleAPFloat(const llvm::fltSemantics & S, llvm::APFloat && First, llvm::APFloat && Second)
  • public DoubleAPFloat(const llvm::detail::DoubleAPFloat & RHS)
  • public DoubleAPFloat(llvm::detail::DoubleAPFloat && RHS)
  • public llvm::APFloatBase::opStatus add(const llvm::detail::DoubleAPFloat & RHS, llvm::APFloatBase::roundingMode RM)
  • private llvm::APFloatBase::opStatus addImpl(const llvm::APFloat & a, const llvm::APFloat & aa, const llvm::APFloat & c, const llvm::APFloat & cc, llvm::APFloatBase::roundingMode RM)
  • private llvm::APFloatBase::opStatus addWithSpecial(const llvm::detail::DoubleAPFloat & LHS, const llvm::detail::DoubleAPFloat & RHS, llvm::detail::DoubleAPFloat & Out, llvm::APFloatBase::roundingMode RM)
  • public llvm::APInt bitcastToAPInt() const
  • public bool bitwiseIsEqual(const llvm::detail::DoubleAPFloat & RHS) const
  • public void changeSign()
  • public llvm::APFloatBase::cmpResult compare(const llvm::detail::DoubleAPFloat & RHS) const
  • public llvm::APFloatBase::cmpResult compareAbsoluteValue(const llvm::detail::DoubleAPFloat & RHS) const
  • public llvm::APFloatBase::opStatus convertFromAPInt(const llvm::APInt & Input, bool IsSigned, llvm::APFloatBase::roundingMode RM)
  • public llvm::APFloatBase::opStatus convertFromSignExtendedInteger(const llvm::APFloatBase::integerPart * Input, unsigned int InputSize, bool IsSigned, llvm::APFloatBase::roundingMode RM)
  • public Expected<llvm::APFloatBase::opStatus> convertFromString(llvm::StringRef, llvm::APFloatBase::roundingMode)
  • public llvm::APFloatBase::opStatus convertFromZeroExtendedInteger(const llvm::APFloatBase::integerPart * Input, unsigned int InputSize, bool IsSigned, llvm::APFloatBase::roundingMode RM)
  • public unsigned int convertToHexString(char * DST, unsigned int HexDigits, bool UpperCase, llvm::APFloatBase::roundingMode RM) const
  • public llvm::APFloatBase::opStatus convertToInteger(MutableArrayRef<llvm::APFloatBase::integerPart> Input, unsigned int Width, bool IsSigned, llvm::APFloatBase::roundingMode RM, bool * IsExact) const
  • public llvm::APFloatBase::opStatus divide(const llvm::detail::DoubleAPFloat & RHS, llvm::APFloatBase::roundingMode RM)
  • public llvm::APFloatBase::opStatus fusedMultiplyAdd(const llvm::detail::DoubleAPFloat & Multiplicand, const llvm::detail::DoubleAPFloat & Addend, llvm::APFloatBase::roundingMode RM)
  • public llvm::APFloatBase::fltCategory getCategory() const
  • public bool getExactInverse(llvm::APFloat * inv) const
  • public llvm::APFloat & getFirst()
  • public const llvm::APFloat & getFirst() const
  • public llvm::APFloat & getSecond()
  • public const llvm::APFloat & getSecond() const
  • public bool isDenormal() const
  • public bool isInteger() const
  • public bool isLargest() const
  • public bool isNegative() const
  • public bool isSmallest() const
  • public void makeInf(bool Neg)
  • public void makeLargest(bool Neg)
  • public void makeNaN(bool SNaN, bool Neg, const llvm::APInt * fill)
  • public void makeSmallest(bool Neg)
  • public void makeSmallestNormalized(bool Neg)
  • public void makeZero(bool Neg)
  • public llvm::APFloatBase::opStatus mod(const llvm::detail::DoubleAPFloat & RHS)
  • public llvm::APFloatBase::opStatus multiply(const llvm::detail::DoubleAPFloat & RHS, llvm::APFloatBase::roundingMode RM)
  • public bool needsCleanup() const
  • public llvm::APFloatBase::opStatus next(bool nextDown)
  • public llvm::APFloatBase::opStatus remainder(const llvm::detail::DoubleAPFloat & RHS)
  • public llvm::APFloatBase::opStatus roundToIntegral(llvm::APFloatBase::roundingMode RM)
  • public llvm::APFloatBase::opStatus subtract(const llvm::detail::DoubleAPFloat & RHS, llvm::APFloatBase::roundingMode RM)
  • public void toString(SmallVectorImpl<char> & Str, unsigned int FormatPrecision, unsigned int FormatMaxPadding, bool TruncateZero = true) const

Inherited from APFloatBase:

Methods

DoubleAPFloat(const llvm::fltSemantics& S)

Declared at: llvm/include/llvm/ADT/APFloat.h:614

Parameters

const llvm::fltSemantics& S

DoubleAPFloat(const llvm::fltSemantics& S,
              llvm::APFloatBase::uninitializedTag)

Declared at: llvm/include/llvm/ADT/APFloat.h:615

Parameters

const llvm::fltSemantics& S
llvm::APFloatBase::uninitializedTag

DoubleAPFloat(const llvm::fltSemantics& S,
              llvm::APFloatBase::integerPart)

Declared at: llvm/include/llvm/ADT/APFloat.h:616

Parameters

const llvm::fltSemantics& S
llvm::APFloatBase::integerPart

DoubleAPFloat(const llvm::fltSemantics& S,
              const llvm::APInt& I)

Declared at: llvm/include/llvm/ADT/APFloat.h:617

Parameters

const llvm::fltSemantics& S
const llvm::APInt& I

DoubleAPFloat(const llvm::fltSemantics& S,
              llvm::APFloat&& First,
              llvm::APFloat&& Second)

Declared at: llvm/include/llvm/ADT/APFloat.h:618

Parameters

const llvm::fltSemantics& S
llvm::APFloat&& First
llvm::APFloat&& Second

DoubleAPFloat(
    const llvm::detail::DoubleAPFloat& RHS)

Declared at: llvm/include/llvm/ADT/APFloat.h:619

Parameters

const llvm::detail::DoubleAPFloat& RHS

DoubleAPFloat(llvm::detail::DoubleAPFloat&& RHS)

Declared at: llvm/include/llvm/ADT/APFloat.h:620

Parameters

llvm::detail::DoubleAPFloat&& RHS

llvm::APFloatBase::opStatus add(
    const llvm::detail::DoubleAPFloat& RHS,
    llvm::APFloatBase::roundingMode RM)

Declared at: llvm/include/llvm/ADT/APFloat.h:639

Parameters

const llvm::detail::DoubleAPFloat& RHS
llvm::APFloatBase::roundingMode RM

llvm::APFloatBase::opStatus addImpl(
    const llvm::APFloat& a,
    const llvm::APFloat& aa,
    const llvm::APFloat& c,
    const llvm::APFloat& cc,
    llvm::APFloatBase::roundingMode RM)

Declared at: llvm/include/llvm/ADT/APFloat.h:607

Parameters

const llvm::APFloat& a
const llvm::APFloat& aa
const llvm::APFloat& c
const llvm::APFloat& cc
llvm::APFloatBase::roundingMode RM

llvm::APFloatBase::opStatus addWithSpecial(
    const llvm::detail::DoubleAPFloat& LHS,
    const llvm::detail::DoubleAPFloat& RHS,
    llvm::detail::DoubleAPFloat& Out,
    llvm::APFloatBase::roundingMode RM)

Declared at: llvm/include/llvm/ADT/APFloat.h:610

Parameters

const llvm::detail::DoubleAPFloat& LHS
const llvm::detail::DoubleAPFloat& RHS
llvm::detail::DoubleAPFloat& Out
llvm::APFloatBase::roundingMode RM

llvm::APInt bitcastToAPInt() const

Declared at: llvm/include/llvm/ADT/APFloat.h:663

bool bitwiseIsEqual(
    const llvm::detail::DoubleAPFloat& RHS) const

Declared at: llvm/include/llvm/ADT/APFloat.h:662

Parameters

const llvm::detail::DoubleAPFloat& RHS

void changeSign()

Declared at: llvm/include/llvm/ADT/APFloat.h:648

llvm::APFloatBase::cmpResult compare(
    const llvm::detail::DoubleAPFloat& RHS) const

Declared at: llvm/include/llvm/ADT/APFloat.h:661

Parameters

const llvm::detail::DoubleAPFloat& RHS

llvm::APFloatBase::cmpResult compareAbsoluteValue(
    const llvm::detail::DoubleAPFloat& RHS) const

Declared at: llvm/include/llvm/ADT/APFloat.h:649

Parameters

const llvm::detail::DoubleAPFloat& RHS

llvm::APFloatBase::opStatus convertFromAPInt(
    const llvm::APInt& Input,
    bool IsSigned,
    llvm::APFloatBase::roundingMode RM)

Declared at: llvm/include/llvm/ADT/APFloat.h:670

Parameters

const llvm::APInt& Input
bool IsSigned
llvm::APFloatBase::roundingMode RM

llvm::APFloatBase::opStatus
convertFromSignExtendedInteger(
    const llvm::APFloatBase::integerPart* Input,
    unsigned int InputSize,
    bool IsSigned,
    llvm::APFloatBase::roundingMode RM)

Declared at: llvm/include/llvm/ADT/APFloat.h:671

Parameters

const llvm::APFloatBase::integerPart* Input
unsigned int InputSize
bool IsSigned
llvm::APFloatBase::roundingMode RM

Expected<llvm::APFloatBase::opStatus>
convertFromString(llvm::StringRef,
                  llvm::APFloatBase::roundingMode)

Declared at: llvm/include/llvm/ADT/APFloat.h:664

Parameters

llvm::StringRef
llvm::APFloatBase::roundingMode

llvm::APFloatBase::opStatus
convertFromZeroExtendedInteger(
    const llvm::APFloatBase::integerPart* Input,
    unsigned int InputSize,
    bool IsSigned,
    llvm::APFloatBase::roundingMode RM)

Declared at: llvm/include/llvm/ADT/APFloat.h:674

Parameters

const llvm::APFloatBase::integerPart* Input
unsigned int InputSize
bool IsSigned
llvm::APFloatBase::roundingMode RM

unsigned int convertToHexString(
    char* DST,
    unsigned int HexDigits,
    bool UpperCase,
    llvm::APFloatBase::roundingMode RM) const

Declared at: llvm/include/llvm/ADT/APFloat.h:677

Parameters

char* DST
unsigned int HexDigits
bool UpperCase
llvm::APFloatBase::roundingMode RM

llvm::APFloatBase::opStatus convertToInteger(
    MutableArrayRef<
        llvm::APFloatBase::integerPart> Input,
    unsigned int Width,
    bool IsSigned,
    llvm::APFloatBase::roundingMode RM,
    bool* IsExact) const

Declared at: llvm/include/llvm/ADT/APFloat.h:667

Parameters

MutableArrayRef<llvm::APFloatBase::integerPart> Input
unsigned int Width
bool IsSigned
llvm::APFloatBase::roundingMode RM
bool* IsExact

llvm::APFloatBase::opStatus divide(
    const llvm::detail::DoubleAPFloat& RHS,
    llvm::APFloatBase::roundingMode RM)

Declared at: llvm/include/llvm/ADT/APFloat.h:642

Parameters

const llvm::detail::DoubleAPFloat& RHS
llvm::APFloatBase::roundingMode RM

llvm::APFloatBase::opStatus fusedMultiplyAdd(
    const llvm::detail::DoubleAPFloat&
        Multiplicand,
    const llvm::detail::DoubleAPFloat& Addend,
    llvm::APFloatBase::roundingMode RM)

Declared at: llvm/include/llvm/ADT/APFloat.h:645

Parameters

const llvm::detail::DoubleAPFloat& Multiplicand
const llvm::detail::DoubleAPFloat& Addend
llvm::APFloatBase::roundingMode RM

llvm::APFloatBase::fltCategory getCategory() const

Declared at: llvm/include/llvm/ADT/APFloat.h:651

bool getExactInverse(llvm::APFloat* inv) const

Declared at: llvm/include/llvm/ADT/APFloat.h:688

Parameters

llvm::APFloat* inv

llvm::APFloat& getFirst()

Declared at: llvm/include/llvm/ADT/APFloat.h:634

const llvm::APFloat& getFirst() const

Declared at: llvm/include/llvm/ADT/APFloat.h:635

llvm::APFloat& getSecond()

Declared at: llvm/include/llvm/ADT/APFloat.h:636

const llvm::APFloat& getSecond() const

Declared at: llvm/include/llvm/ADT/APFloat.h:637

bool isDenormal() const

Declared at: llvm/include/llvm/ADT/APFloat.h:680

bool isInteger() const

Declared at: llvm/include/llvm/ADT/APFloat.h:683

bool isLargest() const

Declared at: llvm/include/llvm/ADT/APFloat.h:682

bool isNegative() const

Declared at: llvm/include/llvm/ADT/APFloat.h:652

bool isSmallest() const

Declared at: llvm/include/llvm/ADT/APFloat.h:681

void makeInf(bool Neg)

Declared at: llvm/include/llvm/ADT/APFloat.h:654

Parameters

bool Neg

void makeLargest(bool Neg)

Declared at: llvm/include/llvm/ADT/APFloat.h:656

Parameters

bool Neg

void makeNaN(bool SNaN,
             bool Neg,
             const llvm::APInt* fill)

Declared at: llvm/include/llvm/ADT/APFloat.h:659

Parameters

bool SNaN
bool Neg
const llvm::APInt* fill

void makeSmallest(bool Neg)

Declared at: llvm/include/llvm/ADT/APFloat.h:657

Parameters

bool Neg

void makeSmallestNormalized(bool Neg)

Declared at: llvm/include/llvm/ADT/APFloat.h:658

Parameters

bool Neg

void makeZero(bool Neg)

Declared at: llvm/include/llvm/ADT/APFloat.h:655

Parameters

bool Neg

llvm::APFloatBase::opStatus mod(
    const llvm::detail::DoubleAPFloat& RHS)

Declared at: llvm/include/llvm/ADT/APFloat.h:644

Parameters

const llvm::detail::DoubleAPFloat& RHS

llvm::APFloatBase::opStatus multiply(
    const llvm::detail::DoubleAPFloat& RHS,
    llvm::APFloatBase::roundingMode RM)

Declared at: llvm/include/llvm/ADT/APFloat.h:641

Parameters

const llvm::detail::DoubleAPFloat& RHS
llvm::APFloatBase::roundingMode RM

bool needsCleanup() const

Declared at: llvm/include/llvm/ADT/APFloat.h:632

llvm::APFloatBase::opStatus next(bool nextDown)

Declared at: llvm/include/llvm/ADT/APFloat.h:665

Parameters

bool nextDown

llvm::APFloatBase::opStatus remainder(
    const llvm::detail::DoubleAPFloat& RHS)

Declared at: llvm/include/llvm/ADT/APFloat.h:643

Parameters

const llvm::detail::DoubleAPFloat& RHS

llvm::APFloatBase::opStatus roundToIntegral(
    llvm::APFloatBase::roundingMode RM)

Declared at: llvm/include/llvm/ADT/APFloat.h:647

Parameters

llvm::APFloatBase::roundingMode RM

llvm::APFloatBase::opStatus subtract(
    const llvm::detail::DoubleAPFloat& RHS,
    llvm::APFloatBase::roundingMode RM)

Declared at: llvm/include/llvm/ADT/APFloat.h:640

Parameters

const llvm::detail::DoubleAPFloat& RHS
llvm::APFloatBase::roundingMode RM

void toString(SmallVectorImpl<char>& Str,
              unsigned int FormatPrecision,
              unsigned int FormatMaxPadding,
              bool TruncateZero = true) const

Declared at: llvm/include/llvm/ADT/APFloat.h:685

Parameters

SmallVectorImpl<char>& Str
unsigned int FormatPrecision
unsigned int FormatMaxPadding
bool TruncateZero = true