BigRational for .NET 6.0, .NET 7 and .NET 8.0 alpha

A new implementation is under development:
BigRat.cs (as single file solution, just copy and try)
New basics, smaller, more performance on operator level, description in short.
Maybe already the better choice.

Introduction

BigRational - a novel rational number class.

There are already many implementations of rational number classes called Rational, BigRat, BigRational, Fraction etc.
For C# the useful ones are based on a combination of System.Numerics.BigInteger for numerator and denominator,
which in fact has the benefit of a really simple implementation.
It works well, but in practice has several disadvantages:

Performance is limited by System.Numerics.BigInteger which is not primarily designed for rational arithmetic.
Using BigInteger leads to a lot of unnecessary memory allocations for all intermediate results at runtime which means extra work for the GC.
Based on BigInteger the struct size of a Rational is unnecessary big, therefore also inefficient to use in Vector or Matrix classes.
The system requires a lot of internal memory copies bad for the overall performance.
There is no way to benefit from the performance improvements in .NET Core, like using Spans, Buffers and Intrinsics.

BigRational is based on a completely new system to achieve better efficiency and performance.
The interface itself is of course the same as for all other numeric types, supporting the basic arithmetic operations via operators, conversions from and to the numeric system types, string formatting and parsing, binary serialization, etc.
As an immutable, readonly struct with only one array field, the layout is as small as possible and similar to a System.String.
Internal the class has its own big-integer calculation core, highly optimized for the purposes of rational arithmetics, encapsulated in a stack machine.
This stack machine is public and offers a second interface layer with a machine language like instruction set.
Using this interface allows optimization on function level with performance improvements by factor 4 to 10.

Example: Classic cross product for a Vector3 type, this one with: BigRational X, Y, Z;

public static Vector3R Cross(Vector3R a, Vector3R b)
{ 
  var x = a.Y * b.Z - a.Z * b.Y; 
  var y = a.Z * b.X - a.X * b.Z;
  var z = a.X * b.Y - a.Y * b.X;
  return new Vector3R(x, y, z);
}

The same function can be optimized using the stack machine (cpu):

public static Vector3R Cross(Vector3R a, Vector3R b)
{
  var cpu = BigRational.task_cpu;
  cpu.mul(a.X, b.Y); cpu.mul(a.Y, b.X); cpu.sub();
  cpu.mul(a.Z, b.X); cpu.mul(a.X, b.Z); cpu.sub();
  cpu.mul(a.Y, b.Z); cpu.mul(a.Z, b.Y); cpu.sub();
  return new Vector3R(cpu.pop_rat(), cpu.pop_rat(), cpu.pop_rat());
}

The optimized version performs 4 x faster, only 3 final normalizations and memory allocs are necessary (instead of 9 for the first version).

As the example shows, using the stack machine directly is not much more difficult or less readable.
Furthermore, this design separates the BigRational struct from the calculation core, the final number object is always normalized what makes the handling, comparsions etc. much easier and doesn't impact the performance of the calculations.

Data Layout

All data, numerator and denominator, is stored in a single variable-length uint array.
This gives the BigRational struct itself a machine word size and therefore can be passed around internal as fast as possible.

public readonly struct BigRational
{
  private readonly uint[] p; // the one and only field

The array starts with a uint header containing a sign-bit and the number of following uint digits for the numerator.
Immediately followed by the same structure for the denominator. The order is little-endian.
A null array, as default value of the struct, represents the number 0 (zero).

Rat P Struct

This simple structure has several advantages:

Exchange, transfer, serialization and marshalling is easy, for example via COM.
The calculation core can directly access the integer parts without conversions.
As Span it is very efficient to build vectors or multidimensional matrices.
The content is easy to debug.

The disadvantage is that small numbers require an array even if they would fit in a machine word.
This is the price for a optimal throughput and less code as there are no special cases to handle.

Stack machine

The stack machine, named CPU, is implemented as nested class in BigRational to get access to the private member.
The class has only two private fields: An array of arrays of uint (uint[][] p) and an index (int i) that marks the current stack top.

public struct BigRational 
{  
  public class CPU
  {
    uint[][] p; int i;

A push operation increases the index and can use the buffer at p[i], which mostly already has the required size.
A pop operation does nothing else than decrement the index.
Each uint[] buffer in the stack represents a rational number with the same data layout as in struct BigRational.

Internally the stack is used excessively for all basic arithmetic operations and so no further memory management is necessary.
However, in consequence, for every single calculation step with BigRational numbers it needs a instance of such stack-machine object what is only efficient when it is as shared object always available. There is a thread-local static instance called task_cpu for this, which is exposed as a static property in BigRational.

public static CPU task_cpu
{
  get { return _cpu ??= new CPU(); }
}
[ThreadStatic] private static CPU? _cpu;

Thread static means that each thread has its own instance of a task_cpu when needed, ensuring thread safety.

This simple system has the advantage that it is also easy to debug:

The system implies that the stack machine exposes an instruction set.
These instructions, for readability as short words, are self explaining basic instructions like add(), sub(), mul(), div(), different push() and pop(), bit level operations like shl(), shr(), but also higher level instructions like pow(), exp(), log(), sin(), tan() etc.

Since the rational numbers can have arbitrary size, it is important for the performance to avoid unnecessary memory copies.
This explains several versions of same instructions, for example:

  cpu.push(x); // implies a memory copy
  cpu.push(y); // implies a memory copy
  cpu.add();  
  // For performance it is better to write:
  cpu.add(x, y); // no copies necessary, same result

In general there is a difference to other stack machines, where the entries can be quickly copied, duplicated etc. since they are mostly in machine word size.
Here we have buffers of arbitrary size as stack entries and for performance it is all about to avoid unnecessary memory copies.
To do this there are swap instructions, cpu.swp(), which internally exchange buffer pointers and this operation is fast since a pointer has machine word size.

String formatting

Compared to other numeric types, string formatting for Rational has some special features.
First of all, there is support for all Standard- and the Custom formats for numeric types like double and decimal and
is exposed by the interface functions of IFormattable and ISpanFormattable.
If a format is specified, the output should be identical to that of an equivalent double value, with the same conventions for rounding, exponential form etc.
Selfexplaining, with the difference that the specified number of decimal digits for rational is not limited.

The default ToString already shows differences.
First there is the automatic detection of repeating decimals that allow a exact string representation when possible.
The position where the repetition starts is marked bei a ' (aphostrop):

  x = (BigRational)1 / 3;
  s = x.ToString();       // "0,'3"    instead of "0.3333333333333..."	
  x = (BigRational)3227 / 555;
  s = x.ToString();       // "5.8'144" instead of "5.8144144144144..."

The adequate parse function supports the extended syntax completely:

  x = BigRational.Parse("0.'3");
  x = BigRational.Parse("123.456'789");

For standard ToString the number of decimal places is limited to 32 by default since there must be some limit.
If the number has a higher precision than will fit in this range, … (horizontal ellipsis, U+2026) are appendet.

  x = MathR.PI(digits: 20); 
  s = x.ToString();         // "3.14159265358979323846"      
  x = MathR.PI(digits: 40);
  s = x.ToString();         // "3.1415926535897932384626433832795…"
  x = MathR.PI(digits: 1000) * BigRational.Pow(10, 1000);
  s = x.ToString();         // "3.1415926535897932384626433832795…E+1000"

This is necessary to distinguish between exact number representations and reduced ones.
eg. "0.3333" for BigRational is exactly 0.3333 and not 1/3 or "0.'3".
So the trailing ellipsis … makes it absolutely clear that the string representation doesn't exactly match the original numeric value.
NOTE: In this reduced form the digits are not rounded.

For compatibility, no decimal repetition marks ' or trailing ellipsis … are inserted when using standard formats.
Furthermore, format R (Round Trip), since this format has to ensure that the number can read back without loss of precision, for the case that the decimal representation does not fit in the (specified) range, the number will formatted in fraction representation. eg."1/3"
Therefore and in difference to other numbers, the number extension for R has also a meaning:

  x = MathR.PI(32);
  s = x.ToString();         // "3.1415926535897932384626433832795…"      
  s = x.ToString("E");      // "3.141593E+000"
  s = x.ToString("e10");    // "3.1415926536e+000"
  s = x.ToString("E40");    // "3.1415926535897932384626433832795000000000E+000"
  s = x.ToString("F");      // "3.14"
  s = x.ToString("F40");    // "3.1415926535897932384626433832795000000000"
  s = x.ToString("R");      // "6283185307179586476925286766559/2000000000000000000000000000000"
  s = x.ToString("R40");    // "3.1415926535897932384626433832795"
  s = x.ToString("0.####"); // "3.1416"

Furthermore there are additional non-standard formats.
Format L is identic to standard format F but without trailing zeros.

  x = MathR.PI(32);
  s = x.ToString("F40");    // "3.1415926535897932384626433832795000000000"
  s = x.ToString("L40");    // "3.1415926535897932384626433832795"
  x = MathR.PI(1000);
  s = x.ToString("L5000");  // "3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679821480865132823066470938446095505822317253594081284811174502841027019385211055596446229489549303819644288109756659334461284756482337867831652712019091456485669234603486104543266482133936072602491412737245870066063155881748815209209628292540917153643678925903600113305305488204665213841469519415116094330572703657595919530921861173819326117931051185480744623799627495673518857527248912279381830119491298336733624406566430860213949463952247371907021798609437027705392171762931767523846748184676694051320005681271452635608277857713427577896091736371787214684409012249534301465495853710507922796892589235420199561121290219608640344181598136297747713099605187072113499999983729780499510597317328160963185950244594553469083026425223082533446850352619311881710100031378387528865875332083814206171776691473035982534904287554687311595628638823537875937519577818577805321712268066130019278766111959092164201989"

Format Q as a non-standard format for an exact representation as fraction:

  x = (BigRational)8 / 7; 
  s = x.ToString();         // "1.'142857"      
  s = x.ToString("Q");      // "8/7"
  x = MathR.Sqrt(2, 30);
  s = x.ToString();         // "1.4142135623730950488016887242096…"      
  s = x.ToString("Q");      // "141421356237309504880168872421/100000000000000000000000000000"

Exceptions

Division and modulo with BigRational throws a System.DivideByZeroException if the dividend is 0 (zero).
No other type of numeric exception is thrown, since overflows are not inherent.
However, there is a possibility to get a System.OverflowException when a number with its digits is too big to fit in a C# array.
Currently, the maximum size of an array for .NET is 2GB, which corresponds to a theoretical maximum BigRational integer value of: 2^34359738240 - 1.

No arithmetic exceptions are thrown when using the stack machine. Division by zero gives a NaN value as for floating point arithmetic.
Therefore, no try-catch handling is required at this interface layer to keep the stack in track.
However, a wrong stack index as program bug leads of course to an IndexOutOfRangeException.

Performance

In general, the performance increases by at least a factor of 2, by using the stack machine up to a factor of 20 and more.
This applies especially to geometric algorithms where precision is important and not so much calculating with gigantic numbers.

For pure integer arithmetic, BigRational is only slightly faster than BigInteger - about 5 to 10%.
However, this is highly dependent on the runtime configuration and processor architecture.
BigRational is currently only tuned for X64, RyuJIT and the latest processor generations.
For .NET 7 as the latest preview release, BigRational can be about 20% faster than BigInteger, as shown by Benchmarks for the performance-critical bottleneck functions.
But in general, the performance of integer calculations can also be significantly increased by using the stack machine.

Project URL: https://github.com/c-ohle/RationalNumerics