This is an English translation of a Japanese blog. Some content may not be fully translated.
AWS

Notes on Decimal Arithmetic in Redshift

Introduction

When you want to maintain precision in floating-point arithmetic, you would use DECIMAL, but when DECIMAL is used in arithmetic calculations, the precision of the result differs from the DECIMAL of the storage destination.

https://docs.aws.amazon.com/ja_jp/redshift/latest/dg/r_numeric_computations201.html

Precision and Scale of DECIMAL Calculation Results

Precision and Scale

First, a basic understanding of precision and scale:

  • Precision: Total number of digits
  • Scale: Number of digits to the right of the decimal point
Input Data NUMBER Type Definition Stored Data
7,456,123.89 NUMBER 7456123.89
7,456,123.89 NUMBER(*,1) 7456123.9
7,456,123.89 NUMBER(9) 7456124
7,456,123.89 NUMBER(9,2) 7456123.89
7,456,123.89 NUMBER(9,1) 7456123.9
7,456,123.89 NUMBER(6) (Not accepted because it exceeds precision)
7,456,123.89 NUMBER(7,-2) 7456100

Floating-Point Arithmetic Specifications in Redshift

For example, executing the following series of commands. Column a is decimal(8,2), column b is decimal(8,7), and the pattern divides the values stored in a and b, storing the result in c as decimal(38,23). We would want c to store values up to 23 decimal places, but when arithmetic is involved, the precision and scale will not be maintained to 23 decimal places. The result in this pattern is 0.33330000000000000000000, which is maintained to 4 decimal places, with zeros padded afterward.

drop table test;
create table test(a decimal(8,2), b decimal(8,7),c decimal(38,23));
insert into test values(1,3,null);

select * from test;
insert into test(c) select a/b from test;
select * from test;

Execution log:

mydb=# drop table test;
DROP TABLE
mydb=# create table test(a decimal(8,2), b decimal(8,7),c decimal(38,23));
CREATE TABLE
mydb=# insert into test values(1,3,null);
INSERT 0 1
mydb=#
mydb=# select * from test;
  a   |     b     | c
------+-----------+---
 1.00 | 3.0000000 |
(1 row)

mydb=# insert into test(c) select a/b from test;
INSERT 0 1
mydb=# select * from test;
  a   |     b     |             c
------+-----------+---------------------------
 1.00 | 3.0000000 |
      |           | 0.33330000000000000000000
(2 rows)

This result occurs because the following calculation formulas are applied. (Excerpt from the manual)

In this case it’s division, so the scale is calculated as max(4,s1+p2-s2+1) and precision as p1-s1+s2+scale.

Operation Category Formula
+ or - Scale max(s1,s2)
+ or - Precision max(p1-s1,p2-s2)+1+scale
* Scale s1+s2
* Precision p1+p2+1
/ Scale max(4,s1+p2-s2+1)
/ Precision p1-s1+s2+scale

In table format, it looks like this. There were differences in the scale and precision that can be stored between the calculation result and the storage destination c. Data types must be decided with an understanding of such specifications.

Column Variable Scale, Precision
a p1 8
a s1 2
b p2 8
b s2 7
Calculation result Precision (total digits) 17
Calculation result Scale (decimal digits) 4
c (destination) Precision (total digits) 38
c (destination) Scale (decimal digits) 23

Notes

Are the calculation formulas different for each database? Below is SQL Server.

https://docs.microsoft.com/ja-jp/sql/t-sql/data-types/precision-scale-and-length-transact-sql?view=sql-server-ver15

References

https://tech.tvisioninsights.co.jp/entry/2018/08/22/100000

https://docs.aws.amazon.com/ja_jp/redshift/latest/dg/r_numeric_computations201.html

https://odashinsuke.hatenablog.com/entry/20100720/1279628893

https://docs.microsoft.com/ja-jp/sql/t-sql/data-types/precision-scale-and-length-transact-sql?redirectedfrom=MSDN&view=sql-server-ver15

Suggest an edit on GitHub