Multiplying ttmath Ints by doubles
Hi,
I came across this library while trying to solve a problem. I am trying to sum a very large group of numbers and the processor errors that I am getting are just too big. I am converting double floating point numbers into Ints * 2^(Const Int) so that I can eliminate these errors. I want to use as few high precision numbers as possible to reduce runtime and I really only want to use a high precision Integer as a store and then for one multiplication at the end. It is this final multiplication that is hurting me.
My code briefly looks like
#include <cstdargs>
#include <ttmath.h>
double FunctionSum(Int numArgs, (double) ...) {
__int64 var1, var2 // these are working variables and arrays, they will fit into __int64
double dbl1, dbl2 // unavoidable
ttmath::Int<4> sum1(0), sum2(0);
code ...
return FinalSum (= sum1*dbl1 + sum2*dbl2);
}
The final line is where I am having the problem. If I use __int64 then this line is accepted by the compiler and the algorithm works well except in cases where I get integer overflow. If I use this typedef from the library here I get the following error.
binary '*' : no operator found which takes a right-hand operand of type 'double'
Can anyone help me with a workaround, please? I looked for a manual and I didn't find one, and have a really hard time understanding the header files, so I don't know what else to do.
Declaring everything as high precision variables is also out of the question because 5 minute simulation runtimes turn into hours or days.
Thanks
> return FinalSum (= sum1*dbl1 + sum2*dbl2);
Even if there was such an operator you would lose the precision, consider this:
"10001 (int) * 0.00001 (double). You end with zero. First change Int<> to Big<> with a correct
size of the mantissa and then make the multiplication, sample:
#include <ttmath/ttmath.h>
#include <stdint.h>
#include <iostream>
double FunctionSum()
{
int64_t var1, var2;
double dbl1, dbl2;
ttmath::Int<4> sum1(0), sum2(0);
sum1 = "100000000000000000000000";
sum2 = "200000000000000000000000";
dbl1 = 0.000456;
dbl2 = 0.000123;
ttmath::Big<1, 4> dsum1 = sum1;
ttmath::Big<1, 4> dsum2 = sum2;
ttmath::Big<1, 4> res = dsum1 * dbl1 + dsum2 * dbl2;
return res.ToDouble();
}
int main()
{
std::cout << FunctionSum() << std::endl;
}
Thanks Tomek, I will try this :)
J