08-10-2017, 06:08 PM
(This post was last modified: 08-12-2017, 05:39 PM by MarkHaysHarris777.)
So what ? What does this tell us, and why is it important ?
Greetings,
In this segment I am providing a utility that will count and analyze the digits of our pi_digits file; the C codes are posted below with instructions for use.
Aside from the historically valid reasons in computer science for calculating PI to many many digits of accuracy, there are three primary reasons why we might want to engage in this activity with our SoC(s); but first, what this does not tell us ...
If the calculation(s) were very slow, or if they were not accurate, the test would tell us nothing about the nature of the latency, nor the reason for the failure; this is a both|and -- if the calculation is both fast and accurate (the definition of efficient performance) then we can say a great deal about the nature of the system being used to perform the calculation; such a system would be virtually flawless in anything performed as "computable" and such a system could also be considered absolutely reliable in terms of operation having completed hundreds of billions of operations in a very short period of time without failure!
The primary reasons for performing PI calculations in modern SoC(s) are the following:
1) demonstrate normalized system performance
2) demonstrate accurate system operation
3) demonstrate normalized output for statistical verification
The first two are most important. The conjecture for normalization of digits in large numbers of PI digits has the benefit (over a very large calculation) of providing a normalized set of operations (hundreds of billions of them) such that if the calculation is both accurate and fast we can say (taking everything into account: cache, bus, ram, pipe-lining, SoC, clock, OS, and user space) that the "system" is highly efficient. If we are comparing two (or more) SoC(s), without having to know anything about the specifics of any one component of the system necessarily, if both SoC(s) are relatively similar in terms of the test, then we can reasonably compare the two SoC(s) and rank them.
Both of our test SoC(s) are relatively similar; they complete the test quickly and accurately and both within less than 25 minutes; yes, one is 3 minutes 16 seconds faster, but the difference is not in orders of magnitude -- not ten times as fast, nor even twice as fast. However, we have demonstrated that the Rock64 is a more efficient SoC (board and system) in a normalized way.
To test the normalized conjecture (which has not been mathematically proven in a rigorous way) that the digits of PI are in fact highly normalized we can run the following utility which will count and analyze the digits.
pi_digit_dist.c
To compile the code:
gcc -Wall -o pi_digit_dist pi_digit_dist.c -lm
( the -lm flag is necessary to link in the math.h library )
To run the code:
./pi_digit_dist PI_1000_K.txt 7000 700000
In the sample above the utility will count the unique occurrences of the digits in the PI file -- it will begin at offset 700000 in the file and count through 7000 digits. A sample run is provided below:
We could also look through the entire file as well:
These statistics also verify the content of the file indirectly as well; because the digits of PI do not vary, the statistics regarding the normalized distribution of PI digits will also not change
Greetings,
In this segment I am providing a utility that will count and analyze the digits of our pi_digits file; the C codes are posted below with instructions for use.
Aside from the historically valid reasons in computer science for calculating PI to many many digits of accuracy, there are three primary reasons why we might want to engage in this activity with our SoC(s); but first, what this does not tell us ...
If the calculation(s) were very slow, or if they were not accurate, the test would tell us nothing about the nature of the latency, nor the reason for the failure; this is a both|and -- if the calculation is both fast and accurate (the definition of efficient performance) then we can say a great deal about the nature of the system being used to perform the calculation; such a system would be virtually flawless in anything performed as "computable" and such a system could also be considered absolutely reliable in terms of operation having completed hundreds of billions of operations in a very short period of time without failure!
The primary reasons for performing PI calculations in modern SoC(s) are the following:
1) demonstrate normalized system performance
2) demonstrate accurate system operation
3) demonstrate normalized output for statistical verification
The first two are most important. The conjecture for normalization of digits in large numbers of PI digits has the benefit (over a very large calculation) of providing a normalized set of operations (hundreds of billions of them) such that if the calculation is both accurate and fast we can say (taking everything into account: cache, bus, ram, pipe-lining, SoC, clock, OS, and user space) that the "system" is highly efficient. If we are comparing two (or more) SoC(s), without having to know anything about the specifics of any one component of the system necessarily, if both SoC(s) are relatively similar in terms of the test, then we can reasonably compare the two SoC(s) and rank them.
Both of our test SoC(s) are relatively similar; they complete the test quickly and accurately and both within less than 25 minutes; yes, one is 3 minutes 16 seconds faster, but the difference is not in orders of magnitude -- not ten times as fast, nor even twice as fast. However, we have demonstrated that the Rock64 is a more efficient SoC (board and system) in a normalized way.
To test the normalized conjecture (which has not been mathematically proven in a rigorous way) that the digits of PI are in fact highly normalized we can run the following utility which will count and analyze the digits.
pi_digit_dist.c
Code:
// pi_digit_dist.c v0.6
//
// Mark H. Harris
// 08-10-2017 v0.2 initial utility
//
// changelog
// v0.3 added standard deviation
// v0.4 added formatting
// added minimal error checking
// fixed for loop inits
// no longer requires -std=c99
// v0.5 adjusted min max
// min is first minimum
// max is last maximum
// adjusted formatting
// v0.6 added Sqrs to stats
//
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define CONST_MOD 48
#define FILE_OPEN -1
#define PARM_USE -2
int main(int argc, char** argv) {
// Definitions
int rc = 0, max=0, min=0;
FILE* pi_digits;
char* pi_file;
int c, i, n, digits, seek_pos;
int counters[10] = {0};
float dsum=0, dmean=0, dsqrs=0, sdev, ssdev, dvar;
double ssqrs=0;
// Parse parameters argument
if (argc==4) {
printf("%s\n\n", argv[0]);
digits = atoi(argv[2]);
pi_file = argv[1];
seek_pos = atoi(argv[3]);
}
else {
printf("\nUsage:\n\npi_digit_dist <pi_digits_filename> <sample_no> <seek_offset>\n\n");
return PARM_USE;
}
// Open the digits file
pi_digits = fopen(pi_file, "r");
if (pi_digits != NULL)
fseek(pi_digits, seek_pos, SEEK_SET);
else {
printf("Error: could not open file %s\n\n", argv[1]);
return FILE_OPEN;
}
// Count the unique digits
for(i=0; i<digits; i++) {
c = fgetc(pi_digits);
n = c - CONST_MOD;
counters[n] += 1;
}
// Print the unique digits
for (i=0; i<10; i++) {
printf(" %d = %d\n", i,counters[i]);
if (counters[i]<counters[min])
min = i;
if (counters[i]>=counters[max])
max = i;
dsum += counters[i];
}
// Calculate Standard Deviation
dmean = dsum / 10.0;
for (i=0; i<10; i++) {
dsqrs += pow( (float) counters[i] - dmean, 2);
ssqrs += pow( (double) counters[i], 2);
}
dvar = dsqrs / 9.0; // sample variance
ssdev = sqrt(dvar); // sample s deviation
sdev = sqrt(dsqrs / 10.0); // population σ deviation
// Print statistics
printf("\n");
printf(" max = %d (%d)\n", max,counters[max]);
printf(" min = %d (%d)\n", min,counters[min]);
printf("\n");
printf(" sum = %14.3f Σx\n", dsum);
printf(" mean = %14.3f μ\n", dmean);
printf(" sqrs = %14.3f Σ(x-μ)²\n", dsqrs);
printf(" Sqrs = %14.3f Σ(x)²\n", ssqrs);
printf(" variance = %14.3f Σ(x-μ)²/(n-1)\n", dvar);
printf(" σ = %14.3f √(Σ(x-μ)²/n)\n", sdev);
printf(" s = %14.3f √(Σ(x-μ)²/(n-1))\n", ssdev);
// Close the digits file
fclose(pi_digits);
printf("\n");
return rc;
}
To compile the code:
gcc -Wall -o pi_digit_dist pi_digit_dist.c -lm
( the -lm flag is necessary to link in the math.h library )
To run the code:
./pi_digit_dist PI_1000_K.txt 7000 700000
In the sample above the utility will count the unique occurrences of the digits in the PI file -- it will begin at offset 700000 in the file and count through 7000 digits. A sample run is provided below:
Code:
rock64@rock64:~/Python/PI_million$ ./pi_digit_dist pi_1000k_out 7000 700000
./pi_digit_dist
0 = 708
1 = 728
2 = 739
3 = 653
4 = 645
5 = 727
6 = 742
7 = 679
8 = 688
9 = 691
max = 6 (742)
min = 4 (645)
sum = 7000.0000 Σx
mean = 700.0000 μ
sqrs = 10762.0000 Σ(x-μ)²
variance = 1195.7778 Σ(x-μ)²/(n-1)
σ = 32.8055 √(Σ(x-μ)²/n)
s = 34.5800 √(Σ(x-μ)²/(n-1))
We could also look through the entire file as well:
Code:
rock64@rock64:~/Python/PI_million$ ./pi_digit_dist pi_1000k_out 1000000 0
./pi_digit_dist
0 = 99959
1 = 99758
2 = 100026
3 = 100229
4 = 100230
5 = 100359
6 = 99548
7 = 99800
8 = 99985
9 = 100106
max = 5 (100359)
min = 6 (99548)
sum = 1000000.0000 Σx
mean = 100000.0000 μ
sqrs = 550908.0000 Σ(x-μ)²
variance = 61212.0000 Σ(x-μ)²/(n-1)
σ = 234.7143 √(Σ(x-μ)²/n)
s = 247.4106 √(Σ(x-μ)²/(n-1))
These statistics also verify the content of the file indirectly as well; because the digits of PI do not vary, the statistics regarding the normalized distribution of PI digits will also not change
marcushh777
please join us for a chat @ irc.pine64.xyz:6667 or ssl irc.pine64.xyz:6697
( I regret that I am not able to respond to personal messages; let's meet on irc! )
please join us for a chat @ irc.pine64.xyz:6667 or ssl irc.pine64.xyz:6697
( I regret that I am not able to respond to personal messages; let's meet on irc! )