Aliquot Sequence Research  2.0
Compute properties of the sum-of-proper-divisors function.
Functions
moewsmoews_sieve.c File Reference

Functions to sieve blocks of sigma function. More...

Functions

sieve_config_tmoews_init_sieve (size_t bound, size_t seg_len)
 creates a sieve_config_t which is shared by a pool of workers More...
 
void moews_destroy_sieve (sieve_config_t *cfg)
 free's memory associated with sieve_config_t More...
 
sieve_worker_tmoews_init_worker (const sieve_config_t *cfg)
 creates a worker object which can be used to run the sieve More...
 
void destroy_worker (sieve_worker_t *worker)
 free's memory associated with sieve_worker_t More...
 
void moews_sieve_odd_standard (sieve_worker_t *worker, const uint64_t seg_start)
 runs sieve for odd sigma(m) More...
 
void moews_sieve_odd_squared (sieve_worker_t *worker, const uint64_t seg_start)
 takes odd m in range and runs sieve for sigma(m * m) More...
 
uint64_t moews_lookup_sigma_m (const sieve_worker_t *worker, uint64_t m)
 lookup sigma(m) from m in a sieved block More...
 
uint64_t moews_lookup_sigma_m_squared (const sieve_worker_t *worker, uint64_t m)
 lookup sigma(m * m) from m in a sieved block More...
 
bool moews_check_prime (const sieve_worker_t *worker, uint64_t m)
 checks if odd numbers in range (seg_start, seg_start + seg_len) is prime More...
 
size_t moews_estimate_heap_usage (size_t bound, size_t seg_len, size_t num_workers)
 estimates how much heap the sieving buffers will require More...
 

Detailed Description

Functions to sieve blocks of sigma function.

Author
Anton Mosunov and Gavin Guinn (gavin.nosp@m.guin.nosp@m.n1@gm.nosp@m.ail..nosp@m.com)
Date
Originally Apr 18, 2014, modified 2021-12-27

PERFORMANCE SUMMARY

The parameter seg_len has will increase sieve runtime by upto a order-of-magnitude if set to low/high. If seg_len is set to low the sieve will execute far more instructions than needed to complete the same outcome, as it's performance is improves with larger values of seg_len. I believe^ the amount of instructions increases exponentially as seg_len shrinks, judging by the cachegrind instruction lookup counts. However if seg_len is set to high the sieve will start thrashing at the cache missing reads/writes in the intermediate sieving steps thus degrading performance. I believe^^ this effect is more pronounced at bounds lower than ~10^9 as seg_len's that result in a reasonable amount of operations also happen to require sieving buffers that fit (very approximately) into my 32mb LL cache.

TODO

test ^ and ^^

PERFORMANCE

The performance of this sieve is highly variable on the value of seg_len, unfortunately not in a clearly tractable way. It is clear that if seg_len is set to low (whatever that means) the sieve will do a lot of additional operations for the same result. This selected cachegrind/time output from 2 pom_yang run is illustrative:

./cli --bound=100000000 --seg_len=400000 --num_locks=1000000 --preimage_count_bits=1 --num_thread=12
==*== I refs: 19,091,262,194
==*==
==*== D refs: 26,107,875,360 (25,263,200,724 rd + 844,674,636 wr)
==*== D1 misses: 268,703,487 ( 255,239,558 rd + 13,463,929 wr)
==*== LLd misses: 406,127 ( 274,082 rd + 132,045 wr)
==*== D1 miss rate: 1.0% ( 1.0% + 1.6% )
6.28s user 0.07s system 901% cpu 0.704 total
./cli --bound=100000000 --seg_len=12500 --num_locks=1000000 --preimage_count_bits=1 --num_thread=12
==*== I refs: 190,995,976,640
==*==
==*== D refs: 4,448,783,095 (3,658,655,908 rd + 790,127,187 wr)
==*== D1 misses: 304,675,262 ( 291,274,461 rd + 13,400,801 wr)
==*== LLd misses: 8,850,772 ( 5,407,998 rd + 3,442,774 wr)
==*== D1 miss rate: 6.8% ( 8.0% + 1.7% )
28.27s user 0.04s system 1052% cpu 2.689 total

Note that the seg_len=12500 is running an order of magnitude more instructions and far fewer references on the data cache, these references are relatively inaccurate.

The seg_len value can also degrade performace if it is set to high:

./cli --bound=100000000 --seg_len=4000000 --num_locks=$((10**6)) --preimage_count_bits=1 --num_thread=12
==*== I refs: 13,701,814,468
==*==
==*== D refs: 3,755,366,091 (2,955,265,779 rd + 800,100,312 wr)
==*== D1 misses: 312,794,587 ( 298,947,273 rd + 13,847,314 wr)
==*== LLd misses: 6,388,217 ( 2,678,555 rd + 3,709,662 wr)
==*== D1 miss rate: 8.3% ( 10.1% + 1.7% )
12.94s user 0.36s system 744% cpu 1.786 total

Comparing to the seg_len=400000 run we can observe that slightly less instructions are being executed but we are still in the same order of magnitude. More interestingly we can notice this run made an order of magnitude fewer references to the data cache and.. those references were much less accurate, likely accounting for the performace degradation. The vast majority of these misses originate from interactions with the sigma and numbers buffers.

SEE

Files in the cg directory for more details.

ALGORITHM

This is a rewrite of Anton Mosunov's implementation of the [Moews and Moews] sieving algorithm, prepared to utilized in the tabulation of non-aliquots, sect. 3 of [Chum et al.]. This implementation includes undocumented optimization on the original method of [Moews and Moews] that are essential for performance, I can't make any sense of why it works however. The _meows_naive_sieve implements the algorithm as described in the paper, it is much slower.

CITATIONS

-> [Moews and Moews] Moews, D. and Moews, P. C. (1991). A search for aliquot cycles below 10 10. Mathematics of Computation, 57(196):849.

-> [Chum et al.] Chum, K., Guy, R. K., Jacobson, J. M. J., and Mosunov, A. S. (2018). Numerical and statistical analysis of aliquot sequences. Experimental Mathematics, 29(4):414–425.

Function Documentation

◆ destroy_worker()

void destroy_worker ( sieve_worker_t worker)

free's memory associated with sieve_worker_t

Parameters
workerto be free'd

◆ moews_check_prime()

bool moews_check_prime ( const sieve_worker_t worker,
uint64_t  m 
)

checks if odd numbers in range (seg_start, seg_start + seg_len) is prime

Parameters
workerthat has previously run moews_sieve_odd_standard
mcheck if this odd number in range (seg_start, seg_start + seg_len) is prime
Returns
bool is prime

◆ moews_destroy_sieve()

void moews_destroy_sieve ( sieve_config_t cfg)

free's memory associated with sieve_config_t

Parameters
cfgto be free'd

◆ moews_estimate_heap_usage()

size_t moews_estimate_heap_usage ( size_t  bound,
size_t  seg_len,
size_t  num_workers 
)

estimates how much heap the sieving buffers will require

Parameters
boundsieve all odd sigma upto to bound
seg_lensize of range to sieve in a step
num_workersnumber of threads working
Returns
size_t approx bytes to be consumed

◆ moews_init_sieve()

sieve_config_t* moews_init_sieve ( size_t  bound,
size_t  seg_len 
)

creates a sieve_config_t which is shared by a pool of workers

Parameters
boundsieving all odd sigma upto to bound
seg_lensize of range to sieve in a step
Returns
sieve_config_t* config struct which is free'd by calling moews_destroy_sieve

◆ moews_init_worker()

sieve_worker_t* moews_init_worker ( const sieve_config_t cfg)

creates a worker object which can be used to run the sieve

Parameters
cfginitalized sieve configuration struct
Returns
sieve_worker_t*

◆ moews_lookup_sigma_m()

uint64_t moews_lookup_sigma_m ( const sieve_worker_t worker,
uint64_t  m 
)

lookup sigma(m) from m in a sieved block

Parameters
workersee definition, has run moews_sieve_odd_standard last
mnumber to lookup
Returns
uint64_t sigma(m)

◆ moews_lookup_sigma_m_squared()

uint64_t moews_lookup_sigma_m_squared ( const sieve_worker_t worker,
uint64_t  m 
)

lookup sigma(m * m) from m in a sieved block

Parameters
workersee definition, has run moews_sieve_odd_squared last
mnumber to lookup
Returns
uint64_t sigma(m * m)

◆ moews_sieve_odd_squared()

void moews_sieve_odd_squared ( sieve_worker_t worker,
uint64_t  seg_start 
)

takes odd m in range and runs sieve for sigma(m * m)

Parameters
workersee struct definiton
seg_startstart of range to sieve

◆ moews_sieve_odd_standard()

void moews_sieve_odd_standard ( sieve_worker_t worker,
uint64_t  seg_start 
)

runs sieve for odd sigma(m)

Parameters
workersee struct definiton
seg_startstart of range to sieve