Wall of Numbers — https://qz.com

**Generating prime number is damn hard! **There is no pattern to prime numbers, we just need to pass over all numbers and check them one by one

Few days ago I’ve started distributed caching system project on top of TreeScale PubSub system. General idea is to have completely distributed caching environment, but there was some strong requirement to not try to cache data which is already exists.

With existing caching systems, when you are trying to add data which is already exists, they are receiving full data, making some checksum (around 200 byte) and exchanging that checksum over replicated endpoints.

My idea was to attach individual prime number to every transferred data and multiply them with cache service prime numbers to get 8 byte unique number (multiplication of prime numbers). So that it will give ability to add few times more performance, because we will be working with numbers, instead of 200+ bytes and it is less bytes to transfer.

You may think WTF? What is he talking about? , you are right, let me simplify things. *I needed to generate new prime number for each cached data! *In order to have more efficient unique identifier and the exact path where is located that data.

Problem with finding Prime Number

The algorithm of finding prime number is very very simple (simple doesn’t mean efficient, or fast).

This implementation is fine and will work without any issue for every range. but it could take a years to generate for example 100000th prime number.

So after implementing whole system prototype, I started testing distributed cache. It worked and it’s really really got super performance. **BUT! **when I started to cache more than 1000 items, every item started to be saved SLOWER and SLOWER, only because generating next prime number is taking too match time.

In modern software development we are not dealing with math problems, we just using libraries for that :) And using my basic math background I figured out that I cant write new Prime numbers algorithm and get Nobel prize (I would love, but I don’t have time for that :) ).

Hacking Prime Numbers!

After having some Google-ing around Prime numbers issue, I found this interesting website https://primes.utm.edu/lists/small/millions/ where they have actually generated first 50Mln prime numbers, and you can download them as a text documents.

Ohh that’s interesting, now I have an idea how to solve my problem!

Great ideas are super simple Great ideas are super simple

So I wrote a grabber to get all 50Mln prime numbers and Encode them as a 4 byte BigEndian numbers so that I can write binary file and easily navigate between them by just calculating index. For example: 5000th prime number BigEndian bytes index would be 5000*4, so I can grab it by just seeking a file to that position. It’s quite efficient compared to Prime number generation.

Here is the implementation for that https://github.com/tigranbs/prime-numbers

You can ask “Ok, what if we will have more than 50Mln data points?”. I would say “hopefully we wouldn’t have it!”, but in general I’m planning to build some aggregator to merge indexed data and keep them with less prime number indexed.

Conclusion

Keep your software away from Prime Numbers! They don’t have a specific pattern, so they are resource heavy!

🎉 You reached to the end! Don’t forget to *Recommend *!