1.

Solve : 7-zip Cannot Compress?

Answer»

I have got the first billion digits of pi in .txt FORMAT. It is 1.5 GB big. I'm planning on sharing it on the internet, but I don't want to put a 1.5 GB text file up there. So, I tried to compress it using 7-zip. Unfortunately, 7-zip failed to compress it...it came up with a message: "system cannot allocate the required amount of memory."
Is there any WAY to fix this problem? Any help is appreciated. Thank you!Seems to be a common problem faced by many people. Found a link you might wanna see. Among COMPRESSION tools, 7-zip is unusually hard on memory. Anyhow, why bother, when it is already being shared? You are only going to hammer your upload and web storage when a big research place has already done it. Also, if I want 1 billion places of pi, I (presumably) want to be able to trust they are the right ones, (OK I know 3.141592653 already) so where do I go, MIT or some guy on the web I never heard of?

Uncompressed (1 GB) at MIT

(Folks, don't click on this link in a browser unless you have a lot of time on your hands... right click and choose Save as...)

http://stuff.mit.edu/afs/sipb/contrib/pi/pi-billion.txt

7-zip compressed (490 MB)

http://micronetsoftware.com/pi_day/pi/pi.7z

I have downloaded the uncompressed text and using WinRAR (64-bit version) "best" compression setting I compressed it in 13 minutes and got a smaller file size than the 7-zip archive at Micronet...

     1,000,000,002 pi-billion.txt
       514,753,983 pi.7z
       434,964,971 pi-billion.rar

If I had to use 7-zip I would probably use some method of splitting the original file into chunks, or maybe if you ask 7-zip to split the archive you can reduce the memory hit?

Your figure of 1.5 GB for the plain text file seems a bit big.Using a very simple method you can reduce a billion digits to about 500 million bytes with little effort. A single decimal digit only needs 4 bits to represent its value. If reduced to binary as a very looooong integer would take even less. About 333 million bytes.  Forget the decimal point. We all know the first three digits are 3.14 anyway.At the moment, I am trying to compress the 1-billion-place text file using 7-zip "ultra" compression, splitting the archive into 100 MB chunks. It estimates 35 minutes to completion, and Process Explorer's memory figure ("working set") has stabilised at 682 MB. I can imagine 7-zip would be not the compression tool of choice if RAM was limited. WinRAR never went over 300 MB.
Time to complete has gone up to 48 minutes... [EDIT] it took 49 minutes... 4 files created

100,000,000 pi-billion.7z.001
100,000,000 pi-billion.7z.002
100,000,000 pi-billion.7z.003
100,000,000 pi-billion.7z.004
 40,941,343 pi-billion.7z.005Is  the nonobjective speed or size?
If you want speed and size, do it in machine code (assembly) by hand. This is a one-time thing. Right? so you make a specific ASM program t o do it once. The output will be a self-extracting EXE file that prints to the output device.
Quote from: Geek-9pm on November 13, 2013, 11:07:20 PM

Is  the nonobjective speed or size?

The objective was reduced size, for web sharing. Time taken to compress is not, I think, an issue.
This approach might be fun:

With curl.exe http://curl.haxx.se/ you can download a range of pi digits so if you want to examine them you don't even need the billion digit file to be stored locally.

Store the first 10 digits of Pi in pi-10-digits.txt (the start character is 2 because the first character of the file is the decimal point)

curl -o pi-10-digits.txt -r 2-11 http://stuff.mit.edu/afs/sipb/contrib/pi/pi-billion.txt

result:

1415926535

Store the 50 digits of Pi starting at the 70th digit in pi-50-digits-from70.txt

curl -o pi-50-digits-from-70.txt -r 71-121 http://stuff.mit.edu/afs/sipb/contrib/pi/pi-billion.txt

result:

406286208998628034825342117067982148086513282306647



Salmon Trout, you are beyond brilliant!
How do you find such things?
Are you riving inside of Google?Came back, and saw all these great, helpful replies.....thanks very much. And Salmon Trout, the curl.exe was very INTERESTING! Thanks!You can use curl.exe to get a range of bytes (or in this case, characters) from a local file. You have to convert the file path to a file url starting file:/// and using forward slashes instead of backslashes:

From the local file D:\Pi\pi-billion.txt, create a text file with the first 10 digits of Pi (the start character is 2 because the first character of the file is the decimal point):

curl -o pi-10-digits.txt -r 2-11 file:///d:/pi/pi-billion.txt

Thus you see that curl.exe can be used as a local file splitter.


I found a pi calculator program for Windows by Fabrice Bellard http://www.bellard.org/pi/pi2700e9/tpi-0.9.3-win.zip

I calculated the first billion places of Pi on my home PC (AMD Phenom II 945, 4GB RAM, Windows 7 64 bit) and it took 1 hour 4 minutes. I chose to limit RAM usage to 1 GB and also to store some of the intermediate results on disk rather than keep them in RAM. If I had gone for the all-RAM option it would have been a lot quicker but the PC would have been quite slow at doing other tasks.

I compared the first million digits of my CALCULATION with the first million of the MIT billion digit file and they are the same.
Next: compare the full billion. Of course it won't prove that either set is "right"...
Pi' are round...


Discussion

No Comment Found