Parallella, Raspberry Pi, FPGA & All That Stuff

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 791534953
RAC: 1247636

RE: Doesn't fftw 3.3.3 use

Quote:

Doesn't fftw 3.3.3 use neon if available? Might be worth trying instead of sticking with 3.3.2.

Even 3.3.2 uses NEON (look for "neon" in the wisdom posted above). The current version is 3.3.4 to which we might upgrade with the next app release I guess.

MarkJ
MarkJ
Joined: 28 Feb 08
Posts: 437
Credit: 139002861
RAC: 0

RE: RE: Doesn't fftw

Quote:
Quote:
Doesn't fftw 3.3.3 use neon if available? Might be worth trying instead of sticking with 3.3.2.

Even 3.3.2 uses NEON (look for "neon" in the wisdom posted above). The current version is 3.3.4 to which we might upgrade with the next app release I guess.


Apparently NEON support got added with 3.3.1 beta1. However 3.3.3 introduced 128 bit NEON instructions which are said to provide a speed up even on 64 bit. FFTW release notes page.

The release version would be better.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 791534953
RAC: 1247636

RE: RE: RE: Doesn't

Quote:
Quote:
Quote:
Doesn't fftw 3.3.3 use neon if available? Might be worth trying instead of sticking with 3.3.2.

Even 3.3.2 uses NEON (look for "neon" in the wisdom posted above). The current version is 3.3.4 to which we might upgrade with the next app release I guess.


Apparently NEON support got added with 3.3.1 beta1. However 3.3.3 introduced 128 bit NEON instructions which are said to provide a speed up even on 64 bit. FFTW release notes page.

The release version would be better.

I remember I tested 3.3.3 for Raspi and Android some time ago and found no significant improvement back then, but I should try 3.3.4 now. It will take some time before I have meaningful results because to make a apples-to-apples comparison, one needs to generate new wisdom first, which can take some time.

poppageek
poppageek
Joined: 13 Aug 10
Posts: 259
Credit: 2473733872
RAC: 0

Thank you for supporting the

Thank you for supporting the Raspberry Pi and continuing to develop the app.

jason
jason
Joined: 25 Mar 16
Posts: 1
Credit: 255071
RAC: 0

RE: For users who are

Quote:

For users who are running the non-beta version of the BRP app on the RPI3, here's some wisdom file to try : store this into a file /etc/fftw/wisdomf (create directory as needed) and restart BOINC.

Unfortunately this will not work for the BETA version of the app, which comes with pre-canned wisdom that cannot be replaced. I'll change that in the next version of the app.

Seems to help a bit and brings per task (CPU) runtime down below 9.5h for the first 4 tasks I've tried it with.

(fftw-3.3.2 fftwf_wisdom #x4a633eef #xb5a95564 #x91014bdd #x9c85ce5f
  (fftwf_dft_vrank_geq1_register 1 #x10048 #x10048 #x0 #xe9490235 #x35dbe44c #x93e9b2b1 #x4e9133c7)
  (fftwf_codelet_n2fv_16_neon 0 #x10448 #x10448 #x0 #x19bc193a #xdc910cad #x050dbf79 #x265533b0)
  (fftwf_dft_vrank_geq1_register 0 #x11048 #x11048 #x0 #x1dbb535b #x45a03fd1 #x17a58ce1 #x0f083ed2)
  (fftwf_dft_r2hc_register 0 #x11048 #x11048 #x0 #x92778231 #xf2c5be82 #xbf854e1f #xcdce7520)
  (fftwf_codelet_r2cfII_4 2 #x11048 #x11048 #x0 #x583c6dad #xcad0b14f #xd60d8871 #x3c3e732b)
  (fftwf_dft_vrank_geq1_register 1 #x10048 #x10048 #x0 #xf34d137e #x6e517ca2 #xea4876fa #x7285cf99)
  (fftwf_dft_buffered_register 0 #x11048 #x11048 #x0 #x617ea872 #x4f8387c0 #xc0e3f3b1 #x32b873cd)
  (fftwf_codelet_r2cf_4 2 #x11048 #x11048 #x0 #x1ccbb87b #xe43cf57c #xeb78f271 #x2bc4f22f)
  (fftwf_codelet_hc2cfdft2_4 0 #x11048 #x11048 #x0 #xc338dbbd #x81477318 #xc96aed6b #xb15ea60a)
  (fftwf_dft_vrank_geq1_register 0 #x10448 #x10448 #x0 #x09af9b00 #xa03bd811 #x26398994 #x31e5b135)
  (fftwf_dft_indirect_register 0 #x10048 #x10048 #x0 #x65a9d712 #xfa146285 #x829effba #x57598691)
  (fftwf_rdft_rank0_register 6 #x10448 #x10448 #x0 #xe74e9bbe #xd45decda #xc0f8c735 #x34c1a8ef)
  (fftwf_rdft_rank0_register 3 #x11048 #x11048 #x0 #xa3218bf8 #x1e4e02e5 #xf3ad505f #xc8d6e15d)
  (fftwf_codelet_q1_2 0 #x11048 #x11048 #x0 #x460f2bdc #x4aa37cb4 #x5c9974cb #x6f00dfca)
  (fftwf_dft_indirect_register 0 #x10448 #x10448 #x0 #x7bf71e1b #x4917bb5f #xd9d15633 #xf582acff)
  (fftwf_dft_r2hc_register 0 #x10448 #x10448 #x0 #x677e78b0 #xad96893c #x78204cfe #x023ab8d6)
  (fftwf_codelet_t1fv_2_neon 0 #x10048 #x10048 #x0 #xf837784a #xe72939cb #x379e76e3 #x8e126882)
  (fftwf_codelet_t1_4 0 #x10048 #x10048 #x0 #x86de9d2d #x51c61173 #x653af340 #x91ee094f)
  (fftwf_dft_vrank_geq1_register 1 #x11048 #x11048 #x0 #x38767c90 #x01ee70b5 #xb6e53cd8 #x51a820b2)
  (fftwf_dft_r2hc_register 0 #x10048 #x10048 #x0 #xc4998e57 #xf8f0c9f6 #x65134c9f #x1b3ee283)
  (fftwf_rdft_rank0_register 2 #x10048 #x10048 #x0 #xa037d173 #x67f7ab80 #x5845994a #x641865eb)
  (fftwf_codelet_q1fv_8_neon 0 #x11048 #x11048 #x0 #xd1bb3633 #x91bc40c2 #x20e3bbdc #x4f21b78b)
  (fftwf_codelet_t1fv_12_neon 0 #x10448 #x10448 #x0 #x4cb0c81a #x014af06f #x3fbb4580 #x23913d12)
  (fftwf_dft_nop_register 0 #x11048 #x11048 #x0 #xe1547730 #xce0f0276 #x1f492e5e #xa455fbfa)
  (fftwf_dft_vrank_geq1_register 0 #x11048 #x11048 #x0 #x1f032d84 #x8c4d1b96 #xdb1f2c30 #xb7dd028c)
  (fftwf_codelet_t3fv_16_neon 0 #x10448 #x10448 #x0 #xd28d84aa #x2f5c0613 #x99a566eb #x0767192a)
  (fftwf_codelet_t1fv_8_neon 1 #x11048 #x11048 #x0 #x5cf4974a #xe8483178 #xfe9b3550 #x48db71ba)
)

Hi, can you please share the parameters that you passed to "fftwf-wisdom" when you generate the wisdom file?

I compiled fftw3.3.4 and want to try it, but I don't know the parameter for generating wisdom file.
I looked into source code and figured out "-x rif4194304".
But the generated file contains fewer lines so I think there must be more combinations.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 791534953
RAC: 1247636

RE: Hi, can you please

Quote:


Hi, can you please share the parameters that you passed to "fftwf-wisdom" when you generate the wisdom file?

I compiled fftw3.3.4 and want to try it, but I don't know the parameter for generating wisdom file.
I looked into source code and figured out "-x rif4194304".
But the generated file contains fewer lines so I think there must be more combinations.

I tried "-x" ("exhaustive") first but after more than 10 days (sic!) I stopped that and set a timeout of 10hrs instead. The FFT performed is a 3 * 2^22 real-to-complex in-place transform (where the factor 3 comes from the command line options and the 2^22 is dictated by the length of the data (number of samples) ), so

./fftwf-wisdom -v -t 10 -n -o wisdomf rif12582912

"-n" ignores any existing system wisdom and starts wisdom generation from scratch.

I ran fftw-wisdom in parallel to E@h tasks to simulate similar loads on the processor, cache and memory bus.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 791534953
RAC: 1247636

RE: Thank you for

Quote:
Thank you for supporting the Raspberry Pi and continuing to develop the app.

You're welcome, and thank you for contributing to E@H.

I love the idea behind the Raspberry Pi, so in my spare time I try to make them productive on E@H. I have a total of 11 Raspis at home now, most of them doing E@H plus some other more or less useful (but fun) stuff, ranging from astronomy to cat surveillance:

RPi 1 A : 2 x
RPi 1 B : 3 x
RPi 1 A+: 1 x
RPi 1 B+: 1 x
RPi 2 B : 2 x
RPi 3 B : 2 x

I'd like to have a few PiZeros as well but they are currently sold out.

poppageek
poppageek
Joined: 13 Aug 10
Posts: 259
Credit: 2473733872
RAC: 0

The Pis are fun to play with.

The Pis are fun to play with. I only have three now but plan on ordering my first Pi 3 next week. A few more will follow later.

I love this project and the Raspberry Pi lets me continue to contribute during the summer when I have to shut down the GPUs due to heat.

Good to keep an eye on felines. Never know when they will decide you are no longer useful. ;-)

Cheers!

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 791534953
RAC: 1247636

I finally managed to get some

I finally managed to get some GW tasks (here running 2 in parallel) to finish successfully on the Pi3:

https://einsteinathome.org/task/550861762

Just under 6 days. This was mostly a fun exercise, but at least has some usefulness to test the code path in the app that is for non-Intel-SSE2 hosts, otherwise unused on E@H.

EDIT: In this configuration the Raspi 3 should draw less than 5W, so we are talking about ca 0.4 kWh per task, which should be in the same ballpark as moderately modern desktop systems.

EDIT^2: running 4 GW tasks in parallel, it will take ca 7 days to finish, still well within the 14 days deadline. It requires active cooling to keep it from throttling down tho, so let's say 6 W in total, or less than 0.3 kWh per task, or ca 0.1 EUR per task in (German) electricity cost per task.

Quote:

Good to keep an eye on felines. Never know when they will decide you are no longer useful. ;-)

LOL! I also decided to add motion detection to the setup so you can catch the few moments while I'm away when the cats actually do something other than sleeping ;-)

Cheers
HB

Anonymous

RE: EDIT: some pictures of

Quote:


EDIT:
some pictures of the potato chip cooling PI3 cruncher:


HB,

I'll see your bet and raise you 3 more tubes/fans. :>)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.