O3AS Broken on RDNA3 (Linux, opencl-ati), Worked a Few Months Ago, Need Help Building Test Env

Paul
Paul
Joined: 3 May 07
Posts: 130
Credit: 1836062221
RAC: 478674
Topic 232049

A few months ago, I got new drivers, and O3AS, which had been my best performing E@H app, started throwing errors withing a few seconds of starting.  I recently tried it again and the error looks the same from what I can remember. Here's an example:

https://einsteinathome.org/task/1722984570

But, it used to work.  So, I'm confused.  This isn't entirely out of character for RDNA3, though.  At the same time, another E@H app -- maybe MeerKat? -- started working when it never had worked before.  And there is yet other problems, like crashes, that I'm not even thinking about right now.

In any case, I could use some help troubleshooting it.  This seems like a good place to start because it's 100 % reproducible and occurs very quickly.  I've reported some of the past problems upstream, but they say they cannot help without being able to debug the live code for themselves.  I pointed them to the source page, but neither they nor I can quite get started building a test environment.  We could use some help.  Would appreciate someone who could help me build a test environment so I could document it and explain it to AMGPU devs so they can reproduce.  I realize this will require a bit more interaction, but there's no hurry; we can work asynchronously here or DM or maybe personal e-mail?  Whatever works.

ahorek's team
ahorek's team
Joined: 16 Dec 05
Posts: 39
Credit: 249686477
RAC: 3077

It appears to be an issue

It appears to be an issue related to incompatibility with Fedora's drivers.

https://einsteinathome.org/cs/content/all-sky-gravitational-wave-search-o3-v107-tasks-compilation-fail-ldlld-error-undefined-symbo

https://github.com/ROCm/ROCm/issues/3575

Only the developers can confirm whether there is a way around it in the code. You could try contacting Oliver Behnke.

Paul
Paul
Joined: 3 May 07
Posts: 130
Credit: 1836062221
RAC: 478674

Yeah, thanks.  I believe I

Yeah, thanks.  I believe I have contact the right people, but that's a new name.

tictoc
tictoc
Joined: 1 Jan 13
Posts: 47
Credit: 7788387618
RAC: 8043699

There really shouldn't be any

There really shouldn't be any issues running O3AS on a 7900xtx.

 

I see that you also have an A750 in that system.  Do you have mesa-libOpenCL installed?  There can be conflicts between the two OpenCL drivers.  

Paul
Paul
Joined: 3 May 07
Posts: 130
Credit: 1836062221
RAC: 478674

Interesting suggestion.  I'm

Interesting suggestion.  I'm not sure why I don't, since I have every thing else from mesa, but no, i don't have mesa-libopencl installed.  I wonder if I ran into this conflict before, removed it, and forgot about it.

I think I finally see what AHOREK's Team was saying about the error.  I now see that it looks like a simple case of a missing call.  Not sure why O3AS is the only app that calls it, out of all the ones I run or have tried recently, but that is what the error suggests.

Paul
Paul
Joined: 3 May 07
Posts: 130
Credit: 1836062221
RAC: 478674

So, follow-up question: what

So, follow-up question: what is __printf_alloc()?  The suggestion above is that there is something wrong with Fedora.  But, that isn't a good explanation for the symptom.  I can find __printf_alloc() in both llvm libs and rocm-comgr.  So, it doesn't *seem* like it's missing.

It also looks like an internal call, so I'm really confused as to how any library could be built if it were missing internal calls.  Something doesn't add up.

So, I'm back to my original question.  Can someone please help me actually build an test environment for E@H?  It seems like the only way I can satisfy people who are will to help me is if I can actually figureout how to build and test this app for myself.

ahorek's team
ahorek's team
Joined: 16 Dec 05
Posts: 39
Credit: 249686477
RAC: 3077

btw what is your glibc

btw what is your glibc version?

ldd --versionldd --versionldd --version

__printf_alloc is included in glibc 2.34+, but Einstain apps usually include all libraries statically for compatibility with older systems. The app is simply searching for a function in a dynamic library that is either missing on your system or doesn't match the expected version.

Unfortunately, the current O3AS source doesn’t appear to be publicly available, so only the Einstein developers can assist. They may try building it with different flags, libraries, or configurations...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.