GPU not picking up work...need help.

floyd
floyd
Joined: 12 Sep 11
Posts: 133
Credit: 186610495
RAC: 0

RE: Comparing the post at

Quote:
Comparing the post at Rosetta here - i can't see anything abnormal.


You mean the post labeled (as of now) "Posted 1586 days ago?" To be more clear, I referred to Rosetta because a host that could be the same as here at Einstein has 100 percent download errors there too. And hundreds of errors at Asteroids, though work still gets through there.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5888
Credit: 119796583082
RAC: 25755561

@ Henry Bundy. It's not

@ Henry Bundy.

It's not really a good idea to pick an existing thread and then hijack it, unless your problem is exactly the same as that of the person who started the thread.

In this case, the thread starter couldn't get GPU work because the necessary CUDA libs were not installed on the particular machine. The OS was Linux and the solution was to install the CUDA libs. Your problem is not the same - you seem to be blaming the project Admins for being derelict in their duty. It really doesn't reflect well on the complainant if basic checks are not made before issuing a complaint like this.

Quote:
I haven't been able to get any work units for several weeks.


Actually, you have, as your full tasks list shows. The scheduler has sent you lots of tasks recently (194 still showing in the online database) all of which have errored out with a download error. None of us can say precisely what caused this. However, if you click on one of the task IDs for an errored task (I chose the most recent task from your full list), you get the following stderr output:-

7.6.12

WU download error: couldn't get input files:

LATeah0131E.dat
-224 (permanent HTTP error)
permanent HTTP error

This would seem to indicate that something on your machine is interfering with the proper download of the required data file LATeah0131E.dat. Computation cannot start without this file and it seems as if something is preventing it from being downloaded. Do you have any ideas what this could be? Have you installed any security/anti-virus type software that might be interfering?

Quote:
The Server Status page says that none of the Work Generator Programs are running. Yet, some people appear to be getting work units. I would really like the people who run this project to say what is going on.


You need to realise that the thing to look at is NOT the status of any work generator program but rather the number of 'Tasks to send'. There are usually several thousand in each category. Once the number available drops below a set 'low water mark' the work generator program will fire up temporarily to replenish the supply. This action is quite quick so the programs are not running for most of the time. So you are just seeing normal behaviour for those programs. There is potentially a real problem ONLY if you see essentially zero tasks ready to send.

I had a look at your host's most recent scheduler contact. Here is an excerpt.

2015-12-06 00:54:04.5441 [PID=27752]    [send] CPU: req 91377.76 sec, 3.00 instances; est delay 0.00
2015-12-06 00:54:04.5441 [PID=27752]    [send] CUDA: req 21780.00 sec, 1.00 instances; est delay 0.00
2015-12-06 00:54:04.5441 [PID=27752]    [send] Intel GPU: req 21780.00 sec, 1.00 instances; est delay 0.00
2015-12-06 00:54:04.5441 [PID=27752]    [send] work_req_seconds: 91377.76 secs

....

2015-12-06 00:54:04.5450 [PID=27752] [send] stopping work search - daily quota exceeded (24>=24)
2015-12-06 00:54:04.5450 [PID=27752] [mixed] sending locality work second
2015-12-06 00:54:04.5452 [PID=27752] Daily result quota 24 exceeded for host 7124785
2015-12-06 00:54:04.5481 [PID=27752] [debug] [HOST#7124785] MSG(high) No work sent
2015-12-06 00:54:04.5481 [PID=27752] [debug] [HOST#7124785] MSG(high) (reached daily quota of 24 tasks)
2015-12-06 00:54:04.5481 [PID=27752] [debug] [HOST#7124785] MSG( low) Project has no jobs available


This shows that you are asking for both CPU work and GPU work and the scheduler isn't interested because you have trashed too many tasks already. Only you can fix that particular problem. Floyd has already pointed you in the correct direction.

To help you in finding a solution, you should think about any recent software changes on your machine. When was the last time it was successfully returning work? What changes did you make at about that time?

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.