fail/cpn: (Database)Campaign no longer loses jobs

Up until now the JobServer was silently losing jobs and only claiming to be
finished - a workaround for this was to restart the campaign until all jobs
were finished according to the database and the campaign's output.
This change fixes the underlying problem, so a single campaign-run suffices
and does no longer lose any jobs.
Debugging this was awful and took us quite some time...

Change-Id: Ie6c982cc3b2ce11128941f1f13be563bae22565c
This commit is contained in:
Michael Lenz
2014-01-15 12:45:23 +01:00
parent abd9decf0b
commit 9c984b9704
3 changed files with 41 additions and 12 deletions

View File

@ -38,7 +38,15 @@ public:
*/
static bool rcvMsg(int sockfd, google::protobuf::Message& msg);
/**
* Receive Protobuf-generated message and just drop it
* @param sockfd open socket descriptor to read from
* \return false if message reception failed
*/
static bool dropMsg(int sockfd);
private:
static char * getBuf(int sockfd, int *size);
static ssize_t safe_write(int fd, const void *buf, size_t count);
static ssize_t safe_read(int fd, void *buf, size_t count);
};